首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary Absence of a perfect reference test is an acknowledged source of bias in diagnostic studies. In the case of tuberculous pleuritis, standard reference tests such as smear microscopy, culture and biopsy have poor sensitivity. Yet meta‐analyses of new tests for this disease have always assumed the reference standard is perfect, leading to biased estimates of the new test’s accuracy. We describe a method for joint meta‐analysis of sensitivity and specificity of the diagnostic test under evaluation, while considering the imperfect nature of the reference standard. We use a Bayesian hierarchical model that takes into account within‐ and between‐study variability. We show how to obtain pooled estimates of sensitivity and specificity, and how to plot a hierarchical summary receiver operating characteristic curve. We describe extensions of the model to situations where multiple reference tests are used, and where index and reference tests are conditionally dependent. The performance of the model is evaluated using simulations and illustrated using data from a meta‐analysis of nucleic acid amplification tests (NAATs) for tuberculous pleuritis. The estimate of NAAT specificity was higher and the sensitivity lower compared to a model that assumed that the reference test was perfect.  相似文献   

2.
Diagnostic or screening tests are widely used in medical fields to classify patients according to their disease status. Several statistical models for meta‐analysis of diagnostic test accuracy studies have been developed to synthesize test sensitivity and specificity of a diagnostic test of interest. Because of the correlation between test sensitivity and specificity, modeling the two measures using a bivariate model is recommended. In this paper, we extend the current standard bivariate linear mixed model (LMM) by proposing two variance‐stabilizing transformations: the arcsine square root and the Freeman–Tukey double arcsine transformation. We compared the performance of the proposed methods with the standard method through simulations using several performance measures. The simulation results showed that our proposed methods performed better than the standard LMM in terms of bias, root mean square error, and coverage probability in most of the scenarios, even when data were generated assuming the standard LMM. We also illustrated the methods using two real data sets.  相似文献   

3.
A "gold" standard test, providing definitive verification of disease status, may be quite invasive or expensive. Current technological advances provide less invasive, or less expensive, diagnostic tests. Ideally, a diagnostic test is evaluated by comparing it with a definitive gold standard test. However, the decision to perform the gold standard test to establish the presence or absence of disease is often influenced by the results of the diagnostic test, along with other measured, or not measured, risk factors. If only data from patients who received the gold standard test were used to assess the test performance, the commonly used measures of diagnostic test performance--sensitivity and specificity--are likely to be biased. Sensitivity would often be higher, and specificity would be lower, than the true values. This bias is called verification bias. Without adjustment for verification bias, one may possibly introduce into the medical practice a diagnostic test with apparent, but not truly, high sensitivity. In this article, verification bias is treated as a missing covariate problem. We propose a flexible modeling and computational framework for evaluating the performance of a diagnostic test, with adjustment for nonignorable verification bias. The presented computational method can be utilized with any software that can repetitively use a logistic regression module. The approach is likelihood-based, and allows use of categorical or continuous covariates. An explicit formula for the observed information matrix is presented, so that one can easily compute standard errors of estimated parameters. The methodology is illustrated with a cardiology data example. We perform a sensitivity analysis of the dependency of verification selection process on disease.  相似文献   

4.
Summary Asbestos exposure is a well‐known risk factor for various lung diseases, and when they occur, workmen's compensation boards need to make decisions concerning the probability the cause is work related. In the absence of a definitive work history, measures of short and long asbestos fibers as well as counts of asbestos bodies in the lung can be used as diagnostic tests for asbestos exposure. Typically, data from one or more lung samples are available to estimate the probability of asbestos exposure, often by comparing the values with those from a reference nonexposed population. As there is no gold standard measure, we explore a variety of latent class models that take into account the mixed discrete/continuous nature of the data, that each subject may provide data from more than one lung sample, and that the within‐subject results across different samples may be correlated. Our methods can be useful to compensation boards in providing individual level probabilities of exposure based on available data, to researchers who are studying the test properties for the various measures used in this area, and more generally, to other test situations with similar data structure.  相似文献   

5.
Summary In diagnostic medicine, estimating the diagnostic accuracy of a group of raters or medical tests relative to the gold standard is often the primary goal. When a gold standard is absent, latent class models where the unknown gold standard test is treated as a latent variable are often used. However, these models have been criticized in the literature from both a conceptual and a robustness perspective. As an alternative, we propose an approach where we exploit an imperfect reference standard with unknown diagnostic accuracy and conduct sensitivity analysis by varying this accuracy over scientifically reasonable ranges. In this article, a latent class model with crossed random effects is proposed for estimating the diagnostic accuracy of regional obstetrics and gynaecological (OB/GYN) physicians in diagnosing endometriosis. To avoid the pitfalls of models without a gold standard, we exploit the diagnostic results of a group of OB/GYN physicians with an international reputation for the diagnosis of endometriosis. We construct an ordinal reference standard based on the discordance among these international experts and propose a mechanism for conducting sensitivity analysis relative to the unknown diagnostic accuracy among them. A Monte Carlo EM algorithm is proposed for parameter estimation and a BIC‐type model selection procedure is presented. Through simulations and data analysis we show that this new approach provides a useful alternative to traditional latent class modeling approaches used in this setting.  相似文献   

6.
Sensitivity and specificity of a monitoring test   总被引:3,自引:0,他引:3  
The usefulness of a diagnostic test is generally assessed by calculating the sensitivity and specificity, or the predictive value positive and predictive value negative of the test. When subjects are monitored periodically for evidence of disease, these calculations must incorporate the varying amounts of information per individual. If in addition, the test results lie on a continuous scale, these quantities vary with the cutoff value (cutpoint) used to define a positive test. They are usually calculated for a spectrum of potential cutpoints in order to produce receiver-operator characteristic curves. In this paper we use a partial likelihood solution to the discrete logistic model in order to obtain estimates of the diagnostic test indices and to provide a significance test when the diagnostic test is administered repeatedly to individuals.  相似文献   

7.
In diagnostic medicine, the volume under the receiver operating characteristic (ROC) surface (VUS) is a commonly used index to quantify the ability of a continuous diagnostic test to discriminate between three disease states. In practice, verification of the true disease status may be performed only for a subset of subjects under study since the verification procedure is invasive, risky, or expensive. The selection for disease examination might depend on the results of the diagnostic test and other clinical characteristics of the patients, which in turn can cause bias in estimates of the VUS. This bias is referred to as verification bias. Existing verification bias correction in three‐way ROC analysis focuses on ordinal tests. We propose verification bias‐correction methods to construct ROC surface and estimate the VUS for a continuous diagnostic test, based on inverse probability weighting. By applying U‐statistics theory, we develop asymptotic properties for the estimator. A Jackknife estimator of variance is also derived. Extensive simulation studies are performed to evaluate the performance of the new estimators in terms of bias correction and variance. The proposed methods are used to assess the ability of a biomarker to accurately identify stages of Alzheimer's disease.  相似文献   

8.
  1. Obtaining accurate estimates of disease prevalence is crucial for the monitoring and management of wildlife populations but can be difficult if different diagnostic tests yield conflicting results and if the accuracy of each diagnostic test is unknown. Bayesian latent class analysis (BLCA) modeling offers a potential solution, providing estimates of prevalence levels and diagnostic test accuracy under the realistic assumption that no diagnostic test is perfect.
  2. In typical applications of this approach, the specificity of one test is fixed at or close to 100%, allowing the model to simultaneously estimate the sensitivity and specificity of all other tests, in addition to infection prevalence. In wildlife systems, a test with near‐perfect specificity is not always available, so we simulated data to investigate how decreasing this fixed specificity value affects the accuracy of model estimates.
  3. We used simulations to explore how the trade‐off between diagnostic test specificity and sensitivity impacts prevalence estimates and found that directional biases depend on pathogen prevalence. Both the precision and accuracy of results depend on the sample size, the diagnostic tests used, and the true infection prevalence, so these factors should be considered when applying BLCA to estimate disease prevalence and diagnostic test accuracy in wildlife systems. A wildlife disease case study, focusing on leptospirosis in California sea lions, demonstrated the potential for Bayesian latent class methods to provide reliable estimates under real‐world conditions.
  4. We delineate conditions under which BLCA improves upon the results from a single diagnostic across a range of prevalence levels and sample sizes, demonstrating when this method is preferable for disease ecologists working in a wide variety of pathogen systems.
  相似文献   

9.
Diagnostic tests play an important role in clinical practice. The objective of a diagnostic test accuracy study is to compare an experimental diagnostic test with a reference standard. The majority of these studies dichotomize test results into two categories: negative and positive. But often the underlying test results may be categorized into more than two, ordered, categories. This article concerns the situation where multiple studies have evaluated the same diagnostic test with the same multiple thresholds in a population of non‐diseased and diseased individuals. Recently, bivariate meta‐analysis has been proposed for the pooling of sensitivity and specificity, which are likely to be negatively correlated within studies. These ideas have been extended to the situation of diagnostic tests with multiple thresholds, leading to a multinomial model with multivariate normal between‐study variation. This approach is efficient, but computer‐intensive and its convergence is highly dependent on starting values. Moreover, monotonicity of the sensitivities/specificities for increasing thresholds is not guaranteed. Here, we propose a Poisson‐correlated gamma frailty model, previously applied to a seemingly quite different situation, meta‐analysis of paired survival curves. Since the approach is based on hazards, it guarantees monotonicity of the sensitivities/specificities for increasing thresholds. The approach is less efficient than the multinomial/normal approach. On the other hand, the Poisson‐correlated gamma frailty model makes no assumptions on the relationship between sensitivity and specificity, gives consistent results, appears to be quite robust against different between‐study variation models, and is computationally very fast and reliable with regard to the overall sensitivities/specificities.  相似文献   

10.
Sensitivity and specificity are common measures of the accuracy of a diagnostic test. The usual estimators of these quantities are unbiased if data on the diagnostic test result and the true disease status are obtained from all subjects in an appropriately selected sample. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the result of the diagnostic test and other characteristics of the subjects. Estimators of sensitivity and specificity based on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias under the assumption that the missing data on disease status are missing at random (MAR), that is, the probability of missingness depends on the true (missing) disease status only through the test result and observed covariate information. When some of the covariates are continuous, or the number of covariates is relatively large, the existing methods require parametric models for the probability of disease or the probability of verification (given the test result and covariates), and hence are subject to model misspecification. We propose a new method for correcting verification bias based on the propensity score, defined as the predicted probability of verification given the test result and observed covariates. This is estimated separately for those with positive and negative test results. The new method classifies the verified sample into several subsamples that have homogeneous propensity scores and allows correction for verification bias. Simulation studies demonstrate that the new estimators are more robust to model misspecification than existing methods, but still perform well when the models for the probability of disease and probability of verification are correctly specified.  相似文献   

11.
BackgroundLike many infectious diseases, there is no practical gold standard for diagnosing clinical visceral leishmaniasis (VL). Latent class modeling has been proposed to estimate a latent gold standard for identifying disease. These proposed models for VL have leveraged information from diagnostic tests with dichotomous serological and PCR assays, but have not employed continuous diagnostic test information.Methods/Principal findingsIn this paper, we employ Bayesian latent class models to improve the identification of canine visceral leishmaniasis using the dichotomous PCR assay and the Dual Path Platform (DPP) serology test. The DPP test has historically been used as a dichotomous assay, but can also yield numerical information via the DPP reader. Using data collected from a cohort of hunting dogs across the United States, which were identified as having either negative or symptomatic disease, we evaluate the impact of including numerical DPP reader information as a proxy for immune response. We find that inclusion of DPP reader information allows us to illustrate changes in immune response as a function of age.Conclusions/SignificanceUtilization of continuous DPP reader information can improve the correct discrimination between individuals that are negative for disease and those with clinical VL. These models provide a promising avenue for diagnostic testing in contexts with multiple, imperfect diagnostic tests. Specifically, they can easily be applied to human visceral leishmaniasis when diagnostic test results are available. Also, appropriate diagnosis of canine visceral leishmaniasis has important consequences for curtailing spread of disease to humans.  相似文献   

12.
The objective of this study was to develop methods to estimate the optimal threshold of a longitudinal biomarker and its credible interval when the diagnostic test is based on a criterion that reflects a dynamic progression of that biomarker. Two methods are proposed: one parametric and one non‐parametric. In both the cases, the Bayesian inference was used to derive the posterior distribution of the optimal threshold from which an estimate and a credible interval could be obtained. A numerical study shows that the bias of the parametric method is low and the coverage probability of the credible interval close to the nominal value, with a small coverage asymmetry in some cases. This is also true for the non‐parametric method in case of large sample sizes. Both the methods were applied to estimate the optimal prostate‐specific antigen nadir value to diagnose prostate cancer recurrence after a high‐intensity focused ultrasound treatment. The parametric method can also be applied to non‐longitudinal biomarkers.  相似文献   

13.
辨证论治是中医学的特色和优势,为开展辨证论治的研究与发展,我国业已发展出百余种证侯动物模型,起到了积极的作用。本文在扼要回顾存在的问题和研究进展后,以气虚证为例,介绍了大鼠和小鼠辨证的思路与方法:(1)人类与大鼠、小鼠气虚证候的比较。研究表明,大鼠、小鼠的气虚表现与人类十分近似,是可以模拟人类气虚辨证的方法和思路,给大鼠、小鼠辨证的。(2)大鼠、小鼠气虚的计量化辨证。标准化和计量化辨证涉及4个基本问题,即如何计量化采集四诊指标、如何计量化辨证、如何确定辨证的标准和阈值,以及如何判断证候及其改善的程度。为此,我们借鉴药理行为学检测技术,比较了悬尾不动、抖笼、旷场、抓力等方法,最终确定旷场和抓力作为气虚证的计量化四诊指标,建立气虚程度/指数计算公式:气虚程度/指数=各动物抓力实测值/正常组均数×0.5+各动物水平移动实测值/正常组均数×0.3+各动物直立次数/正常组均数×0.2,以及气虚辨证的入选标准和阈值。通过大量、重复的实验检测与应用,证明以上方法和辨证标准是可行的,而且初步证明气虚证在神经-内分泌-免疫网络有其广泛的基因表达与剪接改变的基础。目前已初步实现大鼠、小鼠常见虚证标准化、计量化的四诊检测和辨证,该方法和技术可以动态跟踪疾病动物证候的演变,评价辨证论治疗效,还可以广泛用于病-证结合的动物实验研究,辨证论治治疗方案的实验优化和评价,中药、中药有效成分和方剂药性的判断等,适用于中医基础、药理、诊断、临床等各中医药学科的研发工作。  相似文献   

14.
Liu et al. (2018) used a virtual species approach to test the effects of outliers on species distribution models. In their simulations, they applied a threshold value over the simulated suitabilities to generate the species distributions, suggesting that using a probabilistic simulation approach would have been more complex and yield the same results. Here, we argue that using a probabilistic approach is not necessarily more complex and may significantly change results. Although the threshold approach may be justified under limited circumstances, the probabilistic approach has multiple advantages. First, it is in line with ecological theory, which largely assumes non‐threshold responses. Second, it is more general, as it includes the threshold as a limiting case. Third, it allows a better separation of the relevant intervening factors that influence model performance. Therefore, we argue that the probabilistic simulation approach should be used as a general standard in virtual species studies.  相似文献   

15.
Summary Often a binary variable is generated by dichotomizing an underlying continuous variable measured at a specific time point according to a prespecified threshold value. In the event that the underlying continuous measurements are from a longitudinal study, one can use the repeated‐measures model to impute missing data on responder status as a result of subject dropout and apply the logistic regression model on the observed or otherwise imputed responder status. Standard Bayesian multiple imputation techniques ( Rubin, 1987 , in Multiple Imputation for Nonresponse in Surveys) that draw the parameters for the imputation model from the posterior distribution and construct the variance of parameter estimates for the analysis model as a combination of within‐ and between‐imputation variances are found to be conservative. The frequentist multiple imputation approach that fixes the parameters for the imputation model at the maximum likelihood estimates and construct the variance of parameter estimates for the analysis model using the results of Robins and Wang (2000, Biometrika 87, 113–124) is shown to be more efficient. We propose to apply ( Kenward and Roger, 1997 , Biometrics 53, 983–997) degrees of freedom to account for the uncertainty associated with variance–covariance parameter estimates for the repeated measures model.  相似文献   

16.
In diagnostic studies, a new diagnostic test is often compared with a standard test and both tests are applied on the same patients, called paired design. The true disease state is in general given by the so‐called gold standard (most reliable method for classification), which has to be known for all patients. The benefit of the new diagnostic test can be evaluated by sensitivity and specificity, which are in fact proportions. This means, for the comparison of two diagnostic tests, confidence intervals for the difference of the dependent estimated sensitivities and specificities are calculated. In the literature, many comparisons of different approaches can be found, but none explicitly for diagnostic studies. For this reason we compare 13 approaches for a set of scenarios that represent data of diagnostic studies (e.g., with sensitivity and specificity ?0.8). With simulation studies, we show that the nonparametric interval with normal approximation can be recommended for the difference of two dependent sensitivities or specificities without restriction, the Wald interval with the limitation of slightly anti‐conservative results for small sample sizes, and the nonparametric intervals with t‐approximation, and the Tango interval with the limitation of conservative results for high correlations.  相似文献   

17.
Leisenring W  Alonzo T  Pepe MS 《Biometrics》2000,56(2):345-351
Positive and negative predictive values of a diagnostic test are key clinically relevant measures of test accuracy. Surprisingly, statistical methods for comparing tests with regard to these parameters have not been available for the most common study design in which each test is applied to each study individual. In this paper, we propose a statistic for comparing the predictive values of two diagnostic tests using this paired study design. The proposed statistic is a score statistic derived from a marginal regression model and bears some relation to McNemar's statistic. As McNemar's statistic can be used to compare sensitivities and specificities of diagnostic tests, parameters that condition on disease status, our statistic can be considered as an analog of McNemar's test for the problem of comparing predictive values, parameters that condition on test outcome. We report on the results of a simulation study designed to examine the properties of this test under a variety of conditions. The method is illustrated with data from a study of methods for diagnosis of coronary artery disease.  相似文献   

18.
Diagnostic studies in ophthalmology frequently involve binocular data where pairs of eyes are evaluated, through some diagnostic procedure, for the presence of certain diseases or pathologies. The simplest approach of estimating measures of diagnostic accuracy, such as sensitivity and specificity, treats eyes as independent, consequently yielding incorrect estimates, especially of the standard errors. Approaches that account for the inter‐eye correlation include regression methods using generalized estimating equations and likelihood techniques based on various correlated binomial models. The paper proposes a simple alternative statistical methodology of jointly estimating measures of diagnostic accuracy for binocular tests based on a flexible model for correlated binary data. Moments' estimation of model parameters is outlined and asymptotic inference is discussed. The resulting estimates are straightforward and easy to obtain, requiring no special statistical software but only elementary calculations. Results of simulations indicate that large‐sample and bootstrap confidence intervals based on the estimates have relatively good coverage properties when the model is correctly specified. The computation of the estimates and their standard errors are illustrated with data from a study on diabetic retinopathy.  相似文献   

19.
Dual Processing Theories (DPT) assume that human cognition is governed by two distinct types of processes typically referred to as type 1 (intuitive) and type 2 (deliberative). Based on DPT we have derived a Dual Processing Model (DPM) to describe and explain therapeutic medical decision-making. The DPM model indicates that doctors decide to treat when treatment benefits outweigh its harms, which occurs when the probability of the disease is greater than the so called “threshold probability” at which treatment benefits are equal to treatment harms. Here we extend our work to include a wider class of decision problems that involve diagnostic testing. We illustrate applicability of the proposed model in a typical clinical scenario considering the management of a patient with prostate cancer. To that end, we calculate and compare two types of decision-thresholds: one that adheres to expected utility theory (EUT) and the second according to DPM. Our results showed that the decisions to administer a diagnostic test could be better explained using the DPM threshold. This is because such decisions depend on objective evidence of test/treatment benefits and harms as well as type 1 cognition of benefits and harms, which are not considered under EUT. Given that type 1 processes are unique to each decision-maker, this means that the DPM threshold will vary among different individuals. We also showed that when type 1 processes exclusively dominate decisions, ordering a diagnostic test does not affect a decision; the decision is based on the assessment of benefits and harms of treatment. These findings could explain variations in the treatment and diagnostic patterns documented in today’s clinical practice.  相似文献   

20.
Cannibalism may cause considerable mortality on juvenile fish and it has been hypothesised that it may exercise selection on offspring size in that larger offspring may enjoy a size refuge. For this to be evolutionarily advantageous the survival of individual offspring must compensate for the reduced fecundity implied by larger offspring size. We develop a model which combines standard assumptions of size‐dependent mortality with adult cannibalism to investigate the potential for cannibalism to act as selective force on offspring size. We find that for this potential to be realised, the mortality due to cannibalism must exceed a threshold value that is a decreasing function of non‐cannibalistic predation intensity, cannibalized size range width and the average cannibalized size. If cannibalism exceeds this threshold, the model predicts evolution of offspring size towards refuges above or below cannibalized size range depending on initial offspring size. Cannibalistic mortality cannot be so great that the population is non‐viable, however, the range of parameter values describing cannibalistic intensity allowed within these boundaries is wide. On this basis, we suggest that cannibalism is a potential mechanism for offspring size selection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号