共查询到20条相似文献,搜索用时 0 毫秒
1.
Modeling diagnostic error without a gold standard has been an active area of biostatistical research. In a majority of the approaches, model-based estimates of sensitivity, specificity, and prevalence are derived from a latent class model in which the latent variable represents an individual's true unobserved disease status. For simplicity, initial approaches assumed that the diagnostic test results on the same subject were independent given the true disease status (i.e., the conditional independence assumption). More recently, various authors have proposed approaches for modeling the dependence structure between test results given true disease status. This note discusses a potential problem with these approaches. Namely, we show that when the conditional dependence between tests is misspecified, estimators of sensitivity, specificity, and prevalence can be biased. Importantly, we demonstrate that with small numbers of tests, likelihood comparisons and other model diagnostics may not be able to distinguish between models with different dependence structures. We present asymptotic results that show the generality of the problem. Further, data analysis and simulations demonstrate the practical implications of model misspecification. Finally, we present some guidelines about the use of these models for practitioners. 相似文献
2.
Geoffrey Jones Wesley O. Johnson Timothy E. Hanson Ronald Christensen 《Biometrics》2010,66(3):855-863
Summary We discuss the issue of identifiability of models for multiple dichotomous diagnostic tests in the absence of a gold standard (GS) test. Data arise as multinomial or product‐multinomial counts depending upon the number of populations sampled. Models are generally posited in terms of population prevalences, test sensitivities and specificities, and test dependence terms. It is commonly believed that if the degrees of freedom in the data meet or exceed the number of parameters in a fitted model then the model is identifiable. Goodman (1974, Biometrika 61, 215–231) established that this was not the case a long time ago. We discuss currently available models for multiple tests and argue in favor of an extension of a model that was developed by Dendukuri and Joseph (2001, Biometrics 57, 158–167). Subsequently, we further develop Goodman's technique, and make geometric arguments to give further insight into the nature of models that lack identifiability. We present illustrations using simulated and real data. 相似文献
3.
Many analyses of results from multiple diagnostic tests assume the tests are statistically independent conditional on the true disease status of the subject. This assumption may be violated in practice, especially in situations where none of the tests is a perfectly accurate gold standard. Classical inference for models accounting for the conditional dependence between tests requires that results from at least four different tests be used in order to obtain an identifiable solution, but it is not always feasible to have results from this many tests. We use a Bayesian approach to draw inferences about the disease prevalence and test properties while adjusting for the possibility of conditional dependence between tests, particularly when we have only two tests. We propose both fixed and random effects models. Since with fewer than four tests the problem is nonidentifiable, the posterior distributions are strongly dependent on the prior information about the test properties and the disease prevalence, even with large sample sizes. If the degree of correlation between the tests is known a priori with high precision, then our methods adjust for the dependence between the tests. Otherwise, our methods provide adjusted inferences that incorporate all of the uncertainty inherent in the problem, typically resulting in wider interval estimates. We illustrate our methods using data from a study on the prevalence of Strongyloides infection among Cambodian refugees to Canada. 相似文献
4.
Albert PS 《Biometrics》2007,63(3):947-957
Interest often focuses on estimating sensitivity and specificity of a group of raters or a set of new diagnostic tests in situations in which gold standard evaluation is expensive or invasive. Various authors have proposed semilatent class modeling approaches for estimating diagnostic accuracy in this situation. This article presents imputation approaches for this problem. I show how imputation provides a simpler way of performing diagnostic accuracy and prevalence estimation than the use of semilatent modeling. Furthermore, the imputation approach is more robust to modeling assumptions and, in general, there is only a moderate efficiency loss relative to a correctly specified semilatent class model. I apply imputation to a study designed to estimate the diagnostic accuracy of digital radiography for gastric cancer. The feasibility and robustness of imputation is illustrated with analysis, asymptotic results, and simulations. 相似文献
5.
Spencer BD 《Biometrics》2012,68(2):559-566
Latent class models are increasingly used to assess the accuracy of medical diagnostic tests and other classifications when no gold standard is available and the true state is unknown. When the latent class is treated as the true class, the latent class models provide measures of components of accuracy including specificity and sensitivity and their complements, type I and type II error rates. The error rates according to the latent class model differ from the true error rates, however, and empirical comparisons with a gold standard suggest the true error rates often are larger. We investigate conditions under which the true type I and type II error rates are larger than those provided by the latent class models. Results from Uebersax (1988, Psychological Bulletin 104, 405-416) are extended to accommodate random effects and covariates affecting the responses. The results are important for interpreting the results of latent class analyses. An error decomposition is presented that incorporates an error component from invalidity of the latent class model. 相似文献
6.
Pfeiffer RM Carroll RJ Wheeler W Whitby D Mbulaiteye S 《Biostatistics (Oxford, England)》2008,9(1):137-151
For many diseases, it is difficult or impossible to establish a definitive diagnosis because a perfect "gold standard" may not exist or may be too costly to obtain. In this paper, we propose a method to use continuous test results to estimate prevalence of disease in a given population and to estimate the effects of factors that may influence prevalence. Motivated by a study of human herpesvirus 8 among children with sickle-cell anemia in Uganda, where 2 enzyme immunoassays were used to assess infection status, we fit 2-component multivariate mixture models. We model the component densities using parametric densities that include data transformation as well as flexible transformed models. In addition, we model the mixing proportion, the probability of a latent variable corresponding to the true unknown infection status, via a logistic regression to incorporate covariates. This model includes mixtures of multivariate normal densities as a special case and is able to accommodate unusual shapes and skewness in the data. We assess model performance in simulations and present results from applying various parameterizations of the model to the Ugandan study. 相似文献
7.
In this article, we show that, if subjects are assumed to be homogeneous within a finite set of latent classes, the basic restrictions of the Rasch model (conditional independence and unidimensionality) can be relaxed in a flexible way by simply adding appropriate columns to a basic design matrix. When discrete covariates are available so that subjects may be classified into strata, we show how a joint modeling approach can achieve greater parsimony. Parameter estimates may be obtained by maximizing the conditional likelihood (given the total number of captures) with a combined use of the EM and Fisher scoring algorithms. We also discuss a technique for obtaining confidence intervals for the size of the population under study based on the profile likelihood. 相似文献
8.
Summary Asbestos exposure is a well‐known risk factor for various lung diseases, and when they occur, workmen's compensation boards need to make decisions concerning the probability the cause is work related. In the absence of a definitive work history, measures of short and long asbestos fibers as well as counts of asbestos bodies in the lung can be used as diagnostic tests for asbestos exposure. Typically, data from one or more lung samples are available to estimate the probability of asbestos exposure, often by comparing the values with those from a reference nonexposed population. As there is no gold standard measure, we explore a variety of latent class models that take into account the mixed discrete/continuous nature of the data, that each subject may provide data from more than one lung sample, and that the within‐subject results across different samples may be correlated. Our methods can be useful to compensation boards in providing individual level probabilities of exposure based on available data, to researchers who are studying the test properties for the various measures used in this area, and more generally, to other test situations with similar data structure. 相似文献
9.
Summary In diagnostic medicine, estimating the diagnostic accuracy of a group of raters or medical tests relative to the gold standard is often the primary goal. When a gold standard is absent, latent class models where the unknown gold standard test is treated as a latent variable are often used. However, these models have been criticized in the literature from both a conceptual and a robustness perspective. As an alternative, we propose an approach where we exploit an imperfect reference standard with unknown diagnostic accuracy and conduct sensitivity analysis by varying this accuracy over scientifically reasonable ranges. In this article, a latent class model with crossed random effects is proposed for estimating the diagnostic accuracy of regional obstetrics and gynaecological (OB/GYN) physicians in diagnosing endometriosis. To avoid the pitfalls of models without a gold standard, we exploit the diagnostic results of a group of OB/GYN physicians with an international reputation for the diagnosis of endometriosis. We construct an ordinal reference standard based on the discordance among these international experts and propose a mechanism for conducting sensitivity analysis relative to the unknown diagnostic accuracy among them. A Monte Carlo EM algorithm is proposed for parameter estimation and a BIC‐type model selection procedure is presented. Through simulations and data analysis we show that this new approach provides a useful alternative to traditional latent class modeling approaches used in this setting. 相似文献
10.
Repeated binary responses provide efficient information for two purposes: (1) estimating two misclassification (false-positive and false-negative error) probabilities and (2) testing the hypothesis that either is zero in a reliability study. We focus on the assessment of reliability of a diagnostic test when there is no gold standard. This paper uses a latent class model and illustrates some of its properties. In addition, application to data containing variation among individuals is considered. We apply this model to the serological data on the MNSs blood group of atomic bomb survivors and their children. The results provide valuable information for examining measurement reliability. 相似文献
11.
Receiver operating characteristic (ROC) regression methodology is used to identify factors that affect the accuracy of medical diagnostic tests. In this paper, we consider a ROC model for which the ROC curve is a parametric function of covariates but distributions of the diagnostic test results are not specified. Covariates can be either common to all subjects or specific to those with disease. We propose a new estimation procedure based on binary indicators defined by the test result for a diseased subject exceeding various specified quantiles of the distribution of test results from non-diseased subjects with the same covariate values. This procedure is conceptually and computationally simplified relative to existing procedures. Simulation study results indicate that the approach has fairly high statistical efficiency. The new ROC regression methodology is used to evaluate childhood measurements of body mass index as a predictive marker of adult obesity. 相似文献
12.
Summary Latent class analysis (LCA) and latent class regression (LCR) are widely used for modeling multivariate categorical outcomes in social science and biomedical studies. Standard analyses assume data of different respondents to be mutually independent, excluding application of the methods to familial and other designs in which participants are clustered. In this article, we consider multilevel latent class models, in which subpopulation mixing probabilities are treated as random effects that vary among clusters according to a common Dirichlet distribution. We apply the expectation‐maximization (EM) algorithm for model fitting by maximum likelihood (ML). This approach works well, but is computationally intensive when either the number of classes or the cluster size is large. We propose a maximum pairwise likelihood (MPL) approach via a modified EM algorithm for this case. We also show that a simple latent class analysis, combined with robust standard errors, provides another consistent, robust, but less‐efficient inferential procedure. Simulation studies suggest that the three methods work well in finite samples, and that the MPL estimates often enjoy comparable precision as the ML estimates. We apply our methods to the analysis of comorbid symptoms in the obsessive compulsive disorder study. Our models' random effects structure has more straightforward interpretation than those of competing methods, thus should usefully augment tools available for LCA of multilevel data. 相似文献
13.
Prospective studies of diagnostic test accuracy have important advantages over retrospective designs. Yet, when the disease being detected by the diagnostic test(s) has a low prevalence rate, a prospective design can require an enormous sample of patients. We consider two strategies to reduce the costs of prospective studies of binary diagnostic tests: stratification and two-phase sampling. Utilizing neither, one, or both of these strategies provides us with four study design options: (1) the conventional design involving a simple random sample (SRS) of patients from the clinical population; (2) a stratified design where patients from higher-prevalence subpopulations are more heavily sampled; (3) a simple two-phase design using a SRS in the first phase and selection for the second phase based on the test results from the first; and (4) a two-phase design with stratification in the first phase. We describe estimators for sensitivity and specificity and their variances for each design, along with sample size estimation. We offer some recommendations for choosing among the various designs. We illustrate the study designs with two examples. 相似文献
14.
Habtamu K. Benecha John S. Preisser Kimon Divaris Amy H. Herring Kalyan Das 《Biometrical journal. Biometrische Zeitschrift》2018,60(4):845-858
Unlike zero‐inflated Poisson regression, marginalized zero‐inflated Poisson (MZIP) models for counts with excess zeros provide estimates with direct interpretations for the overall effects of covariates on the marginal mean. In the presence of missing covariates, MZIP and many other count data models are ordinarily fitted using complete case analysis methods due to lack of appropriate statistical methods and software. This article presents an estimation method for MZIP models with missing covariates. The method, which is applicable to other missing data problems, is illustrated and compared with complete case analysis by using simulations and dental data on the caries preventive effects of a school‐based fluoride mouthrinse program. 相似文献
15.
Summary The rapid development of new biotechnologies allows us to deeply understand biomedical dynamic systems in more detail and at a cellular level. Many of the subject‐specific biomedical systems can be described by a set of differential or difference equations that are similar to engineering dynamic systems. In this article, motivated by HIV dynamic studies, we propose a class of mixed‐effects state‐space models based on the longitudinal feature of dynamic systems. State‐space models with mixed‐effects components are very flexible in modeling the serial correlation of within‐subject observations and between‐subject variations. The Bayesian approach and the maximum likelihood method for standard mixed‐effects models and state‐space models are modified and investigated for estimating unknown parameters in the proposed models. In the Bayesian approach, full conditional distributions are derived and the Gibbs sampler is constructed to explore the posterior distributions. For the maximum likelihood method, we develop a Monte Carlo EM algorithm with a Gibbs sampler step to approximate the conditional expectations in the E‐step. Simulation studies are conducted to compare the two proposed methods. We apply the mixed‐effects state‐space model to a data set from an AIDS clinical trial to illustrate the proposed methodologies. The proposed models and methods may also have potential applications in other biomedical system analyses such as tumor dynamics in cancer research and genetic regulatory network modeling. 相似文献
16.
Two methods of computing Monte Carlo estimators of variance components using restricted maximum likelihood via the expectation-maximisation algorithm are reviewed. A third approach is suggested and the performance of the methods is compared using simulated data. 相似文献
17.
Two-level data with hierarchical structure and mixed continuous and polytomous data are very common in biomedical research. In this article, we propose a maximum likelihood approach for analyzing a latent variable model with these data. The maximum likelihood estimates are obtained by a Monte Carlo EM algorithm that involves the Gibbs sampler for approximating the E-step and the M-step and the bridge sampling for monitoring the convergence. The approach is illustrated by a two-level data set concerning the development and preliminary findings from an AIDS preventative intervention for Filipina commercial sex workers where the relationship between some latent quantities is investigated. 相似文献
18.
The diagnosis/prognosis problem has already been introduced by the authors in previous papers as a classification problem for survival data. In this paper, the specific aspects of the estimation of the survival functions in diagnostic classes and the evaluation of the posterior probabilities of the diagnostic classes are addressed; a latent random variable Z is defined to denote the classification of censored and uncensored individuals, where early censored individuals cannot be immediately classified as Z is not observed. Parameter estimation of the mixture survival model thus derived is carried out using a proper version of the EM algorithm with given prior probabilities on Z and diagnostic/prognostic information provided by the observable covariates is also included into the model. Numerical examples using AIDS data and a simulation study are used to better outline the main features of the model and of the estimation methodology. 相似文献
19.
EM算法是在不完全信息资料下实现参数极大似然估计的一种通用方法.本文导出了双位点不同标记类型,包括共显性-共显性,共显性-显性和显性-显性3种模式下,估计遗传重组率的EM算法,以及获得重组率抽样方差的Bootstrap方法;并将之推广到部分个体缺失标记基因型(未检测到电泳谱带)下的重组率估计.通过大量Monte Carlo模拟研究发现: (1)连锁紧密时,样本容量对重组率的估计影响不大;连锁松散时,需要较大样本容量才可检测到连锁以及实现重组率的较精确估计.(2)用包含缺失标记的所有个体估计重组率比仅用其中的非缺失标记个体估计更准确,且可显著提高连锁检测的统计功效. 相似文献
20.
For many diseases the infection status of individuals cannot be observed directly, but can only be inferred from biomarkers that are subject to measurement error. Diagnosis of infection based on observed symptoms can itself be regarded as an imperfect test of infection status. The temporal relationship between infection and marker outcomes may be complex, especially for recurrent diseases where individuals can experience multiple bouts of infection. We propose an approach that first models the unobserved longitudinal infection status of individuals conditional on relevant covariates, and then jointly models the longitudinal sequence of biomarker outcomes conditional on infection status and covariate information through time, thus resulting in a joint model for longitudinal infection and biomarker sequences. This model can be used to investigate the temporal dynamics of infection, and to evaluate the usefulness of biomarkers for monitoring purposes. Our work is motivated and illustrated by a longitudinal study of bovine digital dermatitis (BDD) on commercial dairy farms in North West England and North Wales, in which the infection of interest is Treponeme spp., and the biomarkers of interest are a continuous enzyme-linked immunosorbent assay test outcome and a dichotomous outcome, foot lesion status. BDD is known to be one of the possible causes of foot lesions in cows. 相似文献