首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary In estimation of the ROC curve, when the true disease status is subject to nonignorable missingness, the observed likelihood involves the missing mechanism given by a selection model. In this article, we proposed a likelihood‐based approach to estimate the ROC curve and the area under the ROC curve when the verification bias is nonignorable. We specified a parametric disease model in order to make the nonignorable selection model identifiable. With the estimated verification and disease probabilities, we constructed four types of empirical estimates of the ROC curve and its area based on imputation and reweighting methods. In practice, a reasonably large sample size is required to estimate the nonignorable selection model in our settings. Simulation studies showed that all four estimators of ROC area performed well, and imputation estimators were generally more efficient than the other estimators proposed. We applied the proposed method to a data set from research in Alzheimer's disease.  相似文献   

2.
To quantify the ability of a marker to predict the onset of a clinical outcome in the future, time‐dependent estimators of sensitivity, specificity, and ROC curve have been proposed accounting for censoring of the outcome. In this paper, we review these estimators, recall their assumptions about the censoring mechanism and highlight their relationships and properties. A simulation study shows that marker‐dependent censoring can lead to important biases for the ROC estimators not adapted to this case. A slight modification of the inverse probability of censoring weighting estimators proposed by Uno et al. (2007) and Hung and Chiang (2010a) performs as well as the nearest neighbor estimator of Heagerty et al. (2000) in the simulation study and has interesting practical properties. Finally, the estimators were used to evaluate abilities of a marker combining age and a cognitive test to predict dementia in the elderly. Data were obtained from the French PAQUID cohort. The censoring appears clearly marker‐dependent leading to appreciable differences between ROC curves estimated with the different methods.  相似文献   

3.
We compare several nonparametric and parametric weighting methods for the adjustment of the effect of strata. In particular, we focus on the adjustment methods in the context of receiver‐operating characteristic (ROC) analysis. Nonparametrically, rank‐based van Elteren's test and inverse‐variance (IV) weighting using the area under the ROC curve (AUC) are examined. Parametrically, the stratified t‐test and IV AUC weighted method are applied based on a binormal monotone transformation model. Stratum‐specific, pooled, and adjusted estimates are obtained. The pooled and adjusted AUCs are estimated. We illustrate and compare these weighting methods on a multi‐center diagnostic trial and through extensive Monte‐Carlo simulations.  相似文献   

4.
Censored quantile regression models, which offer great flexibility in assessing covariate effects on event times, have attracted considerable research interest. In this study, we consider flexible estimation and inference procedures for competing risks quantile regression, which not only provides meaningful interpretations by using cumulative incidence quantiles but also extends the conventional accelerated failure time model by relaxing some of the stringent model assumptions, such as global linearity and unconditional independence. Current method for censored quantile regressions often involves the minimization of the L1‐type convex function or solving the nonsmoothed estimating equations. This approach could lead to multiple roots in practical settings, particularly with multiple covariates. Moreover, variance estimation involves an unknown error distribution and most methods rely on computationally intensive resampling techniques such as bootstrapping. We consider the induced smoothing procedure for censored quantile regressions to the competing risks setting. The proposed procedure permits the fast and accurate computation of quantile regression parameter estimates and standard variances by using conventional numerical methods such as the Newton–Raphson algorithm. Numerical studies show that the proposed estimators perform well and the resulting inference is reliable in practical settings. The method is finally applied to data from a soft tissue sarcoma study.  相似文献   

5.
Receiver operating characteristic (ROC) curve is commonly used to evaluate and compare the accuracy of classification methods or markers. Estimating ROC curves has been an important problem in various fields including biometric recognition and diagnostic medicine. In real applications, classification markers are often developed under two or more ordered conditions, such that a natural stochastic ordering exists among the observations. Incorporating such a stochastic ordering into estimation can improve statistical efficiency (Davidov and Herman, 2012). In addition, clustered and correlated data arise when multiple measurements are gleaned from the same subject, making estimation of ROC curves complicated due to within-cluster correlations. In this article, we propose to model the ROC curve using a weighted empirical process to jointly account for the order constraint and within-cluster correlation structure. The algebraic properties of resulting summary statistics of the ROC curve such as its area and partial area are also studied. The algebraic expressions reduce to the ones by Davidov and Herman (2012) for independent observations. We derive asymptotic properties of the proposed order-restricted estimators and show that they have smaller mean-squared errors than the existing estimators. Simulation studies also demonstrate better performance of the newly proposed estimators over existing methods for finite samples. The proposed method is further exemplified with the fingerprint matching data from the National Institute of Standards and Technology Special Database 4.  相似文献   

6.
Liu D  Zhou XH 《Biometrics》2011,67(3):906-916
Covariate-specific receiver operating characteristic (ROC) curves are often used to evaluate the classification accuracy of a medical diagnostic test or a biomarker, when the accuracy of the test is associated with certain covariates. In many large-scale screening tests, the gold standard is subject to missingness due to high cost or harmfulness to the patient. In this article, we propose a semiparametric estimation of the covariate-specific ROC curves with a partial missing gold standard. A location-scale model is constructed for the test result to model the covariates' effect, but the residual distributions are left unspecified. Thus the baseline and link functions of the ROC curve both have flexible shapes. With the gold standard missing at random (MAR) assumption, we consider weighted estimating equations for the location-scale parameters, and weighted kernel estimating equations for the residual distributions. Three ROC curve estimators are proposed and compared, namely, imputation-based, inverse probability weighted, and doubly robust estimators. We derive the asymptotic normality of the estimated ROC curve, as well as the analytical form of the standard error estimator. The proposed method is motivated and applied to the data in an Alzheimer's disease research.  相似文献   

7.
Summary In medical research, the receiver operating characteristic (ROC) curves can be used to evaluate the performance of biomarkers for diagnosing diseases or predicting the risk of developing a disease in the future. The area under the ROC curve (ROC AUC), as a summary measure of ROC curves, is widely utilized, especially when comparing multiple ROC curves. In observational studies, the estimation of the AUC is often complicated by the presence of missing biomarker values, which means that the existing estimators of the AUC are potentially biased. In this article, we develop robust statistical methods for estimating the ROC AUC and the proposed methods use information from auxiliary variables that are potentially predictive of the missingness of the biomarkers or the missing biomarker values. We are particularly interested in auxiliary variables that are predictive of the missing biomarker values. In the case of missing at random (MAR), that is, missingness of biomarker values only depends on the observed data, our estimators have the attractive feature of being consistent if one correctly specifies, conditional on auxiliary variables and disease status, either the model for the probabilities of being missing or the model for the biomarker values. In the case of missing not at random (MNAR), that is, missingness may depend on the unobserved biomarker values, we propose a sensitivity analysis to assess the impact of MNAR on the estimation of the ROC AUC. The asymptotic properties of the proposed estimators are studied and their finite‐sample behaviors are evaluated in simulation studies. The methods are further illustrated using data from a study of maternal depression during pregnancy.  相似文献   

8.
We study the problem of estimating the density of a random variable G, given observations of a random variable Y = G + E. The random variable E is independent of G and its probability distribution function is considered as known. We build a family of estimators of the density of G using characteristic functions. We then derive a family of estimators of the density of Y based on the model for Y. The estimators are shown to be asymptotically unbiased and consistent. Simulations show that these estimators are better, as measured by integrated squared error, than the standard kernel estimators. Finally, we give an example of the use of this method for the detection of major genes in animal populations.  相似文献   

9.
Understanding the functional relationship between the sample size and the performance of species richness estimators is necessary to optimize limited sampling resources against estimation error. Nonparametric estimators such as Chao and Jackknife demonstrate strong performances, but consensus is lacking as to which estimator performs better under constrained sampling. We explore a method to improve the estimators under such scenario. The method we propose involves randomly splitting species‐abundance data from a single sample into two equally sized samples, and using an appropriate incidence‐based estimator to estimate richness. To test this method, we assume a lognormal species‐abundance distribution (SAD) with varying coefficients of variation (CV), generate samples using MCMC simulations, and use the expected mean‐squared error as the performance criterion of the estimators. We test this method for Chao, Jackknife, ICE, and ACE estimators. Between abundance‐based estimators with the single sample, and incidence‐based estimators with the split‐in‐two samples, Chao2 performed the best when CV < 0.65, and incidence‐based Jackknife performed the best when CV > 0.65, given that the ratio of sample size to observed species richness is greater than a critical value given by a power function of CV with respect to abundance of the sampled population. The proposed method increases the performance of the estimators substantially and is more effective when more rare species are in an assemblage. We also show that the splitting method works qualitatively similarly well when the SADs are log series, geometric series, and negative binomial. We demonstrate an application of the proposed method by estimating richness of zooplankton communities in samples of ballast water. The proposed splitting method is an alternative to sampling a large number of individuals to increase the accuracy of richness estimations; therefore, it is appropriate for a wide range of resource‐limited sampling scenarios in ecology.  相似文献   

10.
Recent technological advances continue to provide noninvasive and more accurate biomarkers for evaluating disease status. One standard tool for assessing the accuracy of diagnostic tests is the receiver operating characteristic (ROC) curve. Few statistical methods exist to accommodate multiple continuous‐scale biomarkers in the framework of ROC analysis. In this paper, we propose a method to integrate continuous‐scale biomarkers to optimize classification accuracy. Specifically, we develop semiparametric transformation models for multiple biomarkers. We assume that unknown and marker‐specific transformations of biomarkers follow a multivariate normal distribution. Our models accommodate biomarkers subject to limits of detection and account for the dependence among biomarkers by including a subject‐specific random effect. We also propose a diagnostic measure using an optimal linear combination of the transformed biomarkers. Our diagnostic rule does not depend on any monotone transformation of biomarkers and is not sensitive to extreme biomarker values. Nonparametric maximum likelihood estimation (NPMLE) is used for inference. We show that the parameter estimators are asymptotically normal and efficient. We illustrate our semiparametric approach using data from the Endometriosis, Natural History, Diagnosis, and Outcomes (ENDO) study.  相似文献   

11.
We address estimation of the marginal effect of a time‐varying binary treatment on a continuous longitudinal outcome in the context of observational studies using electronic health records, when the relationship of interest is confounded, mediated, and further distorted by an informative visit process. We allow the longitudinal outcome to be recorded only sporadically and assume that its monitoring timing is informed by patients' characteristics. We propose two novel estimators based on linear models for the mean outcome that incorporate an adjustment for confounding and informative monitoring process through generalized inverse probability of treatment weights and a proportional intensity model, respectively. We allow for a flexible modeling of the intercept function as a function of time. Our estimators have closed‐form solutions, and their asymptotic distributions can be derived. Extensive simulation studies show that both estimators outperform standard methods such as the ordinary least squares estimator or estimators that only account for informative monitoring or confounders. We illustrate our methods using data from the Add Health study, assessing the effect of depressive mood on weight in adolescents.  相似文献   

12.
For an r × ctable with ordinal responses, odds ratios are commonly used to describe the relationship between the row and column variables. This article shows two types of ordinal odds ratios where local‐global odds ratios are used to compare several groups on a c‐category ordinal response and a global odds ratio is used to measure the global association between a pair of ordinal responses. When there is a stratification factor, we consider Mantel‐Haenszel (MH) type estimators of these odds ratios to summarize the association from several strata. Like the ordinary MH estimator of the common odds ratio for several 2 × 2 contingency tables, the estimators are used when the association is not expected to vary drastically among the strata. Also, the estimators are consistent under the ordinary asymptotic framework in which the number of strata is fixed and also under sparse asymptotics in which the number of strata grows with the sample size. Compared to the maximum likelihood estimators, simulations find that the MH type estimators perform better especially when each stratum has few observations. This article provides variances and covariances formulae for the local‐global odds ratios estimators and applies the bootstrap method to obtain a standard error for the global odds ratio estimator. At the end, we discuss possible ways of testing the homogeneity assumption.  相似文献   

13.
We develop time‐varying association analyses for onset ages of two lung infections to address the statistical challenges in utilizing registry data where onset ages are left‐truncated by ages of entry and competing‐risk censored by deaths. Two types of association estimators are proposed based on conditional cause‐specific hazard function and cumulative incidence function that are adapted from unconditional quantities to handle left truncation. Asymptotic properties of the estimators are established by using the empirical process techniques. Our simulation study shows that the estimators perform well with moderate sample sizes. We apply our methods to the Cystic Fibrosis Foundation Registry data to study the relationship between onset ages of Pseudomonas aeruginosa and Staphylococcus aureus infections.  相似文献   

14.
Xu et al., in this issue of the Journal of Vegetation Science, compare several species richness estimators. All the non‐parametric estimators, such as Chao and jackknife estimators, underestimated the true number, whereas all the area‐based models, based on species–area curves, overestimated it. No reliable method yet exists to predict the number of species in an area that is appreciably larger than the one(s) sampled.  相似文献   

15.
The ROC (receiver operating characteristic) curve is the most commonly used statistical tool for describing the discriminatory accuracy of a diagnostic test. Classical estimation of the ROC curve relies on data from a simple random sample from the target population. In practice, estimation is often complicated due to not all subjects undergoing a definitive assessment of disease status (verification). Estimation of the ROC curve based on data only from subjects with verified disease status may be badly biased. In this work we investigate the properties of the doubly robust (DR) method for estimating the ROC curve under verification bias originally developed by Rotnitzky, Faraggi and Schisterman (2006) for estimating the area under the ROC curve. The DR method can be applied for continuous scaled tests and allows for a non‐ignorable process of selection to verification. We develop the estimator's asymptotic distribution and examine its finite sample properties via a simulation study. We exemplify the DR procedure for estimation of ROC curves with data collected on patients undergoing electron beam computer tomography, a diagnostic test for calcification of the arteries.  相似文献   

16.
In this paper, we focus on measures to evaluate discrimination of prediction models for ordinal outcomes. We review existing extensions of the dichotomous c‐index—which is equivalent to the area under the receiver operating characteristic (ROC) curve—suggest a new measure, and study their relationships. The volume under the ROC surface (VUS) scores sets of cases including one case from each outcome category. VUS considers sets as either correctly or incorrectly ordered by the model. All other existing measures assess pairs of cases. We propose an ordinal c‐index (ORC) that is set‐based but, contrary to VUS, scores sets more gradually by indicating the closeness of the model‐based ordering to the perfect ordering. As a result, the ORC does not decrease rapidly as the number of outcome categories increases. It turns out that the ORC can be rewritten as the average of pairwise c‐indexes. Hence, the ORC has both a set‐ and pair‐based interpretation. There are several relationships between the existing measures, leading to only two types of existing measures: a prevalence‐weighted average of pairwise c‐indexes and the VUS. Our suggested measure ORC positions itself in between as it is set‐based but turns out to equal an unweighted average of pairwise c‐indexes. The measures are demonstrated through a case study on the prediction of six‐month outcome after traumatic brain injury. In conclusion, the set‐based nature and graded scoring system make the ORC an attractive measure with a simple interpretation, together with its prevalence‐independence that is a natural property of a discrimination measure.  相似文献   

17.
Zhiguo Li  Peter Gilbert  Bin Nan 《Biometrics》2008,64(4):1247-1255
Summary Grouped failure time data arise often in HIV studies. In a recent preventive HIV vaccine efficacy trial, immune responses generated by the vaccine were measured from a case–cohort sample of vaccine recipients, who were subsequently evaluated for the study endpoint of HIV infection at prespecified follow‐up visits. Gilbert et al. (2005, Journal of Infectious Diseases 191 , 666–677) and Forthal et al. (2007, Journal of Immunology 178, 6596–6603) analyzed the association between the immune responses and HIV incidence with a Cox proportional hazards model, treating the HIV infection diagnosis time as a right‐censored random variable. The data, however, are of the form of grouped failure time data with case–cohort covariate sampling, and we propose an inverse selection probability‐weighted likelihood method for fitting the Cox model to these data. The method allows covariates to be time dependent, and uses multiple imputation to accommodate covariate data that are missing at random. We establish asymptotic properties of the proposed estimators, and present simulation results showing their good finite sample performance. We apply the method to the HIV vaccine trial data, showing that higher antibody levels are associated with a lower hazard of HIV infection.  相似文献   

18.
Summary Time varying, individual covariates are problematic in experiments with marked animals because the covariate can typically only be observed when each animal is captured. We examine three methods to incorporate time varying, individual covariates of the survival probabilities into the analysis of data from mark‐recapture‐recovery experiments: deterministic imputation, a Bayesian imputation approach based on modeling the joint distribution of the covariate and the capture history, and a conditional approach considering only the events for which the associated covariate data are completely observed (the trinomial model). After describing the three methods, we compare results from their application to the analysis of the effect of body mass on the survival of Soay sheep (Ovis aries) on the Isle of Hirta, Scotland. Simulations based on these results are then used to make further comparisons. We conclude that both the trinomial model and Bayesian imputation method perform best in different situations. If the capture and recovery probabilities are all high, then the trinomial model produces precise, unbiased estimators that do not depend on any assumptions regarding the distribution of the covariate. In contrast, the Bayesian imputation method performs substantially better when capture and recovery probabilities are low, provided that the specified model of the covariate is a good approximation to the true data‐generating mechanism.  相似文献   

19.
The problem of parallelism for bi‐linear regression lines arises in many real life investigations. For two linear regression models with normal errors, the estimation of the slope as well as the intercept parameters is considered when it is apriori suspected that the two lines are parallel. Three different estimators are defined by using both the sample data and the non‐sample uncertain prior information. The relative performances of the unrestricted, restricted and preliminary test estimators are investigated based on the analysis of the bias, and risk functions under quadratic loss. An example based on a medical study is used to illustrate the method.  相似文献   

20.
In diagnostic medicine, the volume under the receiver operating characteristic (ROC) surface (VUS) is a commonly used index to quantify the ability of a continuous diagnostic test to discriminate between three disease states. In practice, verification of the true disease status may be performed only for a subset of subjects under study since the verification procedure is invasive, risky, or expensive. The selection for disease examination might depend on the results of the diagnostic test and other clinical characteristics of the patients, which in turn can cause bias in estimates of the VUS. This bias is referred to as verification bias. Existing verification bias correction in three‐way ROC analysis focuses on ordinal tests. We propose verification bias‐correction methods to construct ROC surface and estimate the VUS for a continuous diagnostic test, based on inverse probability weighting. By applying U‐statistics theory, we develop asymptotic properties for the estimator. A Jackknife estimator of variance is also derived. Extensive simulation studies are performed to evaluate the performance of the new estimators in terms of bias correction and variance. The proposed methods are used to assess the ability of a biomarker to accurately identify stages of Alzheimer's disease.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号