共查询到20条相似文献,搜索用时 15 毫秒
1.
Finding out biomarkers and building risk scores to predict the occurrence of survival outcomes is a major concern of clinical epidemiology, and so is the evaluation of prognostic models. In this paper, we are concerned with the estimation of the time-dependent AUC--area under the receiver-operating curve--which naturally extends standard AUC to the setting of survival outcomes and enables to evaluate the discriminative power of prognostic models. We establish a simple and useful relation between the predictiveness curve and the time-dependent AUC--AUC(t). This relation confirms that the predictiveness curve is the key concept for evaluating calibration and discrimination of prognostic models. It also highlights that accurate estimates of the conditional absolute risk function should yield accurate estimates for AUC(t). From this observation, we derive several estimators for AUC(t) relying on distinct estimators of the conditional absolute risk function. An empirical study was conducted to compare our estimators with the existing ones and assess the effect of model misspecification--when estimating the conditional absolute risk function--on the AUC(t) estimation. We further illustrate the methodology on the Mayo PBC and the VA lung cancer data sets. 相似文献
2.
Boris Leroy Robin Delsol Bernard Hugueny Christine N. Meynard Chéïma Barhoumi Morgane Barbet‐Massin Céline Bellard 《Journal of Biogeography》2018,45(9):1994-2002
The discriminating capacity (i.e. ability to correctly classify presences and absences) of species distribution models (SDMs) is commonly evaluated with metrics such as the area under the receiving operating characteristic curve (AUC), the Kappa statistic and the true skill statistic (TSS). AUC and Kappa have been repeatedly criticized, but TSS has fared relatively well since its introduction, mainly because it has been considered as independent of prevalence. In addition, discrimination metrics have been contested because they should be calculated on presence–absence data, but are often used on presence‐only or presence‐background data. Here, we investigate TSS and an alternative set of metrics—similarity indices, also known as F‐measures. We first show that even in ideal conditions (i.e. perfectly random presence–absence sampling), TSS can be misleading because of its dependence on prevalence, whereas similarity/F‐measures provide adequate estimations of model discrimination capacity. Second, we show that in real‐world situations where sample prevalence is different from true species prevalence (i.e. biased sampling or presence‐pseudoabsence), no discrimination capacity metric provides adequate estimation of model discrimination capacity, including metrics specifically designed for modelling with presence‐pseudoabsence data. Our conclusions are twofold. First, they unequivocally impel SDM users to understand the potential shortcomings of discrimination metrics when quality presence–absence data are lacking, and we recommend obtaining such data. Second, in the specific case of virtual species, which are increasingly used to develop and test SDM methodologies, we strongly recommend the use of similarity/F‐measures, which were not biased by prevalence, contrary to TSS. 相似文献
3.
ROC曲线形状在生态位模型评价中的重要性——以美国白蛾为例 总被引:2,自引:0,他引:2
【目的】生态位模型在生物地理学、入侵生物学和保护生物学中具有广泛的应用,被越来越多地用于预测物种潜在分布和现实分布的研究中。本文以美国白蛾为例介绍pROC方案在生态位模型评价中的应用及其注意事项,以期对物种潜在分布预测进行合理的评价,促进生态位模型在我国的合理运用和发展。【方法】介绍ROC曲线和AUC值基本原理,总结其在生态位模型评价中的应用,从物种存在分布点和不存在分布点的可信度出发,分析AUC值用于模型评价的优点和不足,最后介绍局部受试者工作特征曲线的线下面积方案(pROC方案)来弥补传统AUC值的不足。【结果】AUC值虽独立于阈值,但因其综合灵敏度和特异度,而屏蔽这2个指标各自的特征,不能分别评估预测结果的灵敏度和特异度,同时对遗漏率和记账错率不能进行权衡,会误导使用者对模型的评价。与AUC值相比,ROC曲线的形状更具有价值,蕴含丰富的模型评价信息。【结论】模型评价需要将灵敏度和特异度区别对待,ROC曲线形状比AUC值在生态位模型评价中更为重要,pROC方案相对于传统AUC值具有优势,但容易对过度模拟做出不当判断。模型评价与作者研究目的密切相关:当以预测物种潜在分布为目的时(如入侵物种潜在分布、气候变化对物种分布的影响和谱系生物地理学),模型评价应当给予灵敏度(或者遗漏率)更多的权重;当以预测物种现实分布为目的时(如保护区界定和濒危物种引入),模型评价应当给予灵敏度和特异度同等的权重。 相似文献
4.
Zhanfeng Wang Xiangyu Luo Yuan‐chin I. Chang 《Biometrical journal. Biometrische Zeitschrift》2015,57(5):797-807
As medical research and technology advance, there are always new biomarkers found and predictive models proposed for improving the diagnostic performance of diseases. Therefore, in addition to the existing biomarkers and predictive models, how to assess new biomarkers becomes an important research problem. Many classification performance measures, which are usually based on the performance on the whole cut‐off values, were applied directly to this type of problems. However, in a medical diagnosis, some cut‐off points are more important, such as those points within the range of high specificity. Thus, as the partial area under the ROC curve to the area under ROC curve, we study the partial integrated discriminant improvement (pIDI) for evaluating the predictive ability of a newly added marker at a prespecified range of cut‐offs. Theoretical property of estimate of the proposed measure is reported. The performance of this new measure is then compared with that of the partial area under an ROC curve. The numerical results use synthesized are presented, and a liver cancer dataset is used for demonstration purposes. 相似文献
5.
6.
S. G. CANDY 《Austral ecology》1997,22(2):233-235
Abstract Mac Nally (1996), in describing the application of ‘hierarchical partitioning’ in regression modelling of species richness of breeding passerine birds with response variable the species count, rejects the use of Poisson regression in favour of normal-errors regression on an incorrect basis. Mac Nally uses a function of the residual sum of squares, the root-mean square prediction error (RMSPE), calculated from predictions from each regression and rejects the Poisson regression because its RMSPE was 20% larger. This note points out that the RMSPE will always be larger for the Poisson regression, given the same link function and linear predictor is used, even if the response is truly Poisson. References to appropriate methods of determining the most suitable response distribution and link function in the context of generalized linear models are given. 相似文献
7.
8.
Paul Blanche Jean‐François Dartigues Hélène Jacqmin‐Gadda 《Biometrical journal. Biometrische Zeitschrift》2013,55(5):687-704
To quantify the ability of a marker to predict the onset of a clinical outcome in the future, time‐dependent estimators of sensitivity, specificity, and ROC curve have been proposed accounting for censoring of the outcome. In this paper, we review these estimators, recall their assumptions about the censoring mechanism and highlight their relationships and properties. A simulation study shows that marker‐dependent censoring can lead to important biases for the ROC estimators not adapted to this case. A slight modification of the inverse probability of censoring weighting estimators proposed by Uno et al. (2007) and Hung and Chiang (2010a) performs as well as the nearest neighbor estimator of Heagerty et al. (2000) in the simulation study and has interesting practical properties. Finally, the estimators were used to evaluate abilities of a marker combining age and a cognitive test to predict dementia in the elderly. Data were obtained from the French PAQUID cohort. The censoring appears clearly marker‐dependent leading to appreciable differences between ROC curves estimated with the different methods. 相似文献
9.
10.
Markus Pauly Thomas Asendorf Frank Konietschke 《Biometrical journal. Biometrische Zeitschrift》2016,58(6):1319-1337
We investigate rank‐based studentized permutation methods for the nonparametric Behrens–Fisher problem, that is, inference methods for the area under the ROC curve. We hereby prove that the studentized permutation distribution of the Brunner‐Munzel rank statistic is asymptotically standard normal, even under the alternative. Thus, incidentally providing the hitherto missing theoretical foundation for the Neubert and Brunner studentized permutation test. In particular, we do not only show its consistency, but also that confidence intervals for the underlying treatment effects can be computed by inverting this permutation test. In addition, we derive permutation‐based range‐preserving confidence intervals. Extensive simulation studies show that the permutation‐based confidence intervals appear to maintain the preassigned coverage probability quite accurately (even for rather small sample sizes). For a convenient application of the proposed methods, a freely available software package for the statistical software R has been developed. A real data example illustrates the application. 相似文献
11.
The rapid advancement in molecule technology has led to the discovery of many markers that have potential applications in disease diagnosis and prognosis. In a prospective cohort study, information on a panel of biomarkers as well as the disease status for a patient are routinely collected over time. Such information is useful to predict patients' prognosis and select patients for targeted therapy. In this article, we develop procedures for constructing a composite test with optimal discrimination power when there are multiple markers available to assist in prediction and characterize the accuracy of the resulting test by extending the time-dependent receiver operating characteristic (ROC) curve methodology. We employ a modified logistic regression model to derive optimal linear composite scores such that their corresponding ROC curves are maximized at every false positive rate. We provide theoretical justification for using such a model for prognostic accuracy. The proposed method allows for time-varying marker effects and accommodates censored failure time outcome. When the effects of markers are approximately constant over time, we propose a more efficient estimating procedure under such models. We conduct numerical studies to evaluate the performance of the proposed procedures. Our results indicate the proposed methods are both flexible and efficient. We contrast these methods with an application concerning the prognostic accuracies of expression levels of six genes. 相似文献
12.
SUMMARY: Comparison of the accuracy of two diagnostic tests using the receiver operating characteristic (ROC) curves from two diagnostic tests has been typically conducted using fixed sample designs. On the other hand, the human experimentation inherent in a comparison of diagnostic modalities argues for periodic monitoring of the accruing data to address many issues related to the ethics and efficiency of the medical study. To date, very little research has been done on the use of sequential sampling plans for comparative ROC studies, even when these studies may use expensive and unsafe diagnostic procedures. In this article we propose a nonparametric group sequential design plan. The nonparametric sequential method adapts a nonparametric family of weighted area under the ROC curve statistics (Wieand et al., 1989, Biometrika 76, 585-592) and a group sequential sampling plan. We illustrate the implementation of this nonparametric approach for sequentially comparing ROC curves in the context of diagnostic screening for nonsmall-cell lung cancer. We also describe a semiparametric sequential method based on proportional hazard models. We compare the statistical properties of the nonparametric approach with alternative semiparametric and parametric analyses in simulation studies. The results show the nonparametric approach is robust to model misspecification and has excellent finite-sample performance. 相似文献
13.
Christopher S. McMahan Alexander C. McLain Colin M. Gallagher Enrique F. Schisterman 《Biometrical journal. Biometrische Zeitschrift》2016,58(4):944-961
There is a need for epidemiological and medical researchers to identify new biomarkers (biological markers) that are useful in determining exposure levels and/or for the purposes of disease detection. Often this process is stunted by high testing costs associated with evaluating new biomarkers. Traditionally, biomarker assessments are individually tested within a target population. Pooling has been proposed to help alleviate the testing costs, where pools are formed by combining several individual specimens. Methods for using pooled biomarker assessments to estimate discriminatory ability have been developed. However, all these procedures have failed to acknowledge confounding factors. In this paper, we propose a regression methodology based on pooled biomarker measurements that allow the assessment of the discriminatory ability of a biomarker of interest. In particular, we develop covariate‐adjusted estimators of the receiver‐operating characteristic curve, the area under the curve, and Youden's index. We establish the asymptotic properties of these estimators and develop inferential techniques that allow one to assess whether a biomarker is a good discriminator between cases and controls, while controlling for confounders. The finite sample performance of the proposed methodology is illustrated through simulation. We apply our methods to analyze myocardial infarction (MI) data, with the goal of determining whether the pro‐inflammatory cytokine interleukin‐6 is a good predictor of MI after controlling for the subjects' cholesterol levels. 相似文献
14.
The receiver operating characteristic curve is a popular tool to characterize the capabilities of diagnostic tests with continuous or ordinal responses. One common design for assessing the accuracy of diagnostic tests involves multiple readers and multiple tests, in which all readers read all test results from the same patients. This design is most commonly used in a radiology setting, where the results of diagnostic tests depend on a radiologist's subjective interpretation. The most widely used approach for analyzing data from such a study is the Dorfman-Berbaum-Metz (DBM) method (Dorfman et al., 1992) which utilizes a standard analysis of variance (ANOVA) model for the jackknife pseudovalues of the area under the ROC curves (AUCs). Although the DBM method has performed well in published simulation studies, there is no clear theoretical basis for this approach. In this paper, focusing on continuous outcomes, we investigate its theoretical basis. Our result indicates that the DBM method does not satisfy the regular assumptions for standard ANOVA models, and thus might lead to erroneous inference. We then propose a marginal model approach based on the AUCs which can adjust for covariates as well. Consistent and asymptotically normal estimators are derived for regression coefficients. We compare our approach with the DBM method via simulation and by an application to data from a breast cancer study. The simulation results show that both our method and the DBM method perform well when the accuracy of tests under the study is the same and that our method outperforms the DBM method for inference on individual AUCs when the accuracy of tests is not the same. The marginal model approach can be easily extended to ordinal outcomes. 相似文献
15.
Alberto Jiménez‐Valverde 《Global Ecology and Biogeography》2012,21(4):498-507
Aim The area under the receiver operating characteristic (ROC) curve (AUC) is a widely used statistic for assessing the discriminatory capacity of species distribution models. Here, I used simulated data to examine the interdependence of the AUC and classical discrimination measures (sensitivity and specificity) derived for the application of a threshold. I shall further exemplify with simulated data the implications of using the AUC to evaluate potential versus realized distribution models. Innovation After applying the threshold that makes sensitivity and specificity equal, a strong relationship between the AUC and these two measures was found. This result is corroborated with real data. On the other hand, the AUC penalizes the models that estimate potential distributions (the regions where the species could survive and reproduce due to the existence of suitable environmental conditions), and favours those that estimate realized distributions (the regions where the species actually lives). Main conclusions Firstly, the independence of the AUC from the threshold selection may be irrelevant in practice. This result also emphasizes the fact that the AUC assumes nothing about the relative costs of errors of omission and commission. However, in most real situations this premise may not be optimal. Measures derived from a contingency table for different cost ratio scenarios, together with the ROC curve, may be more informative than reporting just a single AUC value. Secondly, the AUC is only truly informative when there are true instances of absence available and the objective is the estimation of the realized distribution. When the potential distribution is the goal of the research, the AUC is not an appropriate performance measure because the weight of commission errors is much lower than that of omission errors. 相似文献
16.
Evaluation of diagnostic performance is typically based on the receiver operating characteristic (ROC) curve and the area under the curve (AUC) as its summary index. The partial area under the curve (pAUC) is an alternative index focusing on the range of practical/clinical relevance. One of the problems preventing more frequent use of the pAUC is the perceived loss of efficiency in cases of noncrossing ROC curves. In this paper, we investigated statistical properties of comparisons of two correlated pAUCs. We demonstrated that outside of the classic model there are practically reasonable ROC types for which comparisons of noncrossing concave curves would be more powerful when based on a part of the curve rather than the entire curve. We argue that this phenomenon stems in part from the exclusion of noninformative parts of the ROC curves that resemble straight‐lines. We conducted extensive simulation studies in families of binormal, straight‐line, and bigamma ROC curves. We demonstrated that comparison of pAUCs is statistically more powerful than comparison of full AUCs when ROC curves are close to a “straight line”. For less flat binormal ROC curves an increase in the integration range often leads to a disproportional increase in pAUCs’ difference, thereby contributing to an increase in statistical power. Thus, efficiency of differences in pAUCs of noncrossing ROC curves depends on the shape of the curves, and for families of ROC curves that are nearly straight‐line shaped, such as bigamma ROC curves, there are multiple practical scenarios in which comparisons of pAUCs are preferable. 相似文献
17.
In early detection of disease, combinations of biomarkers promise improved discrimination over diagnostic tests based on single markers. An example of this is in prostate cancer screening, where additional markers have been sought to improve the specificity of the conventional Prostate-Specific Antigen (PSA) test. A marker of particular interest is the percent free PSA. Studies evaluating the benefits of percent free PSA reflect the need for a methodological approach that is statistically valid and useful in the clinical setting. This article presents methods that address this need. We focus on and-or combinations of biomarker results that we call logic rules and present novel definitions for the ROC curve and the area under the curve (AUC) that are applicable to this class of combination tests. Our estimates of the ROC and AUC are amenable to statistical inference including comparisons of tests and regression analysis. The methods are applied to data on free and total PSA levels among prostate cancer cases and matched controls enrolled in the Physicians' Health Study. 相似文献
18.
19.
Gemma Siles Julio M. Alcántara Pedro J. Rey Jesús M. Bastida 《Restoration Ecology》2010,18(4):439-448
Vegetation restoration is usually based on predefined species assemblages from large‐scale maps of potential vegetation. However, most restoration plans apply to smaller spatial scales, so a homogeneous species assemblage is usually assigned to the target site. We propose defining species assemblages for restoration by modeling the distribution of individual target species. The example presented here is about postfire restoration, but it can be used in other types of disturbed areas. We surveyed 212 plots in well‐preserved vegetation around the burned area to obtain a list of target species and physical parameters of the plots. The burned area was divided in a grid of 723 squares, 1 ha each, and then characterized according to the same physical parameters. From these data, we modeled the distribution of 23 target species. A target map of predicted species assemblages was built combining species maps. This map largely resembles the native vegetation in terms of species richness per plot, environmental gradients in α‐diversity, spatial variation in β‐diversity, and frequency of species occurrence. Comparison between the target map and the current vegetation (recovery status) indicated that, on average, only half of the potential set of species is already present in each plot. Analysis of the recovery status suggested that both rock outcrops and areas at lower altitude, with gentle slope and deeper soil, recover faster. This illustrates the utility of target maps to outline plots in more need of restoration. 相似文献
20.
Adel Hamza Ning-Ning Wei Ce Hao Zhilong Xiu 《Journal of biomolecular structure & dynamics》2013,31(11):1236-1250
In this work, we extend our previous ligand shape-based virtual screening approach by using the scoring function Hamza–Wei–Zhan (HWZ) score and an enhanced molecular shape-density model for the ligands. The performance of the method has been tested against the 40 targets in the Database of Useful Decoys and compared with the performance of our previous HWZ score method. The virtual screening results using the novel ligand shape-based approach demonstrated a favorable improvement (area under the receiver operator characteristics curve AUC?=?.89?±?.02) and effectiveness (hit rate HR1%?=?53.0%?±?6.3 and HR10%?=?71.1%?±?4.9). The comparison of the overall performance of our ligand shape-based method with the highest ligand shape-based virtual screening approach using the data fusion of multi queries showed that our strategy takes into account deeper the chemical information of the set of active ligands. Furthermore, the results indicated that our method are suitable for virtual screening and yields superior prediction accuracy than the other study derived from the data fusion using five queries. Therefore, our novel ligand shape-based screening method constitutes a robust and efficient approach to the 3D similarity screening of small compounds and open the door to a whole new approach to drug design by implementing the method in the structure-based virtual screening. 相似文献