共查询到9条相似文献,搜索用时 15 毫秒
1.
Combining predictors for classification using the area under the receiver operating characteristic curve 总被引:1,自引:0,他引:1
No single biomarker for cancer is considered adequately sensitive and specific for cancer screening. It is expected that the results of multiple markers will need to be combined in order to yield adequately accurate classification. Typically, the objective function that is optimized for combining markers is the likelihood function. In this article, we consider an alternative objective function-the area under the empirical receiver operating characteristic curve (AUC). We note that it yields consistent estimates of parameters in a generalized linear model for the risk score but does not require specifying the link function. Like logistic regression, it yields consistent estimation with case-control or cohort data. Simulation studies suggest that AUC-based classification scores have performance comparable with logistic likelihood-based scores when the logistic regression model holds. Analysis of data from a proteomics biomarker study shows that performance can be far superior to logistic regression derived scores when the logistic regression model does not hold. Model fitting by maximizing the AUC rather than the likelihood should be considered when the goal is to derive a marker combination score for classification or prediction. 相似文献
2.
The idea of using measurements such as biomarkers, clinical data, or molecular biology assays for classification and prediction is popular in modern medicine. The scientific evaluation of such measures includes assessing the accuracy with which they predict the outcome of interest. Receiver operating characteristic curves are commonly used for evaluating the accuracy of diagnostic tests. They can be applied more broadly, indeed to any problem involving classification to two states or populations (D= 0 or 1). We show that the ROC curve can be interpreted as a cumulative distribution function for the discriminatory measure Y in the affected population (D= 1) after Y has been standardized to the distribution in the reference population (D= 0). The standardized values are called placement values. If the placement values have a uniform(0, 1) distribution, then Y is not discriminatory, because its distribution in the affected population is the same as that in the reference population. The degree to which the distribution of the standardized measure differs from uniform(0, 1) is a natural way to characterize the discriminatory capacity of Y and provides a nontraditional interpretation for the ROC curve. Statistical methods for making inference about distribution functions therefore motivate new approaches to making inference about ROC curves. We demonstrate this by considering the ROC-GLM regression model and observing that it is equivalent to a regression model for the distribution of placement values. The likelihood of the placement values provides a new approach to ROC parameter estimation that appears to be more efficient than previously proposed methods. The method is applied to evaluate a pulmonary function measure in cystic fibrosis patients as a predictor of future occurrence of severe acute pulmonary infection requiring hospitalization. Finally, we note the relationship between regression models for the mean placement value and recently proposed models for the area under the ROC curve which is the classic summary index of discrimination. 相似文献
3.
Multiple diagnostic tests and risk factors are commonly available for many diseases. This information can be either redundant or complimentary. Combining them may improve the diagnostic/predictive accuracy, but also unnecessarily increase complexity, risks, and/or costs. The improved accuracy gained by including additional variables can be evaluated by the increment of the area under (AUC) the receiver‐operating characteristic curves with and without the new variable(s). In this study, we derive a new test statistic to accurately and efficiently determine the statistical significance of this incremental AUC under a multivariate normality assumption. Our test links AUC difference to a quadratic form of a standardized mean shift in a unit of the inverse covariance matrix through a properly linear transformation of all diagnostic variables. The distribution of the quadratic estimator is related to the multivariate Behrens–Fisher problem. We provide explicit mathematical solutions of the estimator and its approximate non‐central F‐distribution, type I error rate, and sample size formula. We use simulation studies to prove that our new test maintains prespecified type I error rates as well as reasonable statistical power under practical sample sizes. We use data from the Study of Osteoporotic Fractures as an application example to illustrate our method. 相似文献
4.
Plots and tests for goodness of fit with randomly censored data 总被引:2,自引:0,他引:2
5.
Summary . The Wilcoxon Mann-Whitney (WMW) U test is commonly used in nonparametric two-group comparisons when the normality of the underlying distribution is questionable. There has been some previous work on estimating power based on this procedure ( Lehmann, 1998 , Nonparametrics ). In this article, we present an approach for estimating type II error, which is applicable to any continuous distribution, and also extend the approach to handle grouped continuous data allowing for ties. We apply these results to obtaining standard errors of the area under the receiver operating characteristic curve (AUROC) for risk-prediction rules under H 1 and for comparing AUROC between competing risk prediction rules applied to the same data set. These results are based on SAS -callable functions to evaluate the bivariate normal integral and are thus easily implemented with standard software. 相似文献
6.
7.
8.
9.
Screening patients at high risk of recurrence of cancer would allow for more accurate and personalized treatment. In this study, we tried to identify the prognosis-related protein profile by applying two different quantitative proteomic techniques, difference in-gel electrophoresis and cleavable isotope-coded affinity tag method. Six tumor tissues were obtained from stage IV colorectal cancer (CRC) patients, of which three have survived more than five years (good prognostic group, GPG) and the other three died within 25 months (poor prognostic group, PPG) after palliative surgery and subsequent chemotherapy treatment. From the two independent quantitative analyses, we identified 175 proteins with abundance ratios greater than 2-fold. Gene ontology analysis revealed that proteins related to cellular assembly/organization and movement were generally increased in the PPG. Twenty-two proteins, including caveolin-1, were chosen for confirmatory studies by Western blot and immunohistochemistry. The Western blot data for each individual protein were analyzed with Mann-Whitney tests, and a multi-marker panel was generated by logistic regression analysis. Five proteins, fatty acid binding protein 1, intelectin 1, transitional endoplasmic reticulum ATPase, transgelin and tropomyosin 2, were significantly different between the two prognostic groups within 95% confidence. No single protein could completely distinguish the two groups from each other. However, a combination of the five proteins effectively distinguished PPG from GPG patients (AUC=1). Our study suggests a multi-marker panel composed of proteome signatures that provide accurate predictive information and potentially assist personalized therapy. This article is part of a Special Issue entitled: Proteomics: The clinical link. 相似文献