首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 8 毫秒
1.
2.
A permutation test to compare receiver operating characteristic curves   总被引:1,自引:0,他引:1  
Venkatraman ES 《Biometrics》2000,56(4):1134-1138
We developed a permutation test in our earlier paper (Venkatraman and Begg, 1996, Biometrika 83, 835-848) to test the equality of receiver operating characteristic curves based on continuous paired data. Here we extend the underlying concepts to develop a permutation test for continuous unpaired data, and we study its properties through simulations.  相似文献   

3.
Uno H  Cai T  Tian L  Wei LJ 《Biometrics》2011,67(4):1389-1396
Quantitative procedures for evaluating added values from new markers over a conventional risk scoring system for predicting event rates at specific time points have been extensively studied. However, a single summary statistic, for example, the area under the receiver operating characteristic curve or its derivatives, may not provide a clear picture about the relationship between the conventional and the new risk scoring systems. When there are no censored event time observations in the data, two simple scatterplots with individual conventional and new scores for "cases" and "controls" provide valuable information regarding the overall and the subject-specific level incremental values from the new markers. Unfortunately, in the presence of censoring, it is not clear how to construct such plots. In this article, we propose a nonparametric estimation procedure for the distributions of the differences between two risk scores conditional on the conventional score. The resulting quantile curves of these differences over the subject-specific conventional score provide extra information about the overall added value from the new marker. They also help us to identify a subgroup of future subjects who need the new predictors, especially when there is no unified utility function available for cost-risk-benefit decision making. The procedure is illustrated with two data sets. The first is from a well-known Mayo Clinic primary biliary cirrhosis liver study. The second is from a recent breast cancer study on evaluating the added value from a gene score, which is relatively expensive to measure compared with the routinely used clinical biomarkers for predicting the patient's survival after surgery.  相似文献   

4.
Receiver operating characteristic (ROC) analysis is a useful evaluative method of diagnostic accuracy. A Bayesian hierarchical nonlinear regression model for ROC analysis was developed. A validation analysis of diagnostic accuracy was conducted using prospective multi-center clinical trial prostate cancer biopsy data collected from three participating centers. The gold standard was based on radical prostatectomy to determine local and advanced disease. To evaluate the diagnostic performance of PSA level at fixed levels of Gleason score, a normality transformation was applied to the outcome data. A hierarchical regression analysis incorporating the effects of cluster (clinical center) and cancer risk (low, intermediate, and high) was performed, and the area under the ROC curve (AUC) was estimated.  相似文献   

5.
6.
Janes H  Pepe MS 《Biometrika》2009,96(2):371-382
Recent scientific and technological innovations have produced an abundance of potential markers that are being investigated for their use in disease screening and diagnosis. In evaluating these markers, it is often necessary to account for covariates associated with the marker of interest. Covariates may include subject characteristics, expertise of the test operator, test procedures or aspects of specimen handling. In this paper, we propose the covariate-adjusted receiver operating characteristic curve, a measure of covariate-adjusted classification accuracy. Nonparametric and semiparametric estimators are proposed, asymptotic distribution theory is provided and finite sample performance is investigated. For illustration we characterize the age-adjusted discriminatory accuracy of prostate-specific antigen as a biomarker for prostate cancer.  相似文献   

7.
Area under the receiver operating characteristic curve (AROC) is commonly used to choose a biomechanical metric from which to construct an injury risk curve (IRC). However, AROC may not handle censored datasets adequately. Survival analysis creates robust estimates of IRCs which accommodate censored data. We present an observation-adjusted ROC (oaROC) which uses the survival-based IRC to estimate the AROC. We verified and evaluated this method using simulated datasets of different censoring statuses and sample sizes. For a dataset with 1000 left and right censored observations, the median AROC closely approached the oaROCTrue, or the oaROC calculated using an assumed “true” IRC, differing by a fraction of a percent, 0.1%. Using simulated datasets with various censoring, we found that oaROC converged onto oaROCTrue in all cases. For datasets with right and non-censored observations, AROC did not converge onto oaROCTrue. oaROC for datasets with only non-censored observations converged the fastest, and for a dataset with 10 observations, the median oaROC differed from oaROCTrue by 2.74% while the corresponding median AROC with left and right censored data differed from oaROCTrue by 9.74%. We also calculated the AROC and oaROC for a published side impact dataset, and differences between the two methods ranged between −24.08% and 24.55% depending on metric. Overall, when compared with AROC, we found oaROC performs equivalently for doubly censored data, better for non-censored data, and can accommodate more types of data than AROC. While more validation is needed, the results indicate that oaROC is a viable alternative which can be incorporated into the metric selection process for IRCs.  相似文献   

8.
Yuan Z  Ghosh D 《Biometrics》2008,64(2):431-439
Summary .   In medical research, there is great interest in developing methods for combining biomarkers. We argue that selection of markers should also be considered in the process. Traditional model/variable selection procedures ignore the underlying uncertainty after model selection. In this work, we propose a novel model-combining algorithm for classification in biomarker studies. It works by considering weighted combinations of various logistic regression models; five different weighting schemes are considered in the article. The weights and algorithm are justified using decision theory and risk-bound results. Simulation studies are performed to assess the finite-sample properties of the proposed model-combining method. It is illustrated with an application to data from an immunohistochemical study in prostate cancer.  相似文献   

9.
MOTIVATION: Protein expression profiling for differences indicative of early cancer holds promise for improving diagnostics. Due to their high dimensionality, statistical analysis of proteomic data from mass spectrometers is challenging in many aspects such as dimension reduction, feature subset selection as well as construction of classification rules. Search of an optimal feature subset, commonly known as the feature subset selection (FSS) problem, is an important step towards disease classification/diagnostics with biomarkers. METHODS: We develop a parsimonious threshold-independent feature selection (PTIFS) method based on the concept of area under the curve (AUC) of the receiver operating characteristic (ROC). To reduce computational complexity to a manageable level, we use a sigmoid approximation to the empirical AUC as the criterion function. Starting from an anchor feature, the PTIFS method selects a feature subset through an iterative updating algorithm. Highly correlated features that have similar discriminating power are precluded from being selected simultaneously. The classification rule is then determined from the resulting feature subset. RESULTS: The performance of the proposed approach is investigated by extensive simulation studies, and by applying the method to two mass spectrometry data sets of prostate cancer and of liver cancer. We compare the new approach with the threshold gradient descent regularization (TGDR) method. The results show that our method can achieve comparable performance to that of the TGDR method in terms of disease classification, but with fewer features selected. AVAILABILITY: Supplementary Material and the PTIFS implementations are available at http://staff.ustc.edu.cn/~ynyang/PTIFS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

10.
Ma S  Huang J 《Biometrics》2007,63(3):751-757
In biomedical studies, it is of great interest to develop methodologies for combining multiple markers for the purpose of disease classification. The receiving operating characteristic (ROC) technique has been widely used, where classification performance can be measured with the area under the ROC curve (AUC). In this article, we study a ROC-based method for effectively combining multiple markers for disease classification. We propose a sigmoid AUC (SAUC) estimator that maximizes the sigmoid approximation of the empirical AUC. The SAUC estimator is computationally affordable, n(1/2)-consistent and achieves the same asymptotic efficiency as the AUC estimator. Inference based on the weighted bootstrap is investigated. We also propose Monte Carlo methods to assess the overall prediction performance and the relative importance of individual markers. Finite sample performance is evaluated using simulation studies and two public data sets.  相似文献   

11.
Aim The area under the receiver operating characteristic (ROC) curve (AUC) is a widely used statistic for assessing the discriminatory capacity of species distribution models. Here, I used simulated data to examine the interdependence of the AUC and classical discrimination measures (sensitivity and specificity) derived for the application of a threshold. I shall further exemplify with simulated data the implications of using the AUC to evaluate potential versus realized distribution models. Innovation After applying the threshold that makes sensitivity and specificity equal, a strong relationship between the AUC and these two measures was found. This result is corroborated with real data. On the other hand, the AUC penalizes the models that estimate potential distributions (the regions where the species could survive and reproduce due to the existence of suitable environmental conditions), and favours those that estimate realized distributions (the regions where the species actually lives). Main conclusions Firstly, the independence of the AUC from the threshold selection may be irrelevant in practice. This result also emphasizes the fact that the AUC assumes nothing about the relative costs of errors of omission and commission. However, in most real situations this premise may not be optimal. Measures derived from a contingency table for different cost ratio scenarios, together with the ROC curve, may be more informative than reporting just a single AUC value. Secondly, the AUC is only truly informative when there are true instances of absence available and the objective is the estimation of the realized distribution. When the potential distribution is the goal of the research, the AUC is not an appropriate performance measure because the weight of commission errors is much lower than that of omission errors.  相似文献   

12.
13.
This article presents a method for estimating the accuracy of psychological screening scales using receiver operating characteristic curves and associated statistics. Screening scales are typically semicontinuous within a known range with distributions that are nearly symmetric when the target condition is present and highly skewed when the condition is absent. We model screening scale outcomes using truncated normal distributions that accommodate these different distributional shapes and use subject-specific random effects to adjust for multiple assessments within individuals. Using the proposed model, we estimate the accuracy of the Symptom Checklist as a measure of major depression from a repeatedly screened sample of patients.  相似文献   

14.
  1. The receiver operating characteristic (ROC) and precision–recall (PR) plots have been widely used to evaluate the performance of species distribution models. Plotting the ROC/PR curves requires a traditional test set with both presence and absence data (namely PA approach), but species absence data are usually not available in reality. Plotting the ROC/PR curves from presence‐only data while treating background data as pseudo absence data (namely PO approach) may provide misleading results.
  2. In this study, we propose a new approach to calibrate the ROC/PR curves from presence and background data with user‐provided information on a constant c, namely PB approach. Here, c defines the probability that species occurrence is detected (labeled), and an estimate of c can also be derived from the PB‐based ROC/PR plots given that a model with good ability of discrimination is available. We used five virtual species and a real aerial photography to test the effectiveness of the proposed PB‐based ROC/PR plots. Different models (or classifiers) were trained from presence and background data with various sample sizes. The ROC/PR curves plotted by PA approach were used to benchmark the curves plotted by PO and PB approaches.
  3. Experimental results show that the curves and areas under curves by PB approach are more similar to that by PA approach as compared with PO approach. The PB‐based ROC/PR plots also provide highly accurate estimations of c in our experiment.
  4. We conclude that the proposed PB‐based ROC/PR plots can provide valuable complements to the existing model assessment methods, and they also provide an additional way to estimate the constant c (or species prevalence) from presence and background data.
  相似文献   

15.
In the setting of longitudinal study, subjects are followed for the occurrence of some dichotomous outcome. In many of these studies, some markers are also obtained repeatedly during the study period. Emir et al. introduced a non-parametric approach to the estimation of the area under the ROC curve of a repeated marker. Their non-parametric estimate involves assigning a weight to each subject. There are two weighting schemes suggested in their paper: one for the case when within-patient correlation is low, and the other for the case when within-subject correlation is high. However, it is not clear how to assign weights to marker measurements when within-patient correlation is modest. In this paper, we consider the optimal weights that minimize the variance of the estimate of the area under the ROC curve (AUC) of a repeated marker, as well as the optimal weights that minimize the variance of the AUC difference between two repeated markers. Our results in this paper show that the optimal weights depend not only on the within-patient control--case correlation in the longitudinal data, but also on the proportion of subjects that become cases. More importantly, we show that the loss of efficiency by using the two weighting schemes suggested by Emir et al. instead of our optimal weights can be severe when there is a large within-subject control--case correlation and the proportion of subjects that become cases is small, which is often the case in longitudinal study settings.  相似文献   

16.
In many clinical settings, a commonly encountered problem is to assess accuracy of a screening test for early detection of a disease. In these applications, predictive performance of the test is of interest. Variable selection may be useful in designing a medical test. An example is a research study conducted to design a new screening test by selecting variables from an existing screener with a hierarchical structure among variables: there are several root questions followed by their stem questions. The stem questions will only be asked after a subject has answered the root question. It is therefore unreasonable to select a model that only contains stem variables but not its root variable. In this work, we propose methods to perform variable selection with structured variables when predictive accuracy of a diagnostic test is the main concern of the analysis. We take a linear combination of individual variables to form a combined test. We then maximize a direct summary measure of the predictive performance of the test, the area under a receiver operating characteristic curve (AUC of an ROC), subject to a penalty function to control for overfitting. Since maximizing empirical AUC of the ROC of a combined test is a complicated nonconvex problem (Pepe, Cai, and Longton, 2006, Biometrics62, 221-229), we explore the connection between the empirical AUC and a support vector machine (SVM). We cast the problem of maximizing predictive performance of a combined test as a penalized SVM problem and apply a reparametrization to impose the hierarchical structure among variables. We also describe a penalized logistic regression variable selection procedure for structured variables and compare it with the ROC-based approaches. We use simulation studies based on real data to examine performance of the proposed methods. Finally we apply developed methods to design a structured screener to be used in primary care clinics to refer potentially psychotic patients for further specialty diagnostics and treatment.  相似文献   

17.
An interpretation for the ROC curve and inference using GLM procedures   总被引:7,自引:0,他引:7  
Pepe MS 《Biometrics》2000,56(2):352-359
The accuracy of a medical diagnostic test is often summarized in a receiver operating characteristic (ROC) curve. This paper puts forth an interpretation for each point on the ROC curve as being a conditional probability of a test result from a random diseased subject exceeding that from a random nondiseased subject. This interpretation gives rise to new methods for making inference about ROC curves. It is shown that inference can be achieved with binary regression techniques applied to indicator variables constructed from pairs of test results, one component of the pair being from a diseased subject and the other from a nondiseased subject. Within the generalized linear model (GLM) binary regression framework, ROC curves can be estimated, and we highlight a new semiparametric estimator. Covariate effects can also be evaluated with the GLM models. The methodology is applied to a pancreatic cancer dataset where we use the regression framework to compare two different serum biomarkers. Asymptotic distribution theory is developed to facilitate inference and to provide insight into factors influencing variability of estimated model parameters.  相似文献   

18.
19.

Background

In silico models have recently been created in order to predict which genetic variants are more likely to contribute to the risk of a complex trait given their functional characteristics. However, there has been no comprehensive review as to which type of predictive accuracy measures and data visualization techniques are most useful for assessing these models.

Methods

We assessed the performance of the models for predicting risk using various methodologies, some of which include: receiver operating characteristic (ROC) curves, histograms of classification probability, and the novel use of the quantile-quantile plot. These measures have variable interpretability depending on factors such as whether the dataset is balanced in terms of numbers of genetic variants classified as risk variants versus those that are not.

Results

We conclude that the area under the curve (AUC) is a suitable starting place, and for models with similar AUCs, violin plots are particularly useful for examining the distribution of the risk scores.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1616-z) contains supplementary material, which is available to authorized users.  相似文献   

20.
Rosner B  Glynn RJ 《Biometrics》2009,65(1):188-197
Summary .  The Wilcoxon Mann-Whitney (WMW) U test is commonly used in nonparametric two-group comparisons when the normality of the underlying distribution is questionable. There has been some previous work on estimating power based on this procedure ( Lehmann, 1998 , Nonparametrics ). In this article, we present an approach for estimating type II error, which is applicable to any continuous distribution, and also extend the approach to handle grouped continuous data allowing for ties. We apply these results to obtaining standard errors of the area under the receiver operating characteristic curve (AUROC) for risk-prediction rules under H 1 and for comparing AUROC between competing risk prediction rules applied to the same data set. These results are based on SAS -callable functions to evaluate the bivariate normal integral and are thus easily implemented with standard software.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号