首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Survival model predictive accuracy and ROC curves   总被引:5,自引:0,他引:5  
Heagerty PJ  Zheng Y 《Biometrics》2005,61(1):92-105
The predictive accuracy of a survival model can be summarized using extensions of the proportion of variation explained by the model, or R2, commonly used for continuous response models, or using extensions of sensitivity and specificity, which are commonly used for binary response models. In this article we propose new time-dependent accuracy summaries based on time-specific versions of sensitivity and specificity calculated over risk sets. We connect the accuracy summaries to a previously proposed global concordance measure, which is a variant of Kendall's tau. In addition, we show how standard Cox regression output can be used to obtain estimates of time-dependent sensitivity and specificity, and time-dependent receiver operating characteristic (ROC) curves. Semiparametric estimation methods appropriate for both proportional and nonproportional hazards data are introduced, evaluated in simulations, and illustrated using two familiar survival data sets.  相似文献   

2.
Finding out biomarkers and building risk scores to predict the occurrence of survival outcomes is a major concern of clinical epidemiology, and so is the evaluation of prognostic models. In this paper, we are concerned with the estimation of the time-dependent AUC--area under the receiver-operating curve--which naturally extends standard AUC to the setting of survival outcomes and enables to evaluate the discriminative power of prognostic models. We establish a simple and useful relation between the predictiveness curve and the time-dependent AUC--AUC(t). This relation confirms that the predictiveness curve is the key concept for evaluating calibration and discrimination of prognostic models. It also highlights that accurate estimates of the conditional absolute risk function should yield accurate estimates for AUC(t). From this observation, we derive several estimators for AUC(t) relying on distinct estimators of the conditional absolute risk function. An empirical study was conducted to compare our estimators with the existing ones and assess the effect of model misspecification--when estimating the conditional absolute risk function--on the AUC(t) estimation. We further illustrate the methodology on the Mayo PBC and the VA lung cancer data sets.  相似文献   

3.
Bayesian methods for estimating dose response curves from linearized multi-stage models in quantal bioassay are studied. A Gibbs sampling approach with data augmentation is employed to compute the Bayes estimates. In addition, estimation of the “relative additional risk” and the “risk specific dose” is studied. Model selection based on conditional predictive ordinates from cross-validated data is developed. Model adequacy is addressed by means of a posterior predictive tail-area test.  相似文献   

4.

Background

It is important to accurately determine the performance of peptide:MHC binding predictions, as this enables users to compare and choose between different prediction methods and provides estimates of the expected error rate. Two common approaches to determine prediction performance are cross-validation, in which all available data are iteratively split into training and testing data, and the use of blind sets generated separately from the data used to construct the predictive method. In the present study, we have compared cross-validated prediction performances generated on our last benchmark dataset from 2009 with prediction performances generated on data subsequently added to the Immune Epitope Database (IEDB) which served as a blind set.

Results

We found that cross-validated performances systematically overestimated performance on the blind set. This was found not to be due to the presence of similar peptides in the cross-validation dataset. Rather, we found that small size and low sequence/affinity diversity of either training or blind datasets were associated with large differences in cross-validated vs. blind prediction performances. We use these findings to derive quantitative rules of how large and diverse datasets need to be to provide generalizable performance estimates.

Conclusion

It has long been known that cross-validated prediction performance estimates often overestimate performance on independently generated blind set data. We here identify and quantify the specific factors contributing to this effect for MHC-I binding predictions. An increasing number of peptides for which MHC binding affinities are measured experimentally have been selected based on binding predictions and thus are less diverse than historic datasets sampling the entire sequence and affinity space, making them more difficult benchmark data sets. This has to be taken into account when comparing performance metrics between different benchmarks, and when deriving error estimates for predictions based on benchmark performance.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-241) contains supplementary material, which is available to authorized users.  相似文献   

5.
Esophageal cancer ranks the eighth most common cancer and the sixth most common cause of cancer death worldwide. MicroRNAs (miRNAs) are small noncoding RNAs that regulate a wide variety of cancer-related cellular processes. In the current study, a series of previously published gene expression microarray data from Gene Expression Ominus and The Cancer Genome Atlas were downloaded and further divided into training, internal, and external validation sets. Least absolute shrinkage and selectionator operator Cox regression model along with 10-fold cross-validation was performed to select the miRNAs associated with the prognosis of esophageal squamous cell carcinoma (ESCC) and constructed a six-miRNA signature. Then the prediction accuracy of this signature was assessed in validation and test set using Kaplan–Meier analysis, time-dependent receiver operating characteristic (ROC) curves and dynamic area under the ROC curve. According to the result, the prediction accuracy of miRNA signature was much better than that of tumor–node–metastasis (TNM) stage in all the three sets. Stratified analysis also demonstrated that the predict ability of this signature was independent of TNM stage. Finally, function experiments including apoptosis and colony formation assay were performed to further reveal the regulatory role of miRNAs in ESCC. Our study demonstrated the promising potential application of this novel six-miRNA signature as an independent biomarker for survival prediction of ESCC patients.  相似文献   

6.
Assessing influence in regression analysis with censored data.   总被引:14,自引:0,他引:14  
L A Escobar  W Q Meeker 《Biometrics》1992,48(2):507-528
In this paper we show how to evaluate the effect that perturbations to the model, data, or case weights have on maximum likelihood estimates from censored survival data. The ideas and methods also apply to other nonlinear estimation problems. We review the ideas behind using log-likelihood displacement and local influence methods. We describe new interpretations for some local influence statistics and show how these statistics extend and complement traditional case deletion influence statistics for linear least squares. These statistics identify individual and combinations of cases that have important influence on estimates of parameters and functions of these parameters. We illustrate the methods by reanalyzing the Stanford Heart Transplant data with a parametric regression model.  相似文献   

7.
Survival prediction from high-dimensional genomic data is dependent on a proper regularization method. With an increasing number of such methods proposed in the literature, comparative studies are called for and some have been performed. However, there is currently no consensus on which prediction assessment criterion should be used for time-to-event data. Without a firm knowledge about whether the choice of evaluation criterion may affect the conclusions made as to which regularization method performs best, these comparative studies may be of limited value. In this paper, four evaluation criteria are investigated: the log-rank test for two groups, the area under the time-dependent ROC curve (AUC), an R2-measure based on the Cox partial likelihood, and an R2-measure based on the Brier score. The criteria are compared according to how they rank six widely used regularization methods that are based on the Cox regression model, namely univariate selection, principal components regression (PCR), supervised PCR, partial least squares regression, ridge regression, and the lasso. Based on our application to three microarray gene expression data sets, we find that the results obtained from the widely used log-rank test deviate from the other three criteria studied. For future studies, where one also might want to include non-likelihood or non-model-based regularization methods, we argue in favor of AUC and the R2-measure based on the Brier score, as these do not suffer from the arbitrary splitting into two groups nor depend on the Cox partial likelihood.  相似文献   

8.
9.
Analysis of microarray data is associated with the methodological problems of high dimension and small sample size. Various methods have been used for variable selection in highdimension and small sample size cases with a single survival endpoint. However, little effort has been directed toward addressing competing risks where there is more than one failure risks. This study compared three typical variable selection techniques including Lasso, elastic net, and likelihood-based boosting for high-dimensional time-to-event data with competing risks. The performance of these methods was evaluated via a simulation study by analyzing a real dataset related to bladder cancer patients using time-dependent receiver operator characteristic(ROC) curve and bootstrap.632+ prediction error curves. The elastic net penalization method was shown to outperform Lasso and boosting. Based on the elastic net, 33 genes out of 1381 genes related to bladder cancer were selected. By fitting to the Fine and Gray model, eight genes were highly significant(P 0.001). Among them, expression of RTN4, SON, IGF1 R, SNRPE, PTGR1, PLEK, and ETFDH was associated with a decrease in survival time, whereas SMARCAD1 expression was associated with an increase in survival time. This study indicates that the elastic net has a higher capacity than the Lasso and boosting for the prediction of survival time in bladder cancer patients.Moreover, genes selected by all methods improved the predictive power of the model based on only clinical variables, indicating the value of information contained in the microarray features.  相似文献   

10.
The development of high-throughput technology has generated a massive amount of high-dimensional data, and many of them are of discrete type. Robust and efficient learning algorithms such as LASSO [1] are required for feature selection and overfitting control. However, most feature selection algorithms are only applicable to the continuous data type. In this paper, we propose a novel method for sparse support vector machines (SVMs) with L_{p} (p ≪ 1) regularization. Efficient algorithms (LpSVM) are developed for learning the classifier that is applicable to high-dimensional data sets with both discrete and continuous data types. The regularization parameters are estimated through maximizing the area under the ROC curve (AUC) of the cross-validation data. Experimental results on protein sequence and SNP data attest to the accuracy, sparsity, and efficiency of the proposed algorithm. Biomarkers identified with our methods are compared with those from other methods in the literature. The software package in Matlab is available upon request.  相似文献   

11.
Accessibility of high-throughput genotyping technology allows genome-wide association studies for common complex diseases. This paper addresses two challenges commonly facing such studies: (i) searching an enormous amount of possible gene interactions and (ii) finding reproducible associations. These challenges have been traditionally addressed in statistics while here we apply computational approaches--optimization and cross-validation. A complex risk factor is modeled as a subset of single nucleotide polymorphisms (SNPs) with specified alleles and the optimization formulation asks for the one with the maximum odds ratio. To measure and compare ability of search methods to find reproducible risk factors, we propose to apply a cross-validation scheme usually used for prediction validation. We have applied and cross-validated known search methods with proposed enhancements on real case-control studies for several diseases (Crohn's disease, autoimmune disorder, tick-borne encephalitis, lung cancer, and rheumatoid arthritis). Proposed methods are compared favorably to the exhaustive search: they are faster, find more frequently statistically significant risk factors, and have significantly higher leave-half-out cross-validation rate.  相似文献   

12.

Purpose

Recent high-throughput sequencing technology has identified numerous somatic mutations across the whole exome in a variety of cancers. In this study, we generate a predictive model employing the whole exome somatic mutational profile of ovarian high-grade serous carcinomas (Ov-HGSCs) obtained from The Cancer Genome Atlas data portal.

Methods

A total of 311 patients were included for modeling overall survival (OS) and 259 patients were included for modeling progression free survival (PFS) in an analysis of 509 genes. The model was validated with complete leave-one-out cross-validation involving re-selecting genes for each iteration of the cross-validation procedure. Cross-validated Kaplan-Meier curves were generated. Cross-validated time dependent receiver operating characteristic (ROC) curves were computed and the area under the curve (AUC) values were calculated from the ROC curves to estimate the predictive accuracy of the survival risk models.

Results

There was a significant difference in OS between the high-risk group (median, 28.1 months) and the low-risk group (median, 61.5 months) (permutated p-value <0.001). For PFS, there was also a significant difference in PFS between the high-risk group (10.9 months) and the low-risk group (22.3 months) (permutated p-value <0.001). Cross-validated AUC values were 0.807 for the OS and 0.747 for the PFS based on a defined landmark time t = 36 months. In comparisons between a predictive model containing only gene variables and a combined model containing both gene variables and clinical covariates, the predictive model containing gene variables without clinical covariates were effective and high AUC values for both OS and PFS were observed.

Conclusions

We designed a predictive model using a somatic mutation profile obtained from high-throughput genomic sequencing data in Ov-HGSC samples that may represent a new strategy for applying high-throughput sequencing data to clinical practice.  相似文献   

13.
T Würschum  T Kraft 《Heredity》2014,112(4):463-468
Association mapping has become a widely applied genomic approach to identify quantitative trait loci (QTL) and dissect the genetic architecture of complex traits. However, approaches to assess the quality of the obtained QTL results are lacking. We therefore evaluated the potential of cross-validation in association mapping based on a large sugar beet data set. Our results show that the proportion of the population that should be used as estimation and validation sets, respectively, depends on the size of the mapping population. Generally, a fivefold cross-validation, that is, 20% of the lines as independent validation set, appears appropriate for commonly used population sizes. The predictive power for the proportion of genotypic variance explained by QTL was overestimated by on average 38% indicating a strong bias in the estimated QTL effects. The cross-validated predictive power ranged between 4 and 50%, which are more realistic estimates of this parameter for complex traits. In addition, QTL frequency distributions can be used to assess the precision of QTL position estimates and the robustness of the detected QTL. In summary, cross-validation can be a valuable tool to assess the quality of QTL parameters in association mapping.  相似文献   

14.
We investigate a new method to place patients into risk groups in censored survival data. Properties such as median survival time, and end survival rate, are implicitly improved by optimizing the area under the survival curve. Artificial neural networks (ANN) are trained to either maximize or minimize this area using a genetic algorithm, and combined into an ensemble to predict one of low, intermediate, or high risk groups. Estimated patient risk can influence treatment choices, and is important for study stratification. A common approach is to sort the patients according to a prognostic index and then group them along the quartile limits. The Cox proportional hazards model (Cox) is one example of this approach. Another method of doing risk grouping is recursive partitioning (Rpart), which constructs a decision tree where each branch point maximizes the statistical separation between the groups. ANN, Cox, and Rpart are compared on five publicly available data sets with varying properties. Cross-validation, as well as separate test sets, are used to validate the models. Results on the test sets show comparable performance, except for the smallest data set where Rpart’s predicted risk groups turn out to be inverted, an example of crossing survival curves. Cross-validation shows that all three models exhibit crossing of some survival curves on this small data set but that the ANN model manages the best separation of groups in terms of median survival time before such crossings. The conclusion is that optimizing the area under the survival curve is a viable approach to identify risk groups. Training ANNs to optimize this area combines two key strengths from both prognostic indices and Rpart. First, a desired minimum group size can be specified, as for a prognostic index. Second, the ability to utilize non-linear effects among the covariates, which Rpart is also able to do.  相似文献   

15.
Murray S  Tsiatis AA 《Biometrics》1999,55(4):1085-1092
This research develops nonparametric strategies for sequentially monitoring clinical trial data where detecting years of life saved is of interest. The recommended test statistic looks at integrated differences in survival estimates during the time frame of interest. In many practical situations, the test statistic presented has an independent increments covariance structure. Hence, with little additional work, we may apply these testing procedures using available methodology. In the case where an independent increments covariance structure is present, we suggest how clinical trial data might be monitored using these statistics in an information-based design. The resulting study design maintains the desired stochastic operating characteristics regardless of the shapes of the survival curves being compared. This offers an advantage over the popular log-rank-based design strategy since more restrictive assumptions relating to the behavior of the hazards are required to guarantee the planned power of the test. Recommendations for how to sequentially monitor clinical trial progress in the nonindependent increments case are also provided along with an example.  相似文献   

16.
The fate of scientific hypotheses often relies on the ability of a computational model to explain the data, quantified in modern statistical approaches by the likelihood function. The log-likelihood is the key element for parameter estimation and model evaluation. However, the log-likelihood of complex models in fields such as computational biology and neuroscience is often intractable to compute analytically or numerically. In those cases, researchers can often only estimate the log-likelihood by comparing observed data with synthetic observations generated by model simulations. Standard techniques to approximate the likelihood via simulation either use summary statistics of the data or are at risk of producing substantial biases in the estimate. Here, we explore another method, inverse binomial sampling (IBS), which can estimate the log-likelihood of an entire data set efficiently and without bias. For each observation, IBS draws samples from the simulator model until one matches the observation. The log-likelihood estimate is then a function of the number of samples drawn. The variance of this estimator is uniformly bounded, achieves the minimum variance for an unbiased estimator, and we can compute calibrated estimates of the variance. We provide theoretical arguments in favor of IBS and an empirical assessment of the method for maximum-likelihood estimation with simulation-based models. As case studies, we take three model-fitting problems of increasing complexity from computational and cognitive neuroscience. In all problems, IBS generally produces lower error in the estimated parameters and maximum log-likelihood values than alternative sampling methods with the same average number of samples. Our results demonstrate the potential of IBS as a practical, robust, and easy to implement method for log-likelihood evaluation when exact techniques are not available.  相似文献   

17.
Acute myeloid leukaemia (AML) is the most common type of adult acute leukaemia and has a poor prognosis. Thus, optimal risk stratification is of greatest importance for reasonable choice of treatment and prognostic evaluation. For our study, a total of 1707 samples of AML patients from three public databases were divided into meta‐training, meta‐testing and validation sets. The meta‐training set was used to build risk prediction model, and the other four data sets were employed for validation. By log‐rank test and univariate COX regression analysis as well as LASSO‐COX, AML patients were divided into high‐risk and low‐risk groups based on AML risk score (AMLRS) which was constituted by 10 survival‐related genes. In meta‐training, meta‐testing and validation sets, the patient in the low‐risk group all had a significantly longer OS (overall survival) than those in the high‐risk group (P < .001), and the area under ROC curve (AUC) by time‐dependent ROC was 0.5854‐0.7905 for 1 year, 0.6652‐0.8066 for 3 years and 0.6622‐0.8034 for 5 years. Multivariate COX regression analysis indicated that AMLRS was an independent prognostic factor in four data sets. Nomogram combining the AMLRS and two clinical parameters performed well in predicting 1‐year, 3‐year and 5‐year OS. Finally, we created a web‐based prognostic model to predict the prognosis of AML patients ( https://tcgi.shinyapps.io/amlrs_nomogram/ ).  相似文献   

18.
A popular commercially available oligonucleotide microarray technology employs sets of 25 base pair oligonucleotide probes for measurement of gene expression levels. A mathematical algorithm is required to compute an estimate of gene expression from the multiple probes. Previously proposed methods for summarizing gene expression data have either been substantially ad hoc or have relied on model assumptions that may be easily violated. Here we present a new algorithm for calculating gene expression from probe sets. Our approach is functionally related to leave-one-out cross-validation, a non-parametric statistical technique that is often applied in limited data situations. We illustrate this approach using data from our study seeking a molecular fingerprint of STAT3 regulated genes for early detection of human cancer.  相似文献   

19.

Background

Modern experimental techniques deliver data sets containing profiles of tens of thousands of potential molecular and genetic markers that can be used to improve medical diagnostics. Previous studies performed with three different experimental methods for the same set of neuroblastoma patients create opportunity to examine whether augmenting gene expression profiles with information on copy number variation can lead to improved predictions of patients survival. We propose methodology based on comprehensive cross-validation protocol, that includes feature selection within cross-validation loop and classification using machine learning. We also test dependence of results on the feature selection process using four different feature selection methods.

Results

The models utilising features selected based on information entropy are slightly, but significantly, better than those using features obtained with t-test. The synergy between data on genetic variation and gene expression is possible, but not confirmed. A slight, but statistically significant, increase of the predictive power of machine learning models has been observed for models built on combined data sets. It was found while using both out of bag estimate and in cross-validation performed on a single set of variables. However, the improvement was smaller and non-significant when models were built within full cross-validation procedure that included feature selection within cross-validation loop. Good correlation between performance of the models in the internal and external cross-validation was observed, confirming the robustness of the proposed protocol and results.

Conclusions

We have developed a protocol for building predictive machine learning models. The protocol can provide robust estimates of the model performance on unseen data. It is particularly well-suited for small data sets. We have applied this protocol to develop prognostic models for neuroblastoma, using data on copy number variation and gene expression. We have shown that combining these two sources of information may increase the quality of the models. Nevertheless, the increase is small and larger samples are required to reduce noise and bias arising due to overfitting.

Reviewers

This article was reviewed by Lan Hu, Tim Beissbarth and Dimitar Vassilev.
  相似文献   

20.
In medical statistics, many alternative strategies are available for building a prediction model based on training data. Prediction models are routinely compared by means of their prediction performance in independent validation data. If only one data set is available for training and validation, then rival strategies can still be compared based on repeated bootstraps of the same data. Often, however, the overall performance of rival strategies is similar and it is thus difficult to decide for one model. Here, we investigate the variability of the prediction models that results when the same modelling strategy is applied to different training sets. For each modelling strategy we estimate a confidence score based on the same repeated bootstraps. A new decomposition of the expected Brier score is obtained, as well as the estimates of population average confidence scores. The latter can be used to distinguish rival prediction models with similar prediction performances. Furthermore, on the subject level a confidence score may provide useful supplementary information for new patients who want to base a medical decision on predicted risk. The ideas are illustrated and discussed using data from cancer studies, also with high-dimensional predictor space.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号