首页 | 本学科首页   官方微博 | 高级检索  
     


Overfitting,generalization, and MSE in class probability estimation with high‐dimensional data
Authors:Kyung In Kim  Richard Simon
Affiliation:Biometric Research Branch, National Cancer Institute, MSC 9735 Bethesda, USA
Abstract:Accurate class probability estimation is important for medical decision making but is challenging, particularly when the number of candidate features exceeds the number of cases. Special methods have been developed for nonprobabilistic classification, but relatively little attention has been given to class probability estimation with numerous candidate variables. In this paper, we investigate overfitting in the development of regularized class probability estimators. We investigate the relation between overfitting and accurate class probability estimation in terms of mean square error. Using simulation studies based on real datasets, we found that some degree of overfitting can be desirable for reducing mean square error. We also introduce a mean square error decomposition for class probability estimation that helps clarify the relationship between overfitting and prediction accuracy.
Keywords:Class probability estimation  Covariance penalty  High‐dimensional data  Mean square error  Overfitting
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号