Efficient approximate k‐fold and leave‐one‐out cross‐validation for ridge regression |
| |
Authors: | Rosa J Meijer Jelle J Goeman |
| |
Institution: | Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Postzone S5‐P, , P.O. Box 9604 2300 RC Leiden, The Netherlands |
| |
Abstract: | In model building and model evaluation, cross‐validation is a frequently used resampling method. Unfortunately, this method can be quite time consuming. In this article, we discuss an approximation method that is much faster and can be used in generalized linear models and Cox’ proportional hazards model with a ridge penalty term. Our approximation method is based on a Taylor expansion around the estimate of the full model. In this way, all cross‐validated estimates are approximated without refitting the model. The tuning parameter can now be chosen based on these approximations and can be optimized in less time. The method is most accurate when approximating leave‐one‐out cross‐validation results for large data sets which is originally the most computationally demanding situation. In order to demonstrate the method's performance, it will be applied to several microarray data sets. An R package penalized, which implements the method, is available on CRAN. |
| |
Keywords: | Approximation method Cross‐validation Matrix Inversion Lemma Ridge regression Survival analysis |
|
|