首页 | 本学科首页   官方微博 | 高级检索  
     


A general framework of nonparametric feature selection in high-dimensional data
Authors:Hang Yu  Yuanjia Wang  Donglin Zeng
Affiliation:1. Department of Statistics and Operation Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA;2. Department of Biostatistics, Columbia University, New York, New York, USA;3. Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Abstract:Nonparametric feature selection for high-dimensional data is an important and challenging problem in the fields of statistics and machine learning. Most of the existing methods for feature selection focus on parametric or additive models which may suffer from model misspecification. In this paper, we propose a new framework to perform nonparametric feature selection for both regression and classification problems. Under this framework, we learn prediction functions through empirical risk minimization over a reproducing kernel Hilbert space. The space is generated by a novel tensor product kernel, which depends on a set of parameters that determines the importance of the features. Computationally, we minimize the empirical risk with a penalty to estimate the prediction and kernel parameters simultaneously. The solution can be obtained by iteratively solving convex optimization problems. We study the theoretical property of the kernel feature space and prove the oracle selection property and Fisher consistency of our proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via extensive simulation studies and applications to two real studies.
Keywords:Fisher consistency  oracle property  reproducing kernel Hilbert space  tensor product kernel  variable selection
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号