A general framework of nonparametric feature selection in high-dimensional data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

A general framework of nonparametric feature selection in high-dimensional data

Authors:	Hang Yu Yuanjia Wang Donglin Zeng

Affiliation:	1. Department of Statistics and Operation Research, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA;2. Department of Biostatistics, Columbia University, New York, New York, USA;3. Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA

Abstract:	Nonparametric feature selection for high-dimensional data is an important and challenging problem in the fields of statistics and machine learning. Most of the existing methods for feature selection focus on parametric or additive models which may suffer from model misspecification. In this paper, we propose a new framework to perform nonparametric feature selection for both regression and classification problems. Under this framework, we learn prediction functions through empirical risk minimization over a reproducing kernel Hilbert space. The space is generated by a novel tensor product kernel, which depends on a set of parameters that determines the importance of the features. Computationally, we minimize the empirical risk with a penalty to estimate the prediction and kernel parameters simultaneously. The solution can be obtained by iteratively solving convex optimization problems. We study the theoretical property of the kernel feature space and prove the oracle selection property and Fisher consistency of our proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via extensive simulation studies and applications to two real studies.

Keywords:	Fisher consistency oracle property reproducing kernel Hilbert space tensor product kernel variable selection

设为首页 | 免责声明 | 关于勤云 | 加入收藏