Control of population stratification by correlation-selected principal components |
| |
Authors: | Lee Seunggeun Wright Fred A Zou Fei |
| |
Affiliation: | Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599, USA. slee@bios.unc.edu |
| |
Abstract: | In genome-wide association studies, population stratification is recognized as producing inflated type I error due to the inflation of test statistics. Principal component-based methods applied to genotypes provide information about population structure, and have been widely used to control for stratification. Here we explore the precise relationship between genotype principal components and inflation of association test statistics, thereby drawing a connection between principal component-based stratification control and the alternative approach of genomic control. Our results provide an inherent justification for the use of principal components, but call into question the popular practice of selecting principal components based on significance of eigenvalues alone. We propose a new approach, called EigenCorr, which selects principal components based on both their eigenvalues and their correlation with the (disease) phenotype. Our approach tends to select fewer principal components for stratification control than does testing of eigenvalues alone, providing substantial computational savings and improvements in power. Analyses of simulated and real data demonstrate the usefulness of the proposed approach. |
| |
Keywords: | Genomic control GWAS PCA Population stratification |
本文献已被 PubMed 等数据库收录! |
|