Screening large-scale association study data: exploiting interactions using random forests期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Screening large-scale association study data: exploiting interactions using random forests

Authors:	Email author" target="_blank">Kathryn?L?Lunetta Email author L?Brooke?Hayward Jonathan?Segal Paul?Van Eerdewegh

Institution:	(1) Oscient Pharmaceuticals, Inc. (formerly Genome Therapeutics Corporation), Waltham, Massachusetts, USA;(2) Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA;(3) Genizon BioSciences Inc., Montreal, Quebec, Canada;(4) Department of Psychiatry, Harvard Medical School, Boston, Massachusetts, USA

Abstract:	Background Genome-wide association studies for complex diseases will produce genotypes on hundreds of thousands of single nucleotide polymorphisms (SNPs). A logical first approach to dealing with massive numbers of SNPs is to use some test to screen the SNPs, retaining only those that meet some criterion for futher study. For example, SNPs can be ranked by p-value, and those with the lowest p-values retained. When SNPs have large interaction effects but small marginal effects in a population, they are unlikely to be retained when univariate tests are used for screening. However, model-based screens that pre-specify interactions are impractical for data sets with thousands of SNPs. Random forest analysis is an alternative method that produces a single measure of importance for each predictor variable that takes into account interactions among variables without requiring model specification. Interactions increase the importance for the individual interacting variables, making them more likely to be given high importance relative to other variables. We test the performance of random forests as a screening procedure to identify small numbers of risk-associated SNPs from among large numbers of unassociated SNPs using complex disease models with up to 32 loci, incorporating both genetic heterogeneity and multi-locus interaction.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Background