首页 | 本学科首页   官方微博 | 高级检索  
     


Performance of random forest when SNPs are in linkage disequilibrium
Authors:Yan A Meng   Yi Yu   L Adrienne Cupples   Lindsay A Farrer  Kathryn L Lunetta
Affiliation:(1) Genetics Program, Department of Medicine, School of Medicine, Boston University, Boston, MA, USA;(2) Department of Biostatistics, School of Public Health, Boston University, Boston, MA, USA;(3) Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA, USA;(4) Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
Abstract:

Background  

Single nucleotide polymorphisms (SNPs) may be correlated due to linkage disequilibrium (LD). Association studies look for both direct and indirect associations with disease loci. In a Random Forest (RF) analysis, correlation between a true risk SNP and SNPs in LD may lead to diminished variable importance for the true risk SNP. One approach to address this problem is to select SNPs in linkage equilibrium (LE) for analysis. Here, we explore alternative methods for dealing with SNPs in LD: change the tree-building algorithm by building each tree in an RF only with SNPs in LE, modify the importance measure (IM), and use haplotypes instead of SNPs to build a RF.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号