Design and coverage of high throughput genotyping arrays optimized for individuals of East Asian, African American, and Latino race/ethnicity using imputation and a novel hybrid SNP selection algorithm |
| |
Authors: | Hoffmann Thomas J Zhan Yiping Kvale Mark N Hesselson Stephanie E Gollub Jeremy Iribarren Carlos Lu Yontao Mei Gangwu Purdy Matthew M Quesenberry Charles Rowell Sarah Shapero Michael H Smethurst David Somkin Carol P Van den Eeden Stephen K Walter Larry Webster Teresa Whitmer Rachel A Finn Andrea Schaefer Catherine Kwok Pui-Yan Risch Neil |
| |
Affiliation: | aInstitute for Human Genetics, University of California, San Francisco, CA, USA;bAffymetrix Incorporated, Santa Clara, CA, USA;cKaiser Permanente Northern California Division of Research, Oakland, CA, USA;dCardiovascular Research Institute, University of California, San Francisco, CA, USA;eDepartment of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA |
| |
Abstract: | Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies. |
| |
Keywords: | Abbreviations: GWA, genome-wide association MAF, minor allele frequency KGP, 1000 Genomes Project RPGEH, Research Program on Genes, Environment and Health EUR, European and West Asian EAS, East Asian AFR, African American LAT, Latino 2-rep, 2 features 1-rep, 1 feature ASW, African Ancestry in Southwest USA CEU, Utah residents with ancestry from Northern and Western Europe from Centre d'Etude du Polymorphisme Humain CHB, Han Chinese in Beijing CHS, Han Chinese South CLM, Colombian in Medellin, Colombia Fin, Finnish from Finland GBR, British individuals from England and Scotland IBS, Iberians in Spain JPT, Japanese in Tokyo LWK, Luhya in Webuye Kenya MXL, Mexican in Los Angeles, CA PUR, Puerto Rican in Puerto Rico TSI, Toscani in Italia YRI, Yoruba in Ibadan, Nigeria KGHP, 1000 Genomes High Pass KPNC, Kaiser Permanente Northern California AIMs, Ancestry Informative Markers KG2011, 1000 Genomes interim June 2011 release |
本文献已被 ScienceDirect PubMed 等数据库收录! |
|