期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Rapid Assessment of Genetic Ancestry in Populations of Unknown Origin by Genome-Wide Genotyping of Pooled Samples

Charleston W. K. Chiang Zofia K. Z. Gajdos Joshua M. Korn Finny G. Kuruvilla Johannah L. Butler Rachel Hackett Candace Guiducci Thutrang T. Nguyen Rainford Wilks Terrence Forrester Christopher A. Haiman Katherine D. Henderson Loic Le Marchand Brian E. Henderson Mark R. Palmert Colin A. McKenzie Helen N. Lyon Richard S. Cooper Xiaofeng Zhu Joel N. Hirschhorn 《PLoS genetics》2010,6(3)

As we move forward from the current generation of genome-wide association (GWA) studies, additional cohorts of different ancestries will be studied to increase power, fine map association signals, and generalize association results to additional populations. Knowledge of genetic ancestry as well as population substructure will become increasingly important for GWA studies in populations of unknown ancestry. Here we propose genotyping pooled DNA samples using genome-wide SNP arrays as a viable option to efficiently and inexpensively estimate admixture proportion and identify ancestry informative markers (AIMs) in populations of unknown origin. We constructed DNA pools from African American, Native Hawaiian, Latina, and Jamaican samples and genotyped them using the Affymetrix 6.0 array. Aided by individual genotype data from the African American cohort, we established quality control filters to remove poorly performing SNPs and estimated allele frequencies for the remaining SNPs in each panel. We then applied a regression-based method to estimate the proportion of admixture in each cohort using the allele frequencies estimated from pooling and populations from the International HapMap Consortium as reference panels, and identified AIMs unique to each population. In this study, we demonstrated that genotyping pooled DNA samples yields estimates of admixture proportion that are both consistent with our knowledge of population history and similar to those obtained by genotyping known AIMs. Furthermore, through validation by individual genotyping, we demonstrated that pooling is quite effective for identifying SNPs with large allele frequency differences (i.e., AIMs) and that these AIMs are able to differentiate two closely related populations (HapMap JPT and CHB). 相似文献

2.

Efficiency and power as a function of sequence coverage, SNP array density, and imputation

J Flannick JM Korn P Fontanillas GB Grant E Banks MA Depristo D Altshuler 《PLoS computational biology》2012,8(7):e1002604

High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, array intensities, and imputation. In European samples, we find similar sensitivity (89%) and specificity (99.6%) from imputation with either 1× sequencing or 1 M SNP arrays. Sensitivity is increased, particularly for low-frequency polymorphisms (MAF < 5%), when low coverage sequence reads are added to dense genome-wide SNP arrays--the converse, however, is not true. At sites where sequence reads and array intensities produce different sample genotypes, joint analysis reduces genotype errors and identifies novel error modes. Our joint framework informs the use of next-generation sequencing in genome wide association studies and supports development of improved methods for genotype calling. 相似文献

3.

Evaluating the effects of imputation on the power, coverage, and cost efficiency of genome-wide SNP platforms 总被引：1，自引：0，他引：1

Anderson CA Pettersson FH Barrett JC Zhuang JJ Ragoussis J Cardon LR Morris AP 《American journal of human genetics》2008,83(1):112-119

Genotype imputation is potentially a zero-cost method for bridging gaps in coverage and power between genotyping platforms. Here, we quantify these gains in power and coverage by using 1,376 population controls that are from the 1958 British Birth Cohort and were genotyped by the Wellcome Trust Case-Control Consortium with the Illumina HumanHap 550 and Affymetrix SNP Array 5.0 platforms. Approximately 50% of genotypes at single-nucleotide polymorphisms (SNPs) exclusively on the HumanHap 550 can be accurately imputed from direct genotypes on the SNP Array 5.0 or Illumina HumanHap 300. This roughly halves differences in coverage and power between the platforms. When the relative cost of currently available genome-wide SNP platforms is accounted for, and finances are limited but sample size is not, the highest-powered strategy in European populations is to genotype a larger number of individuals with the HumanHap 300 platform and carry out imputation. Platforms consisting of around 1 million SNPs offer poor cost efficiency for SNP association in European populations. 相似文献

4.

Meta-analysis of Dense Genecentric Association Studies Reveals Common and Uncommon Variants Associated with Height

Lanktree MB Guo Y Murtaza M Glessner JT Bailey SD Onland-Moret NC Lettre G Ongen H Rajagopalan R Johnson T Shen H Nelson CP Klopp N Baumert J Padmanabhan S Pankratz N Pankow JS Shah S Taylor K Barnard J Peters BJ Maloney CM Lobmeyer MT Stanton A Zafarmand MH Romaine SP Mehta A van Iperen EP Gong Y Price TS Smith EN Kim CE Li YR Asselbergs FW Atwood LD Bailey KM Bhatt D Bauer F Behr ER Bhangale T Boer JM Boehm BO Bradfield JP Brown M Braund PS Burton PR Carty C Chandrupatla HR Chen W Connell J 《American journal of human genetics》2011,(1):688-18

Height is a classic complex trait with common variants in a growing list of genes known to contribute to the phenotype. Using a genecentric genotyping array targeted toward cardiovascular-related loci, comprising 49,320 SNPs across approximately 2000 loci, we evaluated the association of common and uncommon SNPs with adult height in 114,223 individuals from 47 studies and six ethnicities. A total of 64 loci contained a SNP associated with height at array-wide significance (p < 2.4 × 10⁻⁶), with 42 loci surpassing the conventional genome-wide significance threshold (p < 5 × 10⁻⁸). Common variants with minor allele frequencies greater than 5% were observed to be associated with height in 37 previously reported loci. In individuals of European ancestry, uncommon SNPs in IL11 and SMAD3, which would not be genotyped with the use of standard genome-wide genotyping arrays, were strongly associated with height (p < 3 × 10⁻¹¹). Conditional analysis within associated regions revealed five additional variants associated with height independent of lead SNPs within the locus, suggesting allelic heterogeneity. Although underpowered to replicate findings from individuals of European ancestry, the direction of effect of associated variants was largely consistent in African American, South Asian, and Hispanic populations. Overall, we show that dense coverage of genes for uncommon SNPs, coupled with large-scale meta-analysis, can successfully identify additional variants associated with a common complex trait. 相似文献

5.

The impact of natural selection on an ABCC11 SNP determining earwax type

Ohashi J Naka I Tsuchiya N 《Molecular biology and evolution》2011,28(1):849-857

A nonsynonymous single nucleotide polymorphism (SNP), rs17822931-G/A (538G>A; Gly180Arg), in the ABCC11 gene determines human earwax type (i.e., wet or dry) and is one of most differentiated nonsynonymous SNPs between East Asian and African populations. A recent genome-wide scan for positive selection revealed that a genomic region spanning ABCC11, LONP2, and SIAH1 genes has been subjected to a selective sweep in East Asians. Considering the potential functional significance as well as the population differentiation of SNPs located in that region, rs17822931 is the most plausible candidate polymorphism to have undergone geographically restricted positive selection. In this study, we estimated the selection intensity or selection coefficient of rs17822931-A in East Asians by analyzing two microsatellite loci flanking rs17822931 in the African (HapMap-YRI) and East Asian (HapMap-JPT and HapMap-CHB) populations. Assuming a recessive selection model, a coalescent-based simulation approach suggested that the selection coefficient of rs17822931-A had been approximately 0.01 in the East Asian population, and a simulation experiment using a pseudo-sampling variable revealed that the mutation of rs17822931-A occurred 2006 generations (95% credible interval, 1,023-3,901 generations) ago. In addition, we show that absolute latitude is significantly associated with the allele frequency of rs17822931-A in Asian, Native American, and European populations, implying that the selective advantage of rs17822931-A is related to an adaptation to a cold climate. Our results provide a striking example of how local adaptation has played a significant role in the diversification of human traits. 相似文献

6.

Characterizing Race/Ethnicity and Genetic Ancestry for 100,000 Subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) Cohort

Yambazi Banda Mark N. Kvale Thomas J. Hoffmann Stephanie E. Hesselson Dilrini Ranatunga Hua Tang Chiara Sabatti Lisa A. Croen Brad P. Dispensa Mary Henderson Carlos Iribarren Eric Jorgenson Lawrence H. Kushi Dana Ludwig Diane Olberg Charles P. Quesenberry Jr. Sarah Rowell Marianne Sadler Lori C. Sakoda Stanley Sciortino Ling Shen David Smethurst Carol P. Somkin Stephen K. Van Den Eeden Lawrence Walter Rachel A. Whitmer Pui-Yan Kwok Catherine Schaefer Neil Risch 《Genetics》2015,200(4):1285-1295

Using genome-wide genotypes, we characterized the genetic structure of 103,006 participants in the Kaiser Permanente Northern California multi-ethnic Genetic Epidemiology Research on Adult Health and Aging Cohort and analyzed the relationship to self-reported race/ethnicity. Participants endorsed any of 23 race/ethnicity/nationality categories, which were collapsed into seven major race/ethnicity groups. By self-report the cohort is 80.8% white and 19.2% minority; 93.8% endorsed a single race/ethnicity group, while 6.2% endorsed two or more. Principal component (PC) and admixture analyses were generally consistent with prior studies. Approximately 17% of subjects had genetic ancestry from more than one continent, and 12% were genetically admixed, considering only nonadjacent geographical origins. Self-reported whites were spread on a continuum along the first two PCs, indicating extensive mixing among European nationalities. Self-identified East Asian nationalities correlated with genetic clustering, consistent with extensive endogamy. Individuals of mixed East Asian–European genetic ancestry were easily identified; we also observed a modest amount of European genetic ancestry in individuals self-identified as Filipinos. Self-reported African Americans and Latinos showed extensive European and African genetic ancestry, and Native American genetic ancestry for the latter. Among 3741 genetically identified parent–child pairs, 93% were concordant for self-reported race/ethnicity; among 2018 genetically identified full-sib pairs, 96% were concordant; the lower rate for parent–child pairs was largely due to intermarriage. The parent–child pairs revealed a trend toward increasing exogamy over time; the presence in the cohort of individuals endorsing multiple race/ethnicity categories creates interesting challenges and future opportunities for genetic epidemiologic studies. 相似文献

7.

Effect of Genome-Wide Genotyping and Reference Panels on Rare Variants Imputation

Martin Ladouceur Celia M.T.Greenwood J.Brent Richards 《遗传学报》2012,39(10):545-550

Common variants explain little of the variance of most common disease,prompting large-scale sequencing studies to understand the contribution of rare variants to these diseases.Imputation of rare variants from genome-wide genotypic arrays offers a cost-efficient strategy to achieve necessary sample sizes required for adequate statistical power.To estimate the performance of imputation of rare variants,we imputed 153 individuals,each of whom was genotyped on 3 different genotype arrays including 317k,610k and 1 million single nucleotide polymorphisms(SNPs),to two different reference panels:HapMap2 and 1000 Genomes pilot March 2010 release (lKGpilot) by using IMPUTE version 2.We found that more than 94%and 84%of all SNPs yield acceptable accuracy(info > 0.4) in HapMap2 and lKGpilot-based imputation,respectively.For rare variants(minor allele frequency(MAF) <5%),the proportion of wellimputed SNPs increased as the MAF increased from 0.3%to 5%across all 3 genome-wide association study(GWAS) datasets.The proportion of well-imputed SNPs was 69%,60%and 49%for SNPs with a MAF from 0.3%to 5%for 1M,610k and 317k,respectively. None of the very rare variants(MAF < 0.3%) were well imputed.We conclude that the imputation accuracy of rare variants increases with higher density of genome-wide genotyping arrays when the size of the reference panel is small.Variants with lower MAF are more difficult to impute.These findings have important implications in the design and replication of large-scale sequencing studies. 相似文献

8.

Genetic structure, self-identified race/ethnicity, and confounding in case-control association studies

下载免费PDF全文

Tang H Quertermous T Rodriguez B Kardia SL Zhu X Brown A Pankow JS Province MA Hunt SC Boerwinkle E Schork NJ Risch NJ 《American journal of human genetics》2005,76(2):268-275

We have analyzed genetic data for 326 microsatellite markers that were typed uniformly in a large multiethnic population-based sample of individuals as part of a study of the genetics of hypertension (Family Blood Pressure Program). Subjects identified themselves as belonging to one of four major racial/ethnic groups (white, African American, East Asian, and Hispanic) and were recruited from 15 different geographic locales within the United States and Taiwan. Genetic cluster analysis of the microsatellite markers produced four major clusters, which showed near-perfect correspondence with the four self-reported race/ethnicity categories. Of 3,636 subjects of varying race/ethnicity, only 5 (0.14%) showed genetic cluster membership different from their self-identified race/ethnicity. On the other hand, we detected only modest genetic differentiation between different current geographic locales within each race/ethnicity group. Thus, ancient geographic ancestry, which is highly correlated with self-identified race/ethnicity--as opposed to current residence--is the major determinant of genetic structure in the U.S. population. Implications of this genetic structure for case-control association studies are discussed. 相似文献

9.

Evolution of a length polymorphism in the human PER3 gene, a component of the circadian system

Nadkarni NA Weale ME von Schantz M Thomas MG 《Journal of biological rhythms》2005,20(6):490-499

Period homologue 3 (PER3) is a component of the mammalian circa-dian system, although its precise role is unknown. A biallelic variable number tandem repeat (VNTR) polymorphism exists in human PER3, consisting of 4 or 5 repeats of a 54-bp sequence in a region encoding a putative phosphorylation domain. This polymorphism has previously been reported to associate with diurnal preference ("morningness" and "eveningness") and delayed sleep-phase syndrome. We have investigated the global allele frequencies of this variant in ethnically distinct indigenous populations. All populations were polymorphic, with the shorter (4-repeat) allele ranging in frequency from 0.19 (Papua New Guinea) to 0.89 (Mongolia). To investigate if allele frequency has been influenced by natural selection, the authors 1) tested for a correlation with latitude and mean annual insolation (incident sunlight energy), using classical markers to correct for historical population differentiation; and they 2) compared allele-frequency difference between European American, African American, and East Asian populations, as measured using F(ST), to an empirical null distribution of F(ST)values based on a genome-wide dataset of single nucleotide polymorphisms (SNPs) of presumed neutral loci that were previously typed by The SNP Consortium. The variation in allele frequencies between indigenous populations did not show a pattern that would indicate selective pressure on PER3resulting from day-length variation or mean annual insolation, and the allele-frequency difference between European Americans, African Americans, and East Asians was not an outlier when compared to the distribution for presumed neutral SNPs. We therefore find no evidence for differential or balancing selection in the contemporary pattern of global PER3allele frequencies. 相似文献

10.

OPRM1 and EGFR contribute to skin pigmentation differences between Indigenous Americans and Europeans

Quillen EE Bauchet M Bigham AW Delgado-Burbano ME Faust FX Klimentidis YC Mao X Stoneking M Shriver MD 《Human genetics》2012,131(7):1073-1080

Contemporary variation in skin pigmentation is the result of hundreds of thousands years of human evolution in new and changing environments. Previous studies have identified several genes involved in skin pigmentation differences among African, Asian, and European populations. However, none have examined skin pigmentation variation among Indigenous American populations, creating a critical gap in our understanding of skin pigmentation variation. This study investigates signatures of selection at 76 pigmentation candidate genes that may contribute to skin pigmentation differences between Indigenous Americans and Europeans. Analysis was performed on two samples of Indigenous Americans genotyped on genome-wide SNP arrays. Using four tests for natural selection--locus-specific branch length (LSBL), ratio of heterozygosities (lnRH), Tajima's D difference, and extended haplotype homozygosity (EHH)--we identified 14 selection-nominated candidate genes (SNCGs). SNPs in each of the SNCGs were tested for association with skin pigmentation in 515 admixed Indigenous American and European individuals from regions of the Americas with high ground-level ultraviolet radiation. In addition to SLC24A5 and SLC45A2, genes previously associated with European/non-European differences in skin pigmentation, OPRM1 and EGFR were associated with variation in skin pigmentation in New World populations for the first time. 相似文献

11.

Next generation genome-wide association tool: design and coverage of a high-throughput European-optimized SNP array

Hoffmann TJ Kvale MN Hesselson SE Zhan Y Aquino C Cao Y Cawley S Chung E Connell S Eshragh J Ewing M Gollub J Henderson M Hubbell E Iribarren C Kaufman J Lao RZ Lu Y Ludwig D Mathauda GK McGuire W Mei G Miles S Purdy MM Quesenberry C Ranatunga D Rowell S Sadler M Shapero MH Shen L Shenoy TR Smethurst D Van den Eeden SK Walter L Wan E Wearley R Webster T Wen CC Weng L Whitmer RA Williams A Wong SC Zau C Finn A Schaefer C Kwok PY Risch N 《Genomics》2011,98(2):79-89

The success of genome-wide association studies has paralleled the development of efficient genotyping technologies. We describe the development of a next-generation microarray based on the new highly-efficient Affymetrix Axiom genotyping technology that we are using to genotype individuals of European ancestry from the Kaiser Permanente Research Program on Genes, Environment and Health (RPGEH). The array contains 674,517 SNPs, and provides excellent genome-wide as well as gene-based and candidate-SNP coverage. Coverage was calculated using an approach based on imputation and cross validation. Preliminary results for the first 80,301 saliva-derived DNA samples from the RPGEH demonstrate very high quality genotypes, with sample success rates above 94% and over 98% of successful samples having SNP call rates exceeding 98%. At steady state, we have produced 462 million genotypes per week for each Axiom system. The new array provides a valuable addition to the repertoire of tools for large scale genome-wide association studies. 相似文献

12.

Haplotypes of single cancer driver genes and their local ancestry in a highly admixed long-lived population of Northeast Brazil

Steffany Larissa Galdino Galisa Priscila Lima Jacob Allysson Allan de Farias Renan Barbosa Lemes Leandro Ucela Alves Júlia Cristina Leite Nbrega Mayana Zatz Silvana Santos Mathias Weller 《Genetics and molecular biology》2022,45(1)

Admixed populations have not been examined in detail in cancer genetic studies. Here, we inferred the local ancestry of cancer-associated single nucleotide polymorphisms (SNPs) and haplotypes of a highly admixed Brazilian population. SNP array was used to genotype 73 unrelated individuals aged 80-102 years. Local ancestry inference was performed by merging genotyped regions with phase three data from the 1000 Genomes Project Consortium using RFmix. The average ancestry tract length was 9.12-81.71 megabases. Strong linkage disequilibrium was detected in 48 haplotypes containing 35 SNPs in 10 cancer driver genes. All together, 19 risk and eight protective alleles were identified in 23 out of 48 haplotypes. Homozygous individuals were mainly of European ancestry, whereas heterozygotes had at least one Native American and one African ancestry tract. Native-American ancestry for homozygous individuals with risk alleles for HNF1B, CDH1, and BRCA1 was inferred for the first time. Results indicated that analysis of SNP polymorphism in the present admixed population has a high potential to identify new ancestry-associated alleles and haplotypes that modify cancer susceptibility differentially in distinct human populations. Future case-control studies with populations with a complex history of admixture could help elucidate ancestry-associated biological differences in cancer incidence and therapeutic outcomes. 相似文献

13.

Single nucleotide polymorphisms generated by genotyping by sequencing to characterize genome-wide diversity,linkage disequilibrium,and selective sweeps in cultivated watermelon

Padma Nimmakayala Amnon Levi Lavanya Abburi Venkata Lakshmi Abburi Yan R Tomason Thangasamy Saminathan Venkata Gopinath Vajja Sridhar Malkaram Rishi Reddy Todd C Wehner Sharon E Mitchell Umesh K Reddy 《BMC genomics》2014,15(1)

Background

A large single nucleotide polymorphism (SNP) dataset was used to analyze genome-wide diversity in a diverse collection of watermelon cultivars representing globally cultivated, watermelon genetic diversity. The marker density required for conducting successful association mapping depends on the extent of linkage disequilibrium (LD) within a population. Use of genotyping by sequencing reveals large numbers of SNPs that in turn generate opportunities in genome-wide association mapping and marker-assisted selection, even in crops such as watermelon for which few genomic resources are available. In this paper, we used genome-wide genetic diversity to study LD, selective sweeps, and pairwise F_ST distributions among worldwide cultivated watermelons to track signals of domestication.

Results

We examined 183 Citrullus lanatus var. lanatus accessions representing domesticated watermelon and generated a set of 11,485 SNP markers using genotyping by sequencing. With a diverse panel of worldwide cultivated watermelons, we identified a set of 5,254 SNPs with a minor allele frequency of ≥ 0.05, distributed across the genome. All ancestries were traced to Africa and an admixture of various ancestries constituted secondary gene pools across various continents. A sliding window analysis using pairwise F_ST values was used to resolve selective sweeps. We identified strong selection on chromosomes 3 and 9 that might have contributed to the domestication process. Pairwise analysis of adjacent SNPs within a chromosome as well as within a haplotype allowed us to estimate genome-wide LD decay. LD was also detected within individual genes on various chromosomes. Principal component and ancestry analyses were used to account for population structure in a genome-wide association study. We further mapped important genes for soluble solid content using a mixed linear model.

Conclusions

Information concerning the SNP resources, population structure, and LD developed in this study will help in identifying agronomically important candidate genes from the genomic regions underlying selection and for mapping quantitative trait loci using a genome-wide association study in sweet watermelon.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-767) contains supplementary material, which is available to authorized users. 相似文献

14.

A novel locus for body mass index on 5p15.2: a meta-analysis of two genome-wide association studies

Wang KS Liu X Zheng S Zeng M Pan Y Callahan K 《Gene》2012,500(1):80-84

Objective

Genetic factors play an important role in modulating the vulnerability to body mass index (BMI). The purpose of this study is to identify novel genetic variants for BMI using genome-wide association (GWA) meta-analysis.

Methods

PLINK software was used to perform meta-analysis of two GWA studies (the FUSION and Marshfield samples) of 5218 Caucasian individuals with BMI. A replication study was conducted using the SAGE sample with 762 individuals.

Results

Through meta-analysis we identified 33 SNPs associated with BMI with p < 10^− 4. The most significant association was observed with rs2967951 (p = 1.19 × 10^− 6) at 5p15.2 within ROPN1L gene. Two additional SNPs within ROPN1L and 5 SNPs within MARCH6 (the top SNP was rs2607292 with 4.27 × 10^− 6) further supported the association with BMI on 5p15.2 (p < 1.8 × 10^− 5). Conditional analysis on 5p15.2 could not distinguish the effects of ROPN1L and MARCH6. Several SNPs within MARCH6 and ROPN1L were replicated in the SAGE sample (p < 0.05).

Conclusion

We identified a novel locus for BMI. These findings offer the potential for new insights into the pathogenesis of BMI and obesity and will serve as a resource for replication in other populations to elucidate the potential role of these genetic variants in BMI and obesity. 相似文献

15.

The design and application of a 50 K SNP chip for a threatened Aotearoa New Zealand passerine,the hihi

Kate D. Lee Craig D. Millar Patricia Brekke Annabel Whibley John G. Ewen Melanie Hingston Amy Zhu Anna W. Santure 《Molecular ecology resources》2022,22(1):415-429

Next-generation sequencing has transformed the fields of ecological and evolutionary genetics by allowing for cost-effective identification of genome-wide variation. Single nucleotide polymorphism (SNP) arrays, or “SNP chips”, enable very large numbers of individuals to be consistently genotyped at a selected set of these identified markers, and also offer the advantage of being able to analyse samples of variable DNA quality. We used reduced representation restriction-aided digest sequencing (RAD-seq) of 31 birds of the threatened hihi (Notiomystis cincta; stitchbird) and low-coverage whole genome sequencing (WGS) of 10 of these birds to develop an Affymetrix 50 K SNP chip. We overcame the limitations of having no hihi reference genome and a low quantity of sequence data by separate and pooled de novo assembly of each of the 10 WGS birds. Reads from all individuals were mapped back to these de novo assemblies to identify SNPs. A subset of RAD-seq and WGS SNPs were selected for inclusion on the chip, prioritising SNPs with the highest quality scores whose flanking sequence uniquely aligned to the zebra finch (Taeniopygia guttata) genome. Of the 58,466 SNPs manufactured on the chip, 72% passed filtering metrics and were polymorphic. By genotyping 1,536 hihi on the array, we found that SNPs detected in multiple assemblies were more likely to successfully genotype, representing a cost-effective approach to identify SNPs for genotyping. Here, we demonstrate the utility of the SNP chip by describing the high rates of linkage disequilibrium in the hihi genome, reflecting the history of population bottlenecks in the species. 相似文献

16.

Self-reported ethnicity, genetic structure and the impact of population stratification in a multiethnic study

Hansong Wang Christopher A. Haiman Laurence N. Kolonel Brian E. Henderson Lynne R. Wilkens Loïc Le Marchand Daniel O. Stram 《Human genetics》2010,128(2):165-177

It is well-known that population substructure may lead to confounding in case–control association studies. Here, we examined genetic structure in a large racially and ethnically diverse sample consisting of five ethnic groups of the Multiethnic Cohort study (African Americans, Japanese Americans, Latinos, European Americans and Native Hawaiians) using 2,509 SNPs distributed across the genome. Principal component analysis on 6,213 study participants, 18 Native Americans and 11 HapMap III populations revealed four important principal components (PCs): the first two separated Asians, Europeans and Africans, and the third and fourth corresponded to Native American and Native Hawaiian (Polynesian) ancestry, respectively. Individual ethnic composition derived from self-reported parental information matched well to genetic ancestry for Japanese and European Americans. STRUCTURE-estimated individual ancestral proportions for African Americans and Latinos are consistent with previous reports. We quantified the East Asian (mean 27%), European (mean 27%) and Polynesian (mean 46%) ancestral proportions for the first time, to our knowledge, for Native Hawaiians. Simulations based on realistic settings of case–control studies nested in the Multiethnic Cohort found that the effect of population stratification was modest and readily corrected by adjusting for race/ethnicity or by adjusting for top PCs derived from all SNPs or from ancestry informative markers; the power of these approaches was similar when averaged across causal variants simulated based on allele frequencies of the 2,509 genotyped markers. The bias may be large in case-only analysis of gene by gene interactions but it can be corrected by top PCs derived from all SNPs. 相似文献

17.

Use of a multiethnic approach to identify rheumatoid- arthritis-susceptibility loci, 1p36 and 17q12

Kurreeman FA Stahl EA Okada Y Liao K Diogo D Raychaudhuri S Freudenberg J Kochi Y Patsopoulos NA Gupta N;CLEAR investigators Sandor C Bang SY Lee HS Padyukov L Suzuki A Siminovitch K Worthington J Gregersen PK Hughes LB Reynolds RJ Bridges SL Bae SC Yamamoto K Plenge RM 《American journal of human genetics》2012,90(3):524-532

We have previously shown that rheumatoid arthritis (RA) risk alleles overlap between different ethnic groups. Here, we utilize a multiethnic approach to show that we can effectively discover RA risk alleles. Thirteen putatively associated SNPs that had not yet exceeded genome-wide significance (p < 5 × 10(-8)) in our previous RA genome-wide association study (GWAS) were analyzed in independent sample sets consisting of 4,366 cases and 17,765 controls of European, African American, and East Asian ancestry. Additionally, we conducted an overall association test across all 65,833 samples (a GWAS meta-analysis plus the replication samples). Of the 13 SNPs investigated, four were significantly below the study-wide Bonferroni corrected p value threshold (p < 0.0038) in the replication samples. Two SNPs (rs3890745 at the 1p36 locus [p = 2.3 × 10(-12)] and rs2872507 at the 17q12 locus [p = 1.7 × 10(-9)]) surpassed genome-wide significance in all 16,659 RA cases and 49,174 controls combined. We used available GWAS data to fine map these two loci in Europeans and East Asians, and we found that the same allele conferred risk in both ethnic groups. A series of bioinformatic analyses identified TNFRSF14-MMEL1 at the 1p36 locus and IKZF3-ORMDL3-GSDMB at the 17q12 locus as the genes most likely associated with RA. These findings demonstrate empirically that a multiethnic approach is an effective strategy for discovering RA risk loci, and they suggest that combining GWASs across ethnic groups represents an efficient strategy for gaining statistical power. 相似文献

18.

Genotyping Informatics and Quality Control for 100,000 Subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) Cohort

《Genetics》2015,200(4):1051-1060

The Kaiser Permanente (KP) Research Program on Genes, Environment and Health (RPGEH), in collaboration with the University of California—San Francisco, undertook genome-wide genotyping of >100,000 subjects that constitute the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. The project, which generated >70 billion genotypes, represents the first large-scale use of the Affymetrix Axiom Genotyping Solution. Because genotyping took place over a short 14-month period, creating a near-real-time analysis pipeline for experimental assay quality control and final optimized analyses was critical. Because of the multi-ethnic nature of the cohort, four different ethnic-specific arrays were employed to enhance genome-wide coverage. All assays were performed on DNA extracted from saliva samples. To improve sample call rates and significantly increase genotype concordance, we partitioned the cohort into disjoint packages of plates with similar assay contexts. Using strict QC criteria, the overall genotyping success rate was 103,067 of 109,837 samples assayed (93.8%), with a range of 92.1–95.4% for the four different arrays. Similarly, the SNP genotyping success rate ranged from 98.1 to 99.4% across the four arrays, the variation depending mostly on how many SNPs were included as single copy vs. double copy on a particular array. The high quality and large scale of genotype data created on this cohort, in conjunction with comprehensive longitudinal data from the KP electronic health records of participants, will enable a broad range of highly powered genome-wide association studies on a diversity of traits and conditions. 相似文献

19.

Integrative analysis of single nucleotide polymorphisms and gene expression efficiently distinguishes samples from closely related ethnic populations

HC Yang PL Wang CW Lin CH Chen CH Chen 《BMC genomics》2012,13(1):346

ABSTRACT: BACKGROUND: Ancestry informative markers (AIMs) are a type of genetic marker that is informative for tracing the ancestral ethnicity of individuals. Application of AIMs has gained substantial attention in population genetics, forensic sciences, and medical genetics. Single nucleotide polymorphisms (SNPs), the materials of AIMs, are useful for classifying individuals from distinct continental origins but cannot discriminate individuals with subtle genetic differences from closely related ancestral lineages. Proof-of-principle studies have shown that gene expression (GE) also is a heritable human variation that exhibits differential intensity distributions among ethnic groups. GE supplies ethnic information supplemental to SNPs; this motivated us to integrate SNP and GE markers to construct AIM panels with a reduced number of required markers and provide high accuracy in ancestry inference. Few studies in the literature have considered GE in this aspect, and none have integrated SNP and GE markers to aid classification of samples from closely related ethnic populations. RESULTS: We integrated a forward variable selection procedure into flexible discriminant analysis to identify key SNP and/or GE markers with the highest cross-validation prediction accuracy. By analyzing genome-wide SNP and/or GE markers in 210 independent samples from four ethnic groups in the HapMap II Project, we found that average testing accuracies for a majority of classification analyses were quite high, except for SNP-only analyses that were performed to discern study samples containing individuals from two close Asian populations. The average testing accuracies ranged from 0.53 to 0.79 for SNP-only analyses and increased to around 0.90 when GE markers were integrated together with SNP markers for the classification of samples from closely related Asian populations. Compared to GE-only analyses, integrative analyses of SNP and GE markers showed comparable testing accuracies and a reduced number of selected markers in AIM panels. CONCLUSIONS: Integrative analysis of SNP and GE markers provides high-accuracy and/or cost-effective classification results for assigning samples from closely related or distantly related ancestral lineages to their original ancestral populations. User-friendly BIASLESS (Biomarkers Identification and Samples Subdivision) software was developed as an efficient tool for selecting key SNP and/or GE markers and then building models for sample subdivision. BIASLESS was programmed in R and R-GUI and is available online at http://www.stat.sinica.edu.tw/hsinchou/genetics/prediction/BIASLESS.htm. 相似文献

20.

A 34K SNP genotyping array for Populus trichocarpa: Design,application to the study of natural populations and transferability to other Populus species

A. Geraldes S. P. DiFazio G. T. Slavov P. Ranjan W. Muchero J. Hannemann L. E. Gunter A. M. Wymore C. J. Grassa N. Farzaneh I. Porth A. D. McKown O. Skyba E. Li M. Fujita J. Klápště J. Martin W. Schackwitz C. Pennacchio D. Rokhsar M. C. Friedmann G. O. Wasteneys R. D. Guy Y. A. El‐Kassaby S. D. Mansfield Q. C. B. Cronk J. Ehlting C. J. Douglas G. A. Tuskan 《Molecular ecology resources》2013,13(2):306-323

Genetic mapping of quantitative traits requires genotypic data for large numbers of markers in many individuals. For such studies, the use of large single nucleotide polymorphism (SNP) genotyping arrays still offers the most cost‐effective solution. Herein we report on the design and performance of a SNP genotyping array for Populus trichocarpa (black cottonwood). This genotyping array was designed with SNPs pre‐ascertained in 34 wild accessions covering most of the species latitudinal range. We adopted a candidate gene approach to the array design that resulted in the selection of 34 131 SNPs, the majority of which are located in, or within 2 kb of, 3543 candidate genes. A subset of the SNPs on the array (539) was selected based on patterns of variation among the SNP discovery accessions. We show that more than 95% of the loci produce high quality genotypes and that the genotyping error rate for these is likely below 2%. We demonstrate that even among small numbers of samples (n = 10) from local populations over 84% of loci are polymorphic. We also tested the applicability of the array to other species in the genus and found that the number of polymorphic loci decreases rapidly with genetic distance, with the largest numbers detected in other species in section Tacamahaca. Finally, we provide evidence for the utility of the array to address evolutionary questions such as intraspecific studies of genetic differentiation, species assignment and the detection of natural hybrids. 相似文献