首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
2.
Whole-genome and exome data sets continue to be produced at a frenetic pace, resulting in massively large catalogs of human genomic variation. However, a clear picture of the characteristics and patterns of neutral and deleterious variation within and between populations has yet to emerge, given that recent large-scale sequencing studies have often emphasized different aspects of the data and sometimes appear to have conflicting conclusions. Here, we comprehensively studied characteristics of protein-coding variation in high-coverage exome sequence data from 6,515 European American (EA) and African American (AA) individuals. We developed an unbiased approach to identify putatively deleterious variants and investigated patterns of neutral and deleterious single-nucleotide variants and alleles between individuals and populations. We show that there are substantial differences in the composition of genotypes between EA and AA populations and that small but statistically significant differences exist in the average number of deleterious alleles carried by EA and AA individuals. Furthermore, we performed extensive simulations to delineate the temporal dynamics of deleterious alleles for a broad range of demographic models and use these data to inform the interpretation of empirical patterns of deleterious variation. Finally, we illustrate that the effects of demographic perturbations, such as bottlenecks and expansions, often manifest in opposing patterns of neutral and deleterious variation depending on whether the focus is on populations or individuals. Our results clarify seemingly disparate empirical characteristics of protein-coding variation and provide substantial insights into how natural selection and demographic history have patterned neutral and deleterious variation within and between populations.  相似文献   

3.
Here we report a large, extensively characterized set of single-nucleotide polymorphisms (SNPs) covering the human genome. We determined the allele frequencies of 55,018 SNPs in African Americans, Asians (Japanese-Chinese), and European Americans as part of The SNP Consortium's Allele Frequency Project. A subset of 8333 SNPs was also characterized in Koreans. Because these SNPs were ascertained in the same way, the data set is particularly useful for modeling. Our results document that much genetic variation is shared among populations. For autosomes, some 44% of these SNPs have a minor allele frequency > or =10% in each population, and the average allele frequency differences between populations with different continental origins are less than 19%. However, the several percentage point allele frequency differences among the closely related Korean, Japanese, and Chinese populations suggest caution in using mixtures of well-established populations for case-control genetic studies of complex traits. We estimate that approximately 7% of these SNPs are private SNPs with minor allele frequencies <1%. A useful set of characterized SNPs with large allele frequency differences between populations (>60%) can be used for admixture studies. High-density maps of high-quality, characterized SNPs produced by this project are freely available.  相似文献   

4.
5.
Over the next few years, the efficient use of next-generation sequencing (NGS) in human genetics research will depend heavily upon the effective mechanisms for the selective enrichment of genomic regions of interest. Recently, comprehensive exome capture arrays have become available for targeting approximately 33 Mb or ∼180,000 coding exons across the human genome. Selective genomic enrichment of the human exome offers an attractive option for new experimental designs aiming to quickly identify potential disease-associated genetic variants, especially in family-based studies. We have evaluated a 2.1 M feature human exome capture array on eight individuals from a three-generation family pedigree. We were able to cover up to 98% of the targeted bases at a long-read sequence read depth of ≥3, 86% at a read depth of ≥10, and over 50% of all targets were covered with ≥20 reads. We identified up to 14,284 SNPs and small indels per individual exome, with up to 1,679 of these representing putative novel polymorphisms. Applying the conservative genotype calling approach HCDiff, the average rate of detection of a variant allele based on Illumina 1 M BeadChips genotypes was 95.2% at ≥10x sequence. Further, we propose an advantageous genotype calling strategy for low covered targets that empirically determines cut-off thresholds at a given coverage depth based on existing genotype data. Application of this method was able to detect >99% of SNPs covered ≥8x. Our results offer guidance for “real-world” applications in human genetics and provide further evidence that microarray-based exome capture is an efficient and reliable method to enrich for chromosomal regions of interest in next-generation sequencing experiments.  相似文献   

6.
Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas—70% of the European ancestry in today’s African Americans dates back to European gene flow happening only 7–8 generations ago.  相似文献   

7.
Large-scale genotyping of complex DNA   总被引:21,自引:0,他引:21  
Genetic studies aimed at understanding the molecular basis of complex human phenotypes require the genotyping of many thousands of single-nucleotide polymorphisms (SNPs) across large numbers of individuals. Public efforts have so far identified over two million common human SNPs; however, the scoring of these SNPs is labor-intensive and requires a substantial amount of automation. Here we describe a simple but effective approach, termed whole-genome sampling analysis (WGSA), for genotyping thousands of SNPs simultaneously in a complex DNA sample without locus-specific primers or automation. Our method amplifies highly reproducible fractions of the genome across multiple DNA samples and calls genotypes at >99% accuracy. We rapidly genotyped 14,548 SNPs in three different human populations and identified a subset of them with significant allele frequency differences between groups. We also determined the ancestral allele for 8,386 SNPs by genotyping chimpanzee and gorilla DNA. WGSA is highly scaleable and enables the creation of ultrahigh density SNP maps for use in genetic studies.  相似文献   

8.
《Genomics》2020,112(5):3722-3728
Whole exome sequencing is an adept method to reveal novel and disease-related SNPs and INDELs as it screen the actionable areas of the genome. We evaluated the exome sequenced datasets of patients with Parkinson's disease (PD) in South African ethnic origin. The primary focus of this study was to discover the SNPs and INDELs patterns responsible for PD. The variant discovery was performed with genome analysis tool kit best practices variant detection pipelines. The SNPs were linked to the genes and categorized based on the filter-based annotation from ANNOVAR. We identified a total of 7955 SNPs and 9952 INDELs in all seven datasets together. A total of 130 missense nsSNPs were prioritized based on its damaging effect predicted from SIFT and Polyphen2 annotation. We noticed a novel nsSNP rs111655870 in gene LRRK2 that shows the mutation of a Leucine to Phenylalanine at position 208 which can alter the protein function. The study also filtered seven nsSNPs in genes NAGA, SULT4A1, MYH8, FLNA, TPM3, ATP13A1, CLN8 that have potentially deleterious effects predicted by various computational tools. This analysis suggested that the above filtered nsSNPs and INDELs have a functional impact and provide the footing for genetic studies related to PD. Further screening of these variations provides deeper insight for molecular mechanism of disease progression.  相似文献   

9.
Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright’s F ST that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F ST may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically F ST analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global F ST distribution closely follows an exponential distribution. Third, although the overall F ST distribution is similarly shaped (inverse J), F ST distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-F ST of these groups is linear in allele frequency. These results suggest that investigating the extremes of the F ST distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection.  相似文献   

10.
DNA 池结合DHPLC 和直接测序技术在江豚SNPs 检测中的应用   总被引:6,自引:0,他引:6  
选取江豚基因组中的2 个已知单核苷酸多态性(single nucleotide polymorphisms,SNPs)位点,通过PCR 扩增,将PCR 产物按基因频率不同制备成0 ~ 50% 的11 个DNA 池(DNA pool),用于变性高效液相色谱(denaturing high performance liquid chromatography,DHPLC)和直接测序分析,以探讨DNA 池中基因频率的最低要求。结果显示,当稀有等位基因的基因频率不少于5% 时可在DHPLC 检测过程中明显分辨;而利用DNA 池进行直接测序时的基因频率则需达到10% 。这提示,为保证DHPLC 分析的准确性和可靠性,制备DNA 池时等摩尔DNA 混合的个体数最好不超过10 个。DNA 池结合DHPLC 技术的高效性与准确性可在大规模的SNPs 位点筛选中发挥作用。  相似文献   

11.
High‐throughput DNA sequencing facilitates the analysis of large portions of the genome in nonmodel organisms, ensuring high accuracy of population genetic parameters. However, empirical studies evaluating the appropriate sample size for these kinds of studies are still scarce. In this study, we use double‐digest restriction‐associated DNA sequencing (ddRADseq) to recover thousands of single nucleotide polymorphisms (SNPs) for two physically isolated populations of Amphirrhox longifolia (Violaceae), a nonmodel plant species for which no reference genome is available. We used resampling techniques to construct simulated populations with a random subset of individuals and SNPs to determine how many individuals and biallelic markers should be sampled for accurate estimates of intra‐ and interpopulation genetic diversity. We identified 3646 and 4900 polymorphic SNPs for the two populations of A. longifolia, respectively. Our simulations show that, overall, a sample size greater than eight individuals has little impact on estimates of genetic diversity within A. longifolia populations, when 1000 SNPs or higher are used. Our results also show that even at a very small sample size (i.e. two individuals), accurate estimates of FST can be obtained with a large number of SNPs (≥1500). These results highlight the potential of high‐throughput genomic sequencing approaches to address questions related to evolutionary biology in nonmodel organisms. Furthermore, our findings also provide insights into the optimization of sampling strategies in the era of population genomics.  相似文献   

12.
Hughes AL  Packer B  Welch R  Bergen AW  Chanock SJ  Yeager M 《Genetics》2005,170(3):1181-1187
To develop new strategies for searching for genetic associations with complex human diseases, we analyzed 2784 single-nucleotide polymorphisms (SNPs) in 396 protein-coding genes involved in biological processes relevant to cancer and other complex diseases, with respect to gene diversity within samples of individuals representing the three major historic human populations (African, European, and Asian) and with respect to interpopulation genetic distance. Reduced levels of both intrapopulation gene diversity and interpopulation genetic distance were seen in the case of SNPs located within the 5'-UTR and at nonsynonymous SNPs, causing radical changes to protein structure. Reduction of gene diversity at SNP loci in these categories was evidence of purifying selection acting at these sites, which in turn causes a reduction in interpopulation divergence. By contrast, a small number of SNP sites in these categories revealed unusually high genetic distances between the two most diverged populations (African and Asian); these loci may have historically been subject to divergent selection pressures.  相似文献   

13.

Background

Recent genome-wide association (GWA) studies have provided compelling evidence of association between genetic variants and common complex diseases. These studies have made use of cases and controls almost exclusively from populations of European ancestry and little is known about the frequency of risk alleles in other populations. The present study addresses the transferability of disease associations across human populations by examining levels of population differentiation at disease-associated single nucleotide polymorphisms (SNPs).

Methods

We genotyped ~1000 individuals from 53 populations worldwide at 25 SNPs which show robust association with 6 complex human diseases (Crohn's disease, type 1 diabetes, type 2 diabetes, rheumatoid arthritis, coronary artery disease and obesity). Allele frequency differences between populations for these SNPs were measured using Fst. The Fst values for the disease-associated SNPs were compared to Fst values from 2750 random SNPs typed in the same set of individuals.

Results

On average, disease SNPs are not significantly more differentiated between populations than random SNPs in the genome. Risk allele frequencies, however, do show substantial variation across human populations and may contribute to differences in disease prevalence between populations. We demonstrate that, in some cases, risk allele frequency differences are unusually high compared to random SNPs and may be due to the action of local (i.e. geographically-restricted) positive natural selection. Moreover, some risk alleles were absent or fixed in a population, which implies that risk alleles identified in one population do not necessarily account for disease prevalence in all human populations.

Conclusion

Although differences in risk allele frequencies between human populations are not unusually large and are thus likely not due to positive local selection, there is substantial variation in risk allele frequencies between populations which may account for differences in disease prevalence between human populations.  相似文献   

14.
Xiao M  Latif SM  Kwok PY 《BioTechniques》2003,34(1):190-197
Strategies for identifying genetic risk factors in complex diseases by association studies require the comparison of allele frequencies of numerous SNPs between affected and control populations. Theoretically, hundreds of thousands of SNP markers across the genome will have to be genotyped in these studies. Genotyping SNPs one sample at a time is extremely costly and time consuming. To streamline whole genome association studies, some have proposed to screen SNPs by pooling the DNA samples initially for allele frequency determination and perform individual genotyping only when there is a significant discrepancy in allele frequencies between the affected and control populations. Here we describe a new method for determining the allele frequency of SNPs in pooled DNA samples using a two-color primer extension assay with real-time monitoring of fluorescence polarization (named kinetic FP-TDI assay). By comparing the ratio of the rate of incorporation of the two allele-specific dye-terminators, one can calculate the relative amounts of each allele in the pooled sample. The accuracy of allele frequency determination with pooled samples is within 3.3 +/- 0.8% of that determined by genotyping individual samples that make up the pool.  相似文献   

15.
The analysis of less common variants in genome-wide association studies promises to elucidate complex trait genetics but is hampered by low power to reliably detect association. We show that addition of population-specific exome sequence data to global reference data allows more accurate imputation, particularly of less common SNPs (minor allele frequency 1–10%) in two very different European populations. The imputation improvement corresponds to an increase in effective sample size of 28–38%, for SNPs with a minor allele frequency in the range 1–3%.  相似文献   

16.
Substantial increases of linkage disequilibrium (LD) both in magnitude and in range have been observed in recently admixed populations such as African-American (AfA). On the other hand, it has also been shown that LD in AfAs was very similar to that of African. In this study, we attempted to resolve these contradicting observations by conducting a systematic examination of the LD structure in AfAs by genotyping a sample of AfA individuals at 24,341 single nucleotide polymorphisms (SNPs) spanning almost the entire chromosome 21, with an average density of 1.5 kb/SNP. The overall LD in AfAs is similar to that in African populations and much less than that in European populations. Even when the ancestry-informative markers (AIMs) were used, extended LD in AfA was found to be limited to certain magnitude range (0.2 < or = r(2) < or = 0.8) and certain distance range, that is, between-marker distance more than 200 kb. Furthermore, the inclusion of AfA individuals with predominant African ancestry was found to reduce the overall magnitude of LD. Elevation of LD in the AfA population, compared with its parental populations, can only be observed at the markers with large allele frequency differences between 2 parental populations at limited scenario. AfA individuals of wholly African ancestry contribute little to the extended LD in the AfA population, and further genotyping or association analysis conducted using only admixed individuals may lead to higher statistical power and possibly reduced cost.  相似文献   

17.
Many disease-susceptible SNPs exhibit significant disparity in ancestral and derived allele frequencies across worldwide populations. While previous studies have examined population differentiation of alleles at specific SNPs, global ethnic patterns of ensembles of disease risk alleles across human diseases are unexamined. To examine these patterns, we manually curated ethnic disease association data from 5,065 papers on human genetic studies representing 1,495 diseases, recording the precise risk alleles and their measured population frequencies and estimated effect sizes. We systematically compared the population frequencies of cross-ethnic risk alleles for each disease across 1,397 individuals from 11 HapMap populations, 1,064 individuals from 53 HGDP populations, and 49 individuals with whole-genome sequences from 10 populations. Type 2 diabetes (T2D) demonstrated extreme directional differentiation of risk allele frequencies across human populations, compared with null distributions of European-frequency matched control genomic alleles and risk alleles for other diseases. Most T2D risk alleles share a consistent pattern of decreasing frequencies along human migration into East Asia. Furthermore, we show that these patterns contribute to disparities in predicted genetic risk across 1,397 HapMap individuals, T2D genetic risk being consistently higher for individuals in the African populations and lower in the Asian populations, irrespective of the ethnicity considered in the initial discovery of risk alleles. We observed a similar pattern in the distribution of T2D Genetic Risk Scores, which are associated with an increased risk of developing diabetes in the Diabetes Prevention Program cohort, for the same individuals. This disparity may be attributable to the promotion of energy storage and usage appropriate to environments and inconsistent energy intake. Our results indicate that the differential frequencies of T2D risk alleles may contribute to the observed disparity in T2D incidence rates across ethnic populations.  相似文献   

18.
19.
Advances in genomic techniques are greatly facilitating the study of molecular signatures of selection in diverging natural populations. Connecting these signatures to phenotypes under selection remains challenging, but benefits from dissections of the genetic architecture of adaptive divergence. We here perform quantitative trait locus (QTL) mapping using 488 F2 individuals and 2011 single nucleotide polymorphisms (SNPs) to explore the genetic architecture of skeletal divergence in a lake‐stream stickleback system from Central Europe. We find QTLs for gill raker, snout, and head length, vertebral number, and the extent of lateral plating (plate number and height). Although two large‐effect loci emerge, QTL effect sizes are generally small. Examining the neighborhood of the QTL‐linked SNPs identifies several genes involved in bone formation, which emerge as strong candidate genes for skeletal evolution. Finally, we use SNP data from the natural source populations to demonstrate that some SNPs linked to QTLs in our cross also exhibit striking allele frequency differences in the wild, suggesting a causal role of these QTLs in adaptive population divergence. Our study paves the way for comparative analyses across other (lake‐stream) stickleback populations, and for functional investigations of the candidate genes.  相似文献   

20.
Potential causes of variability in drug response include intrinsic factors such as ethnicity and genetic differences in the expression of enzymes that metabolize drugs, such as those from Cytochrome P450 (CYPs) superfamily. Pharmacogenetic studies search for genetic differences between populations since relevant alleles occur with varying frequencies among different ethnic populations. The Brazilian population is one of the most heterogeneous in the world, resulting from multiethnic admixture of Amerindians, Europeans, and Africans across centuries. Since the knowledge of CYP allele frequency distributions is relevant to pharmacogenetic strategies and these data are scarce in the Brazilian population, this study aimed to describe genotype and allele distributions of 15 single nucleotide polymorphisms (SNPs) at CYP 1A2, 2C19, 3A4, and 3A5 genes in African and European descents from South Brazil. A sample of 179 healthy individuals of European and African ancestry was genotyped by the MassARRAY SNP genotyping system. CYP3A5*3, CYP1A2*1F, CYP3A4*1B, and CYP2C19*2 were the most frequent alleles found in our sample. Significant differences in genotype and allelic distribution between African and European descents were observed for CYP3A4 and CYP3A5 genes. CYP3A4*1B was observed in higher frequency in African descents (0.379) than in European descents (0.098), and European descents showed higher frequency of CYP3A5*3 (0.810) than African descents (0.523). Our results indicate that only a few polymorphisms would have impact in pharmacogenetic testing in South Brazilians. Further studies with larger sample sizes are required also among other Brazilian regions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号