首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
SNP and haplotype variation in the human genome   总被引:19,自引:0,他引:19  
We have surveyed and summarized several aspects of DNA variability among humans. The variation described is the result of mutation followed by a combination of drift, migration and selection bringing the frequencies high enough to be observed. This paper describes what we have learned about how DNA variability differs among genes and populations. We sequenced functional regions of a set of 3950 genes. DNA was sampled from 82 unrelated humans: 20 African-Americans, 20 East Asians, 21 Caucasians, 18 Hispanic-Latinos and 3 Native Americans. Different aspects of variability showed a great deal of concordance. In particular, we studied patterns of single nucleotide polymorphism (SNP) allele and haplotype sharing among the four, large sample populations. We also examined how linkage disequilibrium (LD) between SNPs relates to physical distance in the different populations. It is clear from our findings that while many variants are common to all populations, many others have a more restricted distribution. Research that attempts to find genetic variants that explain phenotypic variants must be careful in their choice of study population.  相似文献   

2.
Phenotypic variation in natural populations results from a combination of genetic effects, environmental effects, and gene-by-environment interactions. Despite the vast amount of genomic data becoming available, many pressing questions remain about the nature of genetic mutations that underlie functional variation. We present the results of combining genome-wide association analysis of 41 different phenotypes in ∼5,000 inbred maize lines to analyze patterns of high-resolution genetic association among of 28.9 million single-nucleotide polymorphisms (SNPs) and ∼800,000 copy-number variants (CNVs). We show that genic and intergenic regions have opposite patterns of enrichment, minor allele frequencies, and effect sizes, implying tradeoffs among the probability that a given polymorphism will have an effect, the detectable size of that effect, and its frequency in the population. We also find that genes tagged by GWAS are enriched for regulatory functions and are ∼50% more likely to have a paralog than expected by chance, indicating that gene regulation and gene duplication are strong drivers of phenotypic variation. These results will likely apply to many other organisms, especially ones with large and complex genomes like maize.  相似文献   

3.
As part of the GAIT (genetic analysis of idiopathic thrombophilia) project, we analyzed polymorphisms in the factor V (FV) gene to assess their role as genetic determinants of normal phenotypic variation of hemostasis-related traits in a Spanish population. During the analysis of exon 13 polymorphisms, we detected an abnormal PCR-amplified fragment in some members of the GAIT19 family. Direct sequence analysis revealed a deletion of 108 bp in eight out of 20 individuals in this family. This deletion removes exactly 36 amino acids from the B domain of FV; thus it does not alter the reading frame of the sequence. Among the deleted amino acids there is the 4070A>G polymorphism (H1299R), which could affect the level or function of FV. In addition, in the same family we identified three novel DNA variants (L1257I, Q1317Q and T1327T) in exon 13 of the F5 gene. Despite these variants, we did not detect any differences either in the coagulant or anticoagulant traits, or in the plasma protein levels involved in the blood coagulation cascade, between the carriers compared with their non-carrier relatives. From these results, we can conclude that the mutant allele is expressed and the resultant protein is functional. Moreover, it is unlikely that the 4070A>G polymorphism, within the deletion, and the novel DNA variants alter the functional properties of the mature FV protein. Further analyses of this naturally occurring mutation and the novel DNA variants should yield useful information for the understanding of the function of the B domain of FV.  相似文献   

4.
Allele-specific gene expression in a wild nonhuman primate population   总被引:1,自引:0,他引:1  
Natural populations hold enormous potential for evolutionary genetic studies, especially when phenotypic, genetic and environmental data are all available on the same individuals. However, untangling the genotype-phenotype relationship in natural populations remains a major challenge. Here, we describe results of an investigation of one class of phenotype, allele-specific gene expression (ASGE), in the well-studied natural population of baboons of the Amboseli basin, Kenya. ASGE measurements identify cases in which one allele of a gene is overexpressed relative to the alternative allele of the same gene, within individuals, thus providing a control for background genetic and environmental effects. Here, we characterize the incidence of ASGE in the Amboseli baboon population, focusing on the genetic and environmental contributions to ASGE in a set of eleven genes involved in immunity and defence. Within this set, we identify evidence for common ASGE in four genes. We also present examples of two relationships between cis-regulatory genetic variants and the ASGE phenotype. Finally, we identify one case in which this relationship is influenced by a novel gene-environment interaction. Specifically, the dominance rank of an individual's mother during its early life (an aspect of that individual's social environment) influences the expression of the gene CCL5 via an interaction with cis-regulatory genetic variation. These results illustrate how environmental and ecological data can be integrated into evolutionary genetic studies of functional variation in natural populations. They also highlight the potential importance of early life environmental variation in shaping the genetic architecture of complex traits in wild mammals.  相似文献   

5.
Whole-exome or gene targeted resequencing in hundreds to thousands of individuals has shown that the majority of genetic variants are at low frequency in human populations. Rare variants are enriched for functional mutations and are expected to explain an important fraction of the genetic etiology of human disease, therefore having a potential medical interest. In this work, we analyze the whole-exome sequences of French-Canadian individuals, a founder population with a unique demographic history that includes an original population bottleneck less than 20 generations ago, followed by a demographic explosion, and the whole exomes of French individuals sampled from France. We show that in less than 20 generations of genetic isolation from the French population, the genetic pool of French-Canadians shows reduced levels of diversity, higher homozygosity, and an excess of rare variants with low variant sharing with Europeans. Furthermore, the French-Canadian population contains a larger proportion of putatively damaging functional variants, which could partially explain the increased incidence of genetic disease in the province. Our results highlight the impact of population demography on genetic fitness and the contribution of rare variants to the human genetic variation landscape, emphasizing the need for deep cataloguing of genetic variants by resequencing worldwide human populations in order to truly assess disease risk.  相似文献   

6.

Background

Rare coding variants constitute an important class of human genetic variation, but are underrepresented in current databases that are based on small population samples. Recent studies show that variants altering amino acid sequence and protein function are enriched at low variant allele frequency, 2 to 5%, but because of insufficient sample size it is not clear if the same trend holds for rare variants below 1% allele frequency.

Results

The 1000 Genomes Exon Pilot Project has collected deep-coverage exon-capture data in roughly 1,000 human genes, for nearly 700 samples. Although medical whole-exome projects are currently afoot, this is still the deepest reported sampling of a large number of human genes with next-generation technologies. According to the goals of the 1000 Genomes Project, we created effective informatics pipelines to process and analyze the data, and discovered 12,758 exonic SNPs, 70% of them novel, and 74% below 1% allele frequency in the seven population samples we examined. Our analysis confirms that coding variants below 1% allele frequency show increased population-specificity and are enriched for functional variants.

Conclusions

This study represents a large step toward detecting and interpreting low frequency coding variation, clearly lays out technical steps for effective analysis of DNA capture data, and articulates functional and population properties of this important class of genetic variation.  相似文献   

7.
Genome-wide association studies (GWAS) have had a tremendous success in the identification of common DNA sequence variants associated with complex human diseases and traits. However, because of their design, GWAS are largely inappropriate to characterize the role of rare and low-frequency DNA variants on human phenotypic variation. Rarer genetic variation is geographically more restricted, supporting the need for local whole-genome sequencing (WGS) efforts to study these variants in specific populations. Here, we present the first large-scale low-pass WGS of the French-Canadian population. Specifically, we sequenced at ~5.6× coverage the whole genome of 1970 French Canadians recruited by the Montreal Heart Institute Biobank and identified 29 million bi-allelic variants (31 % novel), including 19 million variants with a minor allele frequency (MAF) <0.5 %. Genotypes from the WGS data are highly concordant with genotypes obtained by exome array on the same individuals (99.8 %), even when restricting this analysis to rare variants (MAF <0.5, 99.9 %) or heterozygous sites (98.9 %). To further validate our data set, we showed that we can effectively use it to replicate several genetic associations with myocardial infarction risk and blood lipid levels. Furthermore, we analyze the utility of our WGS data set to generate a French-Canadian-specific imputation reference panel and to infer population structure in the Province of Quebec. Our results illustrate the value of low-pass WGS to study the genetics of human diseases in the founder French-Canadian population.  相似文献   

8.
Human genetic variation is the incarnation of diverse evolutionary history, which reflects both selectively advantageous and selectively neutral change. In this study, we catalogue structural and functional features of proteins that restrain genetic variation leading to single amino acid substitutions. Our variation dataset is divided into three categories: i) Mendelian disease-related variants, ii) neutral polymorphisms and iii) cancer somatic mutations. We characterize structural environments of the amino acid variants by the following properties: i) side-chain solvent accessibility, ii) main-chain secondary structure, and iii) hydrogen bonds from a side chain to a main chain or other side chains. To address functional restraints, amino acid substitutions in proteins are examined to see whether they are located at functionally important sites involved in protein-protein interactions, protein-ligand interactions or catalytic activity of enzymes. We also measure the likelihood of amino acid substitutions and the degree of residue conservation where variants occur. We show that various types of variants are under different degrees of structural and functional restraints, which affect their occurrence in human proteome.  相似文献   

9.
Copy number variants (CNVs) contribute to human genetic and phenotypic diversity. However, the distribution of larger CNVs in the general population remains largely unexplored. We identify large variants in ~2500 individuals by using Illumina SNP data, with an emphasis on “hotspots” prone to recurrent mutations. We find variants larger than 500 kb in 5%–10% of individuals and variants greater than 1 Mb in 1%–2%. In contrast to previous studies, we find limited evidence for stratification of CNVs in geographically distinct human populations. Importantly, our sample size permits a robust distinction between truly rare and polymorphic but low-frequency copy number variation. We find that a significant fraction of individual CNVs larger than 100 kb are rare and that both gene density and size are strongly anticorrelated with allele frequency. Thus, although large CNVs commonly exist in normal individuals, which suggests that size alone can not be used as a predictor of pathogenicity, such variation is generally deleterious. Considering these observations, we combine our data with published CNVs from more than 12,000 individuals contrasting control and neurological disease collections. This analysis identifies known disease loci and highlights additional CNVs (e.g., 3q29, 16p12, and 15q25.2) for further investigation. This study provides one of the first analyses of large, rare (0.1%–1%) CNVs in the general population, with insights relevant to future analyses of genetic disease.  相似文献   

10.
Interaction (nonadditive effects) between genetic variants has been highlighted as an important mechanism underlying phenotypic variation, but the discovery of genetic interactions in humans has proved difficult. In this study, we show that the spectrum of variation in the human genome has been shaped by modifier effects of cis-regulatory variation on the functional impact of putatively deleterious protein-coding variants. We analyzed 1000 Genomes population-scale resequencing data from Europe (CEU [Utah residents with Northern and Western European ancestry from the CEPH collection]) and Africa (YRI [Yoruba in Ibadan, Nigeria]) together with gene expression data from arrays and RNA sequencing for the same samples. We observed an underrepresentation of derived putatively functional coding variation on the more highly expressed regulatory haplotype, which suggests stronger purifying selection against deleterious coding variants that have increased penetrance because of their regulatory background. Furthermore, the frequency spectrum and impact size distribution of common regulatory polymorphisms (eQTLs) appear to be shaped in order to minimize the selective disadvantage of having deleterious coding mutations on the more highly expressed haplotype. Interestingly, eQTLs explaining common disease GWAS signals showed an enrichment of putative epistatic effects, suggesting that some disease associations might arise from interactions increasing the penetrance of rare coding variants. In conclusion, our results indicate that regulatory and coding variants often modify the functional impact of each other. This specific type of genetic interaction is detectable from sequencing data in a genome-wide manner, and characterizing these joint effects might help us understand functional mechanisms behind genetic associations to human phenotypes-including both Mendelian and common disease.  相似文献   

11.
The persistence of behaviorally deleterious genes in the human population poses an interesting question for population genetics: If certain alleles at these loci are deleterious, why have they survived in the population? We consider evidence for phenotypic capacitance and/or frequency-dependent selection for an allele that has been putatively shown to have negative associations with human behaviors (the “short” 5-HTT promoter region allele) yet has persisted in human and nonhuman primate populations. Using data from the National Longitudinal Study of Adolescent Health, we compare sibling and twin variation in depression by 5-HTT genotype (specified in several ways) and investigate sibship-level cross-person gene-gene interactions. In support of the “orchid/dandelion” hypothesis, we find evidence that the short allele increases variation in phenotypes in response to environmental (or genetic) differences (i.e., acts as a perturbation of a phenotypic capacitor). Further, we also find some evidence that the effects of allelic variation at this locus are moderated by the genetic environment of the sibship unit (i.e., effects may be susceptible to frequency-dependent selection). We discuss implications of these findings for genetic models in general, specifically with respect to stable unit treatment value assumption violations (i.e., nonindependence of units of analysis).  相似文献   

12.
Population-scale genome sequencing allows the characterization of functional effects of a broad spectrum of genetic variants underlying human phenotypic variation. Here, we investigate the influence of rare and common genetic variants on gene expression patterns, using variants identified from sequencing data from the 1000 genomes project in an African and European population sample and gene expression data from lymphoblastoid cell lines. We detect comparable numbers of expression quantitative trait loci (eQTLs) when compared to genotypes obtained from HapMap 3, but as many as 80% of the top expression quantitative trait variants (eQTVs) discovered from 1000 genomes data are novel. The properties of the newly discovered variants suggest that mapping common causal regulatory variants is challenging even with full resequencing data; however, we observe significant enrichment of regulatory effects in splice-site and nonsense variants. Using RNA sequencing data, we show that 46.2% of nonsynonymous variants are differentially expressed in at least one individual in our sample, creating widespread potential for interactions between functional protein-coding and regulatory variants. We also use allele-specific expression to identify putative rare causal regulatory variants. Furthermore, we demonstrate that outlier expression values can be due to rare variant effects, and we approximate the number of such effects harboured in an individual by effect size. Our results demonstrate that integration of genomic and RNA sequencing analyses allows for the joint assessment of genome sequence and genome function.  相似文献   

13.
Genetic variation in the membrane trafficking adapter protein complex 4 (AP‐4) can result in pathogenic neurological phenotypes including microencephaly, spastic paraplegias, epilepsy, and other developmental defects. We lack molecular mechanisms responsible for impaired AP‐4 function arising from genetic variation, because AP‐4 remains poorly understood structurally. Here, we analyze patterns of AP‐4 genetic evolution and conservation to identify regions that are likely important for function and thus more susceptible to pathogenic variation. We map known variants onto an AP‐4 homology model and predict the likelihood of pathogenic variation at a given location on the structure of AP‐4. We find significant clustering of likely pathogenic variants located at the interface between the β4 and N‐μ4 subunits, as well as throughout the C‐μ4 subunit. Our work offers an integrated perspective on how genetic and evolutionary forces affect AP‐4 structure and function. As more individuals with uncharacterized AP‐4 variants are identified, our work provides a foundation upon which their functional effects and disease relevance can be interpreted.  相似文献   

14.
Individual risk and the population incidence of disease result from the interaction of genetic susceptibility and exposure. DNA repair is an example of a cellular process where genetic variation in families with extreme predisposition is documented to be associated with high disease likelihood, including syndromes of premature aging and cancer. Although the identification and characterization of new genes or variants in cancer families continues to be important, the focus of this paper is the current status of efforts to define the impact of polymorphic amino acid substitutions in DNA repair genes on individual and population cancer risk. There is increasing evidence that mild reductions in DNA repair capacity, assumed to be the consequence of common genetic variation, affect cancer predisposition. The extensive variation being found in the coding regions of DNA repair genes and the large number of genes in each of the major repair pathways results in complex genotypes with potential to impact cancer risk in the general population. The implications of this complexity for molecular epidemiology studies, as well as concepts that may make these challenges more manageable, are discussed. The concepts include both experimental and computational approaches that could be employed to develop predictors of disease susceptibility based on DNA repair genotype, focusing initially on studies to assess functional impact on individual proteins and pathways and then on molecular epidemiology studies to assess exposure-dependent health risk. In closing, we raise some of the non-technical challenges to the utilization of the full richness of the genetic variation to reduce disease occurrence and ultimately improve health care.  相似文献   

15.
《Genomics》2020,112(1):442-458
The Russian Federation is the largest and one of the most ethnically diverse countries in the world, however no centralized reference database of genetic variation exists to date. Such data are crucial for medical genetics and essential for studying population history. The Genome Russia Project aims at filling this gap by performing whole genome sequencing and analysis of peoples of the Russian Federation.Here we report the characterization of genome-wide variation of 264 healthy adults, including 60 newly sequenced samples. People of Russia carry known and novel genetic variants of adaptive, clinical and functional consequence that in many cases show allele frequency divergence from neighboring populations. Population genetics analyses revealed six phylogeographic partitions among indigenous ethnicities corresponding to their geographic locales. This study presents a characterization of population-specific genomic variation in Russia with results important for medical genetics and for understanding the dynamic population history of the world's largest country.  相似文献   

16.
Skin color is a polygenically determined quantitative trait. Although it has been used extensively in studies of between-population variation, there have been relatively few studies of the inheritance of skin color. In this article we use measurements on 359 members of the Jirel population of eastern Nepal to assess the heritabilities and additive genetic correlations of three skin reflectance measures. Skin color was measured at the upper inner arm site at three wavelengths. A maximum likelihood approach was used to estimate sex and age effects on skin reflectance, heritabilities, and phenotypic variances at each wavelength and both additive genetic and environmental correlations between wavelengths. This technique incorporated information from 36 pedigrees with 2-25 members and 173 independent individuals. Likelihood ratio tests were used to assess the significance of specific variance/covariance components. The results indicate that skin reflectances are moderately heritable at all three wavelengths. The pairwise phenotypic correlations ranged from 0.76 to 0.88. The observed additive genetic correlations were not significantly different from 1.00, suggesting that the same loci influence variation at each wavelength. This evidence for relatively complete pleiotropy implies that measurements at multiple wavelengths yield little additional genetic information, although they may be useful for reducing measurement error. Based on estimates of the genetic and phenotypic covariance matrices, we determined that skin reflectance measurements are expected to provide only as much information for assessing local between-population genetic variation as a single two-allele polymorphic marker. Therefore microevolutionary studies based on skin color variation should be viewed with caution.  相似文献   

17.
Color vision in primates is variable across species, and it represents a rare trait in which the genetic mechanisms underlying phenotypic variation are fairly well-understood. Research on primate color vision has largely focused on adaptive explanations for observed variation, but it remains unclear why some species have trichromatic or polymorphic color vision while others are red-green color blind. Lemurs, in particular, are highly variable. While some species are polymorphic, many closely-related species are strictly dichromatic. We provide the first characterization of color vision in a wild population of red-bellied lemurs (Eulemur rubriventer, Ranomafana National Park, Madagascar) with a sample size (87 individuals; NX chromosomes = 134) large enough to detect even rare variants (0.95 probability of detection at ≥ 3% frequency). By sequencing exon 5 of the X-linked opsin gene we identified opsin spectral sensitivity based on known diagnostic sites and found this population to be dichromatic and monomorphic for a long wavelength allele. Apparent fixation of this long allele is in contrast to previously published accounts of Eulemur species, which exhibit either polymorphic color vision or only the medium wavelength opsin. This unexpected result may represent loss of color vision variation, which could occur through selective processes and/or genetic drift (e.g., genetic bottleneck). To indirectly assess the latter scenario, we genotyped 55 adult red-bellied lemurs at seven variable microsatellite loci and used heterozygosity excess and M-ratio tests to assess if this population may have experienced a recent genetic bottleneck. Results of heterozygosity excess but not M-ratio tests suggest a bottleneck might have occurred in this red-bellied lemur population. Therefore, while selection may also play a role, the unique color vision observed in this population might have been influenced by a recent genetic bottleneck. These results emphasize the need to consider adaptive and nonadaptive mechanisms of color vision evolution in primates.  相似文献   

18.
Census population size, sex-ratio and female reproductive success were monitored in 10 laboratory populations of Drosophila melanogaster selected for different ages of reproduction. With this demographic information, we estimated eigenvalue, variance and probability of allele loss effective population sizes. We conclude that estimates of effective size based on gene-frequency change at a few loci are biased downwards. We analysed the relative roles of selection and genetic drift in maintaining genetic variation in laboratory populations of Drosophila. We suggest that rare, favourable genetic variants in our laboratory populations have a high chance of being lost if their fitness effect is weak, e.g. 1% or less. However, if the fitness effect of this variation is 10% or greater, these rare variants are likely to increase to high frequency. The demographic information developed in this study suggests that some of our laboratory populations harbour more genetic variation than expected. One explanation for this finding is that part of the genetic variation in these outbred laboratory Drosophila populations may be maintained by some form of balancing selection. We suggest that, unlike bacteria, medium-term adaptation of laboratory populations of fruit flies is not primarily driven by new mutations, but rather by changes in the frequency of preexisting alleles.  相似文献   

19.
Interpreting the phenotypic consequences of human structural variation remains challenging. Functional enrichment analysis, which can identify functional enrichments among genes affected by structural variants, is providing significant biological insights into the genotype-phenotype relationship. In this review, we discuss the different approaches and choices in the application of this technique to human structural variation. We consider the importance of choosing the right background distribution for detection, the significance of the gene selection criteria, the effects of tissue-specific gene length biases and discuss sources of functional annotations with a focus on Gene Ontology and mouse phenotypic resources. Throughout this review, we highlight potential sources of significant bias that are of particular concern to the analysis of structural variants, and illustrate the importance of examining the expectations upon which enrichment analysis techniques depend.  相似文献   

20.
How polymorphisms are maintained within populations over long periods of time remains debated, because genetic drift and various forms of selection are expected to reduce variation. Here, we study the genetic architecture and maintenance of phenotypic morphs that confer crypsis in Timema cristinae stick insects, combining phenotypic information and genotyping‐by‐sequencing data from 1,360 samples across 21 populations. We find two highly divergent chromosomal variants that span megabases of sequence and are associated with colour polymorphism. We show that these variants exhibit strongly reduced effective recombination, are geographically widespread and probably diverged millions of generations ago. We detect heterokaryotype excess and signs of balancing selection acting on these variants through the species’ history. A third chromosomal variant in the same genomic region likely evolved more recently from one of the two colour variants and is associated with dorsal pattern polymorphism. Our results suggest that large‐scale genetic variation associated with crypsis has been maintained for long periods of time by potentially complex processes of balancing selection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号