首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The allele frequency spectrum has attracted considerable interest for the simultaneous inference of the demographic and adaptive history of populations. In a recent study, Evans et al. (2007) developed a forward diffusion equation describing the allele frequency spectrum, when the population is subject to size changes, selection and mutation. From the diffusion equation, the authors derived a system of ordinary differential equations (ODEs) for the moments in a Wright–Fisher diffusion with varying population size and constant selection. Here, we present an explicit solution for this system of ODEs with variable population size, but without selection, and apply this result to derive the expected spectrum of a sample for time-varying population size. We use this forward-in-time-solution of the allele frequency spectrum to obtain the backward-in-time-solution previously derived via coalescent theory by Griffiths and Tavaré (1998). Finally, we discuss the applicability of the theoretical results to the analysis of nucleotide polymorphism data.  相似文献   

2.
Publicly available single nucleotide polymorphism (SNP) allele frequencies are an important resource for the selection of genetic markers that may be most useful for gene mapping and association studies. Data mining these allele frequencies through disparate public databases and Websites is time consuming and can result in inconsistent findings. We have developed a web-based software tool, Frequency Finder, to acquire SNP allele frequencies from multiple public data sources and return a summarized result to the user. Our software optimizes and automates the search of candidate markers, decreasing the amount of time it would take to extract pertinent data manually. We have included several methods to output the data, including on-screen and as a compressed text file. We show that Frequency Finder accurately retrieves available frequency data from the available sources. Using this tool, we detect significant differences between Asian, African and Caucasian populations in the allele frequency spectra of 246 097 SNPs. While limited to public databases that provide web-based access to allele frequencies, Frequency Finder provides a single, user-friendly interface for retrieving allele frequencies for large batches of SNPs from multiple data sources.  相似文献   

3.
It is well known that the neutral allelic frequency spectrum of a population is affected by the history of population size. A number of authors have used this fact to infer history given observed allele frequency data. We ask whether perfect information concerning the spectrum allows precise recovery of the history, and with an explicit example show that the answer is in the negative. This implies some limitations on how informative allelic spectra can be.  相似文献   

4.
Chen H  Green RE  Pääbo S  Slatkin M 《Genetics》2007,177(1):387-398
We develop the theory for computing the joint frequency spectra of alleles in two closely related species. We allow for arbitrary population growth in both species after they had a common ancestor. We focus on the case in which a single chromosome is sequenced from one of the species. We use classical diffusion theory to show that, if the ancestral species was at equilibrium under mutation and drift and a chromosome from one of the descendant species carries the derived allele, the frequency spectrum in the other species is uniform, independently of the demographic history of both species. We also predict the expected densities of segregating and fixed sites when the chromosome from the other species carries the ancestral allele. We compare the predictions of our model with the site-frequency spectra of SNPs in the four HapMap populations of humans when the nucleotide present in the Neanderthal DNA sequence is ancestral or derived, using the chimp genome as the outgroup.  相似文献   

5.
The analysis of molecular data from natural populations has allowed researchers to answer diverse ecological questions that were previously intractable. In particular, ecologists are often interested in the demographic history of populations, information that is rarely available from historical records. Methods have been developed to infer demographic parameters from genomic data, but it is not well understood how inferred parameters compare to true population history or depend on aspects of experimental design. Here, we present and evaluate a method of SNP discovery using RNA sequencing and demographic inference using the program δaδi, which uses a diffusion approximation to the allele frequency spectrum to fit demographic models. We test these methods in a population of the checkerspot butterfly Euphydryas gillettii. This population was intentionally introduced to Gothic, Colorado in 1977 and has as experienced extreme fluctuations including bottlenecks of fewer than 25 adults, as documented by nearly annual field surveys. Using RNA sequencing of eight individuals from Colorado and eight individuals from a native population in Wyoming, we generate the first genomic resources for this system. While demographic inference is commonly used to examine ancient demography, our study demonstrates that our inexpensive, all‐in‐one approach to marker discovery and genotyping provides sufficient data to accurately infer the timing of a recent bottleneck. This demographic scenario is relevant for many species of conservation concern, few of which have sequenced genomes. Our results are remarkably insensitive to sample size or number of genomic markers, which has important implications for applying this method to other nonmodel systems.  相似文献   

6.
Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.  相似文献   

7.
Patterson NJ 《Genetics》2005,169(2):1093-1104
An important clue to the evolutionary history of an allele is the structure of the neighboring region of the genome, which we term the genomic background of the allele. Consider two copies of the allele. How similar we expect their genomic background to be is strongly influenced by the age of their most recent common ancestor (MRCA). We apply diffusion theory, first used by Motoo Kimura as a tool for predicting the changes in allele frequencies over time and developed by him in many articles in this journal, to prove a variety of new results on the age of the MRCA under the simplest demographic assumptions. In particular, we show that the expected age of the MRCA of two copies of an allele with population frequency f is just 2Nf generations, where N is the effective population size. Our results are a first step in running exact coalescent simulations, where we also simulate the history of the population frequency of an allele.  相似文献   

8.
Positional cloning of genes underlying complex diseases, such as type 2 diabetes mellitus (T2DM), typically follows a two-tiered process in which a chromosomal region is first identified by genome-wide linkage scanning, followed by association analyses using densely spaced single nucleotide polymorphic markers to identify the causal variant(s). The success of genome-wide single nucleotide polymorphism (SNP) detection has resulted in a vast number of potential markers available for use in the construction of such dense SNP maps. However, the cost of genotyping large numbers of SNPs in appropriately sized samples is nearly prohibitive. We have explored pooled DNA genotyping as a means of identifying differences in allele frequency between pools of individuals with T2DM and unaffected controls by using Pyrosequencing technology. We found that allele frequencies in pooled DNA were strongly correlated with those in individuals (r=0.99, P<0.0001) across a wide range of allele frequencies (0.02-0.50). We further investigated the sensitivity of this method to detect allele frequency differences between contrived pools, also over a wide range of allele frequencies. We found that Pyrosequencing was able to detect an allele frequency difference of less than 2% between pools, indicating that this method may be sensitive enough for use in association studies involving complex diseases where a small difference in allele frequency between cases and controls is expected.  相似文献   

9.
Single-nucleotide polymorphism (SNP) arrays have become a popular technology for disease-association studies, but they also have potential for studying the genetic differentiation of human populations. Application of the Affymetrix GeneChip Human Mapping 500K Array Set to a population of 102 individuals representing the major ethnic groups in the United States (African, Asian, European, and Hispanic) revealed patterns of gene diversity and genetic distance that reflected population history. We analyzed allelic frequencies at 388,654 autosomal SNP sites that showed some variation in our study population and 10% or fewer missing values. Despite the small size (23-31 individuals) of each subpopulation, there were no fixed differences at any site between any two subpopulations. As expected from the African origin of modern humans, greater gene diversity was seen in Africans than in either Asians or Europeans, and the genetic distance between the Asian and the European populations was significantly lower than that between either of these two populations and Africans. Principal components analysis applied to a correlation matrix among individuals was able to separate completely the major continental groups of humans (Africans, Asians, and Europeans), while Hispanics overlapped all three of these groups. Genes containing two or more markers with extraordinarily high genetic distance between subpopulations were identified as candidate genes for health differences between subpopulations. The results show that, even with modest sample sizes, genome-wide SNP genotyping technologies have great promise for capturing signatures of gene frequency difference between human subpopulations, with applications in areas as diverse as forensics and the study of ethnic health disparities.  相似文献   

10.
It is becoming routine to obtain data sets on DNA sequence variation across several thousands of chromosomes, providing unprecedented opportunity to infer the underlying biological and demographic forces. Such data make it vital to study summary statistics that offer enough compression to be tractable, while preserving a great deal of information. One well-studied summary is the site frequency spectrum—the empirical distribution, across segregating sites, of the sample frequency of the derived allele. However, most previous theoretical work has assumed that each site has experienced at most one mutation event in its genealogical history, which becomes less tenable for very large sample sizes. In this work we obtain, in closed form, the predicted frequency spectrum of a site that has experienced at most two mutation events, under very general assumptions about the distribution of branch lengths in the underlying coalescent tree. Among other applications, we obtain the frequency spectrum of a triallelic site in a model of historically varying population size. We demonstrate the utility of our formulas in two settings: First, we show that triallelic sites are more sensitive to the parameters of a population that has experienced historical growth, suggesting that they will have use if they can be incorporated into demographic inference. Second, we investigate a recently proposed alternative mechanism of mutation in which the two derived alleles of a triallelic site are created simultaneously within a single individual, and we develop a test to determine whether it is responsible for the excess of triallelic sites in the human genome.  相似文献   

11.
脂尾(臀)性状是绵羊逆境生存的必要性状, 其脂肪在尾臀部大量沉积的遗传特性与分子机制仍不明晰。为此, 文章以筛选的X染色体59383635位点SNP为候选分子标记, 利用PCR-SSCP技术检测该位点在我国尾型极端差异的阿勒泰羊、小尾寒羊、湖羊、中国美利奴细毛羊以及引入品种萨福克羊群体中的多态性, 并采用模型分析其与尾(臀)性状的相关性。结果表明, X染色体59383635位点T等位基因高频出现在表型分值较高的阿勒泰群体中, 而C等位基因则在瘦尾型绵羊品种中高频出现; 等位基因频率T/C的比值与尾臀表型分值相关性模型表明T/C比值随着尾臀表型分值增加呈指数倍增长。以上结果提示, 绵羊X染色体59383635位点多态性在脂尾(臀)与瘦尾绵羊群体中分布存在较大差异, 该SNP可作为一个理想的分子标记应用于高、低脂绵羊品种选育, 但其生物功能仍有待进一步深入研究。  相似文献   

12.
The role of agouti signaling protein (ASIP) in human pigmentation pathways is not definitively understood although its murine homologue regulates, in part, pheomelanogenesis. We have reported an association of a polymorphism in the 3'-untranslated region of ASIP (g.8818A>G) with dark hair and eye color among a group of European-Americans (Am J Hum Genet 2002 March;70:770). Among 147 healthy control subjects, the frequency of the G-allele was 0.12. We hypothesized that this polymorphism would occur at different frequencies among different population groups. Using PCR-RFLP, we genotyped 25 East Asian, 86 African-American, and 207 West African individuals for the ASIP g.8818A>G polymorphism. The g.8818G-allele was present in the West African sample at a frequency of 0.80, in the African-American sample at a frequency of 0.62, and in the East Asian sample at 0.28. The difference in allele frequency among population groups was statistically significant (P < 0.0001). Although the effect of the g.8818A>G polymorphism upon ASIP function is unknown, the large difference in allele frequency between our West African and European-American sample populations lends support to the notion that this gene may be important in human pigmentation.  相似文献   

13.
A major challenge in the analysis of population genomics data consists of isolating signatures of natural selection from background noise caused by random drift and gene flow. Analyses of massive amounts of data from many related populations require high-performance algorithms to determine the likelihood of different demographic scenarios that could have shaped the observed neutral single nucleotide polymorphism (SNP) allele frequency spectrum. In many areas of applied mathematics, Fourier Transforms and Spectral Methods are firmly established tools to analyze spectra of signals and model their dynamics as solutions of certain Partial Differential Equations (PDEs). When spectral methods are applicable, they have excellent error properties and are the fastest possible in high dimension; see Press et al. (2007). In this paper we present an explicit numerical solution, using spectral methods, to the forward Kolmogorov equations for a Wright–Fisher process with migration of K populations, influx of mutations, and multiple population splitting events.  相似文献   

14.
We present a statistical framework for estimation and application of sample allele frequency spectra from New-Generation Sequencing (NGS) data. In this method, we first estimate the allele frequency spectrum using maximum likelihood. In contrast to previous methods, the likelihood function is calculated using a dynamic programming algorithm and numerically optimized using analytical derivatives. We then use a Bayesian method for estimating the sample allele frequency in a single site, and show how the method can be used for genotype calling and SNP calling. We also show how the method can be extended to various other cases including cases with deviations from Hardy-Weinberg equilibrium. We evaluate the statistical properties of the methods using simulations and by application to a real data set.  相似文献   

15.
Dispersal comprises a complex life-history syndrome that influences the demographic dynamics of especially those species that live in fragmented landscapes, the structure of which may in turn be expected to impose selection on dispersal. We have constructed an individual-based evolutionary sexual model of dispersal for species occurring as metapopulations in habitat patch networks. The model assumes correlated random walk dispersal with edge-mediated behaviour (habitat selection) and spatially correlated stochastic local dynamics. The model is parametrized with extensive data for the Glanville fritillary butterfly. Based on empirical results for a single nucleotide polymorphism (SNP) in the phosphoglucose isomerase (Pgi) gene, we assume that dispersal rate in the landscape matrix, fecundity and survival are affected by a locus with two alleles, A and C, individuals with the C allele being more mobile. The model was successfully tested with two independent empirical datasets on spatial variation in Pgi allele frequency. First, at the level of local populations, the frequency of the C allele is the highest in newly established isolated populations and the lowest in old isolated populations. Second, at the level of sub-networks with dissimilar numbers and connectivities of patches, the frequency of C increases with decreasing network size and hence with decreasing average metapopulation size. The frequency of C is the highest in landscapes where local extinction risk is high and where there are abundant opportunities to establish new populations. Our results indicate that the strength of the coupling of the ecological and evolutionary dynamics depends on the spatial scale and is asymmetric, demographic dynamics having a greater immediate impact on genetic dynamics than vice versa.  相似文献   

16.
Cytochrome P450 (CYP) superfamily members CYP2C8 and CYP2C9 are polymorphically expressed enzymes that are involved in the metabolic inactivation of several drugs, including, among others, antiepileptics, NSAIDs, oral hypoglycemics, and anticoagulants. Many of these drugs have a narrow therapeutic index, and growing evidence indicates a prominent role of CYP2C8 and CYP2C9 polymorphisms in the therapeutic efficacy and in the development of adverse effects among patients treated with drugs that are CYP2C8 or CYP2C9 substrates. In this review, we summarize present knowledge on human variability in the frequency of variant CYP2C8 and CYP2C9 alleles. Besides an expected interethnic variability in allele frequencies, a large intraethnic variability exists. Among Asian subjects, for example, statistically significant differences (p < 0.0001) in CYP2C9*3 allele frequencies between Chinese and Japanese individuals have been reported. In addition, individuals from East Asia present different allele frequencies for CYP2C9*2 and CYP2C9*3 compared with South Asian subjects (p < 0.0001). Among Caucasian Europeans, statistically significant differences for the frequency of CYP2C8*3, CYP2C9*2, and CYP2C9*3 exist (p < 0.0001). This indicates that Asian individuals or Caucasian European individuals cannot be considered as homogeneous groups regarding CYP2C8 or CYP2C9 allele frequencies. Caucasian American subjects also show a large variability in allele frequencies, which is likely to be related to ethnic ancestry. A higher frequency of variant CYP2C8 and CYP2C9 alleles is expected among Caucasian Americans with South European ancestry than in individuals with North European ancestry. The findings summarized in this review suggest that among individuals with Asian or European ancestry, intraethnic differences in the risk of developing adverse effects with drugs that are CYP2C8 or CYP2C9 substrates are to be expected. In addition, the observed intraethnic variability reinforces the need for proper selection of control subjects and points against the use of surrogate control groups for studies involving association of CYP2C8 or CYP2C9 alleles with adverse drug reactions or spontaneous diseases.  相似文献   

17.
We investigate the performance of tests of neutrality in admixed populations using plausible demographic models for African-American history as well as resequencing data from African and African-American populations. The analysis of both simulated and human resequencing data suggests that recent admixture does not result in an excess of false-positive results for neutrality tests based on the frequency spectrum after accounting for the population growth in the parental African population. Furthermore, when simulating positive selection, Tajima's D, Fu and Li's D, and haplotype homozygosity have lower power to detect population-specific selection using individuals sampled from the admixed population than from the nonadmixed population. Fay and Wu's H test, however, has more power to detect selection using individuals from the admixed population than from the nonadmixed population, especially when the selective sweep ended long ago. Our results have implications for interpreting recent genome-wide scans for positive selection in human populations.  相似文献   

18.
Demographic factors such as migration rate and population size can impede or facilitate speciation. In hybrid zones, reproductive boundaries between species are tested and demography mediates the opportunity for admixture between lineages that are partially isolated. Genomic ancestry is a powerful tool for revealing the history of admixed populations, but models and methods based on local ancestry are rarely applied to structured hybrid zones. To understand the effects of demography on ancestry in hybrids zones, we performed individual‐based simulations under a stepping‐stone model, treating migration rate, deme size, and hybrid zone age as parameters. We find that the number of ancestry junctions (the transition points between genomic regions with different ancestries) and heterogenicity (the genomic proportion heterozygous for ancestry) are often closely connected to demographic history. Reducing deme size reduces junction number and heterogenicity. Elevating migration rate increases heterogenicity, but migration affects junction number in more complex ways. We highlight the junction frequency spectrum as a novel and informative summary of ancestry that responds to demographic history. A substantial proportion of junctions are expected to fix when migration is limited or deme size is small, changing the shape of the spectrum. Our findings suggest that genomic patterns of ancestry could be used to infer demographic history in hybrid zones.  相似文献   

19.
Erythropoietic protoporphyria (EPP) is an inherited disorder of heme biosynthesis that results from a partial deficiency of ferrochelatase (FECH). Recently, we have shown that the inheritance of the common hypomorphic IVS3-48C allele trans to a deleterious mutation reduces FECH activity to below a critical threshold and accounts for the photosensitivity seen in patients. Rare cases of autosomal recessive inheritance have been reported. We studied a cohort of 173 white French EPP families and a group of 360 unrelated healthy subjects from four ethnic groups. The prevalences of the recessive and dominant autosomal forms of EPP are 4% (95% confidence interval 1-8) and 95% (95% confidence interval 91-99), respectively. In 97.9% of dominant cases, an IVS3-48C allele is co-inherited with the deleterious mutation. The frequency of the IVS3-48C allele differs widely in the Japanese (43%), southeast Asian (31%), white French (11%), North African (2.7%), and black West African (<1%) populations. These differences can be related to the prevalence of EPP in these populations and could account for the absence of EPP in black subjects. The phylogenic origin of the IVS3-48C haplotypes strongly suggests that the IVS3-48C allele arose from a single recent mutational event. Estimation of the age of the IVS3-48C allele from haplotype data in white and Asian populations yields an estimated age three to four times younger in the Japanese than in the white population, and this difference may be attributable either to differing demographic histories or to positive selection for the IVS3-48C allele in the Asian population. Finally, by calculating the KA/KS ratio in humans and chimpanzees, we show that the FECH protein sequence is subject to strong negative pressure. Overall, EPP looks like a Mendelian disorder, in which the prevalence of overt disease depends mainly on the frequency of a single common single-nucleotide polymorphism resulting from a unique mutational event that occurred 60,000 years ago.  相似文献   

20.
Phenotypic divergences between modern human populations have developed as a result of genetic adaptation to local environments over the past 100,000 years. To identify genes involved in population-specific phenotypes, it is necessary to detect signatures of recent positive selection in the human genome. Although detection of elongated linkage disequilibrium (LD) has been a powerful tool in the field of evolutionary genetics, current LD-based approaches are not applicable to already fixed loci. Here, we report a method of scanning for population-specific strong selective sweeps that have reached fixation. In this method, genome-wide SNP data is used to analyze differences in the haplotype frequency, nucleotide diversity, and LD between populations, using the ratio of haplotype homozygosity between populations. To estimate the detection power of the statistics used in this study, we performed computer simulations and found that these tests are relatively robust against the density of typed SNPs and demographic parameters if the advantageous allele has reached fixation. Therefore, we could determine the threshold for maintaining high detection power, regardless of SNP density and demographic history. When this method was applied to the HapMap data, it was able to identify the candidates of population-specific strong selective sweeps more efficiently than the outlier approach that depends on the empirical distribution. This study, confirming strong positive selection on genes previously reported to be associated with specific phenotypes, also identifies other candidates that are likely to contribute to phenotypic differences between human populations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号