首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We consider the genome of a sample of n individuals taken at the end of a selective sweep, which is the fixation of an advantageous allele in the population. When the selective advantage is high, the genealogy at a locus under selective sweep can be approximated by a comb with n teeth. However, because of recombinations during the selective sweep, the hitchhiking effect decreases as the distance from the selected site increases, so that far from this locus, the tree can be approximated by a Kingman coalescent tree, as in the neutral case. We first give the distribution of the tree at a given locus. Then we focus on the evolution of this tree along the genome. Since this tree-valued process is not Markovian, we study the evolution of the Ancestral Recombination Graph along the genome in case of selective sweep.  相似文献   

2.
Coop G  Ralph P 《Genetics》2012,192(1):205-224
Two major sources of stochasticity in the dynamics of neutral alleles result from resampling of finite populations (genetic drift) and the random genetic background of nearby selected alleles on which the neutral alleles are found (linked selection). There is now good evidence that linked selection plays an important role in shaping polymorphism levels in a number of species. One of the best-investigated models of linked selection is the recurrent full-sweep model, in which newly arisen selected alleles fix rapidly. However, the bulk of selected alleles that sweep into the population may not be destined for rapid fixation. Here we develop a general model of recurrent selective sweeps in a coalescent framework, one that generalizes the recurrent full-sweep model to the case where selected alleles do not sweep to fixation. We show that in a large population, only the initial rapid increase of a selected allele affects the genealogy at partially linked sites, which under fairly general assumptions are unaffected by the subsequent fate of the selected allele. We also apply the theory to a simple model to investigate the impact of recurrent partial sweeps on levels of neutral diversity and find that for a given reduction in diversity, the impact of recurrent partial sweeps on the frequency spectrum at neutral sites is determined primarily by the frequencies rapidly achieved by the selected alleles. Consequently, recurrent sweeps of selected alleles to low frequencies can have a profound effect on levels of diversity but can leave the frequency spectrum relatively unperturbed. In fact, the limiting coalescent model under a high rate of sweeps to low frequency is identical to the standard neutral model. The general model of selective sweeps we describe goes some way toward providing a more flexible framework to describe genomic patterns of diversity than is currently available.  相似文献   

3.
The Genealogy of Samples in Models with Selection   总被引:1,自引:0,他引:1  
C. Neuhauser  S. M. Krone 《Genetics》1997,145(2):519-534
We introduce the genealogy of a random sample of genes taken from a large haploid population that evolves according to random reproduction with selection and mutation. Without selection, the genealogy is described by Kingman''s well-known coalescent process. In the selective case, the genealogy of the sample is embedded in a graph with a coalescing and branching structure. We describe this graph, called the ancestral selection graph, and point out differences and similarities with Kingman''s coalescent. We present simulations for a two-allele model with symmetric mutation in which one of the alleles has a selective advantage over the other. We find that when the allele frequencies in the population are already in equilibrium, then the genealogy does not differ much from the neutral case. This is supported by rigorous results. Furthermore, we describe the ancestral selection graph for other selective models with finitely many selection classes, such as the K-allele models, infinitely-many-alleles models, DNA sequence models, and infinitely-many-sites models, and briefly discuss the diploid case.  相似文献   

4.
Jeremy J. Berg  Graham Coop 《Genetics》2015,201(2):707-725
The use of genetic polymorphism data to understand the dynamics of adaptation and identify the loci that are involved has become a major pursuit of modern evolutionary genetics. In addition to the classical “hard sweep” hitchhiking model, recent research has drawn attention to the fact that the dynamics of adaptation can play out in a variety of different ways and that the specific signatures left behind in population genetic data may depend somewhat strongly on these dynamics. One particular model for which a large number of empirical examples are already known is that in which a single derived mutation arises and drifts to some low frequency before an environmental change causes the allele to become beneficial and sweeps to fixation. Here, we pursue an analytical investigation of this model, bolstered and extended via simulation study. We use coalescent theory to develop an analytical approximation for the effect of a sweep from standing variation on the genealogy at the locus of the selected allele and sites tightly linked to it. We show that the distribution of haplotypes that the selected allele is present on at the time of the environmental change can be approximated by considering recombinant haplotypes as alleles in the infinite-alleles model. We show that this approximation can be leveraged to make accurate predictions regarding patterns of genetic polymorphism following such a sweep. We then use simulations to highlight which sources of haplotypic information are likely to be most useful in distinguishing this model from neutrality, as well as from other sweep models, such as the classic hard sweep and multiple-mutation soft sweeps. We find that in general, adaptation from a unique standing variant will likely be difficult to detect on the basis of genetic polymorphism data from a single population time point alone, and when it can be detected, it will be difficult to distinguish from other varieties of selective sweeps. Samples from multiple populations and/or time points have the potential to ease this difficulty.  相似文献   

5.
6.
Kim Y 《Genetics》2006,172(3):1967-1978
The allele frequency of a neutral variant in a population is pushed either upward or downward by directional selection on a linked beneficial mutation ("selective sweeps"). DNA sequences sampled after the fixation of the beneficial allele thus contain an excess of rare neutral alleles. This study investigates the allele frequency distribution under selective sweep models using analytic approximation and simulation. First, given a single selective sweep at a fixed time, I derive an expression for the sampling probabilities of neutral mutants. This solution can be used to estimate the time of the fixation of a beneficial allele from sequence data. Next, I obtain an approximation to mean allele frequencies under recurrent selective sweeps. Under recurrent sweeps, the frequency spectrum is skewed toward rare alleles. However, the excess of high-frequency derived alleles, previously shown to be a signature of single selective sweeps, disappears with recurrent sweeps. It is shown that, using this approximation and multilocus polymorphism data, genomewide parameters of directional selection can be estimated.  相似文献   

7.
The coalescent with recombination is a fundamental model to describe the genealogical history of DNA sequence samples from recombining organisms. Considering recombination as a process which acts along genomes and which creates sequence segments with shared ancestry, we study the influence of single recombination events upon tree characteristics of the coalescent. We focus on properties such as tree height and tree balance and quantify analytically the changes in these quantities incurred by recombination in terms of probability distributions. We find that changes in tree topology are often relatively mild under conditions of neutral evolution, while changes in tree height are on average quite large. Our results add to a quantitative understanding of the spatial coalescent and provide the neutral reference to which the impact by other evolutionary scenarios, for instance tree distortion by selective sweeps, can be compared.  相似文献   

8.
M H Schierup  A M Mikkelsen  J Hein 《Genetics》2001,159(4):1833-1844
Using a coalescent model of multiallelic balancing selection with recombination, the genealogical process as a function of recombinational distance from a site under selection is investigated. We find that the shape of the phylogenetic tree is independent of the distance to the site under selection. Only the timescale changes from the value predicted by Takahata's allelic genealogy at the site under selection, converging with increasing recombination to the timescale of the neutral coalescent. However, if nucleotide sequences are simulated over a recombining region containing a site under balancing selection, a phylogenetic tree constructed while ignoring such recombination is strongly affected. This is true even for small rates of recombination. Published studies of multiallelic balancing selection, i.e., the major histocompatibility complex (MHC) of vertebrates, gametophytic and sporophytic self-incompatibility of plants, and incompatibility of fungi, all observe allelic genealogies with unexpected shapes. We conclude that small absolute levels of recombination are compatible with these observed distortions of the shape of the allelic genealogy, suggesting a possible cause of these observations. Furthermore, we illustrate that the variance in the coalescent with recombination process makes it difficult to locate sites under selection and to estimate the selection coefficient from levels of variability.  相似文献   

9.
Zeng K  Charlesworth B 《Genetics》2011,189(1):251-266
Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data.  相似文献   

10.
The fixation of advantageous mutations in a population has the effect of reducing variation in the DNA sequence near that mutation. Kaplan et al. (1989) used a three-phase simulation model to study the effect of selective sweeps on genealogies. However, most subsequent work has simplified their approach by assuming that the number of individuals with the advantageous allele follows the logistic differential equation. We show that the impact of a selective sweep can be accurately approximated by a random partition created by a stick-breaking process. Our simulation results show that ignoring the randomness when the number of individuals with the advantageous allele is small can lead to substantial errors.  相似文献   

11.
We suggest a simple deterministic approximation for the growth of the favored-allele frequency during a selective sweep. Using this approximation we introduce an accurate model for genetic hitchhiking. Only when Ns<10 (N is the population size and s denotes the selection coefficient) are discrepancies between our approximation and direct numerical simulations of a Moran model notable. Our model describes the gene genealogies of a contiguous segment of neutral loci close to the selected one, and it does not assume that the selective sweep happens instantaneously. This enables us to compute SNP distributions on the neutral segment without bias.  相似文献   

12.
The ratio of singletons to the total number of segregating sites is used to estimate a reproduction parameter in a population model of large offspring numbers without having to jointly estimate the mutation rate. For neutral genetic variation, the ratio of singletons to the total number of segregating sites is equivalent to the ratio of total length of external branches to the total length of the gene genealogy. A multinomial maximum likelihood method that takes into account more frequency classes than just the singletons is developed to estimate the parameter of another large offspring number model. The performance of these methods with regard to sample size, mutation rate, and bias, is investigated by simulation. The expected value of the ratio of the total length of external branches to the total length of the whole tree is, using simulation, shown to decrease for the Kingman coalescent as sample size increases, but can increase or decrease, depending on parameter values, for Λ coalescents. Considering ratios of tree statistics, as opposed to considering lengths of various subtrees separately, can yield better insight into the dynamics of gene genealogies.  相似文献   

13.
This paper concerns the genealogical structure of a sample of chromosomes sharing a neutral rare allele. We suppose that the mutation giving rise to the allele has only happened once in the history of the entire population, and that the allele is of known frequency q in the population. Within a coalescent framework C. Wiuf and P. Donnelly (1999, Theor. Popul. Biol. 56, 183-201) derived an exact analysis of the conditional genealogy but it is inconvenient for applications. Here, we develop an approximation to the exact distribution of the conditional genealogy, including an approximation to the distribution of the time at which the mutation arose. The approximations are accurate for frequencies q<5-10%. In addition, a simple and fast simulation scheme is constructed. We consider a demography parameterized by a d-dimensional vector alpha=(alpha(1), em leader, alpha(d)). It is shown that the conditional genealogy and the age of the mutation have distributions that depend on a=qalpha and q only, and that the effect of q is a linear scaling of times in the genealogy; if q is doubled, the lengths of all branches in the genealogy are doubled. The theory is exemplified in two different demographies of some interest in the study of human evolution: (1) a population of constant size and (2) a population of exponentially decreasing size (going backward in time).  相似文献   

14.
In this paper we consider the genealogy of two nested mutant alleles, assuming the constant-size neutral coalescent model with infinite sites mutation. We study the conditional genealogy and derive explicit formulas for the joint and marginal site frequency spectra for the double, single and zero mutant allele. In addition, we find the mean ages of the two mutations. We show that the age of the youngest mutation does not depend on the frequency of the single mutant allele and that the frequency spectra for the single mutant allele and the zero mutant allele are the same.  相似文献   

15.
The Kingman coalescent, which has become the foundation for a wide range of theoretical as well as empirical studies, was derived as an approximation of the Wright-Fisher (WF) model. The approximation heavily relies on the assumption that population size is large and sample size is much smaller than the population size. Whether the sample size is too large compared to the population size is rarely questioned in practice when applying statistical methods based on the Kingman coalescent. Since WF model is the most widely used population genetics model for reproduction, it is desirable to develop a coalescent framework for the WF model, which can be used whenever there are concerns about the accuracy of the Kingman coalescent as an approximation. This paper described the exact coalescent theory for the WF model and develops a simulation algorithm, which is then used, together with an analytical approach, to study the properties of the exact coalescent as well as its differences to the Kingman coalescent. We show that the Kingman coalescent differs from the exact coalescent by: (1) shorter waiting time between successive coalescent events; (2) different probability of observing a topological relationship among sequences in a sample; and (3) slightly smaller tree length in the genealogy of a large sample. On the other hand, there is little difference in the age of the most recent common ancestor (MRCA) of the sample. The exact coalescent makes up the longer waiting time between successive coalescent events by having multiple coalescence at the same time. The most significant difference among various summary statistics of a coalescent examined is the sum of lengths of external branches, which can be more than 10% larger for exact coalescent than that for the Kingman coalescent. As a whole, the Kingman coalescent is a remarkably accurate approximation to the exact coalescent for sample and population sizes falling considerably outside the region that was originally anticipated.  相似文献   

16.
The structure of linkage disequilibrium around a selective sweep   总被引:1,自引:0,他引:1       下载免费PDF全文
McVean G 《Genetics》2007,175(3):1395-1406
The fixation of advantageous mutations by natural selection has a profound impact on patterns of linked neutral variation. While it has long been appreciated that such selective sweeps influence the frequency spectrum of nearby polymorphism, it has only recently become clear that they also have dramatic effects on local linkage disequilibrium. By extending previous results on the relationship between genealogical structure and linkage disequilibrium, I obtain simple expressions for the influence of a selective sweep on patterns of allelic association. I show that sweeps can increase, decrease, or even eliminate linkage disequilibrium (LD) entirely depending on the relative position of the selected and neutral loci. I also show the importance of the age of the neutral mutations in predicting their degree of association and describe the consequences of such results for the interpretation of empirical data. In particular, I demonstrate that while selective sweeps can eliminate LD, they generate patterns of genetic variation very different from those expected from recombination hotspots.  相似文献   

17.
Selective sweeps of variation caused by fixation of major genes may have a dramatic impact on the genetic gain from background polygenic variation, particularly in the genome regions closely linked to the major gene. The response to selection can be restrained because of the reduced selection intensity and the reduced effective population size caused by the increase in frequency of the major gene. In the context of a selected population where fixation of a known major gene is desired, the question arises as to which is the optimal path of increase in frequency of the gene so that the selective sweep of variation resulting from its fixation is minimized. Using basic theoretical arguments we propose a frequency path that maximizes simultaneously the effective population size applicable to the selected background and the selection intensity on the polygenic variation by minimizing the average squared selection intensity on the major gene over generations up to a given fixation time. We also propose the use of mating between carriers and non-carriers of the major gene, in order to promote the effective recombination between the major gene and its linked polygenic background. Using a locus-based computer simulation assuming different degrees of linkage, we show that the path proposed is more effective than a similar path recently published, and that the combination of the selection and mating methods provides an efficient way to palliate the negative effects of a selective sweep.  相似文献   

18.
Genomic survey data now permit an unprecedented level of sensitivity in the detection of departures from canonical evolutionary models, including expansions in population size and selective sweeps. Here, we examine the effects of seemingly subtle differences among sampling distributions on goodness of fit analyses of site frequency spectra constructed from single nucleotide polymorphisms. Conditioning on the observation of exactly two alleles in a random sample results in a site frequency spectrum that is independent of the scaled rate of neutral substitution (θ). Other sampling distributions, including conditioning on a single mutational event in the sample genealogy or randomly selecting a single mutation from a genealogy with multiple mutations, have distinct site frequency spectra that show highly significant departures from the predictions of the biallelic model. Some aspects of data filtering may contribute to significant departures of site frequency spectra from expectation, apart from any violation of the standard neutral model.  相似文献   

19.
Molecular sequences obtained at different sampling times from populations of rapidly evolving pathogens and from ancient subfossil and fossil sources are increasingly available with modern sequencing technology. Here, we present a Bayesian statistical inference approach to the joint estimation of mutation rate and population size that incorporates the uncertainty in the genealogy of such temporally spaced sequences by using Markov chain Monte Carlo (MCMC) integration. The Kingman coalescent model is used to describe the time structure of the ancestral tree. We recover information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from temporally spaced data, that is, from nucleotide sequences gathered at different times, from different individuals, in an evolving haploid population. We briefly discuss the methodological implications and show what can be inferred, in various practically relevant states of prior knowledge. We develop extensions for exponentially growing population size and joint estimation of substitution model parameters. We illustrate some of the important features of this approach on a genealogy of HIV-1 envelope (env) partial sequences.  相似文献   

20.
Natural populations are structured spatially into local populations and genetically into diverse 'genetic backgrounds' defined by different combinations of selected alleles. If selection maintains genetic backgrounds at constant frequency then neutral diversity is enhanced. By contrast, if background frequencies fluctuate then diversity is reduced. Provided that the population size of each background is large enough, these effects can be described by the structured coalescent process. Almost all the extant results based on the coalescent deal with a single selected locus. Yet we know that very large numbers of genes are under selection and that any substantial effects are likely to be due to the cumulative effects of many loci. Here, we set up a general framework for the extension of the coalescent to multilocus scenarios and we use it to study the simplest model, where strong balancing selection acting on a set of n loci maintains 2n backgrounds at constant frequencies and at linkage equilibrium. Analytical results show that the expected linked neutral diversity increases exponentially with the number of selected loci and can become extremely large. However, simulation results reveal that the structured coalescent approach breaks down when the number of backgrounds approaches the population size, because of stochastic fluctuations in background frequencies. A new method is needed to extend the structured coalescent to cases with large numbers of backgrounds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号