首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We investigate the expected coalescent in populations growing exponentially. The distribution of expected times to coalescence events may show a linear relationship with a number of ancestral lineages, when the latter is subjected to the "epidemic transformation". However, in a number of viral populations, upward curves are created when the epidemically transformed number of ancestral lineages is plotted against time. We consider possible causes of such upward curves. These include the possibility that a curved line is created through a transformation failure due to a sample size that is too large. We suggest a new formula for predicting such failure. The second cause is a population size increasing at an accelerating rate. However, the combination of recent coalescent events and an upward curve is created by an accelerating population increase only under restricted conditions. Specifically, such a pattern is expected only when, were population growth not to have accelerated, the transformation would have failed anyway. The third cause of nonlinearity arises in the estimated coalescent, as distinct from the real coalescent, if the mutation rate is small. However, coalescence times estimated from data typically give a straight line following epidemic transformation, but the rate of exponential increase, or r value, will be underestimated.  相似文献   

2.
Quantifying epidemiological dynamics is crucial for understanding and forecasting the spread of an epidemic. The coalescent and the birth-death model are used interchangeably to infer epidemiological parameters from the genealogical relationships of the pathogen population under study, which in turn are inferred from the pathogen genetic sequencing data. To compare the performance of these widely applied models, we performed a simulation study. We simulated phylogenetic trees under the constant rate birth-death model and the coalescent model with a deterministic exponentially growing infected population. For each tree, we re-estimated the epidemiological parameters using both a birth-death and a coalescent based method, implemented as an MCMC procedure in BEAST v2.0. In our analyses that estimate the growth rate of an epidemic based on simulated birth-death trees, the point estimates such as the maximum a posteriori/maximum likelihood estimates are not very different. However, the estimates of uncertainty are very different. The birth-death model had a higher coverage than the coalescent model, i.e. contained the true value in the highest posterior density (HPD) interval more often (2–13% vs. 31–75% error). The coverage of the coalescent decreases with decreasing basic reproductive ratio and increasing sampling probability of infecteds. We hypothesize that the biases in the coalescent are due to the assumption of deterministic rather than stochastic population size changes. Both methods performed reasonably well when analyzing trees simulated under the coalescent. The methods can also identify other key epidemiological parameters as long as one of the parameters is fixed to its true value. In summary, when using genetic data to estimate epidemic dynamics, our results suggest that the birth-death method will be less sensitive to population fluctuations of early outbreaks than the coalescent method that assumes a deterministic exponentially growing infected population.  相似文献   

3.
Natural populations are structured spatially into local populations and genetically into diverse 'genetic backgrounds' defined by different combinations of selected alleles. If selection maintains genetic backgrounds at constant frequency then neutral diversity is enhanced. By contrast, if background frequencies fluctuate then diversity is reduced. Provided that the population size of each background is large enough, these effects can be described by the structured coalescent process. Almost all the extant results based on the coalescent deal with a single selected locus. Yet we know that very large numbers of genes are under selection and that any substantial effects are likely to be due to the cumulative effects of many loci. Here, we set up a general framework for the extension of the coalescent to multilocus scenarios and we use it to study the simplest model, where strong balancing selection acting on a set of n loci maintains 2n backgrounds at constant frequencies and at linkage equilibrium. Analytical results show that the expected linked neutral diversity increases exponentially with the number of selected loci and can become extremely large. However, simulation results reveal that the structured coalescent approach breaks down when the number of backgrounds approaches the population size, because of stochastic fluctuations in background frequencies. A new method is needed to extend the structured coalescent to cases with large numbers of backgrounds.  相似文献   

4.
Hubbarde JE  Wild G  Wahl LM 《Genetics》2007,177(3):1703-1712
Estimating the fixation probability of a beneficial mutation has a rich history in theoretical population genetics. Typically, to attain mathematical tractability, we assume that generation times are fixed, while the number of offspring per individual is stochastic. However, fixation probabilities are extremely sensitive to these assumptions regarding life history. In this article, we compute the fixation probability for a "burst-death" life-history model. The model assumes that generation times are exponentially distributed, but the number of offspring per individual is constant. We estimate the fixation probability for populations of constant size and for populations that grow exponentially between periodic population bottlenecks. We find that the fixation probability is, in general, substantially lower in the burst-death model than in classical models. We also note striking qualitative differences between the fates of beneficial mutations that increase burst size and mutations that increase the burst rate. In particular, once the burst size is sufficiently large relative to the wild type, the burst-death model predicts that fixation probability depends only on burst rate.  相似文献   

5.
Current human sequencing projects observe an abundance of extremely rare genetic variation, suggesting recent acceleration of population growth. To better understand the impact of such accelerating growth on the quantity and nature of genetic variation, we present a new class of models capable of incorporating faster than exponential growth in a coalescent framework. Our work shows that such accelerated growth affects only the population size in the recent past and thus large samples are required to detect the models’ effects on patterns of variation. When we compare models with fixed initial growth rate, models with accelerating growth achieve very large current population sizes and large samples from these populations contain more variation than samples from populations with constant growth. This increase is driven almost entirely by an increase in singleton variation. Moreover, linkage disequilibrium decays faster in populations with accelerating growth. When we instead condition on current population size, models with accelerating growth result in less overall variation and slower linkage disequilibrium decay compared to models with exponential growth. We also find that pairwise linkage disequilibrium of very rare variants contains information about growth rates in the recent past. Finally, we demonstrate that models of accelerating growth may substantially change estimates of present-day effective population sizes and growth times.  相似文献   

6.
This paper is concerned with the structure of the genealogy of a sample in which it is observed that some subset of chromosomes carries a particular mutation, assumed to have arisen uniquely in the history of the population. A rigorous theoretical study of this conditional genealogy is given using coalescent methods. Particular results include the mean, variance, and density of the age of the mutation conditional on its frequency in the sample. Most of the development relates to populations of constant size, but we discuss the extension to populations which have grown exponentially to their present size.  相似文献   

7.
Gene genealogies in a metapopulation   总被引:1,自引:0,他引:1  
Wakeley J  Aliacar N 《Genetics》2001,159(2):893-905
A simple genealogical process is found for samples from a metapopulation, which is a population that is subdivided into a large number of demes, each of which is subject to extinction and recolonization and receives migrants from other demes. As in the migration-only models studied previously, the genealogy of any sample includes two phases: a brief sample-size adjustment followed by a coalescent process that dominates the history. This result will hold for metapopulations that are composed of a large number of demes. It is robust to the details of population structure, as long as the number of possible source demes of migrants and colonists for each deme is large. Analytic predictions about levels of genetic variation are possible, and results for average numbers of pairwise differences within and between demes are given. Further analysis of the expected number of segregating sites in a sample from a single deme illustrates some previously known differences between migration and extinction/recolonization. The ancestral process is also amenable to computer simulation. Simulation results show that migration and extinction/recolonization have very different effects on the site-frequency distribution in a sample from a single deme. Migration can cause a U-shaped site-frequency distribution, which is qualitatively similar to the pattern reported recently for positive selection. Extinction and recolonization, in contrast, can produce a mode in the site-frequency distribution at intermediate frequencies, even in a sample from a single deme.  相似文献   

8.
Several factors including demographic changes, selection, and recombination are known to affect the distribution of the number of pairwise differences between DNA sequences. The effects of each of these forces have previously been used to estimate population parameter values using various assumptions about other factors. In this article, we use the predictions of the mismatch distribution under a standard neutral equilibrium model to design a coalescent simulation-based test and detect any deviation from this equilibrium. When reliable independent estimates are available for the intragenic recombination rate, this test can be used as a neutrality test or a population expansion test in actual studies, under reasonable assumptions.  相似文献   

9.
Do large populations always outcompete smaller ones? Does increasing the mutation rate have a similar effect to increasing the population size, with respect to the adaptation of a population? How important are substitutions in determining the adaptation rate? In this study, we ask how population size and mutation rate interact to affect adaptation on empirical adaptive landscapes. Using such landscapes, we do not need to make many ad hoc assumption about landscape topography, such as about epistatic interactions among mutations or about the distribution of fitness effects. Moreover, we have a better understanding of all the mutations that occur in a population and their effects on the average fitness of the population than we can know in experimental studies. Our results show that the evolutionary dynamics of a population cannot be fully explained by the population mutation rate \(N\mu\); even at constant \(N\mu\), there can be dramatic differences in the adaptation of populations of different sizes. Moreover, the substitution rate of mutations is not always equivalent to the adaptation rate, because we observed populations adapting to high adaptive peaks without fixing any mutations. Finally, in contrast to some theoretical predictions, even on the most rugged landscapes we study, small population size is never an advantage over larger population size. These result show that complex interactions among multiple factors can affect the evolutionary dynamics of populations, and simple models should be taken with caution.  相似文献   

10.
Detecting population expansion and decline using microsatellites   总被引:15,自引:0,他引:15  
Beaumont MA 《Genetics》1999,153(4):2013-2029
This article considers a demographic model where a population varies in size either linearly or exponentially. The genealogical history of microsatellite data sampled from this population can be described using coalescent theory. A method is presented whereby the posterior probability distribution of the genealogical and demographic parameters can be estimated using Markov chain Monte Carlo simulations. The likelihood surface for the demographic parameters is complicated and its general features are described. The method is then applied to published microsatellite data from two populations. Data from the northern hairy-nosed wombat show strong evidence of decline. Data from European humans show weak evidence of expansion.  相似文献   

11.
Ancient demographic events can be inferred from the distribution of pairwise sequence differences (or mismatches) among individuals. We analyzed a database of 3,677 Y chromosomes typed for 11 biallelic markers in 48 human populations from Europe and the Mediterranean area. Contrary to what is observed in the analysis of mitochondrial polymorphisms, Tajima's test was insignificant for most Y-chromosome samples, and in 47 populations the mismatch distributions had multiple peaks. Taken at face value, these results would suggest either (1) that the size of the male population stayed essentially constant over time, while the female population size increased, or (2) that different selective regimes have shaped mitochondrial and Y-chromosome diversity, leading to an excess of rare alleles only in the mitochondrial genome. An alternative explanation would be that the 11 variable sites of the Y chromosome do not provide sufficient statistical power, so a comparison with mitochondrial data (where more than 200 variable sites are studied in Europe) is impossible at present. To discriminate between these possibilities, we repeatedly analyzed a European mitochondrial database, each time considering only 11 variable sites, and we estimated mismatch distributions in stable and growing populations, generated by simulating coalescent processes. Along with theoretical considerations, these tests suggest that the difference between the mismatch distributions inferred from mitochondrial and Y-chromosome data are not a statistical artifact. Therefore, the observed mismatch distributions appear to reflect different underlying demographic histories and/or selective pressures for maternally and paternally transmitted loci.  相似文献   

12.
Two sequentially Markov coalescent models (SMC and SMC′) are available as tractable approximations to the ancestral recombination graph (ARG). We present a Markov process describing coalescence at two fixed points along a pair of sequences evolving under the SMC′. Using our Markov process, we derive a number of new quantities related to the pairwise SMC′, thereby analytically quantifying for the first time the similarity between the SMC′ and the ARG. We use our process to show that the joint distribution of pairwise coalescence times at recombination sites under the SMC′ is the same as it is marginally under the ARG, which demonstrates that the SMC′ is, in a particular well-defined, intuitive sense, the most appropriate first-order sequentially Markov approximation to the ARG. Finally, we use these results to show that population size estimates under the pairwise SMC are asymptotically biased, while under the pairwise SMC′ they are approximately asymptotically unbiased.  相似文献   

13.
The Kingman coalescent, which has become the foundation for a wide range of theoretical as well as empirical studies, was derived as an approximation of the Wright-Fisher (WF) model. The approximation heavily relies on the assumption that population size is large and sample size is much smaller than the population size. Whether the sample size is too large compared to the population size is rarely questioned in practice when applying statistical methods based on the Kingman coalescent. Since WF model is the most widely used population genetics model for reproduction, it is desirable to develop a coalescent framework for the WF model, which can be used whenever there are concerns about the accuracy of the Kingman coalescent as an approximation. This paper described the exact coalescent theory for the WF model and develops a simulation algorithm, which is then used, together with an analytical approach, to study the properties of the exact coalescent as well as its differences to the Kingman coalescent. We show that the Kingman coalescent differs from the exact coalescent by: (1) shorter waiting time between successive coalescent events; (2) different probability of observing a topological relationship among sequences in a sample; and (3) slightly smaller tree length in the genealogy of a large sample. On the other hand, there is little difference in the age of the most recent common ancestor (MRCA) of the sample. The exact coalescent makes up the longer waiting time between successive coalescent events by having multiple coalescence at the same time. The most significant difference among various summary statistics of a coalescent examined is the sum of lengths of external branches, which can be more than 10% larger for exact coalescent than that for the Kingman coalescent. As a whole, the Kingman coalescent is a remarkably accurate approximation to the exact coalescent for sample and population sizes falling considerably outside the region that was originally anticipated.  相似文献   

14.
The serial coalescent extends traditional coalescent theory to include genealogies in which not all individuals were sampled at the same time. Inference in this framework is powerful because population size and evolutionary rate may be estimated independently. However, when the sequences in question are affected by selection acting at many sites, the genealogies may differ significantly from their neutral expectation, and inference of demographic parameters may become inaccurate. I demonstrate that this inaccuracy is severe when the mutation rate and strength of selection are jointly large, and I develop a new likelihood calculation that, while approximate, improves the accuracy of population size estimates. When used in a Bayesian parameter estimation context, the new calculation allows for estimation of the shape of the pairwise coalescent rate function and can be used to detect the presence of selection acting at many sites in a sequence. Using the new method, I investigate two sets of dengue virus sequences from Puerto Rico and Thailand, and show that both genealogies are likely to have been distorted by selection.  相似文献   

15.
This paper concerns the genealogical structure of a sample of chromosomes sharing a neutral rare allele. We suppose that the mutation giving rise to the allele has only happened once in the history of the entire population, and that the allele is of known frequency q in the population. Within a coalescent framework C. Wiuf and P. Donnelly (1999, Theor. Popul. Biol. 56, 183-201) derived an exact analysis of the conditional genealogy but it is inconvenient for applications. Here, we develop an approximation to the exact distribution of the conditional genealogy, including an approximation to the distribution of the time at which the mutation arose. The approximations are accurate for frequencies q<5-10%. In addition, a simple and fast simulation scheme is constructed. We consider a demography parameterized by a d-dimensional vector alpha=(alpha(1), em leader, alpha(d)). It is shown that the conditional genealogy and the age of the mutation have distributions that depend on a=qalpha and q only, and that the effect of q is a linear scaling of times in the genealogy; if q is doubled, the lengths of all branches in the genealogy are doubled. The theory is exemplified in two different demographies of some interest in the study of human evolution: (1) a population of constant size and (2) a population of exponentially decreasing size (going backward in time).  相似文献   

16.
We analyze patterns of genetic variability of populations in the presence of a large seedbank with the help of a new coalescent structure called the seedbank coalescent. This ancestral process appears naturally as a scaling limit of the genealogy of large populations that sustain seedbanks, if the seedbank size and individual dormancy times are of the same order as those of the active population. Mutations appear as Poisson processes on the active lineages and potentially at reduced rate also on the dormant lineages. The presence of “dormant” lineages leads to qualitatively altered times to the most recent common ancestor and nonclassical patterns of genetic diversity. To illustrate this we provide a Wright–Fisher model with a seedbank component and mutation, motivated from recent models of microbial dormancy, whose genealogy can be described by the seedbank coalescent. Based on our coalescent model, we derive recursions for the expectation and variance of the time to most recent common ancestor, number of segregating sites, pairwise differences, and singletons. Estimates (obtained by simulations) of the distributions of commonly employed distance statistics, in the presence and absence of a seedbank, are compared. The effect of a seedbank on the expected site-frequency spectrum is also investigated using simulations. Our results indicate that the presence of a large seedbank considerably alters the distribution of some distance statistics, as well as the site-frequency spectrum. Thus, one should be able to detect from genetic data the presence of a large seedbank in natural populations.  相似文献   

17.
In order to analyze the pattern of DNA polymorphism in detail, we have developed a simple method using a new statistic theta(i) which estimates 4Nmu from the number of segregating sites whose allelic nucleotide frequency is i/n among n DNA sequences, where N is the effective population size and mu is the mutation rate per generation per nucleotide site. Under the assumption that mutations are selectively neutral and a population size is constant, the expectation of theta(i) is equal to that of theta, which estimates 4Nmu from the number of segregating sites, so that the distribution of theta(i) is flat. Therefore, the departure of the distribution of theta(i) from the horizontal line, which represents the value of theta, reflects change in population size and natural selection. Results of the coalescent simulation show that the distributions of theta(i) in the populations which experienced expansion and reduction are U-shaped and upside-down U-shaped, respectively. And the distributions of theta(i) in some populations that experienced bottleneck are W-shaped. Furthermore, we have applied this method to the SNP data in the International HapMap Project. Results of data analyses show that the distributions of theta(i) in the CEU (European), CHB and JPT (Asian) populations are different from that in the YRI population (African). From these results of data analyses in nuclear DNA and the pattern of polymorphism in human mitochondrial DNA already known, we infer that the CEU, CHB and JPT populations experienced the bottleneck.  相似文献   

18.
This work studies the coalescent (ancestral pedigree, genealogy) of the entire population. The coalescent structure (topology) is robust, but selection changes the rate of coalescence (the time between branching events). The change in the rate of coalescence is not uniform, rather the reduction in the time between branching events is greatest when the coalescent is small (immediately after the common ancestor is the only member of the coalescent) with little change when the coalescent is large (immediately preceding when that common ancestor becomes fixed and the size of the coalescent is N). This provides that the reduction in the coalescent time due to selection is much greater than the reduction in the cumulative size of the coalescent (total number of ancestors of the present population after and including the most recent common ancestor) due to selection. If Ns≫1, the coalescent and fixation times are approximately equal to , which is much less than the value N which would result from neutral drift (N rather than the canonical haploid neutral fixation time 2N is the appropriate comparison for the model considered here), in particular, it is 70% less for Ns=10 and 95% less for Ns=100. However, for those values of Ns, and N ranging between 103 and 106, the reduction in the cumulative size of the coalescent of the entire population compared to the neutral case ranges from 17% to 65% (depending on the values of N and s). The coalescent time for two individuals for Ns of 10 and 100 is reduced by approximately 70% and 94%, respectively, compared with the neutral case. Because heterozygosity is proportional to the coalescent time for two individuals and the number of segregating alleles is proportional to the cumulative size of the coalescent, selection reduces heterozygosity more than it reduces the number of segregating alleles.  相似文献   

19.
A logistic (regulated population size) branching process population genetic model is presented. It is a modification of both the Wright-Fisher and (unconstrained) branching process models, and shares several properties including the coalescent time and shape, and structure of the coalescent process with those models. An important feature of the model is that population size fluctuation and regulation are intrinsic to the model rather than externally imposed. A consequence of this model is that the fluctuation in population size enhances the prospects for fixation of a beneficial mutation with constant relative viability, which is contrary to a result for the Wright-Fisher model with fluctuating population size. Explanation of this result follows from distinguishing between expected and realized viabilities, in addition to the contrast between absolute and relative viabilities.  相似文献   

20.
Molecular sequences obtained at different sampling times from populations of rapidly evolving pathogens and from ancient subfossil and fossil sources are increasingly available with modern sequencing technology. Here, we present a Bayesian statistical inference approach to the joint estimation of mutation rate and population size that incorporates the uncertainty in the genealogy of such temporally spaced sequences by using Markov chain Monte Carlo (MCMC) integration. The Kingman coalescent model is used to describe the time structure of the ancestral tree. We recover information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from temporally spaced data, that is, from nucleotide sequences gathered at different times, from different individuals, in an evolving haploid population. We briefly discuss the methodological implications and show what can be inferred, in various practically relevant states of prior knowledge. We develop extensions for exponentially growing population size and joint estimation of substitution model parameters. We illustrate some of the important features of this approach on a genealogy of HIV-1 envelope (env) partial sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号