首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A special stochastic process, called the coalescent, is of fundamental interest in population genetics. For a large class of population models this process is the appropriate tool to analyse the ancestral structure of a sample of n individuals or genes, if the total number of individuals in the population is sufficiently large. A corresponding convergence theorem was first proved by Kingman in 1982 for the Wright-Fisher model and the Moran model. Generalizations to a large class of exchangeable population models and to models with overlying mutation processes followed shortly later. One speaks of the "robustness of the coalescent, as this process appears in many models as the total population size tends to infinity. This publication can be considered as an introduction to the theory of the coalescent as well as a review of the most important "convergence-to-the-coalescent-theorems. Convergence theorems are not only presented for the classical exchangeable haploid case but also for larger classes of population models, for example for diploid, two-sex or non-exchangeable models. A review-like summary of further examples and applications of convergence to the coalescent is given including the most important biological forces like mutation, recombination and selection. The general coalescent process allows for simultaneous multiple mergers of ancestral lines.  相似文献   

2.
The Kingman coalescent, which has become the foundation for a wide range of theoretical as well as empirical studies, was derived as an approximation of the Wright-Fisher (WF) model. The approximation heavily relies on the assumption that population size is large and sample size is much smaller than the population size. Whether the sample size is too large compared to the population size is rarely questioned in practice when applying statistical methods based on the Kingman coalescent. Since WF model is the most widely used population genetics model for reproduction, it is desirable to develop a coalescent framework for the WF model, which can be used whenever there are concerns about the accuracy of the Kingman coalescent as an approximation. This paper described the exact coalescent theory for the WF model and develops a simulation algorithm, which is then used, together with an analytical approach, to study the properties of the exact coalescent as well as its differences to the Kingman coalescent. We show that the Kingman coalescent differs from the exact coalescent by: (1) shorter waiting time between successive coalescent events; (2) different probability of observing a topological relationship among sequences in a sample; and (3) slightly smaller tree length in the genealogy of a large sample. On the other hand, there is little difference in the age of the most recent common ancestor (MRCA) of the sample. The exact coalescent makes up the longer waiting time between successive coalescent events by having multiple coalescence at the same time. The most significant difference among various summary statistics of a coalescent examined is the sum of lengths of external branches, which can be more than 10% larger for exact coalescent than that for the Kingman coalescent. As a whole, the Kingman coalescent is a remarkably accurate approximation to the exact coalescent for sample and population sizes falling considerably outside the region that was originally anticipated.  相似文献   

3.
We establish convergence to the Kingman coalescent for the genealogy of a geographically-or otherwise-structured version of the Wright-Fisher population model with fast migration. The new feature is that migration probabilities may change in a random fashion. This brings a novel formula for the coalescent effective population size (EPS). We call it a quenched EPS to emphasize the key feature of our model — random environment. The quenched EPS is compared with an annealed (mean-field) EPS which describes the case of constant migration probabilities obtained by averaging the random migration probabilities over possible environments.  相似文献   

4.
Coalescent process with fluctuating population size and its effective size   总被引:3,自引:0,他引:3  
We consider a Wright-Fisher model whose population size is a finite Markov chain. We introduce a sequence of two-dimensional discrete time Markov chains whose components describe the coalescent process and the fluctuation of population size. For the limiting process of the sequence of Markov chains, the relationship of the expectation of coalescence time to the harmonic and the arithmetic means of population sizes is shown, and the Laplace transform of the distribution of coalescence time is calculated. We define the coalescence effective population size (cEPS) by the expectation of coalescence time. We show that cEPS is strictly larger (resp. smaller) than the harmonic (resp. arithmetic) mean. As the population size fluctuates more quickly (resp. slowly), cEPS is closer to the harmonic (resp. arithmetic) mean. For the case of a two-valued Markov chain, we show the explicit expression of cEPS and its dependency on the sample size.  相似文献   

5.
T. Nagylaki 《Genetics》1997,145(2):485-491
Three different derivations of models with multinomial sampling of genotypes in a finite population are presented. The three derivations correspond to the operation of random drift through population regulation, conditioning on the total number of progeny, and culling, respectively. Generations are discrete and nonoverlapping; the diploid population mates at random. Each derivation applies to a single multiallelic locus in a monoecious or dioecious population; in the latter case, the locus may be autosomal or X-linked. Mutation and viability selection are arbitrary; there are no fertility differences. In a monoecious population, the model yields the Wright-Fisher model (i.e., multinomial sampling of genes) if and only if the viabilities are multiplicative. In a dioecious population, the analogous reduction does not occur even for pure random drift. Thus, multinomial sampling of genotypes generally does not lead to multinomial sampling of genes. Although the Wright-Fisher model probably lacks a sound biological basis and may be inaccurate for small populations, it is usually (perhaps always) a good approximation for genotypic multinomial sampling in large populations.  相似文献   

6.
Y. X. Fu 《Genetics》1994,138(4):1375-1386
Mutations resulting in segregating sites of a sample of DNA sequences can be classified by size and type and the frequencies of mutations of different sizes and types can be inferred from the sample. A framework for estimating the essential parameter θ = 4Nu utilizing the frequencies of mutations of various sizes and types is developed in this paper, where N is the effective size of a population and μ is mutation rate per sequence per generation. The framework is a combination of coalescent theory, general linear model and Monte-Carlo integration, which leads to two new estimators θ(ξ) and θ(η) as well as a general Watterson''s estimator θ(K) and a general Tajima''s estimator θ(π). The greatest strength of the framework is that it can be used under a variety of population models. The properties of the framework and the four estimators θ(K), θ(π), θ(ξ) and θ(η) are investigated under three important population models: the neutral Wright-Fisher model, the neutral model with recombination and the neutral Wright''s finite-islands model. Under all these models, it is shown that θ(ξ) is the best estimator among the four even when recombination rate or migration rate has to be estimated. Under the neutral Wright-Fisher model, it is shown that the new estimator θ(ξ) has a variance close to a lower bound of variances of all unbiased estimators of θ which suggests that θ(ξ) is a very efficient estimator.  相似文献   

7.
This work studies the coalescent (ancestral pedigree, genealogy) of the entire population. The coalescent structure (topology) is robust, but selection changes the rate of coalescence (the time between branching events). The change in the rate of coalescence is not uniform, rather the reduction in the time between branching events is greatest when the coalescent is small (immediately after the common ancestor is the only member of the coalescent) with little change when the coalescent is large (immediately preceding when that common ancestor becomes fixed and the size of the coalescent is N). This provides that the reduction in the coalescent time due to selection is much greater than the reduction in the cumulative size of the coalescent (total number of ancestors of the present population after and including the most recent common ancestor) due to selection. If Ns≫1, the coalescent and fixation times are approximately equal to , which is much less than the value N which would result from neutral drift (N rather than the canonical haploid neutral fixation time 2N is the appropriate comparison for the model considered here), in particular, it is 70% less for Ns=10 and 95% less for Ns=100. However, for those values of Ns, and N ranging between 103 and 106, the reduction in the cumulative size of the coalescent of the entire population compared to the neutral case ranges from 17% to 65% (depending on the values of N and s). The coalescent time for two individuals for Ns of 10 and 100 is reduced by approximately 70% and 94%, respectively, compared with the neutral case. Because heterozygosity is proportional to the coalescent time for two individuals and the number of segregating alleles is proportional to the cumulative size of the coalescent, selection reduces heterozygosity more than it reduces the number of segregating alleles.  相似文献   

8.
Volz EM 《Genetics》2012,190(1):187-201
Estimates of the coalescent effective population size N(e) can be poorly correlated with the true population size. The relationship between N(e) and the population size is sensitive to the way in which birth and death rates vary over time. The problem of inference is exacerbated when the mechanisms underlying population dynamics are complex and depend on many parameters. In instances where nonparametric estimators of N(e) such as the skyline struggle to reproduce the correct demographic history, model-based estimators that can draw on prior information about population size and growth rates may be more efficient. A coalescent model is developed for a large class of populations such that the demographic history is described by a deterministic nonlinear dynamical system of arbitrary dimension. This class of demographic model differs from those typically used in population genetics. Birth and death rates are not fixed, and no assumptions are made regarding the fraction of the population sampled. Furthermore, the population may be structured in such a way that gene copies reproduce both within and across demes. For this large class of models, it is shown how to derive the rate of coalescence, as well as the likelihood of a gene genealogy with heterochronous sampling and labeled taxa, and how to simulate a coalescent tree conditional on a complex demographic history. This theoretical framework encapsulates many of the models used by ecologists and epidemiologists and should facilitate the integration of population genetics with the study of mathematical population dynamics.  相似文献   

9.
Methods of calculating the distributions of the time to coalescence depend on the underlying model of population demography. In particular, the models assuming deterministic evolution of population size may not be applicable to populations evolving stochastically. Therefore the study of coalescence models involving stochastic demography is important for applications. One interesting approach which includes stochasticity is the O’Connell limit theory of genealogy in branching processes. Our paper explores how many generations are needed for the limiting distributions of O’Connell to become adequate approximations of exact distributions. We perform extensive simulations of slightly supercritical branching processes and compare the results to the O’Connell limits. Coalescent computations under the Wright-Fisher model are compared with limiting O’Connell results and with full genealogy-based predictions. These results are used to estimate the age of the so-called mitochondrial Eve, i.e., the root of the mitochondrial polymorphisms of the modern humans based on the DNA from humans and Neanderthal fossils.  相似文献   

10.
Davies JL  Simancík F  Lyngsø R  Mailund T  Hein J 《Genetics》2007,177(4):2151-2160
Coalescent theory deals with the dynamics of how sampled genetic material has spread through a population from a single ancestor over many generations and is ubiquitous in contemporary molecular population genetics. Inherent in most applications is a continuous-time approximation that is derived under the assumption that sample size is small relative to the actual population size. In effect, this precludes multiple and simultaneous coalescent events that take place in the history of large samples. If sequences do not recombine, the number of sequences ancestral to a large sample is reduced sufficiently after relatively few generations such that use of the continuous-time approximation is justified. However, in tracing the history of large chromosomal segments, a large recombination rate per generation will consistently maintain a large number of ancestors. This can create a major disparity between discrete-time and continuous-time models and we analyze its importance, illustrated with model parameters typical of the human genome. The presence of gene conversion exacerbates the disparity and could seriously undermine applications of coalescent theory to complete genomes. However, we show that multiple and simultaneous coalescent events influence global quantities, such as total number of ancestors, but have negligible effect on local quantities, such as linkage disequilibrium. Reassuringly, most applications of the coalescent model with recombination (including association mapping) focus on local quantities.  相似文献   

11.
The study of sequence diversity under phylogenetic models is now classic. Theoretical studies of diversity under the Kingman coalescent appeared shortly after the introduction of the coalescent. In this paper we revisit this topic under the multispecies coalescent, an extension of the single population model to multiple populations. We derive exact formulas for the sequence dissimilarity of two sequences drawn at random under a basic multispecies setup. The multispecies model uses three parameters—the species tree birth rate under the pure birth process (Yule), the species effective population size and the mutation rate. We also discuss the effects of relaxing some of the model assumptions.  相似文献   

12.
Inferring aspects of the population histories of species using coalescent analyses of non-coding nuclear DNA has grown in popularity. These inferences, such as divergence, gene flow, and changes in population size, assume that genetic data reflect simple population histories and neutral evolutionary processes. However, violating model assumptions can result in a poor fit between empirical data and the models. We sampled 22 nuclear intron sequences from at least 19 different chromosomes (a genomic transect) to test for deviations from selective neutrality in the gadwall (Anas strepera), a Holarctic duck. Nucleotide diversity among these loci varied by nearly two orders of magnitude (from 0.0004 to 0.029), and this heterogeneity could not be explained by differences in substitution rates alone. Using two different coalescent methods to infer models of population history and then simulating neutral genetic diversity under these models, we found that the observed among-locus heterogeneity in nucleotide diversity was significantly higher than expected for these simple models. Defining more complex models of population history demonstrated that a pre-divergence bottleneck was also unlikely to explain this heterogeneity. However, both selection and interspecific hybridization could account for the heterogeneity observed among loci. Regardless of the cause of the deviation, our results illustrate that violating key assumptions of coalescent models can mislead inferences of population history.  相似文献   

13.
The Genealogy of Samples in Models with Selection   总被引:1,自引:0,他引:1  
C. Neuhauser  S. M. Krone 《Genetics》1997,145(2):519-534
We introduce the genealogy of a random sample of genes taken from a large haploid population that evolves according to random reproduction with selection and mutation. Without selection, the genealogy is described by Kingman''s well-known coalescent process. In the selective case, the genealogy of the sample is embedded in a graph with a coalescing and branching structure. We describe this graph, called the ancestral selection graph, and point out differences and similarities with Kingman''s coalescent. We present simulations for a two-allele model with symmetric mutation in which one of the alleles has a selective advantage over the other. We find that when the allele frequencies in the population are already in equilibrium, then the genealogy does not differ much from the neutral case. This is supported by rigorous results. Furthermore, we describe the ancestral selection graph for other selective models with finitely many selection classes, such as the K-allele models, infinitely-many-alleles models, DNA sequence models, and infinitely-many-sites models, and briefly discuss the diploid case.  相似文献   

14.
The coalescent process in the human-chimpanzee ancestral population is investigated using a model, which incorporates a certain time period of gene flow during the speciation process. a is a parameter to represent the degree and time of gene flow, and the model is identical to the null model with an instantaneous species split when a=infinity. A maximum likelihood (ML) method is developed to estimate a, and its power and reliability is investigated by coalescent simulations. The ML method is applied to nucleotide divergence data between human and chimpanzee. It is found that the null model with an instantaneous species split explains the data best, and no strong evidence for gene flow is detected. The result is discussed in the view of the mode of speciation. Another ML method is developed to estimate the male-female ratio (alpha) of mutation rate, in which the coalescent process in the ancestral population is taken into account.  相似文献   

15.
Exact discrete Markov chains are applied to the Wright-Fisher model and the Moran model of haploid random mating. Selection and mutations are neglected. At each discrete value of time t there is a given number n of diploid monoecious organisms. The evolution of the population distribution is given in diffusion variables, to compare the two models of random mating with their common diffusion limit. Only the Moran model converges uniformly to the diffusion limit near the boundary. The Wright-Fisher model allows the population size to change with the generations. Diffusion theory tends to under-predict the loss of genetic information when a population enters a bottleneck.  相似文献   

16.
We investigate the detailed connection between the Wright-Fisher model of random genetic drift and the diffusion approximation, under the assumption that selection and drift are weak and so cause small changes over a single generation. A representation of the mathematics underlying the Wright-Fisher model is introduced which allows the connection to be made with the corresponding mathematics underlying the diffusion approximation. Two ‘hybrid’ models are also introduced which lie ‘between’ the Wright-Fisher model and the diffusion approximation. In model 1 the relative allele frequency takes discrete values while time is continuous; in model 2 time is discrete and relative allele frequency is continuous. While both hybrid models appear to have a similar status and the same level of plausibility, the different nature of time and frequency in the two models leads to significant mathematical differences. Model 2 is mathematically inconsistent and has to be ruled out as being meaningful. Model 1 is used to clarify the content of Kimura's solution of the diffusion equation, which is shown to have the natural interpretation as describing only those populations where alleles are segregating. By contrast the Wright-Fisher model and the solution of the diffusion equation of McKane and Waxman cover populations of all categories, namely populations where alleles segregate, are lost, or fix.  相似文献   

17.
Wilkins JF 《Genetics》2004,168(4):2227-2244
This article presents an analysis of a model of isolation by distance in a continuous, two-dimensional habitat. An approximate expression is derived for the distribution of coalescence times for a pair of sequences sampled from specific locations in a rectangular habitat. Results are qualitatively similar to previous analyses of isolation by distance, but account explicitly for the location of samples relative to the habitat boundaries. A separation-of-timescales approach takes advantage of the fact that the sampling locations affect only the recent coalescent behavior. When the population size is larger than the number of generations required for a lineage to cross the habitat range, the long-term genealogical process is reasonably well described by Kingman's coalescent with time rescaled by the effective population size. This long-term effective population size is affected by the local dispersal behavior as well as the geometry of the habitat. When the population size is smaller than the time required to cross the habitat, deep branches in the genealogy are longer than would be expected under the standard neutral coalescent, similar to the pattern expected for a panmictic population whose population size was larger in the past.  相似文献   

18.
We study a generalisation of Moran’s population-genetic model that incorporates density dependence. Rather than assuming fixed population size, we allow the number of individuals to vary stochastically with the same events that change allele number, according to a logistic growth process with density dependent mortality. We analyse the expected time to absorption and fixation in the ‘quasi-neutral’ case: both types have the same carrying capacity, achieved through a trade-off of birth and death rates. Such types would be competitively neutral in a classical, fixed-population Wright-Fisher model. Nonetheless, we find that absorption times are skewed compared to the Wright-Fisher model. The absorption time is longer than the Wright-Fisher prediction when the initial proportion of the type with higher birth rate is large, and shorter when it is small. By contrast, demographic stochasticity has no effect on the fixation or absorption times of truly neutral alleles in a large population. Our calculations provide the first analytic results on hitting times in a two-allele model, when the population size varies stochastically.  相似文献   

19.
Arnold B  Bomblies K  Wakeley J 《Genetics》2012,192(1):195-204
We develop coalescent models for autotetraploid species with tetrasomic inheritance. We show that the ancestral genetic process in a large population without recombination may be approximated using Kingman's standard coalescent, with a coalescent effective population size 4N. Numerical results suggest that this approximation is accurate for population sizes on the order of hundreds of individuals. Therefore, existing coalescent simulation programs can be adapted to study population history in autotetraploids simply by interpreting the timescale in units of 4N generations. We also consider the possibility of double reduction, a phenomenon unique to polysomic inheritance, and show that its effects on gene genealogies are similar to partial self-fertilization.  相似文献   

20.
Der R  Epstein C  Plotkin JB 《Genetics》2012,191(4):1331-1344
We analyze the dynamics of two alternative alleles in a simple model of a population that allows for large family sizes in the distribution of offspring number. This population model was first introduced by Eldon and Wakeley, who described the backward-time genealogical relationships among sampled individuals, assuming neutrality. We study the corresponding forward-time dynamics of allele frequencies, with or without selection. We derive a continuum approximation, analogous to Kimura's diffusion approximation, and we describe three distinct regimes of behavior that correspond to distinct regimes in the coalescent processes of Eldon and Wakeley. We demonstrate that the effect of selection is strongly amplified in the Eldon-Wakeley model, compared to the Wright-Fisher model with the same variance effective population size. Remarkably, an advantageous allele can even be guaranteed to fix in the Eldon-Wakeley model, despite the presence of genetic drift. We compute the selection coefficient required for such behavior in populations of Pacific oysters, based on estimates of their family sizes. Our analysis underscores that populations with the same effective population size may nevertheless experience radically different forms of genetic drift, depending on the reproductive mechanism, with significant consequences for the resulting allele dynamics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号