首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A subpopulation D of rare alleles is considered. The subpopulation is part of a large population that evolves according to a Moran model with selection and growth. Conditional on the current frequency, q, of the rare allele, an approximation to the distribution of the genealogy of D is derived. In particular, the density of the age, T(1), of the rare allele is approximated. It is shown that time naturally is measured in units of qN(0) generations, where N(0) is the present day population size, and that the distribution of the genealogy of D depends on the compound parameters rho=rqN(0) and sigma=sqN(0) only. Here, s is the fitness per generation of heterozygote carriers of the rare allele and r is the growth rate per generation of the population. Amongst more, it is shown that for constant population size (rho=0) the distribution of D depends on sigma only through the absolute value /sigma/, not the direction of selection.  相似文献   

2.
In this paper we consider the genealogy of a random sample of n chromosomes from a panmictic population which has evolved with constant size N over many generations. We address two related problems. First we describe how genealogical information may be usefully partitioned into information on the events (mutations and coalescences) which occur in the genealogy, and the times between these events. We show that the distribution of the times given information on the events is particularly simple and describe how this can considerably reduce the computational burden when performing inference for these times. Second we investigate the effect on the genealogy of conditioning on a single mutation having occurred during the ancestry of the sample. In particular we use results from the first part of the paper to derive explicit formulae for the density of the age of a mutant allele, conditional on its frequency in either a sample or the population.  相似文献   

3.
4.
This paper is concerned with the structure of the genealogy of a sample in which it is observed that some subset of chromosomes carries a particular mutation, assumed to have arisen uniquely in the history of the population. A rigorous theoretical study of this conditional genealogy is given using coalescent methods. Particular results include the mean, variance, and density of the age of the mutation conditional on its frequency in the sample. Most of the development relates to populations of constant size, but we discuss the extension to populations which have grown exponentially to their present size.  相似文献   

5.
Ewens' sampling formula, the probability distribution of a configuration of alleles in a sample of genes under the infinitely-many-alleles model of mutation, is proved by a direct combinatorial argument. The distribution is extended to a model where the population size may vary back in time. The distribution of age-ordered frequencies in the population is also derived in the model, extending the GEM distribution of age-ordered frequencies in a model with a constant-sized population. The genealogy of a rare allele is studied using a combinatorial approach. A connection is explored between the distribution of age-ordered frequencies and ladder indices and heights in a sequence of random variables. In a sample of n genes the connection is with ladder heights and indices in a sequence of draws from an urn containing balls labelled 1,2,...,n; and in the population the connection is with ladder heights and indices in a sequence of independent uniform random variables.  相似文献   

6.
The Kingman coalescent, which has become the foundation for a wide range of theoretical as well as empirical studies, was derived as an approximation of the Wright-Fisher (WF) model. The approximation heavily relies on the assumption that population size is large and sample size is much smaller than the population size. Whether the sample size is too large compared to the population size is rarely questioned in practice when applying statistical methods based on the Kingman coalescent. Since WF model is the most widely used population genetics model for reproduction, it is desirable to develop a coalescent framework for the WF model, which can be used whenever there are concerns about the accuracy of the Kingman coalescent as an approximation. This paper described the exact coalescent theory for the WF model and develops a simulation algorithm, which is then used, together with an analytical approach, to study the properties of the exact coalescent as well as its differences to the Kingman coalescent. We show that the Kingman coalescent differs from the exact coalescent by: (1) shorter waiting time between successive coalescent events; (2) different probability of observing a topological relationship among sequences in a sample; and (3) slightly smaller tree length in the genealogy of a large sample. On the other hand, there is little difference in the age of the most recent common ancestor (MRCA) of the sample. The exact coalescent makes up the longer waiting time between successive coalescent events by having multiple coalescence at the same time. The most significant difference among various summary statistics of a coalescent examined is the sum of lengths of external branches, which can be more than 10% larger for exact coalescent than that for the Kingman coalescent. As a whole, the Kingman coalescent is a remarkably accurate approximation to the exact coalescent for sample and population sizes falling considerably outside the region that was originally anticipated.  相似文献   

7.
Using properties of moment stationarity we develop exact expressions for the mean and covariance of allele frequencies at a single locus for a set of populations subject to drift, mutation, and migration. Some general results can be obtained even for arbitrary mutation and migration matrices, for example: (1) Under quite general conditions, the mean vector depends only on mutation rates, not on migration rates or the number of populations. (2) Allele frequencies covary among all pairs of populations connected by migration. As a result, the drift, mutation, migration process is not ergodic when any finite number of populations is exchanging genes. In addition, we provide closed-form expressions for the mean and covariance of allele frequencies in Wright's finite-island model of migration under several simple models of mutation, and we show that the correlation in allele frequencies among populations can be very large for realistic rates of mutation unless an enormous number of populations are exchanging genes. As a result, the traditional diffusion approximation provides a poor approximation of the stationary distribution of allele frequencies among populations. Finally, we discuss some implications of our results for measures of population structure based on Wright's F-statistics.  相似文献   

8.
In this paper we consider the genealogy of two nested mutant alleles, assuming the constant-size neutral coalescent model with infinite sites mutation. We study the conditional genealogy and derive explicit formulas for the joint and marginal site frequency spectra for the double, single and zero mutant allele. In addition, we find the mean ages of the two mutations. We show that the age of the youngest mutation does not depend on the frequency of the single mutant allele and that the frequency spectra for the single mutant allele and the zero mutant allele are the same.  相似文献   

9.
10.
The ancestral selection graph in population genetics was introduced by Krone and Neuhauser [Krone, S.M., Neuhauser, C., 1997. Ancestral process with selection. Theor. Popul. Biol. 51, 210–237] as an analogue of the coalescent genealogy of a sample of genes from a neutrally evolving population. The number of particles in this graph, followed backwards in time, is a birth and death process with quadratic death and linear birth rates. In this paper an explicit form of the probability distribution of the number of particles is obtained by using the density of the allele frequency in the corresponding diffusion model obtained by Kimura [Kimura, M., 1955. Stochastic process and distribution of gene frequencies under natural selection. Cold Spring Harbor Symposia on Quantitative Biology 20, 33–53]. It is shown that the process of fixation of the allele in the diffusion model corresponds to convergence of the ancestral process to its stationary measure. The time to fixation of the allele conditional on fixation is studied in terms of the ancestral process.  相似文献   

11.
We consider population genetics models where selection acts at a set of unlinked loci. It is known that if the fitness of an individual is multiplicative across loci, then these loci are independent. We consider general selection models, but assume parent-independent mutation at each locus. For such a model, the joint stationary distribution of allele frequencies is proportional to the stationary distribution under neutrality multiplied by a known function of the mean fitness of the population. We further show how knowledge of this stationary distribution enables direct simulation of the genealogy of a sample at a single-locus. For a specific selection model appropriate for complex disease genes, we use simulation to determine what features of the genealogy differ between our general selection model and a multiplicative model.  相似文献   

12.
Polygenic variation can be maintained by a balance between mutation and stabilizing selection. When the alleles responsible for variation are rare, many classes of equilibria may be stable. The rate at which drift causes shifts between equilibria is investigated by integrating the gene frequency distribution W2N II (pq)4N mu-1. This integral can be found exactly, by numerical integration, or can be approximated by assuming that the full distribution of allele frequencies is approximately Gaussian. These methods are checked against simulations. Over a wide range of population sizes, drift will keep the population near an equilibrium which minimizes the genetic variance and the deviation from the selective optimum. Shifts between equilibria in this class occur at an appreciable rate if the product of population size and selection on each locus is small (Ns alpha 2 less than 10). The Gaussian approximation is accurate even when the underlying distribution is strongly skewed. Reproductive isolation evolves as populations shift to new combinations of alleles: however, this process is slow, approaching the neutral rate (approximately mu) in small populations.  相似文献   

13.
The gene genealogy is derived for a rare allele that is descended from a mutant ancestor that arose at a fixed time in the past. Following Thompson (1976,Amer. J. Human Genet.28, 442–452), the fractional linear branching process is used as a model of the demography of a rare allele. The model does not require the total population size to be constant or the mutant class to be neutral; so long as individuals in the class are selectively equivalent, the class as a whole may have a selective advantage, or disadvantage, relative to other alleles in the population. An exact result is given for the joint probability distribution of the coalescence times among a sample of alleles descended from the mutant. A method is described for rapidly simulating these coalescence times. The relationship between the genealogical structure of a discrete generation branching process and a continuous generation birth–death process is elucidated. The theory may be applied to the problem of estimating the ages of rare nonrecurrent mutations.  相似文献   

14.
Determining the expected distribution of the time to the most recent common ancestor of a sample of individuals may deliver important information about the genetic markers and evolution of the population. In this paper, we introduce a new recursive algorithm to calculate the distribution of the time to the most recent common ancestor of the sample from a population evolved by any conditional multinomial sampling model. The most important advantage of our method is that it can be applied to a sample of any size drawn from a population regardless of its size growth pattern. We also present a very efficient method to implement and store the genealogy tree of the population evolved by the Galton–Watson process. In the final section we present results applied to a simulated population with a single bottleneck event and to real populations of known size histories.  相似文献   

15.
Explicit formulae are given for the effects of a barrier to gene flow on random fluctuations in allele frequency; these formulae can also be seen as generating functions for the distribution of coalescence times. The formulae are derived using a continuous diffusion approximation, which is accurate over all but very small spatial scales. The continuous approximation is confirmed by comparison with the exact solution to the stepping stone model. In both one and two spatial dimensions, the variance of fluctuations in allele frequencies increases near the barrier; when the barrier is very strong, the variance doubles. However, the effect on fluctuations close to the barrier is much greater when the population is spread over two spatial dimensions than when it occupies a linear, one-dimensional habitat: barriers of strength comparable with the dispersal range (B approximately equal to sigma) can have an appreciable effect in two dimensions, whereas only barriers with strength comparable with the characteristic scale (B approximately equal to L=sigma/sqrt{2mu}) are significant in one dimension (mu is the rate of mutation or long-range dispersal). Thus, in a two-dimensional population, barriers to gene flow can be detected through their effect on the spatial pattern of genetic marker alleles.  相似文献   

16.
Determining the expected distribution of the time to the most recent common ancestor of a sample of individuals may deliver important information about the genetic markers and evolution of the population. In this paper, we introduce a new recursive algorithm to calculate the distribution of the time to the most recent common ancestor of the sample from a population evolved by any conditional multinomial sampling model. The most important advantage of our method is that it can be applied to a sample of any size drawn from a population regardless of its size growth pattern. We also present a very efficient method to implement and store the genealogy tree of the population evolved by the Galton–Watson process. In the final section we present results applied to a simulated population with a single bottleneck event and to real populations of known size histories.  相似文献   

17.
Private microsatellite alleles tend to be found in the tails rather than in the interior of the allele size distribution. To explain this phenomenon, we have investigated the size distribution of private alleles in a coalescent model of two populations, assuming the symmetric stepwise mutation model as the mode of microsatellite mutation. For the case in which four alleles are sampled, two from each population, we condition on the configuration in which three distinct allele sizes are present, one of which is common to both populations, one of which is private to one population, and the third of which is private to the other population. Conditional on this configuration, we calculate the probability that the two private alleles occupy the two tails of the size distribution. This probability, which increases as a function of mutation rate and divergence time between the two populations, is seen to be greater than the value that would be predicted if there was no relationship between privacy and location in the allele size distribution. In accordance with the prediction of the model, we find that in pairs of human populations, the frequency with which private microsatellite alleles occur in the tails of the allele size distribution increases as a function of genetic differentiation between populations.  相似文献   

18.
Jeremy J. Berg  Graham Coop 《Genetics》2015,201(2):707-725
The use of genetic polymorphism data to understand the dynamics of adaptation and identify the loci that are involved has become a major pursuit of modern evolutionary genetics. In addition to the classical “hard sweep” hitchhiking model, recent research has drawn attention to the fact that the dynamics of adaptation can play out in a variety of different ways and that the specific signatures left behind in population genetic data may depend somewhat strongly on these dynamics. One particular model for which a large number of empirical examples are already known is that in which a single derived mutation arises and drifts to some low frequency before an environmental change causes the allele to become beneficial and sweeps to fixation. Here, we pursue an analytical investigation of this model, bolstered and extended via simulation study. We use coalescent theory to develop an analytical approximation for the effect of a sweep from standing variation on the genealogy at the locus of the selected allele and sites tightly linked to it. We show that the distribution of haplotypes that the selected allele is present on at the time of the environmental change can be approximated by considering recombinant haplotypes as alleles in the infinite-alleles model. We show that this approximation can be leveraged to make accurate predictions regarding patterns of genetic polymorphism following such a sweep. We then use simulations to highlight which sources of haplotypic information are likely to be most useful in distinguishing this model from neutrality, as well as from other sweep models, such as the classic hard sweep and multiple-mutation soft sweeps. We find that in general, adaptation from a unique standing variant will likely be difficult to detect on the basis of genetic polymorphism data from a single population time point alone, and when it can be detected, it will be difficult to distinguish from other varieties of selective sweeps. Samples from multiple populations and/or time points have the potential to ease this difficulty.  相似文献   

19.
Estimating the age of alleles by use of intraallelic variability.   总被引:9,自引:6,他引:3  
A method is presented for estimating the age of an allele by use of its frequency and the extent of variation among different copies. The method uses the joint distribution of the number of copies in a population sample and the coalescence times of the intraallelic gene genealogy conditioned on the number of copies. The linear birth-death process is used to approximate the dynamics of a rare allele in a finite population. A maximum-likelihood estimate of the age of the allele is obtained by Monte Carlo integration over the coalescence times. The method is applied to two alleles at the cystic fibrosis (CFTR) locus, deltaF508 and G542X, for which intraallelic variability at three intronic microsatellite loci has been examined. Our results indicate that G542X is somewhat older than deltaF508. Although absolute estimates depend on the mutation rates at the microsatellite loci, our results support the hypothesis that deltaF508 arose < 500 generations (approximately 10,000 years) ago.  相似文献   

20.
Molecular sequences obtained at different sampling times from populations of rapidly evolving pathogens and from ancient subfossil and fossil sources are increasingly available with modern sequencing technology. Here, we present a Bayesian statistical inference approach to the joint estimation of mutation rate and population size that incorporates the uncertainty in the genealogy of such temporally spaced sequences by using Markov chain Monte Carlo (MCMC) integration. The Kingman coalescent model is used to describe the time structure of the ancestral tree. We recover information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from temporally spaced data, that is, from nucleotide sequences gathered at different times, from different individuals, in an evolving haploid population. We briefly discuss the methodological implications and show what can be inferred, in various practically relevant states of prior knowledge. We develop extensions for exponentially growing population size and joint estimation of substitution model parameters. We illustrate some of the important features of this approach on a genealogy of HIV-1 envelope (env) partial sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号