首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Probabilities of monophyly, paraphyly, and polyphyly of two-species gene genealogies are computed for modest sample sizes and compared for two different Λ coalescent processes. Coalescent processes belonging to the Λ coalescent family admit asynchronous multiple mergers of active ancestral lineages. Assigning a timescale to the time of divergence becomes a central issue when different populations have different coalescent processes running on different timescales. Clade probabilities in single populations are also computed, which can be useful for testing for taxonomic distinctiveness of an observed set of monophyletic lineages. The coalescence rates of multiple merger coalescent processes are functions of coalescent parameters. The effect of coalescent parameters on the probabilities studied depends on the coalescent process, and if the population is ancestral or derived. The probability of reciprocal monophyly tends to be somewhat lower, when associated with a Λ coalescent, under the null hypothesis that two groups come from the same population. However, even for fairly recent divergence times, the probability of monophyly tends to be higher as a function of the number of generations for coalescent processes that admit multiple mergers, and is sensitive to the parameter of one of the example processes.  相似文献   

Jesse E. Taylor 《Genetics》2009,182(3):813-837
The genealogical consequences of within-generation fecundity variance polymorphism are studied using coalescent processes structured by genetic backgrounds. I show that these processes have three distinctive features. The first is that the coalescent rates within backgrounds are not jointly proportional to the infinitesimal variance, but instead depend only on the frequencies and traits of genotypes containing each allele. Second, the coalescent processes at unlinked loci are correlated with the genealogy at the selected locus; i.e., fecundity variance polymorphism has a genomewide impact on genealogies. Third, in diploid models, there are infinitely many combinations of fecundity distributions that have the same diffusion approximation but distinct coalescent processes; i.e., in this class of models, ancestral processes and allele frequency dynamics are not in one-to-one correspondence. Similar properties are expected to hold in models that allow for heritable variation in other traits that affect the coalescent effective population size, such as sex ratio or fecundity and survival schedules.THE population genetics of within-generation fecundity variance has been studied from two perspectives. Beginning with Wright (1938), several authors have investigated the relationship between the effective size of a panmictic population with seasonal reproduction and the variance of the number of offspring born to each adult within a season (Crow and Denniston 1988; Nunney 1993, 1996; Waples 2002; Hedrick 2005; Engen et al. 2007). Although the precise form of this relationship depends on other biological factors such as the mating system and the manner in which population regulation operates, each of these studies shows that the effective population size is a decreasing function of fecundity variance. Furthermore, provided that the variance and the coalescent effective population sizes coincide (Ewens 1982; Nordborg and Krone 2002; Sjodin et al. 2005), these results imply that both the rate at which neutral allele frequencies fluctuate from generation to generation and the rate at which lineages coalesce will be positively correlated with within-generation fecundity variance. For example, it has been suggested that the shallow genealogies that have been documented in many marine organisms are a consequence of the high variance of reproductive success in the recruitment sweepstakes operating in these species (Hedgecock 1994; Árnason 2004; Eldon and Wakeley 2006).These results hold in models in which all individuals have the same within-generation (or within-season) fecundity variance. However, the evolutionary genetics of populations that are polymorphic for alleles that influence demographic traits have also been investigated. The first results of this kind were derived by Gillespie (1974, 1975, 1977), who used diffusion theory to show that natural selection can act directly on within-generation fecundity variance in a haploid population with nonoverlapping generations. By studying a simple model of a population composed of two genotypes, say A1 and A2, Gillespie (1974) showed that the fluctuations in the frequency of allele A1 can be approximated by a diffusion process with the following drift and variance coefficients,where p is the frequency of A1, N is the number of adults, and 1 + μi and are the mean and the variance, respectively, of the number of offspring produced by an individual of type Ai. Most discussions of this class of models have focused on the fitness consequences of differences in fecundity variance, which are summarized by the drift coefficient, m(p), of the diffusion approximation. There are two main conclusions. The first is that because m(p) is an increasing function of the difference − , selection can favor alleles that reduce within-generation fecundity variance even if these have lower mean fecundity. Such variance–mean trade-offs can be interpreted as a kind of bet hedging and could explain the evolution of certain risk-spreading traits such as insect oviposition onto multiple host plants (Root and Kareiva 1986) or multiple mating by females (Sarhan and Kokko 2007). On the other hand, because the strength of selection on fecundity variance is inversely proportional to population size, selection for mean–variance trade-offs will usually be dominated by changes in mean fecundity. For this reason, it has been suggested that within-generation bet hedging will be favored only in very small populations (Seger and Brockman 1987; Hopper et al. 2003), although recent theoretical studies have shown that bet hedging can evolve under less restrictive conditions in subdivided populations (Shpak 2005; Lehmann and Balloux 2007; Shpak and Proulx 2007).Less consideration has been given to the diffusion coefficient, v(p), which differs from the familiar quadratic term, p(1 − p), of the Wright–Fisher diffusion. Because the variance effective population size of a monomorphic population depends on the fecundity variance, it is not surprising that v(p) has an additional dependence on the frequency of A1 whenever the two alleles have different offspring variances. However, as noted by Gillespie (1974), the relationship between allele frequency fluctuations and the allelic composition of the population is counterintuitive. For example, when p is close to 1, so that the population is composed mainly of A1-type individuals, the rate of allele frequency fluctuations is dominated by the variance of the A2 genotype. In particular, if we define the variance effective population size by the expression Np(1 − p)/v(p) (Ewens 1982), then not only is this quantity frequency dependent, but also it depends on the life history traits of the missing genotype whenever the population is fixed for one of the two alleles. In contrast, the coalescent effective population size of a monomorphic population depends only on the offspring distribution of the fixed allele. The discrepancy between these two quantities raises the following question: namely, How does fecundity variance polymorphism affect the statistical properties of the genealogy of a random sample of individuals?The answer to this question is of interest for several reasons. First, although the effects of selection on genealogies have received considerable attention (Przeworski et al. 1999; Williamson and Orive 2002; Barton and Etheridge 2004), little is known about the genealogical consequences of variation in traits that alter the coalescent rate. Extrapolating from models in which the effective population size varies under the control of external factors, we might expect the coalescent process in a model with fecundity variance polymorphism to be a stochastic time change of Kingman''s coalescent. However, the results derived in the next section show that this intuition is usually wrong. The second motivation is more practical. Even if changes in fecundity variance are usually controlled by selection on other traits, the existence of interspecific differences in fecundity variance suggests that there must be periods when populations are polymorphic for alleles that alter the fecundity variance. In these instances, it might be possible to use sequence data to identify the loci responsible for these changes, but to do so will require the development of methods that exploit patterns that are unique to models in which the effective population size depends on the genetic composition of the population. For example, whereas the effects of genetic hitchhiking are usually restricted to linked sites (Maynard Smith and Haigh 1974; Kim and Stephan 2002; Przeworski 2002; Przeworski et al. 2005), we will see later that selective sweeps by mutations that affect fecundity variance would have a genomewide impact on polymorphism.Kingman (1982a,b) showed that the genealogy of a sample of individuals from a panmictic, neutrally evolving population of constant size can be described by a simple stochastic process known as the coalescent (or Kingman''s coalescent). One of the most important properties of Kingman''s coalescent is that it is a Markov process, a fact that is heavily exploited in mathematical analyses and that also allows for efficient simulations of genealogies. Unfortunately, this property generally does not hold in populations composed of nonexchangeable individuals. For example, if there are selective differences between individuals, then although the genealogy of a sample of individuals can still be regarded as a stochastic process, selective interactions between individuals cause this process to also depend on the history of nonancestral lineages. The key to overcoming this difficulty is to embed the genealogical process in a larger process that does satisfy the Markov property. This can be done in two ways. One approach is to embed the coalescent tree within a graphical process called the ancestral selection graph (Krone and Neuhauser 1997; Neuhauser and Krone 1997; Donnelly and Kurtz 1999) in which lineages can either branch, giving rise to pairs of potential ancestors, or coalesce. The intuition behind this construction is that the effects of selection on the genealogy can be accounted for by keeping track of a pool of potential ancestors that includes lineages that have failed to persist due to being outcompeted by individuals of higher fitness. Because the branching rates are linear in the number of lineages, while the coalescence rates are quadratic, this process is certain to reach an ultimate ancestor in finite time. The process can be stopped at this time, and both the ancestral and the genotypic status of individual branches can be resolved by assigning random mutations to the graph and then traversing it from the root to the leaves.The second approach is due to Kaplan et al. (1988), who showed that the genealogical history of a sample of genes under selection can be represented by a structured coalescent process. Here we think of the population as being subdivided into several demes, or genetic backgrounds, consisting of individuals that share the same genotype at the selected locus. Because individuals with the same genotype are exchangeable, the rate of coalescence within a background depends only on the size of the background and the number of ancestral lineages sharing that genotype. In addition, mutations at the selected site will move lineages between backgrounds. To obtain a Markov process, we need to keep track of two kinds of information: (i) the types of the ancestral lineages and (ii) the frequencies of the alleles segregating at the selected locus. Fortunately, because one-dimensional diffusion processes are reversible with respect to their stationary distributions (i.e., the detailed balance conditions are satisfied), the ancestral process of allele frequencies at a locus segregating two alleles has the same law as the forward process. Subsequently, Hudson and Kaplan (1988) showed that the genealogy at a linked neutral locus can be described by a structured coalescent defined in terms of the genetic backgrounds at the selected locus; in this case, recombination between the selected and neutral loci can also move lineages between backgrounds.The objective of this article is to extend the structured coalescent to population genetic models in which within-generation fecundity variance is genotype dependent. (The genealogical consequences of polymorphism affecting between-generation fecundity variance will be described in a separate article.) In these models, exchangeability is violated not only by selective differences between individuals, but also by differences in life history traits that affect coalescent rates and allele frequency fluctuations. Nonetheless, because lineages are exchangeable within backgrounds, the coalescence and substitution rates can still be calculated conditional on the types of the lineages and the genetic composition of the population. In the next two sections, I derive structured coalescent processes that describe the genealogy at a neutral marker locus that is linked to a second locus (the “selected locus”) that affects fecundity variance. This is first done for a haploid model and then extended to a diploid model in which there may be both sex- and genotype-specific differences in fecundity variance. Results for both models are summarized in
TransitionHaploid modelDiploid model
Open in a separate windowThis work shows that coalescent processes in populations with fecundity variance polymorphism differ from the structured coalescent in a monomorphic population in three ways. One difference is that in populations with fecundity variance polymorphism, the coalescent rates in the different genetic backgrounds are not inversely proportional to the variance effective population size. Instead, coalescence within each allelic background depends only on the frequencies and fecundity distributions of genotypes containing that allele. The second difference is that the genealogies at the marker and selected loci are correlated even when these loci are unlinked; i.e., fecundity variance polymorphism has a genomewide impact on genealogies and genetic variation. This follows from the calculations leading up to Equation 28, which show that the genealogical process at an unlinked marker locus can be represented as a stochastic time change of Kingman''s coalescent dependent on the ancestral process of allele frequencies at the selected locus. The third and most surprising difference is that the correspondence between ancestral processes and allele frequency processes is many-to-one in diploid models with fecundity variance polymorphism. In fact, there are infinitely many combinations of genotype-dependent fecundity distributions (satisfying Equation 24) that have the same diffusion approximation but different genealogical processes. These results are illustrated numerically using simulations of the structured coalescent under directional and balancing selection. Finally, I examine the scope of the theory and some possible applications in the discussion.  相似文献   

An Ancestral Recombination Graph for Diploid Populations with Skewed Offspring Distribution     
Matthias Birkner  Jochen Blath  Bjarki Eldon 《Genetics》2013,193(1):255-290
A large offspring-number diploid biparental multilocus population model of Moran type is our object of study. At each time step, a pair of diploid individuals drawn uniformly at random contributes offspring to the population. The number of offspring can be large relative to the total population size. Similar “heavily skewed” reproduction mechanisms have been recently considered by various authors (cf. e.g., Eldon and Wakeley 2006, 2008) and reviewed by Hedgecock and Pudovkin (2011). Each diploid parental individual contributes exactly one chromosome to each diploid offspring, and hence ancestral lineages can coalesce only when in distinct individuals. A separation-of-timescales phenomenon is thus observed. A result of Möhle (1998) is extended to obtain convergence of the ancestral process to an ancestral recombination graph necessarily admitting simultaneous multiple mergers of ancestral lineages. The usual ancestral recombination graph is obtained as a special case of our model when the parents contribute only one offspring to the population each time. Due to diploidy and large offspring numbers, novel effects appear. For example, the marginal genealogy at each locus admits simultaneous multiple mergers in up to four groups, and different loci remain substantially correlated even as the recombination rate grows large. Thus, genealogies for loci far apart on the same chromosome remain correlated. Correlation in coalescence times for two loci is derived and shown to be a function of the coalescence parameters of our model. Extending the observations by Eldon and Wakeley (2008), predictions of linkage disequilibrium are shown to be functions of the reproduction parameters of our model, in addition to the recombination rate. Correlations in ratios of coalescence times between loci can be high, even when the recombination rate is high and sample size is large, in large offspring-number populations, as suggested by simulations, hinting at how to distinguish between different population models.  相似文献   

A Six Nuclear Gene Phylogeny of Citrus (Rutaceae) Taking into Account Hybridization and Lineage Sorting     
Chandrika Ramadugu  Bernard E. Pfeil  Manjunath L. Keremane  Richard F. Lee  Ivan J. Maureira-Butler  Mikeal L. Roose 《PloS one》2013,8(7)


Genus Citrus (Rutaceae) comprises many important cultivated species that generally hybridize easily. Phylogenetic study of a group showing extensive hybridization is challenging. Since the genus Citrus has diverged recently (4–12 Ma), incomplete lineage sorting of ancestral polymorphisms is also likely to cause discrepancies among genes in phylogenetic inferences. Incongruence of gene trees is observed and it is essential to unravel the processes that cause inconsistencies in order to understand the phylogenetic relationships among the species.

Methodology and Principal Findings

(1) We generated phylogenetic trees using haplotype sequences of six low copy nuclear genes. (2) Published simple sequence repeat data were re-analyzed to study population structure and the results were compared with the phylogenetic trees constructed using sequence data and coalescence simulations. (3) To distinguish between hybridization and incomplete lineage sorting, we developed and utilized a coalescence simulation approach. In other studies, species trees have been inferred despite the possibility of hybridization having occurred and used to generate null distributions of the effect of lineage sorting alone (by coalescent simulation). Since this is problematic, we instead generate these distributions directly from observed gene trees. Of the six trees generated, we used the most resolved three to detect hybrids. We found that 11 of 33 samples appear to be affected by historical hybridization. Analysis of the remaining three genes supported the conclusions from the hybrid detection test.


We have identified or confirmed probable hybrid origins for several Citrus cultivars using three different approaches–gene phylogenies, population structure analysis and coalescence simulation. Hybridization and incomplete lineage sorting were identified primarily based on differences among gene phylogenies with reference to null expectations via coalescence simulations. We conclude that identifying hybridization as a frequent cause of incongruence among gene trees is critical to correctly infer the phylogeny among species of Citrus.  相似文献   

Extensions of the Coalescent Effective Population Size          下载免费PDF全文
John Wakeley  Ori Sargsyan 《Genetics》2009,181(1):341-345

Population genetics of <Emphasis Type="Italic">Cedrela fissilis</Emphasis> (Meliaceae) from an ecotone in central Brazil     
J. M. Diaz-Soto  A. Huamán-Mera  L. O. Oliveira 《Tree Genetics & Genomes》2018,14(5):73
Cedrela fissilis is an endangered timber species associated with seasonal forests throughout South America. We investigated a population of C. fissilis (PAN) located toward central Brazil to uncover insights on how an ecotone may have shaped the evolutionary history of this species at the local scale. PAN consisted of 18 mother trees and their 283 offspring (18 families), which were genotyped with ten microsatellite loci. We supplemented our dataset with equivalent microsatellite data from 175 specimens representing the east and west lineages of C. fissilis. An array of complementary methods assessed PAN for genetic diversity, population structure, and mating system. In PAN, the gene pool of the east lineage combined with a third (previously unidentified) lineage to form an admixture population. PAN is under inbreeding (Ho?=?0.80 and 0.74, uHe?=?0.85 and 0.82, Ap?=?1.1 and 7.1, F?=?0.06 and 0.10, for mother trees and offspring, respectively). Mother trees were predominantly outcrossing (tm?=?0.95), with some selfing (1???tm?=?0.05), and crossing between related individuals (tmts?=?0.07); they received pollen from few donors (Nep?=?9). Restricted gene flow within PAN gave rise to a strong population structure, which split the 18 families into six groups. Some mother trees were reproductively isolated. Conservation perspectives are discussed.  相似文献   

Simulation of ‘hitch-hiking’ genealogies     
Slade PF 《Journal of mathematical biology》2001,42(1):41-70
An ancestral influence graph is derived, an analogue of the coalescent and a composite of Griffiths' (1991) two-locus ancestral graph and Krone and Neuhauser's (1997) ancestral selection graph. This generalizes their use of branching-coalescing random graphs so as to incorporate both selection and recombination into gene genealogies. Qualitative understanding of a ‘hitch-hiking’ effect on genealogies is pursued via diagrammatic representation of the genealogical process in a two-locus, two-allele haploid model. Extending the simulation technique of Griffiths and Tavaré (1996), computational estimation of expected times to the most recent common ancestor of samples of n genes under recombination and selection in two-locus, two-allele haploid and diploid models are presented. Such times are conditional on sample configuration. Monte Carlo simulations show that ‘hitch-hiking’ is a subtle effect that alters the conditional expected depth of the genealogy at the linked neutral locus depending on a mutation-selection-recombination balance. Received: 21 July 2000 / Published online: 5 December 2000  相似文献   

Topologies of the Conditional Ancestral Trees and Full-Likelihood-Based Inference in the General Coalescent Tree Framework     
Ori Sargsyan 《Genetics》2010,185(4):1355-1368
The general coalescent tree framework is a family of models for determining ancestries among random samples of DNA sequences at a nonrecombining locus. The ancestral models included in this framework can be derived under various evolutionary scenarios. Here, a computationally tractable full-likelihood-based inference method for neutral polymorphisms is presented, using the general coalescent tree framework and the infinite-sites model for mutations in DNA sequences. First, an exact sampling scheme is developed to determine the topologies of conditional ancestral trees. However, this scheme has some computational limitations and to overcome these limitations a second scheme based on importance sampling is provided. Next, these schemes are combined with Monte Carlo integrations to estimate the likelihood of full polymorphism data, the ages of mutations in the sample, and the time of the most recent common ancestor. In addition, this article shows how to apply this method for estimating the likelihood of neutral polymorphism data in a sample of DNA sequences completely linked to a mutant allele of interest. This method is illustrated using the data in a sample of DNA sequences at the APOE gene locus.THE interest in analyzing polymorphism data in contemporary samples of DNA sequences under various evolutionary scenarios creates a demand to design computationally tractable full-likelihood-based inference methods. For an evolutionary scenario of interest, an ancestral-mutation model can be used to design such a method. The ancestral-mutation model for a sample of DNA sequences at a nonrecombining locus is a combination of two processes: one is an ancestral process that traces the lineages of the sample back in time until the most recent common ancestor, constructing an ancestral tree for the sample. The second is a mutation process that is superimposed on the ancestral tree. The complexities of ancestral-mutation models make the design of such methods challenging. Full data are used instead of summary statistics, which can result in loss of important information in the data (see Felsenstein 1992; Donnelly and Tavaré 1995). In addition, current methods use specific features of the underlying ancestral-mutation models, so they lose flexibility to be applicable to other ancestral-mutation models.More specifically, Griffiths and Tavaré (1994c, 1995) and Kuhner et al. (1995) developed full-likelihood-based inference methods for neutral polymorphisms at a nonrecombining locus. They used the combinations of the standard coalescent (Kingman 1982a,b,c; Hudson 1983; Tajima 1983) with the finite-sites or infinite-sites (Watterson 1975) models as ancestral-mutation models. Stephens and Donnelly (2000) designed an importance sampling method to estimate the full likelihood of the data using the same settings for the ancestral-mutation models. Hobolth et al. (2008) provided another importance sampling scheme restricted to the infinite-sites model. The last two methods are computationally more efficient than the first two methods, but they lose flexibility to be applicable to ancestral models without standard coalescent features with independent coalescence waiting times, such as the coalescent processes with exponential growth (Slatkin and Hudson 1991; Griffiths and Tavaré 1994b).To incorporate the coalescent processes with exponential growth, Kuhner et al. (1998) and Griffiths and Tavaré (1994a, 1999) extended their previous methods. For example, the method of Griffiths and Tavaré (1994a, 1999) allows one to consider ancestral models based on coalescent processes with variable population sizes. Coop and Griffiths (2004) modified this inference method and made it applicable for analyzing full polymorphism data in a sample of DNA sequences from a nonrecombining locus completely linked to a mutant allele of interest, either neutral or under selection. Additionally, ancestral models have been developed for this type of sample, where the mutant allele is either neutral (Griffiths and Tavaré 1998, 2003; Wiuf and Donnelly 1999; Stephens 2000) or under selection (Slatkin and Rannala 1997; Stephens and Donnelly 2003). The ancestral model of Slatkin and Rannala (1997) is part of a family of ancestral models derived by Thompson (1975), Nee et al. (1994), and Rannala (1997), using a linear birth–death process as an evolutionary process in a population. Although all the ancestral models mentioned above differ in their properties and evolutionary scenarios, they are part of the general coalescent tree framework (Griffiths and Tavaré 1998). Therefore, a computationally tractable full-likelihood-based inference method based on this general framework is of great interest.For a sample of n sequences, an ancestral model in the general coalescent tree framework is described as a bifurcating rooted tree with n − 1 internal nodes and n leaves, where the internal nodes are coalescent events that happen one at a time. The tree is a combination of two independent components: the topology and the branch lengths. The topology of the tree is constructed going backward in time by combining two randomly chosen ancestral lineages of the sample at each node; the branch lengths of the tree are defined by the joint distribution function of the coalescence waiting times. Note that any density function for coalescence waiting times can define an ancestral model in the general coalescent tree framework.The n leaves (and the sequences in the sample) are labeled from 1 to n; and the n − 1 internal nodes of the ancestral tree are labeled from 1 to n − 1 (in order of occurrence of the coalescent events backward in time). Thus, the topology of an ancestral tree is a leaf-labeled bifurcating rooted tree with totally ordered interior vertices. These trees are called topological trees.When using the general coalescent tree framework and the infinite-sites model, an evolutionary process that generates polymorphism data in a sample of DNA sequences can be described in the following way. An ancestral tree is constructed, as described above, and mutations are added independently on different branches of the ancestral tree as Poisson processes with equal rates, θ/2, in which θ is the mutation rate at the locus. Then, at the mutation events, the ancestral sequences of the sample are changed according to the infinite-sites model; that is, each mutation occurs at a site of an ancestral sequence at which no previous mutations occurred. Thus, these changes define polymorphism data.Naively, this probabilistic framework can be used to estimate the likelihood of the full observed data in a sample of n sequences. That is, data sets are simulated independently as described above and each simulated data set is compared to the observed data. The proportion of the simulated data sets that match the observed data is an estimate of the likelihood of the observed data. Although this approach provides an estimate for the likelihood of the observed data, this method is computationally infeasible, because the topologies of the ancestral trees of the generated data sets are sampled from the space of all the possible topological trees with n leaves. This space has size n!(n − 1)!/2n−1 (Edwards 1970), which is huge for moderate values of n. The topologies of the ancestral trees of the generated data sets that match the observed data represent a small portion of that space. Thus, designing a method that samples topologies of the ancestral trees from this subspace can make the method computationally tractable.On the basis of this idea, I use the general coalescent tree framework with the infinite-sites model to develop a computationally tractable full-likelihood-based inference method for polymorphisms in DNA sequences at a nonrecombining locus. First, an exact sampling scheme for topologies of the conditional ancestral trees is developed. This method has some computational limitations, so to overcome these limitations a second scheme based on an importance sampling is provided. These sampling schemes are combined with Monte Carlo integrations to estimate the likelihood of the full data, the ages of the mutations in the sample, and the time of the most recent common ancestor of the sample. I describe an application of this method for neutral polymorphism data in a sample of DNA sequences at a nonrecombining locus that is completely linked to a mutant allele of interest, either neutral or under selection. The method is illustrated using the data in a sample of DNA sequences at the APOE gene locus from Fullerton et al. (2000).  相似文献   

Lineage Divergence and Historical Gene Flow in the Chinese Horseshoe Bat (Rhinolophus sinicus)     
Xiuguang Mao  Guimei He  Junpeng Zhang  Stephen J. Rossiter  Shuyi Zhang 《PloS one》2013,8(2)
Closely related taxa living in sympatry provide good opportunities to investigate the origin of barriers to gene flow as well as the extent of reproductive isolation. The only two recognized subspecies of the Chinese rufous horseshoe bat Rhinolophus sinicus are characterized by unusual relative distributions in which R. s. septentrionalis is restricted to a small area within the much wider range of its sister taxon R. s. sinicus. To determine the history of lineage divergence and gene flow between these taxa, we applied phylogenetic, demographic and coalescent analyses to multi-locus datasets. MtDNA gene genealogies and microsatellite-based clustering together revealed three divergent lineages of sinicus, corresponding to Central China, East China and the offshore Hainan Island. However, the central lineage of sinicus showed a closer relationship with septentrionalis than with other lineages of R. s. sinicus, in contrary to morphological data. Paraphyly of sinicus could result from either past asymmetric mtDNA introgression between these two taxa, or could suggest septentrionalis evolved in situ from its more widespread sister subspecies. To test between these hypotheses, we applied coalescent-based phylogenetic reconstruction and Approximate Bayesian Computation (ABC). We found that septentrionalis is likely to be the ancestral taxon and therefore a recent origin of this subspecies can be ruled out. On the other hand, we found a clear signature of asymmetric mtDNA gene flow from septentrionalis into central populations of sinicus yet no nuclear gene flow, thus strongly pointing to historical mtDNA introgression. We suggest that the observed deeply divergent lineages within R. sinicus probably evolved in isolation in separate Pleistocene refugia, although their close phylogeographic correspondence with distinct eco-environmental zones suggests that divergent selection might also have promoted broad patterns of population genetic structure.  相似文献   

A mathematical model of biological evolution     
K. Ishii  H. Matsuda  N. Ogita 《Journal of mathematical biology》1982,14(3):327-353
In order to understand generally how the biological evolution rate depends on relevant parameters such as mutation rate, intensity of selection pressure and its persistence time, the following mathematical model is proposed: dN n (t)/dt=(m n (t-)N n (t)+N n-1(t) (n=0,1,2,3...), where N n (t) and m n (t) are respectively the number and Malthusian parameter of replicons with step number n in a population at time t and is the mutation rate, assumed to be a positive constant. The step number of each replicon is defined as either equal to or larger by one than that of its parent, the latter case occurring when and only when mutation has taken place. The average evolution rate defined by is rigorously obtained for the case (i) m n (t)=m n is independent of t (constant fitness model), where m n is essentially periodic with respect to n, and for the case (ii) (periodic fitness model), together with the long time average m of the average Malthusian parameter . The biological meaning of the results is discussed, comparing them with the features of actual molecular evolution and with some results of computer simulation of the model for finite populations.An early version of this study was read at the International Symposium on Mathematical Topics in Biological held in kyoto, Japan, on September 11–12, 1978, and was published in its Procedings.  相似文献   

Coalescent patterns in diploid exchangeable population models     
Möhle M  Sagitov S 《Journal of mathematical biology》2003,47(4):337-352
A class of two-sex population models is considered with N females and equal number N of males constituting each generation. Reproduction is assumed to undergo three stages: 1) random mating, 2) exchangeable reproduction, 3) random sex assignment. Treating individuals as pairs of genes at a certain locus we introduce the diploid ancestral process (the past genealogical tree) for n such genes sampled in the current generation. Neither mutation nor selection are assumed. A convergence criterium for the diploid ancestral process is proved as N goes to infinity while n remains unchanged. Conditions are specified when the limiting process (coalescent) is the Kingman coalescent and situations are discussed when the coalescent allows for multiple mergers of ancestral lines.Work supported by the Bank of Sweden Tercentenary Foundation.Mathematics Subject Classification (2000):Primary 92F25, 60J70; Secondary 92D15, 60F17  相似文献   

Three brown trout Salmo trutta lineages in Corsica described through allozyme variation          下载免费PDF全文
P. Berrebi 《Journal of fish biology》2015,86(1):60-73
The brown trout Salmo trutta is represented by three lineages in Corsica: (1) an ancestral Corsican lineage, (2) a Mediterranean lineage and (3) a recently stocked domestic Atlantic S. trutta lineage (all are interfertile); the main focus of this study was the ancestral Corsican S. trutta, but the other lineages were also considered. A total of 38 samples captured between 1993 and 1998 were analysed, with nearly 1000 individuals considered overall. The Corsican ancestral lineage (Adriatic lineage according to the mitochondrial DNA control region nomenclature, AD) mostly inhabits streams in the southern half of the island; the Mediterranean lineage (ME) is present more in the north, especially in Golu River, but most populations are an admixture of these lineages and the domestic Atlantic S. trutta (AT). Locations where the Corsican ancestral S. trutta is dominant are now protected against stocking and sometimes fishing is also forbidden. The presence of the Corsican S. trutta is unique in France.  相似文献   

The Total Branch Length of Sample Genealogies in Populations of Variable Size     
A. Eriksson  B. Mehlig  M. Rafajlovic  S. Sagitov 《Genetics》2010,186(2):601-611
We consider neutral evolution of a large population subject to changes in its population size. For a population with a time-variable carrying capacity we study the distribution of the total branch lengths of its sample genealogies. Within the coalescent approximation we have obtained a general expression—Equation 20—for the moments of this distribution with a given arbitrary dependence of the population size on time. We investigate how the frequency of population-size variations alters the total branch length.MODELS for gene genealogies of biological populations often assume a constant, time-independent population size N. This is the case for the Wright–Fisher model (Fisher 1930; Wright 1931), for the Moran model (Moran 1958), and for their representation in terms of the coalescent (Kingman 1982). In real biological populations, by contrast, the population size changes over time. Such fluctuations may be due to catastrophic events (bottlenecks) and subsequent population expansions or just reflect the randomness in the factors determining the population dynamics. Many authors have argued that genetic variation in a population subject to size fluctuations may nevertheless be described by the Wright–Fisher model, if one replaces the constant population size in this model by an effective population size of the form(1)where Nl stands for the population size in generation l. The harmonic average in Equation 1 is argued to capture the significant effect of catastrophic events on patterns of genetic variation in a population: if, for example, a population went through a recent bottleneck, a large fraction of individuals in a given sample would originate from few parents. This in turn would lead to significantly reduced genetic variation, parameterized by a small value of Neff. (See, e.g., Ewens 1982 for a review of different measures of the effective population size and Sjödin et al. 2005 and Wakeley and Sargsyan 2009 for recent developments of this concept.)The concept of an effective population size has been frequently used in the literature, implicitly assuming that the distribution of neutral mutations in a large population of fluctuating size is identical to the distribution in a Wright–Fisher model with the corresponding constant effective population size given by Equation 1. However, recently it was shown that this is true only under certain circumstances (Kaj and Krone 2003; Nordborg and Krone 2003; Jagers and Sagitov 2004). It is argued by Sjödin et al. (2005) that the concept of an effective population size is appropriate when the timescale of fluctuations of Nl is either much smaller or much larger than the typical time between coalescent events in the sample genealogy. In these limits it can be proved that the distribution of the sample genealogies is exactly given by that of the coalescent with a constant, effective population size.More importantly, it follows from these results that, in populations with variable size, the coalescent with a constant effective population size is not always a valid approximation for the sample genealogies. Deviations between the predictions of the standard coalescent model and empirical data are frequently observed, and there are a number of different statistical tests quantifying the corresponding discrepancies (see, for example, Tajima 1989, Fu and Li 1993, and Zeng et al. 2006). The analysis of such deviations is of crucial importance in understanding, for example, human genetic history (Garrigan and Hammer 2006). But while there is a substantial amount of work numerically quantifying deviations, often in terms of a single number, little is known about their qualitative origins and their effect upon summary statistics in the population in question.The question is thus to understand the effect of population-size fluctuations on the patterns of genetic variation, in particular for the case where the scale of the population-size fluctuations is comparable to the time between coalescent events in the ancestral tree. As is well known, many empirical measures of genetic variation can be computed from the total branch length of the sample genealogy (the expected number of single-nucleotide polymorphisms, for example, is proportional to the average total branch length).The aim of this article is to analyze the distribution of the scaled total branch length Tn for a sample genealogy in a population of fluctuating size, as illustrated in Figure 1. For the genealogy of n ≥ 2 lineages sampled at the present time, the expression ⌊NTn⌋ gives the total branch length in terms of generations. Here ⌊Nt⌋ is the largest integer ≤Nt, and the scaling factor N is a suitable measure of the number of genes in the population and serves as a counterpart of the constant generation size of the standard Wright–Fisher model.Open in a separate windowFigure 1.—The effect of population-size oscillations on the genealogy of a sample of size n = 17 (schematic). Left, genealogy described by Kingman''s coalescent for a large population of constant size, illustrated by the light blue rectangle; right, sinusoidally varying population size. Coalescence is accelerated in regions of small population sizes and vice versa. This significantly alters the tree and gives rise to changes in the distribution of the number of mutations and of the population homozygosity.A motivating example is given in Figure 2, which shows numerically computed distributions ρ(Tn) of the total branch lengths Tn for a particular population model with a time-dependent carrying capacity. The model is described briefly in the Figure 2 legend and in detail in a model for a population with time-dependent carrying capacity. As Figure 2 shows, the distributions depend in a complex manner on the form of the size changes. We observe that when the frequency of the population-size fluctuations is very small (Figure 2a), the distribution is well described by the standard coalescent result(2)(Hein et al. 2005). When the frequency is very large (Figure 2e), Equation 2 also applies, but with a different time scaling reflecting an effective population size: t on the right-hand side (rhs) in Equation 2 is replaced by t/c with c = N/Neff. Apart from these special limits, however, the form of the distributions appears to depend in a complicated manner upon the frequency of the population-size variation. The observed behavior is caused by the fact that coalescence proceeds faster for smaller population sizes and more slowly for larger population sizes, as illustrated in Figure 1. But the question is how to quantitatively account for the changes shown in Figure 2.Open in a separate windowFigure 2.—Numerically computed distributions of the scaled total branch lengths Tn in genealogies of samples of size n = 10. The model employed in the simulations is outlined in a model for a population with time-dependent carrying capacity. It describes a population subject to a time-varying carrying capacity, Kl = K0(1 + ɛ sin(2πνl)). The frequency of the time changes is determined by ν, and l = 1, 2, 3, … labels discrete generations forward in time. The parameter N = K0 describes the typical population size, which is taken here to be equal to the time-averaged carrying capacity. a–e show for populations with increasingly rapidly oscillating carrying capacity. The dashed red line in a shows that in the limit of low frequencies the standard coalescent result, Equation 2, is obtained. The dashed red line in e shows that also in the limit of large frequencies the standard coalescent result is obtained, but now with an effective population size. The dashed red line in d is a two-parameter distribution, Equation 41, derived in comparison between numerical simulations and coalescent predictions. Further numerical and analytical results on the frequency dependence of the moments of these distributions are shown in Figure 4. Parameter values used: K0 = 10,000, ɛ = 0.9, and r = 1 (see a model for a population with time-dependent carrying capacity for the exact meaning of the intrinsic growth rate r) and (a) νN = 0.001, (b) νN = 0.1, (c) νN = 0.316, (d) νN = 1, and (e) νN = 100.We show in this article that the results of the simulations displayed in Figure 2 are explained by a general expression—Equation 20—for the moments of the distributions shown in Figure 2. Our general result is obtained within the coalescent approximation valid in the limit of large population size. But we find that in most cases, the coalescent approximation works very well down to small population sizes (a few hundred individuals). Our result enables us to understand and quantitatively describe how the distributions shown in Figure 2 depend upon the frequency of the population-size oscillations. It makes possible to determine, for example, how the variance, skewness, and the kurtosis of these distributions depend upon the frequency of demographic fluctuations. This in turn allows us to compute the population homozygosity and to characterize genetic variation in populations with size fluctuations.The remainder of this article is organized as follows. The next section summarizes our analytical results for the moments of the total branch length. Following that, we describe the model employed in the computer simulations. Then, corresponding numerical results are compared to the analytical predictions. And finally, we summarize how population-size fluctuations influence the distribution of total branch lengths and conclude with an outlook.  相似文献   

Approximating the coalescent with recombination     
McVean GA  Cardin NJ 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2005,360(1459):1387-1393
The coalescent with recombination describes the distribution of genealogical histories and resulting patterns of genetic variation in samples of DNA sequences from natural populations. However, using the model as the basis for inference is currently severely restricted by the computational challenge of estimating the likelihood. We discuss why the coalescent with recombination is so challenging to work with and explore whether simpler models, under which inference is more tractable, may prove useful for genealogy-based inference. We introduce a simplification of the coalescent process in which coalescence between lineages with no overlapping ancestral material is banned. The resulting process has a simple Markovian structure when generating genealogies sequentially along a sequence, yet has very similar properties to the full model, both in terms of describing patterns of genetic variation and as the basis for statistical inference.  相似文献   

Chaotic Dynamics of Neuromuscular System Parameters and the Problems of the Evolution of Complexity     
V. V. Eskov  O. E. Filatova  T. V. Gavrilenko  D. V. Gorbunov 《Biophysics》2017,62(6):961-966
The evolution rate v(t) varies among diverse biosystems, but a general theory can be formulated when the dynamics of the biosystem stater x = x(t) = (x1, x2, x m ) T is considered in the m-dimensional space of states. A mathematical approach is proposed for evaluating such processes and describes the processes in terms of particular chaos of the statistical distribution functions f(x). In the case of complex multicomponent systems with a high dimension number m (m ?1) of the phase space of states, we propose using pairwise comparison matrices of samples x(t) when homeostasis is constant and calculating the parameters of quasiattractors. The Glensdorff–Prigogine thermodynamic approach to estimating evolution is inefficient in assessing the third-type systems, while it is applicable and the Prigogine theorem works at the level of molecular systems. Alterations in the state of the human neuromuscular system were found to lead to chaotic changes in the statistical functions f(x) in tremor recording samples, while quasiattractor parameters demonstrate a certain regularity.  相似文献   

Asymptotic line-of-descent distributions     
R. C. Griffiths 《Journal of mathematical biology》1984,21(1):67-75
Asymptotic distributions are derived for the number of non-mutant ancestors, at time t in the past, of a sample of n from a neutral infinite alleles model. Either the number of non-mutant ancestors L n (t) has a normal distribution or n-Ln(t) has a Poisson distribution as n , t 0.  相似文献   

TREES SIFTER 1.0: an approximate method to estimate the time to the most recent common ancestor of a sample of DNA sequences     
PATRICK MARDULYN 《Molecular ecology resources》2007,7(3):418-421
trees sifter 1.0 implements an approximate method to estimate the time to the most recent common ancestor (TMRCA) of a set of DNA sequences, using population evolution modelling. In essence, the program simulates genealogies with a user‐defined model of coalescence of lineages, and then compares each simulated genealogy to the genealogy inferred from the real data, through two summary statistics: (i) the number of mutations on the genealogy (Mn), and (ii) the number of different sequence types (alleles) observed (Kn). The simulated genealogies are then submitted to a rejection algorithm that keeps only those that are the most likely to have generated the observed sequence data. At the end of the process, the accepted genealogies can be used to estimate the posterior probability distribution of the TMRCA.  相似文献   

Estimating ancestral distributions of lineages with uncertain sister groups: a statistical approach to Dispersal–Vicariance Analysis and a case using Aesculus L. (Sapindaceae) including fossils     
A.J. HARRIS  Qiu‐Yun XIANG 《植物分类学报:英文版》2009,47(5):349-368
Abstract We propose a simple statistical approach for using Dispersal–Vicariance Analysis (DIVA) software to infer biogeographic histories without fully bifurcating trees. In this approach, ancestral ranges are first optimized for a sample of Bayesian trees. The probability P of an ancestral range r at a node is then calculated as where Y is a node, and F(rY ) is the frequency of range r among all the optimal solutions resulting from DIVA optimization at node Y, t is one of n topologies optimized, and Pt is the probability of topology t. Node Y is a hypothesized ancestor shared by a specific crown lineage and the sister of that lineage “x”, where x may vary due to phylogenetic uncertainty (polytomies and nodes with posterior probability <100%). Using this method, the ancestral distribution at Y can be estimated to provide inference of the geographic origins of the specific crown group of interest. This approach takes into account phylogenetic uncertainty as well as uncertainty from DIVA optimization. It is an extension of the previously described method called Bayes‐DIVA, which pairs Bayesian phylogenetic analysis with biogeographic analysis using DIVA. Further, we show that the probability P of an ancestral range at Y calculated using this method does not equate to pp*F(rY ) on the Bayesian consensus tree when both variables are <100%, where pp is the posterior probability and F(rY ) is the frequency of range r for the node containing the specific crown group. We tested our DIVA‐Bayes approach using Aesculus L., which has major lineages unresolved as a polytomy. We inferred the most probable geographic origins of the five traditional sections of Aesculus and of Aesculus californica Nutt. and examined range subdivisions at parental nodes of these lineages. Additionally, we used the DIVA‐Bayes data from Aesculus to quantify the effects on biogeographic inference of including two wildcard fossil taxa in phylogenetic analysis. Our analysis resolved the geographic ranges of the parental nodes of the lineages of Aesculus with moderate to high probabilities. The probabilities were greater than those estimated using the simple calculation of pp*F(ry) at a statistically significant level for two of the six lineages. We also found that adding fossil wildcard taxa in phylogenetic analysis generally increased P for ancestral ranges including the fossil's distribution area. The ΔP was more dramatic for ranges that include the area of a wildcard fossil with a distribution area underrepresented among extant taxa. This indicates the importance of including fossils in biogeographic analysis. Exmination of range subdivision at the parental nodes revealed potential range evolution (extinction and dispersal events) along the stems of A. californica and sect. Parryana.  相似文献   

Strong sequence patterns in eukaryotic promoter regions: Potential implications for DNA structure     
《The International journal of biochemistry》1993,25(4):597-607
  • 1.1. Analysis of eukaryotic sequences reveals recurring trends in upstream regions. Oligomers composed of (G/C)n and (A/T)m blocks are preferentially flanked by (G/C)2 doublets on their 3' rather than on their 5′ ends, that is (G/C)nä(A/T)m(G/C)2 > (G/C)n+2(A/T)m.
  • 2.2. These trends are stronger for larger n and smaller m. Additional trends are outlined below.
  • 3.3. The trends are correlated with DNA structural parameters, in particular with twist and roll angles.
  • 4.4. Generally, the trends hold if the base pair step joining the 5′ (G/C)2 doublet to the (G/C)n (A/T)m oligomer is not undertwisted and is not strongly rolled into the major groove.
  • 5.5. Other DNA parameters crucial for DNA-protein interactions are discussed as well.

An application of the central limit theorem to coalescence times in the structured coalescent model with strong migration     
Morihiro Notohara 《Journal of mathematical biology》2010,61(5):695-714
The structured coalescent describes the ancestral relationship among sampled genes from a geographically structured population. The aim of this article is to apply the central limit theorem to functionals of the migration process to study coalescence times and population structure. An application of the law of large numbers to the migration process leads to the strong migration limit for the distributions of coalescence times. The central limit theorem enables us to obtain approximate distributions of coalescence times for strong migration. We show that approximate distributions depend on the population structure. If migration is conservative and strong, we can define a kind of effective population size N e *, with which the entire population approximately behaves like a panmictic population. On the other hand, the approximate distributions for nonconservative migration are qualitatively different from those for conservative migration. And the entire population behaves unlike a panmictic population even though migration is strong.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号