首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
2.
In this paper, we investigate the standard Yule model, and a recently studied model of speciation and extinction, the “critical branching process.” We develop an analytic way—as opposed to the common simulation approach—for calculating the speciation times in a reconstructed phylogenetic tree. Simple expressions for the density and the moments of the speciation times are obtained. Methods for dating a speciation event become valuable, if for the reconstructed phylogenetic trees, no time scale is available. A missing time scale could be due to supertree methods, morphological data, or molecular data which violates the molecular clock. Our analytic approach is, in particular, useful for the model with extinction, since simulations of birth-death processes which are conditioned on obtaining n extant species today are quite delicate. Further, simulations are very time consuming for big n under both models.  相似文献   

3.
The relationship between speciation times and the corresponding times of gene divergence is of interest in phylogenetic inference as a means of understanding the past evolutionary dynamics of populations and of estimating the timing of speciation events. It has long been recognized that gene divergence times might substantially pre-date speciation events. Although the distribution of the difference between these has previously been studied for the case of two populations, this distribution has not been explicitly computed for larger species phylogenies. Here we derive a simple method for computing this distribution for trees of arbitrary size. A two-stage procedure is proposed which (i) considers the probability distribution of the time from the speciation event at the root of the species tree to the gene coalescent time conditionally on the number of gene lineages available at the root; and (ii) calculates the probability mass function for the number of gene lineages at the root. This two-stage approach dramatically simplifies numerical analysis, because in the first step the conditional distribution does not depend on an underlying species tree, while in the second step the pattern of gene coalescence prior to the species tree root is irrelevant. In addition, the algorithm provides intuition concerning the properties of the distribution with respect to the various features of the underlying species tree. The methodology is complemented by developing probabilistic formulae and software, written in R. The method and software are tested on five-taxon species trees with varying levels of symmetry. The examples demonstrate that more symmetric species trees tend to have larger mean coalescent times and are more likely to have a unimodal gamma-like distribution with a long right tail, while asymmetric trees tend to have smaller mean coalescent times with an exponential-like distribution. In addition, species trees with longer branches generally have shorter mean coalescent times, with branches closest to the root of the tree being most influential.  相似文献   

4.
We propose a model based approach to use multiple gene trees to estimate the species tree. The coalescent process requires that gene divergences occur earlier than species divergences when there is any polymorphism in the ancestral species. Under this scenario, speciation times are restricted to be smaller than the corresponding gene split times. The maximum tree (MT) is the tree with the largest possible speciation times in the space of species trees restricted by available gene trees. If all populations have the same population size, the MT is the maximum likelihood estimate of the species tree. It can be shown the MT is a consistent estimator of the species tree even when the MT is built upon the estimates of the true gene trees if the gene tree estimates are statistically consistent. The MT converges in probability to the true species tree at an exponential rate.  相似文献   

5.
An improved Bayesian method is presented for estimating phylogenetic trees using DNA sequence data. The birth-death process with species sampling is used to specify the prior distribution of phylogenies and ancestral speciation times, and the posterior probabilities of phylogenies are used to estimate the maximum posterior probability (MAP) tree. Monte Carlo integration is used to integrate over the ancestral speciation times for particular trees. A Markov Chain Monte Carlo method is used to generate the set of trees with the highest posterior probabilities. Methods are described for an empirical Bayesian analysis, in which estimates of the speciation and extinction rates are used in calculating the posterior probabilities, and a hierarchical Bayesian analysis, in which these parameters are removed from the model by an additional integration. The Markov Chain Monte Carlo method avoids the requirement of our earlier method for calculating MAP trees to sum over all possible topologies (which limited the number of taxa in an analysis to about five). The methods are applied to analyze DNA sequences for nine species of primates, and the MAP tree, which is identical to a maximum-likelihood estimate of topology, has a probability of approximately 95%.   相似文献   

6.
Estimates of the timing of divergence are central to testing the underlying causes of speciation. Relaxed molecular clocks and fossil calibration have improved these estimates; however, these advances are implemented in the context of gene trees, which can overestimate divergence times. Here we couple recent innovations for dating speciation events with the analytical power of species trees, where multilocus data are considered in a coalescent context. Divergence times are estimated in the bird genus Aphelocoma to test whether speciation in these jays coincided with mountain uplift or glacial cycles. Gene trees and species trees show general agreement that diversification began in the Miocene amid mountain uplift. However, dates from the multilocus species tree are more recent, occurring predominately in the Pleistocene, consistent with theory that divergence times can be significantly overestimated with gene‐tree based approaches that do not correct for genetic divergence that predates speciation. In addition to coalescent stochasticity, Haldane's rule could account for some differences in timing estimates between mitochondrial DNA and nuclear genes. By incorporating a fossil calibration applied to the species tree, in addition to the process of gene lineage coalescence, the present approach provides a more biologically realistic framework for dating speciation events, and hence for testing the links between diversification and specific biogeographic and geologic events.  相似文献   

7.
We investigate some discrete structural properties of evolutionary trees generated under simple null models of speciation, such as the Yule model. These models have been used as priors in Bayesian approaches to phylogenetic analysis, and also to test hypotheses concerning the speciation process. In this paper we describe new results for three properties of trees generated under such models. Firstly, for a rooted tree generated by the Yule model we describe the probability distribution on the depth (number of edges from the root) of the most recent common ancestor of a random subset of k species. Next we show that, for trees generated under the Yule model, the approximate position of the root can be estimated from the associated unrooted tree, even for trees with a large number of leaves. Finally, we analyse a biologically motivated extension of the Yule model and describe its distribution on tree shapes when speciation occurs in rapid bursts.  相似文献   

8.
Drawing inferences about macroevolutionary processes from phylogenetic trees is a fundamental challenge in evolutionary biology. Understanding stochastic models for speciation is an essential step in solving this challenge. We consider a neutral class of stochastic models for speciation, the constant rate birth-death process. For trees with n extant species - which might be derived from bigger trees via random taxon sampling - we calculate the expected time of the kth speciation event (k=1,...,n-1). Further, for a tree with n extant species, we calculate the density and expectation for the number of lineages at any time between the origin of the process and the present. With the developed methods, expected lineages-through-time (LTT) plots can be drawn analytically. The effect of random taxon sampling on LTT plots is discussed.  相似文献   

9.
An important challenge for phylogenetic studies of closely related species is the existence of deep coalescence and gene tree heterogeneity. However, their effects can vary between species and they are often neglected in phylogenetic analyses. In addition, a practical problem in the reconstruction of shallow phylogenies is to determine the most efficient set of DNA markers for a reliable estimation. To address these questions, we conducted a multilocus simulation study using empirical values of nucleotide diversity and substitution rates obtained from a wide range of mammals and evaluated the performance of both gene tree and species tree approaches to recover the known speciation times and topological relationships. We first show that deep coalescence can be a serious problem, more than usually assumed, for the estimation of speciation times in mammals using traditional gene trees. Furthermore, we tested the performance of different sets of DNA markers in the determination of species trees using a coalescent approach. Although the best estimates of speciation times were obtained, as expected, with the use of an increasing number of nuclear loci, our results show that similar estimations can be obtained with a much lower number of genes and the incorporation of a mitochondrial marker, with its high information content. Thus, the use of the combined information of both nuclear and mitochondrial markers in a species tree framework is the most efficient option to estimate recent speciation times and, consequently, the underlying species tree.  相似文献   

10.
The constant rate birth–death process is a popular null model for speciation and extinction. If one removes extinct and non-sampled lineages, this process induces ‘reconstructed trees’ which describe the relationship between extant lineages. We derive the probability density of the length of a randomly chosen pendant edge in a reconstructed tree. For the special case of a pure-birth process with complete sampling, we also provide the probability density of the length of an interior edge, of the length of an edge descending from the root, and of the diversity (which is the sum of all edge lengths). We show that the results depend on whether the reconstructed trees are conditioned on the number of leaves, the age, or both.  相似文献   

11.

Background

The history of gene families—which are equivalent to event-labeled gene trees—can be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralogous or xenologous genes. The question then arises as whether inferred event-labeled gene trees are biologically feasible, that is, if there is a possible true history that would explain a given gene tree. In practice, this problem is boiled down to finding a reconciliation map—also known as DTL-scenario—between the event-labeled gene trees and a (possibly unknown) species tree.

Results

In this contribution, we first characterize whether there is a valid reconciliation map for binary event-labeled gene trees T that contain speciation, duplication and horizontal gene transfer events and some unknown species tree S in terms of “informative” triples that are displayed in T and provide information of the topology of S. These informative triples are used to infer the unknown species tree S for T. We obtain a similar result for non-binary gene trees. To this end, however, the reconciliation map needs to be further restricted. We provide a polynomial-time algorithm to decide whether there is a species tree for a given event-labeled gene tree, and in the positive case, to construct the species tree and the respective (restricted) reconciliation map. However, informative triples as well as DTL-scenarios have their limitations when they are used to explain the biological feasibility of gene trees. While reconciliation maps imply biological feasibility, we show that the converse is not true in general. Moreover, we show that informative triples neither provide enough information to characterize “relaxed” DTL-scenarios nor non-restricted reconciliation maps for non-binary biologically feasible gene trees.
  相似文献   

12.
We implement an isolation with migration model for three species, with migration occurring between two closely related species while an out-group species is used to provide further information concerning gene trees and model parameters. The model is implemented in the likelihood framework for analyzing multilocus genomic sequence alignments, with one sequence sampled from each of the three species. The prior distribution of gene tree topology and branch lengths at every locus is calculated using a Markov chain characterization of the genealogical process of coalescent and migration, which integrates over the histories of migration events analytically. The likelihood function is calculated by integrating over branch lengths in the gene trees (coalescent times) numerically. We analyze the model to study the gene tree-species tree mismatch probability and the time to the most recent common ancestor at a locus. The model is used to construct a likelihood ratio test (LRT) of speciation with gene flow. We conduct computer simulations to evaluate the LRT and found that the test is in general conservative, with the false positive rate well below the significance level. For the test to have substantial power, hundreds of loci are needed. Application of the test to a human-chimpanzee-gorilla genomic data set suggests gene flow around the time of speciation of the human and the chimpanzee.  相似文献   

13.
Host specificity in parasites can be explained by spatial isolation from other potential hosts or by specialization and speciation of specific parasite species. The first assertion is based on allopatric speciation, the latter on differential lifetime reproductive success on different available hosts. We investigated the host specificity and cophylogenetic histories of four sympatric European bat species of the genus Myotis and their ectoparasitic wing mites of the genus Spinturnix. We sampled >40 parasite specimens from each bat species and reconstructed their phylogenetic COI trees to assess host specificity. To test for cospeciation, we compared host and parasite trees for congruencies in tree topologies. Corresponding divergence events in host and parasite trees were dated using the molecular clock approach. We found two species of wing mites to be host specific and one species to occur on two unrelated hosts. Host specificity cannot be explained by isolation of host species, because we found individual parasites on other species than their native hosts. Furthermore, we found no evidence for cospeciation, but for one host switch and one sorting event. Host‐specific wing mites were several million years younger than their hosts. Speciation of hosts did not cause speciation in their respective parasites, but we found that diversification of recent host lineages coincided with a lineage split in some parasites.  相似文献   

14.
The shape of a phylogenetic tree is defined by the sequence of speciation events, represented by its branching points, and extinctions, represented by branch interruptions. In a neutral scenario of parapatry and isolation by distance, species tend to branch off the original population one after the other, leading to highly unbalanced trees. In this case the degree of imbalance, measured by the normalized Sackin index, grows linearly with species richness. Here we claim that moderate values of imbalance for trees with large number of species can occur if the geographic distribution involves more than one deme (allopatry) and speciation is parapatric within demes. The combined values of balance (normalized Sackin index) and species richness provide an estimate of how many demes were involved in the process if it happened in such neutral scenario. We also show that the spatial division in demes moderately slows down the diversification process, portraying a neutral mechanism for structuring the branch length distribution of phylogenetic trees.  相似文献   

15.
地黄属分子系统学分析   总被引:4,自引:0,他引:4  
地黄属是中国准特有属,属内种间关系仍待澄清。该研究基于多个个体的叶绿体与核基因片段对地黄属进行系统发育重建,探讨属内物种分化与可能历史。结果表明:(1)地黄属为单系群,天目地黄为本属原始类群,并与湖北地黄互为姐妹群或连续姐妹群,裂叶地黄-高地黄、地黄-茄叶地黄分别组成姐妹群;(2)多个个体构建的系统树更能揭示本属物种关系的复杂性;(3)祖先分布区重建表明本属经历过3次扩张与2次隔离过程,物种形成过程与历史气候变化密切相关。最后该研究提出了地黄属物种形成研究亟待解决的几个重要问题。  相似文献   

16.
A new method is presented for inferring evolutionary trees using nucleotide sequence data. The birth-death process is used as a model of speciation and extinction to specify the prior distribution of phylogenies and branching times. Nucleotide substitution is modeled by a continuous-time Markov process. Parameters of the branching model and the substitution model are estimated by maximum likelihood. The posterior probabilities of different phylogenies are calculated and the phylogeny with the highest posterior probability is chosen as the best estimate of the evolutionary relationship among species. We refer to this as the maximum posterior probability (MAP) tree. The posterior probability provides a natural measure of the reliability of the estimated phylogeny. Two example data sets are analyzed to infer the phylogenetic relationship of human, chimpanzee, gorilla, and orangutan. The best trees estimated by the new method are the same as those from the maximum likelihood analysis of separate topologies, but the posterior probabilities are quite different from the bootstrap proportions. The results of the method are found to be insensitive to changes in the rate parameter of the branching process. Correspondence to: Z. Yang  相似文献   

17.
Species are not independent points for comparative analyses because closely related species share more evolutionary history and are therefore more similar to each other than distantly related species. The extent to which independent-contrast analysis reduces type I and type II statistical error in comparison with cross-species analysis depends on the relative branch lengths in the phylogenetic tree: as deeper branches get relatively long, cross-species analyses have more statistical type I and type II error. Phylogenetic trees reconstructed from extant species, under the assumptions of a branching process with speciation (branching) and extinction rates remaining constant through time, will have relatively longer deep branches as the extinction rate increases relative to the speciation rate. We compare the statistical performance of cross-species and independent-contrast analyses with varying relative extinction rates, and conclude that cross-species comparisons have unacceptable statistical performance, particularly when extinction rates are relatively high.  相似文献   

18.
Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals—each with many genes—splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are four species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendant branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled.  相似文献   

19.
Genealogical data are an important source of evidence for delimiting species, yet few statistical methods are available for calculating the probabilities associated with different species delimitations. Bayesian species delimitation uses reversible-jump Markov chain Monte Carlo (rjMCMC) in conjunction with a user-specified guide tree to estimate the posterior distribution for species delimitation models containing different numbers of species. We apply Bayesian species delimitation to investigate the speciation history of forest geckos (Hemidactylus fasciatus) from tropical West Africa using five nuclear loci (and mtDNA) for 51 specimens representing 10 populations. We find that species diversity in H. fasciatus is currently underestimated, and describe three new species to reflect the most conservative estimate for the number of species in this complex. We examine the impact of the guide tree, and the prior distributions on ancestral population sizes (θ) and root age (τ0), on the posterior probabilities for species delimitation. Mis-specification of the guide tree or the prior distribution for θ can result in strong support for models containing more species. We describe a new statistic for summarizing the posterior distribution of species delimitation models, called speciation probabilities, which summarize the posterior support for each speciation event on the starting guide tree.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号