首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 32 毫秒
1.
We propose a model based approach to use multiple gene trees to estimate the species tree. The coalescent process requires that gene divergences occur earlier than species divergences when there is any polymorphism in the ancestral species. Under this scenario, speciation times are restricted to be smaller than the corresponding gene split times. The maximum tree (MT) is the tree with the largest possible speciation times in the space of species trees restricted by available gene trees. If all populations have the same population size, the MT is the maximum likelihood estimate of the species tree. It can be shown the MT is a consistent estimator of the species tree even when the MT is built upon the estimates of the true gene trees if the gene tree estimates are statistically consistent. The MT converges in probability to the true species tree at an exponential rate.  相似文献   

2.
Estimates of the timing of divergence are central to testing the underlying causes of speciation. Relaxed molecular clocks and fossil calibration have improved these estimates; however, these advances are implemented in the context of gene trees, which can overestimate divergence times. Here we couple recent innovations for dating speciation events with the analytical power of species trees, where multilocus data are considered in a coalescent context. Divergence times are estimated in the bird genus Aphelocoma to test whether speciation in these jays coincided with mountain uplift or glacial cycles. Gene trees and species trees show general agreement that diversification began in the Miocene amid mountain uplift. However, dates from the multilocus species tree are more recent, occurring predominately in the Pleistocene, consistent with theory that divergence times can be significantly overestimated with gene‐tree based approaches that do not correct for genetic divergence that predates speciation. In addition to coalescent stochasticity, Haldane's rule could account for some differences in timing estimates between mitochondrial DNA and nuclear genes. By incorporating a fossil calibration applied to the species tree, in addition to the process of gene lineage coalescence, the present approach provides a more biologically realistic framework for dating speciation events, and hence for testing the links between diversification and specific biogeographic and geologic events.  相似文献   

3.
Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals—each with many genes—splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are four species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendant branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled.  相似文献   

4.
We studied the phylogenetic relationships among Japanese Leptocarabus ground beetles, which show extensive trans-species polymorphisms in mitochondrial gene genealogies. Simultaneous analysis of combined nuclear data with partial sequences from the long-wavelength rhodopsin, wingless, phosphoenolpyruvate carboxykinase, and 28S rRNA genes resolved the relationships among the five species, although separate analyses of these genes provided topologies with low resolution. For both the nuclear gene tree resulting from the combined data from four genes and a mitochondrial cytochrome oxidase subunit I (COI) gene tree, we applied a Bayesian divergence time estimation using a common calibration method to identify mitochondrial introgression events that occurred after speciation. Three mitochondrial lineages shared by two or three species were likely subject to introgression due to interspecific hybridization because the coalescent times for these lineages were much shorter than the corresponding speciation times estimated from nuclear gene sequences. We demonstrated that when species phylogeny is fully resolved with nuclear gene sequence data, comparative analysis of nuclear and mitochondrial gene trees can be used to infer introgressive hybridization events that might cause trans-species polymorphisms in mitochondrial gene trees.  相似文献   

5.
Several methods have been designed to infer species trees from gene trees while taking into account gene tree/species tree discordance. Although some of these methods provide consistent species tree topology estimates under a standard model, most either do not estimate branch lengths or are computationally slow. An exception, the GLASS method of Mossel and Roch, is consistent for the species tree topology, estimates branch lengths, and is computationally fast. However, GLASS systematically overestimates divergence times, leading to biased estimates of species tree branch lengths. By assuming a multispecies coalescent model in which multiple lineages are sampled from each of two taxa at L independent loci, we derive the distribution of the waiting time until the first interspecific coalescence occurs between the two taxa, considering all loci and measuring from the divergence time. We then use the mean of this distribution to derive a correction to the GLASS estimator of pairwise divergence times. We show that our improved estimator, which we call iGLASS, consistently estimates the divergence time between a pair of taxa as the number of loci approaches infinity, and that it is an unbiased estimator of divergence times when one lineage is sampled per taxon. We also show that many commonly used clustering methods can be combined with the iGLASS estimator of pairwise divergence times to produce a consistent estimator of the species tree topology. Through simulations, we show that iGLASS can greatly reduce the bias and mean squared error in obtaining estimates of divergence times in a species tree.  相似文献   

6.
Interpretations of phylogeographic patterns can change when analyses shift from single gene-tree to multilocus coalescent analyses. Using multilocus coalescent approaches, a species tree and divergence times can be estimated from a set of gene trees while accounting for gene-tree stochasticity. We utilized the conceptual strengths of a multilocus coalescent approach coupled with complete range-wide sampling to examine the speciation history of a broadly distributed, North American warm-desert toad, Anaxyrus punctatus. Phylogenetic analyses provided strong support for three major lineages within A. punctatus. Each lineage broadly corresponded to one of three desert regions. Early speciation in A. punctatus appeared linked to late Miocene-Pliocene development of the Baja California peninsula. This event was likely followed by a Pleistocene divergence associated with the separation of the Chihuahuan and Sonoran Deserts. Our multilocus coalescent-based reconstruction provides an informative contrast to previous single gene-tree estimates of the evolutionary history of A. punctatus.  相似文献   

7.
Because of the stochastic way in which lineages sort during speciation, gene trees may differ in topology from each other and from species trees. Surprisingly, assuming that genetic lineages follow a coalescent model of within-species evolution, we find that for any species tree topology with five or more species, there exist branch lengths for which gene tree discordance is so common that the most likely gene tree topology to evolve along the branches of a species tree differs from the species phylogeny. This counterintuitive result implies that in combining data on multiple loci, the straightforward procedure of using the most frequently observed gene tree topology as an estimate of the species tree topology can be asymptotically guaranteed to produce an incorrect estimate. We conclude with suggestions that can aid in overcoming this new obstacle to accurate genomic inference of species phylogenies.  相似文献   

8.
Genome-scale sequence data have become increasingly available in the phylogenetic studies for understanding the evolutionary histories of species. However, it is challenging to develop probabilistic models to account for heterogeneity of phylogenomic data. The multispecies coalescent model describes gene trees as independent random variables generated from a coalescence process occurring along the lineages of the species tree. Since the multispecies coalescent model allows gene trees to vary across genes, coalescent-based methods have been popularly used to account for heterogeneous gene trees in phylogenomic data analysis. In this paper, we summarize and evaluate the performance of coalescent-based methods for estimating species trees from genome-scale sequence data. We investigate the effects of deep coalescence and mutation on the performance of species tree estimation methods. We found that the coalescent-based methods perform well in estimating species trees for a large number of genes, regardless of the degree of deep coalescence and mutation. The performance of the coalescent methods is negatively correlated with the lengths of internal branches of the species tree.  相似文献   

9.
We implement an isolation with migration model for three species, with migration occurring between two closely related species while an out-group species is used to provide further information concerning gene trees and model parameters. The model is implemented in the likelihood framework for analyzing multilocus genomic sequence alignments, with one sequence sampled from each of the three species. The prior distribution of gene tree topology and branch lengths at every locus is calculated using a Markov chain characterization of the genealogical process of coalescent and migration, which integrates over the histories of migration events analytically. The likelihood function is calculated by integrating over branch lengths in the gene trees (coalescent times) numerically. We analyze the model to study the gene tree-species tree mismatch probability and the time to the most recent common ancestor at a locus. The model is used to construct a likelihood ratio test (LRT) of speciation with gene flow. We conduct computer simulations to evaluate the LRT and found that the test is in general conservative, with the false positive rate well below the significance level. For the test to have substantial power, hundreds of loci are needed. Application of the test to a human-chimpanzee-gorilla genomic data set suggests gene flow around the time of speciation of the human and the chimpanzee.  相似文献   

10.
Rannala B  Yang Z 《Genetics》2003,164(4):1645-1656
The effective population sizes of ancestral as well as modern species are important parameters in models of population genetics and human evolution. The commonly used method for estimating ancestral population sizes, based on counting mismatches between the species tree and the inferred gene trees, is highly biased as it ignores uncertainties in gene tree reconstruction. In this article, we develop a Bayes method for simultaneous estimation of the species divergence times and current and ancestral population sizes. The method uses DNA sequence data from multiple loci and extracts information about conflicts among gene tree topologies and coalescent times to estimate ancestral population sizes. The topology of the species tree is assumed known. A Markov chain Monte Carlo algorithm is implemented to integrate over uncertain gene trees and branch lengths (or coalescence times) at each locus as well as species divergence times. The method can handle any species tree and allows different numbers of sequences at different loci. We apply the method to published noncoding DNA sequences from the human and the great apes. There are strong correlations between posterior estimates of speciation times and ancestral population sizes. With the use of an informative prior for the human-chimpanzee divergence date, the population size of the common ancestor of the two species is estimated to be approximately 20,000, with a 95% credibility interval (8000, 40,000). Our estimates, however, are affected by model assumptions as well as data quality. We suggest that reliable estimates have yet to await more data and more realistic models.  相似文献   

11.
12.
We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.  相似文献   

13.
The great increase in the number of phylogenetic studies of a wide variety of organisms in recent decades has focused considerable attention on the balance of phylogenetic trees—the degree to which sister clades within a tree tend to be of equal size—for at least two reasons: (1) the degree of balance of a tree may affect the accuracy of estimates of it; (2) the degree of balance, or imbalance, of a tree may reveal something about the macroevolutionary processes that produced it. In particular, variation among lineages in rates of speciation or extinction is expected to produce trees that are less balanced than those that result from phylogenetic evolution in which each extant species of a group has the same probability of speciation or extinction. Several coefficients for measuring the balance or imbalance of phylogenetic trees have been proposed. I focused on Colless's coefficient of imbalance (7) for its mathematical tractability and ease of interpretation. Earlier work on this statistic produced exact methods only for calculating the expected value. In those studies, the variance and confidence limits, which are necessary for testing the departure of observed values of I from the expected, were estimated by Monte Carlo simulation. I developed recursion equations that allow exact calculation of the mean, variance, skewness, and complete probability distribution of I for two different probability-generating models for bifurcating tree shapes. The Equal-Rates Markov (ERM) model assumes that trees grow by the random speciation and extinction of extant species, with all species that are extant at a given time having the same probability of speciation or extinction. The Equal Probability (EP) model assumes that all possible labeled trees for a given number of terminal taxa have the same probability of occurring. Examples illustrate how these theoretically derived probabilities and parameters may be used to test whether the evolution of a monophyletic group or set of monophyletic groups has proceeded according to a Markov model with equal rates of speciation and extinction among species, that is, whether there has been significant variation among lineages in expected rates of speciation or extinction.  相似文献   

14.
Grasshoppers in the genus Melanoplus have undergone a radiation in the 'sky islands' of western North America, with many species originating during the Pleistocene. Despite their recent origins, phylogenetic analyses indicate that all the species exhibit monophyletic or paraphyletic gene trees. The objectives of this study were to determine whether the monophyletic genealogies are the result of a bottleneck at speciation and to investigate the extent to which the different phylogenetic states of eight species (i.e. monophyletic versus paraphyletic gene trees) can be ascribed to the effects of speciation. A coalescent simulation was used to test for a bottleneck at speciation in each species. The effective population sizes and demographic histories of species were compared across taxa to evaluate the possibility that the paraphyly versus monophyly of the species reflects differential rates of lineage loss rather than speciation mode. While coalescent analyses indicate that the monophyly of Melanoplus species might not be indicative of bottlenecks at speciation, the results suggest that the paraphyletic gene trees may reflect the demography of speciation, involving localized divergences in the ancestral species. With respect to different models of Pleistocene divergence, the data do not support a model of founder-effect speciation but are compatible with divergence in allopatric refugia.  相似文献   

15.
Peripatric speciation and the importance of founder effects have long been controversial, and multilocus sequence data and coalescent methods now allow hypotheses of peripatric speciation to be tested in a rigorous manner. Using a multilocus phylogeographical data set for two species of salamanders (genus Hydromantes) from the Sierra Nevada of California, hypotheses of recent divergence by peripatric speciation and older, allopatric divergence were tested. Phylogeographical analysis revealed two divergent lineages within Hydromantes platycephalus, which were estimated to have diverged in the Pliocene. By contrast, a low‐elevation species, Hydromantes brunus, diverged from within the northern lineage of H. platycephalus much more recently (mid‐Pleistocene), during a time of major climatic change in the Sierra Nevada. Multilocus species tree estimation and coalescent estimates of divergence time, migration rate, and growth rate reject a scenario of ancient speciation of H. brunus with subsequent gene flow and introgression from H. platycephalus, instead supporting a more recent divergence with population expansion. Although the small, peripheral distribution of H. brunus suggests the possibility of peripatric speciation, the estimated founding population size of the species was too large to have allowed founder effects to be important in its divergence. These results provide evidence for both recent speciation, most likely tied to the climatic changes of the Pleistocene, and older lineage divergence, possibly due to geological events, and add to evidence that Pleistocene glacial cycles were an important driver of diversification in the Sierra Nevada.  相似文献   

16.
Gene tree distributions under the coalescent process   总被引:10,自引:0,他引:10  
Under the coalescent model for population divergence, lineage sorting can cause considerable variability in gene trees generated from any given species tree. In this paper, we derive a method for computing the distribution of gene tree topologies given a bifurcating species tree for trees with an arbitrary number of taxa in the case that there is one gene sampled per species. Applications for gene tree distributions include determining exact probabilities of topological equivalence between gene trees and species trees and inferring species trees from multiple datasets. In addition, we examine the shapes of gene tree distributions and their sensitivity to changes in branch lengths, species tree shape, and tree size. The method for computing gene tree distributions is implemented in the computer program COAL.  相似文献   

17.
Lineage, or true ‘species’, trees may differ from gene trees because of stochastic processes in molecular evolution leading to gene‐tree heterogeneity. Problems with inferring species trees because of excessive incomplete lineage sorting may be exacerbated in lineages with rapid diversification or recent divergences necessitating the use of multiple loci and individuals. Many recent multilocus studies that investigate divergence times identify lineage splitting to be more recent than single‐locus studies, forcing the revision of biogeographic scenarios driving divergence. Here, we use 21 nuclear loci from regional populations to re‐evaluate hypotheses identified in an mtDNA phylogeographic study of the Brown Creeper (Certhia americana), as well as identify processes driving divergence. Nuclear phylogeographic analyses identified hierarchical genetic structure, supporting a basal split at approximately 32°N latitude, splitting northern and southern populations, with mixed patterns of genealogical concordance and discordance between data sets within the major lineages. Coalescent‐based analyses identify isolation, with little to no gene flow, as the primary driver of divergence between lineages. Recent isolation appears to have caused genetic bottlenecks in populations in the Sierra Madre Oriental and coastal mountain ranges of California, which may be targets for conservation concerns.  相似文献   

18.
The macroevolutionary consequences of recent climate change remain controversial, and there is little paleobotanical or morphological evidence that Pleistocene (1.8-0.12 Ma) glacial cycles acted as drivers of speciation, especially among lineages with long generation times, such as trees. We combined genetic and ecogeographic data from 2 closely related North American tree species, Populus balsamifera and P. trichocarpa (Salicacaeae), to determine if their divergence coincided with and was possibly caused by Pleistocene climatic events. We analyzed 32 nuclear loci from individuals of P. balsamifera and P. trichocarpa to produce coalescent-based estimates of the divergence time between the 2 species. We coupled the coalescent analyses with paleodistribution models to assess the influence of climate change on species' range. Furthermore, measures of niche overlap were used to investigate patterns of ecological differentiation between species. We estimated the divergence date of P. balsamifera and P. trichocarpa at approximately 75 Ka, which corresponds closely with the onset of Marine Isotope Stage 4 (~76 Ka) and a rapid increase in global ice volume. Significance tests of niche overlap, in conjunction with genetic estimates of migration, suggested that speciation occurred in allopatry, possibly resulting from the environmental effects of Pleistocene glacial cycles. Our results indicate that the divergence of keystone tree species, which have shaped community diversity in northern North American ecosystems, was recent and may have been a consequence of Pleistocene-era glaciation and climate change.  相似文献   

19.
Yu Y  Degnan JH  Nakhleh L 《PLoS genetics》2012,8(4):e1002660
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa.  相似文献   

20.
The properties of random gene tree topologies have recently been studied under a coalescent model that treats a species tree as a fixed parameter. Here we develop the analogous theory for random ranked gene tree topologies, in which both the topology and the sequence of coalescences for a random gene tree are considered. We derive the probability distribution of ranked gene tree topologies conditional on a fixed species tree. We then show that similar to the unranked case, ranked gene trees that do not match either the ranking or the topology of the species tree can have greater probability than the matching ranked gene tree.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号