首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 830 毫秒
1.
We tested the metabolic rate hypothesis (whereby rates of mtDNA evolution are postulated to be mediated primarily by mutagenic by-products of respiration) by examining whether mass-specific metabolic rate was correlated with root-to-tip distance on a set of mtDNA trees for the springtail Cryptopygus antarcticus travei from sub-Antarctic Marion Island.Using Bayesian analyses and a novel application of the comparative phylogenetic method, we did not find significant evidence that contemporary metabolic rates directly correlate with mutation rate (i.e., root-to-tip distance) once the underlying phylogeny is taken into account. However, we did find significant evidence that metabolic rate is dependent on the underlying mtDNA tree, or in other words, lineages with related mtDNA also have similar metabolic rates.We anticipate that future analyses which apply this methodology to datasets with longer sequences, more taxa, or greater variability will have more power to detect a significant direct correlation between metabolic rate and mutation rate. We conclude with suggestions for future analyses that would extend the preliminary approach applied here, in particular highlighting ways to tease apart oxidative stress effects from the effects of population size and/or selection coefficients operating on the molecular evolutionary rate.  相似文献   

2.
Bayesian estimation of ancestral character states on phylogenies   总被引:17,自引:0,他引:17  
Biologists frequently attempt to infer the character states at ancestral nodes of a phylogeny from the distribution of traits observed in contemporary organisms. Because phylogenies are normally inferences from data, it is desirable to account for the uncertainty in estimates of the tree and its branch lengths when making inferences about ancestral states or other comparative parameters. Here we present a general Bayesian approach for testing comparative hypotheses across statistically justified samples of phylogenies, focusing on the specific issue of reconstructing ancestral states. The method uses Markov chain Monte Carlo techniques for sampling phylogenetic trees and for investigating the parameters of a statistical model of trait evolution. We describe how to combine information about the uncertainty of the phylogeny with uncertainty in the estimate of the ancestral state. Our approach does not constrain the sample of trees only to those that contain the ancestral node or nodes of interest, and we show how to reconstruct ancestral states of uncertain nodes using a most-recent-common-ancestor approach. We illustrate the methods with data on ribonuclease evolution in the Artiodactyla. Software implementing the methods (BayesMultiState) is available from the authors.  相似文献   

3.
Detecting the node-density artifact in phylogeny reconstruction   总被引:4,自引:0,他引:4  
The node-density effect is an artifact of phylogeny reconstruction that can cause branch lengths to be underestimated in areas of the tree with fewer taxa. Webster, Payne, and Pagel (2003, Science 301:478) introduced a statistical procedure (the "delta" test) to detect this artifact, and here we report the results of computer simulations that examine the test's performance. In a sample of 50,000 random data sets, we find that the delta test detects the artifact in 94.4% of cases in which it is present. When the artifact is not present (n = 10,000 simulated data sets) the test showed a type I error rate of approximately 1.69%, incorrectly reporting the artifact in 169 data sets. Three measures of tree shape or "balance" failed to predict the size of the node-density effect. This may reflect the relative homogeneity of our randomly generated topologies, but emphasizes that nearly any topology can suffer from the artifact, the effect not being confined only to highly unevenly sampled or otherwise imbalanced trees. The ability to screen phylogenies for the node-density artifact is important for phylogenetic inference and for researchers using phylogenetic trees to infer evolutionary processes, including their use in molecular clock dating.  相似文献   

4.
Reconciling discordant morphological and molecular phylogenies remains a problem in modern systematics. By examining conflicting DNA-hybridization and morphological phylogenies of sand dollars, I show that morphological criteria may be used to help evaluate the reliability of molecular phylogenies where they differ from morphological trees. All available criteria for assessing the reliability of DNA-hybridization phylogenies suggest that the sand dollar DNA-hybridization phylogeny is robust. Standard homology-recognition criteria are used to assess the a priori reliabilities of the morphological attributes associated with the node drawn into question by the DNA data, and it is shown that these attributes are among the least phylogenetically informative of all the morphological characters. Moreover, the questioned node has the smallest number of supporting characters, and most of these characters are associated with the food grooves, which suggests that they may be functionally correlated. Thus, on the basis of the analysis of the morphological data and given the robustness of the DNA tree, the DNA phylogeny is preferred. Further, paleobiogeographic data support the DNA tree rather than the morphological tree, and a plausible heterochronic mechanism has been proposed that may account for the homoplasious morphological evolution that must have occurred if the DNA tree is correct.  相似文献   

5.
Due to its speed, the distance approach remains the best hope for building phylogenies on very large sets of taxa. Recently (R. Desper and O. Gascuel, J. Comp. Biol. 9:687-705, 2002), we introduced a new "balanced" minimum evolution (BME) principle, based on a branch length estimation scheme of Y. Pauplin (J. Mol. Evol. 51:41-47, 2000). Initial simulations suggested that FASTME, our program implementing the BME principle, was more accurate than or equivalent to all other distance methods we tested, with running time significantly faster than Neighbor-Joining (NJ). This article further explores the properties of the BME principle, and it explains and illustrates its impressive topological accuracy. We prove that the BME principle is a special case of the weighted least-squares approach, with biologically meaningful variances of the distance estimates. We show that the BME principle is statistically consistent. We demonstrate that FASTME only produces trees with positive branch lengths, a feature that separates this approach from NJ (and related methods) that may produce trees with branches with biologically meaningless negative lengths. Finally, we consider a large simulated data set, with 5,000 100-taxon trees generated by the Aldous beta-splitting distribution encompassing a range of distributions from Yule-Harding to uniform, and using a covarion-like model of sequence evolution. FASTME produces trees faster than NJ, and much faster than WEIGHBOR and the weighted least-squares implementation of PAUP*. Moreover, FASTME trees are consistently more accurate at all settings, ranging from Yule-Harding to uniform distributions, and all ranges of maximum pairwise divergence and departure from molecular clock. Interestingly, the covarion parameter has little effect on the tree quality for any of the algorithms. FASTME is freely available on the web.  相似文献   

6.
Molecular phylogenies are increasingly being used to investigate the patterns and mechanisms of macroevolution. In particular, node heights in a phylogeny can be used to detect changes in rates of diversification over time. Such analyses rest on the assumption that node heights in a phylogeny represent the timing of diversification events, which in turn rests on the assumption that evolutionary time can be accurately predicted from DNA sequence divergence. But there are many influences on the rate of molecular evolution, which might also influence node heights in molecular phylogenies, and thus affect estimates of diversification rate. In particular, a growing number of studies have revealed an association between the net diversification rate estimated from phylogenies and the rate of molecular evolution. Such an association might, by influencing the relative position of node heights, systematically bias estimates of diversification time. We simulated the evolution of DNA sequences under several scenarios where rates of diversification and molecular evolution vary through time, including models where diversification and molecular evolutionary rates are linked. We show that commonly used methods, including metric‐based, likelihood and Bayesian approaches, can have a low power to identify changes in diversification rate when molecular substitution rates vary. Furthermore, the association between the rates of speciation and molecular evolution rate can cause the signature of a slowdown or speedup in speciation rates to be lost or misidentified. These results suggest that the multiple sources of variation in molecular evolutionary rates need to be considered when inferring macroevolutionary processes from phylogenies.  相似文献   

7.
Extant gars represent the remaining members of a formerly diverse assemblage of ancient ray-finned fishes and have been the subject of multiple phylogenetic analyses using morphological data. Here, we present the first hypothesis of phylogenetic relationships among living gar species based on molecular data, through the examination of gene tree heterogeneity and coalescent species tree analyses of a portion of one mitochondrial (COI) and seven nuclear (ENC1, myh6, plagl2, S7 ribosomal protein intron 1, sreb2, tbr1, and zic1) genes. Individual gene trees displayed varying degrees of resolution with regards to species-level relationships, and the gene trees inferred from COI and the S7 intron were the only two that were completely resolved. Coalescent species tree analyses of nuclear genes resulted in a well-resolved and strongly supported phylogenetic tree of living gar species, for which Bayesian posterior node support was further improved by the inclusion of the mitochondrial gene. Species-level relationships among gars inferred from our molecular data set were highly congruent with previously published morphological phylogenies, with the exception of the placement of two species, Lepisosteus osseus and L. platostomus. Re-examination of the character coding used by previous authors provided partial resolution of this topological discordance, resulting in broad concordance in the phylogenies inferred from individual genes, the coalescent species tree analysis, and morphology. The completely resolved phylogeny inferred from the molecular data set with strong Bayesian posterior support at all nodes provided insights into the potential for introgressive hybridization and patterns of allopatric speciation in the evolutionary history of living gars, as well as a solid foundation for future examinations of functional diversification and evolutionary stasis in a "living fossil" lineage.  相似文献   

8.
The Bayesian method for estimating species phylogenies from molecular sequence data provides an attractive alternative to maximum likelihood with nonparametric bootstrap due to the easy interpretation of posterior probabilities for trees and to availability of efficient computational algorithms. However, for many data sets it produces extremely high posterior probabilities, sometimes for apparently incorrect clades. Here we use both computer simulation and empirical data analysis to examine the effect of the prior model for internal branch lengths. We found that posterior probabilities for trees and clades are sensitive to the prior for internal branch lengths, and priors assuming long internal branches cause high posterior probabilities for trees. In particular, uniform priors with high upper bounds bias Bayesian clade probabilities in favor of extreme values. We discuss possible remedies to the problem, including empirical and full Bayesian methods and subjective procedures suggested in Bayesian hypothesis testing. Our results also suggest that the bootstrap proportion and Bayesian posterior probability are different measures of accuracy, and that the bootstrap proportion, if interpreted as the probability that the clade is true, can be either too liberal or too conservative.  相似文献   

9.
Phylogenetic trees inferred from sequence data often have branch lengths measured in the expected number of substitutions and therefore, do not have divergence times estimated. These trees give an incomplete view of evolutionary histories since many applications of phylogenies require time trees. Many methods have been developed to convert the inferred branch lengths from substitution unit to time unit using calibration points, but none is universally accepted as they are challenged in both scalability and accuracy under complex models. Here, we introduce a new method that formulates dating as a nonconvex optimization problem where the variance of log-transformed rate multipliers is minimized across the tree. On simulated and real data, we show that our method, wLogDate, is often more accurate than alternatives and is more robust to various model assumptions.  相似文献   

10.
Metrics of phylogenetic tree reliability, such as parametric bootstrap percentages or Bayesian posterior probabilities, represent internal measures of the topological reproducibility of a phylogenetic tree, while the recently introduced aLRT (approximate likelihood ratio test) assesses the likelihood that a branch exists on a maximum-likelihood tree. Although those values are often equated with phylogenetic tree accuracy, they do not necessarily estimate how well a reconstructed phylogeny represents cladistic relationships that actually exist in nature. The authors have therefore attempted to quantify how well bootstrap percentages, posterior probabilities, and aLRT measures reflect the probability that a deduced phylogenetic clade is present in a known phylogeny. The authors simulated the evolution of bacterial genes of varying lengths under biologically realistic conditions, and reconstructed those known phylogenies using both maximum likelihood and Bayesian methods. Then, they measured how frequently clades in the reconstructed trees exhibiting particular bootstrap percentages, aLRT values, or posterior probabilities were found in the true trees. The authors have observed that none of these values correlate with the probability that a given clade is present in the known phylogeny. The major conclusion is that none of the measures provide any information about the likelihood that an individual clade actually exists. It is also found that the mean of all clade support values on a tree closely reflects the average proportion of all clades that have been assigned correctly, and is thus a good representation of the overall accuracy of a phylogenetic tree.  相似文献   

11.
A new method, PATHd8, for estimating ultrametric trees from trees with edge (branch) lengths proportional to the number of substitutions is proposed. The method allows for an arbitrary number of reference nodes for time calibration, each defined either as absolute age, minimum age, or maximum age, and the tree need not be fully resolved. The method is based on estimating node ages by mean path lengths from the node to the leaves but correcting for deviations from a molecular clock suggested by reference nodes. As opposed to most existing methods allowing substitution rate variation, the new method smoothes substitution rates locally, rather than simultaneously over the whole tree, thus allowing for analysis of very large trees. The performance of PATHd8 is compared with other frequently used methods for estimating divergence times. In analyses of three separate data sets, PATHd8 gives similar divergence times to other methods, the largest difference being between crown group ages, where unconstrained nodes get younger ages when analyzed with PATHd8. Overall, chronograms obtained from other methods appear smoother, whereas PATHd8 preserves more of the heterogeneity seen in the original edge lengths. Divergence times are most evenly spread over the chronograms obtained from the Bayesian implementation and the clock-based Langley-Fitch method, and these two methods produce very similar ages for most nodes. Evaluations of PATHd8 using simulated data suggest that PATHd8 is slightly less precise compared with penalized likelihood, but it gives more sensible answers for extreme data sets. A clear advantage with PATHd8 is that it is more or less instantaneous even with trees having several thousand leaves, whereas other programs often run into problems when analyzing trees with hundreds of leaves. PATHd8 is implemented in freely available software.  相似文献   

12.
Cophylogeny is the congruence of phylogenetic relationships between two different groups of organisms due to their long‐term interaction. We investigated the use of tree shape distance measures to quantify the degree of cophylogeny. We implemented a reverse‐time simulation model of pathogen phylogenies within a fixed host tree, given cospeciation probability, host switching, and pathogen speciation rates. We used this model to evaluate 18 distance measures between host and pathogen trees including two kernel distances that we developed for labeled and unlabeled trees, which use branch lengths and accommodate different size trees. Finally, we used these measures to revisit published cophylogenetic studies, where authors described the observed associations as representing a high or low degree of cophylogeny. Our simulations demonstrated that some measures are more informative than others with respect to specific coevolution parameters especially when these did not assume extreme values. For real datasets, trees’ associations projection revealed clustering of high concordance studies suggesting that investigators are describing it in a consistent way. Our results support the hypothesis that measures can be useful for quantifying cophylogeny. This motivates their usage in the field of coevolution and supports the development of simulation‐based methods, i.e., approximate Bayesian computation, to estimate the underlying coevolutionary parameters.  相似文献   

13.
Aim When hypotheses of historical biogeography are evaluated, age estimates of individual nodes in a phylogeny often have a direct impact on what explanation is concluded to be most likely. Confidence intervals of estimated divergence times obtained in molecular dating analyses are usually very large, but the uncertainty is rarely incorporated in biogeographical analyses. The aim of this study is to use the group Urophylleae, which has a disjunct pantropical distribution, to explore how the uncertainty in estimated divergence times affects conclusions in biogeographical analysis. Two hypotheses are evaluated: (1) long‐distance dispersal from Africa to Asia and the Neotropics, and (2) a continuous distribution in the boreotropics, probably involving migration across the North Atlantic Land Bridge, followed by isolation in equatorial refugia. Location Tropical and subtropical Asia, tropical Africa, and central and southern tropical America. Methods This study uses parsimony and Bayesian phylogenetic analyses of chloroplast DNA and nuclear ribosomal DNA data from 56 ingroup species, beast molecular dating and a Bayesian approach to dispersal–vicariance analysis (Bayes‐DIVA) to reconstruct the ancestral area of the group, and the dispersal–extinction–cladogenesis method to test biogeographical hypotheses. Results When the two models of geographic range evolution were compared using the maximum likelihood (ML) tree with mean estimates of divergence times, boreotropical migration was indicated to be much more likely than long‐distance dispersal. Analyses of a large sample of dated phylogenies did, however, show that this result was not consistent. The age estimate of one specific node had a major impact on likelihood values and on which model performed best. The results show that boreotropical migration provides a slightly better explanation of the geographical distribution patterns of extant Urophylleae than long‐distance dispersal. Main conclusions This study shows that results from biogeographical analyses based on single phylogenetic trees, such as a ML or consensus tree, can be misleading, and that it may be very important to take the uncertainty in age estimates into account. Methods that account for the uncertainty in topology, branch lengths and estimated divergence times are not commonly used in biogeographical inference today but should definitely be preferred in order to avoid unwarranted conclusions.  相似文献   

14.
In popular use of Bayesian phylogenetics, a default branch-length prior is almost universally applied without knowing how a different prior would have affected the outcome. We performed Bayesian and maximum likelihood (ML) inference of phylogeny based on empirical nucleotide sequence data from a family of lichenized ascomycetes, the Psoraceae, the morphological delimitation of which has been controversial. We specifically assessed the influence of the combination of Bayesian branch-length prior and likelihood model on the properties of the Markov chain Monte Carlo tree sample, including node support, branch lengths, and taxon stability. Data included two regions of the mitochondrial ribosomal RNA gene, the internal transcribed spacer region of the nuclear ribosomal RNA gene, and the protein-coding largest subunit of RNA polymerase II. Data partitioning was performed using Bayes' factors, whereas the best-fitting model of each partition was selected using the Bayesian information criterion (BIC). Given the data and model, short Bayesian branch-length priors generate higher numbers of strongly supported nodes as well as short and topologically similar trees sampled from parts of tree space that are largely unexplored by the ML bootstrap. Long branch-length priors generate fewer strongly supported nodes and longer and more dissimilar trees that are sampled mostly from inside the range of tree space sampled by the ML bootstrap. Priors near the ML distribution of branch lengths generate the best marginal likelihood and the highest frequency of "rogue" (unstable) taxa. The branch-length prior was shown to interact with the likelihood model. Trees inferred under complex partitioned models are more affected by the stretching effect of the branch-length prior. Fewer nodes are strongly supported under a complex model given the same branch-length prior. Irrespective of model, internal branches make up a larger proportion of total tree length under the shortest branch-length priors compared with longer priors. Relative effects on branch lengths caused by the branch-length prior can be problematic to downstream phylogenetic comparative methods making use of the branch lengths. Furthermore, given the same branch-length prior, trees are on average more dissimilar under a simple unpartitioned model compared with a more complex partitioned models. The distribution of ML branch lengths was shown to better fit a gamma or Pareto distribution than an exponential one. Model adequacy tests indicate that the best-fitting model selected by the BIC is insufficient for describing data patterns in 5 of 8 partitions. More general substitution models are required to explain the data in three of these partitions, one of which also requires nonstationarity. The two mitochondrial ribosomal RNA gene partitions need heterotachous models. We found no significant correlations between, on the one hand, the amount of ambiguous data or the smallest branch-length distance to another taxon and, on the other hand, the topological stability of individual taxa. Integrating over several exponentially distributed means under the best-fitting model, node support for the family Psoraceae, including Psora, Protoblastenia, and the Micarea sylvicola group, is approximately 0.96. Support for the genus Psora is distinctly lower, but we found no evidence to contradict the current classification.  相似文献   

15.

Background  

Phylogenetic comparative methods are often improved by complete phylogenies with meaningful branch lengths (e.g., divergence dates). This study presents a dated molecular supertree for all 34 world pinniped species derived from a weighted matrix representation with parsimony (MRP) supertree analysis of 50 gene trees, each determined under a maximum likelihood (ML) framework. Divergence times were determined by mapping the same sequence data (plus two additional genes) on to the supertree topology and calibrating the ML branch lengths against a range of fossil calibrations. We assessed the sensitivity of our supertree topology in two ways: 1) a second supertree with all mtDNA genes combined into a single source tree, and 2) likelihood-based supermatrix analyses. Divergence dates were also calculated using a Bayesian relaxed molecular clock with rate autocorrelation to test the sensitivity of our supertree results further.  相似文献   

16.
Recent computational advances provide novel opportunities to infer species trees based on multiple independent loci. Thus, single gene trees no longer need suffice as proxies for species phylogenies. Several methods have been developed to deal with the challenges posed by incomplete and stochastic lineage sorting. In this study, we employed four Bayesian methods to infer the phylogeny of a clade of 11 recently diverged oriole species within the genus Icterus. We obtained well-resolved and mostly congruent phylogenies using a set of seven unlinked nuclear intron loci and sampling multiple individuals per species. Most notably, Bayesian concordance analysis generally agreed well with concatenation; the two methods agreed fully on eight of nine nodes. The coalescent-based method BEAST further supported six of these eight nodes. The fourth method used, BEST, failed to converge despite exhaustive efforts to optimize the tree search. Overall, the results obtained by new species tree methods and concatenation generally corroborate our findings from previous analyses and data sets. However, we found striking disagreement between mitochondrial and nuclear DNA involving relationships within the northern oriole group. Our results highlight the danger of reliance on mtDNA alone for phylogenetic inference. We demonstrate that in spite of low variability and incomplete lineage sorting, multiple nuclear loci can produce largely congruent phylogenies based on multiple species tree methods, even for very closely-related species.  相似文献   

17.
Until recently, molecular phylogenies based on a single or few orthologous genes often yielded contradictory results. Using multiple genes in a large concatenation was proposed to end these incongruences. Here we show that single-gene phylogenies often produce incongruences, albeit ones lacking statistically significant support. By contrast, the use of different tree reconstruction methods on different partitions of the concatenated supergene leads to well-resolved, but real (i.e. statistically significant) incongruences. Gathering a large amount of data is not sufficient to produce reliable trees, given the current limitation of tree reconstruction methods, especially when the quality of data is poor. We propose that selecting only data that contain minimal nonphylogenetic signals takes full advantage of phylogenomics and markedly reduces incongruence.  相似文献   

18.
Because of the stochastic way in which lineages sort during speciation, gene trees may differ in topology from each other and from species trees. Surprisingly, assuming that genetic lineages follow a coalescent model of within-species evolution, we find that for any species tree topology with five or more species, there exist branch lengths for which gene tree discordance is so common that the most likely gene tree topology to evolve along the branches of a species tree differs from the species phylogeny. This counterintuitive result implies that in combining data on multiple loci, the straightforward procedure of using the most frequently observed gene tree topology as an estimate of the species tree topology can be asymptotically guaranteed to produce an incorrect estimate. We conclude with suggestions that can aid in overcoming this new obstacle to accurate genomic inference of species phylogenies.  相似文献   

19.
Xu K  Bezakova I  Bunimovich L  Yi SV 《Proteomics》2011,11(10):1857-1867
We investigated the biological significance of path lengths in 12 protein-protein interaction (PPI) networks. We put forward three predictions, based on the idea that biological complexity influences path lengths. First, at the network level, path lengths are generally longer in PPIs than in random networks. Second, this pattern is more pronounced in more complex organisms. Third, within a PPI network, path lengths of individual proteins are biologically significant. We found that in 11 of the 12 species, average path lengths in PPI networks are significantly longer than those in randomly rewired networks. The PPI network of the malaria parasite Plasmodium falciparum, however, does not exhibit deviation from rewired networks. Furthermore, eukaryotic PPIs exhibit significantly greater deviation from randomly rewired networks than prokaryotic PPIs. Thus our study highlights the potentially meaningful variation in path lengths of PPI networks. Moreover, node eccentricity, defined as the longest path from a protein to others, is significantly correlated with the levels of gene expression and dispensability in the yeast PPI network. We conclude that biological complexity influences both global and local properties of path lengths in PPI networks. Investigating variation of path lengths may provide new tools to analyze the evolution of functional modules in biological systems.  相似文献   

20.
Interest in methods that estimate speciation and extinction rates from molecular phylogenies has increased over the last decade. The application of such methods requires reliable estimates of tree topology and node ages, which are frequently obtained using standard phylogenetic inference combining concatenated loci and molecular dating. However, this practice disregards population‐level processes that generate gene tree/species tree discordance. We evaluated the impact of employing concatenation and coalescent‐based phylogeny inference in recovering the correct macroevolutionary regime using simulated data based on the well‐established diversification rate shift of delphinids in Cetacea. We found that under scenarios of strong incomplete lineage sorting, macroevolutionary analysis of phylogenies inferred by concatenating loci failed to recover the delphinid diversification shift, while the coalescent‐based tree consistently retrieved the correct rate regime. We suggest that ignoring microevolutionary processes reduces the power of methods that estimate macroevolutionary regimes from molecular data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号