首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
A general comparison of relaxed molecular clock models   总被引:4,自引:0,他引:4  
Several models have been proposed to relax the molecular clock in order to estimate divergence times. However, it is unclear which model has the best fit to real data and should therefore be used to perform molecular dating. In particular, we do not know whether rate autocorrelation should be considered or which prior on divergence times should be used. In this work, we propose a general bench mark of alternative relaxed clock models. We have reimplemented most of the already existing models, including the popular lognormal model, as well as various prior choices for divergence times (birth-death, Dirichlet, uniform), in a common Bayesian statistical framework. We also propose a new autocorrelated model, called the "CIR" process, with well-defined stationary properties. We assess the relative fitness of these models and priors, when applied to 3 different protein data sets from eukaryotes, vertebrates, and mammals, by computing Bayes factors using a numerical method called thermodynamic integration. We find that the 2 autocorrelated models, CIR and lognormal, have a similar fit and clearly outperform uncorrelated models on all 3 data sets. In contrast, the optimal choice for the divergence time prior is more dependent on the data investigated. Altogether, our results provide useful guidelines for model choice in the field of molecular dating while opening the way to more extensive model comparisons.  相似文献   

2.
Ephedra comprises approximately 50 species, which are roughly equally distributed between the Old and New World deserts, but not in the intervening regions (amphitropical range). Great heterogeneity in the substitution rates of Gnetales (Ephedra, Gnetum, and Welwitschia) has made it difficult to infer the ages of the major divergence events in Ephedra, such as the timing of the Beringian disjunction in the genus and the entry into South America. Here, we use data from as many Gnetales species and genes as available from GenBank and from a recent study to investigate the timing of the major divergence events. Because of the tradeoff between the amount of missing data and taxon/gene sampling, we reduced the initial matrix of 265 accessions and 12 loci to 95 accessions and 10 loci, and further to 42 species (and 7736 aligned nucleotides) to achieve stationary distributions in the Bayesian molecular clock runs. Results from a relaxed clock with an uncorrelated rates model and fossil-based calibration reveal that New World species are monophyletic and diverged from their mostly Asian sister clade some 30 mya, fitting with many other Beringian disjunctions. The split between the single North American and the single South American clade occurred approximately 25 mya, well before the closure of the Panamanian Isthmus. Overall, the biogeographic history of Ephedra appears dominated by long-distance dispersal, but finer-scale studies are needed to test this hypothesis.  相似文献   

3.
Abstract Ephedra comprises approximately 50 species, which are roughly equally distributed between the Old and New World deserts, but not in the intervening regions (amphitropical range). Great heterogeneity in the substitution rates of Gnetales (Ephedra, Gnetum, and Welwitschia) has made it difficult to infer the ages of the major divergence events in Ephedra, such as the timing of the Beringian disjunction in the genus and the entry into South America. Here, we use data from as many Gnetales species and genes as available from GenBank and from a recent study to investigate the timing of the major divergence events. Because of the tradeoff between the amount of missing data and taxon/gene sampling, we reduced the initial matrix of 265 accessions and 12 loci to 95 accessions and 10 loci, and further to 42 species (and 7736 aligned nucleotides) to achieve stationary distributions in the Bayesian molecular clock runs. Results from a relaxed clock with an uncorrelated rates model and fossil‐based calibration reveal that New World species are monophyletic and diverged from their mostly Asian sister clade some 30 mya, fitting with many other Beringian disjunctions. The split between the single North American and the single South American clade occurred approximately 25 mya, well before the closure of the Panamanian Isthmus. Overall, the biogeographic history of Ephedra appears dominated by long‐distance dispersal, but finer‐scale studies are needed to test this hypothesis.  相似文献   

4.
The chronological scenario of the evolution of hominoid primates has been thoroughly investigated since the advent of the molecular clock hypothesis. With the availability of genomic sequences for all hominid genera and other anthropoids, we may have reached the point at which the information from sequence data alone will not provide further evidence for the inference of the hominid evolution timescale. To verify this conjecture, we have compiled a genomic data set for all of the anthropoid genera. Our estimate places the Homo/Pan divergence at approximately 7.4 Ma, the Gorilla lineage divergence at approximately 9.7 Ma, the basal Hominidae divergence at 18.1 Ma and the basal Hominoidea divergence at 20.6 Ma. By inferring the theoretical limit distribution of posterior densities under a Bayesian framework, we show that it is unlikely that lengthier alignments or the availability of new genomic sequences will provide additional information to reduce the uncertainty associated with the divergence time estimates of the four hominid genera. A reduction of this uncertainty will be achieved only by the inclusion of more informative calibration priors.  相似文献   

5.
Begonia is a mega‐diverse genus comprising c. 1500 species of herbs, shrubs and epiphytes with a near pantropical distribution. Previous date estimates for the most recent common ancestor of Begonia have placed the evolution of this genus into a broad temporal context, but the issue of an absolute date estimate remains open. In this study, we attempt to estimate absolute DNA divergence dates for Begonia and, in doing so, address some of many the factors that can affect such estimates. The largest source of variance in our estimates was because of uncertainty with the calibration constraints and phylogenetic distance between these constraints and Begonia. Another large source of variance was due to the alternative methods of analysis investigated. Less variance was as a result of the alternative DNA datasets and combinations of calibration constraints assessed. Our date estimates suggest that the most recent common ancestor of Begonia could have diversified from the end of the Cretaceous to the beginning of the Neogene, probably during a period of global cooling from the mid Eocene to early Oligocene. These estimates imply that the near pantropical distribution of extant Begonia was generated by intercontinental dispersal after the ancient inferred break up of the supercontinent, Gondwana. © 2009 The Linnean Society of London, Botanical Journal of the Linnean Society, 2009, 159 , 363–380.  相似文献   

6.
Molecular estimates of evolutionary timescales have an important role in a range of biological studies. Such estimates can be made using methods based on molecular clocks, including models that are able to account for rate variation across lineages. All clock models share a dependence on calibrations, which enable estimates to be given in absolute time units. There are many available methods for incorporating fossil calibrations, but geological and climatic data can also provide useful calibrations for molecular clocks. However, a number of strong assumptions need to be made when using these biogeographic calibrations, leading to wide variation in their reliability and precision. In this review, we describe the nature of biogeographic calibrations and the assumptions that they involve. We present an overview of the different geological and climatic events that can provide informative calibrations, and explain how such temporal information can be incorporated into dating analyses.  相似文献   

7.
In recent years, a number of phylogenetic methods have been developed for estimating molecular rates and divergence dates under models that relax the molecular clock constraint by allowing rate change throughout the tree. These methods are being used with increasing frequency, but there have been few studies into their accuracy. We tested the accuracy of several relaxed-clock methods (penalized likelihood and Bayesian inference using various models of rate change) using nucleotide sequences simulated on a nine-taxon tree. When the sequences evolved with a constant rate, the methods were able to infer rates accurately, but estimates were more precise when a molecular clock was assumed. When the sequences evolved under a model of auto-correlated rate change, rates were accurately estimated using penalized likelihood and by Bayesian inference using lognormal and exponential models of rate change, while other models did not perform as well. When the sequences evolved under a model of uncorrelated rate change, only Bayesian inference using an exponential rate model performed well. Collectively, the results provide a strong recommendation for using the exponential model of rate change if a conservative approach to divergence time estimation is required. A case study is presented in which we use a simulation-based approach to examine the hypothesis of elevated rates in the Cambrian period, and it is found that these high rate estimates might be an artifact of the rate estimation method. If this bias is present, then the ages of metazoan divergences would be systematically underestimated. The results of this study have implications for studies of molecular rates and divergence dates.  相似文献   

8.
The age of the angiosperms: a molecular timescale without a clock   总被引:8,自引:0,他引:8  
The age of the angiosperms has long been of interest to botanists and evolutionary biologists. Many early efforts to date the age of the angiosperms and evolutionary divergences within the angiosperm clade using a molecular clock have yielded age estimates that are grossly inconsistent with the fossil record. We investigated the age of angiosperms using Bayesian relaxed clock (BRC) and penalized likelihood (PL) approaches. Both of these methods allow the incorporation of multiple fossil constraints into the optimization procedure. The BRC method allows a range of values for among-lineage rate of substitution, from a nearly clocklike behavior to a condition in which each branch is allowed an optimal substitution rate, and also accounts for variation in molecular evolution across multiple genes. A topology derived from an analysis of genes from all three plant genomes for 71 taxa was used as a backbone. The effects on age estimates of different genes, single-gene versus concatenated datasets, and the inclusion and assumptions of fossils as age constraints were examined. In addition, the influence of prior distributions on estimates of divergence times was also explored. These results indicate that widely divergent age estimates can result from the different methods (198-139 million years ago), different sources of data (275-122 million years ago), and the inclusion of temporal constraints to topologies. Most dates, however, are between 180-140 million years ago, suggesting a Middle Jurassic-Early Cretaceous origin of flowering plants, predating the oldest unequivocal fossil angiosperms by about 45-5 million years. Nonetheless, these dates are consistent with other recent studies that have used methods that relax the assumption of a strict molecular clock and also agree with the hypothesis that the angiosperms may be somewhat older than the fossil record indicates.  相似文献   

9.
The molecular clock presents a means of estimating evolutionary rates and timescales using genetic data. These estimates can lead to important insights into evolutionary processes and mechanisms, as well as providing a framework for further biological analyses. To deal with rate variation among genes and among lineages, a diverse range of molecular‐clock methods have been developed. These methods have been implemented in various software packages and differ in their statistical properties, ability to handle different models of rate variation, capacity to incorporate various forms of calibrating information and tractability for analysing large data sets. Choosing a suitable molecular‐clock model can be a challenging exercise, but a number of model‐selection techniques are available. In this review, we describe the different forms of evolutionary rate heterogeneity and explain how they can be accommodated in molecular‐clock analyses. We provide an outline of the various clock methods and models that are available, including the strict clock, local clocks, discrete clocks and relaxed clocks. Techniques for calibration and clock‐model selection are also described, along with methods for handling multilocus data sets. We conclude our review with some comments about the future of molecular clocks.  相似文献   

10.
Divergence time studies rely on calibration information from several sources. The age of volcanic islands is one of the standard references to obtain chronological data to estimate the absolute times of lineage diversifications. This strategy assumes that cladogenesis is necessarily associated with island formation, and punctual calibrations are commonly used to date the splits of endemic island species. Here, we re-examined three studies that inferred divergence times for different Hawaiian lineages assuming fixed calibration points. We show that, by permitting probabilistic calibrations, some divergences are estimated to be significantly younger or older than the age of the island formation, thus yielding distinct ecological scenarios for the speciation process. The results highlight the importance of using calibration information correctly, as well as the possibility of incorporating volcanic island studies into a formal, biogeographical hypothesis-testing framework.  相似文献   

11.
Recurrent events could be stopped by a terminal event, which commonly occurs in biomedical and clinical studies. In this situation, dependent censoring is encountered because of potential dependence between these two event processes, leading to invalid inference if analyzing recurrent events alone. The joint frailty model is one of the widely used approaches to jointly model these two processes by sharing the same frailty term. One important assumption is that recurrent and terminal event processes are conditionally independent given the subject‐level frailty; however, this could be violated when the dependency may also depend on time‐varying covariates across recurrences. Furthermore, marginal correlation between two event processes based on traditional frailty modeling has no closed form solution for estimation with vague interpretation. In order to fill these gaps, we propose a novel joint frailty‐copula approach to model recurrent events and a terminal event with relaxed assumptions. Metropolis–Hastings within the Gibbs Sampler algorithm is used for parameter estimation. Extensive simulation studies are conducted to evaluate the efficiency, robustness, and predictive performance of our proposal. The simulation results show that compared with the joint frailty model, the bias and mean squared error of the proposal is smaller when the conditional independence assumption is violated. Finally, we apply our method into a real example extracted from the MarketScan database to study the association between recurrent strokes and mortality.  相似文献   

12.
A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution.  相似文献   

13.
Evolutionary timescales can be estimated from genetic data using phylogenetic methods based on the molecular clock. To account for molecular rate variation among lineages, a number of relaxed‐clock models have been developed. Some of these models assume that rates vary among lineages in an autocorrelated manner, so that closely related species share similar rates. In contrast, uncorrelated relaxed clocks allow all of the branch‐specific rates to be drawn from a single distribution, without assuming any correlation between rates along neighbouring branches. There is uncertainty about which of these two classes of relaxed‐clock models are more appropriate for biological data. We present an R package, NELSI, that allows the evolution of DNA sequences to be simulated according to a range of clock models. Using data generated by this package, we assessed the ability of two Bayesian phylogenetic methods to distinguish among different relaxed‐clock models and to quantify rate variation among lineages. The results of our analyses show that rate autocorrelation is typically difficult to detect, even when there is complete taxon sampling. This provides a potential explanation for past failures to detect rate autocorrelation in a range of data sets.  相似文献   

14.
Violation of the molecular clock has been amply documented, and is now routinely taken into account by molecular dating methods. Comparative analyses have revealed a systematic component in rate variation, relating it to the evolution of life-history traits, such as body size or generation time. Life-history evolution can be reconstructed using Brownian models. However, the resulting estimates are typically uncertain, and potentially sensitive to the underlying assumptions. As a way of obtaining more accurate ancestral trait and divergence time reconstructions, correlations between life-history traits and substitution rates could be used as an additional source of information. In this direction, a Bayesian framework for jointly reconstructing rates, traits, and dates was previously introduced. Here, we apply this model to a 17 protein-coding gene alignment for 73 placental taxa. Our analysis indicates that the coupling between molecules and life history can lead to a reevaluation of ancestral life-history profiles, in particular for groups displaying convergent evolution in body size. However, reconstructions are sensitive to fossil calibrations and to the Brownian assumption. Altogether, our analysis suggests that further integrating inference of rates and traits might be particularly useful for neontological macroevolutionary comparative studies.  相似文献   

15.

Background

Molecular dating has gained ever-increasing interest since the molecular clock hypothesis was proposed in the 1960s. Molecular dating provides detailed temporal frameworks for divergence events in phylogenetic trees, allowing diverse evolutionary questions to be addressed. The key aspect of the molecular clock hypothesis, namely that differences in DNA or protein sequence between two species are proportional to the time elapsed since they diverged, was soon shown to be untenable. Other approaches were proposed to take into account rate heterogeneity among lineages, but the calibration process, by which relative times are transformed into absolute ages, has received little attention until recently. New methods have now been proposed to resolve potential sources of error associated with the calibration of phylogenetic trees, particularly those involving use of the fossil record.

Scope and Conclusions

The use of the fossil record as a source of independent information in the calibration process is the main focus of this paper; other sources of calibration information are also discussed. Particularly error-prone aspects of fossil calibration are identified, such as fossil dating, the phylogenetic placement of the fossil and the incompleteness of the fossil record. Methods proposed to tackle one or more of these potential error sources are discussed (e.g. fossil cross-validation, prior distribution of calibration points and confidence intervals on the fossil record). In conclusion, the fossil record remains the most reliable source of information for the calibration of phylogenetic trees, although associated assumptions and potential bias must be taken into account.  相似文献   

16.
We implement a Bayesian Markov chain Monte Carlo algorithm for estimating species divergence times that uses heterogeneous data from multiple gene loci and accommodates multiple fossil calibration nodes. A birth-death process with species sampling is used to specify a prior for divergence times, which allows easy assessment of the effects of that prior on posterior time estimates. We propose a new approach for specifying calibration points on the phylogeny, which allows the use of arbitrary and flexible statistical distributions to describe uncertainties in fossil dates. In particular, we use soft bounds, so that the probability that the true divergence time is outside the bounds is small but nonzero. A strict molecular clock is assumed in the current implementation, although this assumption may be relaxed. We apply our new algorithm to two data sets concerning divergences of several primate species, to examine the effects of the substitution model and of the prior for divergence times on Bayesian time estimation. We also conduct computer simulation to examine the differences between soft and hard bounds. We demonstrate that divergence time estimation is intrinsically hampered by uncertainties in fossil calibrations, and the error in Bayesian time estimates will not go to zero with increased amounts of sequence data. Our analyses of both real and simulated data demonstrate potentially large differences between divergence time estimates obtained using soft versus hard bounds and a general superiority of soft bounds. Our main findings are as follows. (1) When the fossils are consistent with each other and with the molecular data, and the posterior time estimates are well within the prior bounds, soft and hard bounds produce similar results. (2) When the fossils are in conflict with each other or with the molecules, soft and hard bounds behave very differently; soft bounds allow sequence data to correct poor calibrations, while poor hard bounds are impossible to overcome by any amount of data. (3) Soft bounds eliminate the need for "safe" but unrealistically high upper bounds, which may bias posterior time estimates. (4) Soft bounds allow more reliable assessment of estimation errors, while hard bounds generate misleadingly high precisions when fossils and molecules are in conflict.  相似文献   

17.
Spliceosomal introns are present in almost all eukaryotic genes, yet little is known about their origin and turnover in the majority of eukaryotic phyla. There is no agreement whether most introns are ancestral and have been lost in some lineage or have been gained recently. We addressed this question by analyzing the spatial and temporal distribution of introns in actins of foraminifera, a group of testate protists whose exceptionally rich fossil record permits the calibration of molecular phylogenies to date intron origins. We identified 24 introns dispersed along the sequence of two foraminiferan actin paralogues and actin deviating proteins, an unconventional type of fast-evolving actin found in some foraminifera. Comparison of intron positions indicates that 20 of 24 introns are specific to foraminifera. Four introns shared between foraminifera and other eukaryotes were interpreted as parallel gains because they have been found only in single species belonging to phylogenetically distinctive lineages. Moreover, additional recent intron gain due to the transfer between the actin paralogues was observed in two cultured species. Based on a relaxed molecular clock timescale, we conclude that intron gains in actin took place throughout the evolution of foraminifera, with the oldest introns inserted between 550 and 500 million years ago and the youngest ones acquired less than 100 million years ago. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Dr. Debashish Bhattacharya]  相似文献   

18.
A limiting factor in many molecular dating studies is shortage of reliable calibrations. Current methods for choosing calibrations (e.g. cross-validation) treat them as either correct or incorrect, whereas calibrations probably lie on a continuum from highly accurate to very poor. Bayesian relaxed clock analysis permits inclusion of numerous candidate calibrations as priors: provided most calibrations are reliable, the model appropriate and the data informative, the accuracy of each calibration prior can be evaluated. If a calibration is accurate, then the analysis will support the prior so that the posterior estimate reflects the prior; if a calibration is poor, the posterior will be forced away from the prior. We use this approach to test two fossil dates recently proposed as standard calibrations within vertebrates. The proposed bird-crocodile calibration (approx. 247Myr ago) appears to be accurate, but the proposed bird-lizard calibration (approx. 255Myr ago) is substantially too recent.  相似文献   

19.
On prequential model assessment in life history analysis   总被引:1,自引:0,他引:1  
ARJAS  ELJA; GASBARRA  DARIO 《Biometrika》1997,84(3):505-522
  相似文献   

20.
Recent advances have allowed for both morphological fossil evidence and molecular sequences to be integrated into a single combined inference of divergence dates under the rule of Bayesian probability. In particular, the fossilized birth–death tree prior and the Lewis-Mk model of discrete morphological evolution allow for the estimation of both divergence times and phylogenetic relationships between fossil and extant taxa. We exploit this statistical framework to investigate the internal consistency of these models by producing phylogenetic estimates of the age of each fossil in turn, within two rich and well-characterized datasets of fossil and extant species (penguins and canids). We find that the estimation accuracy of fossil ages is generally high with credible intervals seldom excluding the true age and median relative error in the two datasets of 5.7% and 13.2%, respectively. The median relative standard error (RSD) was 9.2% and 7.2%, respectively, suggesting good precision, although with some outliers. In fact, in the two datasets we analyse, the phylogenetic estimate of fossil age is on average less than 2 Myr from the mid-point age of the geological strata from which it was excavated. The high level of internal consistency found in our analyses suggests that the Bayesian statistical model employed is an adequate fit for both the geological and morphological data, and provides evidence from real data that the framework used can accurately model the evolution of discrete morphological traits coded from fossil and extant taxa. We anticipate that this approach will have diverse applications beyond divergence time dating, including dating fossils that are temporally unconstrained, testing of the ‘morphological clock'', and for uncovering potential model misspecification and/or data errors when controversial phylogenetic hypotheses are obtained based on combined divergence dating analyses.This article is part of the themed issue ‘Dating species divergences using rocks and clocks’.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号