首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Multicellular animals, or Metazoa, appear in the fossil records between 575 and 509 million years ago (MYA). At odds with paleontological evidence, molecular estimates of basal metazoan divergences have been consistently older than 700 MYA. However, those date estimates were based on the molecular clock hypothesis, which is almost always violated. To relax this hypothesis, we have implemented a Bayesian approach to describe the change of evolutionary rate over time. Analysis of 22 genes from the nuclear and the mitochondrial genomes under the molecular clock assumption produced old date estimates, similar to those from previous studies. However, by allowing rates to vary in time and by taking small species-sampling fractions into account, we obtained much younger estimates, broadly consistent with the fossil records. In particular, the date of protostome-deuterostome divergence was on average 582 +/- 112 MYA. These results were found to be robust to specification of the model of rate change. The clock assumption thus had a dramatic effect on date estimation. However, our results appeared sensitive to the prior model of cladogenesis, although the oldest estimates (791 +/- 246 MYA) were obtained under a suboptimal model. Bayes posterior estimates of evolutionary rates indicated at least one major burst of molecular evolution at the end of the Precambrian when protostomes and deuterostomes diverged. We stress the importance of assumptions about rates on date estimation and suggest that the large discrepancies between the molecular and fossil dates of metazoan divergences might partly be due to biases in molecular date estimation.  相似文献   

2.
In recent years, a number of phylogenetic methods have been developed for estimating molecular rates and divergence dates under models that relax the molecular clock constraint by allowing rate change throughout the tree. These methods are being used with increasing frequency, but there have been few studies into their accuracy. We tested the accuracy of several relaxed-clock methods (penalized likelihood and Bayesian inference using various models of rate change) using nucleotide sequences simulated on a nine-taxon tree. When the sequences evolved with a constant rate, the methods were able to infer rates accurately, but estimates were more precise when a molecular clock was assumed. When the sequences evolved under a model of auto-correlated rate change, rates were accurately estimated using penalized likelihood and by Bayesian inference using lognormal and exponential models of rate change, while other models did not perform as well. When the sequences evolved under a model of uncorrelated rate change, only Bayesian inference using an exponential rate model performed well. Collectively, the results provide a strong recommendation for using the exponential model of rate change if a conservative approach to divergence time estimation is required. A case study is presented in which we use a simulation-based approach to examine the hypothesis of elevated rates in the Cambrian period, and it is found that these high rate estimates might be an artifact of the rate estimation method. If this bias is present, then the ages of metazoan divergences would be systematically underestimated. The results of this study have implications for studies of molecular rates and divergence dates.  相似文献   

3.
The study of biogeography has benefited from the exponential increase of DNA sequence data from recent molecular systematic studies, the development of analytical methods in the last decade concerning divergence time estimation and geographic area analyses, and the availability of large-scale distributiofi data of species in many groups of organisms. The underlying principle of divergence time estimation from DNA and protein data is that sequence divergence depends on the product of evolutionary rate and time. With their molecular clock hypothesis, Zuckerkandl and Pauling (1965) separated rates of molecular evolution from time by incorporating fossil evidence. Originally,  相似文献   

4.
Divergence time and substitution rate are seriously confounded in phylogenetic analysis, making it difficult to estimate divergence times when the molecular clock (rate constancy among lineages) is violated. This problem can be alleviated to some extent by analyzing multiple gene loci simultaneously and by using multiple calibration points. While different genes may have different patterns of evolutionary rate change, they share the same divergence times. Indeed, the fact that each gene may violate the molecular clock differently leads to the advantage of simultaneous analysis of multiple loci. Multiple calibration points provide the means for characterizing the local evolutionary rates on the phylogeny. In this paper, we extend previous likelihood models of local molecular clock for estimating species divergence times to accommodate multiple calibration points and multiple genes. Heterogeneity among different genes in evolutionary rate and in substitution process is accounted for by the models. We apply the likelihood models to analyze two mitochondrial protein-coding genes, cytochrome oxidase II and cytochrome b, to estimate divergence times of Malagasy mouse lemurs and related outgroups. The likelihood method is compared with the Bayes method of Thorne et al. (1998, Mol. Biol. Evol. 15:1647-1657), which uses a probabilistic model to describe the change in evolutionary rate over time and uses the Markov chain Monte Carlo procedure to derive the posterior distribution of rates and times. Our likelihood implementation has the drawbacks of failing to accommodate uncertainties in fossil calibrations and of requiring the researcher to classify branches on the tree into different rate groups. Both problems are avoided in the Bayes method. Despite the differences in the two methods, however, data partitions and model assumptions had the greatest impact on date estimation. The three codon positions have very different substitution rates and evolutionary dynamics, and assumptions in the substitution model affect date estimation in both likelihood and Bayes analyses. The results demonstrate that the separate analysis is unreliable, with dates variable among codon positions and between methods, and that the combined analysis is much more reliable. When the three codon positions were analyzed simultaneously under the most realistic models using all available calibration information, the two methods produced similar results. The divergence of the mouse lemurs is dated to be around 7-10 million years ago, indicating a surprisingly early species radiation for such a morphologically uniform group of primates.  相似文献   

5.
We introduce a new model for relaxing the assumption of a strict molecular clock for use as a prior in Bayesian methods for divergence time estimation. Lineage-specific rates of substitution are modeled using a Dirichlet process prior (DPP), a type of stochastic process that assumes lineages of a phylogenetic tree are distributed into distinct rate classes. Under the Dirichlet process, the number of rate classes, assignment of branches to rate classes, and the rate value associated with each class are treated as random variables. The performance of this model was evaluated by conducting analyses on data sets simulated under a range of different models. We compared the Dirichlet process model with two alternative models for rate variation: the strict molecular clock and the independent rates model. Our results show that divergence time estimation under the DPP provides robust estimates of node ages and branch rates without significantly reducing power. Further analyses were conducted on a biological data set, and we provide examples of ways to summarize Markov chain Monte Carlo samples under this model.  相似文献   

6.
Precise dating of viral subtype divergence enables researchers to correlate divergence with geographic and demographic occurrences. When historical data are absent (that is, the overwhelming majority), viral sequence sampling on a time scale commensurate with the rate of substitution permits the inference of the times of subtype divergence. Currently, researchers use two strategies to approach this task, both requiring strong conditions on the molecular clock assumption of substitution rate. As the underlying structure of the substitution rate process at the time of subtype divergence is not understood and likely highly variable, we present a simple method that estimates rates of substitution, and from there, times of divergence, without use of an assumed molecular clock. We accomplish this by blending estimates of the substitution rate for triplets of dated sequences where each sequence draws from a distinct viral subtype, providing a zeroth-order approximation for the rate between subtypes. As an example, we calculate the time of divergence for three genes among influenza subtypes A-H3N2 and B using subtype C as an outgroup. We show a time of divergence approximately 100 years ago, substantially more recent than previous estimates which range from 250 to 3800 years ago.  相似文献   

7.
Various nucleotide substitution models have been developed to accommodate among lineage rate heterogeneity, thereby relaxing the assumptions of the strict molecular clock. Recently developed "uncorrelated relaxed clock" and "random local clock" (RLC) models allow decoupling of nucleotide substitution rates between descendant lineages and are thus predicted to perform better in the presence of lineage-specific rate heterogeneity. However, it is uncertain how these models perform in the presence of punctuated shifts in substitution rate, especially between closely related clades. Using cetaceans (whales and dolphins) as a case study, we test the performance of these two substitution models in estimating both molecular rates and divergence times in the presence of substantial lineage-specific rate heterogeneity. Our RLC analyses of whole mitochondrial genome alignments find evidence for up to ten clade-specific nucleotide substitution rate shifts in cetaceans. We provide evidence that in the uncorrelated relaxed clock framework, a punctuated shift in the rate of molecular evolution within a subclade results in posterior rate estimates that are either misled or intermediate between the disparate rate classes present in baleen and toothed whales. Using simulations, we demonstrate abrupt changes in rate isolated to one or a few lineages in the phylogeny can mislead rate and age estimation, even when the node of interest is calibrated. We further demonstrate how increasing prior age uncertainty can bias rate and age estimates, even while the 95% highest posterior density around age estimates decreases; in other words, increased precision for an inaccurate estimate. We interpret the use of external calibrations in divergence time studies in light of these results, suggesting that rate shifts at deep time scales may mislead inferences of absolute molecular rates and ages.  相似文献   

8.
Rannala B  Yang Z 《Genetics》2003,164(4):1645-1656
The effective population sizes of ancestral as well as modern species are important parameters in models of population genetics and human evolution. The commonly used method for estimating ancestral population sizes, based on counting mismatches between the species tree and the inferred gene trees, is highly biased as it ignores uncertainties in gene tree reconstruction. In this article, we develop a Bayes method for simultaneous estimation of the species divergence times and current and ancestral population sizes. The method uses DNA sequence data from multiple loci and extracts information about conflicts among gene tree topologies and coalescent times to estimate ancestral population sizes. The topology of the species tree is assumed known. A Markov chain Monte Carlo algorithm is implemented to integrate over uncertain gene trees and branch lengths (or coalescence times) at each locus as well as species divergence times. The method can handle any species tree and allows different numbers of sequences at different loci. We apply the method to published noncoding DNA sequences from the human and the great apes. There are strong correlations between posterior estimates of speciation times and ancestral population sizes. With the use of an informative prior for the human-chimpanzee divergence date, the population size of the common ancestor of the two species is estimated to be approximately 20,000, with a 95% credibility interval (8000, 40,000). Our estimates, however, are affected by model assumptions as well as data quality. We suggest that reliable estimates have yet to await more data and more realistic models.  相似文献   

9.
Dating evolutionary origins of taxa is essential for understanding rates and timing of evolutionary events, often inciting intense debate when molecular estimates differ from first fossil appearances. For numerous reasons, ostracods present a challenging case study of rates of evolution and congruence of fossil and molecular divergence time estimates. On the one hand, ostracods have one of the densest fossil records of any metazoan group. However, taxonomy of fossil ostracods is controversial, owing at least in part to homoplasy of carapaces, the most commonly fossilized part. In addition, rates of evolution are variable in ostracods. Here, we report evidence of extreme variation in the rate of molecular evolution in different ostracod groups. This rate is significantly elevated in Halocyprid ostracods, a widespread planktonic group, consistent with previous observations that planktonic groups show elevated rates of molecular evolution. At the same time, the rate of molecular evolution is slow in the lineage leading to Manawa staceyi, a relict species that we estimate diverged approximately 500 million years ago from its closest known living relative. We also report multiple cases of significant incongruence between fossil and molecular estimates of divergence times in Ostracoda. Although relaxed clock methods improve the congruence of fossil and molecular divergence estimates over strict clock models, incongruence is present regardless of method. We hypothesize that this observed incongruence is driven largely by problems with taxonomy of fossil Ostracoda. Our results illustrate the difficulty in consistently estimating lineage divergence times, even in the presence of a voluminous fossil record.  相似文献   

10.
11.
We develop a reversible jump Markov chain Monte Carlo approach to estimating the posterior distribution of phylogenies based on aligned DNA/RNA sequences under several hierarchical evolutionary models. Using a proper, yet nontruncated and uninformative prior, we demonstrate the advantages of the Bayesian approach to hypothesis testing and estimation in phylogenetics by comparing different models for the infinitesimal rates of change among nucleotides, for the number of rate classes, and for the relationships among branch lengths. We compare the relative probabilities of these models and the appropriateness of a molecular clock using Bayes factors. Our most general model, first proposed by Tamura and Nei, parameterizes the infinitesimal change probabilities among nucleotides (A, G, C, T/U) into six parameters, consisting of three parameters for the nucleotide stationary distribution, two rate parameters for nucleotide transitions, and another parameter for nucleotide transversions. Nested models include the Hasegawa, Kishino, and Yano model with equal transition rates and the Kimura model with a uniform stationary distribution and equal transition rates. To illustrate our methods, we examine simulated data, 16S rRNA sequences from 15 contemporary eubacteria, halobacteria, eocytes, and eukaryotes, 9 primates, and the entire HIV genome of 11 isolates. We find that the Kimura model is too restrictive, that the Hasegawa, Kishino, and Yano model can be rejected for some data sets, that there is evidence for more than one rate class and a molecular clock among similar taxa, and that a molecular clock can be rejected for more distantly related taxa.  相似文献   

12.
Analyses of a comprehensive morphological character matrix of mammals using ‘relaxed’ clock models (which simultaneously estimate topology, divergence dates and evolutionary rates), either alone or in combination with an 8.5 kb nuclear sequence dataset, retrieve implausibly ancient, Late Jurassic–Early Cretaceous estimates for the initial diversification of Placentalia (crown-group Eutheria). These dates are much older than all recent molecular and palaeontological estimates. They are recovered using two very different clock models, and regardless of whether the tree topology is freely estimated or constrained using scaffolds to match the current consensus placental phylogeny. This raises the possibility that divergence dates have been overestimated in previous analyses that have applied such clock models to morphological and total evidence datasets. Enforcing additional age constraints on selected internal divergences results in only a slight reduction of the age of Placentalia. Constraining Placentalia to less than 93.8 Ma, congruent with recent molecular estimates, does not require major changes in morphological or molecular evolutionary rates. Even constraining Placentalia to less than 66 Ma to match the ‘explosive’ palaeontological model results in only a 10- to 20-fold increase in maximum evolutionary rate for morphology, and fivefold for molecules. The large discrepancies between clock- and fossil-based estimates for divergence dates might therefore be attributable to relatively small changes in evolutionary rates through time, although other explanations (such as overly simplistic models of morphological evolution) need to be investigated. Conversely, dates inferred using relaxed clock models (especially with discrete morphological data and MrBayes) should be treated cautiously, as relatively minor deviations in rate patterns can generate large effects on estimated divergence dates.  相似文献   

13.
Accurate and precise estimation of divergence times during the Neo-Proterozoic is necessary to understand the speciation dynamic of early Eukaryotes. However such deep divergences are difficult to date, as the molecular clock is seriously violated. Recent improvements in Bayesian molecular dating techniques allow the relaxation of the molecular clock hypothesis as well as incorporation of multiple and flexible fossil calibrations. Divergence times can then be estimated even when the evolutionary rate varies among lineages and even when the fossil calibrations involve substantial uncertainties. In this paper, we used a Bayesian method to estimate divergence times in Foraminifera, a group of unicellular eukaryotes, known for their excellent fossil record but also for the high evolutionary rates of their genomes. Based on multigene data we reconstructed the phylogeny of Foraminifera and dated their origin and the major radiation events. Our estimates suggest that Foraminifera emerged during the Cryogenian (650-920 Ma, Neo-Proterozoic), with a mean time around 770 Ma, about 220 Myr before the first appearance of reliable foraminiferal fossils in sediments (545 Ma). Most dates are in agreement with the fossil record, but in general our results suggest earlier origins of foraminiferal orders. We found that the posterior time estimates were robust to specifications of the prior. Our results highlight inter-species variations of evolutionary rates in Foraminifera. Their effect was partially overcome by using the partitioned Bayesian analysis to accommodate rate heterogeneity among data partitions and using the relaxed molecular clock to account for changing evolutionary rates. However, more coding genes appear necessary to obtain more precise estimates of divergence times and to resolve the conflicts between fossil and molecular date estimates.  相似文献   

14.
Recent work on Bayesian inference of disease mapping models discusses the advantages of the fully Bayesian (FB) approach over its empirical Bayes (EB) counterpart, suggesting that FB posterior standard deviations of small-area relative risks are more reflective of the uncertainty associated with the relative risk estimation than counterparts based on EB inference, since the latter fail to account for the variability in the estimation of the hyperparameters. In this article, an EB bootstrap methodology for relative risk inference with accurate parametric EB confidence intervals is developed, illustrated, and contrasted with the hyperprior Bayes. We elucidate the close connection between the EB bootstrap methodology and hyperprior Bayes, present a comparison between FB inference via hybrid Markov chain Monte Carlo and EB inference via penalized quasi-likelihood, and illustrate the ability of parametric bootstrap procedures to adjust for the undercoverage in the "naive" EB interval estimates. We discuss the important roles that FB and EB methods play in risk inference, map interpretation, and real-life applications. The work is motivated by a recent analysis of small-area infant mortality rates in the province of British Columbia in Canada.  相似文献   

15.
Simultaneous molecular dating of population and species divergences is essential in many biological investigations, including phylogeography, phylodynamics and species delimitation studies. In these investigations, multiple sequence alignments consist of both intra‐ and interspecies samples (mixed samples). As a result, the phylogenetic trees contain interspecies, interpopulation and within‐population divergences. Bayesian relaxed clock methods are often employed in these analyses, but they assume the same tree prior for both inter‐ and intraspecies branching processes and require specification of a clock model for branch rates (independent vs. autocorrelated rates models). We evaluated the impact of a single tree prior on Bayesian divergence time estimates by analysing computer‐simulated data sets. We also examined the effect of the assumption of independence of evolutionary rate variation among branches when the branch rates are autocorrelated. Bayesian approach with coalescent tree priors generally produced excellent molecular dates and highest posterior densities with high coverage probabilities. We also evaluated the performance of a non‐Bayesian method, RelTime, which does not require the specification of a tree prior or a clock model. RelTime's performance was similar to that of the Bayesian approach, suggesting that it is also suitable to analyse data sets containing both populations and species variation when its computational efficiency is needed.  相似文献   

16.
As larger, more complex data sets are being used to infer phylogenies, accuracy of these phylogenies increasingly requires models of evolution that accommodate heterogeneity in the processes of molecular evolution. We investigated the effect of improper data partitioning on phylogenetic accuracy, as well as the type I error rate and sensitivity of Bayes factors, a commonly used method for choosing among different partitioning strategies in Bayesian analyses. We also used Bayes factors to test empirical data for the need to divide data in a manner that has no expected biological meaning. Posterior probability estimates are misleading when an incorrect partitioning strategy is assumed. The error was greatest when the assumed model was underpartitioned. These results suggest that model partitioning is important for large data sets. Bayes factors performed well, giving a 5% type I error rate, which is remarkably consistent with standard frequentist hypothesis tests. The sensitivity of Bayes factors was found to be quite high when the across-class model heterogeneity reflected that of empirical data. These results suggest that Bayes factors represent a robust method of choosing among partitioning strategies. Lastly, results of tests for the inclusion of unexpected divisions in empirical data mirrored the simulation results, although the outcome of such tests is highly dependent on accounting for rate variation among classes. We conclude by discussing other approaches for partitioning data, as well as other applications of Bayes factors.  相似文献   

17.
Violation of the molecular clock has been amply documented, and is now routinely taken into account by molecular dating methods. Comparative analyses have revealed a systematic component in rate variation, relating it to the evolution of life-history traits, such as body size or generation time. Life-history evolution can be reconstructed using Brownian models. However, the resulting estimates are typically uncertain, and potentially sensitive to the underlying assumptions. As a way of obtaining more accurate ancestral trait and divergence time reconstructions, correlations between life-history traits and substitution rates could be used as an additional source of information. In this direction, a Bayesian framework for jointly reconstructing rates, traits, and dates was previously introduced. Here, we apply this model to a 17 protein-coding gene alignment for 73 placental taxa. Our analysis indicates that the coupling between molecules and life history can lead to a reevaluation of ancestral life-history profiles, in particular for groups displaying convergent evolution in body size. However, reconstructions are sensitive to fossil calibrations and to the Brownian assumption. Altogether, our analysis suggests that further integrating inference of rates and traits might be particularly useful for neontological macroevolutionary comparative studies.  相似文献   

18.
Estimation of primate speciation dates using local molecular clocks   总被引:16,自引:0,他引:16  
Protein-coding genes of the mitochondrial genomes from 31 mammalian species were analyzed to estimate the speciation dates within primates and also between rats and mice. Three calibration points were used based on paleontological data: one at 20-25 MYA for the hominoid/cercopithecoid divergence, one at 53-57 MYA for the cetacean/artiodactyl divergence, and the third at 110-130 MYA for the metatherian/eutherian divergence. Both the nucleotide and the amino acid sequences were analyzed, producing conflicting results. The global molecular clock was clearly violated for both the nucleotide and the amino acid data. Models of local clocks were implemented using maximum likelihood, allowing different evolutionary rates for some lineages while assuming rate constancy in others. Surprisingly, the highly divergent third codon positions appeared to contain phylogenetic information and produced more sensible estimates of primate divergence dates than did the amino acid sequences. Estimated dates varied considerably depending on the data type, the calibration point, and the substitution model but differed little among the four tree topologies used. We conclude that the calibration derived from the primate fossil record is too recent to be reliable; we also point out a number of problems in date estimation when the molecular clock does not hold. Despite these obstacles, we derived estimates of primate divergence dates that were well supported by the data and were generally consistent with the paleontological record. Estimation of the mouse-rat divergence date, however, was problematic.  相似文献   

19.
Controversies over the molecular clock hypothesis were reviewed. Since it is evident that the molecular clock does not hold in an exact sense, accounting for evolution of the rate of molecular evolution is a prerequisite when estimating divergence times with molecular sequences. Recently proposed statistical methods that account for this rate variation are overviewed and one of these procedures is applied to the mitochondrial protein sequences and to the nuclear gene sequences from many mammalian species in order to estimate the time scale of eutherian evolution. This Bayesian method not only takes account of the variation of molecular evolutionary rate among lineages and among genes, but it also incorporates fossil evidence via constraints on node times. With denser taxonomic sampling and a more realistic model of molecular evolution, this Bayesian approach is expected to increase the accuracy of divergence time estimates.  相似文献   

20.
Phylogenetic dating is one of the most powerful and commonly used methods of drawing epidemiological interpretations from pathogen genomic data. Building such trees requires considering a molecular clock model which represents the rate at which substitutions accumulate on genomes. When the molecular clock rate is constant throughout the tree then the clock is said to be strict, but this is often not an acceptable assumption. Alternatively, relaxed clock models consider variations in the clock rate, often based on a distribution of rates for each branch. However, we show here that the distributions of rates across branches in commonly used relaxed clock models are incompatible with the biological expectation that the sum of the numbers of substitutions on two neighboring branches should be distributed as the substitution number on a single branch of equivalent length. We call this expectation the additivity property. We further show how assumptions of commonly used relaxed clock models can lead to estimates of evolutionary rates and dates with low precision and biased confidence intervals. We therefore propose a new additive relaxed clock model where the additivity property is satisfied. We illustrate the use of our new additive relaxed clock model on a range of simulated and real data sets, and we show that using this new model leads to more accurate estimates of mean evolutionary rates and ancestral dates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号