首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
ZihengYANG 《动物学报》2004,50(4):645-656
众所周知 ,物种分化年代的估计对分子钟 (进化速率恒定 )假定很敏感。另一方面 ,在远缘物种 (例如哺乳纲不同目的动物 )的比较中 ,分子钟几乎总是不成立的。这样在估计分化时间时考虑不同进化区系的速率差异至为重要。最大似然法可以很自然地考虑这种速率差异 ,并且可以同时分析多个基因位点的资料以及同时利用多重化石校正数据。以前提出的似然法需要研究者将进化树的树枝按速率分组 ,本文提出一个近似方法以使这个过程自动化。本方法综合了以前的似然法、贝斯法及近似速率平滑法的一些特征。此外 ,还对算法加以改进 ,以适应综合数据分析时某些基因在某些物种中缺乏资料的情形。应用新提出的方法来分析马达加斯加的倭狐猴的分化年代 ,并与以前的似然法及贝斯法的分析进行了比较  相似文献   

Accurate and precise estimation of divergence times during the Neo-Proterozoic is necessary to understand the speciation dynamic of early Eukaryotes. However such deep divergences are difficult to date, as the molecular clock is seriously violated. Recent improvements in Bayesian molecular dating techniques allow the relaxation of the molecular clock hypothesis as well as incorporation of multiple and flexible fossil calibrations. Divergence times can then be estimated even when the evolutionary rate varies among lineages and even when the fossil calibrations involve substantial uncertainties. In this paper, we used a Bayesian method to estimate divergence times in Foraminifera, a group of unicellular eukaryotes, known for their excellent fossil record but also for the high evolutionary rates of their genomes. Based on multigene data we reconstructed the phylogeny of Foraminifera and dated their origin and the major radiation events. Our estimates suggest that Foraminifera emerged during the Cryogenian (650-920 Ma, Neo-Proterozoic), with a mean time around 770 Ma, about 220 Myr before the first appearance of reliable foraminiferal fossils in sediments (545 Ma). Most dates are in agreement with the fossil record, but in general our results suggest earlier origins of foraminiferal orders. We found that the posterior time estimates were robust to specifications of the prior. Our results highlight inter-species variations of evolutionary rates in Foraminifera. Their effect was partially overcome by using the partitioned Bayesian analysis to accommodate rate heterogeneity among data partitions and using the relaxed molecular clock to account for changing evolutionary rates. However, more coding genes appear necessary to obtain more precise estimates of divergence times and to resolve the conflicts between fossil and molecular date estimates.  相似文献   

Divergence time and substitution rate are seriously confounded in phylogenetic analysis, making it difficult to estimate divergence times when the molecular clock (rate constancy among lineages) is violated. This problem can be alleviated to some extent by analyzing multiple gene loci simultaneously and by using multiple calibration points. While different genes may have different patterns of evolutionary rate change, they share the same divergence times. Indeed, the fact that each gene may violate the molecular clock differently leads to the advantage of simultaneous analysis of multiple loci. Multiple calibration points provide the means for characterizing the local evolutionary rates on the phylogeny. In this paper, we extend previous likelihood models of local molecular clock for estimating species divergence times to accommodate multiple calibration points and multiple genes. Heterogeneity among different genes in evolutionary rate and in substitution process is accounted for by the models. We apply the likelihood models to analyze two mitochondrial protein-coding genes, cytochrome oxidase II and cytochrome b, to estimate divergence times of Malagasy mouse lemurs and related outgroups. The likelihood method is compared with the Bayes method of Thorne et al. (1998, Mol. Biol. Evol. 15:1647-1657), which uses a probabilistic model to describe the change in evolutionary rate over time and uses the Markov chain Monte Carlo procedure to derive the posterior distribution of rates and times. Our likelihood implementation has the drawbacks of failing to accommodate uncertainties in fossil calibrations and of requiring the researcher to classify branches on the tree into different rate groups. Both problems are avoided in the Bayes method. Despite the differences in the two methods, however, data partitions and model assumptions had the greatest impact on date estimation. The three codon positions have very different substitution rates and evolutionary dynamics, and assumptions in the substitution model affect date estimation in both likelihood and Bayes analyses. The results demonstrate that the separate analysis is unreliable, with dates variable among codon positions and between methods, and that the combined analysis is much more reliable. When the three codon positions were analyzed simultaneously under the most realistic models using all available calibration information, the two methods produced similar results. The divergence of the mouse lemurs is dated to be around 7-10 million years ago, indicating a surprisingly early species radiation for such a morphologically uniform group of primates.  相似文献   

The phylogenetic relationships of 46 echinoids, with representatives from 13 of the 14 ordinal-level clades and about 70% of extant families commonly recognized, have been established from 3 genes (3,226 alignable bases) and 119 morphological characters. Morphological and molecular estimates are similar enough to be considered suboptimal estimates of one another, and the combined data provide a tree that, when calibrated against the fossil record, provides paleontological estimates of divergence times and completeness of their fossil record. The order of branching on the cladogram largely agrees with the stratigraphic order of first occurrences and implies that their fossil record is more than 85% complete at family level and at a resolution of 5-Myr time intervals. Molecular estimates of divergence times derived from applying both molecular clock and relaxed molecular clock models are concordant with estimates based on the fossil record in up to 70% of cases, with most concordant results obtained using Sanderson's semiparametric penalized likelihood method and a logarithmic-penalty function. There are 3 regions of the tree where molecular and fossil estimates of divergence time consistently disagree. Comparison with results obtained when molecular divergence dates are estimated from the combined (morphology + gene) tree suggests that errors in phylogenetic reconstruction explain only one of these. In another region the error most likely lies with the paleontological estimates because taxa in this region are demonstrated to have a very poor fossil record. In the third case, morphological and paleontological evidence is much stronger, and the topology for this part of the molecular tree differs from that derived from the combined data. Here the cause of the mismatch is unclear but could be methodological, arising from marked inequality of molecular rates. Overall, the level of agreement reached between these different data and methodological approaches leads us to believe that careful application of likelihood and Bayesian methods to molecular data provides realistic divergence time estimates in the majority of cases (almost 80% in this specific example), thus providing a remarkably well-calibrated phylogeny of a character-rich clade of ubiquitous marine benthic invertebrates.  相似文献   

Controversies over the molecular clock hypothesis were reviewed. Since it is evident that the molecular clock does not hold in an exact sense, accounting for evolution of the rate of molecular evolution is a prerequisite when estimating divergence times with molecular sequences. Recently proposed statistical methods that account for this rate variation are overviewed and one of these procedures is applied to the mitochondrial protein sequences and to the nuclear gene sequences from many mammalian species in order to estimate the time scale of eutherian evolution. This Bayesian method not only takes account of the variation of molecular evolutionary rate among lineages and among genes, but it also incorporates fossil evidence via constraints on node times. With denser taxonomic sampling and a more realistic model of molecular evolution, this Bayesian approach is expected to increase the accuracy of divergence time estimates.  相似文献   

The molecular clock theory has greatly enlightened our understanding of macroevolutionary events. Maximum likelihood (ML) estimation of divergence times involves the adoption of fixed calibration points, and the confidence intervals associated with the estimates are generally very narrow. The credibility intervals are inferred assuming that the estimates are normally distributed, which may not be the case. Moreover, calculation of standard errors is usually carried out by the curvature method and is complicated by the difficulty in approximating second derivatives of the likelihood function. In this study, a standard primate phylogeny was used to examine the standard errors of ML estimates via the bootstrap method. Confidence intervals were also assessed from the posterior distribution of divergence times inferred via Bayesian Markov Chain Monte Carlo. For the primate topology under evaluation, no significant differences were found between the bootstrap and the curvature methods. Also, Bayesian confidence intervals were always wider than those obtained by ML.  相似文献   

Molecular clock methods allow biologists to estimate divergence times, which in turn play an important role in comparative studies of many evolutionary processes. It is well known that molecular age estimates can be biased by heterogeneity in rates of molecular evolution, but less attention has been paid to the issue of potentially erroneous fossil calibrations. In this study we estimate the timing of diversification in Centrarchidae, an endemic major lineage of the diverse North American freshwater fish fauna, through a new approach to fossil calibration and molecular evolutionary model selection. Given a completely resolved multi-gene molecular phylogeny and a set of multiple fossil-inferred age estimates, we tested for potentially erroneous fossil calibrations using a recently developed fossil cross-validation. We also used fossil information to guide the selection of the optimal molecular evolutionary model with a new fossil jackknife method in a fossil-based model cross-validation. The centrarchid phylogeny resulted from a mixed-model Bayesian strategy that included 14 separate data partitions sampled from three mtDNA and four nuclear genes. Ten of the 31 interspecific nodes in the centrarchid phylogeny were assigned a minimal age estimate from the centrarchid fossil record. Our analyses identified four fossil dates that were inconsistent with the other fossils, and we removed them from the molecular dating analysis. Using fossil-based model cross-validation to determine the optimal smoothing value in penalized likelihood analysis, and six mutually consistent fossil calibrations, the age of the most recent common ancestor of Centrarchidae was 33.59 million years ago (mya). Penalized likelihood analyses of individual data partitions all converged on a very similar age estimate for this node, indicating that rate heterogeneity among data partitions is not confounding our analyses. These results place the origin of the centrarchid radiation at a time of major faunal turnover as the fossil record indicates that the most diverse lineages of the North American freshwater fish fauna originated at the Eocene-Oligocene boundary, approximately 34 mya. This time coincided with major global climate change from warm to cool temperatures and a signature of elevated lineage extinction and origination in the fossil record across the tree of life. Our analyses demonstrate the utility of fossil cross-validation to critically assess individual fossil calibration points, providing the ability to discriminate between consistent and inconsistent fossil age estimates that are used for calibrating molecular phylogenies.  相似文献   

Distance-based phylogenetic methods are widely used in biomedical research. However, there has been little development of rigorous statistical methods and software for dating speciation and gene duplication events by using evolutionary distances. Here we present a simple, fast and accurate dating method based on the least-squares (LS) method that has already been widely used in molecular phylogenetic reconstruction. Dating methods with a global clock or two different local clocks are presented. Single or multiple fossil calibration points can be used, and multiple data sets can be integrated in a combined analysis. Variation of the estimated divergence time is estimated by resampling methods such as bootstrapping or jackknifing. Application of the method to dating the divergence time among seven ape species or among 35 mammalian species including major mammalian orders shows that the estimated divergence time with the LS criterion is nearly identical to those obtained by the likelihood method or Bayesian inference.  相似文献   

Molecular divergence time analyses often rely on the age of fossil lineages to calibrate node age estimates. Most divergence time analyses are now performed in a Bayesian framework, where fossil calibrations are incorporated as parametric prior probabilities on node ages. It is widely accepted that an ideal parameterization of such node age prior probabilities should be based on a comprehensive analysis of the fossil record of the clade of interest, but there is currently no generally applicable approach for calculating such informative priors. We provide here a simple and easily implemented method that employs fossil data to estimate the likely amount of missing history prior to the oldest fossil occurrence of a clade, which can be used to fit an informative parametric prior probability distribution on a node age. Specifically, our method uses the extant diversity and the stratigraphic distribution of fossil lineages confidently assigned to a clade to fit a branching model of lineage diversification. Conditioning this on a simple model of fossil preservation, we estimate the likely amount of missing history prior to the oldest fossil occurrence of a clade. The likelihood surface of missing history can then be translated into a parametric prior probability distribution on the age of the clade of interest. We show that the method performs well with simulated fossil distribution data, but that the likelihood surface of missing history can at times be too complex for the distribution-fitting algorithm employed by our software tool. An empirical example of the application of our method is performed to estimate echinoid node ages. A simulation-based sensitivity analysis using the echinoid data set shows that node age prior distributions estimated under poor preservation rates are significantly less informative than those estimated under high preservation rates.  相似文献   

Restriction site-associated DNA sequencing (RAD-seq) and related methods have become relatively common approaches to resolve species-level phylogeny. It is not clear, however, whether RAD-seq data matrices are well suited to relaxed clock inference of divergence times, given the size of the matrices and the abundance of missing data. We investigated the sensitivity of Bayesian relaxed clock estimates of divergence times to alternative analytical decisions on an empirical RAD-seq phylogenetic matrix. We explored the relative contribution of secondary calibration strategies, amount of missing data, and the data partition analyzed to overall variance in divergence times inferred using BEAST MCMC analyses of Carex section Schoenoxiphium (Cyperaceae)—a recent radiation for which we have nearly complete species sampling of RAD-seq data. The crown node for Schoenoxiphium was estimated to be 15.22 (9.56–21.18) Ma using a single calibration point and low missing data, 11.93 (8.07–16.03) Ma using multiple calibration points and low missing data, and 8.34 (5.41–11.22) using multiple calibrations but high missing data. We found that using matrices with more than half of the individuals with missing data inferred younger mean ages for all nodes. Moreover, we have found that our molecular clock estimates are sensitive to the positions of the calibration(s) in our phylogenetic tree (using matrices with low missing data), especially when only a single calibration was applied to estimate divergence times. These results argue for sensitivity analyses and caution in interpreting divergence time estimates from RAD-seq data.  相似文献   

Inferring speciation times under an episodic molecular clock   总被引:5,自引:0,他引:5  
We extend our recently developed Markov chain Monte Carlo algorithm for Bayesian estimation of species divergence times to allow variable evolutionary rates among lineages. The method can use heterogeneous data from multiple gene loci and accommodate multiple fossil calibrations. Uncertainties in fossil calibrations are described using flexible statistical distributions. The prior for divergence times for nodes lacking fossil calibrations is specified by use of a birth-death process with species sampling. The prior for lineage-specific substitution rates is specified using either a model with autocorrelated rates among adjacent lineages (based on a geometric Brownian motion model of rate drift) or a model with independent rates among lineages specified by a log-normal probability distribution. We develop an infinite-sites theory, which predicts that when the amount of sequence data approaches infinity, the width of the posterior credibility interval and the posterior mean of divergence times form a perfect linear relationship, with the slope indicating uncertainties in time estimates that cannot be reduced by sequence data alone. Simulations are used to study the influence of among-lineage rate variation and the number of loci sampled on the uncertainty of divergence time estimates. The analysis suggests that posterior time estimates typically involve considerable uncertainties even with an infinite amount of sequence data, and that the reliability and precision of fossil calibrations are critically important to divergence time estimation. We apply our new algorithms to two empirical data sets and compare the results with those obtained in previous Bayesian and likelihood analyses. The results demonstrate the utility of our new algorithms.  相似文献   

Molecular sequences do not only allow the reconstruction of phylogenetic relationships among species, but also provide information on the approximate divergence times. Whereas the fossil record dates the origin of most multicellular animal phyla during the Cambrian explosion less than 540 million years ago(mya), molecular clock calculations usually suggest much older dates. Here we used a large multiple sequence alignment derived from Expressed Sequence Tags and genomes comprising 129genes (37,476 amino acid positions) and 117 taxa, including 101 arthropods. We obtained consistent divergence time estimates applying relaxed Bayesian clock models with different priors and multiple calibration points. While the influence of substitution rates, missing data, and model priors were negligible, the clock model had significant effect. A log-normal autocorrelated model was selected on basis of cross-validation. We calculated that arthropods emerged ~600 mya. Onychophorans (velvet worms) and euarthropods split ~590 mya, Pancrustacea and Myriochelata ~560 mya, Myriapoda and Chelicerata ~555 mya, and 'Crustacea' and Hexapoda ~510 mya. Endopterygote insects appeared ~390 mya. These dates are considerably younger than most previous molecular clock estimates and in better agreement with the fossil record. Nevertheless, a Precambrian origin of arthropods and other metazoan phyla is still supported. Our results also demonstrate the applicability of large datasets of random nuclear sequences for approximating the timing of multicellular animal evolution.  相似文献   

This study investigated the biogeography and genetic variation in the antitropically distributed Micromesistius genus. A 579 bp fragment of the mitochondrial coI gene was analysed in 279 individuals of Micromesistius poutassou and 163 of Micromesistius australis. The time since divergence was estimated to be c. 2 million years before present (Mb.p.) with an externally derived clock rate by Bayesian methods. Congruent estimates were obtained with an additional data set of cytochrome b sequences derived from GenBank utilizing a different clock rate. The divergence time of 2 Mb.p. was in disagreement with fossil findings in New Zealand and previous hypotheses which suggested the divergence to be much older. It, therefore, appears likely that Micromesistius has penetrated into the southern hemisphere at least two times. Paleoceanographic records indicate that conditions that would increase the likelihood for transequatorial dispersals were evident c. 2-1·6 Mb.p.. Haplotype frequency differences, along with pairwise F(ST) values, indicated that Mediterranean M. poutassou is a genetically isolated population.  相似文献   

Dating evolutionary origins of taxa is essential for understanding rates and timing of evolutionary events, often inciting intense debate when molecular estimates differ from first fossil appearances. For numerous reasons, ostracods present a challenging case study of rates of evolution and congruence of fossil and molecular divergence time estimates. On the one hand, ostracods have one of the densest fossil records of any metazoan group. However, taxonomy of fossil ostracods is controversial, owing at least in part to homoplasy of carapaces, the most commonly fossilized part. In addition, rates of evolution are variable in ostracods. Here, we report evidence of extreme variation in the rate of molecular evolution in different ostracod groups. This rate is significantly elevated in Halocyprid ostracods, a widespread planktonic group, consistent with previous observations that planktonic groups show elevated rates of molecular evolution. At the same time, the rate of molecular evolution is slow in the lineage leading to Manawa staceyi, a relict species that we estimate diverged approximately 500 million years ago from its closest known living relative. We also report multiple cases of significant incongruence between fossil and molecular estimates of divergence times in Ostracoda. Although relaxed clock methods improve the congruence of fossil and molecular divergence estimates over strict clock models, incongruence is present regardless of method. We hypothesize that this observed incongruence is driven largely by problems with taxonomy of fossil Ostracoda. Our results illustrate the difficulty in consistently estimating lineage divergence times, even in the presence of a voluminous fossil record.  相似文献   

Estimation of primate speciation dates using local molecular clocks   总被引:16,自引:0,他引:16  
Protein-coding genes of the mitochondrial genomes from 31 mammalian species were analyzed to estimate the speciation dates within primates and also between rats and mice. Three calibration points were used based on paleontological data: one at 20-25 MYA for the hominoid/cercopithecoid divergence, one at 53-57 MYA for the cetacean/artiodactyl divergence, and the third at 110-130 MYA for the metatherian/eutherian divergence. Both the nucleotide and the amino acid sequences were analyzed, producing conflicting results. The global molecular clock was clearly violated for both the nucleotide and the amino acid data. Models of local clocks were implemented using maximum likelihood, allowing different evolutionary rates for some lineages while assuming rate constancy in others. Surprisingly, the highly divergent third codon positions appeared to contain phylogenetic information and produced more sensible estimates of primate divergence dates than did the amino acid sequences. Estimated dates varied considerably depending on the data type, the calibration point, and the substitution model but differed little among the four tree topologies used. We conclude that the calibration derived from the primate fossil record is too recent to be reliable; we also point out a number of problems in date estimation when the molecular clock does not hold. Despite these obstacles, we derived estimates of primate divergence dates that were well supported by the data and were generally consistent with the paleontological record. Estimation of the mouse-rat divergence date, however, was problematic.  相似文献   

ABSTRACT: BACKGROUND: Duikers in the subfamily Cephalophinae are a group of tropical forest mammals believed to have first originated during the late Miocene. However, knowledge of phylogenetic relationships, pattern and timing of their subsequent radiation is poorly understood. Here we present the first multi-locus phylogeny of this threatened group of tropical artiodactyls and use a Bayesian uncorrelated molecular clock to estimate divergence times. RESULTS: A total of 4152 bp of sequence data was obtained from two mitochondrial genes and four nuclear introns. Phylogenies were estimated using maximum parsimony, maximum likelihood, and Bayesian analysis of concatenated mitochondrial, nuclear and combined datasets. A relaxed molecular clock with two fossil calibration points was used to estimate divergence times. The first was based on the age of the split between the two oldest subfamilies within the Bovidae whereas the second was based on the earliest known fossil appearance of the Cephalophinae and molecular divergence time estimates for the oldest lineages within this group. Findings indicate strong support for four major lineages within the subfamily, all of which date to the late Miocene/early Pliocene. The first of these to diverge was the dwarf duiker genus Philantomba, followed by the giant, eastern and western red duiker lineages, all within the genus Cephalophus. While these results uphold the recognition of Philantomba, they do not support the monotypic savanna-specialist genus Sylvicapra, which as sister to the giant duikers leaves Cephalophus paraphyletic. BEAST analyses indicate that most sister species pairs originated during the Pleistocene, suggesting that repeated glacial cycling may have played an important role in the recent diversification of this group. Furthermore, several red duiker sister species pairs appear to be either paraphyletic (C.callipygus/C. ogilbyi and C. harveyi/C. natalensis) or exhibit evidence of mitochondrial admixture (C. nigrifrons and C. rufilatus), consistent with their recent divergence and/or possible hybridization with each other. CONCLUSIONS: Molecular phylogenetic analyses suggest that Pleistocene-era climatic oscillations have played an important role in the speciation of this largely forest-dwelling group. Our results also reveal the most well supported species phylogeny for the subfamily to date, but also highlight several areas of inconsistency between our current understanding of duiker taxonomy and the evolutionary relationships depicted here. These findings may therefore prove particularly relevant to future conservation efforts, given that many species are presently regulated under the Convention for Trade in Endangered Species.  相似文献   

Calibration is a critical step in every molecular clock analysis but it has been the least considered. Bayesian approaches to divergence time estimation make it possible to incorporate the uncertainty in the degree to which fossil evidence approximates the true time of divergence. We explored the impact of different approaches in expressing this relationship, using arthropod phylogeny as an example for which we established novel calibrations. We demonstrate that the parameters distinguishing calibration densities have a major impact upon the prior and posterior of the divergence times, and it is critically important that users evaluate the joint prior distribution of divergence times used by their dating programmes. We illustrate a procedure for deriving calibration densities in Bayesian divergence dating through the use of soft maximum constraints.  相似文献   

In recent years, a number of phylogenetic methods have been developed for estimating molecular rates and divergence dates under models that relax the molecular clock constraint by allowing rate change throughout the tree. These methods are being used with increasing frequency, but there have been few studies into their accuracy. We tested the accuracy of several relaxed-clock methods (penalized likelihood and Bayesian inference using various models of rate change) using nucleotide sequences simulated on a nine-taxon tree. When the sequences evolved with a constant rate, the methods were able to infer rates accurately, but estimates were more precise when a molecular clock was assumed. When the sequences evolved under a model of auto-correlated rate change, rates were accurately estimated using penalized likelihood and by Bayesian inference using lognormal and exponential models of rate change, while other models did not perform as well. When the sequences evolved under a model of uncorrelated rate change, only Bayesian inference using an exponential rate model performed well. Collectively, the results provide a strong recommendation for using the exponential model of rate change if a conservative approach to divergence time estimation is required. A case study is presented in which we use a simulation-based approach to examine the hypothesis of elevated rates in the Cambrian period, and it is found that these high rate estimates might be an artifact of the rate estimation method. If this bias is present, then the ages of metazoan divergences would be systematically underestimated. The results of this study have implications for studies of molecular rates and divergence dates.  相似文献   

A class of discrete-time models of infectious disease spread, referred to as individual-level models (ILMs), are typically fitted in a Bayesian Markov chain Monte Carlo (MCMC) framework. These models quantify probabilistic outcomes regarding the risk of infection of susceptible individuals due to various susceptibility and transmissibility factors, including their spatial distance from infectious individuals. The infectious pressure from infected individuals exerted on susceptible individuals is intrinsic to these ILMs. Unfortunately, quantifying this infectious pressure for data sets containing many individuals can be computationally burdensome, leading to a time-consuming likelihood calculation and, thus, computationally prohibitive MCMC-based analysis. This problem worsens when using data augmentation to allow for uncertainty in infection times. In this paper, we develop sampling methods that can be used to calculate a fast, approximate likelihood when fitting such disease models. A simple random sampling approach is initially considered followed by various spatially-stratified schemes. We test and compare the performance of our methods with both simulated data and data from the 2001 foot-and-mouth disease (FMD) epidemic in the U.K. Our results indicate that substantial computation savings can be obtained—albeit, of course, with some information loss—suggesting that such techniques may be of use in the analysis of very large epidemic data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号