首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 421 毫秒
1.
Three commonly used molecular dating methods for correction of variable rates (non-parametric rate smoothing, penalized likelihood, and Bayesian rate correction) as well as the assumption of a global molecular clock were tested for sensitivity to taxon sampling. The test dataset of 6854 basepairs for 300 terminals includes a nearly complete sample of the Restio-clade of the African Restionaceae (272 of the 288 species), as well as 26 outgroup species. Of this, nested subsets of 35, 51, 80, 120, 150, and the full 300 species were used. Molecular dating experiments with these datasets showed that all methods are sensitive to undersampling, but that this effect is more severe in analyses that use more extreme rate smoothing. Additionally, the undersampling effect is positively related to distance from the calibration node. The combined effect of undersampling and distance from the calibration node resulted in up to threefold differences in the age estimation of nodes from the same dataset with the same calibration point. We suggest that the most suitable methods are penalized likelihood and Bayesian when a global clock assumption has been rejected, as these methods are more successful at finding optimal levels of smoothing to correct for rate heterogeneity, and are less sensitive to undersampling.  相似文献   

2.
Rate heterogeneity among lineages is a common feature of molecular evolution, and it has long impeded our ability to accurately estimate the age of evolutionary divergence events. The development of relaxed molecular clocks, which model variable substitution rates among lineages, was intended to rectify this problem. Major subtypes of pandemic HIV-1 group M are thought to exemplify closely related lineages with different substitution rates. Here, we report that inferring the time of most recent common ancestor of all these subtypes in a single phylogeny under a single (relaxed) molecular clock produces significantly different dates for many of the subtypes than does analysis of each subtype on its own. We explore various methods to ameliorate this problem. We conclude that current molecular dating methods are inadequate for dealing with this type of substitution rate variation in HIV-1. Through simulation, we show that heterotachy causes root ages to be overestimated.  相似文献   

3.
Phillips MJ 《Gene》2009,441(1-2):132-140
Despite recent methodological advances in inferring the time-scale of biological evolution from molecular data, the fundamental question of whether our substitution models are sufficiently well specified to accurately estimate branch-lengths has received little attention. I examine this implicit assumption of all molecular dating methods, on a vertebrate mitochondrial protein-coding dataset. Comparison with analyses in which the data are RY-coded (AG --> R; CT --> Y) suggests that even rates-across-sites maximum likelihood greatly under-compensates for multiple substitutions among the standard (ACGT) NT-coded data, which has been subject to greater phylogenetic signal erosion. Accordingly, the fossil record indicates that branch-lengths inferred from the NT-coded data translate into divergence time overestimates when calibrated from deeper in the tree. Intriguingly, RY-coding led to the opposite result. The underlying NT and RY substitution model misspecifications likely relate respectively to "hidden" rate heterogeneity and changes in substitution processes across the tree, for which I provide simulated examples. Given the magnitude of the inferred molecular dating errors, branch-length estimation biases may partly explain current conflicts with some palaeontological dating estimates.  相似文献   

4.
Estimating divergence dates from molecular sequences   总被引:25,自引:13,他引:12  
The ability to date the time of divergence between lineages using molecular data provides the opportunity to answer many important questions in evolutionary biology. However, molecular dating techniques have previously been criticized for failing to adequately account for variation in the rate of molecular evolution. We present a maximum- likelihood approach to estimating divergence times that deals explicitly with the problem of rate variation. This method has many advantages over previous approaches including the following: (1) a rate constancy test excludes data for which rate heterogeneity is detected; (2) date estimates are generated with confidence intervals that allow the explicit testing of hypotheses regarding divergence times; and (3) a range of sequences and fossil dates are used, removing the reliance on a single calculated calibration rate. We present tests of the accuracy of our method, which show it to be robust to the effects of some modes of rate variation. In addition, we test the effect of substitution model and length of sequence on the accuracy of the dating technique. We believe that the method presented here offers solutions to many of the problems facing molecular dating and provides a platform for future improvements to such analyses.   相似文献   

5.
Rate heterogeneity within groups of organisms is known to exist even when closely related taxa are examined. A wide variety of phylogenetic and dating methods have been developed that aim either to test for the existence of rate variation or to correct for its bias. However, none of the existing methods track the evolution of features that account for observed rate heterogeneity. Here, we present a likelihood model that assumes that rate variation is caused, in part, by species' intrinsic characteristics, such as a particular life-history trait, morphological feature, or habitat association. The model combines models of sequence and character state evolution such that rates of sequence change depend on the character state of a lineage at each point in time. We test, using simulations, the power and accuracy of the model to determine whether rates of molecular evolution depend on a particular character state and demonstrate its utility using an empirical example with halophilic and freshwater daphniids.  相似文献   

6.

Background

Molecular dating has gained ever-increasing interest since the molecular clock hypothesis was proposed in the 1960s. Molecular dating provides detailed temporal frameworks for divergence events in phylogenetic trees, allowing diverse evolutionary questions to be addressed. The key aspect of the molecular clock hypothesis, namely that differences in DNA or protein sequence between two species are proportional to the time elapsed since they diverged, was soon shown to be untenable. Other approaches were proposed to take into account rate heterogeneity among lineages, but the calibration process, by which relative times are transformed into absolute ages, has received little attention until recently. New methods have now been proposed to resolve potential sources of error associated with the calibration of phylogenetic trees, particularly those involving use of the fossil record.

Scope and Conclusions

The use of the fossil record as a source of independent information in the calibration process is the main focus of this paper; other sources of calibration information are also discussed. Particularly error-prone aspects of fossil calibration are identified, such as fossil dating, the phylogenetic placement of the fossil and the incompleteness of the fossil record. Methods proposed to tackle one or more of these potential error sources are discussed (e.g. fossil cross-validation, prior distribution of calibration points and confidence intervals on the fossil record). In conclusion, the fossil record remains the most reliable source of information for the calibration of phylogenetic trees, although associated assumptions and potential bias must be taken into account.  相似文献   

7.
60 million years of co-divergence in the fig-wasp symbiosis   总被引:4,自引:0,他引:4  
Figs (Ficus; ca 750 species) and fig wasps (Agaoninae) are obligate mutualists: all figs are pollinated by agaonines that feed exclusively on figs. This extraordinary symbiosis is the most extreme example of specialization in a plant-pollinator interaction and has fuelled much speculation about co-divergence. The hypothesis that pollinator specialization led to the parallel diversification of fig and pollinator lineages (co-divergence) has so far not been tested due to the lack of robust and comprehensive phylogenetic hypotheses for both partners. We produced and combined the most comprehensive molecular phylogenetic trees to date with fossil data to generate independent age estimates for fig and pollinator lineages, using both non-parametric rate smoothing and penalized likelihood dating methods. Molecular dating of ten pairs of interacting lineages provides an unparalleled example of plant-insect co-divergence over a geological time frame spanning at least 60 million years.  相似文献   

8.
Although still controversial, estimation of divergence times using molecular data has emerged as a powerful tool to examine the tempo and mode of evolutionary change. Two primary obstacles in improving the accuracy of molecular dating are heterogeneity in DNA substitution rates and accuracy of the fossil record as calibration points. Recent methodological advances have provided powerful methods that estimate relative divergence times in the face of heterogeneity of nucleotide substitution rates among lineages. However, relatively little attention has focused on the accuracy of fossil calibration points that allow one to translate relative divergence times into absolute time. We present a new cross-validation method that identifies inconsistent fossils when multiple fossil calibrations are available for a clade and apply our method to a molecular phylogeny of living turtles with fossil calibration times for 17 of the 22 internal nodes in the tree. Our cross-validation procedure identified seven inconsistent fossils. Using the consistent fossils as calibration points, we found that despite their overall antiquity as a lineage, the most species-rich clades of turtles diversified well within the Cenozoic. Many of the truly ancient lineages of turtles are currently represented by a few, often endangered species that deserve high priority as conservation targets.  相似文献   

9.
One of the most useful features of molecular phylogenetic analyses is the potential for estimating dates of divergence of evolutionary lineages from the DNA of extant species. But lineage-specific variation in rate of molecular evolution complicates molecular dating, because a calibration rate estimated from one lineage may not be an accurate representation of the rate in other lineages. Many molecular dating studies use a ``clock test' to identify and exclude sequences that vary in rate between lineages. However, these clock tests should not be relied upon without a critical examination of their effectiveness at removing rate variable sequences from any given data set, particularly with regard to the sequence length and number of variable sites. As an illustration of this problem we present a power test of a frequently employed triplet relative rates test. We conclude that (1) relative rates tests are unlikely to detect moderate levels of lineage-specific rate variation (where one lineage has a rate of molecular evolution 1.5 to 4.0 times the other) for most commonly used sequences in molecular dating analyses, and (2) this lack of power is likely to result in substantial error in the estimation of dates of divergence. As an example, we show that the well-studied rate difference between murid rodents and great apes will not be detected for many of the sequences used to date the divergence between these two lineages and that this failure to detect rate variation is likely to result in consistent overestimation the date of the rodent–primate split. Received: 9 June 1999 / Accepted: 22 October 1999  相似文献   

10.
The rate of change in DNA is an important parameter for understanding molecular evolution and hence for inferences drawn from studies of phylogeography and phylogenetics. Most rate calibrations for mitochondrial coding regions in marine species have been made from divergence dating for fossils and vicariant events older than 1-2 My and are typically 0.5-2% per lineage per million years. Recently, calibrations made with ancient DNA (aDNA) from younger dates have yielded faster rates, suggesting that estimates of the molecular rate of change depend on the time of calibration, decaying from the instantaneous mutation rate to the phylogenetic substitution rate. aDNA methods for recent calibrations are not available for most marine taxa so instead we use radiometric dates for sea-level rise onto the Sunda Shelf following the Last Glacial Maximum (starting ~18,000 years ago), which led to massive population expansions for marine species. Instead of divergence dating, we use a two-epoch coalescent model of logistic population growth preceded by a constant population size to infer a time in mutational units for the beginning of these expansion events. This model compares favorably to simpler coalescent models of constant population size, and exponential or logistic growth, and is far more precise than estimates from the mismatch distribution. Mean rates estimated with this method for mitochondrial coding genes in three invertebrate species are elevated in comparison to older calibration points (2.3-6.6% per lineage per million years), lending additional support to the hypothesis of calibration time dependency for molecular rates.  相似文献   

11.
近年来, 分子钟定年方法(molecular dating methods)得以广泛运用, 为宏观进化研究尤其是生物多样性及其格局形成历史的相关研究提供了不可或缺且十分详尽的进化时间框架。贝叶斯方法(Bayesian methods)和马尔可夫链蒙特卡罗方法 (Markov chain Monte Carlo)可容纳多维度、多类型的数据和参数设置, 因此以BEAST、PAML-MCMCTree等软件为代表的贝叶斯节点标记法(Bayesian node-dating methods)逐渐成为分子钟定年方法中最为广泛使用的类型。贝叶斯框架的优势之一在于其可以利用复杂模型考虑各种不确定性因素, 但是该类方法中各类模型和参数的设置都可能引入误差, 从而影响进化分化时间估算的可靠性。本文介绍了贝叶斯分子钟定年方法的原理和主要类型, 并以贝叶斯节点标记法为例, 重点讨论了分子钟模型、化石标记的选择与放置、采样频率及化石标记点年龄先验分布等因素对节点定年的影响; 提供了贝叶斯时间树构建软件的使用建议、节点年龄的讨论原则和不同模型下时间树的比较方法, 针对常见的引起节点年龄潜在高估和低估风险的情况作了分析并给出了合理化建议。我们认为, 合理整合多种贝叶斯方法和模型得出的结果并从中择优, 能够提高定年结果的可靠性; 研究人员应对时间树构建结果与其参数设置的关系开展讨论, 从而为其他学者提供参考; 化石记录的更新与分子钟定年方法的改进应同步不断跟进。  相似文献   

12.
The three-domains tree, which depicts eukaryotes and archaebacteria as monophyletic sister groups, is the dominant model for early eukaryotic evolution. By contrast, the ‘eocyte hypothesis’, where eukaryotes are proposed to have originated from within the archaebacteria as sister to the Crenarchaeota (also called the eocytes), has been largely neglected in the literature. We have investigated support for these two competing hypotheses from molecular sequence data using methods that attempt to accommodate the across-site compositional heterogeneity and across-tree compositional and rate matrix heterogeneity that are manifest features of these data. When ribosomal RNA genes were analysed using standard methods that do not adequately model these kinds of heterogeneity, the three-domains tree was supported. However, this support was eroded or lost when composition-heterogeneous models were used, with concomitant increase in support for the eocyte tree for eukaryotic origins. Analysis of combined amino acid sequences from 41 protein-coding genes supported the eocyte tree, whether or not composition-heterogeneous models were used. The possible effects of substitutional saturation of our data were examined using simulation; these results suggested that saturation is delayed by among-site rate variation in the sequences, and that phylogenetic signal for ancient relationships is plausibly present in these data.  相似文献   

13.
The 22 genera and 64 species of rodents (Muridae: Murinae) distributed in the Philippine Islands provide a unique opportunity to study patterns and processes of diversification in island systems. Over 90% of these rodent species are endemic to the archipelago, but the relative importance of dispersal from the mainland, dispersal within the archipelago, and in situ differentiation as explanations of this diversity remains unclear, as no phylogenetic hypothesis for these species and relevant mainland forms is currently available. Here we report the results of phylogenetic analyses of the endemic Philippine murines and a wide sampling of murine diversity from outside the archipelago, based on the mitochondrial cytochrome b gene and the nuclear-encoded IRBP exon 1. Analysis of our combined gene data set consistently identified five clades comprising endemic Philippine genera, suggesting multiple invasions of the archipelago. Molecular dating analyses using parametric and semiparametric methods suggest that colonization occurred in at least two stages, one ca. 15 Mya, and another 8 to 12 million years later, consistent with the previous recognition of "Old" and "New" endemic rodent faunas. Ancestral area analysis suggests that the Old Endemics invaded landmasses that are now part of the island of Luzon, whereas the three New Endemic clades may have colonized through either Mindanao, Luzon, or both. Further, our results suggest that most of the diversification of Philippine murines took place within the archipelago. Despite heterogeneity between nuclear and mitochondrial genes in most model parameters, combined analysis of the two data sets using both parsimony and likelihood increased phylogenetic resolution; however, the effect of data combination on support for resolved nodes was method dependent. In contrast, our results suggest that combination of mitochondrial and nuclear data to estimate relatively ancient divergence times can severely compromise those estimates, even when specific methods that account for rate heterogeneity among genes are employed. [Biogeography; divergence date estimation; mitochondrial DNA; molecular systematics; Murinae; nuclear exon; Philippines; phylogeny.].  相似文献   

14.
A common assumption in dating patrilineal events using Y-chromosome sequencing data is that the Y-chromosome mutation rate is invariant across haplogroups. Previous studies revealed interhaplogroup heterogeneity in phylogenetic branch length. Whether this heterogeneity is caused by interhaplogroup mutation rate variation or nongenetic confounders remains unknown. Here, we analyzed whole-genome sequences from cultured cells derived from >1,700 males. We confirmed the presence of branch length heterogeneity. We demonstrate that sex-chromosome mutations that appear within cell lines, which likely occurred somatically or in vitro (and are thus not influenced by nongenetic confounders) are informative for germline mutational processes. Using within-cell-line mutations, we computed a relative Y-chromosome somatic mutation rate, and uncovered substantial variation (up to 83.3%) in this proxy for germline mutation rate among haplogroups. This rate positively correlates with phylogenetic branch length, indicating that interhaplogroup mutation rate variation is a likely cause of branch length heterogeneity.  相似文献   

15.

Background  

A full understanding of the patterns and processes of biological diversification requires the dating of evolutionary events, yet the fossil record is inadequate for most lineages under study. Alternatively, a molecular clock approach, in which DNA or amino acid substitution rates are calibrated with fossils or geological/climatic events, can provide indirect estimates of clade ages and diversification rates. The utility of this approach depends on the rate constancy of molecular evolution at a genetic locus across time and across lineages. Although the nuclear ribosomal internal transcribed spacer region (nrITS) is increasingly being used to infer clade ages in plants, little is known about the sources or magnitude of variation in its substitution rate. Here, we systematically review the literature to assess substitution rate variation in nrITS among angiosperms, and we evaluate possible correlates of the variation.  相似文献   

16.
Molecular dating of phylogenetic trees is a growing discipline using sequence data to co‐estimate the timing of evolutionary events and rates of molecular evolution. All molecular‐dating methods require converting genetic divergence between sequences into absolute time. Historically, this could only be achieved by associating externally derived dates obtained from fossil or biogeographical evidence to internal nodes of the tree. In some cases, notably for fast‐evolving genomes such as viruses and some bacteria, the time span over which samples were collected may cover a significant proportion of the time since they last shared a common ancestor. This situation allows phylogenetic trees to be calibrated by associating sampling dates directly to the sequences representing the tips (terminal nodes) of the tree. The increasing availability of genomic data from ancient DNA extends the applicability of such tip‐based calibration to a variety of taxa including humans, extinct megafauna and various microorganisms which typically have a scarce fossil record. The development of statistical models accounting for heterogeneity in different aspects of the evolutionary process while accommodating very large data sets (e.g. whole genomes) has allowed using tip‐dating methods to reach inferences on divergence times, substitution rates, past demography or the age of specific mutations on a variety of spatiotemporal scales. In this review, we summarize the current state of the art of tip dating, discuss some recent applications, highlight common pitfalls and provide a ‘how to’ guide to thoroughly perform such analyses.  相似文献   

17.
Statistical methods for molecular dating of viral origins have been used extensively to infer the time of most common recent ancestor for many rapidly evolving pathogens. However, there are a number of cases, in which epidemiological, historical, or genomic evidence suggests much older viral origins than those obtained via molecular dating. We demonstrate how pervasive purifying selection can mask the ancient origins of recently sampled pathogens, in part due to the inability of nucleotide-based substitution models to properly account for complex patterns of spatial and temporal variability in selective pressures. We use codon-based substitution models to infer the length of branches in viral phylogenies; these models produce estimates that are often considerably longer than those obtained with traditional nucleotide-based substitution models. Correcting the apparent underestimation of branch lengths suggests substantially older origins for measles, Ebola, and avian influenza viruses. This work helps to reconcile some of the inconsistencies between molecular dating and other types of evidence concerning the age of viral lineages.  相似文献   

18.
The assumption of a molecular clock for dating events from sequence information is often frustrated by the presence of heterogeneity among evolutionary rates due, among other factors, to positively selected sites. In this work, our goal is to explore methods to estimate infection dates from sequence analysis. One such method, based on site stripping for clock detection, was proposed to unravel the clocklike molecular evolution in sequences showing high variability of evolutionary rates and in the presence of positive selection. Other alternatives imply accommodating heterogeneity in evolutionary rates at various levels, without eliminating any information from the data. Here we present the analysis of a data set of hepatitis C virus (HCV) sequences from 24 patients infected by a single individual with known dates of infection. We first used a simple criterion of relative substitution rate for site removal prior to a regression analysis. Time was regressed on maximum likelihood pairwise evolutionary distances between the sequences sampled from the source individual and infected patients. We show that it is indeed the fastest evolving sites that disturb the molecular clock and that these sites correspond to positively selected codons. The high computational efficiency of the regression analysis allowed us to compare the site-stripping scheme with random removal of sites. We demonstrate that removing the fast-evolving sites significantly increases the accuracy of estimation of infection times based on a single substitution rate. However, the time-of-infection estimations improved substantially when a more sophisticated and computationally demanding Bayesian method was used. This method was used with the same data set but keeping all the sequence positions in the analysis. Consequently, despite the distortion introduced by positive selection on evolutionary rates, it is possible to obtain quite accurate estimates of infection dates, a result of especial relevance for molecular epidemiology studies.  相似文献   

19.
Molecular clock methods allow biologists to estimate divergence times, which in turn play an important role in comparative studies of many evolutionary processes. It is well known that molecular age estimates can be biased by heterogeneity in rates of molecular evolution, but less attention has been paid to the issue of potentially erroneous fossil calibrations. In this study we estimate the timing of diversification in Centrarchidae, an endemic major lineage of the diverse North American freshwater fish fauna, through a new approach to fossil calibration and molecular evolutionary model selection. Given a completely resolved multi-gene molecular phylogeny and a set of multiple fossil-inferred age estimates, we tested for potentially erroneous fossil calibrations using a recently developed fossil cross-validation. We also used fossil information to guide the selection of the optimal molecular evolutionary model with a new fossil jackknife method in a fossil-based model cross-validation. The centrarchid phylogeny resulted from a mixed-model Bayesian strategy that included 14 separate data partitions sampled from three mtDNA and four nuclear genes. Ten of the 31 interspecific nodes in the centrarchid phylogeny were assigned a minimal age estimate from the centrarchid fossil record. Our analyses identified four fossil dates that were inconsistent with the other fossils, and we removed them from the molecular dating analysis. Using fossil-based model cross-validation to determine the optimal smoothing value in penalized likelihood analysis, and six mutually consistent fossil calibrations, the age of the most recent common ancestor of Centrarchidae was 33.59 million years ago (mya). Penalized likelihood analyses of individual data partitions all converged on a very similar age estimate for this node, indicating that rate heterogeneity among data partitions is not confounding our analyses. These results place the origin of the centrarchid radiation at a time of major faunal turnover as the fossil record indicates that the most diverse lineages of the North American freshwater fish fauna originated at the Eocene-Oligocene boundary, approximately 34 mya. This time coincided with major global climate change from warm to cool temperatures and a signature of elevated lineage extinction and origination in the fossil record across the tree of life. Our analyses demonstrate the utility of fossil cross-validation to critically assess individual fossil calibration points, providing the ability to discriminate between consistent and inconsistent fossil age estimates that are used for calibrating molecular phylogenies.  相似文献   

20.
Mitochondrial DNA remains one of the most widely used molecular markers to reconstruct the phylogeny and phylogeography of closely related birds. It has been proposed that bird mitochondrial genomes evolve at a constant rate of ~0.01 substitution per site per million years, that is that they evolve according to a strict molecular clock. This molecular clock is often used in studies of bird mitochondrial phylogeny and molecular dating. However, rates of mitochondrial genome evolution vary among bird species and correlate with life history traits such as body mass and generation time. These correlations could cause systematic biases in molecular dating studies that assume a strict molecular clock. In this study, we overcome this issue by estimating corrected molecular rates for birds. Using complete or nearly complete mitochondrial genomes of 475 species, we show that there are strong relationships between body mass and substitution rates across birds. We use this information to build models that use bird species’ body mass to estimate their substitution rates across a wide range of common mitochondrial markers. We demonstrate the use of these corrected molecular rates on two recently published data sets. In one case, we obtained molecular dates that are twice as old as the estimates obtained using the strict molecular clock. We hope that this method to estimate molecular rates will increase the accuracy of future molecular dating studies in birds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号