共查询到20条相似文献,搜索用时 15 毫秒
1.
David Duchêne 《Molecular ecology resources》2015,15(4):688-696
Evolutionary timescales can be estimated from genetic data using phylogenetic methods based on the molecular clock. To account for molecular rate variation among lineages, a number of relaxed‐clock models have been developed. Some of these models assume that rates vary among lineages in an autocorrelated manner, so that closely related species share similar rates. In contrast, uncorrelated relaxed clocks allow all of the branch‐specific rates to be drawn from a single distribution, without assuming any correlation between rates along neighbouring branches. There is uncertainty about which of these two classes of relaxed‐clock models are more appropriate for biological data. We present an R package, NELSI, that allows the evolution of DNA sequences to be simulated according to a range of clock models. Using data generated by this package, we assessed the ability of two Bayesian phylogenetic methods to distinguish among different relaxed‐clock models and to quantify rate variation among lineages. The results of our analyses show that rate autocorrelation is typically difficult to detect, even when there is complete taxon sampling. This provides a potential explanation for past failures to detect rate autocorrelation in a range of data sets. 相似文献
2.
Many molecular phylogenies show longer root-to-tip path lengths in species-rich groups, encouraging hypotheses linking cladogenesis with accelerated molecular evolution. However, the pattern can also be caused by an artifact called the node density effect (NDE): this effect occurs when the method used to reconstruct a tree underestimates multiple hits that would have been revealed by extra nodes, leading to longer root-to-tip path lengths in clades with more terminal taxa. Here we use a twofold approach to demonstrate that maximum likelihood and Bayesian methods also suffer from the NDE known to affect parsimony. First, simulations deliberately mismatching the simulation and reconstruction models show that the greater the model disparity, the greater the gap between actual and reconstructed tree lengths, and the greater the NDE. Second, taxon sampling manipulation with empirical data shows that NDE can still be present when using optimized models: across 12 datasets, 70 out of 109 sister path comparisons showed significant evidence of NDE. Unless the model fairly accurately reconstructs the real tree length-and given the complexity of real sequence evolution this may be uncommon -- it will consistently produce a node density artifact. At commonly encountered divergence levels, a 10% underestimation of tree length results in > or = 80% of simulated phylogenies showing a positive NDE. Bayesian trees have a slight but consistently stronger effect. This pervasive methodological artifact increases apparent rate heterogeneity, and can compromise investigations of factors influencing molecular evolutionary rate that use path lengths in topologically asymmetric trees. 相似文献
3.
Mahan Ghafari Louis du Plessis Jayna Raghwani Samir Bhatt Bo Xu Oliver G Pybus Aris Katzourakis 《Molecular biology and evolution》2022,39(2)
High-throughput sequencing enables rapid genome sequencing during infectious disease outbreaks and provides an opportunity to quantify the evolutionary dynamics of pathogens in near real-time. One difficulty of undertaking evolutionary analyses over short timescales is the dependency of the inferred evolutionary parameters on the timespan of observation. Crucially, there are an increasing number of molecular clock analyses using external evolutionary rate priors to infer evolutionary parameters. However, it is not clear which rate prior is appropriate for a given time window of observation due to the time-dependent nature of evolutionary rate estimates. Here, we characterize the molecular evolutionary dynamics of SARS-CoV-2 and 2009 pandemic H1N1 (pH1N1) influenza during the first 12 months of their respective pandemics. We use Bayesian phylogenetic methods to estimate the dates of emergence, evolutionary rates, and growth rates of SARS-CoV-2 and pH1N1 over time and investigate how varying sampling window and data set sizes affect the accuracy of parameter estimation. We further use a generalized McDonald–Kreitman test to estimate the number of segregating nonneutral sites over time. We find that the inferred evolutionary parameters for both pandemics are time dependent, and that the inferred rates of SARS-CoV-2 and pH1N1 decline by ∼50% and ∼100%, respectively, over the course of 1 year. After at least 4 months since the start of sequence sampling, inferred growth rates and emergence dates remain relatively stable and can be inferred reliably using a logistic growth coalescent model. We show that the time dependency of the mean substitution rate is due to elevated substitution rates at terminal branches which are 2–4 times higher than those of internal branches for both viruses. The elevated rate at terminal branches is strongly correlated with an increasing number of segregating nonneutral sites, demonstrating the role of purifying selection in generating the time dependency of evolutionary parameters during pandemics. 相似文献
4.
We have used analysis of variance to partition the variation in synonymous and amino acid substitution rates between three effects (gene, lineage, and a gene-by-lineage interaction) in mammalian nuclear and mitochondrial genes. We find that gene effects are stronger for amino acid substitution rates than for synonymous substitution rates and that lineage effects are stronger for synonymous substitution rates than for amino acid substitution rates. Gene-by-lineage interactions, equivalent to overdispersion corrected for lineage effects, are found in amino acid substitutions but not in synonymous substitutions. The variance in the ratio of amino acid and synonymous substitution rates is dominated by gene effects, but there is also a significant gene-by-lineage interaction. 相似文献
5.
Senchina DS Alvarez I Cronn RC Liu B Rong J Noyes RD Paterson AH Wing RA Wilkins TA Wendel JF 《Molecular biology and evolution》2003,20(4):633-643
Molecular evolutionary rate variation in Gossypium (cotton) was characterized using sequence data for 48 nuclear genes from both genomes of allotetraploid cotton, models of its diploid progenitors, and an outgroup. Substitution rates varied widely among the 48 genes, with silent and replacement substitution levels varying from 0.018 to 0.162 and from 0.000 to 0.073, respectively, in comparisons between orthologous Gossypium and outgroup sequences. However, about 90% of the genes had silent substitution rates spanning a more narrow threefold range. Because there was no evidence of rate heterogeneity among lineages for any gene and because rates were highly correlated in independent tests, evolutionary rate is inferred to be a property of each gene or its genetic milieu rather than the clade to which it belongs. Evidence from approximately 200,000 nucleotides (40,000 per genome) suggests that polyploidy in Gossypium led to a modest enhancement in rates of nucleotide substitution. Phylogenetic analysis for each gene yielded the topology expected from organismal history, indicating an absence of gene conversion or recombination among homoeologs subsequent to allopolyploid formation. Using the mean synonymous substitution rate calculated across the 48 genes, allopolyploid cotton is estimated to have formed circa 1.5 million years ago (MYA), after divergence of the diploid progenitors about 6.7 MYA. 相似文献
6.
Lanfear R 《Evolution; international journal of organic evolution》2011,65(2):606-611
Rates of molecular evolution vary substantially between lineages, and a growing effort is directed at uncovering the causes and consequences of this variation. Comparing local-clocks (rates of molecular evolution estimated from different sets of branches of a phylogenetic tree) is a common tool in this research effort. Here, I show that a commonly used test (the Likelihood Ratio Test, LRT) will not be statistically valid for comparing local-clocks in most cases. Instead, I propose the local-clock permutation test (LCPT), a simple test that can be used to test the significance of differences between local-clocks. The LCPT could also be used to test for differences between any parameter that can be assigned to individual branches on a phylogenetic tree. Using simulated data, I show that the LCPT has good power to detect differences between local-clocks. 相似文献
7.
Maughan H 《Evolution; international journal of organic evolution》2007,61(2):280-288
Rates of molecular evolution are known to vary considerably among lineages, partially due to differences in life-history traits such as generation time. The generation-time effect has been well documented in some eukaryotes, but its prevalence in prokaryotes is unknown. \"Because many species of Firmicute bacteria spend long periods of time as metabolically dormant spores, which could result in fewer DNA substitutions per unit time, they present an excellent system for testing predictions of the molecular clock hypothesis.\" To test whether spore-forming bacteria evolve more slowly than their non-spore-forming relatives, I used phylogenetic methods to determine if there were differences in rates of amino acid substitution between spore-forming and non-spore-forming lineages of Firmicute bacteria. Although rates of evolution do vary among lineages, I find no evidence for an effect of spore-formation on evolutionary rate and, furthermore, evolutionary rates are similar to those calculated for enteric bacteria. These results support the notion that variation in generation time does not affect evolutionary rates in bacterial lineages. 相似文献
8.
Two measures, amplitude and phase, have been used to describe the characteristics of the endogenous human circadian pacemaker, a biological clock located in the hypothalamus. Although many studies of change in circadian phase with respect to different stimuli have been conducted, the physiologic implications of the amplitude changes (dynamics) of the pacemaker are unknown. It is known that phase changes of the human circadian pacemaker have a significant impact on sleep timing and content, hormone secretion, subjective alertness and neurobehavioral performance. However, the changes in circadian amplitude with respect to different stimuli are less well documented. Although amplitude dynamics of the human circadian pacemaker are observed in physiological rhythms such as plasma cortisol, plasma melatonin and core temperature data, currently methods are not available to accurately characterize the amplitude dynamics from these rhythms. Of the three rhythms core temperature is the only reliable variable that can be monitored continuously in real time with a high sampling rate. To characterize the amplitude dynamics of the circadian pacemaker we propose a stochastic-dynamic model of core temperature data that contains both stochastic and dynamic characteristics. In this model the circadian component that has a dynamic characteristic is represented as a perturbation solution of the van der Pol equation and the thermoregulatory response in the data that has a stochastic characteristic is represented as a first-order autoregressive process. The model parameters are estimated using data with a maximum likelihood procedure and the goodness-of-fit measures along with the associated standard error of the estimated parameters provided inference about the amplitude dynamics of the pacemaker. Using this model we analysed core temperature data from an experiment designed to exhibit amplitude dynamics. We found that the circadian pacemaker recovers slowly to an equilibrium level following amplitude suppression. In humans this reaction to perturbation from equilibrium value has potential physiological implications. 相似文献
9.
Accuracy of rate estimation using relaxed-clock models with a critical focus on the early metazoan radiation 总被引:7,自引:0,他引:7
In recent years, a number of phylogenetic methods have been developed for estimating molecular rates and divergence dates under models that relax the molecular clock constraint by allowing rate change throughout the tree. These methods are being used with increasing frequency, but there have been few studies into their accuracy. We tested the accuracy of several relaxed-clock methods (penalized likelihood and Bayesian inference using various models of rate change) using nucleotide sequences simulated on a nine-taxon tree. When the sequences evolved with a constant rate, the methods were able to infer rates accurately, but estimates were more precise when a molecular clock was assumed. When the sequences evolved under a model of auto-correlated rate change, rates were accurately estimated using penalized likelihood and by Bayesian inference using lognormal and exponential models of rate change, while other models did not perform as well. When the sequences evolved under a model of uncorrelated rate change, only Bayesian inference using an exponential rate model performed well. Collectively, the results provide a strong recommendation for using the exponential model of rate change if a conservative approach to divergence time estimation is required. A case study is presented in which we use a simulation-based approach to examine the hypothesis of elevated rates in the Cambrian period, and it is found that these high rate estimates might be an artifact of the rate estimation method. If this bias is present, then the ages of metazoan divergences would be systematically underestimated. The results of this study have implications for studies of molecular rates and divergence dates. 相似文献
10.
Examining rates and patterns of nucleotide substitution in plants 总被引:19,自引:0,他引:19
Muse SV 《Plant molecular biology》2000,42(1):25-43
Driven by rapid improvements in affordable computing power and by the even faster accumulation of genomic data, the statistical analysis of molecular sequence data has become an active area of interdisciplinary research. Maximum likelihood methods have become mainstream because of their desirable properties and, more importantly, their potential for providing statistically sound solutions in complex data analysis settings. In this chapter, a review of recent literature focusing on rates and patterns of nucleotide substitution rates in the nuclear, chloroplast, and mitochondrial genomes of plants demonstrates the power and flexibility of these new methods. The emerging picture of the nucleotide substitution process in plants is a complex one. Evolutionary rates are seen to be quite variable, both among genes and among plant lineages. However, there are hints, particularly in the chloroplast, that individual factors can have important effects on many genes simultaneously. 相似文献
11.
Ho SY Lanfear R Bromham L Phillips MJ Soubrier J Rodrigo AG Cooper A 《Molecular ecology》2011,20(15):3087-3101
For over half a century, it has been known that the rate of morphological evolution appears to vary with the time frame of measurement. Rates of microevolutionary change, measured between successive generations, were found to be far higher than rates of macroevolutionary change inferred from the fossil record. More recently, it has been suggested that rates of molecular evolution are also time dependent, with the estimated rate depending on the timescale of measurement. This followed surprising observations that estimates of mutation rates, obtained in studies of pedigrees and laboratory mutation-accumulation lines, exceeded long-term substitution rates by an order of magnitude or more. Although a range of studies have provided evidence for such a pattern, the hypothesis remains relatively contentious. Furthermore, there is ongoing discussion about the factors that can cause molecular rate estimates to be dependent on time. Here we present an overview of our current understanding of time-dependent rates. We provide a summary of the evidence for time-dependent rates in animals, bacteria and viruses. We review the various biological and methodological factors that can cause rates to be time dependent, including the effects of natural selection, calibration errors, model misspecification and other artefacts. We also describe the challenges in calibrating estimates of molecular rates, particularly on the intermediate timescales that are critical for an accurate characterization of time-dependent rates. This has important consequences for the use of molecular-clock methods to estimate timescales of recent evolutionary events. 相似文献
12.
13.
Bromham L 《Philosophical transactions of the Royal Society of London. Series B, Biological sciences》2011,366(1577):2503-2513
DNA sequences evolve at different rates in different species. This rate variation has been most closely examined in mammals, revealing a large number of characteristics that can shape the rate of molecular evolution. Many of these traits are part of the mammalian life-history continuum: species with small body size, rapid generation turnover, high fecundity and short lifespans tend to have faster rates of molecular evolution. In addition, rate of molecular evolution in mammals might be influenced by behaviour (such as mating system), ecological factors (such as range restriction) and evolutionary history (such as diversification rate). I discuss the evidence for these patterns of rate variation, and the possible explanations of these correlations. I also consider the impact of these systematic patterns of rate variation on the reliability of the molecular date estimates that have been used to suggest a Cretaceous radiation of modern mammals, before the final extinction of the dinosaurs. 相似文献
14.
Ziheng Yang 《Journal of molecular evolution》1996,42(5):587-596
Models of nucleotide substitution were constructed for combined analyses of heterogeneous sequence data (such as those of
multiple genes) from the same set of species. The models account for different aspects of the heterogeneity in the evolutionary
process of different genes, such as differences in nucleotide frequencies, in substitution rate bias (for example, the transition/transversion
rate bias), and in the extent of rate variation across sites. Model parameters were estimated by maximum likelihood and the
likelihood ratio test was used to test hypotheses concerning sequence evolution, such as rate constancy among lineages (the
assumption of a molecular clock) and proportionality of branch lengths for different genes. The example data from a segment
of the mitochondrial genome of six hominoid species (human, common and pygmy chimpanzees, gorilla, orangutan, and siamang)
were analyzed. Nucleotides at the three codon positions in the protein-coding regions and from the tRNA-coding regions were
considered heterogeneous data sets. Statistical tests showed that the amount of evolution in the sequence data reflected in
the estimated branch lengths can be explained by the codon-position effect and lineage effect of substitution rates. The assumption
of a molecular clock could not be rejected when the data were analyzed separately or when the rate variation among sites was
ignored. However, significant differences in substitution rate among lineages were found when the data sets were combined
and when the rate variation among sites was accounted for in the models. Under the assumption that the orangutan and African
apes diverged 13 million years ago, the combined analysis of the sequence data estimated the times for the human-chimpanzee
separation and for the separation of the gorilla as 4.3 and 6.8 million years ago, respectively. 相似文献
15.
The hepatitis B virus (HBV) has a circular DNA genome of about 3,200 base pairs. Economical use of the genome with overlapping reading frames may have led to severe constraints on nucleotide substitutions along the genome and to highly variable rates of substitution among nucleotide sites. Nucleotide sequences from 13 complete HBV genomes were compared to examine such variability of substitution rates among sites and to examine the phylogenetic relationships among the HBV variants. The maximum likelihood method was employed to fit models of DNA sequence evolution that can account for the complexity of the pattern of nucleotide substitution. Comparison of the models suggests that the rates of substitution are different in different genes and codon positions; for example, the third codon position changes at a rate over ten times higher than the second position. Furthermore, substantial variation of substitution rates was detected even after the effects of genes and codon positions were corrected; that is, rates are different at different sites of the same gene or at the same codon position. Such rates after the correction were also found to be positively correlated at adjacent sites, which indicated the existence of conserved and variable domains in the proteins encoded by the viral genome. A multiparameter model validates the earlier finding that the variation in nucleotide conservation is not random around the HBV genome. The test for the existence of a molecular clock suggests that substitution rates are more or less constant among lineages. The phylogenetic relationships among the viral variants were examined. Although the data do not seem to contain sufficient information to resolve the details of the phylogeny, it appears quite certain that the serotypes of the viral variants do not reflect their genetic relatedness.
Correspondence to: Z. Yang 相似文献
16.
The decarboxylases are involved in neurotransmitter synthesis in animals, and in pathways of secondary metabolism in plants. Different decarboxylase proteins are characterized for their different substrate specificities, but are encoded by homologous genes. We study, within a maximum-likelihood framework, the evolutionary relationships among dopa decarboxylase (Ddc), histidine decarboxylase (Hdc) and alpha-methyldopa hypersensitive (amd) in animals, and tryptophan decarboxylase (Wdc) and tyrosine decarboxylase (Ydc) in plants. The evolutionary rates are heterogeneous. There are differences between paralogous genes in the same lineages: 4.13 x 10(-10) nucleotide substitutions per site per year in mammalian Ddc vs. 1.95 in Hdc; between orthologous genes in different lineages, 7.62 in dipteran Ddc vs. 4.13 in mammalian Ddc; and very large temporal variations in some lineages, from 3.7 up to 54.9 in the Drosophila Ddc lineage. Our results are inconsistent with the molecular clock hypothesis. 相似文献
17.
We propose a scaled linear mixed model to assess the effects of exposure and other covariates on multiple continuous outcomes. The most general form of the model allows a different exposure effect for each outcome. An important special case is a model that represents the exposure effects using a common global measure that can be characterized in terms of effect sizes. Correlations among different outcomes within the same subject are accommodated using random effects. We develop two approaches to model fitting, including the maximum likelihood method and the working parameter method. A key feature of both methods is that they can be easily implemented by repeatedly calling software for fitting standard linear mixed models, e.g., SAS PROC MIXED. Compared to the maximum likelihood method, the working parameter method is easier to implement and yields fully efficient estimators of the parameters of interest. We illustrate the proposed methods by analyzing data from a study of the effects of occupational pesticide exposure on semen quality in a cohort of Chinese men. 相似文献
18.
Testing the molecular clock: molecular and paleontological estimates of divergence times in the Echinoidea (Echinodermata) 总被引:6,自引:0,他引:6
Smith AB Pisani D Mackenzie-Dodds JA Stockley B Webster BL Littlewood DT 《Molecular biology and evolution》2006,23(10):1832-1851
The phylogenetic relationships of 46 echinoids, with representatives from 13 of the 14 ordinal-level clades and about 70% of extant families commonly recognized, have been established from 3 genes (3,226 alignable bases) and 119 morphological characters. Morphological and molecular estimates are similar enough to be considered suboptimal estimates of one another, and the combined data provide a tree that, when calibrated against the fossil record, provides paleontological estimates of divergence times and completeness of their fossil record. The order of branching on the cladogram largely agrees with the stratigraphic order of first occurrences and implies that their fossil record is more than 85% complete at family level and at a resolution of 5-Myr time intervals. Molecular estimates of divergence times derived from applying both molecular clock and relaxed molecular clock models are concordant with estimates based on the fossil record in up to 70% of cases, with most concordant results obtained using Sanderson's semiparametric penalized likelihood method and a logarithmic-penalty function. There are 3 regions of the tree where molecular and fossil estimates of divergence time consistently disagree. Comparison with results obtained when molecular divergence dates are estimated from the combined (morphology + gene) tree suggests that errors in phylogenetic reconstruction explain only one of these. In another region the error most likely lies with the paleontological estimates because taxa in this region are demonstrated to have a very poor fossil record. In the third case, morphological and paleontological evidence is much stronger, and the topology for this part of the molecular tree differs from that derived from the combined data. Here the cause of the mismatch is unclear but could be methodological, arising from marked inequality of molecular rates. Overall, the level of agreement reached between these different data and methodological approaches leads us to believe that careful application of likelihood and Bayesian methods to molecular data provides realistic divergence time estimates in the majority of cases (almost 80% in this specific example), thus providing a remarkably well-calibrated phylogeny of a character-rich clade of ubiquitous marine benthic invertebrates. 相似文献
19.
The phylogeny of theDrosophila hydei subgroup, which is a member of theD. repleta species group, was inferred from 1,515 base pairs of mitochondrial DNA sequence of the cytochrome oxidase subunits I, II, and III. Four of the seven species in the subgroup were examined, which are placed into two taxonomic complexes: theD. bifurca complex (D. bifurca) andD. nigrohydei) and theD. hydei complex (D. hydei and (D. eohydei). Both complexes appear to be monophyletic, although theD. bifurca complex is only weakly supported. The evolution of chromosomal change, interspecific crossability, sperm gigantism, and divergence times of the subgroup is discussed in a phylogenetic context. Correspondence to: G. Spicer 相似文献
20.
Estabrook GF Smith GR Dowling TE 《Evolution; international journal of organic evolution》2007,61(5):1176-1187
The mass-specific metabolic rate hypothesis of Gillooly and others predicts that DNA mutation and substitution rates are a function of body mass and temperature. We tested this hypothesis with sequence divergences estimated from mtDNA cytochrome b sequences of 54 taxa of cyprinid fish. Branch lengths estimated from a likelihood tree were compared with metabolic rates calculated from body mass and environmental temperatures experienced by those taxa. The problem of unknown age estimates of lineage splitting was avoided by comparing estimated amounts of metabolic activity along phyletic lines leading to pairs of modern taxa from their most recent common ancestor with sequence divergences along those same pairs of phyletic lines. There were significantly more pairs for which the phyletic line with greater genetic change also had the higher metabolic activity, when compared to the prediction of a hypothesis that body mass and temperature are not related to substitution rate. 相似文献