首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The principle of heterotachy states that the substitution rate of sites in a gene can change through time. In this article, we propose a powerful statistical test to detect sites that evolve according to the process of heterotachy. We apply this test to an alignment of 1289 eukaryotic rRNA molecules to 1) determine how widespread the phenomenon of heterotachy is in ribosomal RNA, 2) to test whether these heterotachous sites are nonrandomly distributed, that is, linked to secondary structure features of ribosomal RNA, and 3) to determine the impact of heterotachous sites on the bootstrap support of monophyletic groupings. Our study revealed that with 21 monophyletic taxa, approximately two-thirds of the sites in the considered set of sequences is heterotachous. Although the detected heterotachous sites do not appear bound to specific structural features of the small subunit rRNA, their presence is shown to have a large beneficial influence on the bootstrap support of monophyletic groups. Using extensive testing, we show that this may not be due to heterotachy itself but merely due to the increased substitution rate at the detected heterotachous sites.  相似文献   

2.
Heterotachy is a general term to describe positions that evolve at different rates in different lineages. Heterotachy also can generally be viewed as multivariate rates-across-sites variation, which can be described as randomly drawing rates (or branch lengths) from a multivariate distribution for each branch at each site (Wu J, Susko E. 2009. General heterotachy and distance method adjustments. Mol Biol Evol. 26:2689-2697). Motivated by this result, we propose three new distance-based tests: a heterogeneity test, a heterotachy test, and a within-gene heterotachy test and demonstrate with simulations that they perform well under a wide range of conditions. We also applied the first two tests to two real data sets and found that although all these data sets showed significant evidence of heterotachy, there were subtrees for which the data were consistent with an equal rates or rates-across-sites model.heterogeneity, heterotachy, within-gene heterotachy, covarion model, distance method, hypothesis test.  相似文献   

3.
Evolutionary relationships are typically inferred from molecular sequence data using a statistical model of the evolutionary process. When the model accurately reflects the underlying process, probabilistic phylogenetic methods recover the correct relationships with high accuracy. There is ample evidence, however, that models commonly used today do not adequately reflect real-world evolutionary dynamics. Virtually all contemporary models assume that relatively fast-evolving sites are fast across the entire tree, whereas slower sites always evolve at relatively slower rates. Many molecular sequences, however, exhibit site-specific changes in evolutionary rates, called "heterotachy." Here we examine the accuracy of 2 phylogenetic methods for incorporating heterotachy, the mixed branch length model--which incorporates site-specific rate changes by summing likelihoods over multiple sets of branch lengths on the same tree--and the covarion model, which uses a hidden Markov process to allow sites to switch between variable and invariable as they evolve. Under a variety of simple heterogeneous simulation conditions, the mixed model was dramatically more accurate than homotachous models, which were subject to topological biases as well as biases in branch length estimates. When data were simulated with strong versions of the types of heterotachy observed in real molecular sequences, the mixed branch length model was more accurate than homotachous techniques. Analyses of empirical data sets confirmed that the mixed branch length model can improve phylogenetic accuracy under conditions that cause homotachous models to fail. In contrast, the covarion model did not improve phylogenetic accuracy compared with homotachous models and was sometimes substantially less accurate. We conclude that a mixed branch length approach, although not the solution to all phylogenetic errors, is a valuable strategy for improving the accuracy of inferred trees.  相似文献   

4.
The covarion hypothesis of molecular evolution proposes that selective pressures on an amino acid or nucleotide site change through time, thus causing changes of evolutionary rate along the edges of a phylogenetic tree. Several kinds of Markov models for the covarion process have been proposed. One model, proposed by Huelsenbeck (2002), has 2 substitution rate classes: the substitution process at a site can switch between a single variable rate, drawn from a discrete gamma distribution, and a zero invariable rate. A second model, suggested by Galtier (2001), assumes rate switches among an arbitrary number of rate classes but switching to and from the invariable rate class is not allowed. The latter model allows for some sites that do not participate in the rate-switching process. Here we propose a general covarion model that combines features of both models, allowing evolutionary rates not only to switch between variable and invariable classes but also to switch among different rates when they are in a variable state. We have implemented all 3 covarion models in a maximum likelihood framework for amino acid sequences and tested them on 23 protein data sets. We found significant likelihood increases for all data sets for the 3 models, compared with a model that does not allow site-specific rate switches along the tree. Furthermore, we found that the general model fit the data better than the simpler covarion models in the majority of the cases, highlighting the complexity in modeling the covarion process. The general covarion model can be used for comparing tree topologies, molecular dating studies, and the investigation of protein adaptation.  相似文献   

5.
Heterotachy, an important process of protein evolution.   总被引:10,自引:0,他引:10  
Because of functional constraints, substitution rates vary among the positions of a protein but are usually assumed to be constant at a given site during evolution. The distribution of the rates across the sequence positions generally fits a Gamma distribution. Models of sequence evolution were accordingly designed and led to improved phylogenetic reconstruction. However, it has been convincingly demonstrated that the evolutionary rate of a given position is not always constant throughout time. We called such within-site rate variations heterotachy (for "different speed" in Greek). Yet, heterotachy was found among homologous sequences of distantly related organisms, often with different functions. In such cases, the functional constraints are likely different, which would explain the different distribution of variable sites. To evaluate the importance of heterotachy, we focused on amino acid sequences of mitochondrial cytochrome b, for which the function is likely the same in all vertebrates. Using 2,038 sequences, we demonstrate that 95% of the variable positions are heterotachous, i.e., underwent dramatic variations of substitution rate among vertebrate lineages. Heterotachy even occurs at small evolutionary scale, and in these cases it is very unlikely to be related to functional changes. Since a large number of sequences are required to efficiently detect heterotachy, the extent of this phenomenon could not be estimated for all proteins yet. It could be as large as for cytochrome b, since this protein is not a peculiar case. The observations made here open several new avenues of research, such as the understanding of the evolution of functional constraints or the improvement of phylogenetic reconstruction methods.  相似文献   

6.
The aims of the work were (1) to develop statistical tests to identify whether substitution takes place under a covariotide model in sequences used for phylogenetic inference and (2) to determine the influence of covariotide substitution on phylogenetic trees inferred for photosynthetic and other organisms. (Covariotide and covarion models are ones in which sites that are variable in some parts of the underlying tree are invariable in others and vice versa.) Two tests were developed. The first was a contingency test, and the second was an inequality test comparing the expected number of variable sites in two groups with the observed number. Application of these tests to 16S rDNA and tufA sequences from a range of nonphotosynthetic prokaryotes and oxygenic photosynthetic prokaryotes and eukaryotes suggests the occurrence of a covariotide mechanism. The degree of support for partitioning of taxa in reconstructed trees involving these organisms was determined in the presence or absence of sites showing particular substitution patterns. This analysis showed that the support for splits between (1) photosynthetic eukaryotes and prokaryotes and (2) photosynthetic and nonphotosynthetic organisms could be accounted for by patterns arising from covariotide substitution. We show that the additional problem of compositional bias in sequence data needs to be considered in the context of patterns of covariotide/covarion substitution. We argue that while covariotide or covarion substitution may give rise to phylogenetically informative patterns in sequence data, this may not always be so.   相似文献   

7.
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon--known as heterotachy--can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.  相似文献   

8.
It has long been recognized that the rates of molecular evolution vary amongst sites in proteins. The usual model for rate heterogeneity assumes independent rate variation according to a rate distribution. In such models the rate at a site, although random, is assumed fixed throughout the evolutionary tree. Recent work by several groups has suggested that rates at sites often vary across subtrees of the larger tree as well as across sites. This phenomenon is not captured by most phylogenetic models but instead is more similar to the covarion model of Fitch and coworkers. In this article we present methods that can be useful in detecting whether different rates occur in two different subtrees of the larger tree and where these differences occur. Parametric bootstrapping and orthogonal regression methodologies are used to test for rate differences and to make statements about the general differences in the rates at sites. Confidence intervals based on the conditional distributions of rates at sites are then used to detect where the rate differences occur. Such methods will be helpful in studying the phylogenetic, structural, and functional bases of changes in evolutionary rates at sites, a phenomenon that has important consequences for deep phylogenetic inference.  相似文献   

9.
Covarion processes allow changes in evolutionary rates at sites along the branches of a phylogenetic tree. Covarion-like evolution is increasingly recognized as an important mode of protein evolution. Several recent reports suggest that maximum likelihood estimation employing covarion models may support different optimal topologies than estimation using standard rates-across-sites (RAS) models. However, it remains to be demonstrated that ignoring covarion evolution will generally result in topological misestimation. In this study we performed analytical and theoretical studies of limiting distances under the covarion model and four-taxon tree simulations to investigate the extent to which the covarion process impacts on phylogenetic estimation. In particular, we assessed the limits of an RAS model-based maximum likelihood method to recover the phylogenies when the sequence data were simulated under the covarion processes. We find that, when ignored, covarion processes can induce systematic errors in phylogeny reconstruction. Surprisingly, when sequences are evolved under a covarion process but an RAS model is used for estimation, we find that a long branch repel bias occurs.  相似文献   

10.
Serial transfer of plastids from one eukaryotic host to another is the key process involved in evolution of secondhand plastids. Such transfers drastically change the environment of the plastids and hence the selection regimes, presumably leading to changes over time in the characteristics of plastid gene evolution and to misleading phylogenetic inferences. About half of the dinoflagellate protists species are photosynthetic and unique in harboring a diversity of plastids acquired from a wide range of eukaryotic algae. They are therefore ideal for studying evolutionary processes of plastids gained through secondary and tertiary endosymbioses. In the light of these processes, we have evaluated the origin of 2 types of dinoflagellate plastids, containing the peridinin or 19'-hexanoyloxyfucoxanthin (19'-HNOF) pigments, by inferring the phylogeny using "covarion" evolutionary models allowing the pattern of among-site rate variation to change over time. Our investigations of genes from secondary and tertiary plastids derived from the rhodophyte plastid lineage clearly reveal "heterotachy" processes characterized as stationary covarion substitution patterns and changes in proportion of variable sites across sequences. Failure to accommodate covarion-like substitution patterns can have strong effects on the plastid tree topology. Importantly, multigene analyses performed with probabilistic methods using among-site rate and covarion models of evolution conflict with proposed single origin of the peridinin- and 19'-HNOF-containing plastids, suggesting that analysis of secondhand plastids can be hampered by convergence in the evolutionary signature of the plastid DNA sequences. Another type of sequence convergence was detected at protein level involving the psaA gene. Excluding the psaA sequence from a concatenated protein alignment grouped the peridinin plastid with haptophytes, congruent with all DNA trees. Altogether, taking account of complex processes involved in the evolution of dinoflagellate plastid sequences (both at the DNA and amino acid level), we demonstrate the difficulty of excluding independent, tertiary origin for both the peridinin and 19'-HNOF plastids involving engulfment of haptophyte-like algae. In addition, the refined topologies suggest the red algal order, Porphyridales, as the endosymbiont ancestor of the secondary plastids in cryptophytes, haptophytes, and heterokonts.  相似文献   

11.
Despite the advances in understanding molecular evolution, current phylogenetic methods barely take account of a fraction of the complexity of evolution. We are chiefly constrained by our incomplete knowledge of molecular evolutionary processes and the limits of computational power. These limitations lead to the establishment of either biologically simplistic models that rarely account for a fraction of the complexity involved or overfitting models that add little resolution to the problem. Such oversimplified models may lead us to assign high confidence to an incorrect tree (inconsistency). Rate-across-site (RAS) models are commonly used evolutionary models in phylogenetic studies. These account for heterogeneity in the evolutionary rates among sites but do not account for changing within-site rates across lineages (heterotachy). If heterotachy is common, using RAS models may lead to systematic errors in tree inference. In this work we show possible misleading effects in tree inference when the assumption of constant within-site rates across lineages is violated using maximum likelihood. Using a simulation study, we explore the ways in which gamma stationary models can lead to wrong topology or to deceptive bootstrap support values when the within-site rates change across lineages. More precisely, we show that different degrees of heterotachy mislead phylogenetic inference when the model assumed is stationary. Finally, we propose a geometry-based approach to visualize and to test for the possible existence of bias due to heterotachy.  相似文献   

12.
The covarion (COV)-like properties of sequences are poorly described and their impact on phylogenetic analyses poorly understood. We demonstrate using simulations that, under an evolutionary model where the proportion of variable sites changes in nonadjacent lineages, log likelihood values for rates across site (RAS) and COV models become similar, making models difficult to distinguish. Further, although COV and RAS models provide a great improvement in likelihood scores over a homogeneous model with these simulated data, reconstruction accuracy of tree building is low, suggesting caution when it is suspected that proportions of variable sites differ in different evolutionary lineages. We study the performance of a recently developed contingency test that detects the presence of COV-type evolution modified for protein data. We report that if proportions of variable sites (p(var)) change in a lineage-specific manner such that their distributions in different lineages become sufficiently nonoverlapping, then the contingency test can incorrectly suggest a homogeneous model. Also of concern is the possibility of different proportions of variable sites between the groups being studied. In a study of chloroplast proteins, interpretation of the test is found to be susceptible to different partitioning of taxon groups, making the test very subjective in its implementation. Extreme intergroup differences in the extent of divergence and difference in proportions of variable sites could be contributing to this effect.  相似文献   

13.

Background  

The covarion hypothesis of molecular evolution holds that selective pressures on a given amino acid or nucleotide site are dependent on the identity of other sites in the molecule that change throughout time, resulting in changes of evolutionary rates of sites along the branches of a phylogenetic tree. At the sequence level, covarion-like evolution at a site manifests as conservation of nucleotide or amino acid states among some homologs where the states are not conserved in other homologs (or groups of homologs). Covarion-like evolution has been shown to relate to changes in functions at sites in different clades, and, if ignored, can adversely affect the accuracy of phylogenetic inference.  相似文献   

14.
Variation in rates of molecular evolution (heterotachy) is a common phenomenon among plants. Although multiple theoretical models have been proposed, fundamental questions remain regarding the combined effects of ecological and morphological traits on rate heterogeneity. Here, we used tree ferns to explore the correlation between rates of molecular evolution in chloroplast DNA sequences and several morphological and environmental factors within a Bayesian framework. We revealed direct and indirect effects of body size, biological productivity, and temperature on substitution rates, where smaller tree ferns living in warmer and less productive environments tend to have faster rates of molecular evolution. In addition, we found that variation in the ratio of nonsynonymous to synonymous substitution rates (dN/dS) in the chloroplast rbcL gene was significantly correlated with ecological and morphological variables. Heterotachy in tree ferns may be influenced by effective population size associated with variation in body size and productivity. Macroevolutionary hypotheses should go beyond explaining heterotachy in terms of mutation rates and instead, should integrate population‐level factors to better understand the processes affecting the tempo of evolution at the molecular level.  相似文献   

15.
Although molecular-based phylogenetic studies of hosts and parasites are increasingly common in the literature, no study to date has examined two congeneric lineages of parasites that live in sympatry on the same lineage of hosts. This study examines phylogenetic relationships among chewing lice (Phthiraptera: Trichodectidae) of the Geomydoecus coronadoi and Geomydoecus mexicanus species complexes and compares these to phylogenetic patterns in their hosts (pocket gophers of the rodent family Geomyidae). Sympatry of congeneric lice provides a natural experiment to test the hypothesis that closely related lineages of parasites will respond similarly to the same host. Sequence data from the mitochondrial COI and the nuclear EF-1alpha genes confirm that the two louse complexes are reciprocally monophyletic and that individual clades within each species complex parasitize a different species of pocket gopher. Phylogenetic comparisons reveal that both louse complexes show a significant pattern of cophylogeny with their hosts. Comparisons of rates of nucleotide substitution at 4-fold degenerate sites in the COI gene indicate that both groups of lice have significantly higher basal mutation rates than their hosts. The two groups of lice have similar basal rates of mutation, but lice of the G. coronadoi complex show significantly elevated rates of nucleotide substitution at all sites. These rate differences are hypothesized to result from population-level phenomena, such as effective population size, founder effects, and drift, that influence rates of nucleotide substitution.  相似文献   

16.
We used Bayesian phylogenetic analysis of 5 kb of chloroplast DNA data from 68 Sapotaceae species to clarify phylogenetic relationships within Sapotoideae, one of the two major clades within Sapotaceae. Variation in substitution rates through time was shown to be a very important aspect of molecular evolution for this data set. Relative rates tests indicated that changes in overall rate have taken place in several lineages during the history of the group and Bayes factors strongly supported a covarion model, which allows the rate of a site to vary over time, over commonly used models that only allow rates to vary across sites. Rate variation over time was actually found to be a more important model component than rate variation across sites. The covarion model was originally developed for coding gene sequences and has so far only been tested for this type of data. The fact that it performed so well with the present data set, consisting mainly of data from noncoding spacer regions, suggests that it deserves a wider consideration in model based phylogenetic inference. Repeatability of phylogenetic results was very difficult to obtain with the more parameter rich models, and analyses with identical settings often supported different topologies. Overparameterization may be the reason why the MCMC did not sample from the posterior distribution in these cases. The problem could, however, be overcome by using less parameter rich evolutionary models, and adjusting the MCMC settings. The phylogenetic results showed that two taxa, previously thought to belong in Sapotoideae, are not part of this group. Eberhardtia aurata is the sister of the two major Sapotaceae clades, Chrysophylloideae and Sapotoideae, and Neohemsleya usambarensis belongs in Chrysophylloideae. Within Sapotoideae two clades, Sideroxyleae and Sapoteae, were strongly supported. Bayesian analysis of the character history of some floral morphological traits showed that the ancestral type of flower in Sapotoideae may have been characterized by floral parts (sepals, petals, stamens, and staminodes) in single whorls of five, entire corolla lobes, and seeds with an adaxial hilum.  相似文献   

17.
Heterotachy occurs when the relative evolutionary rates among sites are not the same across lineages. Sequence alignments are likely to exhibit heterotachy with varying severity because the intensity of purifying selection and adaptive forces at a given amino acid or DNA sequence position is unlikely to be the same in different species. In a recent study, the influence of heterotachy on the performance of different phylogenetic methods was examined using computer simulation for a four-species phylogeny. Maximum parsimony (MP) was reported to generally outperform maximum likelihood (ML). However, our comparisons of MP and ML methods using the methods and evaluation criteria employed in that study, but considering the possible range of proportions of sites involved in heterotachy, contradict their findings and indicate that, in fact, ML is significantly superior to MP even under heterotachy.  相似文献   

18.
Testing the covarion hypothesis of molecular evolution   总被引:14,自引:8,他引:6  
The covarion hypothesis of molecular evolution states that the fixation of mutations may alter the probability that any given position will fix the next change. Tests of this hypothesis using the divergence of real sequences are compromised because models of rate variation among sites (e.g., the gamma version of the one-parameter equation) predict sequence divergence values similar to those for the covarion process. This study therefore focuses on the extent to which the varied and unvaried codons of two well-diverged taxa are the same, because fewer are expected by the covarion hypothesis than by the gamma model. The data for these tests are the protein sequences of Cu, Zn superoxide dismutase (SOD) for mammals and plants. Simulation analyses show that the covarion hypothesis makes better predictions about the frequencies of varied and unhit positions in common between these two taxa than does the gamma version of the one-parameter model. Furthermore, the analysis of SOD tertiary structure demonstrates that mammal and plant variabilities are distributed differently on the protein. These results support the conclusions that the variable and invariable codons of mammal and plant SODs are different and that the covarion model explains the evolution of this protein better than the gamma version of the one-parameter process. Unlike other models, the covarion hypothesis accounts for rate fluctuations among positions over time, which is an important parameter of molecular evolution.   相似文献   

19.
The strength and direction of selection on the identity of an amino acid residue in a protein is typically measured by the ratio of the rate of non-synonymous substitutions to the rate of synonymous substitutions. In attempting to predict positively selected sites from amino acid alignments, we made the unexpected observation that the site likelihood of an alignment column for a given tree tends to be negatively correlated with the posterior probability that site is in the positive selection class under widely-used codon models. This is likely because positively selected sites tend to be more variable and display more “radical” amino acid changes; both of these features are expected to result in low site log-likelihoods. We explored the efficacy of using the site log-likelihood (SLL) score as a predictor for positive selection. Through simulation we show that a SLL-based test has a low false positive rate and comparable power as the codon models. In one case where the simulated data violated the assumption that synonymous substitution rates were constant across the sites, the codon models were not able to detect positive selection in the data while the SLL test did. We applied the new method to ten empirical datasets and found that it made similar predictions as the codon models in eight of them. For the tax gene dataset the SLL test seemed to produce more reasonable results. The SLL methods are a valuable complement to codon models, especially for some cases where the assumptions of codon models are likely violated.  相似文献   

20.
Testing a covariotide model of DNA substitution   总被引:10,自引:0,他引:10  
The concomitantly variable codons hypothesis of DNA substitution argues that at any time only a fraction of the codons in a gene are capable of accepting a mutation. However, as mutations are fixed at some positions in a gene, the sites that are potentially variable also change because of changed functional constraints. This hypothesis has been termed the "covarion" hypothesis or when the model is applied to nucleotides, the "covariotide" hypothesis. The covarion-covariotide model has proven to be remarkably difficult to test. Here I examine a covariotide hypothesis for 11 genes using a likelihood ratio test. I show that in nine of the genes examined a covariotide model provides a better explanation of the data than a model that does not allow constraints to change over time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号