首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The study of rates of nucleotide substitution in RNA viruses is central to our understanding of their evolution. Herein we report a comprehensive analysis of substitution rates in 50 RNA viruses using a recently developed maximum likelihood phylogenetic method. This analysis revealed a significant relationship between genetic divergence and isolation time for an extensive array of RNA viruses, although more rate variation was usually present among lineages than would be expected under the constraints of a molecular clock. Despite the lack of a molecular clock, the range of statistically significant variation in overall substitution rates was surprisingly narrow for those viruses where a significant relationship between genetic divergence and time was found, as was the case when synonymous sites were considered alone, where the molecular clock was rejected less frequently. An analysis of the ecological and genetic factors that might explain this rate variation revealed some evidence of significantly lower substitution rates in vector-borne viruses, as well as a weak correlation between rate and genome length. Finally, a simulation study revealed that our maximum likelihood estimates of substitution rates are valid, even if the molecular clock is rejected, provided that sufficiently large data sets are analyzed. Received: 23 February 2001 / Accepted: 3 July 2001  相似文献   

2.
To test hypotheses on the differences in retroviral genetic diversity, we compared the evolutionary dynamics of the human immunodeficiency virus type 1 (HIV-1) group M and the primate T-cell lymphotropic virus (PTLV) using a full-genome analysis. Evolutionary rates and nonsynonymous/synonymous substitution rate ratios were estimated across the genome using a maximum likelihood sliding window approach, and molecular clock properties were investigated. We confirm a remarkable difference in genetic stability and selective pressure at the interhost level. While there is evidence for adaptive evolution in HIV-1, the evolution of PTLV is almost exclusively characterized by negative selection or nearly neutral processes. For both retroviruses, evolutionary rate estimates across the genome reflect the differential selective constraints. However, based on the relationship between evolutionary rate and selective pressure and based on the comparison of synonymous substitution rates, the differences in rate between HIV-1 and PTLV cannot be explained by selective forces only. Several evolutionary and statistical assumptions, examined using a Bayesian coalescent method, were shown to have little influence on our inference.  相似文献   

3.
The high rates of RNA virus evolution are generally attributed to replication with error-prone RNA-dependent RNA polymerases. However, these long-term nucleotide substitution rates span three orders of magnitude and do not correlate well with mutation rates or selection pressures. This substitution rate variation may be explained by differences in virus ecology or intrinsic genomic properties. We generated nucleotide substitution rate estimates for mammalian RNA viruses and compiled comparable published rates, yielding a dataset of 118 substitution rates of structural genes from 51 different species, as well as 40 rates of non-structural genes from 28 species. Through ANCOVA analyses, we evaluated the relationships between these rates and four ecological factors: target cell, transmission route, host range, infection duration; and three genomic properties: genome length, genome sense, genome segmentation. Of these seven factors, we found target cells to be the only significant predictors of viral substitution rates, with tropisms for epithelial cells or neurons (P<0.0001) as the most significant predictors. Further, one-tailed t-tests showed that viruses primarily infecting epithelial cells evolve significantly faster than neurotropic viruses (P<0.0001 and P<0.001 for the structural genes and non-structural genes, respectively). These results provide strong evidence that the fastest evolving mammalian RNA viruses infect cells with the highest turnover rates: the highly proliferative epithelial cells. Estimated viral generation times suggest that epithelial-infecting viruses replicate more quickly than viruses with different cell tropisms. Our results indicate that cell tropism is a key factor in viral evolvability.  相似文献   

4.
Phylogenetic analysis using parsimony and likelihood methods   总被引:1,自引:0,他引:1  
The assumptions underlying the maximum-parsimony (MP) method of phylogenetic tree reconstruction were intuitively examined by studying the way the method works. Computer simulations were performed to corroborate the intuitive examination. Parsimony appears to involve very stringent assumptions concerning the process of sequence evolution, such as constancy of substitution rates between nucleotides, constancy of rates across nucleotide sites, and equal branch lengths in the tree. For practical data analysis, the requirement of equal branch lengths means similar substitution rates among lineages (the existence of an approximate molecular clock), relatively long interior branches, and also few species in the data. However, a small amount of evolution is neither a necessary nor a sufficient requirement of the method. The difficulties involved in the application of current statistical estimation theory to tree reconstruction were discussed, and it was suggested that the approach proposed by Felsenstein (1981,J. Mol. Evol. 17: 368–376) for topology estimation, as well as its many variations and extensions, differs fundamentally from the maximum likelihood estimation of a conventional statistical parameter. Evidence was presented showing that the Felsenstein approach does not share the asymptotic efficiency of the maximum likelihood estimator of a statistical parameter. Computer simulations were performed to study the probability that MP recovers the true tree under a hierarchy of models of nucleotide substitution; its performance relative to the likelihood method was especially noted. The results appeared to support the intuitive examination of the assumptions underlying MP. When a simple model of nucleotide substitution was assumed to generate data, the probability that MP recovers the true topology could be as high as, or even higher than, that for the likelihood method. When the assumed model became more complex and realistic, e.g., when substitution rates were allowed to differ between nucleotides or across sites, the probability that MP recovers the true topology, and especially its performance relative to that of the likelihood method, generally deteriorates. As the complexity of the process of nucleotide substitution in real sequences is well recognized, the likelihood method appears preferable to parsimony. However, the development of a statistical methodology for the efficient estimation of the tree topology remains a difficult open problem.  相似文献   

5.
The hepatitis B virus (HBV) has a circular DNA genome of about 3,200 base pairs. Economical use of the genome with overlapping reading frames may have led to severe constraints on nucleotide substitutions along the genome and to highly variable rates of substitution among nucleotide sites. Nucleotide sequences from 13 complete HBV genomes were compared to examine such variability of substitution rates among sites and to examine the phylogenetic relationships among the HBV variants. The maximum likelihood method was employed to fit models of DNA sequence evolution that can account for the complexity of the pattern of nucleotide substitution. Comparison of the models suggests that the rates of substitution are different in different genes and codon positions; for example, the third codon position changes at a rate over ten times higher than the second position. Furthermore, substantial variation of substitution rates was detected even after the effects of genes and codon positions were corrected; that is, rates are different at different sites of the same gene or at the same codon position. Such rates after the correction were also found to be positively correlated at adjacent sites, which indicated the existence of conserved and variable domains in the proteins encoded by the viral genome. A multiparameter model validates the earlier finding that the variation in nucleotide conservation is not random around the HBV genome. The test for the existence of a molecular clock suggests that substitution rates are more or less constant among lineages. The phylogenetic relationships among the viral variants were examined. Although the data do not seem to contain sufficient information to resolve the details of the phylogeny, it appears quite certain that the serotypes of the viral variants do not reflect their genetic relatedness. Correspondence to: Z. Yang  相似文献   

6.
Models of nucleotide substitution were constructed for combined analyses of heterogeneous sequence data (such as those of multiple genes) from the same set of species. The models account for different aspects of the heterogeneity in the evolutionary process of different genes, such as differences in nucleotide frequencies, in substitution rate bias (for example, the transition/transversion rate bias), and in the extent of rate variation across sites. Model parameters were estimated by maximum likelihood and the likelihood ratio test was used to test hypotheses concerning sequence evolution, such as rate constancy among lineages (the assumption of a molecular clock) and proportionality of branch lengths for different genes. The example data from a segment of the mitochondrial genome of six hominoid species (human, common and pygmy chimpanzees, gorilla, orangutan, and siamang) were analyzed. Nucleotides at the three codon positions in the protein-coding regions and from the tRNA-coding regions were considered heterogeneous data sets. Statistical tests showed that the amount of evolution in the sequence data reflected in the estimated branch lengths can be explained by the codon-position effect and lineage effect of substitution rates. The assumption of a molecular clock could not be rejected when the data were analyzed separately or when the rate variation among sites was ignored. However, significant differences in substitution rate among lineages were found when the data sets were combined and when the rate variation among sites was accounted for in the models. Under the assumption that the orangutan and African apes diverged 13 million years ago, the combined analysis of the sequence data estimated the times for the human-chimpanzee separation and for the separation of the gorilla as 4.3 and 6.8 million years ago, respectively.  相似文献   

7.
Selecting the best-fit model of nucleotide substitution   总被引:2,自引:0,他引:2  
Despite the relevant role of models of nucleotide substitution in phylogenetics, choosing among different models remains a problem. Several statistical methods for selecting the model that best fits the data at hand have been proposed, but their absolute and relative performance has not yet been characterized. In this study, we compare under various conditions the performance of different hierarchical and dynamic likelihood ratio tests, and of Akaike and Bayesian information methods, for selecting best-fit models of nucleotide substitution. We specifically examine the role of the topology used to estimate the likelihood of the different models and the importance of the order in which hypotheses are tested. We do this by simulating DNA sequences under a known model of nucleotide substitution and recording how often this true model is recovered by the different methods. Our results suggest that model selection is reasonably accurate and indicate that some likelihood ratio test methods perform overall better than the Akaike or Bayesian information criteria. The tree used to estimate the likelihood scores does not influence model selection unless it is a randomly chosen tree. The order in which hypotheses are tested, and the complexity of the initial model in the sequence of tests, influence model selection in some cases. Model fitting in phylogenetics has been suggested for many years, yet many authors still arbitrarily choose their models, often using the default models implemented in standard computer programs for phylogenetic estimation. We show here that a best-fit model can be readily identified. Consequently, given the relevance of models, model fitting should be routine in any phylogenetic analysis that uses models of evolution.  相似文献   

8.
Despite their close phylogenetic relationship, type A and B influenza viruses exhibit major epidemiological differences in humans, with the latter both less common and less often associated with severe disease. However, it is unclear what processes determine the evolutionary dynamics of influenza B virus, and how influenza viruses A and B interact at the evolutionary scale. To address these questions we inferred the phylogenetic history of human influenza B virus using complete genome sequences for which the date (day) of isolation was available. By comparing the phylogenetic patterns of all eight viral segments we determined the occurrence of segment reassortment over a 30-year sampling period. An analysis of rates of nucleotide substitution and selection pressures revealed sporadic occurrences of adaptive evolution, most notably in the viral hemagglutinin and compatible with the action of antigenic drift, yet lower rates of overall and nonsynonymous nucleotide substitution compared to influenza A virus. Overall, these results led us to propose a model in which evolutionary changes within and between the antigenically distinct 'Yam88' and 'Vic87' lineages of influenza B virus are the result of changes in herd immunity, with reassortment continuously generating novel genetic variation. Additionally, we suggest that the interaction with influenza A virus may be central in shaping the evolutionary dynamics of influenza B virus, facilitating the shift of dominance between the Vic87 and the Yam88 lineages.  相似文献   

9.
The evolutionary patterns of hepatitis C virus (HCV), including the best-fitting nucleotide substitution model and the molecular clock hypothesis, were investigated by analyzing full-genome sequences available in the HCV database. The likelihood ratio test allowed us to discriminate among different evolutionary hypotheses. The phylogeny of the six major HCV types was accurately inferred, and the final tree was rooted by reconstructing the hypothetical HCV common ancestor with the maximum likelihood method. The presence of phylogenetic noise and the relative nucleotide substitution rates in the different HCV genes were also examined. These results offer a general guideline for the future of HCV phylogenetic analysis and also provide important insights on HCV origin and evolution. Received: 13 January 2001 / Accepted: 21 June 2001  相似文献   

10.
A Space-Time Process Model for the Evolution of DNA Sequences   总被引:20,自引:3,他引:17       下载免费PDF全文
Z. Yang 《Genetics》1995,139(2):993-1005
We describe a model for the evolution of DNA sequences by nucleotide substitution, whereby nucleotide sites in the sequence evolve over time, whereas the rates of substitution are variable and correlated over sites. The temporal process used to describe substitutions between nucleotides is a continuous-time Markov process, with the four nucleotides as the states. The spatial process used to describe variation and dependence of substitution rates over sites is based on a serially correlated gamma distribution, i.e., an auto-gamma model assuming Markov-dependence of rates at adjacent sites. To achieve computational efficiency, we use several equal-probability categories to approximate the gamma distribution, and the result is an auto-discrete-gamma model for rates over sites. Correlation of rates at sites then is modeled by the Markov chain transition of rates at adjacent sites from one rate category to another, the states of the chain being the rate categories. Two versions of nonparametric models, which place no restrictions on the distributional forms of rates for sites, also are considered, assuming either independence or Markov dependence. The models are applied to data of a segment of mitochondrial genome from nine primate species. Model parameters are estimated by the maximum likelihood method, and models are compared by the likelihood ratio test. Tremendous variation of rates among sites in the sequence is revealed by the analyses, and when rate differences for different codon positions are appropriately accounted for in the models, substitution rates at adjacent sites are found to be strongly (positively) correlated. Robustness of the results to uncertainty of the phylogenetic tree linking the species is examined.  相似文献   

11.
Exceptional Convergent Evolution in a Virus   总被引:16,自引:5,他引:16       下载免费PDF全文
Replicate lineages of the bacteriophage X 174 adapted to growth at high temperature on either of two hosts exhibited high rates of identical, independent substitutions. Typically, a dozen or more substitutions accumulated in the 5.4-kilobase genome during propagation. Across the entire data set of nine lineages, 119 independent substitutions occurred at 68 nucleotide sites. Over half of these substitutions, accounting for one third of the sites, were identical with substitutions in other lineages. Some convergent substitutions were specific to the host used for phage propagation, but others occurred across both hosts. Continued adaptation of an evolved phage at high temperature, but on the other host, led to additional changes that included reversions of previous substitutions. Phylogenetic reconstruction using the complete genome sequence not only failed to recover the correct evolutionary history because of these convergent changes, but the true history was rejected as being a significantly inferior fit to the data. Replicate lineages subjected to similar environmental challenges showed similar rates of substitution and similar rates of fitness improvement across corresponding times of adaptation. Substitution rates and fitness improvements were higher during the initial period of adaptation than during a later period, except when the host was changed.  相似文献   

12.
A maximum likelihood method for independently estimating the relative rate of substitution at different nucleotide sites is presented. With this method, the evolution of DNA sequences can be analyzed without assuming a specific distribution of rates among sites. To investigate the pattern of correlation of rates among sites, the method was applied to a data set consisting of the protein-coding regions of the mitochondrial genome from 10 vertebrate species. Rates appear to be strongly correlated at distances up to 40 codons apart. Furthermore, there appears to be some higher order correlation of sites approximately 75 codons apart. The method of site-by-site estimation of the rate of substitution may also be applied to examine other aspects of rate variation along a DNA sequence and to assess the difference in the support of a tree along the sequence.  相似文献   

13.
Viral evolution and the emergence of SARS coronavirus   总被引:8,自引:0,他引:8  
The recent appearance of severe acute respiratory syndrome coronavirus (SARS-CoV) highlights the continual threat to human health posed by emerging viruses. However, the central processes in the evolution of emerging viruses are unclear, particularly the selection pressures faced by viruses in new host species. We outline some of the key evolutionary genetic aspects of viral emergence. We emphasize that, although the high mutation rates of RNA viruses provide them with great adaptability and explain why they are the main cause of emerging diseases, their limited genome size means that they are also subject to major evolutionary constraints. Understanding the mechanistic basis of these constraints, particularly the roles played by epistasis and pleiotropy, is likely to be central in explaining why some RNA viruses are more able than others to cross species boundaries. Viral genetic factors have also been implicated in the emergence of SARS-CoV, with the suggestion that this virus is a recombinant between mammalian and avian coronaviruses. We show, however, that the phylogenetic patterns cited as evidence for recombination are more probably caused by a variation in substitution rate among lineages and that recombination is unlikely to explain the appearance of SARS in humans.  相似文献   

14.
Steel demonstrated that the maximum-likelihood function for a phylogenetic tree may have multiple local maxima. If this phenomenon were general, it would compromise the applicability of maximum likelihood as an optimality criterion for phylogenetic trees. In several simulation studies reported on in this paper, the true tree, and other trees of very high likelihood, rarely had multiple maxima. Our results thus provide reassurance that the value of maximum likelihood as a tree selection criterion is not compromised by the presence of multiple local maxima--the best estimates of the true tree are not likely to have them. This result holds true even when an incorrect nucleotide substitution model is used for tree selection.  相似文献   

15.
Nonhomogeneous Markov models of nucleotide substitution have received scant attention. Here we explore the possibility of using nonhomogeneous models to identify host shift nodes along phylogenetic trees of pathogens evolving in different hosts. It has been noticed that influenza viruses show marked differences in nucleotide composition in human and avian hosts. We take advantage of this fact to identify the host shift event that led to the 1918 ‘Spanish’ influenza. This disease killed over 50 million people worldwide, ranking it as the deadliest pandemic in recorded history. Our model suggests that the eight RNA segments which eventually became the 1918 viral genome were introduced into a mammalian host around 1882–1913. The viruses later diverged into the classical swine and human H1N1 influenza lineages around 1913–1915. The last common ancestor of human strains dates from February 1917 to April 1918. Because pigs are more readily infected with avian influenza viruses than humans, it would seem that they were the original recipient of the virus. This would suggest that the virus was introduced into humans sometime between 1913 and 1918.  相似文献   

16.
Phylogenetic analyses frequently rely on models of sequence evolution that detail nucleotide substitution rates, nucleotide frequencies, and site-to-site rate heterogeneity. These models can influence hypothesis testing and can affect the accuracy of phylogenetic inferences. Maximum likelihood methods of simultaneously constructing phylogenetic tree topologies and estimating model parameters are computationally intensive, and are not feasible for sample sizes of 25 or greater using personal computers. Techniques that initially construct a tree topology and then use this non-maximized topology to estimate ML substitution rates, however, can quickly arrive at a model of sequence evolution. The accuracy of this two-step estimation technique was tested using simulated data sets with known model parameters. The results showed that for a star-like topology, as is often seen in human immunodeficiency virus type 1 (HIV-1) subtype B sequences, a random starting topology could produce nucleotide substitution rates that were not statistically different than the true rates. Samples were isolated from 100 HIV-1 subtype B infected individuals from the United States and a 620 nt region of the env gene was sequenced for each sample. The sequence data were used to obtain a substitution model of sequence evolution specific for HIV-1 subtype B env by estimating nucleotide substitution rates and the site-to-site heterogeneity in 100 individuals from the United States. The method of estimating the model should provide users of large data sets with a way to quickly compute a model of sequence evolution, while the nucleotide substitution model we identified should prove useful in the phylogenetic analysis of HIV-1 subtype B env sequences. Received: 4 October 2000 / Accepted: 1 March 2001  相似文献   

17.
McGeoch DJ  Dolan A  Ralph AC 《Journal of virology》2000,74(22):10401-10406
With the aim of deriving a definitive phylogenetic tree for as many mammalian and avian herpesvirus species as possible, alignments were made of amino acid sequences from eight conserved and ubiquitously present genes of herpesviruses, with 48 virus species each represented by at least one gene. Phylogenetic trees for both single-gene and concatenated alignments were evaluated thoroughly by maximum-likelihood methods, with each of the three herpesvirus subfamilies (the Alpha-, Beta-, and Gammaherpesvirinae) examined independently. Composite trees were constructed starting with the top-scoring tree based on the broadest set of genes and supplemented by addition of virus species from trees based on narrower gene sets, to give finally a 46-species tree; branching order for three regions within the tree remained unresolved. Sublineages of the Alpha- and Betaherpesvirinae showed extensive cospeciation with host lineages by criteria of congruence in branching patterns and consistency in extent of divergence. The Gammaherpesvirinae presented a more complex picture, with both higher and lower substitution rates in different sublineages. The final tree obtained represents the most detailed view to date of phylogenetic relationships in any family of large-genome viruses.  相似文献   

18.
Summary In the maximum likelihood (ML) method for estimating a molecular phylogenetic tree, the pattern of nucleotide substitutions for computing likelihood values is assumed to be simpler than that of the actual evolutionary process, simply because the process, considered to be quite devious, is unknown. The problem, however, is that there has been no guarantee to endorse the simplification.To study this problem, we first evaluated the robustness of the ML method in the estimation of molecular trees against different nucleotide substitution patterns, including Jukes and Cantor's, the simplest ever proposed. Namely, we conducted computer simulations in which we could set up various evolutionary models of a hypothetical gene, and define a true tree to which an estimated tree by the ML method was to be compared. The results show that topology estimation by the ML method is considerably robust against different ratios of transitions to transversions and different GC contents, but branch length estimation is not so. The ML tree estimation based on Jukes and Cantor's model is also revealed to be resistant to GC content, but rather sensitive to the ratio of transitions to transversions.We then applied the ML method with different substitution patterns to nucleotide sequence data ontax gene from T-cell leukemia viruses whose evolutionary process must have been more complicated than that of the hypothetical gene. The results are in accordance with those from the simulation study, showing that Jukes and Cantor's model is as useful as a more complicated one for making inferences about molecular phylogeny of the viruses.  相似文献   

19.
Using nucleotide sequences from three genomic regions of the human and simian T-cell lymphotropic virus type I (HTLV-I/STLV-I)-consisting of 69 sequences from a 140-bp segment of the pol region, 98 sequences from a 503-bp segment of the LTR, and 154 sequences from a 386-bp segment of the env region-we tested two hypotheses concerning the geographic origin and evolution of STLV-I and HTLV-I. First, we tested the assumption of equal rates of evolution along STLV-I and HTLV-I lineages using a likelihood ratio test to ascertain whether current levels of genomic diversity can be used to determine ancestry. We demonstrated that unequal rates of evolution along HTLV-I and STLV-I lineages have occurred throughout evolutionary time, thus calling into question the use of pairwise distances to assign ancestry. Second, we constructed phylogenetic trees using multiple phylogenetic techniques to test for the geographic origin of STLV-I and HTLV-I. Using the principle of likelihood, we chose a statistically justified model of evolution for each data set. We demonstrated the utility of the likelihood ratio test to determine which model of evolution should be chosen for phylogenetic analyses, revealing that using different models of evolution produces conflicting results, and neither the hypothesis of an African origin nor the hypothesis of an Asian origin can be rejected statistically. Our best estimates of phylogenetic relationships, however, support an African origin of PTLV for each gene region.  相似文献   

20.
Mitochondria are the site for the citric acid cycle and oxidative phosphorylation (OXPHOS), the final steps of ATP synthesis via cellular respiration. Each mitochondrion contains its own genome; in vertebrates, this is a small, circular DNA molecule that encodes 13 subunits of the multiprotein OXPHOS electron transport complexes. Vertebrate lineages vary dramatically in metabolic rates; thus, functional constraints on mitochondrial‐encoded proteins likely differ, potentially impacting mitochondrial genome evolution. Here, we examine mitochondrial genome evolution in salamanders, which have the lowest metabolic requirements among tetrapods. We show that salamanders experience weaker purifying selection on protein‐coding sequences than do frogs, a comparable amphibian clade with higher metabolic rates. In contrast, we find no evidence for weaker selection against mitochondrial genome expansion in salamanders. Together, these results suggest that different aspects of mitochondrial genome evolution (i.e., nucleotide substitution, accumulation of noncoding sequences) are differently affected by metabolic variation across tetrapod lineages.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号