A simple method for estimating the transition/transversion ratio was developed. This method can be applied to not only two sequences but also more than two sequences. The statistical properties of the method and some other methods were examined by numerical computation and computer simulation. The results obtained showed that, in terms of bias and variance, the new method gives a better estimate of the transition/transversion ratio than do the other examined methods. The new method was applied to human and chimpanzee mitochondrial control region sequences. Received: 22 September 1997 / Accepted: 1 November 1997  相似文献   

A method is presented for estimating the transition/transversion ratio (TI/TV), based on phylogenetically independent comparisons. TI/TV is a parameter of some models used in phylogeny estimation intended to reflect the fact that nucleotide substitutions are not all equally likely. Previous attempts to estimate TI/TV have commonly faced three problems: (1) few taxa; (2) nonindependence among pairwise comparisons; and (3) multiple hits make the apparent TI/TV between two sequences decrease over time since their divergence, giving a misleading impression of relative substitution probabilities. We have made use of the time dependency, modeling how the observed TI/TV changes over time and extrapolating to estimate the ``instantaneous' TI/TV—the relevant parameter for phylogenetic inference. To illustrate our method, TI/TV was estimated for two mammalian mitochondrial genes. For 26 pairs of cytochrome b sequences, the estimate of TI/TV was 5.5; 16 pairs of 12s rRNA yielded an estimate of 9.5. These estimates are higher than those given by the maximum likelihood method and than those obtained by averaging all possible pairwise comparisons (with or without a two-parameter correction for multiple substitutions). We discuss strengths, weaknesses, and further uses of our method. Received: 22 August 1995 / Accepted: 26 July 1996  相似文献   

The idea that the pattern of point mutation in Drosophila has remained constant during the evolution of the genus has recently been challenged. A study of the nucleotide composition focused on the Drosophila saltans group has evidenced unsuspected nucleotide composition differences among lineages. Compositional differences are associated with an accelerated rate of amino acid replacement in functionally less constrained regions. Here we reassess this issue from a different perspective. Adopting a maximum-likelihood estimation approach, we focus on the different predictions that mutation and selection make about the nonsynonymous-to-synonymous rate ratio. We investigate two gene regions, alcohol dehydrogenase (Adh) and xanthine dehydrogenase (Xdh), using a balanced data set that comprises representatives from the melangaster, obscura, saltans, and willistoni groups. We also consider representatives of the Hawaiian picture-winged group. These Hawaiian species are known to have experienced repeated bottlenecks and are included as a reference for comparison. Our results confirm patterns previously detected. The branch ancestral to the fast-evolving willistoni/saltans lineage, where most of the change in GC content has occurred, exhibits an excess of synonymous substitutions. The shift in mutation bias has affected the extent of the rate variation among sites in Xdh. Received: 4 May 1999 / Accepted: 26 July 1999  相似文献   

The synonymous divergence between Escherichia coli and Salmonella typhimurium is explained in a model where there is a large variation between mutation rates at different nucleotide sites in the genome. The model is based on the experimental observation that spontaneous mutation rates can vary over several orders of magnitude at different sites in a gene. Such site-specific variation must be taken into account when studying synonymous divergence and will result in an apparent saturation below the level expected from an assumption of uniform rates. Recently, it has been suggested that codon preference in enterobacteria has a very large site-specific variation and that the synonymous divergence between different species, e.g., E. coli and Salmonella, is saturated. In the present communication it is shown that when site-specific variation in mutation rates is introduced, there is no need to invoke assumptions of saturation and a large variability in codon preference. The same rate variation will also bring average mutation rates as estimated from synonymous sequence divergence into numerical agreement with experimental values. Received: 10 July 1998 / Accepted: 20 August 1998  相似文献   

Synonymous codon usage in related species may differ as a result of variation in mutation biases, differences in the overall strength and efficiency of selection, and shifts in codon preference—the selective hierarchy of codons within and between amino acids. We have developed a maximum-likelihood method to employ explicit population genetic models to analyze the evolution of parameters determining codon usage. The method is applied to twofold degenerate amino acids in 50 orthologous genes from D. melanogaster and D. virilis. We find that D. virilis has significantly reduced selection on codon usage for all amino acids, but the data are incompatible with a simple model in which there is a single difference in the long-term N e, or overall strength of selection, between the two species, indicating shifts in codon preference. The strength of selection acting on codon usage in D. melanogaster is estimated to be |N e s|≈ 0.4 for most CT-ending twofold degenerate amino acids, but 1.7 times greater for cysteine and 1.4 times greater for AG-ending codons. In D. virilis, the strength of selection acting on codon usage for most amino acids is only half that acting in D. melanogaster but is considerably greater than half for cysteine, perhaps indicating the dual selection pressures of translational efficiency and accuracy. Selection coefficients in orthologues are highly correlated (ρ= 0.46), but a number of genes deviate significantly from this relationship. Received: 20 December 1998 / Accepted: 17 February 1999  相似文献   

We consider a model of the origin of genetic code organization incorporating the biosynthetic relationships between amino acids and their physicochemical properties. We study the behavior of the genetic code in the set of codes subject both to biosynthetic constraints and to the constraint that the biosynthetic classes of amino acids must occupy only their own codon domain, as observed in the genetic code. Therefore, this set contains the smallest number of elements ever analyzed in similar studies. Under these conditions and if, as predicted by physicochemical postulates, the amino acid properties played a fundamental role in genetic code organization, it can be expected that the code must display an extremely high level of optimization. This prediction is not supported by our analysis, which indicates, for instance, a minimization percentage of only 80%. These observations can therefore be more easily explained by the coevolution theory of genetic code origin, which postulates a role that is important but not fundamental for the amino acid properties in the structuring of the code. We have also investigated the shape of the optimization landscape that might have arisen during genetic code origin. Here, too, the results seem to favor the coevolution theory because, for instance, the fact that only a few amino acid exchanges would have been sufficient to transform the genetic code (which is not a local minimum) into a much better optimized code, and that such exchanges did not actually take place, seems to suggest that, for instance, the reduction of translation errors was not the main adaptive theme structuring the genetic code.  相似文献   

With the aim of elucidating evolutionary features of GB virus C/hepatitis G virus (GBV-C/HGV), molecular evolutionary analyses were conducted using the entire coding region of this virus. In particular, the rate of nucleotide substitution for this virus was estimated to be less than 9.0 × 10−6 per site per year, which was much slower than those for other RNA viruses. The phylogenetic tree reconstructed for GBV-C/HGV, by using GB virus A (GBV-A) as outgroup, indicated that there were three major clusters (the HG, GB, and Asian types) in GBV-C/HGV, and the divergence between the ancestor of GB- and Asian-type strains and that of HG-type strains first took place more than 7000–10,000 years ago. The slow evolutionary rate for GBV-C/HGV suggested that this virus cannot escape from the immune response of the host by means of producing escape mutants, implying that it may have evolved other systems for persistent infection. Received: 2 June 1998 / Accepted: 8 August 1998  相似文献   

Synonymous and nonsynonymous rate variation in nuclear genes of mammals  
A maximum likelihood approach was used to estimate the synonymous and nonsynonymous substitution rates in 48 nuclear genes from primates, artiodactyls, and rodents. A codon-substitution model was assumed, which accounts for the genetic code structure, transition/transversion bias, and base frequency biases at codon positions. Likelihood ratio tests were applied to test the constancy of nonsynonymous to synonymous rate ratios among branches (evolutionary lineages). It is found that at 22 of the 48 nuclear loci examined, the nonsynonymous/synonymous rate ratio varies significantly across branches of the tree. The result provides strong evidence against a strictly neutral model of molecular evolution. Our likelihood estimates of synonymous and nonsynonymous rates differ considerably from previous results obtained from approximate pairwise sequence comparisons. The differences between the methods are explored by detailed analyses of data from several genes. Transition/transversion rate bias and codon frequency biases are found to have significant effects on the estimation of synonymous and nonsynonymous rates, and approximate methods do not adequately account for those factors. The likelihood approach is preferable, even for pairwise sequence comparison, because more-realistic models about the mutation and substitution processes can be incorporated in the analysis. Received: 17 May 1997 / Accepted: 28 September 1997  相似文献   

Algorithmic details to obtain maximum likelihood estimates of parameters on a large phylogeny are discussed. On a large tree, an efficient approach is to optimize branch lengths one at a time while updating parameters in the substitution model simultaneously. Codon substitution models that allow for variable nonsynonymous/synonymous rate ratios (ω=d N/d S) among sites are used to analyze a data set of human influenza virus type A hemagglutinin (HA) genes. The data set has 349 sequences. Methods for obtaining approximate estimates of branch lengths for codon models are explored, and the estimates are used to test for positive selection and to identify sites under selection. Compared with results obtained from the exact method estimating all parameters by maximum likelihood, the approximate methods produced reliable results. The analysis identified a number of sites in the viral gene under diversifying Darwinian selection and demonstrated the importance of including many sequences in the data in detecting positive selection at individual sites. Received: 25 April 2000 / Accepted: 24 July 2000  相似文献   

Seven new Italian and two new British HTLV-II isolates were obtained from injecting drug users and the entire long terminal repeat (LTR) region was sequenced. Restriction analysis showed that all the Italian isolates are of the IIb subtype, whereas the British isolates are of the IIa subtype. To understand whether the further differentiation of each two principal HTLV-II subtypes in several subgroups could be statistically supported by phylogenetic analysis, the neighbor-joining, parsimony, and maximum likelihood methods were used. The separation between IIa and IIb is very well supported by all three methods. At least two phylogenetic subgroups exist within the HTLV-IIa and at least three within the HTLV-IIb subtype. In the present analysis, no statistical support was obtained for additional phylogroups. Two particular subgroups seem interesting because they include all European and North American injecting drug user strains within the IIa and IIb subtypes, respectively. These data confirm that European HTLV-II infection among drug users is probably derived from North America. They also suggest that though a certain differentiation by restriction analysis in different subgroups is possible, carefully interpreted phylogenetic analyses remain necessary. Using the likelihood ratio test, a molecular clock for the drug user strains was calibrated. A fixation rate between 1.08 × 10−4 and 2.7 × 10−5 nucleotide substitutions per site per year was calculated for the IIa and IIb injecting drug user strains. This is the lowest fixation rate so far reported for RNA viruses, including for HIV, which typically range between 10−2 and 10−4.  相似文献   

We present an analysis of the evolutionary rates of the cytochrome c oxidase subunit I genes of primates and other mammals. Five primate genes were sequenced, and this information was combined with published data from other species. The sequences from simian primates show approximately twofold increases in their nonsynonymous substitution rate compared to those from other primates and other mammals. The species range and the overall magnitude of this rate increase are similar to those previously identified for the cytochrome c oxidase subunit II and cytochrome b genes. Received: 22 July 1999 / Accepted: 21 February 2000  相似文献   

Branch length estimates play a central role in maximum-likelihood (ML) and minimum-evolution (ME) methods of phylogenetic inference. For various reasons, branch length estimates are not statistically independent under ML or ME. We studied the response of correlations among branch length estimates to the degree of among-branch length heterogeneity (BLH) in the model (true) tree. The frequency and magnitude of (especially negative) correlations among branch length estimates were both shown to increase as BLH increases under simulation and analytically. For ML, we used the correct model (Jukes–Cantor). For ME, we employed ordinary least-squares (OLS) branch lengths estimated under both simple p-distances and Jukes–Cantor distances, analyzed with and without an among-site rate heterogeneity parameter. The efficiency of ME and ML was also shown to decrease in response to increased BLH. We note that the shape of the true tree will in part determine BLH and represents a critical factor in the probability of recovering the correct topology. An important finding suggests that researchers cannot expect that different branches that were in fact the same length will have the same probability of being accurately reconstructed when BLH exists in the overall tree. We conclude that methods designed to minimize the interdependencies of branch length estimates (BLEs) may (1) reduce both the variance and the covariance associated with the estimates and (2) increase the efficiency of model-based optimality criteria. We speculate on possible ways to reduce the nonindependence of BLEs under OLS and ML. Received: 9 March 1999 / Accepted: 4 May 1999  相似文献   

Statistical and biochemical studies of the genetic code have found evidence of nonrandom patterns in the distribution of codon assignments. It has, for example, been shown that the code minimizes the effects of point mutation or mistranslation: erroneous codons are either synonymous or code for an amino acid with chemical properties very similar to those of the one that would have been present had the error not occurred. This work has suggested that the second base of codons is less efficient in this respect, by about three orders of magnitude, than the first and third bases. These results are based on the assumption that all forms of error at all bases are equally likely. We extend this work to investigate (1) the effect of weighting transition errors differently from transversion errors and (2) the effect of weighting each base differently, depending on reported mistranslation biases. We find that if the bias affects all codon positions equally, as might be expected were the code adapted to a mutational environment with transition/transversion bias, then any reasonable transition/transversion bias increases the relative efficiency of the second base by an order of magnitude. In addition, if we employ weightings to allow for biases in translation, then only 1 in every million random alternative codes generated is more efficient than the natural code. We thus conclude not only that the natural genetic code is extremely efficient at minimizing the effects of errors, but also that its structure reflects biases in these errors, as might be expected were the code the product of selection. Received: 25 July 1997 / Accepted: 9 January 1998  相似文献   

The very high AT content of hymenopteran mtDNA has warranted speculation about nucleotide substitution processes in this group. Here we investigate the pattern of honeybee, Apis mellifera, mtDNA nucleotide polymorphisms inferred from phylogeny in terms of differences between the ATPase6, COI, COII, COIII, cytochrome b, and ND2 genes and strand asymmetry in mutation rates. The observed transition/transversion ratios and the distribution of nonsynonymous substitutions between regions differed significantly. The pattern of differences between genes leading to these heterogeneities (the ATPase6 and COIII genes group apart from the rest) differed markedly from that predicted on the basis of long-term evolutionary change and may indicate differences between current and long-term dynamics of sequence evolution. Also, there is strong strand asymmetry in substitutions, which probably results in a mutability of G and C sufficiently high to account for the AT-richness of honeybee mtDNA. Received: 21 October 1998 / Accepted: 27 January 1999  相似文献   

Sequence differences in the tRNA-proline (tRNApro) end of the mitochondrial control-region of three species of Pacific butterflyfishes accumulated 33–43 times more rapidly than did changes within the mitochondrial cytochrome b gene (cytb). Rapid evolution in this region was accompanied by strong transition/transversion bias and large variation in the probability of a DNA substitution among sites. These substitution constraints placed an absolute ceiling on the magnitude of sequence divergence that could be detected between individuals. This divergence ``ceiling' was reached rapidly and led to a decay in the relative rate of control-region/cytb b evolution. A high rate of evolution in this section of the control-region of butterflyfishes stands in marked contrast to the patterns reported in some other fish lineages. Although the mechanism underlying rate variation remains unclear, all taxa with rapid evolution in the 5′-end of the control-region showed extreme transition biases. By contrast, in taxa with slower control-region evolution, transitions accumulated at nearly the same rate as transversions. More information is needed to understand the relationship between nucleotide bias and the rate of evolution in the 5′-end of the control-region. Despite strong constraints on sequence change, phylogenetic information was preserved in the group of recently differentiated species and supported the clustering of sequences into three major mtDNA groupings. Within these groups, very similar control-region sequences were widely distributed across the Pacific Ocean and were shared between recognized species, indicating a lack of mitochondrial sequence monophyly among species. Received: 30 June 1996 / Accepted: 15 May 1997  相似文献   

As methods of molecular phylogeny have become more explicit and more biologically realistic following the pioneering work of Thomas Jukes, they have had to relax their initial assumption that rates of evolution were equal at all sites. Distance matrix and likelihood methods of inferring phylogenies make this assumption; parsimony, when valid, is less limited by it. Nucleotide sequences, including RNA sequences, can show substantial rate variation; protein sequences show rates that vary much more widely. Assuming a prior distribution of rates such as a gamma distribution or lognormal distribution has deservedly been popular, but for likelihood methods it leads to computational difficulties. These can be resolved using hidden Markov model (HMM) methods which approximate the distribution by one with a modest number of discrete rates. Generalized Laguerre quadrature can be used to improve the selection of rates and their probabilities so as to more nearly approach the desired gamma distribution. A model based on population genetics is presented predicting how the rates of evolution might vary from locus to locus. Challenges for the future include allowing rates at a given site to vary along the tree, as in the ``covarion' model, and allowing them to have correlations that reflect three-dimensional structure, rather than position in the coding sequence. Markov chain Monte Carlo likelihood methods may be the only practical way to carry out computations for these models. Received: 8 February 2001 / Accepted: 20 May 2001  相似文献   

Changes in the primary and quarternary structure of vacuolar and archaeal type ATPases that accompany the prokaryote-to-eukaryote transition are analyzed. The gene encoding the vacuolar-type proteolipid of the V-ATPase from Giardia lamblia is reported. Giardia has a typical vacuolar ATPase as observed from the common motifs shared between its proteolipid subunit and other eukaryotic vacuolar ATPases, suggesting that the former enzyme works as a hydrolase in this primitive eukaryote. The phylogenetic analyses of the V-ATPase catalytic subunit and the front and back halves of the proteolipid subunit placed Giardia as the deepest branch within the eukaryotes. Our phylogenetic analysis indicated that at least two independent duplication and fusion events gave rise to the larger proteolipid type found in eukaryotes and in Methanococcus. The spatial distribution of the conserved residues among the vacuolar-type proteolipids suggest a zipper-type interaction among the transmembrane helices and surrounding subunits of the V-ATPase complex. Important residues involved in the function of the F-ATP synthase proteolipid have been replaced during evolution in the V-proteolipid, but in some cases retained in the archaeal A-ATPase. Their possible implication in the evolution of V/F/A-ATPases is discussed. Received: 27 August 1997 / Accepted: 14 January 1998  相似文献   

The study of rates of nucleotide substitution in RNA viruses is central to our understanding of their evolution. Herein we report a comprehensive analysis of substitution rates in 50 RNA viruses using a recently developed maximum likelihood phylogenetic method. This analysis revealed a significant relationship between genetic divergence and isolation time for an extensive array of RNA viruses, although more rate variation was usually present among lineages than would be expected under the constraints of a molecular clock. Despite the lack of a molecular clock, the range of statistically significant variation in overall substitution rates was surprisingly narrow for those viruses where a significant relationship between genetic divergence and time was found, as was the case when synonymous sites were considered alone, where the molecular clock was rejected less frequently. An analysis of the ecological and genetic factors that might explain this rate variation revealed some evidence of significantly lower substitution rates in vector-borne viruses, as well as a weak correlation between rate and genome length. Finally, a simulation study revealed that our maximum likelihood estimates of substitution rates are valid, even if the molecular clock is rejected, provided that sufficiently large data sets are analyzed. Received: 23 February 2001 / Accepted: 3 July 2001  相似文献   

A method is described for estimating rapid rate constants from the distributions of current amplitude observed in single-channel electrical recordings. It has the advantages over previous, similar approaches that it can accommodate both multistate kinetic models and adjustable filtering of the data using an 8-pole Bessel filter. The method is conceptually straightforward: the observed distributions of current amplitude are compared with theoretical distributions derived by combining several simplifying assumptions about the underlying stochastic process with a model of the filter and electrical noise. Parameters are estimated by approximate maximum likelihood. The method was used successfully to estimate rate constants for both a simple two-state kinetic model (the transitions between open and closed states during the rapid gating of an outward-rectifying K+-selective channel in the plasma membrane of Acetabularia) and a complex multistate kinetic model (the blockade of the maxi cation channel in the plasma membrane of rye roots by verapamil). For the two-state model, parameters were estimated well, provided that they were not too fast or too slow in relation to the sampling rate. In the three-state model the precision of estimates depended in a complex way on the values of all rate parameters in the model. Received: 4 October 1996/Revised: 2 September 1997  相似文献   

The complete mitochondrial genomes of two microbats, the horseshoe bat Rhinolophus pumilus, and the Japanese pipistrelle Pipistrellus abramus, and that of an insectivore, the long-clawed shrew Sorex unguiculatus, were sequenced and analyzed phylogenetically by a maximum likelihood method in an effort to enhance our understanding of mammalian evolution. Our analysis suggested that (1) a sister relationship exists between moles and shrews, which form an eulipotyphlan clade; (2) chiropterans have a sister-relationship with eulipotyphlans; and (3) the Eulipotyphla/Chiroptera clade is closely related to fereuungulates (Cetartiodactyla, Perissodactyla and Carnivora). Divergence times on the mammalian tree were estimated from consideration of a relaxed molecular clock, the amino acid sequences of 12 concatenated mitochondrial proteins and multiple reference criteria. Moles and shrews were estimated to have diverged approximately 48 MyrBP, and bats and eulipotyphlans to have diverged 68 MyrBP. Recent phylogenetic controversy over the polyphyly of microbats, the monophyly of rodents, and the position of hedgehogs is also examined. Received: 21 December 2000 / Accepted: 16 February 2001  相似文献   

