共查询到20条相似文献,搜索用时 15 毫秒
1.
Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models 总被引:22,自引:0,他引:22
Approximate methods for estimating the numbers of synonymous and nonsynonymous substitutions between two DNA sequences involve three steps: counting of synonymous and nonsynonymous sites in the two sequences, counting of synonymous and nonsynonymous differences between the two sequences, and correcting for multiple substitutions at the same site. We examine complexities involved in those steps and propose a new approximate method that takes into account two major features of DNA sequence evolution: transition/transversion rate bias and base/codon frequency bias. We compare the new method with maximum likelihood, as well as several other approximate methods, by examining infinitely long sequences, performing computer simulations, and analyzing a real data set. The results suggest that when there are transition/transversion rate biases and base/codon frequency biases, previously described approximate methods for estimating the nonsynonymous/synonymous rate ratio may involve serious biases, and the bias can be both positive and negative. The new method is, in general, superior to earlier approximate methods and may be useful for analyzing large data sets, although maximum likelihood appears to always be the method of choice. 相似文献
2.
Wen-Hsiung Li 《Journal of molecular evolution》1993,36(1):96-99
Summary The current convention in estimating the number of substitutions per synonymous site (K
S
) and per nonsynonymous site (K
A
) between two protein-coding genes is to count each twofold degenerate site as one-third synonymous and two-thirds nonsynonymous
because one of the three possible changes at such a site is synonymous and the other two are nonsynonymous. This counting
rule can considerably overestimate theK
S
value because transitional mutations tend to occur more often than transversional mutations and because most transitional
mutations at twofold degenerate sites are synonymous. A new method that gives unbiased estimates is proposed. An application
of the new and the old method to 14 pairs of mouse and rat genes shows that the new method gives aK
S
value very close to the number of substitutions per fourfold degenerate site whereas the old method gives a value 30% higher.
Both methods give aK
A
value close to the number of substitutions per nondegenerate site. 相似文献
3.
Methods for estimating synonymous and nonsynonymous substitution rates among protein-coding sequences adopt different mutation (substitution) models with subtle yet significant differences, which lead to different estimates of evolutionary information. Little attention has been devoted to the comparison of methods for obtaining reliable estimates since the amount of sequence variations within targeted datasets is always unpredictable. To our knowledge, there is little information available in literature about evaluation of these different methods. In this study, we compared six widely used methods and provided with evaluation results using simulated sequences. The results indicate that incorporating sequence features (such as transition/transversion bias and nucleotide/codon frequency bias) into methods could yield better performance. We recommend that conclusions related to or derived from Ka and Ks analyses should not be readily drawn only according to results from one method. 相似文献
4.
Using mammalian gene sequences, the variances in the numbers of synonymous and nonsynonymous substitutions among genes were estimated together with the correlation coefficient between the two. The expected correlation coefficient can be obtained under the neutral theory using these estimated values of the variances. The expected coefficient is found to often be one-half to two-thirds of the observed value. Possible causes for the disagreement were discussed, such as correlated selective constraints on the two types of substitutions and excess doublet mutations. The variance of mutation rate and that of selective constraint were also estimated. The results show that the coefficient of variation of the former is 0.2–0.3, whereas that of the latter is 0.7–0.9.Correspondence to: T. Ohta 相似文献
5.
The rate of molecular evolution can vary among lineages. Sources of this variation have differential effects on synonymous and nonsynonymous substitution rates. Changes in effective population size or patterns of natural selection will mainly alter nonsynonymous substitution rates. Changes in generation length or mutation rates are likely to have an impact on both synonymous and nonsynonymous substitution rates. By comparing changes in synonymous and nonsynonymous rates, the relative contributions of the driving forces of evolution can be better characterized. Here, we introduce a procedure for estimating the chronological rates of synonymous and nonsynonymous substitutions on the branches of an evolutionary tree. Because the widely used ratio of nonsynonymous and synonymous rates is not designed to detect simultaneous increases or simultaneous decreases in synonymous and nonsynonymous rates, the estimation of these rates rather than their ratio can improve characterization of the evolutionary process. With our Bayesian approach, we analyze cytochrome oxidase subunit I evolution in primates and infer that nonsynonymous rates have a greater tendency to change over time than do synonymous rates. Our analysis of these data also suggests that rates have been positively correlated. 相似文献
6.
We develop a new model for studying the molecular evolution of protein-coding DNA sequences. In contrast to existing models, we incorporate the potential for site-to-site heterogeneity of both synonymous and nonsynonymous substitution rates. We demonstrate that within-gene heterogeneity of synonymous substitution rates appears to be common. Using the new family of models, we investigate the utility of a variety of new statistical inference procedures, and we pay particular attention to issues surrounding the detection of sites undergoing positive selection. We discuss how failure to model synonymous rate variation in the model can lead to misidentification of sites as positively selected. 相似文献
7.
A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome 总被引:5,自引:24,他引:5
A model of DNA sequence evolution applicable to coding regions is
presented. This represents the first evolutionary model that accounts for
dependencies among nucleotides within a codon. The model uses the codon, as
opposed to the nucleotide, as the unit of evolution, and is parameterized
in terms of synonymous and nonsynonymous nucleotide substitution rates. One
of the model's advantages over those used in methods for estimating
synonymous and nonsynonymous substitution rates is that it completely
corrects for multiple hits at a codon, rather than taking a parsimony
approach and considering only pathways of minimum change between homologous
codons. Likelihood-ratio versions of the relative-rate test are constructed
and applied to data from the complete chloroplast DNA sequences of Oryza
sativa, Nicotiana tabacum, and Marchantia polymorpha. Results of these
tests confirm previous findings that substitution rates in the chloroplast
genome are subject to both lineage-specific and locus-specific effects.
Additionally, the new tests suggest tha the rate heterogeneity is due
primarily to differences in nonsynonymous substitution rates. Simulations
help confirm previous suggestions that silent sites are saturated, leaving
no evidence of heterogeneity in synonymous substitution rates.
相似文献
8.
Most methods for estimating the rate of synonymous and nonsynonymous substitution per site define a site as a mutational opportunity: the proportion of sites that are synonymous is equal to the proportion of mutations that would be synonymous under the model of evolution being considered. Here we demonstrate that this definition of a site can give misleading results and that a physical definition of site should be used in some circumstances. We illustrate our point by reexamining the relationship between codon usage bias and the synonymous substitution rate. It has recently been shown that the rate of synonymous substitution, calculated using the Goldman-Yang method, which encapsulates the mutational-opportunity definition of a site at a high level of sophistication, is either positively correlated or uncorrelated to synonymous codon bias in Drosophila. Using other methods, which account for synonymous codon bias but define a site physically, we show that there is a negative correlation between the synonymous substitution rate and codon bias and that the lack of a negative correlation using the Goldman-Yang method is due to the way in which the number of synonymous sites is counted. We also show that there is a positive correlation between the synonymous substitution rate and third position GC content in mammals, but that the relationship is considerably weaker than that obtained using the Goldman-Yang method. We argue that the Goldman-Yang method is misleading in this context and conclude that methods that rely on a mutational-opportunity definition of a site should be used with caution. 相似文献
9.
A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes 总被引:71,自引:7,他引:71
A new method is proposed for estimating the number of synonymous and nonsynonymous nucleotide substitutions between homologous genes. In this method, a nucleotide site is classified as nondegenerate, twofold degenerate, or fourfold degenerate, depending on how often nucleotide substitutions will result in amino acid replacement; nucleotide changes are classified as either transitional or transversional, and changes between codons are assumed to occur with different probabilities, which are determined by their relative frequencies among more than 3,000 changes in mammalian genes. The method is applied to a large number of mammalian genes. The rate of nonsynonymous substitution is extremely variable among genes; it ranges from 0.004 X 10(-9) (histone H4) to 2.80 X 10(-9) (interferon gamma), with a mean of 0.88 X 10(-9) substitutions per nonsynonymous site per year. The rate of synonymous substitution is also variable among genes; the highest rate is three to four times higher than the lowest one, with a mean of 4.7 X 10(-9) substitutions per synonymous site per year. The rate of nucleotide substitution is lowest at nondegenerate sites (the average being 0.94 X 10(-9), intermediate at twofold degenerate sites (2.26 X 10(-9)). and highest at fourfold degenerate sites (4.2 X 10(-9)). The implication of our results for the mechanisms of DNA evolution and that of the relative likelihood of codon interchanges in parsimonious phylogenetic reconstruction are discussed. 相似文献
10.
Three frequently used methods for estimating the synonymous and nonsynonymous substitution rates (Ks and Ka) were evaluated and compared for their accuracies; these methods are denoted by LWL85, LPB93, and GY94, respectively. For this purpose, we used a codon-evolution model to obtain the expected Ka and Ks values for the above three methods and compared the values with those obtained by the three methods. We also proposed some modifications of LWL85 and LPB93 to increase their accuracies. Our computer simulations under the codon-evolution model showed that for sequences < or =300 codons, the performance of GY94 may not be reliable. For longer sequences, GY94 is more accurate for estimating the Ka/Ks ratio than the modified LPB93 and LWL85 in the majority of the cases studied. This is particularly so when k > or = 3, which is the transition/transversion (mutation) rate ratio. However, when k is approximately 2 and when the sequence divergence is relatively large, the modified LWL85 performed better than GY94 and the modified LPB93. The inferiority of LPB93 to LWL85 is surprising because LPB93 was intended to improve LWL85. Also, it has been thought that the codon-based method of GY94 is better than the heuristic method of LWL85, but our simulation results showed that in many cases, the opposite was true, even though our simulation was based on the codon-evolution model. 相似文献
11.
Mitochondrial genomes encode fundamental subunits of the basic energy producing machinery of eukaryotic cells that are under strong functional constraint. Paradoxically, these genes evolve rapidly in general, and there is substantial variation in evolutionary rates among genes within genomes. In order to investigate spatial variation in selection intensity, we conducted tests of neutrality using ratios of synonymous to nonsynonymous substitutions (dN/dS = omega) on numerous protein gene segments from fishes and mammals. Values of omega were very low for nearly all genomic regions. However, values of both omega and dN varied in a clinal pattern with increasing distance from the light-strand origin of replication. Spatial heterogeneity of nonsynonymous substitution rates exhibits a significantly positive correlation with variation in mutation rates that are related to the mode of mitochondrial DNA replication. The finding that nonsynonymous substitution rates are proportional to mutation rates is expected if a majority of substitutions are selectively neutral or slightly deleterious. Spatial patterns of among-gene variation in nonsynonymous rates were highly similar between fishes and mammals, suggesting that forces governing mitochondrial gene evolution have remained relatively constant over 450 Myr of vertebrate evolution. Conservation of substitution patterns despite major shifts in thermal habit and metabolic demands among taxa implicates a conserved replication mechanism controlling relative mutation rates as a major determinant of mitochondrial protein evolution. 相似文献
12.
Nonsynonymous substitutions in DNA cause amino acid substitutions while synonymous substitutions in DNA leave amino acids unchanged. The cause of the correlation between the substitution rates at nonsynonymous (K(A)) and synonymous (K(S)) sites in mammals is a contentious issue, and one that impacts on many aspects of molecular evolution. Here we use a large set of orthologous mammalian genes to investigate the causes of the K(A)-K(S) correlation in rodents. The strength of the K(A)-K(S) correlation exceeds the neutral theory expectation when substitution rates are estimated using algorithmic methods, but not when substitution rates are estimated by maximum likelihood. Irrespective of this methodological uncertainty the strength of the K(A)-K(S) correlation appears mostly due to tandem substitutions, an excess of which is generated by substitutional nonindependence. Doublet mutations cannot explain the excess of tandem synonymous-nonsynonymous substitutions, and substitution patterns indicate that selection on silent sites is the likely cause. We find no evidence for selection on codon usage. The nature of the relationship between synonymous divergence and base composition is unclear because we find a significant correlation if we use maximum-likelihood methods but not if we use algorithmic methods. Finally, we find that K(S) is reduced at the start of genes, which suggests that selection for RNA structure may affect silent sites in mammalian protein-coding genes. 相似文献
13.
Jeffrey P Mower Pascal Touzet Julie S Gummow Lynda F Delph Jeffrey D Palmer 《BMC evolutionary biology》2007,7(1):135
Background
It has long been known that rates of synonymous substitutions are unusually low in mitochondrial genes of flowering and other land plants. Although two dramatic exceptions to this pattern have recently been reported, it is unclear how often major increases in substitution rates occur during plant mitochondrial evolution and what the overall magnitude of substitution rate variation is across plants. 相似文献14.
The distribution of mutational effects on fitness is of fundamental importance for many aspects of evolution. We develop two methods for characterizing the fitness effects of deleterious, nonsynonymous mutations, using polymorphism data from two related species. These methods also provide estimates of the proportion of amino acid substitutions that are selectively favorable, when combined with data on between-species sequence divergence. The methods are applicable to species with different effective population sizes, but that share the same distribution of mutational effects. The first, simpler, method assumes that diversity for all nonneutral mutations is given by the value under mutation-selection balance, while the second method allows for stronger effects of genetic drift and yields estimates of the parameters of the probability distribution of mutational effects. We apply these methods to data on populations of Drosophila miranda and D. pseudoobscura and find evidence for the presence of deleterious nonsynonymous mutations, mostly with small heterozygous selection coefficients (a mean of the order of 10(-5) for segregating variants). A leptokurtic gamma distribution of mutational effects with a shape parameter between 0.1 and 1 can explain observed diversities, in the absence of a separate class of completely neutral nonsynonymous mutations. We also describe a simple approximate method for estimating the harmonic mean selection coefficient from diversity data on a single species. 相似文献
15.
Michael E Hellberg 《BMC evolutionary biology》2006,6(1):24-8
Background
The mitochondrial DNA (mtDNA) of most animals evolves more rapidly than nuclear DNA, and often shows higher levels of intraspecific polymorphism and population subdivision. The mtDNA of anthozoans (corals, sea fans, and their kin), by contrast, appears to evolve slowly. Slow mtDNA evolution has been reported for several anthozoans, however this slow pace has been difficult to put in phylogenetic context without parallel surveys of nuclear variation or calibrated rates of synonymous substitution that could permit quantitative rate comparisons across taxa. Here, I survey variation in the coding region of a mitochondrial gene from a coral species (Balanophyllia elegans) known to possess high levels of nuclear gene variation, and estimate synonymous rates of mtDNA substitution by comparison to another coral (Tubastrea coccinea). 相似文献16.
We estimated the intensity of selection on preferred codons in Drosophila pseudoobscura and D. miranda at X-linked and autosomal loci, using a published data set on sequence variability at 67 loci, by means of an improved method that takes account of demographic effects. We found evidence for stronger selection at X-linked loci, consistent with their higher levels of codon usage bias. The estimates of the strength of selection and mutational bias in favor of unpreferred codons were similar to those found in other species, after taking into account the fact that D. pseudoobscura showed evidence for a recent expansion in population size. We examined correlates of synonymous and nonsynonymous diversity in these species and found no evidence for effects of recurrent selective sweeps on nonsynonymous mutations, which is probably because this set of genes have much higher than average levels of selective constraints. There was evidence for correlated effects of levels of selective constraints on protein sequences and on codon usage, as expected under models of selection for translational accuracy. Our analysis of a published data set on D. melanogaster provided evidence for the effects of selective sweeps of nonsynonymous mutations on linked synonymous diversity, but only in the subset of loci that experienced the highest rates of nonsynonymous substitutions (about one-quarter of the total) and not at more slowly evolving loci. Our correlational analysis of this data set suggested that both selective constraints on protein sequences and recurrent selective sweeps affect the overall level of codon usage. 相似文献
17.
New methods for estimating the numbers of synonymous and nonsynonymous substitutions 总被引:10,自引:0,他引:10
Yasuo Ina 《Journal of molecular evolution》1995,40(2):190-226
New methods for estimating the numbers of synonymous and nonsynonymous substitutions per site were developed. The methods are unweighted pathway methods based on Kimura's two-parameter model. Computer simulations were conducted to evaluate the accuracies of the new methods, Nei and Gojobori's (NG) method, Miyata and Yasunaga's (MY) method, Li, Wu, and Luo's (LWL) method, and Pamilo, Bianchi, and Li's (PBL) method. The following results were obtained: (1) The NG, MY, and LWL methods give overestimates of the number of synonymous substitutions and underestimates of the number of nonsynonymous substitutions. The major cause for the biased estimation is that these three methods underestimate the number of synonymous sites and overestimate the number of nonsynonymous sites. (2) The PBL method gives better estimates of the numbers of synonymous and nonsynonymous substitutions than those obtained by the NG, MY, and LWL methods. (3) The new methods also give better estimates of the numbers of synonymous and nonsynonymous substitutions than those obtained by the NG, MY, and LWL methods. In addition, estimates of the numbers of synonymous and nonsynonymous sites obtained by the new methods are reasonably accurate. (4) In some cases, the new methods and the PBL method give biased estimates of substitution numbers. However, from the number of nucleotide substitutions at the third position of codons, we can examine whether estimates obtained by the new methods are good or not, whereas we cannot make an examination of estimates obtained by the PBL method. (5) When there are strong transition/transversion and nucleotide-frequency biases like mitochondrial genes, all of the above methods give biased estimates of substitution numbers. In such cases, Kondo et al.'s method is recommended to be used for estimating the number of synonymous substitutions, although their method cannot estimate the number of nonsynonymous substitutions and is time-consuming. These results, particularly result (1), call for reexaminations of some genes. This is because evolutionary pictures of genes have often been discussed on the basis of results obtained by the NG, MY, and LWL methods, which are favorable for the neutral theory of molecular evolution. 相似文献
18.
We have investigated patterns of within-species polymorphism and between-species divergence for synonymous and nonsynonymous variants at a set of autosomal and X-linked loci of Drosophila miranda. D. pseudoobscura and D. affinis were used for the between-species comparisons. The results suggest the action of purifying selection on nonsynonymous, polymorphic variants. Among synonymous polymorphisms, there is a significant excess of synonymous mutations from preferred to unpreferred codons and of GC to AT mutations. There was no excess of GC to AT mutations among polymorphisms at noncoding sites. This suggests that selection is acting to maintain the use of preferred codons. Indirect evidence suggests that biased gene conversion in favor of GC base pairs may also be operating. The joint intensity of selection and biased gene conversion, in terms of the product of effective population size and the sum of the selection and conversion coefficients, was estimated to be approximately 0.65. 相似文献
19.
Regulation and evolution of dinoflagellate luciferases are of particular interest since the enzyme is structurally unique and bioluminescence is under circadian control. In this study, three new members of the dinoflagellate luciferase gene family were identified and characterized from Pyrocystis lunula. These genes, lcfA, lcfB, and lcfC, also exhibit the unusual structure and organization previously reported for the luciferase gene of a related dinoflagellate, Lingulodinium polyedrum: three repeated domains, each encoding an active catalytic site, multiple gene copies, and tandem organization. The histidine residues involved in the pH regulation of L. polyedrum luciferase activity, and implicated in the regulation of flashing, are also fully conserved in P. lunula. The interspecific conservation between the individual luciferase domains of P. lunula and L. polyedrum is higher than among domains intramolecularly, indicating that this unique gene structure arose through duplication events that occurred prior to the divergence of these dinoflagellates. However, P. lunula luciferase genes differ from L. polyedrum in several respects, notably, the occurrence of an intron in one gene (lcfC), a 2.25-kb intergenic region connecting lcfA and lcfB, and, of particular interest, an invariant rate of synonymous (silent) substitutions along the repeat domains, in contrast to L. polyedrum luciferase, where the occurrence of synonymous substitutions is practically absent in the central region of the domains. 相似文献
20.
Parallel evolution of drug resistance in HIV: failure of nonsynonymous/synonymous substitution rate ratio to detect selection. 总被引:1,自引:0,他引:1
K A Crandall C R Kelsey H Imamichi H C Lane N P Salzman 《Molecular biology and evolution》1999,16(3):372-382
Parallel or convergent evolution at the molecular level has been difficult to demonstrate especially when rigorous statistical criteria are applied. We present sequence data from the protease gene from eight patients infected with the human immunodeficiency virus (HIV-1). These patients have been on multiple drug therapies for at least 2 years. We present sequence data from two timepoints: time zero--the initiation of drug therapy--and a subsequent timepoint between 59 and 104 weeks after the initiation of drug therapy. In addition to the sequence data, we present viral load data from both initial and final timepoints. Our phylogenetic analyses indicate significant evolution of virus from initial to final time points, even in three of eight patients who show low viral loads. Of the five patients who escaped drug therapy, identical amino acid replacements were seen in all five patients at two different codon positions, an indication of parallel evolution. We also measured genetic diversity for these patients and found no correlation between genetic diversity and viral load. Finally, we calculated the nonsynonymous and synonymous substitution rates and showed that the ratio of nonsynonymous to synonymous substitution compared to the value of one may be a poor indicator of natural selection. 相似文献