首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 625 毫秒
1.
Comparison of numbers of synonymous and nonsynonymous substitutions is useful for understanding mechanisms of molecular evolution. In this paper, I examine the statistical properties of six methods of estimating numbers of synonymous and nonsynonymous substitutions. The six methods are Miyata and Yasunaga’s (MY) method; Nei and Gojobori’s (NG) method; Li, Wu and Luo’s (LWL) method; Pamilo, Bianchi and Li’s (PBL) method; and Ina’s (Ina) two methods. When the transition/transversion bias at the mutation level is strong, the numbers of synonymous and nonsynonymous substitutions are estimated more accurately by the PBL and Ina methods than by the NG, MY and LWL methods. When the nucleotide-frequency bias is strong and distantly related sequences are compared, all the six methods give underestimates of the number of synonymous substitutions. The concept of synonymous and nonsynonymous categories is also useful for analysis of DNA polymorphism data.  相似文献   

2.
Two simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions are presented. Although they give no weights to different types of codon substitutions, these methods give essentially the same results as those obtained by Miyata and Yasunaga's and by Li et al.'s methods. Computer simulation indicates that estimates of synonymous substitutions obtained by the two methods are quite accurate unless the number of nucleotide substitutions per site is very large. It is shown that all available methods tend to give an underestimate of the number of nonsynonymous substitutions when the number is large.   相似文献   

3.
Unbiased estimation of the rates of synonymous and nonsynonymous substitution   总被引:39,自引:0,他引:39  
Summary The current convention in estimating the number of substitutions per synonymous site (K S ) and per nonsynonymous site (K A ) between two protein-coding genes is to count each twofold degenerate site as one-third synonymous and two-thirds nonsynonymous because one of the three possible changes at such a site is synonymous and the other two are nonsynonymous. This counting rule can considerably overestimate theK S value because transitional mutations tend to occur more often than transversional mutations and because most transitional mutations at twofold degenerate sites are synonymous. A new method that gives unbiased estimates is proposed. An application of the new and the old method to 14 pairs of mouse and rat genes shows that the new method gives aK S value very close to the number of substitutions per fourfold degenerate site whereas the old method gives a value 30% higher. Both methods give aK A value close to the number of substitutions per nondegenerate site.  相似文献   

4.
Bielawski JP  Dunn KA  Yang Z 《Genetics》2000,156(3):1299-1308
Rates and patterns of synonymous and nonsynonymous substitutions have important implications for the origin and maintenance of mammalian isochores and the effectiveness of selection at synonymous sites. Previous studies of mammalian nuclear genes largely employed approximate methods to estimate rates of nonsynonymous and synonymous substitutions. Because these methods did not account for major features of DNA sequence evolution such as transition/transversion rate bias and unequal codon usage, they might not have produced reliable results. To evaluate the impact of the estimation method, we analyzed a sample of 82 nuclear genes from the mammalian orders Artiodactyla, Primates, and Rodentia using both approximate and maximum-likelihood methods. Maximum-likelihood analysis indicated that synonymous substitution rates were positively correlated with GC content at the third codon positions, but independent of nonsynonymous substitution rates. Approximate methods, however, indicated that synonymous substitution rates were independent of GC content at the third codon positions, but were positively correlated with nonsynonymous rates. Failure to properly account for transition/transversion rate bias and unequal codon usage appears to have caused substantial biases in approximate estimates of substitution rates.  相似文献   

5.
A method for estimating the numbers of synonymous (Ks) and nonsynonymous (Ka) substitutions per site is proposed. The method is based on the Li's (J Mol. Evol. 36:96–99, 1993) and Pamilo and Bianchi's (Mol. Biol. Evol. 10:271–281, 1993) method, but a putative source of bias is solved. It is proposed that the number of synonymous substitutions that are actually transitions or transversions should be computed by separating the twofold degenerate sites into two types of sites, 2S-fold and 2V-fold, where only transitional and transversional substitutions are synonymous, respectively. Kimura's (J. Mol. Evol. 16:111–120, 1980) two-parameter correcting method for multiple substitutions at a site is then applied using the overall observed synonymous transversion frequency to estimate both the numbers of synonymous transversional (Bs) and transitional (As) substitutions per site. This approach, therefore, also minimizes stochastic errors. Computer simulations indicate that the method presented gives more accurate Ks and Ka estimates than the aforementioned methods. Furthermore, the obtention of confidence intervals for divergence estimates by computer simulation is proposed.  相似文献   

6.
Nei and Gojobori (1986) developed a simple method to estimate the numbers of synonymous (ds) and nonsynonymous (dN) substitutions per site. In the present paper, we have developed a method for computing variances and covariances of ds's and dN's and of the proportions of synonymous (ps) and nonsynonymous (pN) differences. We also have developed a method for computing the variances of mean dS, dN, pS, pN, without constructing a phylogenetic tree of the genes. We have conducted computer simulations based on simple evolutionary models and have shown that the new method gives good estimates of variances and covariances.   相似文献   

7.
A new method is proposed for estimating the number of synonymous and nonsynonymous nucleotide substitutions between homologous genes. In this method, a nucleotide site is classified as nondegenerate, twofold degenerate, or fourfold degenerate, depending on how often nucleotide substitutions will result in amino acid replacement; nucleotide changes are classified as either transitional or transversional, and changes between codons are assumed to occur with different probabilities, which are determined by their relative frequencies among more than 3,000 changes in mammalian genes. The method is applied to a large number of mammalian genes. The rate of nonsynonymous substitution is extremely variable among genes; it ranges from 0.004 X 10(-9) (histone H4) to 2.80 X 10(-9) (interferon gamma), with a mean of 0.88 X 10(-9) substitutions per nonsynonymous site per year. The rate of synonymous substitution is also variable among genes; the highest rate is three to four times higher than the lowest one, with a mean of 4.7 X 10(-9) substitutions per synonymous site per year. The rate of nucleotide substitution is lowest at nondegenerate sites (the average being 0.94 X 10(-9), intermediate at twofold degenerate sites (2.26 X 10(-9)). and highest at fourfold degenerate sites (4.2 X 10(-9)). The implication of our results for the mechanisms of DNA evolution and that of the relative likelihood of codon interchanges in parsimonious phylogenetic reconstruction are discussed.  相似文献   

8.
Approximate methods for estimating the numbers of synonymous and nonsynonymous substitutions between two DNA sequences involve three steps: counting of synonymous and nonsynonymous sites in the two sequences, counting of synonymous and nonsynonymous differences between the two sequences, and correcting for multiple substitutions at the same site. We examine complexities involved in those steps and propose a new approximate method that takes into account two major features of DNA sequence evolution: transition/transversion rate bias and base/codon frequency bias. We compare the new method with maximum likelihood, as well as several other approximate methods, by examining infinitely long sequences, performing computer simulations, and analyzing a real data set. The results suggest that when there are transition/transversion rate biases and base/codon frequency biases, previously described approximate methods for estimating the nonsynonymous/synonymous rate ratio may involve serious biases, and the bias can be both positive and negative. The new method is, in general, superior to earlier approximate methods and may be useful for analyzing large data sets, although maximum likelihood appears to always be the method of choice.  相似文献   

9.
A method for detecting positive selection at single amino acid sites   总被引:23,自引:0,他引:23  
A method was developed for detecting the selective force at single amino acid sites given a multiple alignment of protein-coding sequences. The phylogenetic tree was reconstructed using the number of synonymous substitutions. Then, the neutrality was tested for each codon site using the numbers of synonymous and nonsynonymous changes throughout the phylogenetic tree. Computer simulation showed that this method accurately estimated the numbers of synonymous and nonsynonymous substitutions per site, as long as the substitution number on each branch was relatively small. The false-positive rate for detecting the selective force was generally low. On the other hand, the true-positive rate for detecting the selective force depended on the parameter values. Within the range of parameter values used in the simulation, the true-positive rate increased as the strength of the selective force and the total branch length (namely the total number of synonymous substitutions per site) in the phylogenetic tree increased. In particular, with the relative rate of nonsynonymous substitutions to synonymous substitutions being 5.0, most of the positively selected codon sites were correctly detected when the total branch length in the phylogenetic tree was > or = 2.5. When this method was applied to the human leukocyte antigen (HLA) gene, which included antigen recognition sites (ARSs), positive selection was detected mainly on ARSs. This finding confirmed the effectiveness of the present method with actual data. Moreover, two amino acid sites were newly identified as positively selected in non-ARSs. The three-dimensional structure of the HLA molecule indicated that these sites might be involved in antigen recognition. Positively selected amino acid sites were also identified in the envelope protein of human immunodeficiency virus and the influenza virus hemagglutinin protein. This method may be helpful for predicting functions of amino acid sites in proteins, especially in the present situation, in which sequence data are accumulating at an enormous speed.  相似文献   

10.
Evolution of the Zfx and Zfy genes: rates and interdependence between the genes   总被引:29,自引:10,他引:19  
A phylogenetic analysis of sex-chromosomal zinc-finger genes (Zfx and Zfy) indicates that the genes have not evolved completely independently since their initial separation. The sequence similarities suggest gene conversion in the last exon between the duplicated Y-chromosomal genes Zfy-1 and Zfy-2 in the mouse. There are also indications of conversion (or recombination) between the X- and Y-chromosomal genes in the crab- eating fox and in the mouse. The method for estimating synonymous and nonsynonymous substitutions is modified by incorporating the substitutions in the twofold-degenerate sites in a novel way. The estimates of synonymous substitutions support the generation-time hypothesis in that the obtained rates are higher in mice (by a factor of 4.7) than in humans and higher in the Y-chromosomal genes (by a factor of 1.9) than in the X-chromosomal genes.   相似文献   

11.
J. M. Comeron  M. Aguade 《Genetics》1996,144(3):1053-1062
The Xdh (rosy) region of Drosophila subobscura has been sequenced and compared to the homologous region of D. pseudoobscura and D. melanogaster. Estimates of the numbers of synonymous substitutions per site (Ks) confirm that Xdh has a high synonymous substitution rate. The distributions of both nonsynonymous and synonymous substitutions along the coding region were found to be heterogeneous. Also, no relationship has been detected between Ks estimates and codon usage bias along the gene, in contrast with the generally observed relationship among genes. This heterogeneous distribution of synonymous substitutions along the Xdh gene, which is expression-level independent, could be explained by a differential selection pressure on synonymous sites along the coding region acting on mRNA secondary structure. The synonymous rate in the Xdh coding region is lower in the D. subobscura than in the D. pseudoobscura lineage, whereas the reverse is true for the Adh gene.  相似文献   

12.
To determine the relative importance of gene conversion followed by natural selection and of natural selection for point mutation in generating variability in immunoglobulins, the numbers of synonymous and nonsynonymous substitutions in immunoglobulin sequences of various subgroups were estimated for complementarity-determining regions (CDRs) and for framework regions (FRs). Both the number of synonymous substitutions and the number of nonsynonymous substitutions in the CDR were found to exceed the corresponding numbers in the FR. Therefore, gene conversion is likely to be an important mechanism for providing variability in the CDR of immunoglobulins. The correlation coefficients between the number of synonymous substitutions and the number of nonsynonymous substitutions and between the substitution number in the CDR and that in the FR were found to be very low. Again, gene conversion is thought to be responsible for this finding.  相似文献   

13.
We consider three approaches for estimating the rates of nonsynonymous and synonymous changes at each site in a sequence alignment in order to identify sites under positive or negative selection: (1) a suite of fast likelihood-based "counting methods" that employ either a single most likely ancestral reconstruction, weighting across all possible ancestral reconstructions, or sampling from ancestral reconstructions; (2) a random effects likelihood (REL) approach, which models variation in nonsynonymous and synonymous rates across sites according to a predefined distribution, with the selection pressure at an individual site inferred using an empirical Bayes approach; and (3) a fixed effects likelihood (FEL) method that directly estimates nonsynonymous and synonymous substitution rates at each site. All three methods incorporate flexible models of nucleotide substitution bias and variation in both nonsynonymous and synonymous substitution rates across sites, facilitating the comparison between the methods. We demonstrate that the results obtained using these approaches show broad agreement in levels of Type I and Type II error and in estimates of substitution rates. Counting methods are well suited for large alignments, for which there is high power to detect positive and negative selection, but appear to underestimate the substitution rate. A REL approach, which is more computationally intensive than counting methods, has higher power than counting methods to detect selection in data sets of intermediate size but may suffer from higher rates of false positives for small data sets. A FEL approach appears to capture the pattern of rate variation better than counting methods or random effects models, does not suffer from as many false positives as random effects models for data sets comprising few sequences, and can be efficiently parallelized. Our results suggest that previously reported differences between results obtained by counting methods and random effects models arise due to a combination of the conservative nature of counting-based methods, the failure of current random effects models to allow for variation in synonymous substitution rates, and the naive application of random effects models to extremely sparse data sets. We demonstrate our methods on sequence data from the human immunodeficiency virus type 1 env and pol genes and simulated alignments.  相似文献   

14.
A positive correlation between ω, the ratio of the nonsynonymous and synonymous substitution rates, and dS, the synonymous substitution rate has recently been reported. This correlation is unexpected under simple evolutionary models. Here, we investigate two explanations for this correlation: first, whether it is a consequence of a statistical bias in the estimation of ω and second, whether it is due to substitutions at adjacent sites. Using simulations, we show that estimates of ω are biased when levels of divergence are low. This is true using the methods of Yang and Nielsen, Nei and Gojobori, and Muse and Gaut. Although the bias could generate a positive correlation between ω and dS, we show that it is unlikely to be the main determinant. Instead we show that the correlation is reduced when genes that are high quality in sequence, annotation, and alignment are used. The remaining--likely genuine--positive correlation appears to be due to adjacent tandem substitutions; single substitutions, though far more numerous, do not contribute to the correlation. Genuine adjacent substitutions may be due to mutation or selection.  相似文献   

15.
Natural selection operating on amino acid substitution at single amino acid sites can be detected by comparing the rates of synonymous (r(S)) and nonsynonymous (r(N)) nucleotide substitution at single codon sites. Amino acid substitutions can be classified as conservative or radical according to whether they retain the properties of the substituted amino acid. Here methods for comparing the rates of conservative (r(C)) and radical (r(R)) nonsynonymous substitution with r(S) at single codon sites were developed to detect natural selection operating on these substitutions at single amino acid sites. A method for comparing r(C) and r(R) at single codon sites was also developed to detect biases toward these substitutions at single amino acid sites. Charge was used as the property of the amino acids. In a computer simulation, false-positive rates of these methods were always < 5%, unless termination sites were included in the computation of the numbers of sites and estimates of transition/transversion rate ratio were highly biased. The frequency of detection of natural selection operating on conservative substitution was almost independent of the presence of natural selection operating on radical substitution, and vice versa. Natural selection operating specifically on conservative and radical substitution was detected more efficiently by comparing r(S) with r(C) and r(S) with r(R) than by comparing r(S) with r(N). These methods also appeared to be robust against the occurrence of recombination during evolution. In an analysis of class I human leukocyte antigen, negative selection operating on conservative substitution, but not positive selection operating on radical substitution, was observed at some of the codon sites with r(R) > r(C), suggesting that r(R) > r(C) may not necessarily be an indicator of positive selection operating on radical substitution.  相似文献   

16.
It is often stated that patterns of nonsynonymous rate variation among mammalian lineages are more irregular than expected or overdispersed under the neutral model, whereas synonymous sites conform to the neutral model. Here we reexamined genome-wide patterns of the variance to mean ratio, or index of dispersion (R), of substitutions in proteins from human, mouse, and dog. Contrary to the prevailing notion, we found that the mean index of dispersion for nonsynonymous sites of mammalian proteins is not significantly different from 1. We propose that earlier analyses were biased because the data included disproportionately more protein hormones, which tend to be more dispersed than genes in other functional categories. Synonymous sites exhibit greater degree of dispersion than nonsynonymous sites, although similar to earlier estimates and potentially due to errors associated with correction for multiple hits. Overall, our analysis identifies strong genome-wide generation-time effect and natural selection as important determinants of among-lineage variation of protein evolutionary rates. Furthermore, patterns of lineage-specific selective constraint are consistent with the nearly neutral model of molecular evolution.  相似文献   

17.
There are two tightly linked loci (D and CE) for the human Rh blood group. Their gene products are membrane proteins having 12 transmembrane domains and form a complex with Rh50 glycoprotein on erythrocytes. We constructed phylogenetic networks of human and nonhuman primate Rh genes, and the network patterns suggested the occurrences of gene conversions. We therefore used a modified site-by-site reconstruction method by using two assumed gene trees and detected 9 or 11 converted regions. After eliminating the effect of gene conversions, we estimated numbers of nonsynonymous and synonymous substitutions for each branch of both trees. Whichever gene tree we selected the branch connecting hominoids and Old World monkeys showed significantly higher nonsynonymous than synonymous substitutions, an indication of positive selection. Many other branches also showed higher nonsynonymous than synonymous substitutions; this suggests that the Rh genes have experienced some kind of positive selection. Received: 16 March 1999 / Accepted: 17 June 1999  相似文献   

18.
Codon Substitution in Evolution and the "Saturation" of Synonymous Changes   总被引:4,自引:1,他引:3  
Takashi Gojobori 《Genetics》1983,105(4):1011-1027
A mathematical model for codon substitution is presented, taking into account unequal mutation rates among different nucleotides and purifying selection. This model is constructed by using a 61 X 61 transition probability matrix for the 61 nonterminating codons. Under this model, a computer simulation is conducted to study the numbers of silent (synonymous) and amino acid-altering (nonsynonymous) nucleotide substitutions when the underlying mutation rates among the four kinds of nucleotides are not equal. It is assumed that the substitution rates are constant over evolutionary time, the codon frequencies being in equilibrium, and, thus, the numbers of synonymous and nonsynonymous substitutions both increase linearly with evolutionary time. It is shown that, when the mutation rates are not equal, the estimate of synonymous substitutions obtained by F. Perler, A. Efstratiadis, P. Lomedico, W. Gilbert, R. Kolodner and J. Dodgson's "Percent Corrected Divergence" method increases nonlinearly, although the true number of synonymous substitutions increases linearly. It is, therefore, possible that the "saturation" of synonymous substitutions observed by Perler et al. is due to the inefficiency of their method to detect all synonymous substitutions.  相似文献   

19.
The proportion of amino acid substitutions driven by adaptive evolution can potentially be estimated from polymorphism and divergence data by an extension of the McDonald-Kreitman test. We have developed a maximum-likelihood method to do this and have applied our method to several data sets from three Drosophila species: D. melanogaster, D. simulans, and D. yakuba. The estimated number of adaptive substitutions per codon is not uniformly distributed among genes, but follows a leptokurtic distribution. However, the proportion of amino acid substitutions fixed by adaptive evolution seems to be remarkably constant across the genome (i.e., the proportion of amino acid substitutions that are adaptive appears to be the same in fast-evolving and slow-evolving genes; fast-evolving genes have higher numbers of both adaptive and neutral substitutions). Our estimates do not seem to be significantly biased by selection on synonymous codon use or by the assumption of independence among sites. Nevertheless, an accurate estimate is hampered by the existence of slightly deleterious mutations and variations in effective population size. The analysis of several Drosophila data sets suggests that approximately 25% +/- 20% of amino acid substitutions were driven by positive selection in the divergence between D. simulans and D. yakuba.  相似文献   

20.
Summary The hemagglutinin (HA) genes of influenza type A (H1N1) viruses isolated from swine were cloned into plasmid vectors and their nucleotide sequences were determined. A phylogenetic tree for the HA genes of swine and human influenza viruses was constructed by the neighbor-joining method. It showed that the divergence between swine and human HA genes might have occurred around 1905. The estimated rates of synonymous (silent) substitutions for swine and human influenza viruses were almost the same. For both viruses, the rate of synonymous substitution was much higher than that of nonsynonymous (amino acid altering) substitution. It is the case even for only the antigenic sites of the HA. This feature is consistent with the neutral theory of molecular evolution. The rate of nonsynonymous substitution for human influenza viruses was three times the rate for swine influenza viruses. In particular, nonsynonymous substitutions at antigenic sites occurred less frequently in swine than in humans. The difference in the rate of nonsynonymous substitution between swine and human influenza viruses can be explained by the different degrees of functional constraint operating on the amino acid sequence of the HA in both hosts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号