首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
Yang Z  Ro S  Rannala B 《Genetics》2003,165(2):695-705
The role of somatic mutation in cancer is well established and several genes have been identified that are frequent targets. This has enabled large-scale screening studies of the spectrum of somatic mutations in cancers of particular organs. Cancer gene mutation databases compile the results of many studies and can provide insight into the importance of specific amino acid sequences and functional domains in cancer, as well as elucidate aspects of the mutation process. Past studies of the spectrum of cancer mutations (in particular genes) have examined overall frequencies of mutation (at specific nucleotides) and of missense, nonsense, and silent substitution (at specific codons) both in the sequence as a whole and in a specific functional domain. Existing methods ignore features of the genetic code that allow some codons to mutate to missense, or stop, codons more readily than others (i.e., by one nucleotide change, vs. two or three). A new codon-based method to estimate the relative rate of substitution (fixation of a somatic mutation in a cancer cell lineage) of nonsense vs. missense mutations in different functional domains and in different tumor tissues is presented. Models that account for several potential influences on rates of somatic mutation and substitution in cancer progenitor cells and allow biases of mutation rates for particular dinucleotide sequences (CGs and dipyrimidines), transition vs. transversion bias, and variable rates of silent substitution across functional domains (useful in detecting investigator sampling bias) are considered. Likelihood-ratio tests are used to choose among models, using cancer gene mutation data. The method is applied to analyze published data on the spectrum of p53 mutations in cancers. A novel finding is that the ratio of the probability of nonsense to missense substitution is much lower in the DNA-binding and transactivation domains (ratios near 1) than in structural domains such as the linker, tetramerization (oligomerization), and proline-rich domains (ratios exceeding 100 in some tissues), implying that the specific amino acid sequence may be less critical in structural domains (e.g., amino acid changes less often lead to cancer). The transition vs. transversion bias and effect of CpG dinucleotides on mutation rates in p53 varied greatly across cancers of different organs, likely reflecting effects of different endogenous and exogenous factors influencing mutation in specific organs.  相似文献   

2.
Synonymous and nonsynonymous rate variation in nuclear genes of mammals   总被引:34,自引:6,他引:28  
A maximum likelihood approach was used to estimate the synonymous and nonsynonymous substitution rates in 48 nuclear genes from primates, artiodactyls, and rodents. A codon-substitution model was assumed, which accounts for the genetic code structure, transition/transversion bias, and base frequency biases at codon positions. Likelihood ratio tests were applied to test the constancy of nonsynonymous to synonymous rate ratios among branches (evolutionary lineages). It is found that at 22 of the 48 nuclear loci examined, the nonsynonymous/synonymous rate ratio varies significantly across branches of the tree. The result provides strong evidence against a strictly neutral model of molecular evolution. Our likelihood estimates of synonymous and nonsynonymous rates differ considerably from previous results obtained from approximate pairwise sequence comparisons. The differences between the methods are explored by detailed analyses of data from several genes. Transition/transversion rate bias and codon frequency biases are found to have significant effects on the estimation of synonymous and nonsynonymous rates, and approximate methods do not adequately account for those factors. The likelihood approach is preferable, even for pairwise sequence comparison, because more-realistic models about the mutation and substitution processes can be incorporated in the analysis. Received: 17 May 1997 / Accepted: 28 September 1997  相似文献   

3.
X Liu  H Liu  W Guo  K Yu 《Gene》2012,509(1):136-141
Codon models are now widely used to draw evolutionary inferences from alignments of homologous sequence data. Incorporating physicochemical properties of amino acids into codon models, two novel codon substitution models describing the evolution of protein-coding DNA sequences are presented based on the similarity scores of amino acids. To describe substitutions between codons a continue-time Markov process is used. Transition/transversion rate bias and nonsynonymous codon usage bias are allowed in the models. In our implementation, the parameters are estimated by maximum-likelihood (ML) method as in previous studies. Furthermore, instantaneous mutations involving more than one nucleotide position of a codon are considered in the second model. Then the two suggested models are applied to five real data sets. The analytic results indicate that the new codon models considering physicochemical properties of amino acids can provide a better fit to the data comparing with existing codon models, and then produce more reliable estimates of certain biologically important measures than existing methods.  相似文献   

4.
Three frequently used methods for estimating the synonymous and nonsynonymous substitution rates (Ks and Ka) were evaluated and compared for their accuracies; these methods are denoted by LWL85, LPB93, and GY94, respectively. For this purpose, we used a codon-evolution model to obtain the expected Ka and Ks values for the above three methods and compared the values with those obtained by the three methods. We also proposed some modifications of LWL85 and LPB93 to increase their accuracies. Our computer simulations under the codon-evolution model showed that for sequences < or =300 codons, the performance of GY94 may not be reliable. For longer sequences, GY94 is more accurate for estimating the Ka/Ks ratio than the modified LPB93 and LWL85 in the majority of the cases studied. This is particularly so when k > or = 3, which is the transition/transversion (mutation) rate ratio. However, when k is approximately 2 and when the sequence divergence is relatively large, the modified LWL85 performed better than GY94 and the modified LPB93. The inferiority of LPB93 to LWL85 is surprising because LPB93 was intended to improve LWL85. Also, it has been thought that the codon-based method of GY94 is better than the heuristic method of LWL85, but our simulation results showed that in many cases, the opposite was true, even though our simulation was based on the codon-evolution model.  相似文献   

5.
Approximate methods for estimating the numbers of synonymous and nonsynonymous substitutions between two DNA sequences involve three steps: counting of synonymous and nonsynonymous sites in the two sequences, counting of synonymous and nonsynonymous differences between the two sequences, and correcting for multiple substitutions at the same site. We examine complexities involved in those steps and propose a new approximate method that takes into account two major features of DNA sequence evolution: transition/transversion rate bias and base/codon frequency bias. We compare the new method with maximum likelihood, as well as several other approximate methods, by examining infinitely long sequences, performing computer simulations, and analyzing a real data set. The results suggest that when there are transition/transversion rate biases and base/codon frequency biases, previously described approximate methods for estimating the nonsynonymous/synonymous rate ratio may involve serious biases, and the bias can be both positive and negative. The new method is, in general, superior to earlier approximate methods and may be useful for analyzing large data sets, although maximum likelihood appears to always be the method of choice.  相似文献   

6.
A common approach to estimate the strength and direction of selection acting on protein coding sequences is to calculate the dN/dS ratio. The method to calculate dN/dS has been widely used by many researchers and many critical reviews have been made on its application after the proposition by Nei and Gojobori in 1986. However, the method is still evolving considering the non-uniform substitution rates and pretermination codons. In our study of SNPs in 586 genes across 156 Escherichia coli strains, synonymous polymorphism in 2-fold degenerate codons were higher in comparison to that in 4-fold degenerate codons, which could be attributed to the difference between transition (Ti) and transversion (Tv) substitution rates where the average rate of a transition is four times more than that of a transversion in general. We considered both the Ti/Tv ratio, and nonsense mutation in pretermination codons, to improve estimates of synonymous (S) and non-synonymous (NS) sites. The accuracy of estimating dN/dS has been improved by considering the Ti/Tv ratio and nonsense substitutions in pretermination codons. We showed that applying the modified approach based on Ti/Tv ratio and pretermination codons results in higher values of dN/dS in 29 common genes of equal reading-frames between E. coli and Salmonella enterica. This study emphasizes the robustness of amino acid composition with varying codon degeneracy, as well as the pretermination codons when calculating dN/dS values.  相似文献   

7.
A model of nucleotide substitution that allows the transition/transversion rate bias to vary across sites was constructed. We examined the fit of this model using likelihood-ratio tests by analyzing 13 protein coding genes and 1 pseudogene. Likelihood-ratio testing indicated that a model that allows variation in the transition/transversion rate bias across sites provided a significant improvement in fit for most protein coding genes but not for the pseudogene. When the analysis was repeated with parameters estimated separately for first, second, and third codon positions, strong heterogeneity was uncovered for the first and second codon positions; the variation in the transition/transversion rate was generally weaker at the third codon position. The transition rate bias and branch lengths are underestimated when variation in the transition/transversion rate was not accommodated, suggesting that it may be important to accommodate variation in the pattern of nucleotide substitution for accurate estimation of evolutionary parameters. Received: 4 November 1997 / Accepted: 19 May 1998  相似文献   

8.
The Selection-Mutation-Drift Theory of Synonymous Codon Usage   总被引:69,自引:11,他引:58       下载免费PDF全文
M. Bulmer 《Genetics》1991,129(3):897-907
It is argued that the bias in synonymous codon usage observed in unicellular organisms is due to a balance between the forces of selection and mutation in a finite population, with greater bias in highly expressed genes reflecting stronger selection for efficiency of translation. A population genetic model is developed taking into account population size and selective differences between synonymous codons. A biochemical model is then developed to predict the magnitude of selective differences between synonymous codons in unicellular organisms in which growth rate (or possibly growth yield) can be equated with fitness. Selection can arise from differences in either the speed or the accuracy of translation. A model for the effect of speed of translation on fitness is considered in detail, a similar model for accuracy more briefly. The model is successful in predicting a difference in the degree of bias at the beginning than in the rest of the gene under some circumstances, as observed in Escherichia coli, but grossly overestimates the amount of bias expected. Possible reasons for this discrepancy are discussed.  相似文献   

9.
Summary A codon-based approach to estimating the number of variable sites in a protein is presented. When first and second positions of codons are assumed to be replacement positions, a capture-recapture model can be used to estimate the number of variable codons from every pair of homologous and aligned sequences. The capture-recapture estimate is compared to a maximum likelihood estimate of the number of variable codons and to previous approaches that estimate the number of variable sites (not codons) in a sequence. Computer simulations are presented that show under which circumstances the capture-recapture estimate can be used to correct biases in distance matrices. Analysis of published sequences of two genes, calmodulin and serum albumin, shows that distance corrections that employ a capture-recapture estimate of the number of variable sites may be considerably different from corrections that assume that the number of variable sites is equal to the total number of positions in the sequence. Offprint requests to: A. Sidow  相似文献   

10.
Statistical and biochemical studies of the genetic code have found evidence of nonrandom patterns in the distribution of codon assignments. It has, for example, been shown that the code minimizes the effects of point mutation or mistranslation: erroneous codons are either synonymous or code for an amino acid with chemical properties very similar to those of the one that would have been present had the error not occurred. This work has suggested that the second base of codons is less efficient in this respect, by about three orders of magnitude, than the first and third bases. These results are based on the assumption that all forms of error at all bases are equally likely. We extend this work to investigate (1) the effect of weighting transition errors differently from transversion errors and (2) the effect of weighting each base differently, depending on reported mistranslation biases. We find that if the bias affects all codon positions equally, as might be expected were the code adapted to a mutational environment with transition/transversion bias, then any reasonable transition/transversion bias increases the relative efficiency of the second base by an order of magnitude. In addition, if we employ weightings to allow for biases in translation, then only 1 in every million random alternative codes generated is more efficient than the natural code. We thus conclude not only that the natural genetic code is extremely efficient at minimizing the effects of errors, but also that its structure reflects biases in these errors, as might be expected were the code the product of selection. Received: 25 July 1997 / Accepted: 9 January 1998  相似文献   

11.
A common problem in molecular phylogenetics is choosing a model of DNA substitution that does a good job of explaining the DNA sequence alignment without introducing superfluous parameters. A number of methods have been used to choose among a small set of candidate substitution models, such as the likelihood ratio test, the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), and Bayes factors. Current implementations of any of these criteria suffer from the limitation that only a small set of models are examined, or that the test does not allow easy comparison of non-nested models. In this article, we expand the pool of candidate substitution models to include all possible time-reversible models. This set includes seven models that have already been described. We show how Bayes factors can be calculated for these models using reversible jump Markov chain Monte Carlo, and apply the method to 16 DNA sequence alignments. For each data set, we compare the model with the best Bayes factor to the best models chosen using AIC and BIC. We find that the best model under any of these criteria is not necessarily the most complicated one; models with an intermediate number of substitution types typically do best. Moreover, almost all of the models that are chosen as best do not constrain a transition rate to be the same as a transversion rate, suggesting that it is the transition/transversion rate bias that plays the largest role in determining which models are selected. Importantly, the reversible jump Markov chain Monte Carlo algorithm described here allows estimation of phylogeny (and other phylogenetic model parameters) to be performed while accounting for uncertainty in the model of DNA substitution.  相似文献   

12.
Mutational changes involving transitions can convert only one sense codon to ochre, two codons to amber, and two codons to UGA. One codon, UGG for tryptophan, can be converted by transitions to either amber or UGA. By transversion changes 15 other codons can be converted to ochre and/or amber and/or UGA. Ten amino acids can never be replaced by chain termination as a result of transition and transversion mutagenesis of single base-pairs. For two systems (bacteriophage T4 lysozyme and Escherichia coli K12 tryptophan synthetase A protein) in which the poly-peptide gene product has been completely sequenced one can construct predictive intra-genic distribution maps for the location of all possible chain-terminating mutations arising as a result of transitions and transversions.  相似文献   

13.
14.
A simple method for estimating the transition/transversion ratio was developed. This method can be applied to not only two sequences but also more than two sequences. The statistical properties of the method and some other methods were examined by numerical computation and computer simulation. The results obtained showed that, in terms of bias and variance, the new method gives a better estimate of the transition/transversion ratio than do the other examined methods. The new method was applied to human and chimpanzee mitochondrial control region sequences. Received: 22 September 1997 / Accepted: 1 November 1997  相似文献   

15.
Proteins evolve under a myriad of biophysical selection pressures that collectively control the patterns of amino acid substitutions. These evolutionary pressures are sufficiently consistent over time and across protein families to produce substitution patterns, summarized in global amino acid substitution matrices such as BLOSUM, JTT, WAG, and LG, which can be used to successfully detect homologs, infer phylogenies, and reconstruct ancestral sequences. Although the factors that govern the variation of amino acid substitution rates have received much attention, the influence of thermodynamic stability constraints remains unresolved. Here we develop a simple model to calculate amino acid substitution matrices from evolutionary dynamics controlled by a fitness function that reports on the thermodynamic effects of amino acid mutations in protein structures. This hybrid biophysical and evolutionary model accounts for nucleotide transition/transversion rate bias, multi‐nucleotide codon changes, the number of codons per amino acid, and thermodynamic protein stability. We find that our theoretical model accurately recapitulates the complex yet universal pattern observed in common global amino acid substitution matrices used in phylogenetics. These results suggest that selection for thermodynamically stable proteins, coupled with nucleotide mutation bias filtered by the structure of the genetic code, is the primary driver behind the global amino acid substitution patterns observed in proteins throughout the tree of life.  相似文献   

16.
Based on the differences in synonymous codon use between E. coli and S. typhimurium, the synonymous substitution rates can be estimated. In contrast to previous studies on the substitution rates in these two organisms, we use a kinetic model that explicitly takes the selection bias into account. The selection pressure on synonymous codons for a particular amino acid can be calculated from the observed codon bias. This offers a unique opportunity to study systematically the relationship between substitution-rate constants and selection pressure. The results indicate that the codon bias in these organisms is determined by a mutation-selection balance rather than by stabilizing selection. A best fit to the data implies that the mutation rate constant increases about threefold in genes at low expression levels relative to those that are highly expressed.Correspondence to: O.G. Berg  相似文献   

17.
We present a likelihood method for estimating codon usage bias parameters along the lineages of a phylogeny. The method is an extension of the classical codon-based models used for estimating dN/dS ratios along the lineages of a phylogeny. However, we add one extra parameter for each lineage: the selection coefficient for optimal codon usage (S), allowing joint maximum likelihood estimation of S and the dN/dS ratio. We apply the method to previously published data from Drosophila melanogaster, Drosophila simulans, and Drosophila yakuba and show, in accordance with previous results, that the D. melanogaster lineage has experienced a reduction in the selection for optimal codon usage. However, the D. melanogaster lineage has also experienced a change in the biological mutation rates relative to D. simulans, in particular, a relative reduction in the mutation rate from A to G and an increase in the mutation rate from C to T. However, neither a reduction in the strength of selection nor a change in the mutational pattern can alone explain all of the data observed in the D. melanogaster lineage. For example, we also confirm previous results showing that the Notch locus has experienced positive selection for previously classified unpreferred mutations.  相似文献   

18.
Rao Y  Wu G  Wang Z  Chai X  Nie Q  Zhang X 《DNA research》2011,18(6):499-512
Synonymous codons are used with different frequencies both among species and among genes within the same genome and are controlled by neutral processes (such as mutation and drift) as well as by selection. Up to now, a systematic examination of the codon usage for the chicken genome has not been performed. Here, we carried out a whole genome analysis of the chicken genome by the use of the relative synonymous codon usage (RSCU) method and identified 11 putative optimal codons, all of them ending with uracil (U), which is significantly departing from the pattern observed in other eukaryotes. Optimal codons in the chicken genome are most likely the ones corresponding to highly expressed transfer RNA (tRNAs) or tRNA gene copy numbers in the cell. Codon bias, measured as the frequency of optimal codons (Fop), is negatively correlated with the G + C content, recombination rate, but positively correlated with gene expression, protein length, gene length and intron length. The positive correlation between codon bias and protein, gene and intron length is quite different from other multi-cellular organism, as this trend has been only found in unicellular organisms. Our data displayed that regional G + C content explains a large proportion of the variance of codon bias in chicken. Stepwise selection model analyses indicate that G + C content of coding sequence is the most important factor for codon bias. It appears that variation in the G + C content of CDSs accounts for over 60% of the variation of codon bias. This study suggests that both mutation bias and selection contribute to codon bias. However, mutation bias is the driving force of the codon usage in the Gallus gallus genome. Our data also provide evidence that the negative correlation between codon bias and recombination rates in G. gallus is determined mostly by recombination-dependent mutational patterns.  相似文献   

19.
Estimation of the Transition/Transversion Rate Bias and Species Sampling   总被引:7,自引:0,他引:7  
The transition/transversion (ti/tv) rate ratios are estimated by pairwise sequence comparison and joint likelihood analysis using mitochondrial cytochrome b genes of 28 primate species, representing both the Strepsirrhini (lemurs and lories) and the Anthropoidea (monkeys, apes, and humans). Pairwise comparison reveals a strong negative correlation between estimates of the ti/tv ratio and the sequence distance, even when both are corrected for multiple substitutions. The maximum-likelihood estimate of the ti/tv ratio changes with the species included in the analysis. The ti/tv bias within the lemuriform taxa is found to be as strong as in the anthropoids, in contradiction to an earlier study which sampled only one lemuriform. Simulations show the surprising result that both the pairwise correction method and the joint likelihood analysis tend to overcorrect for multiple substitutions and overestimate the ti/tv ratio, especially at low sequence divergence. The bias, however, is not large enough to account for the observed patterns. Nucleotide frequency biases, variation of substitution rates among sites, and different evolutionary dynamics at the three codon positions can be ruled out as possible causes. The likelihood-ratio test suggests that the ti/tv rate ratios may be variable among evolutionary lineages. Without any biological evidence for such a variation, however, we are left with no plausible explanations for the observed patterns other than a possible saturation effect due to the unrealistic nature of the model assumed. Received: 1 October 1997 / Accepted: 29 September 1998  相似文献   

20.
Protein evolution by codon-based random deletions   总被引:2,自引:1,他引:1  
A method to delete in-phase codons throughout a defined target region of a gene has been developed. This approach, named the codon-based random deletion (COBARDE) method, is able to delete complete codons in a random and combinatorial mode. Robustness, automation and fine-tuning of the mutagenesis rate are essential characteristics of the method, which is based on the assembly of oligonucleotides and on the use of two transient orthogonal protecting groups during the chemical synthesis. The performance of the method for protein function evolution was demonstrated by changing the substrate specificity of TEM-1 β-lactamase. Functional ceftazidime-resistant β-lactamase variants containing several deleted residues inside the catalytically important omega-loop region were found. The results show that the COBARDE method is a useful new molecular tool to access previously unexplorable sequence space.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号