The canonical genetic code has been reported both to be error minimizing and to show stereochemical associations between coding triplets and binding sites. In order to test whether these two properties are unexpectedly overlapping, we generated 200,000 randomized genetic codes using each of five randomization schemes, with and without randomization of stop codons. Comparison of the code error (difference in polar requirement for single-nucleotide codon interchanges) with the coding triplet concentrations in RNA binding sites for eight amino acids shows that these properties are independent and uncorrelated. Thus, one is not the result of the other, and error minimization and triplet associations probably arose independently during the history of the genetic code. We explicitly show that prior fixation of a stereochemical core is consistent with an effective later minimization of error. [Reviewing Editor : Dr. Stephen Freeland]  相似文献   

The genetic code is not random but instead is organized in such a way that single nucleotide substitutions are more likely to result in changes between similar amino acids. This fidelity, or error minimization, has been proposed to be an adaptation within the genetic code. Many models have been proposed to measure this adaptation within the genetic code. However, we find that none of these consider codon usage differences between species. Furthermore, use of different indices of amino acid physicochemical characteristics leads to different estimations of this adaptation within the code. In this study, we try to establish a more accurate model to address this problem. In our model, a weighting scheme is established for mistranslation biases of the three different codon positions, transition/transversion biases, and codon usage. Different indices of amino acids physicochemical characteristics are also considered. In contrast to pervious work, our results show that the natural genetic code is not fully optimized for error minimization. The genetic code, therefore, is not the most optimized one for error minimization, but one that balances between flexibility and fidelity for different species.  相似文献   

Distances between amino acids were derived from the polar requirement measure of amino acid polarity and Benner and co-workers' (1994) 74-100 PAM matrix. These distances were used to examine the average effects of amino acid substitutions due to single-base errors in the standard genetic code and equally degenerate randomized variants of the standard code. Second-position transitions conserved all distances on average, an order of magnitude more than did second-position transversions. In contrast, first-position transitions and transversions were about equally conservative. In comparison with randomized codes, second-position transitions in the standard code significantly conserved mean square differences in polar requirement and mean Benner matrix-based distances, but mean absolute value differences in polar requirement were not significantly conserved. The discrepancy suggests that these commonly used distance measures may be insufficient for strict hypothesis testing without more information. The translational consequences of single-base errors were then examined in different codon contexts, and similarities between these contexts explored with a hierarchical cluster analysis. In one cluster of codon contexts corresponding to the RNY and GNR codons, second-position transversions between C and G and transitions between C and U were most conservative of both polar requirement and the matrix-based distance. In another cluster of codon contexts, second-position transitions between A and G were most conservative. Despite the claims of previous authors to the contrary, it is shown theoretically that the standard code may have been shaped by position-invariant forces such as mutation and base content. These forces may have left heterogeneous signatures in the code because of differences in translational fidelity by codon position. A scenario for the origin of the code is presented wherein selection for error minimization could have occurred multiple times in disjoint parts of the code through a phyletic process of competition between lineages. This process permits error minimization without the disruption of previously useful messages, and does not predict that the code is optimally error-minimizing with respect to modern error. Instead, the code may be a record of genetic process and patterns of mutation before the radiation of modern organisms and organelles. Received: 28 July 1997 / Accepted: 23 January 1998  相似文献   

Selection on Codon Usage for Error Minimization at the Protein Level   总被引:1,自引:0,他引:1  
Given the structure of the genetic code, synonymous codons differ in their capacity to minimize the effects of errors due to mutation or mistranslation. I suggest that this may lead, in protein-coding genes, to a preference for codons that minimize the impact of errors at the protein level. I develop a theoretical measure of error minimization for each codon, based on amino acid similarity. This measure is used to calculate the degree of error minimization for 82 genes of Drosophila melanogaster and 432 rodent genes and to study its relationship with CG content, the degree of codon usage bias, and the rate of nucleotide substitution. I show that (i) Drosophila and rodent genes tend to prefer codons that minimize errors; (ii) this cannot be merely the effect of mutation bias; (iii) the degree of error minimization is correlated with the degree of codon usage bias; (iv) the amino acids that contribute more to codon usage bias are the ones for which synonymous codons differ more in the capacity to minimize errors; and (v) the degree of error minimization is correlated with the rate of nonsynonymous substitution. These results suggest that natural selection for error minimization at the protein level plays a role in the evolution of coding sequences in Drosophila and rodents.Reviewing Editor: Dr. Massimo Di Giulio  相似文献   

Studies on the origin of the genetic code compare measures of the degree of error minimization of the standard code with measures produced by random variant codes but do not take into account codon usage, which was probably highly biased during the origin of the code. Codon usage bias could play an important role in the minimization of the chemical distances between amino acids because the importance of errors depends also on the frequency of the different codons. Here I show that when codon usage is taken into account, the degree of error minimization of the standard code may be dramatically reduced, and shifting to alternative codes often increases the degree of error minimization. This is especially true with a high CG content, which was probably the case during the origin of the code. I also show that the frequency of codes that perform better than the standard code, in terms of relative efficiency, is much higher in the neighborhood of the standard code itself, even when not considering codon usage bias; therefore alternative codes that differ only slightly from the standard code are more likely to evolve than some previous analyses suggested. My conclusions are that the standard genetic code is far from being an optimum with respect to error minimization and must have arisen for reasons other than error minimization.[Reviewing Editor: Martin Kreitman]  相似文献   

The Case for an Error Minimizing Standard Genetic Code   总被引:1,自引:1,他引:0  
Since discovering the pattern by which amino acids are assigned to codons within the standard genetic code, investigators have explored the idea that natural selection placed biochemically similar amino acids near to one another in coding space so as to minimize the impact of mutations and/or mistranslations. The analytical evidence to support this theory has grown in sophistication and strength over the years, and counterclaims questioning its plausibility and quantitative support have yet to transcend some significant weaknesses in their approach. These weaknesses are illustrated here by means of a simple simulation model for adaptive genetic code evolution. There remain ill explored facets of the `error minimizing' code hypothesis, however, including the mechanism and pathway by which an adaptive pattern of codon assignments emerged, the extent to which natural selection created synonym redundancy, its role in shaping the amino acid and nucleotide languages, and even the correct interpretation of the adaptive codon assignment pattern: these represent fertile areas for future research.  相似文献   

We have assumed that the coevolution theory of genetic code origin (Wong JT, Proc Natl Acad Sci USA 72:1909–1912, 1975) is essentially correct. This theory makes it possible to identify at least 10 evolutionary stages through which genetic code organization might have passed prior to reaching its current form. The calculation of the minimization level of all these evolutionary stages leads to the following conclusions. (1) The minimization percentages increased linearly with the number of amino acids codified in the codes of the various evolutionary stages when only the sense changes are considered in the analysis. This seems to favor the physicochemical theory of genetic code origin even if, as discussed in the paper, this observation is also compatible with the coevolution theory. (2) For the first seven evolutionary stages of the genetic code, this trend is less clear and indeed is inverted when we consider the global optimisation of the codes due to both sense changes and synonymous changes. This inverse correlation between minimization percentages and the number of amino acids codified in the codes of the intermediate stages seems to favor neither the physicochemical nor the stereochemical theories of genetic code origin, as it is in the early and intermediate stages of code development that these theories would expect minimization to have played a crucial role, and this does not seem to be the case. However, these results are in agreement with the coevolution theory, which attributes a role to the physicochemical properties of amino acids that, while important, is nevertheless subordinate to the mechanism which concedes codons from the precursor amino acids to the product amino acids as the primary factor determining the evolutionary structuring of the genetic code. The results are therefore discussed in the context of the various theories proposed to explain genetic code origin. Received: 25 October 1998 / Accepted: 19 February 1999  相似文献   

During the RNA World, organisms experienced high rates of genetic errors, which implies that there was strong evolutionary pressure to reduce the errors’ phenotypical impact by suitably structuring the still-evolving genetic code. Therefore, the relative rates of the various types of genetic errors should have left characteristic imprints in the structure of the genetic code. Here, we show that, therefore, it is possible to some extent to reconstruct those error rates, as well as the nucleotide frequencies, for the time when the code was fixed. We find evidence indicating that the frequencies of G and C in the genome were not elevated. Since, for thermodynamic reasons, RNA in thermophiles tends to possess elevated G+C content, this result indicates that the fixation of the genetic code occurred in organisms which were either not thermophiles or that the code’s fixation occurred after the rise of DNA. Supplementary Materials Original data and programs are available at the author’s web site: .  相似文献   

We have previously proposed an SNS hypothesis on the origin of the genetic code (Ikehara and Yoshida 1998). The hypothesis predicts that the universal genetic code originated from the SNS code composed of 16 codons and 10 amino acids (S and N mean G or C and either of four bases, respectively). But, it must have been very difficult to create the SNS code at one stroke in the beginning. Therefore, we searched for a simpler code than the SNS code, which could still encode water-soluble globular proteins with appropriate three-dimensional structures at a high probability using four conditions for globular protein formation (hydropathy, α-helix, β-sheet, and β-turn formations). Four amino acids (Gly [G], Ala [A], Asp [D], and Val [V]) encoded by the GNC code satisfied the four structural conditions well, but other codes in rows and columns in the universal genetic code table do not, except for the GNG code, a slightly modified form of the GNC code. Three three-amino acid systems ([D], Leu and Tyr; [D], Tyr and Met; Glu, Pro and Ile) also satisfied the above four conditions. But, some amino acids in the three systems are far more complex than those encoded by the GNC code. In addition, the amino acids in the three-amino acid systems are scattered in the universal genetic code table. Thus, we concluded that the universal genetic code originated not from a three-amino acid system but from a four-amino acid system, the GNC code encoding [GADV]-proteins, as the most primitive genetic code. Received: 11 June 2001 / Accepted: 11 October 2001  相似文献   

Since the early days of the discovery of the genetic code nonrandom patterns have been searched for in the code in the hope of providing information about its origin and early evolution. Here we present a new classification scheme of the genetic code that is based on a binary representation of the purines and pyrimidines. This scheme reveals known patterns more clearly than the common one, for instance, the classification of strong, mixed, and weak codons as well as the ordering of codon families. Furthermore, new patterns have been found that have not been described before: Nearly all quantitative amino acid properties, such as Woeses polarity and the specific volume, show a perfect correlation to Lagerkvists codon–anticodon binding strength. Our new scheme leads to new ideas about the evolution of the genetic code. It is hypothesized that it started with a binary doublet code and developed via a quaternary doublet code into the contemporary triplet code. Furthermore, arguments are presented against suggestions that a simpler code, where only the midbase was informational, was at the origin of the genetic code.  相似文献   

Summary We lay new foundations to the hypothesis that the genetic code is adapted to evolutionary retention of information in the antisense strands of natural DNA/RNA sequences. In particular, we show that the genetic code exhibits, beyond the neutral replacement patterns of amino acid substitutions, optimal properties by favoring simultaneous evolution of proteins encoded in DNA/RNA sense-antisense strands. This is borne out in the sense-antisense transformations of the codons of every amino acid which target amino acids physicochemically similar to each other. Moreover, silent mutations in the sense strand generate conservative ones in its antisense counterpart and vice versa. Coevolution of proteins coded by complementary strands is shown to be a definite possibility, a result which does not depend on any physical interaction between the coevolving proteins. Likewise, the degree to which the present genetic code is dedicated to evolutionary sense-antisense tolerance is demonstrated by comparison with many randomized codes. Double-strand coding is quantified from an information-theoretical point of view.  相似文献   

Statistical and biochemical studies of the genetic code have found evidence of nonrandom patterns in the distribution of codon assignments. It has, for example, been shown that the code minimizes the effects of point mutation or mistranslation: erroneous codons are either synonymous or code for an amino acid with chemical properties very similar to those of the one that would have been present had the error not occurred. This work has suggested that the second base of codons is less efficient in this respect, by about three orders of magnitude, than the first and third bases. These results are based on the assumption that all forms of error at all bases are equally likely. We extend this work to investigate (1) the effect of weighting transition errors differently from transversion errors and (2) the effect of weighting each base differently, depending on reported mistranslation biases. We find that if the bias affects all codon positions equally, as might be expected were the code adapted to a mutational environment with transition/transversion bias, then any reasonable transition/transversion bias increases the relative efficiency of the second base by an order of magnitude. In addition, if we employ weightings to allow for biases in translation, then only 1 in every million random alternative codes generated is more efficient than the natural code. We thus conclude not only that the natural genetic code is extremely efficient at minimizing the effects of errors, but also that its structure reflects biases in these errors, as might be expected were the code the product of selection. Received: 25 July 1997 / Accepted: 9 January 1998  相似文献   

A paper (Amirnovin R, J Mol Evol 44:473–476, 1997) seems to undermine the validity of the coevolution theory of genetic code origin by shedding doubt on the connection between the biosynthetic relationships between amino acids and the organization of the genetic code, at a time when the literature on the topic takes this for granted. However, as a few papers cite this paper as evidence against the coevolution theory, and to cast aside all doubt on the subject, we have decided to reanalyze the statistical bases on which this theory is founded. We come to the following conclusions: (1) the methods used in the above referred paper contain certain mistakes, and (2) the statistical foundations on which the coevolution theory is based are extremely robust. We have done this by critically appraising Amirnovin's paper and suggesting an alternative method based on the generation of random codes which, along with the method reported in the literature, allows us to evaluate the significance, in the genetic code, of different sets of amino acid pairs in biosynthetic relationships. In particular, by using this method and after building up a certain set of amino acid pairs reflecting the expectations of the coevolution theory, we show that the presence of this set in the genetic code would be obtained, purely by chance, with a probability of 6 × 10−5. This observation seems to provide particularly strong support to the coevolution theory. Received: 28 June 1999 / Accepted: 23 October 1999  相似文献   

The Genetic Code appears to be a non-random triplet code in which both the position of a nucleotide within a codon, as well as its physicochemical nature, contribute to the identity of the expressed amino acid. The non-randomness of the code is manifested in apparent patterns in the mapping from codon to amino acid; some of the patterns seem quite clear, while other more subtle patterns are less obvious or certain. Discussion in the literature has been largely qualitative in nature. In this study, we employ evolution similarity data, widely employed in the field of bioinformatics, to explore the patterns relating nucleotide features to amino acids. The results support a hierarchical order based on position and physicochemical features proposed by Jimenez-Montaño et al., [“The Hypercube Structure of the Genetic Code Explains Conservative and Non-Conservative Amino Acid Substitutions in vivo and in vitroBiosystems (1996) 39, pp. 117–125]. The method also provides a quantitative approach to testing the importance of other putative patterns.  相似文献   

A computer program was used to test Wong's coevolution theory of the genetic code. The codon correlations between the codons of biosynthetically related amino acids in the universal genetic code and in randomly generated genetic codes were compared. It was determined that many codon correlations are also present within random genetic codes and that among the random codes there are always several which have many more correlations than that found in the universal code. Although the number of correlations depends on the choice of biosynthetically related amino acids, the probability of choosing a random genetic code with the same or greater number of codon correlations as the universal genetic code was found to vary from 0.1% to 34% (with respect to a fairly complete listing of related amino acids). Thus, Wong's theory that the genetic code arose by coevolution with the biosynthetic pathways of amino acids, based on codon correlations between biosynthetically related amino acids, is statistical in nature. Received: 8 August 1996 / Accepted: 26 December 1996  相似文献   

The Standard Genetic Code is organized such that similar codons encode similar amino acids. One explanation suggested that the Standard Code is the result of natural selection to reduce the fitness ``load' that derives from the mutation and mistranslation of protein-coding genes. We review the arguments against the mutational load-minimizing hypothesis and argue that they need to be reassessed. We review recent analyses of the organization of the Standard Code and conclude that under cautious interpretation they support the mutational load-minimizing hypothesis. We then present a deterministic asexual model with which we study the mode of selection for load minimization. In this model, individual fitness is determined by a protein phenotype resulting from the translation of a mutable set of protein-coding genes. We show that an equilibrium fitness may be associated with a population with the same genetic code and that genetic codes that assign similar codons to similar amino acids have a higher fitness. We also show that the number of mutant codons in each individual at equilibrium, which determines the strength of selection for load minimization, reflects a long-term evolutionary balance between mutations in messages and selection on proteins, rather than the number of mutations that occur in a single generation, as has been assumed by previous authors. We thereby establish that selection for mutational load minimization acts at the level of an individual in a single generation. We conclude with comments on the shortcomings and advantages of load minimization over other hypotheses for the origin of the Standard Code. Received: 4 April 2001 / Accepted: 22 October 2001  相似文献   

The coevolution theory proposes that primordial proteins consisted only of those amino acids readily obtainable from the prebiotic environment, representing about half the twenty encoded amino acids of today, and the missing amino acids entered the system as the code expanded along with pathways of amino acid biosynthesis. The isolation of genetic code mutants, and the antiquity of pretran synthesis revealed by the comparative genomics of tRNAs and aminoacyl-tRNA synthetases, have combined to provide a rigorous proof of the four fundamental tenets of the theory, thus solving the riddle of the structure of the universal genetic code. Presented at: International School of Complexity – 4th Course: Basic Questions on the Origins of Life; “Ettore Majorana” Foundation and Centre for Scientific Culture, Erice, Italy, 1–6 October 2006.  相似文献   

We consider a model of the origin of genetic code organization incorporating the biosynthetic relationships between amino acids and their physicochemical properties. We study the behavior of the genetic code in the set of codes subject both to biosynthetic constraints and to the constraint that the biosynthetic classes of amino acids must occupy only their own codon domain, as observed in the genetic code. Therefore, this set contains the smallest number of elements ever analyzed in similar studies. Under these conditions and if, as predicted by physicochemical postulates, the amino acid properties played a fundamental role in genetic code organization, it can be expected that the code must display an extremely high level of optimization. This prediction is not supported by our analysis, which indicates, for instance, a minimization percentage of only 80%. These observations can therefore be more easily explained by the coevolution theory of genetic code origin, which postulates a role that is important but not fundamental for the amino acid properties in the structuring of the code. We have also investigated the shape of the optimization landscape that might have arisen during genetic code origin. Here, too, the results seem to favor the coevolution theory because, for instance, the fact that only a few amino acid exchanges would have been sufficient to transform the genetic code (which is not a local minimum) into a much better optimized code, and that such exchanges did not actually take place, seems to suggest that, for instance, the reduction of translation errors was not the main adaptive theme structuring the genetic code.  相似文献   

Explaining the apparent non-random codon distribution and the nature and number of amino acids in the ‘standard’ genetic code remains a challenge, despite the various hypotheses so far proposed. In this paper we propose a simple new hypothesis for code evolution involving a progression from singlet to doublet to triplet codons with a reading mechanism that moves three bases each step. We suggest that triplet codons gradually evolved from two types of ambiguous doublet codons, those in which the first two bases of each three-base window were read (‘prefix’ codons) and those in which the last two bases of each window were read (‘suffix’ codons). This hypothesis explains multiple features of the genetic code such as the origin of the pattern of four-fold degenerate and two-fold degenerate triplet codons, the origin of its error minimising properties, and why there are only 20 amino acids. Reviewing Editor: Dr. Laura Landweber An erratum to this article can be found at .  相似文献   

