共查询到20条相似文献,搜索用时 15 毫秒
1.
The standard genetic code is known to be robust to translation errors and point mutations. We studied how small modifications of the standard code affect its robustness. The robustness was assessed in terms of a proper stability function, the negative variations of which correspond to a more robust code. The fraction of more robust codes obtained under small modifications appeared to be unexpectedly high, about 0.1-0.4 depending on the choice of stability function and code modifications, yet significantly lower than the corresponding fraction in the random codes (about a half). In this sense the standard code ought to be considered distinctly non-random in accordance with previous observations. The distribution of the negative variations of stability function revealed very abrupt drop beyond one standard deviation, much sharper than for Gaussian distribution or for the random codes with the same number of codons in the sets coding for amino acids or stop-codons. This behavior holds for both the standard code as a whole and its binary NRN-NYN, NWN-NSN, and NMN-NKN blocks. Previously, it has been proved that such binary block structure is necessary for the robustness of a code and is inherent to the standard genetic code. The modifications of the standard code corresponding to more robust coding may be related to the different variants of the code. These effects may also contribute to the rates of replacements of amino acids. The observed features demonstrate the joint impact of random factors and natural selection during evolution of the genetic code. 相似文献
2.
Two ideas have essentially been used to explain the origin of the genetic code: Crick's frozen accident and Woese's amino acid-codon specific chemical interaction. Whatever the origin and codon-amino acid correlation, it is difficult to imagine the sudden appearance of the genetic code in its present form of 64 codons coding for 20 amino acids without appealing to some evolutionary process. On the contrary, it is more reasonable to assume that it evolved from a much simpler initial state in which a few triplets were coding for each of a small number of amino acids. Analysis of genetic code through information theory and the metabolism of pyrimidine biosynthesis provide evidence that suggests that the genetic code could have begun in an RNA world with the two letters A and U grouped in eight triplets coding for seven amino acids and one stop signal. This code could have progressively evolved by making gradual use of letters G and C to end with 64 triplets coding for 20 amino acids and three stop signals. According to proposed evidence, DNA could have appeared after the four-letter structure was already achieved. In the newborn DNA world, T substituted U to get higher physicochemical and genetic stability. 相似文献
3.
A quantitative rationale for the evolution of the genetic code is developed considering the principle of minimal hardware. This principle defines an optimal code as one that minimizes for a given amount of information encoded, the product of the number of physical devices used by the average complexity of each device. By identifying the number of different amino acids, number of nucleotide positions per codon and number of base types that can occupy each such position with, respectively, the amount of information, number of devices and the complexity, we show that optimal codes occur for 3, 7 and 20 amino acids with codons having a single, two and three base positions per codon, respectively. The advantage of a code of exactly 4 symbols is deduced, as well as a plausible evolutionary pathway from a code of doublets to triplets. The present day code of 20 amino acids encoded by 64 codons is shown to be the most optimal in an absolute sense. Using a tetraplet code further evolution to a code in which there would be 55 amino acids is in principle possible, but such a code would deviate slightly more than the present day code from the minimal hardware configuration. The change from a triplet code to a tetraplet code would occur at about 32 amino acids. Our conclusions are independent of, but consistent with, the observed physico-chemical properties of the amino acids and codon structures. These correlations could have evolved within the constrains imposed by the minimal hardware principle. 相似文献
4.
By considering two important factors involved in the codon-anticodon interactions, the hydrogen bond number and the chemical
type of bases, a codon array of the genetic code table as an increasing code scale of interaction energies of amino acids
in proteins was obtained. Next, in order to consecutively obtain all codons from the codon AAC, a sum operation has been introduced
in the set of codons. The group obtained over the set of codons is isomorphic to the group ( Z64, +) of the integer module 64. On the Z64-algebra of the set of 64 N codon sequences of length N, gene mutations are described by means of endomorphisms f:( Z64) N→( Z64) N. Endomorphisms and automorphisms helped us describe the gene mutation pathways. For instance, 77.7% mutations in 749 HIV
protease gene sequences correspond to unique diagonal endomorphisms of the wild type strain HXB2. In particular, most of the
reported mutations that confer drug resistance to the HIV protease gene correspond to diagonal automorphisms of the wild type.
What is more, in the human beta-globin gene a similar situation appears where most of the single codon mutations correspond
to automorphisms. Hence, in the analyses of molecular evolution process on the DNA sequence set of length N, the Z64-algebra will help us explain the quantitative relationships between genes. 相似文献
5.
The standard genetic code is known to be much more efficient in minimizing adverse effects of misreading errors and one-point mutations in comparison with a random code having the same structure, i.e. the same number of codons coding for each particular amino acid. We study the inverse problem, how the code structure affects the optimal physico-chemical parameters of amino acids ensuring the highest stability of the genetic code. It is shown that the choice of two or more amino acids with given properties determines unambiguously all the others. In this sense the code structure determines strictly the optimal parameters of amino acids or the corresponding scales may be derived directly from the genetic code. In the code with the structure of the standard genetic code the resulting values for hydrophobicity obtained in the scheme “leave one out” and in the scheme with fixed maximum and minimum parameters correlate significantly with the natural scale. The comparison of the optimal and natural parameters allows assessing relative impact of physico-chemical and error-minimization factors during evolution of the genetic code. As the resulting optimal scale depends on the choice of amino acids with given parameters, the technique can also be applied to testing various scenarios of the code evolution with increasing number of codified amino acids. Our results indicate the co-evolution of the genetic code and physico-chemical properties of recruited amino acids. 相似文献
6.
It is known that different codons may be unified into larger groups related to the hierarchical structure, approximate hidden symmetries, and evolutionary origin of the universal genetic code. Using a simplified evolutionary motivated two-letter version of genetic code, the general principles of the most stable coding are discussed. By the complete enumeration in such a reduced code it is strictly proved that the maximum stability with respect to point mutations and shifts in the reading frame needs the fixation of the middle letters within codons in groups with different physico-chemical properties, thus, explaining a key feature of the universal genetic code. The translational stability of the genetic code is studied by the mapping of code onto de Bruijn graph providing both the compact visual representation of mutual relationships between different codons as well as between codons and protein coding DNA sequence and a powerful tool for the investigation of stability of protein coding. Then, the results are extended to four-letter codes. As is shown, the universal genetic code obeys mainly the principles of optimal coding. These results demonstrate the hierarchical character of optimization of universal genetic code with strictly optimal coding being evolved at the earliest stages of molecular evolution. Finally, the universal genetic code is compared with the other natural variants of genetic codes. 相似文献
7.
Directed protein evolution is the most versatile method for studying protein structure-function relationships, and for tailoring a protein's properties to the needs of industrial applications. In this review, we performed a statistical analysis on the genetic code to study the extent and consequence of the organization of the genetic code on amino acid substitution patterns generated in directed evolution experiments. In detail, we analyzed amino acid substitution patterns caused by (a) a single nucleotide (nt) exchange at each position of all 64 codons, and (b) two subsequent nt exchanges (first and second nt, first and third nt, second and third nt). Additionally, transitions and transversions mutations were compared at the level of amino acid substitution patterns. The latter analysis showed that single nucleotide substitution in a codon generates only 39.5% of the natural diversity on the protein level with 5.2-7 amino acid substitutions per codon. Transversions generate more complex amino acid substitution patterns (increased number and chemically more diverse amino acid substitutions) than transitions. Simultaneous nt exchanges at both first and second nt of a codon generates very diverse amino acid substitution patterns, achieving 83.2% of the natural diversity. The statistical analysis described in this review sets the objectives for novel random mutagenesis methods that address the consequences of the organization of the genetic code. Random mutagenesis methods that favor transversions or introduce consecutive nt exchanges can contribute in this regard. 相似文献
8.
Directed protein evolution is the most versatile method for studying protein structure–function relationships, and for tailoring a protein's properties to the needs of industrial applications. In this review, we performed a statistical analysis on the genetic code to study the extent and consequence of the organization of the genetic code on amino acid substitution patterns generated in directed evolution experiments. In detail, we analyzed amino acid substitution patterns caused by (a) a single nucleotide (nt) exchange at each position of all 64 codons, and (b) two subsequent nt exchanges (first and second nt, first and third nt, second and third nt). Additionally, transitions and transversions mutations were compared at the level of amino acid substitution patterns. The latter analysis showed that single nucleotide substitution in a codon generates only 39.5% of the natural diversity on the protein level with 5.2–7 amino acid substitutions per codon. Transversions generate more complex amino acid substitution patterns (increased number and chemically more diverse amino acid substitutions) than transitions. Simultaneous nt exchanges at both first and second nt of a codon generates very diverse amino acid substitution patterns, achieving 83.2% of the natural diversity. The statistical analysis described in this review sets the objectives for novel random mutagenesis methods that address the consequences of the organization of the genetic code. Random mutagenesis methods that favor transversions or introduce consecutive nt exchanges can contribute in this regard. 相似文献
9.
The first information system emerged on the earth as primordial version of the genetic code and genetic texts. The natural appearance of arithmetic power in such a linguistic milieu is theoretically possible and practical for producing information systems of extremely high efficiency. In this case, the arithmetic symbols should be incorporated into an alphabet, i.e. the genetic code. A number is the fundamental arithmetic symbol produced by the system of numeration. If the system of numeration were detected inside the genetic code, it would be natural to expect that its purpose is arithmetic calculation e.g., for the sake of control, safety, and precise alteration of the genetic texts. The nucleons of amino acids and the bases of nucleic acids seem most suitable for embodiments of digits. These assumptions were used for the analyzing the genetic code. The compressed, life-size, and split representation of the Escherichia coli and Euplotes octocarinatus code versions were considered simultaneously. An exact equilibration of the nucleon sums of the amino acid standard blocks and/or side chains was found repeatedly within specified sets of the genetic code. Moreover, the digital notations of the balanced sums acquired, in decimal representation, the unique form 111, 222, …, 999. This form is a consequence of the criterion of divisibility by 037. The criterion could simplify some computing mechanism of a cell if any and facilitate its computational procedure. The cooperative symmetry of the genetic code demonstrates that possibly a zero was invented and used by this mechanism. Such organization of the genetic code could be explained by activities of some hypothetical molecular organelles working as natural biocomputers of digital genetic texts. It is well known that if mutation replaces an amino acid, the change of hydrophobicity is generally weak, while that of size is strong. The antisymmetrical correlation between the amino acid size and the degeneracy number is known as well. It is shown that these and some other familiar properties may be a physicochemical effect of arithmetic inside the genetic code. The “frozen accident” model, giving unlimited freedom to the mapping function, could optimally support the appearance of both arithmetic symbols and physicochemical protection inside the genetic code. 相似文献
10.
Information theoretic analysis of genetic languages indicates that the naturally occurring 20 amino acids and the triplet genetic code arose by duplication of 10 amino acids of class-II and a doublet genetic code having codons NNY and anticodons GNN. Evidence for this scenario is presented based on the properties of aminoacyl-tRNA synthetases, amino acids and nucleotide bases. 相似文献
12.
A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D, A, C, G, U}, where symbol D represents one or more hypothetical bases with unspecific pairings. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvement of a primeval DNA repair system could make possible the transition from ancient to modern genetic codes. Our results suggest that the Watson-Crick base pairing G ≡ C and A = U and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as, the transition from the former to the latter. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences. The phylogenetic analyses achieved with metrics defined in the N-dimensional vector space ( B3) N of DNA sequences and with the new evolutionary model presented here also suggest that an ancient DNA coding sequence with five or more bases does not contradict the expected evolutionary history. 相似文献
13.
Summary One-half of the twenty amino acids of the genetic code are just one mutational step away from the chain-terminator codons UAA, UAG, and UGA. It is postulated that somatic mutation to terminator is a hazard to which the organism has had to respond by adjusting certain proteins in the direction of fewer mutable residues. This view is supported by calculations based on the primary structure of five of the human hemoglobin chains. Each chain is scored for mutability to terminator in accord with the numbers and kinds of amino acids present. Among the adult chains, the most essential one, the alpha, has lowest mutability. The beta and delta follow, and in order of the presumed harm to the organism of a shortage of chain copies. Ante-natal chains tend to have higher mutabilities, supporting the view that cumulative mutational change in DNA can do little harm if the gene ceases to transcribe early in life. Two other predictions based on the supposition of effective selection against mutability to terminator are also met: chain length of polypeptides is negatively correlated with their scores for mutability to terminator, and examination of the recently determined sequence of beta messenger RNA shows preferential use of codons that are not readily mutable to terminator.Supported in part by the National Institutes of Health, Grant HL-16005 相似文献
14.
Error detection and correction properties are fundamental for informative codes. Hamming's distance allows us to study this noise resistance. We present codes characterized by the resistance optimization to nonsense mutational effects. The calculation of the cumulated Hamming's distance allowing to determine the number of optimal codes and their structure can be detailed. The principle of these laws of optimization of resistance consists of choosing constituent codons connected by mutational neighbouring in such a way that random application of mutations on such a code minimize the occurrence of nonsense n-uplets or terminators. New coding symmetries are then described and screened using Galois's polynomials properties and Baudot's code. Such a study can be applied to any length of the codons. Here we present the principles of this optimization for the most simple doublet codes. Another constraint is discussed: the distribution of optimal subcodes for synonymity and the frequencies of utilization of the different codons.We compare these results to those of the present genetic code, and we observe that all coded amino acids (except the particular case of SER) are using optimal sub-codes of synonymity.This work suggests that the appearance of the genetic code was provoked by mutations while optimizing on several levels its resistance to their effects. Thus genetic coding would have been the best automata that could be produced in prebiotic conditions. 相似文献
15.
We describe a compact representation of the genetic code that factorizes the table in quartets. It represents a “least grammar” for the genetic language. It is justified by the Klein-4 group structure of RNA bases and codon doublets. The matrix of the outer product between the column-vector of bases and the corresponding row-vector V T = (C G U A), considered as signal vectors, has a block structure consisting of the four cosets of the K × K group of base transformations acting on doublet AA. This matrix, translated into weak/strong (W/S) and purine/pyrimidine (R/Y) nucleotide classes, leads to a code table with mixed and unmixed families in separate regions. A basic difference between them is the non-commuting (R/Y) doublets: AC/CA, GU/UG. We describe the degeneracy in the canonical code and the systematic changes in deviant codes in terms of the divisors of 24, employing modulo multiplication groups. We illustrate binary sub-codes characterizing mutations in the quartets. We introduce a decision-tree to predict the mode of tRNA recognition corresponding to each codon, and compare our result with related findings by Jestin and Soulé [Jestin, J.-L., Soulé, C., 2007. Symmetries by base substitutions in the genetic code predict 2′ or 3′ aminoacylation of tRNAs. J. Theor. Biol. 247, 391–394], and the rearrangements of the table by Delarue [Delarue, M., 2007. An asymmetric underlying rule in the assignment of codons: possible clue to a quick early evolution of the genetic code via successive binary choices. RNA 13, 161–169] and Rodin and Rodin [Rodin, S.N., Rodin, A.S., 2008. On the origin of the genetic code: signatures of its primordial complementarity in tRNAs and aminoacyl-tRNA synthetases. Heredity 100, 341–355], respectively. 相似文献
16.
New insights into the arrangement of the genetic code table, based on the analysis of the physico-chemical properties of its molecular constituents, are reported in this paper. It will be demonstrated that the code has a twofold symmetry that is not apparent from the conventional code table, but becomes apparent when the codon-anticodon energies are listed for each triplet. The evolutionary development of the current code based on single base replacement mutations (transitions) from an 'iso-energetic' degenerated subset of 16 of the 64 codons is discussed. The energy landscape of all 64 codons is presented. A detailed analysis of the energy changes due to mutations in the 3rd, 1st or 2nd position of a codon reveals that the modern genetic code is highly robust. Changes come in small discrete steps that can be quantified in relation to the thermal noise of the system. The relation of the individual codon to its neighbours in the rearranged codon table can be completely understood based on thermodynamic considerations. 相似文献
17.
Summary We have calculated the average effect of changing a codon by a single base for all possible single-base changes in the genetic code and for changes in the first, second, and third codon positions separately. Such values were calculated for an amino acid's polar requirement, hydropathy, molecular volume, and isoelectric point. For each attribute the average effect of single-base changes was also calculated for a large number of randomly generated codes that retained the same level of redundancy as the natural code. Amino acids whose codons differed by a single base in the first and third codon positions were very similar with respect to polar requirement and hydropathy. The major differences between amino acids were specified by the second codon position. Codons with U in the second position are hydrophobic, whereas most codons with A in the second position are hydrophilic. This accounts for the observation of complementary hydropathy. Single-base changes in the natural code had a smaller average effect on polar requirement than all but 0.02% of random codes. This result is most easily explained by selection to minimize deleterious effects of translation errors during the early evolution of the code. 相似文献
18.
Summary The use of triplet code words in E. coli, X174, MS2, and rabbit globin was examined. A significant deficiency of purines in the third position of fourfold degenerate codons was noted, although its significance is not understood. There has been no consistent selection against uracil in pyrimidine restricted codons. For many amino acids the choice between code words appears random, while for arginine, isoleucine, and probably glycine, distinct biases exist which can be explained in terms of tRNA availability. 相似文献
19.
The genetic code is one of the most highly conserved characters in living organisms. Only a small number of genomes have evolved slight variations on the code, and these non-canonical codes are instrumental in understanding the selective pressures maintaining the code. Here, we describe a new case of a non-canonical genetic code from the oxymonad flagellate Streblomastix strix. We have sequenced four protein-coding genes from S.strix and found that the canonical stop codons TAA and TAG encode the amino acid glutamine. These codons are retained in S.strix mRNAs, and the legitimate termination codons of all genes examined were found to be TGA, supporting the prediction that this should be the only true stop codon in this genome. Only four other lineages of eukaryotes are known to have evolved non-canonical nuclear genetic codes, and our phylogenetic analyses of alpha-tubulin, beta-tubulin, elongation factor-1 alpha (EF-1 alpha), heat-shock protein 90 (HSP90), and small subunit rRNA all confirm that the variant code in S.strix evolved independently of any other known variant. The independent origin of each of these codes is particularly interesting because the code found in S.strix, where TAA and TAG encode glutamine, has evolved in three of the four other nuclear lineages with variant codes, but this code has never evolved in a prokaryote or a prokaryote-derived organelle. The distribution of non-canonical codes is probably the result of a combination of differences in translation termination, tRNAs, and tRNA synthetases, such that the eukaryotic machinery preferentially allows changes involving TAA and TAG. 相似文献
20.
We formulate the following hypothesis: Life's origin may have occurred during the lower Archaean at a time when the environmental temperature was higher than it is at present. Preliminary consequences of this hypothesis are studied from the point of view of molecular evolution. We restrict our attention to implications regarding the genetic code. We conclude that alternative assignment of termination codons may be understood in terms of: (a) the elevated temperatures to which the progenote may initially have been exposed; and (b) the subsequent response of its genome to the opportunity provided by the eventual loss of hyperthermal genetic expression during a thermal transition (TT) period, which was triggered off by the evolution of the dynamic Earth. 相似文献
|