首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 468 毫秒
The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nucleotides-adenine, thymine, guanine and cytosine-according to their emergence in evolution, and apply the organizational rules to devising an algebraic representation for the canonical genetic code. Under a framework of the devised code, we quantify codon and amino acid usages from a large collection of 917 prokaryotic genome sequences, and associate the usages with its intrinsic structure and classification schemes as well as amino acid physicochemical properties. Our results show that the algebraic representation of the code is structurally equivalent to a content-centric organization of the code and that codon and amino acid usages under different classification schemes were correlated closely with GC content, implying a set of rules governing composition dynamics across a wide variety of prokaryotic genome sequences. These results also indicate that codons and amino acids are not randomly allocated in the code, where the six-fold degenerate codons and their amino acids have important balancing roles for error minimization. Therefore, the content-centric code is of great usefulness in deciphering its hitherto unknown regularities as well as the dynamics of nucleotide, codon, and amino acid compositions.  相似文献   

A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D, A, C, G, U}, where symbol D represents one or more hypothetical bases with unspecific pairings. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvement of a primeval DNA repair system could make possible the transition from ancient to modern genetic codes. Our results suggest that the Watson-Crick base pairing G ≡ C and A = U and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as, the transition from the former to the latter. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences. The phylogenetic analyses achieved with metrics defined in the N-dimensional vector space (B3)N of DNA sequences and with the new evolutionary model presented here also suggest that an ancient DNA coding sequence with five or more bases does not contradict the expected evolutionary history.  相似文献   

Reprogramming of the standard genetic code to include non-canonical amino acids (ncAAs) opens new prospects for medicine, industry, and biotechnology. There are several methods of code engineering, which allow us for storing new genetic information in DNA sequences and producing proteins with new properties. Here, we provided a theoretical background for the optimal genetic code expansion, which may find application in the experimental design of the genetic code. We assumed that the expanded genetic code includes both canonical and non-canonical information stored in 64 classical codons. What is more, the new coding system is robust to point mutations and minimizes the possibility of reversion from the new to old information. In order to find such codes, we applied graph theory to analyze the properties of optimal codon sets. We presented the formal procedure in finding the optimal codes with various number of vacant codons that could be assigned to new amino acids. Finally, we discussed the optimal number of the newly incorporated ncAAs and also the optimal size of codon groups that can be assigned to ncAAs.  相似文献   

A computer program was used to test Wong's coevolution theory of the genetic code. The codon correlations between the codons of biosynthetically related amino acids in the universal genetic code and in randomly generated genetic codes were compared. It was determined that many codon correlations are also present within random genetic codes and that among the random codes there are always several which have many more correlations than that found in the universal code. Although the number of correlations depends on the choice of biosynthetically related amino acids, the probability of choosing a random genetic code with the same or greater number of codon correlations as the universal genetic code was found to vary from 0.1% to 34% (with respect to a fairly complete listing of related amino acids). Thus, Wong's theory that the genetic code arose by coevolution with the biosynthetic pathways of amino acids, based on codon correlations between biosynthetically related amino acids, is statistical in nature. Received: 8 August 1996 / Accepted: 26 December 1996  相似文献   

Since the genetic code first was determined, many have claimed that it is organized adaptively, so as to assign similar codons to similar amino acids. This claim has proved difficult to establish due to the absence of relevant comparative data on alternative primordial codes and of objective measures of amino acid exchangeability. Here we use a recently developed measure of exchangeability to evaluate a null hypothesis and two alternative hypotheses about the adaptiveness of the genetic code. The null hypothesis that there is no tendency for exchangeable amino acids to be assigned to similar codons can be excluded here as expected from earlier work. The first alternative hypothesis is that any such correlation between codon distance and amino acid distance is due to incremental mechanisms of code evolution, and not to adaptation to reduce deleterious effects of future mutations. More specifically, new codon assignments that occur by ambiguity reduction or by codon capture will tend to give rise to correlations, whether due to the condition of amino acid ambiguity, or to the condition of similarity between a new tRNA synthetase (or tRNA) and its parent. The second alternative hypothesis, the adaptive hypothesis, then may be defined as an excess relative to what may be expected given the incremental nature of evolution, reflecting true adaptation for robustness rather than an incidental effect. The results reported here indicate that most of the nonrandomness in the amino acids to codon assignments can be explained by incremental code evolution, with a small residue of orderliness that may reflect code adaptation.  相似文献   

Understanding how codons became associated with their specific amino acids is fundamental to deriving a theory for the origin of the genetic code. Carl Woese and coworkers designed a series of experiments to test associations between amino acids and nucleobases that may have played a role in establishing the genetic code. Through these experiments it was found that a property of amino acids called the polar requirement (PR) is correlated with the organization of the codon table. No other property of amino acids has been found that correlates with the codon table as well as PR, indicating that PR is uniquely related to the modern genetic code. Using molecular dynamics simulations of amino acids in solutions of water and dimethylpyridine used to experimentally measure PR, we show that variations in the partitioning between the two phases as described by radial distribution functions correlate well with the measured PRs. Partition coefficients based on probability densities of the amino acids in each phase have the linear behavior with base concentration as suggested by PR experiments.  相似文献   

An algebraic and geometrical approach is used to describe the primaeval RNA code and a proposed Extended RNA code. The former consists of all codons of the type RNY, where R means purines, Y pyrimidines, and N any of them. The latter comprises the 16 codons of the type RNY plus codons obtained by considering the RNA code but in the second (NYR type), and the third, (YRN type) reading frames. In each of these reading frames, there are 16 triplets that altogether complete a set of 48 triplets, which specify 17 out of the 20 amino acids, including AUG, the start codon, and the three known stop codons. The other 16 codons, do not pertain to the Extended RNA code and, constitute the union of the triplets YYY and RRR that we define as the RNA-less code. The codons in each of the three subsets of the Extended RNA code are represented by a four-dimensional hypercube and the set of codons of the RNA-less code is portrayed as a four-dimensional hyperprism. Remarkably, the union of these four symmetrical pairwise disjoint sets comprises precisely the already known six-dimensional hypercube of the Standard Genetic Code (SGC) of 64 triplets. These results suggest a plausible evolutionary path from which the primaeval RNA code could have originated the SGC, via the Extended RNA code plus the RNA-less code. We argue that the life forms that probably obeyed the Extended RNA code were intermediate between the ribo-organisms of the RNA World and the last common ancestor (LCA) of the Prokaryotes, Archaea, and Eucarya, that is, the cenancestor. A general encoding function, E, which maps each codon to its corresponding amino acid or the stop signal is also derived. In 45 out of the 64 cases, this function takes the form of a linear transformation F, which projects the whole six-dimensional hypercube onto a four-dimensional hyperface conformed by all triplets that end in cytosine. In the remaining 19 cases the function E adopts the form of an affine transformation, i.e., the composition of F with a particular translation. Graphical representations of the four local encoding functions and E, are illustrated and discussed. For every amino acid and for the stop signal, a single triplet, among those that specify it, is selected as a canonical representative. From this mapping a graphical representation of the 20 amino acids and the stop signal is also derived. We conclude that the general encoding function E represents the SGC itself.  相似文献   

The standard genetic code is known to be much more efficient in minimizing adverse effects of misreading errors and one-point mutations in comparison with a random code having the same structure, i.e. the same number of codons coding for each particular amino acid. We study the inverse problem, how the code structure affects the optimal physico-chemical parameters of amino acids ensuring the highest stability of the genetic code. It is shown that the choice of two or more amino acids with given properties determines unambiguously all the others. In this sense the code structure determines strictly the optimal parameters of amino acids or the corresponding scales may be derived directly from the genetic code. In the code with the structure of the standard genetic code the resulting values for hydrophobicity obtained in the scheme “leave one out” and in the scheme with fixed maximum and minimum parameters correlate significantly with the natural scale. The comparison of the optimal and natural parameters allows assessing relative impact of physico-chemical and error-minimization factors during evolution of the genetic code. As the resulting optimal scale depends on the choice of amino acids with given parameters, the technique can also be applied to testing various scenarios of the code evolution with increasing number of codified amino acids. Our results indicate the co-evolution of the genetic code and physico-chemical properties of recruited amino acids.  相似文献   

M A Soto  C J Tohá 《Bio Systems》1985,18(2):209-215
A quantitative rationale for the evolution of the genetic code is developed considering the principle of minimal hardware. This principle defines an optimal code as one that minimizes for a given amount of information encoded, the product of the number of physical devices used by the average complexity of each device. By identifying the number of different amino acids, number of nucleotide positions per codon and number of base types that can occupy each such position with, respectively, the amount of information, number of devices and the complexity, we show that optimal codes occur for 3, 7 and 20 amino acids with codons having a single, two and three base positions per codon, respectively. The advantage of a code of exactly 4 symbols is deduced, as well as a plausible evolutionary pathway from a code of doublets to triplets. The present day code of 20 amino acids encoded by 64 codons is shown to be the most optimal in an absolute sense. Using a tetraplet code further evolution to a code in which there would be 55 amino acids is in principle possible, but such a code would deviate slightly more than the present day code from the minimal hardware configuration. The change from a triplet code to a tetraplet code would occur at about 32 amino acids. Our conclusions are independent of, but consistent with, the observed physico-chemical properties of the amino acids and codon structures. These correlations could have evolved within the constrains imposed by the minimal hardware principle.  相似文献   

The aminoacyl-tRNA synthetases exist as two enzyme families which were apparently generated by divergent evolution from two primordial synthetases. The two classes of enzymes exhibit intriguing familial relationships, in that they are distributed nonrandomly within the codon-amino acid matrix of the genetic code. For example, all XCX codons code for amino acids handled by class II synthetases, and all but one of the XUX codons code for amino acids handled by class I synthetases. One interpretation of these patterns is that the synthetases coevolved with the genetic code. The more likely explanation, however, is that the synthetases evolved in the context of an already-established genetic code—a code which developed earlier in an RNA world. The rules which governed the development of the genetic code, and led to certain patterns in the coding catalog between codons and amino acids, would also have governed the subsequent evolution of the synthetases in the context of a fixed code, leading to patterns in synthetase distribution such as those observed. These rules are (1) conservative evolution of amino acid and adapter binding sites and (2) minimization of the disruptive effects on protein structure caused by codon meaning changes.  相似文献   

Selection for resource conservation can shape the coding sequences of organisms living in nutrient-limited environments. Recently, it was proposed that selection for resource conservation, specifically for nitrogen and carbon content, has also shaped the structure of the standard genetic code, such that the missense mutations the code allows tend to cause small increases in the number of nitrogen and carbon atoms in amino acids. Moreover, it was proposed that this optimization is not confounded by known optimizations of the standard genetic code, such as for polar requirement or hydropathy. We challenge these claims. We show the proposed optimization for nitrogen conservation is highly sensitive to choice of null model and the proposed optimization for carbon conservation is confounded by the known conservative nature of the standard genetic code with respect to the molecular volume of amino acids. There is therefore little evidence the standard genetic code is optimized for resource conservation. We discuss our findings in the context of null models of the standard genetic code.  相似文献   

The codon table for the canonical genetic code can be rearranged in such a way that the code is divided into four quarters and two halves according to the variability of their GC and purine contents, respectively. For prokaryotic genomes, when the genomic GC content increases, their amino acid contents tend to be restricted to the GC-rich quarter and the purine-content insensitive half, where all codons are fourfold degenerate and relatively mutation-tolerant. Conversely, when the genomic GC content decreases, most of the codons retract to the AUrich quarter and the purine-content sensitive half; most of the codons not only remain encoding physicochemically diversified amino acids but also vary when transversion (between purine and pyrimidine) happens. Amino acids with sixfolddegenerate codons are distributed into all four quarters and across the two halves; their fourfold-degenerate codons are all partitioned into the purine-insensitive half in favorite of robustness against mutations. The features manifested in the rearranged codon table explain most of the intrinsic relationship between protein coding sequences (the informational content) and amino acid compositions (the functional content). The renovated codon table is useful in predicting abundant amino acids and positioning the amino acids with related or distinct physicochemical properties.  相似文献   

A quantitative measure of error minimization in the genetic code   总被引:7,自引:0,他引:7  
Summary We have calculated the average effect of changing a codon by a single base for all possible single-base changes in the genetic code and for changes in the first, second, and third codon positions separately. Such values were calculated for an amino acid's polar requirement, hydropathy, molecular volume, and isoelectric point. For each attribute the average effect of single-base changes was also calculated for a large number of randomly generated codes that retained the same level of redundancy as the natural code. Amino acids whose codons differed by a single base in the first and third codon positions were very similar with respect to polar requirement and hydropathy. The major differences between amino acids were specified by the second codon position. Codons with U in the second position are hydrophobic, whereas most codons with A in the second position are hydrophilic. This accounts for the observation of complementary hydropathy. Single-base changes in the natural code had a smaller average effect on polar requirement than all but 0.02% of random codes. This result is most easily explained by selection to minimize deleterious effects of translation errors during the early evolution of the code.  相似文献   

We have previously proposed an SNS hypothesis on the origin of the genetic code (Ikehara and Yoshida 1998). The hypothesis predicts that the universal genetic code originated from the SNS code composed of 16 codons and 10 amino acids (S and N mean G or C and either of four bases, respectively). But, it must have been very difficult to create the SNS code at one stroke in the beginning. Therefore, we searched for a simpler code than the SNS code, which could still encode water-soluble globular proteins with appropriate three-dimensional structures at a high probability using four conditions for globular protein formation (hydropathy, α-helix, β-sheet, and β-turn formations). Four amino acids (Gly [G], Ala [A], Asp [D], and Val [V]) encoded by the GNC code satisfied the four structural conditions well, but other codes in rows and columns in the universal genetic code table do not, except for the GNG code, a slightly modified form of the GNC code. Three three-amino acid systems ([D], Leu and Tyr; [D], Tyr and Met; Glu, Pro and Ile) also satisfied the above four conditions. But, some amino acids in the three systems are far more complex than those encoded by the GNC code. In addition, the amino acids in the three-amino acid systems are scattered in the universal genetic code table. Thus, we concluded that the universal genetic code originated not from a three-amino acid system but from a four-amino acid system, the GNC code encoding [GADV]-proteins, as the most primitive genetic code. Received: 11 June 2001 / Accepted: 11 October 2001  相似文献   

通过遗传密码子扩展技术位点特异性插入非天然氨基酸(noncanonical amino acids,ncAAs)可在原子水平上对蛋白质的结构与功能进行操控。目前该技术能够向包括高等动植物在内的各种生命体中插入200多种ncAAs,已被广泛应用于生物医药领域。凭借能够在蛋白质中定点引入可控生物正交化学官能团的独特优势,该技术不仅可以用于蛋白质及多肽药物的研发,提高蛋白质及多肽药物的质量与疗效,而且可以为一些人类重大疾病的预防和治疗提供开创性解决方案。本文将重点关注遗传密码子扩展技术的前沿进展及其在各类抗体、细胞因子以及抗菌肽等蛋白质及多肽类药物中的应用,同时也对其衍生的新型生物治疗手段进行简单阐述。  相似文献   

Some aspects of the organization and evolution of the genetic code   总被引:1,自引:0,他引:1  
In this paper, I define a measure of the relative position of each amino acid in the genetic code by means of a 21-dimensional vector describing its potential for mutation, in a single step, to each of the other amino acids, or to a chain termination codon. This measure allows us to make a systematic investigation of the type and number of the physicochemical properties of the amino acids that were involved in evolution. The polar character and size of amino acids are identified in this analysis as properties that played a leading role in the evolutionary history of the genetic code. The application of cluster analysis and discriminant analysis reveals the characteristics of the structural organization of the genetic code. Finally, I suggest the existence of a relationship between the molecular weight of the amino acids and the number of synonymous codons.  相似文献   

Tetrahymena thermophila and Paramecium tetraurelia are ciliates that reassign TAA and TAG from stop codons to glutamine codons. Because of the lack of full genome sequences, few studies have concentrated on analyzing the effects of codon reassignment in protein evolution. We used the recently sequenced genome of these species to analyze the patterns of amino acid substitution in ciliates that reassign the code. We show that, as expected, the codon reassignment has a large impact on amino acid substitutions in closely related proteins; however, contrary to expectations, these effects also hold for very diverged proteins. Previous studies have used amino acid substitution data to calculate the minimization of the genetic code; our results show that because of the lasting influence of the code in the patterns of substitution, such studies are tautological. These different substitution patterns might affect alignment of ciliate proteins, as alignment programs use scoring matrices based on substitution patterns of organisms that use the standard code. We also show that glutamine is used more frequently in ciliates than in other species, as often as expected based on the presence of the 2 new reassigned codons, indicating that the frequencies of amino acids in proteomes is mostly determined by neutral processes based on their number of codons.  相似文献   

The frequencies of A, C, G, and T in mitochondrial DNA vary among species due to unequal rates of mutation between the bases. The frequencies of bases at fourfold degenerate sites respond directly to mutation pressure. At first and second positions, selection reduces the degree of frequency variation. Using a simple evolutionary model, we show that first position sites are less constrained by selection than second position sites and, therefore, that the frequencies of bases at first position are more responsive to mutation pressure than those at second position. We define a measure of distance between amino acids that is dependent on eight measured physical properties and a similarity measure that is the inverse of this distance. Columns 1, 2, 3, and 4 of the genetic code correspond to codons with U, C, A, and G in their second position, respectively. The similarity of amino acids in the four columns decreases systematically from column 1 to column 2 to column 3 to column 4. We then show that the responsiveness of first position bases to mutation pressure is dependent on the second position base and follows the same decreasing trend through the four columns. Again, this shows the correlation between physical properties and responsiveness. We determine a proximity measure for each amino acid, which is the average similarity between an amino acid and all others that are accessible via single point mutations in the mitochondrial genetic code structure. We also define a responsiveness for each amino acid, which measures how rapidly an amino acid frequency changes as a result of mutation pressure acting on the base frequencies. We show that there is a strong correlation between responsiveness and proximity, and that both these quantities are also correlated with the mutability of amino acids estimated from the mtREV substitution rate matrix. We also consider the variation of base frequencies between strands and between genes on a strand. These trends are consistent with the patterns expected from analysis of the variation among genomes. [Reviewing Editor: Dr. David Pollock]  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号