首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The canonical genetic code has been reported both to be error minimizing and to show stereochemical associations between coding triplets and binding sites. In order to test whether these two properties are unexpectedly overlapping, we generated 200,000 randomized genetic codes using each of five randomization schemes, with and without randomization of stop codons. Comparison of the code error (difference in polar requirement for single-nucleotide codon interchanges) with the coding triplet concentrations in RNA binding sites for eight amino acids shows that these properties are independent and uncorrelated. Thus, one is not the result of the other, and error minimization and triplet associations probably arose independently during the history of the genetic code. We explicitly show that prior fixation of a stereochemical core is consistent with an effective later minimization of error. [Reviewing Editor : Dr. Stephen Freeland]  相似文献   

2.
The multiple codes of nucleotide sequences   总被引:4,自引:0,他引:4  
Nucleotide sequences carry genetic information of many different kinds, not just instructions for protein synthesis (triplet code). Several codes of nucleotide sequences are discussed including: (1) the translation framing code, responsible for correct triplet counting by the ribosome during protein synthesis; (2) the chromatin code, which provides instructions on appropriate placement of nucleosomes along the DNA molecules and their spatial arrangement; (3) a putative loop code for single-stranded RNA-protein interactions. The codes are degenerate and corresponding messages are not only interspersed but actually overlap, so that some nucleotides belong to several messages simultaneously. Tandemly repeated sequences frequently considered as functionless “junk” are found to be grouped into certain classes of repeat unit lengths. This indicates some functional involvement of these sequences. A hypothesis is formulated according to which the tandem repeats are given the role of weak enhancer-silencers that modulate, in a copy number-dependent way, the expression of proximal genes. Fast amplification and elimination of the repeats provides an attractive mechanism of species adaptation to a rapidly changing environment.  相似文献   

3.
4.
The laws governing degeneration of the genetic code are discussed below. Of fundamental importance in this context is the classification of the amino acids into groups on the basis of the physicochemical behaviour of their residues. From this, it is possible to formulate arithmetic relationships between the number of amino acids in the same group and the number of coding triplets.It is found that the degeneration of the genetic code obeys certain laws, the reasons for this being related to the number and the qualitative properties of the amino acids and triplets. The fact that the three bases of a coding triplet have different priorities must also be a critical factor.  相似文献   

5.
M A Soto  C J Tohá 《Bio Systems》1985,18(2):209-215
A quantitative rationale for the evolution of the genetic code is developed considering the principle of minimal hardware. This principle defines an optimal code as one that minimizes for a given amount of information encoded, the product of the number of physical devices used by the average complexity of each device. By identifying the number of different amino acids, number of nucleotide positions per codon and number of base types that can occupy each such position with, respectively, the amount of information, number of devices and the complexity, we show that optimal codes occur for 3, 7 and 20 amino acids with codons having a single, two and three base positions per codon, respectively. The advantage of a code of exactly 4 symbols is deduced, as well as a plausible evolutionary pathway from a code of doublets to triplets. The present day code of 20 amino acids encoded by 64 codons is shown to be the most optimal in an absolute sense. Using a tetraplet code further evolution to a code in which there would be 55 amino acids is in principle possible, but such a code would deviate slightly more than the present day code from the minimal hardware configuration. The change from a triplet code to a tetraplet code would occur at about 32 amino acids. Our conclusions are independent of, but consistent with, the observed physico-chemical properties of the amino acids and codon structures. These correlations could have evolved within the constrains imposed by the minimal hardware principle.  相似文献   

6.
7.
A novel concept on mechanisms of evolution of genes and genomes is suggested: the sequences evolve largely by local events of triplet expansion and subsequent mutational changes in the repeats. The immediate memory about the earlier expansion events still resides in the sequences, in form of the frequently occurring segments of tandemly repeating codons. Other predicted fossils of the original repeats are: (I) the expanding triplets should be accompanied by their point mutation derivatives and (II) the remaining excess of codons formerly belonging to the tandem repeats should be reflected in overall codon usage biases. Both predictions are confirmed by analysis of largest available database of non-redundant protein coding sequences, of total size ~5?×?10(9) codons. One important conclusion also follows from the results. Life which, presumably, started with replication of expanding triplets and their subsequent mutational changes, is continuing to emerge within the genes and genomes, in form of new events of triplet expansion.  相似文献   

8.
M Kozak 《Cell》1983,34(3):971-978
Plasmids have been constructed containing reiterated copies of a 66 bp fragment, loosely referred to as the ribosome binding site, that includes the AUG initiator codon of preproinsulin. The extreme test involved plasmid 255/17, which carried four tandem copies of the ribosome binding site, with all four AUG triplets in the same reading frame as the preproinsulin coding sequence downstream. Initiation at any potential start site would generate a polypeptide precipitable with anti-insulin antiserum, and its size would reveal the AUG(s) active in initiation. One insulin-related polypeptide was synthesized in cells transfected by p255/17; its size corresponded to the product initiated at the first ribosome binding site in the tandem array. Inasmuch as the three downstream AUG triplets, which are not used, occur in a sequence context identical with that around the 5'-proximal AUG triplet, which is used, the position of an AUG triplet relative to the 5' end of the mRNA appears to be important in identifying it as a functional initiator codon.  相似文献   

9.
Friedreich ataxia is caused by expansion of a GAA triplet repeat (GAA-TR) in the FRDA gene. Normal alleles contain <30 triplets, and disease-causing expansions (66-1700 triplets) arise via hyperexpansion of premutations (30-65 triplets). To gain insight into GAA-TR instability we analyzed all triplet repeats in the human genome. We identified 988 (GAA)(8+) repeats, 291 with >or=20 triplets, including 29 potential premutations (30-62 triplets). Most other triplet repeats were restricted to <20 triplets. We estimated the expected frequency of (GAA)(6+) repeats to be negligible, further indicating that GAA-TRs have undergone significant expansion. Eighty-nine percent of (GAA)(8+) sequences map within G/A islands, and 58% map within the poly(A) tails of Alu elements. Only two other (GAA)(8+) sequences shared the central Alu location seen at the FRDA locus. One showed allelic variation, including expansions analogous to short Friedreich ataxia mutations. Our data demonstrate that GAA-TRs have expanded throughout primate evolution with the generation of potential premutation alleles at multiple loci.  相似文献   

10.
The genetic code is known to have a high level of error robustness and has been shown to be very error robust compared to randomly selected codes, but to be significantly less error robust than a certain code found by a heuristic algorithm. We formulate this optimization problem as a Quadratic Assignment Problem and use this to formally verify that the code found by the heuristic algorithm is the global optimum. We also argue that it is strongly misleading to compare the genetic code only with codes sampled from the fixed block model, because the real code space is orders of magnitude larger. We thus enlarge the space from which random codes can be sampled from approximately 2.433 × 10(18) codes to approximately 5.908 × 10(45) codes. We do this by leaving the fixed block model, and using the wobble rules to formulate the characteristics acceptable for a genetic code. By relaxing more constraints, three larger spaces are also constructed. Using a modified error function, the genetic code is found to be more error robust compared to a background of randomly generated codes with increasing space size. We point out that these results do not necessarily imply that the code was optimized during evolution for error minimization, but that other mechanisms could be the reason for this error robustness.  相似文献   

11.
Code domains in tandem repetitive DNA sequence structures   总被引:6,自引:0,他引:6  
Peter Vogt 《Chromosoma》1992,101(10):585-589
Traditionally, many people doing research in molecular biology attribute coding properties to a given DNA sequence if this sequence contains an open reading frame for translation into a sequence of amino acids. This protein coding capability of DNA was detected about 30 years ago. The underlying genetic code is highly conserved and present in every biological species studied so far. Today, it is obvious that DNA has a much larger coding potential for other important tasks. Apart from coding for specific RNA molecules such as rRNA, snRNA and tRNA molecules, specific structural and sequence patterns of the DNA chain itself express distinct codes for the regulation and expression of its genetic activity. A chromatin code has been defined for phasing of the histone-octamer protein complex in the nucleosome. A translation frame code has been shown to exist that determines correct triplet counting at the ribosome during protein synthesis. A loop code seems to organize the single stranded interaction of the nascent RNA chain with proteins during the splicing process, and a splicing code phases successive 5' and 3' splicing sites. Most of these DNA codes are not exclusively based on the primary DNA sequence itself, but also seem to include specific features of the corresponding higher order structures. Based on the view that these various DNA codes are genetically instructive for specific molecular interactions or processes, important in the nucleus during interphase and during cell division, the coding capability of tandem repetitive DNA sequences has recently been reconsidered.  相似文献   

12.
The codon-degeneracy model (CDM) predicts that patterns of nucleotide substitution in protein-coding genes are largely determined by the relative frequencies of four-fold (4f), two-fold, and non-degenerate sites, the attributes of which are determined by the structure of the governing genetic code. The CDM thus further predicts that genetic codes with alternative structures will "filter" molecular evolution differentially. A method, therefore, is presented by which the CDM may be applied to the unique structure of any genetic code. The mathematical relationship between the proportion of transitions at 4f degenerate nucleotide sites and the transition-to-transversion ratio is described. Predictions for five individual genetic codes, relative to the relationship between code structure and expected patterns of nucleotide substitution, are clearly defined. To test this "filter" hypothesis of genetic codes, simulated DNA sequence data sets were generated with a variety of input parameter values to estimate the relationship between patterns of nucleotide substitution and best-fit estimates of transition bias at 4f degenerate sites for both the universal genetic code and the vertebrate mitochondrial genetic code. These analyses confirm the prediction of the CDM that, all else being equal, even small differences in the structure of alternative genetic codes may result in significant shifts in the overall pattern of nucleotide substitution.  相似文献   

13.
It is known that different codons may be unified into larger groups related to the hierarchical structure, approximate hidden symmetries, and evolutionary origin of the universal genetic code. Using a simplified evolutionary motivated two-letter version of genetic code, the general principles of the most stable coding are discussed. By the complete enumeration in such a reduced code it is strictly proved that the maximum stability with respect to point mutations and shifts in the reading frame needs the fixation of the middle letters within codons in groups with different physico-chemical properties, thus, explaining a key feature of the universal genetic code. The translational stability of the genetic code is studied by the mapping of code onto de Bruijn graph providing both the compact visual representation of mutual relationships between different codons as well as between codons and protein coding DNA sequence and a powerful tool for the investigation of stability of protein coding. Then, the results are extended to four-letter codes. As is shown, the universal genetic code obeys mainly the principles of optimal coding. These results demonstrate the hierarchical character of optimization of universal genetic code with strictly optimal coding being evolved at the earliest stages of molecular evolution. Finally, the universal genetic code is compared with the other natural variants of genetic codes.  相似文献   

14.
Reprogramming of the standard genetic code to include non-canonical amino acids (ncAAs) opens new prospects for medicine, industry, and biotechnology. There are several methods of code engineering, which allow us for storing new genetic information in DNA sequences and producing proteins with new properties. Here, we provided a theoretical background for the optimal genetic code expansion, which may find application in the experimental design of the genetic code. We assumed that the expanded genetic code includes both canonical and non-canonical information stored in 64 classical codons. What is more, the new coding system is robust to point mutations and minimizes the possibility of reversion from the new to old information. In order to find such codes, we applied graph theory to analyze the properties of optimal codon sets. We presented the formal procedure in finding the optimal codes with various number of vacant codons that could be assigned to new amino acids. Finally, we discussed the optimal number of the newly incorporated ncAAs and also the optimal size of codon groups that can be assigned to ncAAs.  相似文献   

15.
Chemical language of the genetic code is suggested in which elementary information code units are presented by functional groups of amino acids and nucleotides. Using this language, the existence of correspondence and conformity of chemical parameters of amino acids and of central nucleotides of their anticodons was demonstrated. These findings confirm the idea that the genetic code is determined by chemical properties of amino acids and nucleotides and that this determination is the result of direct specific interactions between amino acids and nucleotide triplets at the stage of the origin of the code. The data obtained reveal primary role of anticodon triplets in the origin of the code. Key role of the central nucleotide in triplets for amino acid coding is confirmed.  相似文献   

16.
The updated structural and phylogenetic analyses of tRNA pairs with complementary anticodons provide independent support for our earlier finding, namely that these tRNA pairs concertedly show complementary second bases in the acceptor stem. Two implications immediately follow: first, that a tRNA molecule gained its present, complete, cloverleaf shape via duplication(s) of a shorter precursor. Second, that common ancestry is shared by two major components of the genetic code within the tRNA molecule--the classic code per se embodied in anticodon triplets, and the operational code of aminoacylation embodied primarily in the first three base pairs of the acceptor stems. In this communication we show that it might have been a double, sense-antisense, in-frame translation of the very first protein-encoding genes that directed the code's earliest expansion, thus preserving this fundamental dual-complementary link between acceptors and anticodons. Furthermore, the dual complementarity appears to be consistent with two mirror-symmetrical modes by which class I and II aminoacyl-tRNA synthetases recognize the cognate tRNAs--from the minor and major groove side of the acceptor stem, respectively.  相似文献   

17.
Early fixation of an optimal genetic code   总被引:19,自引:0,他引:19  
The evolutionary forces that produced the canonical genetic code before the last universal ancestor remain obscure. One hypothesis is that the arrangement of amino acid/codon assignments results from selection to minimize the effects of errors (e.g., mistranslation and mutation) on resulting proteins. If amino acid similarity is measured as polarity, the canonical code does indeed outperform most theoretical alternatives. However, this finding does not hold for other amino acid properties, ignores plausible restrictions on possible code structure, and does not address the naturally occurring nonstandard genetic codes. Finally, other analyses have shown that significantly better code structures are possible. Here, we show that if theoretically possible code structures are limited to reflect plausible biological constraints, and amino acid similarity is quantified using empirical data of substitution frequencies, the canonical code is at or very close to a global optimum for error minimization across plausible parameter space. This result is robust to variation in the methods and assumptions of the analysis. Although significantly better codes do exist under some assumptions, they are extremely rare and thus consistent with reports of an adaptive code: previous analyses which suggest otherwise derive from a misleading metric. However, all extant, naturally occurring, secondarily derived, nonstandard genetic codes do appear less adaptive. The arrangement of amino acid assignments to the codons of the standard genetic code appears to be a direct product of natural selection for a system that minimizes the phenotypic impact of genetic error. Potential criticisms of previous analyses appear to be without substance. That known variants of the standard genetic code appear less adaptive suggests that different evolutionary factors predominated before and after fixation of the canonical code. While the evidence for an adaptive code is clear, the process by which the code achieved this optimization requires further attention.  相似文献   

18.
A model for topological coding of proteins is proposed. The model is based on the capacity of hydrogen bonds (property of connectivity) to fix conformations of protein molecules. The protein chain is modeled by an n -arc graph with the following elements: vertices (alpha -carbon atoms), structural edges (peptide bonds) and connectivity edges (virtual edges connecting non-adjacent atoms). It was shown that 64 conformations of the 4-arc graph can be described in the binary system by matrices of six variables which form a supermatrix containing four blocks. On the basis of correspondences between the pairs of variables in matrices and four letters of the genetic code matrices and supermatrix are converted, respectively, into the triplets and the table of the genetic code. An algorithm admitting computer programming is proposed for coding the n -arc graph and protein chain. Connectivity operators (polar amino acids) are assigned to blocks of triplets coding for cyclic conformations (G, A-in the second position), while anti-connectivity operators (non-polar amino acids) correspond to blocks of triplets coding for open conformations (C, U-in the second position). Amino acids coded by triplets differing by the first base have different structures. The third base for C, U and G, A is degenerated. Properties of the real genetic code are in full agreement with the model. The model provides an insight into the topological nature of the genetic code and can be used for development of algorithms for the prediction of the protein structure.  相似文献   

19.
The genetic code appears to be optimized in its robustness to missense errors and frameshift errors. In addition, the genetic code is near-optimal in terms of its ability to carry information in addition to the sequences of encoded proteins. As evolution has no foresight, optimality of the modern genetic code suggests that it evolved from less optimal code variants. The length of codons in the genetic code is also optimal, as three is the minimal nucleotide combination that can encode the twenty standard amino acids. The apparent impossibility of transitions between codon sizes in a discontinuous manner during evolution has resulted in an unbending view that the genetic code was always triplet. Yet, recent experimental evidence on quadruplet decoding, as well as the discovery of organisms with ambiguous and dual decoding, suggest that the possibility of the evolution of triplet decoding from living systems with non-triplet decoding merits reconsideration and further exploration. To explore this possibility we designed a mathematical model of the evolution of primitive digital coding systems which can decode nucleotide sequences into protein sequences. These coding systems can evolve their nucleotide sequences via genetic events of Darwinian evolution, such as point-mutations. The replication rates of such coding systems depend on the accuracy of the generated protein sequences. Computer simulations based on our model show that decoding systems with codons of length greater than three spontaneously evolve into predominantly triplet decoding systems. Our findings suggest a plausible scenario for the evolution of the triplet genetic code in a continuous manner. This scenario suggests an explanation of how protein synthesis could be accomplished by means of long RNA-RNA interactions prior to the emergence of the complex decoding machinery, such as the ribosome, that is required for stabilization and discrimination of otherwise weak triplet codon-anticodon interactions.  相似文献   

20.
S. Nogami  Y. Satow  Y. Ohya    Y. Anraku 《Genetics》1997,147(1):73-85
Protein splicing is a compelling chemical reaction in which two proteins are produced posttranslationally from a single precursor polypeptide by excision of the internal protein segment and ligation of the flanking regions. This unique autocatalytic reaction was first discovered in the yeast Vma1p protozyme where the 50-kD site-specific endonuclease (VDE) is excised from the 120-kD precursor containing the N-and C-terminal regions of the catalytic subunit of the vacuolar H(+)-ATPase. In this work, we randomized the conserved valine triplet residues three amino acids upstream of the C-terminal splicing junction in the Vma1 protozyme and found that these site-specific random mutations interfere with normal protein splicing to different extents. Intragenic suppressor analysis has revealed that this particular hydrophobic triplet preceding the C-terminal splicing junction genetically interacts with three hydrophobic residues preceding the N-terminal splicing junction. This is the first evidence showing that the N-terminal portion of the V-ATPase subunit is involved in protein splicing. Our genetic evidence is consistent with a structural model that correctly aligns two parallel β-strands ascribed to the triplets. This model delineates spatial interactions between the two conserved regions both residing upstream of the splicing junctions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号