首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
By combining crystallographic and NMR structural data for RNA-bound amino acids within riboswitches, aptamers, and RNPs, chemical principles governing specific RNA interaction with amino acids can be deduced. Such principles, which we summarize in a “polar profile”, are useful in explaining newly selected specific RNA binding sites for free amino acids bearing varied side chains charged, neutral polar, aliphatic, and aromatic. Such amino acid sites can be queried for parallels to the genetic code. Using recent sequences for 337 independent binding sites directed to 8 amino acids and containing 18,551 nucleotides in all, we show a highly robust connection between amino acids and cognate coding triplets within their RNA binding sites. The apparent probability (P) that cognate triplets around these sites are unrelated to binding sites is ≅5.3 × 10−45 for codons overall, and P ≅ 2.1 × 10−46 for cognate anticodons. Therefore, some triplets are unequivocally localized near their present amino acids. Accordingly, there was likely a stereochemical era during evolution of the genetic code, relying on chemical interactions between amino acids and the tertiary structures of RNA binding sites. Use of cognate coding triplets in RNA binding sites is nevertheless sparse, with only 21% of possible triplets appearing. Reasoning from such broad recurrent trends in our results, a majority (approximately 75%) of modern amino acids entered the code in this stereochemical era; nevertheless, a minority (approximately 21%) of modern codons and anticodons were assigned via RNA binding sites. A Direct RNA Template scheme embodying a credible early history for coded peptide synthesis is readily constructed based on these observations.  相似文献   

2.
A diminutive and specific RNA binding site for L-tryptophan   总被引:1,自引:1,他引:0       下载免费PDF全文
Selection for amino acid affinity by elution of RNAs from tryptophan–Sepharose using free L-tryptophan evokes one sequence predominantly (KD = 12 µM), a symmetrical internal loop of 3 nt per side. Though we have also isolated larger sequences with affinity for tryptophan, successively squeezed selection in randomized tracts of 70, 60, 40, 20 and 17 nt show that this internal loop is the simplest sequence that can meet the column affinity selection. From sequence variation in ~50 independent isolates, only 26 bits of information are required to describe this loop (equivalent to only 13 fully conserved nucleotides). Thus, it is among the simplest amino acid binding sites known, as well as selective among hydrophobic side chains. Among site sequences defined as essential to affinity by conservation, protection and modification-interference, there is a recurring CCA sequence (a tryptophan anticodon triplet) which apparently forms one side of the binding site. Such conserved juxtaposition of tryptophan with a cognate coding triplet supports a stereochemical origin for the genetic code.  相似文献   

3.
Studies on the origin of the genetic code compare measures of the degree of error minimization of the standard code with measures produced by random variant codes but do not take into account codon usage, which was probably highly biased during the origin of the code. Codon usage bias could play an important role in the minimization of the chemical distances between amino acids because the importance of errors depends also on the frequency of the different codons. Here I show that when codon usage is taken into account, the degree of error minimization of the standard code may be dramatically reduced, and shifting to alternative codes often increases the degree of error minimization. This is especially true with a high CG content, which was probably the case during the origin of the code. I also show that the frequency of codes that perform better than the standard code, in terms of relative efficiency, is much higher in the neighborhood of the standard code itself, even when not considering codon usage bias; therefore alternative codes that differ only slightly from the standard code are more likely to evolve than some previous analyses suggested. My conclusions are that the standard genetic code is far from being an optimum with respect to error minimization and must have arisen for reasons other than error minimization.[Reviewing Editor: Martin Kreitman]  相似文献   

4.
5.
Selection on Codon Usage for Error Minimization at the Protein Level   总被引:1,自引:0,他引:1  
Given the structure of the genetic code, synonymous codons differ in their capacity to minimize the effects of errors due to mutation or mistranslation. I suggest that this may lead, in protein-coding genes, to a preference for codons that minimize the impact of errors at the protein level. I develop a theoretical measure of error minimization for each codon, based on amino acid similarity. This measure is used to calculate the degree of error minimization for 82 genes of Drosophila melanogaster and 432 rodent genes and to study its relationship with CG content, the degree of codon usage bias, and the rate of nucleotide substitution. I show that (i) Drosophila and rodent genes tend to prefer codons that minimize errors; (ii) this cannot be merely the effect of mutation bias; (iii) the degree of error minimization is correlated with the degree of codon usage bias; (iv) the amino acids that contribute more to codon usage bias are the ones for which synonymous codons differ more in the capacity to minimize errors; and (v) the degree of error minimization is correlated with the rate of nonsynonymous substitution. These results suggest that natural selection for error minimization at the protein level plays a role in the evolution of coding sequences in Drosophila and rodents.Reviewing Editor: Dr. Massimo Di Giulio  相似文献   

6.
We isolated RNAs by selection–amplification, selecting for affinity to Phe–Sepharose and elution with free l-phenylalanine. Constant sequences did not contain Phe condons or anticodons, to avoid any possible confounding influence on initially randomized sequences. We examined the eight most frequent Phe-binding RNAs for inclusion of coding triplets. Binding sites were defined by nucleotide conservation, protection, and interference data. Together these RNAs comprise 70% of the 105 sequenced RNAs. The K D for the strongest sites is ≈50 μM free amino acid, with strong stereoselectivity. One site strongly distinguishes free Phe from Trp and Tyr, a specificity not observed previously. In these eight Phe-binding RNAs, Phe codons are not significantly associated with Phe binding sites. However, among 21 characterized RNAs binding Phe, Tyr, Arg, and Ile, containing 1342 total nucleotides, codons are 2.7-fold more frequent within binding sites than in surrounding sequences in the same molecules. If triplets were not specifically related to binding sites, the probability of this distribution would be 4.8 × 10−11. Therefore, triplet concentration within amino acid binding sites taken together is highly likely. In binding sites for Arg, Tyr, and Ile cognate codons are overrepresented. Thus Arg, Tyr, and Ile may be amino acids whose codons were assigned during an era of direct RNA–amino acid affinity. In contrast, Phe codons arguably were assigned by another criterion, perhaps during later code evolution.  相似文献   

7.
We have assumed that the coevolution theory of genetic code origin (Wong JT, Proc Natl Acad Sci USA 72:1909–1912, 1975) is essentially correct. This theory makes it possible to identify at least 10 evolutionary stages through which genetic code organization might have passed prior to reaching its current form. The calculation of the minimization level of all these evolutionary stages leads to the following conclusions. (1) The minimization percentages increased linearly with the number of amino acids codified in the codes of the various evolutionary stages when only the sense changes are considered in the analysis. This seems to favor the physicochemical theory of genetic code origin even if, as discussed in the paper, this observation is also compatible with the coevolution theory. (2) For the first seven evolutionary stages of the genetic code, this trend is less clear and indeed is inverted when we consider the global optimisation of the codes due to both sense changes and synonymous changes. This inverse correlation between minimization percentages and the number of amino acids codified in the codes of the intermediate stages seems to favor neither the physicochemical nor the stereochemical theories of genetic code origin, as it is in the early and intermediate stages of code development that these theories would expect minimization to have played a crucial role, and this does not seem to be the case. However, these results are in agreement with the coevolution theory, which attributes a role to the physicochemical properties of amino acids that, while important, is nevertheless subordinate to the mechanism which concedes codons from the precursor amino acids to the product amino acids as the primary factor determining the evolutionary structuring of the genetic code. The results are therefore discussed in the context of the various theories proposed to explain genetic code origin. Received: 25 October 1998 / Accepted: 19 February 1999  相似文献   

8.
The aminoacyl-tRNA synthetases exist as two enzyme families which were apparently generated by divergent evolution from two primordial synthetases. The two classes of enzymes exhibit intriguing familial relationships, in that they are distributed nonrandomly within the codon-amino acid matrix of the genetic code. For example, all XCX codons code for amino acids handled by class II synthetases, and all but one of the XUX codons code for amino acids handled by class I synthetases. One interpretation of these patterns is that the synthetases coevolved with the genetic code. The more likely explanation, however, is that the synthetases evolved in the context of an already-established genetic code—a code which developed earlier in an RNA world. The rules which governed the development of the genetic code, and led to certain patterns in the coding catalog between codons and amino acids, would also have governed the subsequent evolution of the synthetases in the context of a fixed code, leading to patterns in synthetase distribution such as those observed. These rules are (1) conservative evolution of amino acid and adapter binding sites and (2) minimization of the disruptive effects on protein structure caused by codon meaning changes.  相似文献   

9.
The genetic code is not random but instead is organized in such a way that single nucleotide substitutions are more likely to result in changes between similar amino acids. This fidelity, or error minimization, has been proposed to be an adaptation within the genetic code. Many models have been proposed to measure this adaptation within the genetic code. However, we find that none of these consider codon usage differences between species. Furthermore, use of different indices of amino acid physicochemical characteristics leads to different estimations of this adaptation within the code. In this study, we try to establish a more accurate model to address this problem. In our model, a weighting scheme is established for mistranslation biases of the three different codon positions, transition/transversion biases, and codon usage. Different indices of amino acids physicochemical characteristics are also considered. In contrast to pervious work, our results show that the natural genetic code is not fully optimized for error minimization. The genetic code, therefore, is not the most optimized one for error minimization, but one that balances between flexibility and fidelity for different species.  相似文献   

10.
Code domains in tandem repetitive DNA sequence structures   总被引:6,自引:0,他引:6  
Peter Vogt 《Chromosoma》1992,101(10):585-589
Traditionally, many people doing research in molecular biology attribute coding properties to a given DNA sequence if this sequence contains an open reading frame for translation into a sequence of amino acids. This protein coding capability of DNA was detected about 30 years ago. The underlying genetic code is highly conserved and present in every biological species studied so far. Today, it is obvious that DNA has a much larger coding potential for other important tasks. Apart from coding for specific RNA molecules such as rRNA, snRNA and tRNA molecules, specific structural and sequence patterns of the DNA chain itself express distinct codes for the regulation and expression of its genetic activity. A chromatin code has been defined for phasing of the histone-octamer protein complex in the nucleosome. A translation frame code has been shown to exist that determines correct triplet counting at the ribosome during protein synthesis. A loop code seems to organize the single stranded interaction of the nascent RNA chain with proteins during the splicing process, and a splicing code phases successive 5' and 3' splicing sites. Most of these DNA codes are not exclusively based on the primary DNA sequence itself, but also seem to include specific features of the corresponding higher order structures. Based on the view that these various DNA codes are genetically instructive for specific molecular interactions or processes, important in the nucleus during interphase and during cell division, the coding capability of tandem repetitive DNA sequences has recently been reconsidered.  相似文献   

11.
RNA-ligand chemistry: a testable source for the genetic code   总被引:5,自引:3,他引:2       下载免费PDF全文
In the genetic code, triplet codons and amino acids can be shown to be related by chemical principles. Such chemical regularities could be created either during the code's origin or during later evolution. One such chemical principle can now be shown experimentally. Natural or particularly selected RNA binding sites for at least three disparate amino acids (arginine, isoleucine, and tyrosine) are enriched in codons for the cognate amino acid. Currently, in 517 total nucleotides, binding sites contain 2.4-fold more codon sequences than surrounding nucleotides. The aggregate probability of this enrichment is 10(-7) to 10(-8), had codons and binding site sequences been independent. Thus, at least some primordial coding assignments appear to have exploited triplets from amino acid binding sites as codons.  相似文献   

12.
We simulate a deterministic population genetic model for the coevolution of genetic codes and protein-coding genes. We use very simple assumptions about translation, mutation, and protein fitness to calculate mutation-selection equilibria of codon frequencies and fitness in a large asexual population with a given genetic code. We then compute the fitnesses of altered genetic codes that compete to invade the population by translating its genes with higher fitness. Codes and genes coevolve in a succession of stages, alternating between genetic equilibration and code invasion, from an initial wholly ambiguous coding state to a diversified frozen coding state. Our simulations almost always resulted in partially redundant frozen genetic codes. Also, the range of simulated physicochemical properties among encoded amino acids in frozen codes was always less than maximal. These results did not require the assumption of historical constraints on the number and type of amino acids available to codes nor on the complexity of proteins, stereochemical constraints on the translational apparatus, nor mechanistic constraints on genetic code change. Both the extent and timing of amino-acid diversification in genetic codes were strongly affected by the message mutation rate and strength of missense selection. Our results suggest that various omnipresent phenomena that distribute codons over sites with different selective requirements—such as the persistence of nonsynonymous mutations at equilibrium, the positive selection of the same codon in different types of sites, and translational ambiguity—predispose the evolution of redundancy and of reduced amino acid diversity in genetic codes. Received: 21 December 2000 / Accepted: 12 March 2001  相似文献   

13.
The genetic code appears to be optimized in its robustness to missense errors and frameshift errors. In addition, the genetic code is near-optimal in terms of its ability to carry information in addition to the sequences of encoded proteins. As evolution has no foresight, optimality of the modern genetic code suggests that it evolved from less optimal code variants. The length of codons in the genetic code is also optimal, as three is the minimal nucleotide combination that can encode the twenty standard amino acids. The apparent impossibility of transitions between codon sizes in a discontinuous manner during evolution has resulted in an unbending view that the genetic code was always triplet. Yet, recent experimental evidence on quadruplet decoding, as well as the discovery of organisms with ambiguous and dual decoding, suggest that the possibility of the evolution of triplet decoding from living systems with non-triplet decoding merits reconsideration and further exploration. To explore this possibility we designed a mathematical model of the evolution of primitive digital coding systems which can decode nucleotide sequences into protein sequences. These coding systems can evolve their nucleotide sequences via genetic events of Darwinian evolution, such as point-mutations. The replication rates of such coding systems depend on the accuracy of the generated protein sequences. Computer simulations based on our model show that decoding systems with codons of length greater than three spontaneously evolve into predominantly triplet decoding systems. Our findings suggest a plausible scenario for the evolution of the triplet genetic code in a continuous manner. This scenario suggests an explanation of how protein synthesis could be accomplished by means of long RNA-RNA interactions prior to the emergence of the complex decoding machinery, such as the ribosome, that is required for stabilization and discrimination of otherwise weak triplet codon-anticodon interactions.  相似文献   

14.
15.
The simplest RNA that can meet a column affinity selection for isoleucine was previously defined using selection amplification with decreasing numbers of randomized nucleotides. This simplest UAUU motif was a small asymmetric internal loop. Conserved positions of the loop include isoleucine codon and anticodon triplets (Lozupone C., Changayil, S., Majerfeld, I., and Yarus, M. (2003) RNA (N. Y.) 9, 1315-1322). Using new primer sequences, we now select a somewhat more complex isoleucine binding RNA, requiring 4.7 more bits of information to describe. The newly selected structure is a terminal or hairpin loop of 20 nucleotides, 15 being invariant. An information profile shows that the new binding site contains five short functional loop regions joined by less significant single nucleotide positions. Among the important nucleotides is a conserved isoleucine anticodon, supporting the escaped triplet theory, which posits a stereochemical genetic code originating in RNA amino acid binding sites.  相似文献   

16.
17.
A progene hypothesis has been proposed earlier to explain the mechanism of origin of the self-reproducing genetic system. Progenes (precursors of the genetic system) are mixed anhydrides of an amino acid and deoxyribotrinucleotide at the 3'-gamma-terminal phosphate (NpNpNppp-AA); they are produced from dinucleotides (NpNp) and 3'-gamma-aminoacylnucleotidylates (Nppp-AA) as a result of specific interaction between amino acid and dinucleotide. The postulated mechanism of progene formation accounts for the selection of substances, including chirality, the origin of the genetic code as well as for the mechanisms of formation, self-reproduction and evolution of the simpliest genetic system ("gene--polypeptide"). A stereochemical analysis of the progene formation mechanism has allowed us to support the main statements of the hypothesis that relate to the origin of the genetic code and to selection of substances. Atomic groups that could be responsible for the specificity of interaction between dinucleotides and amino acids in progene formation have been revealed. Stereochemical evidence for the physicochemical basis of the origin of the existing genetic code have been produced: 1) a special role of the second nucleotide in the codon is demonstrated in amino acid coding by the progene hypothesis principle; 2) an advantage of T against U in such coding is demonstrated; 3) for 16 amino acids out of 20 an agreement has been obtained between the optimal dinucleotide as revealed by the stereochemical analysis and the codon dinucleotides; 4) an explanation for the third nucleotide selection mechanism is offered. A restoration of the prebiotic code, based on these results, has indicated that the code contains 32 codons, is statistical and group-wise. It encodes 7 groups of isofunctional amino acids: 3 overlapping groups of non-polar amino acids 1) medium-size hydrophobic amino acids (chiefly Val, n-Val and a-But), 2) small and medium-size non-polar amino acids (chiefly Ala Val, n-Val a-But and Gly), 3) small non-polar amino acids (Gly, Ala, a-But) and 4 groups of polar amino acids--1) hydroxy--+dicarbonic (Asp, Glu, Ser and Thr), 2) dicarbonic (Asp and Glu), 3) hydroxy (Ser and Thr) and 4) basic (Arg and Lys). The code includes about 20 amino acids among which are 15-17 canonical and a few common non-canonical. The prebiotic code explains many properties of the existing genetic code and is capable of evolving into the latter by way of a gradual replacement of the physicochemical coding mechanism by the enzymatic coding mechanism.  相似文献   

18.
Explaining the apparent non-random codon distribution and the nature and number of amino acids in the ‘standard’ genetic code remains a challenge, despite the various hypotheses so far proposed. In this paper we propose a simple new hypothesis for code evolution involving a progression from singlet to doublet to triplet codons with a reading mechanism that moves three bases each step. We suggest that triplet codons gradually evolved from two types of ambiguous doublet codons, those in which the first two bases of each three-base window were read (‘prefix’ codons) and those in which the last two bases of each window were read (‘suffix’ codons). This hypothesis explains multiple features of the genetic code such as the origin of the pattern of four-fold degenerate and two-fold degenerate triplet codons, the origin of its error minimising properties, and why there are only 20 amino acids. Reviewing Editor: Dr. Laura Landweber An erratum to this article can be found at .  相似文献   

19.
A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D, A, C, G, U}, where symbol D represents one or more hypothetical bases with unspecific pairings. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvement of a primeval DNA repair system could make possible the transition from ancient to modern genetic codes. Our results suggest that the Watson-Crick base pairing G ≡ C and A = U and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as, the transition from the former to the latter. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences. The phylogenetic analyses achieved with metrics defined in the N-dimensional vector space (B3)N of DNA sequences and with the new evolutionary model presented here also suggest that an ancient DNA coding sequence with five or more bases does not contradict the expected evolutionary history.  相似文献   

20.
The laws governing degeneration of the genetic code are discussed below. Of fundamental importance in this context is the classification of the amino acids into groups on the basis of the physicochemical behaviour of their residues. From this, it is possible to formulate arithmetic relationships between the number of amino acids in the same group and the number of coding triplets.It is found that the degeneration of the genetic code obeys certain laws, the reasons for this being related to the number and the qualitative properties of the amino acids and triplets. The fact that the three bases of a coding triplet have different priorities must also be a critical factor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号