首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Directed protein evolution is the most versatile method for studying protein structure-function relationships, and for tailoring a protein's properties to the needs of industrial applications. In this review, we performed a statistical analysis on the genetic code to study the extent and consequence of the organization of the genetic code on amino acid substitution patterns generated in directed evolution experiments. In detail, we analyzed amino acid substitution patterns caused by (a) a single nucleotide (nt) exchange at each position of all 64 codons, and (b) two subsequent nt exchanges (first and second nt, first and third nt, second and third nt). Additionally, transitions and transversions mutations were compared at the level of amino acid substitution patterns. The latter analysis showed that single nucleotide substitution in a codon generates only 39.5% of the natural diversity on the protein level with 5.2-7 amino acid substitutions per codon. Transversions generate more complex amino acid substitution patterns (increased number and chemically more diverse amino acid substitutions) than transitions. Simultaneous nt exchanges at both first and second nt of a codon generates very diverse amino acid substitution patterns, achieving 83.2% of the natural diversity. The statistical analysis described in this review sets the objectives for novel random mutagenesis methods that address the consequences of the organization of the genetic code. Random mutagenesis methods that favor transversions or introduce consecutive nt exchanges can contribute in this regard.  相似文献   

2.
Directed protein evolution is the most versatile method for studying protein structure–function relationships, and for tailoring a protein's properties to the needs of industrial applications. In this review, we performed a statistical analysis on the genetic code to study the extent and consequence of the organization of the genetic code on amino acid substitution patterns generated in directed evolution experiments. In detail, we analyzed amino acid substitution patterns caused by (a) a single nucleotide (nt) exchange at each position of all 64 codons, and (b) two subsequent nt exchanges (first and second nt, first and third nt, second and third nt). Additionally, transitions and transversions mutations were compared at the level of amino acid substitution patterns. The latter analysis showed that single nucleotide substitution in a codon generates only 39.5% of the natural diversity on the protein level with 5.2–7 amino acid substitutions per codon. Transversions generate more complex amino acid substitution patterns (increased number and chemically more diverse amino acid substitutions) than transitions. Simultaneous nt exchanges at both first and second nt of a codon generates very diverse amino acid substitution patterns, achieving 83.2% of the natural diversity. The statistical analysis described in this review sets the objectives for novel random mutagenesis methods that address the consequences of the organization of the genetic code. Random mutagenesis methods that favor transversions or introduce consecutive nt exchanges can contribute in this regard.  相似文献   

3.
The standard genetic code is known to be much more efficient in minimizing adverse effects of misreading errors and one-point mutations in comparison with a random code having the same structure, i.e. the same number of codons coding for each particular amino acid. We study the inverse problem, how the code structure affects the optimal physico-chemical parameters of amino acids ensuring the highest stability of the genetic code. It is shown that the choice of two or more amino acids with given properties determines unambiguously all the others. In this sense the code structure determines strictly the optimal parameters of amino acids or the corresponding scales may be derived directly from the genetic code. In the code with the structure of the standard genetic code the resulting values for hydrophobicity obtained in the scheme “leave one out” and in the scheme with fixed maximum and minimum parameters correlate significantly with the natural scale. The comparison of the optimal and natural parameters allows assessing relative impact of physico-chemical and error-minimization factors during evolution of the genetic code. As the resulting optimal scale depends on the choice of amino acids with given parameters, the technique can also be applied to testing various scenarios of the code evolution with increasing number of codified amino acids. Our results indicate the co-evolution of the genetic code and physico-chemical properties of recruited amino acids.  相似文献   

4.
Models of amino acid substitution were developed and compared using maximum likelihood. Two kinds of models are considered. "Empirical" models do not explicitly consider factors that shape protein evolution, but attempt to summarize the substitution pattern from large quantities of real data. "Mechanistic" models are formulated at the codon level and separate mutational biases at the nucleotide level from selective constraints at the amino acid level. They account for features of sequence evolution, such as transition-transversion bias and base or codon frequency biases, and make use of physicochemical distances between amino acids to specify nonsynonymous substitution rates. A general approach is presented that transforms a Markov model of codon substitution into a model of amino acid replacement. Protein sequences from the entire mitochondrial genomes of 20 mammalian species were analyzed using different models. The mechanistic models were found to fit the data better than empirical models derived from large databases. Both the mutational distance between amino acids (determined by the genetic code and mutational biases such as the transition-transversion bias) and the physicochemical distance are found to have strong effects on amino acid substitution rates. A significant proportion of amino acid substitutions appeared to have involved more than one codon position, indicating that nucleotide substitutions at neighboring sites may be correlated. Rates of amino acid substitution were found to be highly variable among sites.   相似文献   

5.
The 1000 Genomes Project data provides a natural background dataset for amino acid germline mutations in humans. Since the direction of mutation is known, the amino acid exchange matrix generated from the observed nucleotide variants is asymmetric and the mutabilities of the different amino acids are very different. These differences predominantly reflect preferences for nucleotide mutations in the DNA (especially the high mutation rate of the CpG dinucleotide, which makes arginine mutability very much higher than other amino acids) rather than selection imposed by protein structure constraints, although there is evidence for the latter as well. The variants occur predominantly on the surface of proteins (82%), with a slight preference for sites which are more exposed and less well conserved than random. Mutations to functional residues occur about half as often as expected by chance. The disease-associated amino acid variant distributions in OMIM are radically different from those expected on the basis of the 1000 Genomes dataset. The disease-associated variants preferentially occur in more conserved sites, compared to 1000 Genomes mutations. Many of the amino acid exchange profiles appear to exhibit an anti-correlation, with common exchanges in one dataset being rare in the other. Disease-associated variants exhibit more extreme differences in amino acid size and hydrophobicity. More modelling of the mutational processes at the nucleotide level is needed, but these observations should contribute to an improved prediction of the effects of specific variants in humans.  相似文献   

6.
The nucleotide frequencies in the second codon positions of genes are remarkably different for the coding regions that correspond to different secondary structures in the encoded proteins, namely, helix, beta-strand and aperiodic structures. Indeed, hydrophobic and hydrophilic amino acids are encoded by codons having U or A, respectively, in their second position. Moreover, the beta-strand structure is strongly hydrophobic, while aperiodic structures contain more hydrophilic amino acids. The relationship between nucleotide frequencies and protein secondary structures is associated not only with the physico-chemical properties of these structures but also with the organisation of the genetic code. In fact, this organisation seems to have evolved so as to preserve the secondary structures of proteins by preventing deleterious amino acid substitutions that could modify the physico-chemical properties required for an optimal structure.  相似文献   

7.
M Pieber  J Tohá 《Origins of life》1983,13(2):139-146
The frequency of amino acid replacements in families of typical proteins has been elegantly analyzed by Argyle (1980) showing that the most frequent replacements involve a conservation of the amino acid chemical properties. The cyclic arrangement of the twenty amino acids resulting from the most frequent replacements has been described as an amino acid chemical ring. In this work, a novel amino acid replacement frequency ring is proposed, for which a conservation of over 90% of the most general physico-chemical properties can be deduced. The amino acid chemical similarity ring is also analyzed in terms of the genetic code base probability changes, showing that the discrepancy that exists between the standard deviation value of the amino acid replacement frequency matrix and its respective ideal value is almost equal to that deduced from the corresponding base codon replacement probability matrices. These differences are finally evaluated and discussed in terms of the restrictions imposed by the structure of the genetic code and the physico-chemical dissimilarities between some codons of amino acids which are chemically similar.  相似文献   

8.
The frequency of amino acid replacements in families of typical proteins has been elegantly analyzed by Argyle (1980) showing that the most frequent replacements involve a conservation of the amino acid chemical properties. The cyclic arrangement of the twenty amino acids resulting from the most frequent replacements has been described as an amino acid chemical ring.In this work, a novel amino acid replacement frequency ring is proposed, for which a conservation of over 90% of the most general physico-chemical properties can be deduced.The amino acid chemical similarity ring is also analyzed in terms of the genetic code base probability changes, showing that the discrepancy that exists between the standard deviation value of the amino acid replacement frequency matrix and its respective ideal value is almost equal to that deduced from the corresponding base codon replacement probability matrices. These differences are finally evaluated and discussed in terms of the restrictions imposed by the structure of the genetic code and the physico-chemical dissimilarities between some codons of amino acids which are chemically similar.This work was partially supported by OEA and Departamento de Desarrollo de la Investigación.  相似文献   

9.
In the past, 2 kinds of Markov models have been considered to describe protein sequence evolution. Codon-level models have been mechanistic with a small number of parameters designed to take into account features, such as transition-transversion bias, codon frequency bias, and synonymous-nonsynonymous amino acid substitution bias. Amino acid models have been empirical, attempting to summarize the replacement patterns observed in large quantities of data and not explicitly considering the distinct factors that shape protein evolution. We have estimated the first empirical codon model (ECM). Previous codon models assume that protein evolution proceeds only by successive single nucleotide substitutions, but our results indicate that model accuracy is significantly improved by incorporating instantaneous doublet and triplet changes. We also find that the affiliations between codons, the amino acid each encodes and the physicochemical properties of the amino acids are main factors driving the process of codon evolution. Neither multiple nucleotide changes nor the strong influence of the genetic code nor amino acids' physicochemical properties form a part of standard mechanistic models and their views of how codon evolution proceeds. We have implemented the ECM for likelihood-based phylogenetic analysis, and an assessment of its ability to describe protein evolution shows that it consistently outperforms comparable mechanistic codon models. We point out the biological interpretation of our ECM and possible consequences for studies of selection.  相似文献   

10.
Summary Chou-Fasman parameters, measuring preferences of each amino acid for different conformational regions in proteins, were used to obtain an amino acid difference index of conformational parameter distance (CPD) values. CPD values were found to be significantly lower for amino acid exchanges representing in the genetic code transitions of purines, GA than for exchanges representing either transitions of pyrimidines, CU, or transversions of purines and pyrimidines. Inasmuch as the distribution of CPD values in these non GA exchanges resembles that obtained for amino acid pairs with double or triple base differences in their underlying codons, we conclude that the genetic code was not particularly designed to minimize effects of mutation on protein conformation. That natural selection minimizes these changes, however, was shown by tabulating results obtained by the maximum parsimony method for eight protein genealogies with a total occurrence of 4574 base substitutions. At the beginning position of the codons GA transitions were in very great excess over other base substitutions, and, conversely, CU transitions were deficient. At the middle position of the codons only fast evolving proteins showed an excess of GA transitions, as though selection mainly preserved conformation in these proteins while weeding out mutations affecting chemical properties of functional sites in slow evolving proteins. In both fast and slow evolving proteins the net direction of transitions and transversions was found to be from G beginning codons to non-G beginning codons resulting in more commonly occurring amino acids, especially alanine with its generalized conformational properties, being replaced at suitable sites by amino acids with more specialized conformational and chemical properties. Historical circumstances pertaining to the origin of the genetic code and the nature of primordial proteins could account for such directional changes leading to increases in the functional density of proteins.In order to further explore the course of protein evolution, a modified parsimony algorithm was developed for constructing protein genealogies on the basis of minimum CPD length. The algorithm's ability to judge with finer discrimination that in protein evolution certain pathways of amino acid substitution should occur more readily than others was considered a potential advantage over strict maximum parsimony. In developing this CPD algorithm, the path of minimum CPD length through intermediate amino acids allowed by the genetic code for each pair of amino acids was determined. It was found that amino acid exchanges representing two base changes have a considerably lower average CPD value per base substitution than the amino acid exchanges representing single base changes. Amino acid exchanges representing three base changes have yet a further marked reduction in CPD per base change. This shows how extreme constraining effects of stabilizing selection can be circumvented, for by way of intermediate amino acids almost any amino acid can ultimately be substituted for another without damage to an evolving protein's conformation during the process.  相似文献   

11.
The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nucleotides-adenine, thymine, guanine and cytosine-according to their emergence in evolution, and apply the organizational rules to devising an algebraic representation for the canonical genetic code. Under a framework of the devised code, we quantify codon and amino acid usages from a large collection of 917 prokaryotic genome sequences, and associate the usages with its intrinsic structure and classification schemes as well as amino acid physicochemical properties. Our results show that the algebraic representation of the code is structurally equivalent to a content-centric organization of the code and that codon and amino acid usages under different classification schemes were correlated closely with GC content, implying a set of rules governing composition dynamics across a wide variety of prokaryotic genome sequences. These results also indicate that codons and amino acids are not randomly allocated in the code, where the six-fold degenerate codons and their amino acids have important balancing roles for error minimization. Therefore, the content-centric code is of great usefulness in deciphering its hitherto unknown regularities as well as the dynamics of nucleotide, codon, and amino acid compositions.  相似文献   

12.
13.
Fifty years have passed since the genetic code was deciphered, but how the genetic code came into being has not been satisfactorily addressed. It is now widely accepted that the earliest genetic code did not encode all 20 amino acids found in the universal genetic code as some amino acids have complex biosynthetic pathways and likely were not available from the environment. Therefore, the genetic code evolved as pathways for synthesis of new amino acids became available. One hypothesis proposes that early in the evolution of the genetic code four amino acids—valine, alanine, aspartic acid, and glycine—were coded by GNC codons (N = any base) with the remaining codons being nonsense codons. The other sixteen amino acids were subsequently added to the genetic code by changing nonsense codons into sense codons for these amino acids. Improvement in protein function is presumed to be the driving force behind the evolution of the code, but how improved function was achieved by adding amino acids has not been examined. Based on an analysis of amino acid function in proteins, an evolutionary mechanism for expansion of the genetic code is described in which individual coded amino acids were replaced by new amino acids that used nonsense codons differing by one base change from the sense codons previously used. The improved or altered protein function afforded by the changes in amino acid function provided the selective advantage underlying the expansion of the genetic code. Analysis of amino acid properties and functions explains why amino acids are found in their respective positions in the genetic code.  相似文献   

14.
为探索甲型肝炎减毒活疫苗 (H2株 )细胞适应的分子机制 ,将甲型肝炎减毒活疫苗毒株(HAVH2K7)在人胚肺二倍体细胞KMB17上快速连续系列传代增强适应 ,繁殖周期由原来的 2 8d缩短为 14d ,连续传代后抗原滴度和感染性滴度不断增加 ,传至 2 2代抗原滴度达 1∶10 2 4 ,感染性滴度lgCCID50 (每ml)为 7 83 .分别将第 6代和第 2 2代病毒用AC PCR法和PCR法扩增 .扩增片段分别与pGEM T载体连接得到重组质粒 ,测定cDNA插入片段的序列 .对 2个不同代次全基因序列及氨基酸序列比较分析表明 ,HAVH2K7适应至第 6代时 ,整个基因组有 6个核苷酸变异 ,全部位于编码区内 ,变异率为 0 0 7% ,导致 3个氨基酸变化 ,分别位于VP2 (A S) ;2C(N H) ;3A(R C) .适应至第 2 2代时 ,整个基因组出现 18个核苷酸突变 ,变异率为 0 2 4 % ,13个是该代次产生的 ,变异最大区域 5′端非编码区 (5′UTR)有 5个核苷酸变异 .编码区突变导致 7个氨基酸变化 ,其中 4个氨基酸变化是该代次在 6代基础上特有的变异 (2C ,Q P ;3A ,A S ;3C ,T A ;3D ,V G) .2C区是编码区变异最多区域 ,共有 4个核苷酸突变 ,在 6代变异基础上出现 3个新突变 ,导致 1个氨基酸变异 .说明 5′UTR的2C区变异对病毒的翻译效率、感染性滴度提高具有重要作用 .  相似文献   

15.
Natural amino acids having common antiamino acids are divided into families and groups according to the algorithm of the genetic code (a-n-n-a, amino acid-codon-anticodon-antiamino acid). Members of these groups are placed symmetrically in the structure of the genetic code. In the course of evolution, those point mutations are predominantly accepted retained. In homologous proteins of phylogenetically related organisms which lend to amino acids belonging to one family or group and having common antiamino acids. This assumption is in agreement with L. B. Mekler's theory (1969) of the amino acid interaction code a-a.  相似文献   

16.
A progene hypothesis has been proposed earlier to explain the mechanism of origin of the self-reproducing genetic system. Progenes (precursors of the genetic system) are mixed anhydrides of an amino acid and deoxyribotrinucleotide at the 3'-gamma-terminal phosphate (NpNpNppp-AA); they are produced from dinucleotides (NpNp) and 3'-gamma-aminoacylnucleotidylates (Nppp-AA) as a result of specific interaction between amino acid and dinucleotide. The postulated mechanism of progene formation accounts for the selection of substances, including chirality, the origin of the genetic code as well as for the mechanisms of formation, self-reproduction and evolution of the simpliest genetic system ("gene--polypeptide"). A stereochemical analysis of the progene formation mechanism has allowed us to support the main statements of the hypothesis that relate to the origin of the genetic code and to selection of substances. Atomic groups that could be responsible for the specificity of interaction between dinucleotides and amino acids in progene formation have been revealed. Stereochemical evidence for the physicochemical basis of the origin of the existing genetic code have been produced: 1) a special role of the second nucleotide in the codon is demonstrated in amino acid coding by the progene hypothesis principle; 2) an advantage of T against U in such coding is demonstrated; 3) for 16 amino acids out of 20 an agreement has been obtained between the optimal dinucleotide as revealed by the stereochemical analysis and the codon dinucleotides; 4) an explanation for the third nucleotide selection mechanism is offered. A restoration of the prebiotic code, based on these results, has indicated that the code contains 32 codons, is statistical and group-wise. It encodes 7 groups of isofunctional amino acids: 3 overlapping groups of non-polar amino acids 1) medium-size hydrophobic amino acids (chiefly Val, n-Val and a-But), 2) small and medium-size non-polar amino acids (chiefly Ala Val, n-Val a-But and Gly), 3) small non-polar amino acids (Gly, Ala, a-But) and 4 groups of polar amino acids--1) hydroxy--+dicarbonic (Asp, Glu, Ser and Thr), 2) dicarbonic (Asp and Glu), 3) hydroxy (Ser and Thr) and 4) basic (Arg and Lys). The code includes about 20 amino acids among which are 15-17 canonical and a few common non-canonical. The prebiotic code explains many properties of the existing genetic code and is capable of evolving into the latter by way of a gradual replacement of the physicochemical coding mechanism by the enzymatic coding mechanism.  相似文献   

17.
M Hasegawa  T A Yano 《Origins of life》1975,6(1-2):219-227
The entropy of the amino acid sequences coded by DNA is considered as a measure of diversity of variety of proteins, and is taken as a measure of evolution. The DNA or m-RNA sequence is considered as a stationary second-order Markov chain composed of four kinds of bases. Because of the biased nature of the genetic code table, increase of entropy of amino acid sequences is possible with biased nucleotide sequence. Thus the biased DNA base composition and the extreme rarity of the base doublet CpG of higher organisms are explained. It is expected that the amino acid composition was highly biased at the days of the origin of the genetic code table, and the more frequent amino acids have tended to get rarer, and the rarer ones more frequent. This tendency is observed in the evolution of hemoglobin, cytochrome C, fibrinopeptide, immunoglobulin and lysozyme, and protein as a whole.  相似文献   

18.
The Case for an Error Minimizing Standard Genetic Code   总被引:1,自引:1,他引:0  
Since discovering the pattern by which amino acids are assigned to codons within the standard genetic code, investigators have explored the idea that natural selection placed biochemically similar amino acids near to one another in coding space so as to minimize the impact of mutations and/or mistranslations. The analytical evidence to support this theory has grown in sophistication and strength over the years, and counterclaims questioning its plausibility and quantitative support have yet to transcend some significant weaknesses in their approach. These weaknesses are illustrated here by means of a simple simulation model for adaptive genetic code evolution. There remain ill explored facets of the `error minimizing' code hypothesis, however, including the mechanism and pathway by which an adaptive pattern of codon assignments emerged, the extent to which natural selection created synonym redundancy, its role in shaping the amino acid and nucleotide languages, and even the correct interpretation of the adaptive codon assignment pattern: these represent fertile areas for future research.  相似文献   

19.
The genetic code provides the translation table necessary to transform the information contained in DNA into the language of proteins. In this table, a correspondence between each codon and each amino acid is established: tRNA is the main adaptor that links the two. Although the genetic code is nearly universal, several variants of this code have been described in a wide range of nuclear and organellar systems, especially in metazoan mitochondria. These variants are generally found by searching for conserved positions that consistently code for a specific alternative amino acid in a new species. We have devised an accurate computational method to automate these comparisons, and have tested it with 626 metazoan mitochondrial genomes. Our results indicate that several arthropods have a new genetic code and translate the codon AGG as lysine instead of serine (as in the invertebrate mitochondrial genetic code) or arginine (as in the standard genetic code). We have investigated the evolution of the genetic code in the arthropods and found several events of parallel evolution in which the AGG codon was reassigned between serine and lysine. Our analyses also revealed correlated evolution between the arthropod genetic codes and the tRNA-Lys/-Ser, which show specific point mutations at the anticodons. These rather simple mutations, together with a low usage of the AGG codon, might explain the recurrence of the AGG reassignments.  相似文献   

20.
Conflict between Amino Acid and Nucleotide Characters   总被引:5,自引:0,他引:5  
Slowly evolving characters, such as amino acids and replacement substitutions, have generally been favored over faster evolving characters for inferring phylogenetic relationships. However, amino acids constitute composite characters and, because of the degenerate genetic code, are subject to convergence. Based on an analysis of atpB and rbcL in 567 seed plants, we show that silent substitutions may be more phylogenetically informative than replacement substitutions and that artifacts caused by composite characters and/or convergence cause clades on amino acid trees to conflict with nucleotide trees and independent evidence. These findings indicate that coding nucleotide sequences only as amino acid characters for phylogenetic analysis provides little benefit and may yield misleading results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号