首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The 655 bp cytochrome c oxidase subunit I barcode region of single specimens of 388 species of fishes (four Holocephali, 61 Elasmobranchii and 323 Actinopterygii) was examined. All but two (Urolophus cruciatus and Urolophus sufflavus) showed different cox1 nucleotide sequences (99.5% species discrimination); the two that could not be resolved are suspected to hybridize. Most of the power of cox1 nucleotide sequence analysis for species identification comes from the degenerate nature of the genetic code and the highly variable nature of the third codon position of amino acids. Variation at the third codon position is bimodally distributed, and the more variable mode is dominated by amino acids with four or six codons, while the less variable mode is dominated by amino acids with two codons. The ratio of nonsynonymous to synomymous changes is much less than one, indicating that this gene is subject to strong purifying selection. Consequently, cox1 amino acid sequence diversity is much less than nucleotide sequence diversity and has very poor species resolution power. Fourteen of the 16 amino acid residues recognized as having important functions in the region of cox1 sequenced were completely conserved over all 388 species (and the bovine cox1 sequence), with one fish species varying at one of these sites, and three fish at another site. No significant differences in amino acid conservation were observed between residues in helices, strands and turns. Patterns of nucleotide and amino acid variability were very similar between elasmobranchs and actinopterygians.  相似文献   

2.
Models of amino acid substitution were developed and compared using maximum likelihood. Two kinds of models are considered. "Empirical" models do not explicitly consider factors that shape protein evolution, but attempt to summarize the substitution pattern from large quantities of real data. "Mechanistic" models are formulated at the codon level and separate mutational biases at the nucleotide level from selective constraints at the amino acid level. They account for features of sequence evolution, such as transition-transversion bias and base or codon frequency biases, and make use of physicochemical distances between amino acids to specify nonsynonymous substitution rates. A general approach is presented that transforms a Markov model of codon substitution into a model of amino acid replacement. Protein sequences from the entire mitochondrial genomes of 20 mammalian species were analyzed using different models. The mechanistic models were found to fit the data better than empirical models derived from large databases. Both the mutational distance between amino acids (determined by the genetic code and mutational biases such as the transition-transversion bias) and the physicochemical distance are found to have strong effects on amino acid substitution rates. A significant proportion of amino acid substitutions appeared to have involved more than one codon position, indicating that nucleotide substitutions at neighboring sites may be correlated. Rates of amino acid substitution were found to be highly variable among sites.   相似文献   

3.
In the past, 2 kinds of Markov models have been considered to describe protein sequence evolution. Codon-level models have been mechanistic with a small number of parameters designed to take into account features, such as transition-transversion bias, codon frequency bias, and synonymous-nonsynonymous amino acid substitution bias. Amino acid models have been empirical, attempting to summarize the replacement patterns observed in large quantities of data and not explicitly considering the distinct factors that shape protein evolution. We have estimated the first empirical codon model (ECM). Previous codon models assume that protein evolution proceeds only by successive single nucleotide substitutions, but our results indicate that model accuracy is significantly improved by incorporating instantaneous doublet and triplet changes. We also find that the affiliations between codons, the amino acid each encodes and the physicochemical properties of the amino acids are main factors driving the process of codon evolution. Neither multiple nucleotide changes nor the strong influence of the genetic code nor amino acids' physicochemical properties form a part of standard mechanistic models and their views of how codon evolution proceeds. We have implemented the ECM for likelihood-based phylogenetic analysis, and an assessment of its ability to describe protein evolution shows that it consistently outperforms comparable mechanistic codon models. We point out the biological interpretation of our ECM and possible consequences for studies of selection.  相似文献   

4.
Abstract

The codon usage in the Vibrio cholerae genome is analyzed in this paper. Although there are much more genes on the chromosome 1 than on chromosome 2, the codon usage patterns of genes on the two chromosomes are quite similar, indicating that the two chromosomes may have coexisted in the same cell for a very long history. Unlike the base frequency pattern observed in other genomes, the G+C content at the third codon position of the V. cholerae genome varies in a rather small interval. The most notable feature of codon usage of V. cholerae genome is that there is a fraction of genes show significant bias in base choice at the second codon position. The 2006 known genes can be classified into two clusters according to the base frequencies at this position. The smaller cluster contains 227 genes, most of which code for proteins involved in transport and binding functions. The encoding products of these genes have significant bias in amino acids composition as compared with other genes. The codon usage patterns for the 1836 function unknown ORFs are also analyzed, which is useful to study their functions.  相似文献   

5.
《BBA》2022,1863(8):148597
The origin of the genetic code is an abiding mystery in biology. Hints of a ‘code within the codons’ suggest biophysical interactions, but these patterns have resisted interpretation. Here, we present a new framework, grounded in the autotrophic growth of protocells from CO2 and H2. Recent work suggests that the universal core of metabolism recapitulates a thermodynamically favoured protometabolism right up to nucleotide synthesis. Considering the genetic code in relation to an extended protometabolism allows us to predict most codon assignments. We show that the first letter of the codon corresponds to the distance from CO2 fixation, with amino acids encoded by the purines (G followed by A) being closest to CO2 fixation. These associations suggest a purine-rich early metabolism with a restricted pool of amino acids. The second position of the anticodon corresponds to the hydrophobicity of the amino acid encoded. We combine multiple measures of hydrophobicity to show that this correlation holds strongly for early amino acids but is weaker for later species. Finally, we demonstrate that redundancy at the third position is not randomly distributed around the code: non-redundant amino acids can be assigned based on size, specifically length. We attribute this to additional stereochemical interactions at the anticodon. These rules imply an iterative expansion of the genetic code over time with codon assignments depending on both distance from CO2 and biophysical interactions between nucleotide sequences and amino acids. In this way the earliest RNA polymers could produce non-random peptide sequences with selectable functions in autotrophic protocells.  相似文献   

6.
Directed protein evolution is the most versatile method for studying protein structure–function relationships, and for tailoring a protein's properties to the needs of industrial applications. In this review, we performed a statistical analysis on the genetic code to study the extent and consequence of the organization of the genetic code on amino acid substitution patterns generated in directed evolution experiments. In detail, we analyzed amino acid substitution patterns caused by (a) a single nucleotide (nt) exchange at each position of all 64 codons, and (b) two subsequent nt exchanges (first and second nt, first and third nt, second and third nt). Additionally, transitions and transversions mutations were compared at the level of amino acid substitution patterns. The latter analysis showed that single nucleotide substitution in a codon generates only 39.5% of the natural diversity on the protein level with 5.2–7 amino acid substitutions per codon. Transversions generate more complex amino acid substitution patterns (increased number and chemically more diverse amino acid substitutions) than transitions. Simultaneous nt exchanges at both first and second nt of a codon generates very diverse amino acid substitution patterns, achieving 83.2% of the natural diversity. The statistical analysis described in this review sets the objectives for novel random mutagenesis methods that address the consequences of the organization of the genetic code. Random mutagenesis methods that favor transversions or introduce consecutive nt exchanges can contribute in this regard.  相似文献   

7.
Directed protein evolution is the most versatile method for studying protein structure-function relationships, and for tailoring a protein's properties to the needs of industrial applications. In this review, we performed a statistical analysis on the genetic code to study the extent and consequence of the organization of the genetic code on amino acid substitution patterns generated in directed evolution experiments. In detail, we analyzed amino acid substitution patterns caused by (a) a single nucleotide (nt) exchange at each position of all 64 codons, and (b) two subsequent nt exchanges (first and second nt, first and third nt, second and third nt). Additionally, transitions and transversions mutations were compared at the level of amino acid substitution patterns. The latter analysis showed that single nucleotide substitution in a codon generates only 39.5% of the natural diversity on the protein level with 5.2-7 amino acid substitutions per codon. Transversions generate more complex amino acid substitution patterns (increased number and chemically more diverse amino acid substitutions) than transitions. Simultaneous nt exchanges at both first and second nt of a codon generates very diverse amino acid substitution patterns, achieving 83.2% of the natural diversity. The statistical analysis described in this review sets the objectives for novel random mutagenesis methods that address the consequences of the organization of the genetic code. Random mutagenesis methods that favor transversions or introduce consecutive nt exchanges can contribute in this regard.  相似文献   

8.
Okayasu T  Sorimachi K 《Amino acids》2009,36(2):261-271
We recently classified 23 bacteria into two types based on their complete genomes; “S-type” as represented by Staphylococcus aureus and “E-type” as represented by Escherichia coli. Classification was characterized by concentrations of Arg, Ala or Lys in the amino acid composition calculated from the complete genome. Based on these previous classifications, not only prokaryotic but also eukaryotic genome structures were investigated by amino acid compositions and nucleotide contents. Organisms consisting of 112 bacteria, 15 archaea and 18 eukaryotes were classified into two major groups by cluster analysis using GC contents at the three codon positions calculated from complete genomes. The 145 organisms were classified into “AT-type” and “GC-type” represented by high A or T (low G or C) and high G or C (low A or T) contents, respectively, at every third codon position. Reciprocal changes between G or C and A or T contents at the third codon position occurred almost synchronously in every codon among the organisms. Correlations between amino acid concentrations (Ala, Ile and Lys) and the nucleotide contents at the codon position were obtained in both “AT-type” and “GC-type” organisms, but with different regression coefficients. In certain correlations of amino acid concentrations with GC contents, eukaryotes, archaea and bacteria showed different behaviors; thus these kingdoms evolved differently. All organisms are basically classifiable into two groups having characteristic codon patterns; organisms with low GC and high AT contents at the third codon position and their derivatives, and organisms with an inverse relationship.  相似文献   

9.
The amino acid sequences of the amidinotransferases and the nucleotide sequences of their genes or cDNA from four Streptomyces species (seven genes) and from the kidneys of rat, pig, human and human pancreas were compared. The overall amino acid and nucleotide sequences of the prokaryotes and eukaryotes were very similar and further, three regions were identified that were highly identical. Evidence is presented that there is virtually zero chance that the overall and high identity regions of the amino acid sequence similarities and the overall nucleotide sequence similarities between Streptomyces and mammals represent random match. Both rat and lamprey amidinotransferases were able to use inosamine phosphate, the amidine group acceptor of Streptomyces. We have concluded that the structure and function of the amidinotransferases and their genes has been highly conserved through evolution from prokaryotes to eukaryotes. The evolution has occurred with: (1) a high degree of retention of nucleotide and amino acid sequences; (2) a high degree of retention of the primitive Streptomyces guanine+cytosine (G+C) third codon position composition in certain high identity regions of the eukaryote cDNA; (3) a decrease in the specificities for the amidine group acceptors; and (4) most of the mutations silent in the regions suggested to code for active sites in the enzymes.  相似文献   

10.
Phosphorylation has to have been one of the key events in prebiotic evolution on earth. In this article, the emergence of phosphoryl amino acid 5′-nucleosides having a P–N bond is described as a model of the origin of amino acid homochirality and Genetic Code. It is proposed that the intramolecular interaction between the nucleotide base and the amino acid side-chain influences the stability of particular amino acid 5′-nucleotides, and the interaction also selects for the chirality of amino acids. The differences between l- and d-conformation energies (ΔE conf) are evaluated by DFT methods at the B3LYP/6-31G(d) level. Although, as expected, these ΔE conf values are not large, they do give differences in energy that can distinguish the chirality of amino acids. Based on our calculations, the chiral selection of the earliest amino acids for l-enantiomers seems to be determined by a clear stereochemical/physicochemical relationship. As later amino acids developed from the earliest amino acids, we deduce that the chirality of these late amino acids was inherited from that of the early amino acids. This idea reaches far back into evolution, and we hope that it will guide further experiments in this area.  相似文献   

11.
Genetic investigation and in silico analysis of plantaricin EFI (plnEFI) locus was performed in three indigenous isolates of Lactobacillus plantarum EL3, L28 and BL1. Amplification with plnEFI specific primers and production of ~ 10 KDa size protein suggested the existence of class II bacteriocins. The analysis demonstrated that the studied fragment included structural bacteriocin, immunity, partial transporter and potential regulatory encoding regions. Based on the results, there was one DNA polymorphic site in plnE as well as plnF of the studied sequences. One nucleotide substitution in plnE of BL1 isolate lead to replacement of Glycin with Valine. These two are of non-polar type which did not affect instability index of plnE protein. The only nucleotide variation in plnF of EL3 isolate did not change the amino acid sequence since the modified nucleotide constituted alternative codon of the original amino acid. The highest DNA polymorphism occurred in the region with immunity function which in BL1 resulted in the conversion of start codon to amino acid codon. In the partial transporter sequence, one variable nucleotide site caused amino acid replacement in all the isolates which elevated stability of N-terminal domain in the transporter protein compared to nominated reference isolate L. plantarum C11. The region with possible regulatory function was identical in all three isolates. © 2018 American Institute of Chemical Engineers Biotechnol Progress, 35: e2773, 2019.  相似文献   

12.
An algebraic and geometrical approach is used to describe the primaeval RNA code and a proposed Extended RNA code. The former consists of all codons of the type RNY, where R means purines, Y pyrimidines, and N any of them. The latter comprises the 16 codons of the type RNY plus codons obtained by considering the RNA code but in the second (NYR type), and the third, (YRN type) reading frames. In each of these reading frames, there are 16 triplets that altogether complete a set of 48 triplets, which specify 17 out of the 20 amino acids, including AUG, the start codon, and the three known stop codons. The other 16 codons, do not pertain to the Extended RNA code and, constitute the union of the triplets YYY and RRR that we define as the RNA-less code. The codons in each of the three subsets of the Extended RNA code are represented by a four-dimensional hypercube and the set of codons of the RNA-less code is portrayed as a four-dimensional hyperprism. Remarkably, the union of these four symmetrical pairwise disjoint sets comprises precisely the already known six-dimensional hypercube of the Standard Genetic Code (SGC) of 64 triplets. These results suggest a plausible evolutionary path from which the primaeval RNA code could have originated the SGC, via the Extended RNA code plus the RNA-less code. We argue that the life forms that probably obeyed the Extended RNA code were intermediate between the ribo-organisms of the RNA World and the last common ancestor (LCA) of the Prokaryotes, Archaea, and Eucarya, that is, the cenancestor. A general encoding function, E, which maps each codon to its corresponding amino acid or the stop signal is also derived. In 45 out of the 64 cases, this function takes the form of a linear transformation F, which projects the whole six-dimensional hypercube onto a four-dimensional hyperface conformed by all triplets that end in cytosine. In the remaining 19 cases the function E adopts the form of an affine transformation, i.e., the composition of F with a particular translation. Graphical representations of the four local encoding functions and E, are illustrated and discussed. For every amino acid and for the stop signal, a single triplet, among those that specify it, is selected as a canonical representative. From this mapping a graphical representation of the 20 amino acids and the stop signal is also derived. We conclude that the general encoding function E represents the SGC itself.  相似文献   

13.
Herein, we rigorously develop novel 3-dimensional algebraic models called Genetic Hotels of the Standard Genetic Code (SGC). We start by considering the primeval RNA genetic code which consists of the 16 codons of type RNY (purine-any base-pyrimidine). Using simple algebraic operations, we show how the RNA code could have evolved toward the current SGC via two different intermediate evolutionary stages called Extended RNA code type I and II. By rotations or translations of the subset RNY, we arrive at the SGC via the former (type I) or via the latter (type II), respectively. Biologically, the Extended RNA code type I, consists of all codons of the type RNY plus codons obtained by considering the RNA code but in the second (NYR type) and third (YRN type) reading frames. The Extended RNA code type II, comprises all codons of the type RNY plus codons that arise from transversions of the RNA code in the first (YNY type) and third (RNR) nucleotide bases. Since the dimensions of remarkable subsets of the Genetic Hotels are not necessarily integer numbers, we also introduce the concept of algebraic fractal dimension. A general decoding function which maps each codon to its corresponding amino acid or the stop signals is also derived. The Phenotypic Hotel of amino acids is also illustrated. The proposed evolutionary paths are discussed in terms of the existing theories of the evolution of the SGC. The adoption of 3-dimensional models of the Genetic and Phenotypic Hotels will facilitate the understanding of the biological properties of the SGC.  相似文献   

14.
The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nucleotides-adenine, thymine, guanine and cytosine-according to their emergence in evolution, and apply the organizational rules to devising an algebraic representation for the canonical genetic code. Under a framework of the devised code, we quantify codon and amino acid usages from a large collection of 917 prokaryotic genome sequences, and associate the usages with its intrinsic structure and classification schemes as well as amino acid physicochemical properties. Our results show that the algebraic representation of the code is structurally equivalent to a content-centric organization of the code and that codon and amino acid usages under different classification schemes were correlated closely with GC content, implying a set of rules governing composition dynamics across a wide variety of prokaryotic genome sequences. These results also indicate that codons and amino acids are not randomly allocated in the code, where the six-fold degenerate codons and their amino acids have important balancing roles for error minimization. Therefore, the content-centric code is of great usefulness in deciphering its hitherto unknown regularities as well as the dynamics of nucleotide, codon, and amino acid compositions.  相似文献   

15.
An original tetrahedral representation of the Genetic Code (GC) that better describes its structure, degeneration and evolution trends is defined. The possibility to reduce the dimension of the representation by projecting the GC tetrahedron on an adequately oriented plane is also analyzed, leading to some equivalent complex representations of the GC. On these bases, optimal symbolic-to-digital mappings of the linear, nucleic acid strands into real or complex genomic signals are derived at nucleotide, codon and amino acid levels. By converting the sequences of nucleotides and polypeptides into digital genomic signals, this approach offers the possibility to use a large variety of signal processing methods for their handling and analysis. It is also shown that some essential features of the nucleotide sequences can be better extracted using this representation. Specifically, the paper reports for the first time the existence of a global helicoidal wrapping of the complex representations of the bases along DNA sequences, a large scale trend of genomic signals. New tools for genomic signal analysis, including the use of phase, aggregated phase, unwrapped phase, sequence path, stem representation of components'relative frequencies, as well as analysis of the transitions are introduced at the nucleotide, codon and amino acid levels, and in a multiresolution approach.  相似文献   

16.
The genetic code is not random but instead is organized in such a way that single nucleotide substitutions are more likely to result in changes between similar amino acids. This fidelity, or error minimization, has been proposed to be an adaptation within the genetic code. Many models have been proposed to measure this adaptation within the genetic code. However, we find that none of these consider codon usage differences between species. Furthermore, use of different indices of amino acid physicochemical characteristics leads to different estimations of this adaptation within the code. In this study, we try to establish a more accurate model to address this problem. In our model, a weighting scheme is established for mistranslation biases of the three different codon positions, transition/transversion biases, and codon usage. Different indices of amino acids physicochemical characteristics are also considered. In contrast to pervious work, our results show that the natural genetic code is not fully optimized for error minimization. The genetic code, therefore, is not the most optimized one for error minimization, but one that balances between flexibility and fidelity for different species.  相似文献   

17.
It has been suggested that codon volatility (the proportion of the point-mutation neighbors of a codon that encode different amino acids) can be used as an index of past positive selection. We compared codon volatility with patterns of synonymous and nonsynonymous nucleotide substitution in genome-wide comparisons of orthologous genes between three pairs of related genomes: (1) the protists Plasmodium falciparum and P. yoelii, (2) the fungi Saccharomyces cerevisiae and S. paradoxus, and (3) the mammals mouse and rat. Codon volatility was not consistently associated with an elevated rate of nonsynonymous substitution, as would be expected under positive selection. Rather, the most consistent and powerful correlate of elevated codon volatility was nucleotide content at the second codon position, as expected, given the nature of the genetic code.  相似文献   

18.
Site-directed mutagenesis and nucleotide sequence analysis were used to study the roles of the global and local contexts in suppression of the lys2-90 frameshift (FS) mutation inSaccharomyces cerevisiae. Global context features established for the LYS2 mRNA region containing the extra T (lys2-90) were similar to those characteristic of regions involved in translational frameshifting. These were a potential ability of the region to form a pseudoknot and the presence of heptanucleotide CUU UGA C with the hungry UGA nonsense codon in the pseudoknot. Some local context features proved to be essential for the phenotypic expression of FS suppression as a result of translational frameshifting. Two amino acid substitutions determined by the nucleotide sequence between the extra U and the UGA nonsense codon lacked expression. A dependence was observed between the efficiency of FS suppression and the type of the nonsense codon located at a particular position downstream of the extra nucleotide (UGA > UAG > UAA). When translation termination was inactivated, nonsense suppression and FS suppression correlated with each other. These results suggest that translational frameshifting, which underlies suppression in the case of inactivation of translation termination, most likely takes place on the nonsense codon arising as a result of insertion of an extra nucleotide.  相似文献   

19.

Background  

The standard genetic code is redundant and has a highly non-random structure. Codons for the same amino acids typically differ only by the nucleotide in the third position, whereas similar amino acids are encoded, mostly, by codon series that differ by a single base substitution in the third or the first position. As a result, the code is highly albeit not optimally robust to errors of translation, a property that has been interpreted either as a product of selection directed at the minimization of errors or as a non-adaptive by-product of evolution of the code driven by other forces.  相似文献   

20.
Summary The 20 naturally occurring amino acids are characterized by 20 variables: pKNH 2, pKCOOH, pI, molecular weight, substituent van der Waals volume, seven1H and13C nuclear magnetic resonance shift variables, and eight hydrophobicity-hydrophilicity scales. The 20-dimensional data set is reduced to a few new dimensions by principal components analysis. The three first principal components reveal relationships between the properties of the amino acids and the genetic code. Thus the amino acids coded for by adenosine (A), uracil (U), or cytosine (C) in their second codon position (corresponding to U, A, or G in the second anticodon position) are grouped in these components. No grouping was detected for the amino acids coded for by guanine (G) in the second codon position (corresponding to C in the second anticodon position). The results show that a relationship exists between the physical-chemical properties of the amino acids and which of the A (U), U (A), or C (G) nucleotide is used in the second codon (anticodon) position. The amino acids coded for by G (C) in the second codon (anticodon) position do not participate in this relationship.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号