首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 569 毫秒
1.
In the RNA world, RNA is assumed to be the dominant macromolecule performing most, if not all, core "house-keeping" functions. The ribo-cell hypothesis suggests that the genetic code and the translation machinery may both be born of the RNA world, and the introduction of DNA to ribo-cells may take over the informational role of RNA gradually, such as a mature set of genetic code and mechanism enabling stable inheritance of sequence and its variation. In this context, we modeled the genetic code in two content variables-GC and purine contents-of protein-coding sequences and measured the purine content sensitivities for each codon when the sensitivity (% usage) is plotted as a function of GC content variation. The analysis leads to a new pattern-the symmetric pattern-where the sensitivity of purine content variation shows diagonally symmetry in the codon table more significantly in the two GC content invariable quarters in addition to the two existing patterns where the table is divided into either four GC content sensitivity quarters or two amino acid diversity halves. The most insensitive codon sets are GUN (valine) and CAN (CAR for asparagine and CAY for aspartic acid) and the most biased amino acid is valine (always over-estimated) followed by alanine (always under-estimated). The unique position of valine and its codons suggests its key roles in the final recruitment of the complete codon set of the canonical table. The distinct choice may only be attributable to sequence signatures or signals of splice sites for spliceosomal introns shared by all extant eukaryotes.  相似文献   

2.
Lightfield J  Fram NR  Ely B 《PloS one》2011,6(3):e17677
The GC content of bacterial genomes ranges from 16% to 75% and wide ranges of genomic GC content are observed within many bacterial phyla, including both gram negative and gram positive phyla. Thus, divergent genomic GC content has evolved repeatedly in widely separated bacterial taxa. Since genomic GC content influences codon usage, we examined codon usage patterns and predicted protein amino acid content as a function of genomic GC content within eight different phyla or classes of bacteria. We found that similar patterns of codon usage and protein amino acid content have evolved independently in all eight groups of bacteria. For example, in each group, use of amino acids encoded by GC-rich codons increased by approximately 1% for each 10% increase in genomic GC content, while the use of amino acids encoded by AT-rich codons decreased by a similar amount. This consistency within every phylum and class studied led us to conclude that GC content appears to be the primary determinant of the codon and amino acid usage patterns observed in bacterial genomes. These results also indicate that selection for translational efficiency of highly expressed genes is constrained by the genomic parameters associated with the GC content of the host genome.  相似文献   

3.
The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nucleotides-adenine, thymine, guanine and cytosine-according to their emergence in evolution, and apply the organizational rules to devising an algebraic representation for the canonical genetic code. Under a framework of the devised code, we quantify codon and amino acid usages from a large collection of 917 prokaryotic genome sequences, and associate the usages with its intrinsic structure and classification schemes as well as amino acid physicochemical properties. Our results show that the algebraic representation of the code is structurally equivalent to a content-centric organization of the code and that codon and amino acid usages under different classification schemes were correlated closely with GC content, implying a set of rules governing composition dynamics across a wide variety of prokaryotic genome sequences. These results also indicate that codons and amino acids are not randomly allocated in the code, where the six-fold degenerate codons and their amino acids have important balancing roles for error minimization. Therefore, the content-centric code is of great usefulness in deciphering its hitherto unknown regularities as well as the dynamics of nucleotide, codon, and amino acid compositions.  相似文献   

4.
Palidwor GA  Perkins TJ  Xia X 《PloS one》2010,5(10):e13431

Background

In spite of extensive research on the effect of mutation and selection on codon usage, a general model of codon usage bias due to mutational bias has been lacking. Because most amino acids allow synonymous GC content changing substitutions in the third codon position, the overall GC bias of a genome or genomic region is highly correlated with GC3, a measure of third position GC content. For individual amino acids as well, G/C ending codons usage generally increases with increasing GC bias and decreases with increasing AT bias. Arginine and leucine, amino acids that allow GC-changing synonymous substitutions in the first and third codon positions, have codons which may be expected to show different usage patterns.

Principal Findings

In analyzing codon usage bias in hundreds of prokaryotic and plant genomes and in human genes, we find that two G-ending codons, AGG (arginine) and TTG (leucine), unlike all other G/C-ending codons, show overall usage that decreases with increasing GC bias, contrary to the usual expectation that G/C-ending codon usage should increase with increasing genomic GC bias. Moreover, the usage of some codons appears nonlinear, even nonmonotone, as a function of GC bias. To explain these observations, we propose a continuous-time Markov chain model of GC-biased synonymous substitution. This model correctly predicts the qualitative usage patterns of all codons, including nonlinear codon usage in isoleucine, arginine and leucine. The model accounts for 72%, 64% and 52% of the observed variability of codon usage in prokaryotes, plants and human respectively. When codons are grouped based on common GC content, 87%, 80% and 68% of the variation in usage is explained for prokaryotes, plants and human respectively.

Conclusions

The model clarifies the sometimes-counterintuitive effects that GC mutational bias can have on codon usage, quantifies the influence of GC mutational bias and provides a natural null model relative to which other influences on codon bias may be measured.  相似文献   

5.
Correlations between genomic GC contents and amino acid frequencies were studied in the homologous sequences of 12 eubacterial genomes. Results show that amino acids encoded by GC-rich codons increases significantly with genomic GC contents, whereas opposite trend was observed in case of amino acids encoded by GC-poor codons. Further studies show all the amino acids do not change in the predicted direction according to their genomic GC pressure, suggesting that protein evolution is not entirely dictated by their nucleotide frequencies. Amino acid substitution matrix calculated among hydrophobic, amphipathic and hydrophilic amino acid groups' shows that amphipathic and hydrophilic amino acids are more frequently substituted by hydrophobic amino acids than from hydrophobic to hydrophilic or amphipathic amino acids. This indicates that nucleotide bias induces a directional changes in proteome composition in such a way that underwent strong changes in hydropathy values. In fact, significant increases in hydrophobicity values have also been observed with the increase of genomic GC contents. Correlations between GC contents and amino acid compositions in three different predicted protein secondary structures show that hydropathy values increases significantly with GC contents in aperiodic and helix structures whereas strand structure remains insensitive with the genomic GC levels. The relative importance of mutation and selection on the evolution of proteins have been discussed on the basis of these results.  相似文献   

6.
Patterns of codon usage bias in three dicot and four monocot plant species   总被引:9,自引:0,他引:9  
Codon usage in nuclear genes of four monocot and three dicot species was analyzed to find general patterns in codon choice of plant species. Codon bias was correlated with GC content at the third codon position. GC contents were higher in monocot species than in dicot species at all codon positions. The high GC contents of monocot species might be the result of relatively strong mutational bias that occurred in the lineage of the Poaceae species. In both dicot and monocot species, the effective number of codons (ENCs) for most genes was similar to that for the expected ENCs based on the GC content at the third codon positions. G and C ending codons were detected as the "preferred" codons in monocot species, as in Drosophila. Also, many "preferred" codons are the same in dicot species. Pyrimidine (C and T) is used more frequently than purine (G and A) in four-fold degenerate codon groups.  相似文献   

7.
The structure of the genetic code is related to a Gray code, which is a plausible theoretical model for an amino acid code. The proposed model implies that the most important factor in shaping the code was the effects of mistakes in translation, not effects of mutations. Another possible implication is that the preservation of stiffness and flexibility at appropriate places in a protein chain is as important in protein structure as the appropriate placement of hydrophilic (external) and hydrophobic (internal) residues. Other results are a simple conceptualization of the relationships among the 20 amino acids and their relations to their codons. The detailed relationships are summarized in the following ‘similarity alphabet’: ala, thr, gly, pro, ser; asp, asn, glu, gln, lys; his, arg, trp, tyr, phe; leu, met, ile, val, cys; (ATGPS DNEQK HRWYF LMIVC in the one-letter code). This alphabet falls into four groups of amino acids: small, external, large, internal. The approximate relation of the groups to their codons is expressed as: the first base of a codon controls size—a purine means a small amino acid, a pyrimidine means large; the middle base controls cloisterednes—purine means external, pyrimidine means internal. These relationships express the minimum change principle upon which the code appears to be founded.  相似文献   

8.
New insights into the arrangement of the genetic code table, based on the analysis of the physico-chemical properties of its molecular constituents, are reported in this paper. It will be demonstrated that the code has a twofold symmetry that is not apparent from the conventional code table, but becomes apparent when the codon-anticodon energies are listed for each triplet. The evolutionary development of the current code based on single base replacement mutations (transitions) from an 'iso-energetic' degenerated subset of 16 of the 64 codons is discussed. The energy landscape of all 64 codons is presented. A detailed analysis of the energy changes due to mutations in the 3rd, 1st or 2nd position of a codon reveals that the modern genetic code is highly robust. Changes come in small discrete steps that can be quantified in relation to the thermal noise of the system. The relation of the individual codon to its neighbours in the rearranged codon table can be completely understood based on thermodynamic considerations.  相似文献   

9.
Bellgard MI  Gojobori T 《Gene》1999,238(1):33-37
The relationship between the overall G+C content of the genome (GC) and the GC content at the third codon positions (GC3) of genes, which we refer to as a GC3-plot, was examined using 15 currently available complete genome sequences. A remarkably linear relationship was found between these two quantities, confirming previous observations of a strong positive correlation in the GC3-plot. In order to conduct a more detailed analysis of the GC3-plot, we examined the GC3 content by separating orthologous codons into three categories: synonymously different codons (namely identical amino acids, IA), different amino acids (DA), and identical codons (IC), for a pairwise comparison of two closely related species. When we took pairwise species comparisons between Mycoplasma genitalium (Mg) and Mycoplasma pneumoniae (Mp) and between Mycobacterium tuberculosis (Mt) and Mycobacterium leprae (Ml) as examples, we found that for Mp and Ml, the GC3 for IA deviated the most from the linear expectation in the GC3-plot, whereas for Mg and Mt the deviation was minimal. These findings suggest that the major changes of GC content took place in Mp and Ml, but not in Mg and Mt. This analysis also enables us to predict the future direction of the evolutionary changes of the genomic GC content.  相似文献   

10.
人类基因同义密码子偏好的特征以及与基因GC含量的关系   总被引:24,自引:0,他引:24  
对人类的728个基因,按其编码区中GC的含量分成四组(从GC<0.43到GC>0.58),分别考察了这四组样本对同义密码子偏好的特征,发现在全部样本中都呈现NTG(N代表四种碱基中的任一种)特受偏爱和NCG尽量避免的特征.基因环境中GC含量与C3/G3含量(密码子第三位C和G的含量)的相关分析,以及四组样本对密码子的偏好都支持以C结尾的密码子在编码中有特殊的优势,这种优势有利于保证翻译的准确性.还考察了各种氨基酸含量随编码区GC含量不同而变化的趋势.  相似文献   

11.
Base composition, codon usages and amino acid usages have been analyzed by taking 529 orthologous sequences of Aquifex aeolicus and Bacillus subtilis, having different optimal growth temperatures. These two bacteria do not have significant difference in overall GC composition, but GC(1+2) and GC3 levels were found to vary significantly. Significant increments in purine content and GC3 composition have been observed in the coding sequences of Aquifex aeolicus than its Bacillus subtilis counterparts. Correspondence analyses on codon and amino acid usages reveal that variation in base composition actually influences their codon and amino acid usages. Two selection pressures acting on the nucleotide level (GC3 and purine enrichment), causes variation in the amino acid usage differently in different protein secondary structures. Our results suggest that adaptation of amino acid usages in coil structure of Aquifex aeolicus proteins is under the control of both purine increment and GC3 composition, whereas the adaptation of the amino acids in the helical region of thermophilic bacteria is strongly influenced by the purine content. Evolutionary perspectives concerning the temperature adaptation of DNA and protein molecules of these two bacteria have been discussed on the basis of these results.  相似文献   

12.
To expand the genetic code for specification of multiple non-natural amino acids, unique codons for these novel amino acids are needed. As part of a study of the potential of quadruplets as codons, the decoding of tandem UAGA quadruplets by an engineered tRNALeu with an eight-base anticodon loop, has been investigated. When GCC is the codon immediately 5′ of the first UAGA quadruplet, and release factor 1 is partially inactivated, the tandem UAGAs specify two leucines with an overall efficiency of at least 10%. The presence of a purine at anticodon loop position 32 of the tRNA decoding the codon 5′ to the first UAGA seems to influence translation of the following codon. Another finding is intraribosomal dissociation of anticodons from codons and their re-pairing to mRNA at overlapping or nearby codons. In one case where GCC is replaced by CGG, only a single Watson–Crick base pair can form upon re-pairing when decoding is resumed. This has implications for the mechanism of some cases of programmed frameshifting.  相似文献   

13.
Compositional distributions in the three codon positions of the coding sequences of 12 fully sequenced prokaryotic genomes, which are publicly available, were investigated. A universal compositional correlation was observed in most of the genomes under investigation irrespective of their overall genomic GC contents. In all the genomes, the GC contents at the first codon positions are always greater than the overall GC contents of the genomes whereas the reverse is true in the case of second codon positions. GC contents at the third codon positions are higher than the overall genomic GC contents in high GC containing genomes, and the opposite situation was found in case of low GC genomes except for Helicobacter pylori. In high-GC rich genomes, the GC contents at the first + second codon positions are less than the GC contents at the third codon positions, and they are low in low-GC genomes except for Helicobacter pylori. The distributions of four bases at the three different positions were also investigated for all 12 organisms. It was observed that in high-GC genomes G is the most dominant base and in low-GC genomes A is the most dominant base in the first codon positions. But purine bases, i.e., (A + G), predominantly occur in the first codon position. In the second codon position, A is the most dominant base in most of the organisms and G is the least dominant base in all the organisms. There is no unique regular pattern of individual bases at the third codon positions; however, there are significant differences in the occurrences of (G + C) contents in the third codon positions among the different organisms. Calculations of dinucleotide frequencies in 12 different organisms indicate that in GC-rich genomes GG, GC, CC, and CG dinucleotides are the most dominant whereas the reverse is true in case of low-GC genomes. Biological implications of these results are discussed in this paper.  相似文献   

14.
Genomic GC (overall G+C content of the coding sequences) variations were reinvestigated between the orthologous genes of Mycobacterium tuberculosis and Mycobacterium leprae species. It was observed that overall genomic GC variation between the species mainly originates from the combined effects GC(1) and GC(2) variations. But codons having identical amino acids with different codons (IA) (between the orthologous codon pairs) are responsible for the genomic GC(3) variation between the organisms, whereas orthologous codons having different amino acids (DA) between the two organisms are responsible for the variation of GC(1) levels. Further analyses indicate that duets and quartets are going in the same direction with same magnitude in changing the GC(3) levels for IA category, whereas GC(1) levels of duets of DA category decreases significantly from the overall GC(1) levels but GC(1) levels of quartets increases significantly from the overall GC(1) levels. GC(3) levels of informational genes for the IA category decrease more rapidly than the other functional categories of genes. The biological implications of these results have been discussed in this paper.  相似文献   

15.
The number of completely sequenced archaeal genomes has been sufficient for a large-scale bioinformatic study.We have conducted analyses for each coding region from 36 archaeal genomes using the original CGS algorithm by calculating the total GC content(G+C),GC content in first,second and third codon positions as well as in fourfold and twofold degenerated sites from third codon positions,levels of arginine codon usage(Arg2:AGA/G;Arg4:CGX),levels of amino acid usage and the entropy of amino acid content distribution.In archaeal genomes with strong GC pressure,arginine is coded preferably by GC-rich Arg4 codons,whereas in most of archaeal genomes with G+C0.6,arginine is coded preferably by AT-rich Arg2 codons.In the genome of Haloquadratum walsbyi,which is closely related to GC-rich archaea,GC content has decreased mostly in third codon positions,while Arg4Arg2 bias still persists.Proteomes of archaeal species carry characteristic amino acid biases:levels of isoleucine and lysine are elevated,while levels of alanine,histidine,glutamine and cytosine are relatively decreased.Numerous genomic and proteomic biases observed can be explained by the hypothesis of previously existed strong mutational AT pressure in the common predecessor of all archaea.  相似文献   

16.
The genetic code discovered 40 years ago, consists of 64 triplets (codons) of nucleotides. The genetic code is almost universal. The same codons are assigned to the same amino acids and to the same START and STOP signals in the vast majority of genes in animals, plants, and microorganisms. Each codon encodes for one of the 20 amino acids used in the synthesis of proteins. That produces some redundancy in the code and most of the amino acids being encoded by more than one codon. The two cases have been found where selenocysteine or pyrrolysine, that are not one of the standard 20 is inserted by a tRNA into the growing polypeptide.  相似文献   

17.
We propose that glycine was the first amino acid to be incorporated into the genetic code, followed by serine, aspartic and/or glutamic acid—small hydrophilic amino acids that all have codons in the bottom right-hand corner of the standard genetic code table. Because primordial ribosomal synthesis is presumed to have been rudimentary, this stage would have been characterized by the synthesis of short, water-soluble peptides, the first of which would have comprised polyglycine. Evolution of the code is proposed to have occurred by the duplication and mutation of tRNA sequences, which produced a radiation of codon assignment outwards from the bottom right-hand corner. As a result of this expansion, we propose a trend from small hydrophilic to hydrophobic amino acids, with selection for longer polypeptides requiring a hydrophobic core for folding and stability driving the incorporation of hydrophobic amino acids into the code.  相似文献   

18.
Different synonymous codons are favored by natural selection for translation efficiency and accuracy in different organisms. The rules governing the identities of favored codons in different organisms remain obscure. In fact, it is not known whether such rules exist or whether favored codons are chosen randomly in evolution in a process akin to a series of frozen accidents. Here, we study this question by identifying for the first time the favored codons in 675 bacteria, 52 archea, and 10 fungi. We use a number of tests to show that the identified codons are indeed likely to be favored and find that across all studied organisms the identity of favored codons tracks the GC content of the genomes. Once the effect of the genomic GC content on selectively favored codon choice is taken into account, additional universal amino acid specific rules governing the identity of favored codons become apparent. Our results provide for the first time a clear set of rules governing the evolution of selectively favored codon usage. Based on these results, we describe a putative scenario for how evolutionary shifts in the identity of selectively favored codons can occur without even temporary weakening of natural selection for codon bias.  相似文献   

19.
20.
Understanding how codons became associated with their specific amino acids is fundamental to deriving a theory for the origin of the genetic code. Carl Woese and coworkers designed a series of experiments to test associations between amino acids and nucleobases that may have played a role in establishing the genetic code. Through these experiments it was found that a property of amino acids called the polar requirement (PR) is correlated with the organization of the codon table. No other property of amino acids has been found that correlates with the codon table as well as PR, indicating that PR is uniquely related to the modern genetic code. Using molecular dynamics simulations of amino acids in solutions of water and dimethylpyridine used to experimentally measure PR, we show that variations in the partitioning between the two phases as described by radial distribution functions correlate well with the measured PRs. Partition coefficients based on probability densities of the amino acids in each phase have the linear behavior with base concentration as suggested by PR experiments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号