首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
Biased usage of synonymous codons has been elucidated under the perspective of cellular tRNA abundance for quite a long time now. Taking advantage of publicly available gene expression data for Saccharomyces cerevisiae, a systematic analysis of the codon and amino acid usages in two different coding regions corresponding to the regular (helix and strand) as well as the irregular (coil) protein secondary structures, have been performed. Our analyses suggest that apart from tRNA abundance, mRNA folding stability is another major evolutionary force in shaping the codon and amino acid usage differences between the highly and lowly expressed genes in S. cerevisiae genome and surprisingly it depends on the coding regions corresponding to the secondary structures of the encoded proteins. This is obviously a new paradigm in understanding the codon usage in S. cerevisiae. Differential amino acid usage between highly and lowly expressed genes in the regions coding for the irregular protein secondary structure in S. cerevisiae is expounded by the stability of the mRNA folded structure. Irrespective of the protein secondary structural type, the highly expressed genes always tend to encode cheaper amino acids in order to reduce the overall biosynthetic cost of production of the corresponding protein. This study supports the hypothesis that the tRNA abundance is a consequence of and not a reason for the biased usage of amino acid between highly and lowly expressed genes.  相似文献   

2.
The relationship between the synonymous codon usage and different protein secondary structural classes were investigated using 401 Homo sapiens proteins extracted from Protein Data Bank (PDB). A simple Chi-square test was used to assess the significance of deviation of the observed and expected frequencies of 59 codons at the level of individual synonymous families in the four different protein secondary structural classes. It was observed that synonymous codon families show non-randomness in codon usage in four different secondary structural classes. However,when the genes were classified according to their GC3 levels there was an increase in non-randomness in high GC3 group of genes. The non-randomness in codon usage was further tested among the same protein secondary structures belonging to four different protein folding classes of high GC3 group of genes. The results show that in each of the protein secondary structural unit there exist some synonymous family that shows class specific codon-usage pattern. Moreover, there is an increased non-random behaviour of synonymous codons in sheet structure of all secondary structural classes in high GC3 group of genes. Biological implications of these results have been discussed.  相似文献   

3.
High-quality data about protein structures and their gene sequences are essential to the understanding of the relationship between protein folding and protein coding sequences. Firstly we constructed the EcoPDB database, which is a high-quality database of Escherichia coli genes and their corresponding PDB structures. Based on EcoPDB, we presented a novel approach based on information theory to investigate the correlation between cysteine synonymous codon usages and local amino acids flanking cysteines, the correlation between cysteine synonymous codon usages and synonymous codon usages of local amino acids flanking cysteines, as well as the correlation between cysteine synonymous codon usages and the disulfide bonding states of cysteines in the E. coli genome. The results indicate that the nearest neighboring residues and their synonymous codons of the C-terminus have the greatest influence on the usages of the synonymous codons of cysteines and the usage of the synonymous codons has a specific correlation with the disulfide bond formation of cysteines in proteins. The correlations may result from the regulation mechanism of protein structures at gene sequence level and reflect the biological function restriction that cysteines pair to form disulfide bonds. The results may also be helpful in identifying residues that are important for synonymous codon selection of cysteines to introduce disulfide bridges in protein engineering and molecular biology. The approach presented in this paper can also be utilized as a complementary computational method and be applicable to analyse the synonymous codon usages in other model organisms.  相似文献   

4.
Wang ML  Song JN  Xu WB  Li WJ 《FEBS letters》2004,576(3):336-338
Proline is a special imino acid in protein and the isomerization of the prolyl peptide bond has notable biological significance and influences the final structure of protein greatly, so the correlation between proline synonymous codon usage and local amino acid, the correlation between proline synonymous codon usage and the isomerization of the prolyl peptide bond were both investigated in the Escherichia coli genome by using a novel method based on information theory. The results show that in peptide chain, the residue at the first position C-terminal influences the usage of proline synonymous codon greatly and proline synonymous codons contain some factors influencing the isomerization of the prolyl peptide bond.  相似文献   

5.
The extent of codon usage in the protein coding genes of the mycobacteriophage, Bxz1, and its plating bacteria, M. smegmatis, were determined, and it was observed that the codons ending with either G and / or C were predominant in both the organisms. Multivariate statistical analysis showed that in both organisms, the genes were separated along the first major explanatory axis according to their expression levels and their genomic GC content at the synonymous third positions of the codons. The second major explanatory axis differentiates the genes according to their genome type. A comparison of the relative synonymous codon usage between 20 highly- and 20 lowly expressed genes from Bxz1 identified 21 codons, which are statistically over represented in the former group of genes. Further analysis found that the Bxz1- specific tRNA species could recognize 13 out of the 21 over represented synonymous codons, which incorporated 13 amino acid residues preferentially into the highly expressed proteins of Bxz1. In contrast, seven amino acid residues were preferentially incorporated into the lowly expressed proteins by 10 other tRNA species of Bxz1. This analysis predicts for the first time that the Bxz1-specific tRNA species modulates the optimal expression of its proteins during development.  相似文献   

6.
Summary We have analyzed the correlation that exists between the GC levels of third and first or second codon position for about 1400 human coding sequences. The linear relationship that was found indicates that the large differences in GC level of third codon positions of human genes are paralleled by smaller differences in GC levels of first and second codon positions. Whereas third codon position differences correspond to very large differences in codon usage within the human genome, the first and second codon position differences correspond to smaller, yet very remarkable, differences in the amino acid composition of encoded proteins. Because GC levels of codon positions are linearly correlated with the GC levels of the isochores harboring the corresponding genes, both codon usage and amino acid composition are different for proteins encoded by genes located in isochores of different GC levels. Furthermore, we have also shown that a linear relationship with a unity slope and a correlation coefficient of 0.77 exists between GC levels of introns and exons from the 238 human genes currently available for this analysis. Introns are, however, about 5% lower in GC, on average, than exons from the same genes.  相似文献   

7.
同义密码子携带多少蛋白质二级结构信息   总被引:4,自引:0,他引:4  
应用信息论方法考察了大肠杆菌人两种生物的同义密码子用语和蛋白质二级结构的关联情况。研究结果表明:大肠杆菌和人的基因组中都存在着一些同义密码子明显携带有蛋白质二级结构信息,尽管这些信息量都很小;同义密码子与蛋白质二级结构的关联是种属特异性。  相似文献   

8.
The 'effective number of codons' used in a gene   总被引:64,自引:0,他引:64  
F Wright 《Gene》1990,87(1):23-29
A simple measure is presented that quantifies how far the codon usage of a gene departs from equal usage of synonymous codons. This measure of synonymous codon usage bias, the 'effective number of codons used in a gene', Nc, can be easily calculated from codon usage data alone, and is independent of gene length and amino acid (aa) composition. Nc can take values from 20, in the case of extreme bias where one codon is exclusively used for each aa, to 61 when the use of alternative synonymous codons is equally likely. Nc thus provides an intuitively meaningful measure of the extent of codon preference in a gene. Codon usage patterns across genes can be investigated by the Nc-plot: a plot of Nc vs. G + C content at synonymous sites. Nc-plots are produced for Homo sapiens, Saccharomyces cerevisiae, Escherichia coli, Bacillus subtilis, Dictyostelium discoideum, and Drosophila melanogaster. A FORTRAN77 program written to calculate Nc is available on request.  相似文献   

9.
Synonymous codon replacement can change protein structure and function, indicating that protein structure depends on DNA sequence. During heterologous protein expression, low expression or formation of insoluble aggregates may be attributable to differences in synonymous codon usage between expression and natural hosts. This discordance may be particularly important during translation of the domain boundaries (link/end segments) that separate elements of higher ordered structure. Within such regions, ribosomal progression slows as the ribosome encounters clusters of infrequently used codons that preferentially encode a subset of amino acids. To replicate the modulation of such localized translation rates during heterologous expression, we used known relationships between codon usage frequencies and secondary protein structure to develop an algorithm ("codon harmonization") for identifying regions of slowly translated mRNA that are putatively associated with link/end segments. It then recommends synonymous replacement codons having usage frequencies in the heterologous expression host that are less than or equal to the usage frequencies of native codons in the native expression host. For protein regions other than these putative link/end segments, it recommends synonymous substitutions with codons having usage frequencies matched as nearly as possible to the native expression system. Previous application of this algorithm facilitated E. coli expression, manufacture and testing of two Plasmodium falciparum vaccine candidates. Here we describe the algorithm in detail and apply it to E. coli expression of three additional P. falciparum proteins. Expression of the "recoded" genes exceeded that of the native genes by 4- to 1,000-fold, representing levels suitable for vaccine manufacture. The proteins were soluble and reacted with a variety of functional conformation-specific mAbs suggesting that they were folded properly and had assumed native conformation. Codon harmonization may further provide a general strategy for improving the expression of soluble functional proteins during heterologous expression in hosts other than E. coli.  相似文献   

10.
Does the 'non-coding' strand code?   总被引:3,自引:2,他引:1       下载免费PDF全文
The hypothesis that DNA strands complementary to the coding strand contain in phase coding sequences has been investigated. Statistical analysis of the 50 genes of bacteriophage T7 shows no significant correlation between patterns of codon usage on the coding and non-coding strands. In Bacillus and yeast genes the correlation observed is not different from that expected with random synonymous codon usage, while a high correlation seen in 52 E. coli genes can be explained in terms of an excess of RNY codons. A deficiency of UUA, CUA and UCA codons (complementary to termination) seems to be restricted to the E. coli genes, and may be due to low abundance of the relevant cognate tRNA species. Thus the analysis shows that the non-coding strand has the properties expected of a sequence complementary to a coding strand, with no indications that it encodes, or may have encoded, proteins.  相似文献   

11.
We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship.  相似文献   

12.
To reveal how the AT-rich genome of bacteriophage PhiKZ has been shaped in order to carryout its growth in the GC-rich host Pseudomonas aeruginosa,synonymous codon and amino acid usage bias ofPhiKZ was investigated and the data were compared with that of P.aeruginosa.It was found that synonymouscodon and amino acid usage of PhiKZ was distinct from that of P.aeruginosa.In contrast to P.aeruginosa,the third codon position of the synonymous codons of PhiKZ carries mostly A or T base;codon usage biasin PhiKZ is dictated mainly by mutational bias and,to a lesser extent,by translational selection.A clusteranalysis of the relative synonymous codon usage values of 16 myoviruses including PhiKZ shows that PhiKZis evolutionary much closer to Escherickia coli phage T4.Further analysis reveals that the three factors ofmean molecular weight,aromaticity and cysteine content are mostly responsible for the variation of aminoacid usage in PhiKZ proteins,whereas amino acid usage of P.aeruginosa proteins is mainly governed bygrand average of hydropathicity,aromaticity and cysteine content.Based on these observations,we suggestthat codons of the phage-like PhiKZ have evolved to preferentially incorporate the smaller amino acid residuesinto their proteins during translation,thereby economizing the cost of its development in GC-rich P.aeruginosa.  相似文献   

13.
To understand the synonymous codon usage pattern in mitochondrial genome of Antheraea assamensis, we analyzed the 13 mitochondrial protein‐coding genes of this species using a bioinformatic approach as no work was reported yet. The nucleotide composition analysis suggested that the percentages of A, T, G,and C were 33.73, 46.39, 9.7 and 10.17, respectively and the overall GC content was 19.86, that is, lower than 50% and the genes were AT rich. The mean effective number of codons of mitochondrial protein‐coding genes was 36.30 and it indicated low codon usage bias (CUB). Relative synonymous codon usage analysis suggested overrepresented and underrepresented codons in each gene and the pattern of codon usage was different among genes. Neutrality plot analysis revealed a narrow range of distribution for GC content at the third codon position and some points were diagonally distributed, suggesting both mutation pressure and natural selection influenced the CUB.  相似文献   

14.
Summary Ubiquitin is ubiquitous in all eukaryotes and its amino acid sequence shows extreme conservation. Ubiquitin genes comprise direct repeats of the ubiquitin coding unit with no spacers. The nucleotide sequences coding for 13 ubiquitin genes from 11 species reported so far have been compiled and analyzed. The G+C content of codon third base reveals a positive linear correlation with the genome G+C content of the corresponding species. The slope strongly suggests that the overall G+C content of codons of polyubiquitin genes clearly reflects the genome G+C content by AT/GC substitutions at the codon third position. The G+C content of ubiquitin codon third base also shows a positive linear correlation with the overall G+C content of coding regions of compiled genes, indicating the codon choices among synonymous codons reflect the average codon usage pattern of corresponding species. On the other hand, the monoubiquitin gene, which is different from the polyubiquitin gene in gene organization, gene expression, and function of the encoding protein, shows a different codon usage pattern compared with that of the polyubiquitin gene. From comparisons of the levels of synonymous substitutions among ubiquitin repeats and the homology of the amino acid sequence of the tail of monomeric ubiquitin genes, we propose that the molecular evolution of ubiquitin genes occurred as follows: Plural primitive ubiquitin sequences were dispersed on genome in ancestral eukaryotes. Some of them situated in a particular environment fused with the tail sequence to produce monomeric ubiquitin genes that were maintained across species. After divergence of species, polyubiquitin genes were formed by duplication of the other primitive ubiquitin sequences on different chromosomes. Differences in the environments in which ubiquitin genes are embedded reflect the differences in codon choice and in gene expression pattern between poly- and monomeric ubiquitin genes.  相似文献   

15.
Codon usage and tRNA content in unicellular and multicellular organisms   总被引:129,自引:17,他引:112  
Choices of synonymous codons in unicellular organisms are here reviewed, and differences in synonymous codon usages between Escherichia coli and the yeast Saccharomyces cerevisiae are attributed to differences in the actual populations of isoaccepting tRNAs. There exists a strong positive correlation between codon usage and tRNA content in both organisms, and the extent of this correlation relates to the protein production levels of individual genes. Codon-choice patterns are believed to have been well conserved during the course of evolution. Examination of silent substitutions and tRNA populations in Enterobacteriaceae revealed that the evolutionary constraint imposed by tRNA content on codon usage decelerated rather than accelerated the silent-substitution rate, at least insofar as pairs of taxonomically related organisms were examined. Codon-choice patterns of multicellular organisms are briefly reviewed, and diversity in G+C percentage at the third position of codons in vertebrate genes--as well as a possible causative factor in the production of this diversity--is discussed.   相似文献   

16.
The frequencies of occurrence of nucleotides at the 5' side of codons have been determined in highly and weakly expressed genes from E. coli. Significant constraints on the nucleotide 5' to some codons were found in highly expressed genes. Certain rules of synonymous codon usage depending on the amino acid 3' of the codon were established. E. g., codon possessing quanosine in the third position (NNG) are preferred over NNA if the next amino acid is lysine (P less than 10(-5)). On the other hand, rules of synonymous codon usage in relation to 5' flanking nucleotide were found. For example, when coding for aspartic acid, GAC codon is preferred over GAU (P less than 0.001) if uridine is 5' to codon and on the contrary GAU is favoured (P less than 0.0001) if quanosine is at the 5' side of aspartic acid codon. These rules can be used in the chemical synthesis of genes designed for expression in E. coli.  相似文献   

17.
The relationship between the synonymous codon usage and different protein secondary structural classes were investigated using 401 Homo sapiens proteins extracted from Protein Data Bank (PDB). A simple Chi-square test was used to assess the significance of deviation of the observed and expected frequencies of 59 codons at the level of individual synonymous families in the four different protein secondary structural classes. It was observed that synonymous codon families show non-randomness in codon usage in four different secondary structural classes. However, when the genes were classified according to their GC3 levels there was an increase in non-randomness in high GC3 group of genes. The non-randomness in codon usage was further tested among the same protein secondary structures belonging to four different protein folding classes of high GC3 group of genes. The results show that in each of the protein secondary structural unit there exist some synonymous family that shows class specific codon-usage pattern. Moreover, there is an increased non-random behaviour of synonymous codons in sheet structure of all secondary structural classes in high GC3 group of genes. Biological implications of these results have been discussed.  相似文献   

18.
In the present study, major constraints for codon and amino acid usage of Sulfolobus acidocaldarius, Sulfolobus solfataricus, Sulfolobus tokodali, Sulfolobus islandis and 6 other isolates from islandicus species of genus Sulfolobus were investigated. Correspondence analysis revealed high significant correlation between the major trend of synonymous codon usage and gene expression level, as assessed by the “Codon Adaptation Index” (CAI). There is a significant negative correlation between Nc (Effective number of codons) and CAI demonstrating role of codon bias as an important determinant of codon usage. The significant correlation between major trend of synonymous codon usage and GC3s (G + C at third synonymous position) indicated dominant role of mutational bias in codon usage pattern. The result was further supported from SCUO (synonymous codon usage order) analysis. The amino acid usage was found to be significantly influenced by aromaticity and hydrophobicity of proteins. However, translational selection which causes a preference for codons that are most rapidly translated by current tRNA with multiple copy numbers was not found to be highly dominating for all studied isolates. Notably, 26 codons that were found to be optimally used by genes of S. acidocaldarius at higher expression level and its comparative analysis with 9 other isolates may provide some useful clues for further in vivo genetic studies on this genus.  相似文献   

19.
In many unicellular organisms, invertebrates, and plants, synonymous codon usage biases result from a coadaptation between codon usage and tRNAs abundance to optimize the efficiency of protein synthesis. However, it remains unclear whether natural selection acts at the level of the speed or the accuracy of mRNAs translation. Here we show that codon usage can improve the fidelity of protein synthesis in multicellular species. As predicted by the model of selection for translational accuracy, we find that the frequency of codons optimal for translation is significantly higher at codons encoding for conserved amino acids than at codons encoding for nonconserved amino acids in 548 genes compared between Caenorhabditis elegans and Homo sapiens. Although this model predicts that codon bias correlates positively with gene length, a negative correlation between codon bias and gene length has been observed in eukaryotes. This suggests that selection for fidelity of protein synthesis is not the main factor responsible for codon biases. The relationship between codon bias and gene length remains unexplained. Exploring the differences in gene expression process in eukaryotes and prokaryotes should provide new insights to understand this key question of codon usage. Received: 18 June 2000 / Accepted: 10 November 2000  相似文献   

20.
Base composition, codon usages and amino acid usages have been analyzed by taking 529 orthologous sequences of Aquifex aeolicus and Bacillus subtilis, having different optimal growth temperatures. These two bacteria do not have significant difference in overall GC composition, but GC(1+2) and GC3 levels were found to vary significantly. Significant increments in purine content and GC3 composition have been observed in the coding sequences of Aquifex aeolicus than its Bacillus subtilis counterparts. Correspondence analyses on codon and amino acid usages reveal that variation in base composition actually influences their codon and amino acid usages. Two selection pressures acting on the nucleotide level (GC3 and purine enrichment), causes variation in the amino acid usage differently in different protein secondary structures. Our results suggest that adaptation of amino acid usages in coil structure of Aquifex aeolicus proteins is under the control of both purine increment and GC3 composition, whereas the adaptation of the amino acids in the helical region of thermophilic bacteria is strongly influenced by the purine content. Evolutionary perspectives concerning the temperature adaptation of DNA and protein molecules of these two bacteria have been discussed on the basis of these results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号