首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
2.
In this study codon usage bias of all experimentally known genes of Lactococcus lactis has been analyzed. Since Lactococcus lactis is an AT rich organism, it is expected to occur A and/or T at the third position of codons and detailed analysis of overall codon usage data indicates that A and/or T ending codons are predominant in this organism. However, multivariate statistical analyses based both on codon count and on relative synonymous codon usage (RSCU) detect a large number of genes, which are supposed to be highly expressed are clustered at one end of the first major axis, while majority of the putatively lowly expressed genes are clustered at the other end of the first major axis. It was observed that in the highly expressed genes C and T ending codons are significantly higher than the lowly expressed genes and also it was observed that C ending codons are predominant in the duets of highly expressed genes, whereas the T endings codons are abundant in the quartets. Abundance of C and T ending codons in the highly expressed genes suggest that, besides, compositional biases, translational selection are also operating in shaping the codon usage variation among the genes in this organism as observed in other compositionally skewed organisms. The second major axis generated by correspondence analysis on simple codon counts differentiates the genes into two distinct groups according to their hydrophobicity values, but the same analysis computed with relative synonymous codon usage values could not discriminate the genes according to the hydropathy values. This suggests that amino acid composition exerts constraints on codon usage in this organism. On the other hand the second major axis produced by correspondence analysis on RSCU values differentiates the genes into two groups according to the synonymous codon usage for cysteine residues (rarest amino acids in this organism), which is nothing but a artifactual effect induced by the RSCU values. Other factors such as length of the genes and the positions of the genes in the leading and lagging strand of replication have practically no influence in the codon usage variation among the genes in this organism.  相似文献   

3.
Studies on codon usage in Entamoeba histolytica   总被引:13,自引:0,他引:13  
Codon usage bias of Entamoeba histolytica, a protozoan parasite, was investigated using the available DNA sequence data. Entamoeba histolytica having AT rich genome, is expected to have A and/or T at the third position of codons. Overall codon usage data analysis indicates that A and/or T ending codons are strongly biased in the coding region of this organism. However, multivariate statistical analysis suggests that there is a single major trend in codon usage variation among the genes. The genes which are supposed to be highly expressed are clustered at one end, while the majority of the putatively lowly expressed genes are clustered at the other end. The codon usage pattern is distinctly different in these two sets of genes. C ending codons are significantly higher in the putatively highly expressed genes suggesting that C ending codons are translationally optimal in this organism. In the putatively lowly expressed genes A and/or T ending codons are predominant, which suggests that compositional constraints are playing the major role in shaping codon usage variation among the lowly expressed genes. These results suggest that both mutational bias and translational selection are operational in the codon usage variation in this organism.  相似文献   

4.
Compositional distributions in three different codon positions as well as codon usage biases of all available DNA sequences of Buchnera aphidicola genome have been analyzed. It was observed that GC levels among the three codon positions is I>II>III as observed in other extremely high AT rich organisms. B. aphidicola being an AT rich organism is expected to have A and/or T at the third positions of codons. Overall codon usage analyses indicate that A and/or T ending codons are predominant in this organism and some particular amino acids are abundant in the coding region of genes. However, multivariate statistical analysis indicates two major trends in the codon usage variation among the genes; one being strongly correlated with the GC contents at the third synonymous positions of codons, and the other being associated with the expression level of genes. Moreover, codon usage biases of the highly expressed genes are almost identical with the overall codon usage biases of all the genes of this organism. These observations suggest that mutational bias is the main factor in determining the codon usage variation among the genes in B. aphidicola.  相似文献   

5.
Biased usage of synonymous codons has been elucidated under the perspective of cellular tRNA abundance for quite a long time now. Taking advantage of publicly available gene expression data for Saccharomyces cerevisiae, a systematic analysis of the codon and amino acid usages in two different coding regions corresponding to the regular (helix and strand) as well as the irregular (coil) protein secondary structures, have been performed. Our analyses suggest that apart from tRNA abundance, mRNA folding stability is another major evolutionary force in shaping the codon and amino acid usage differences between the highly and lowly expressed genes in S. cerevisiae genome and surprisingly it depends on the coding regions corresponding to the secondary structures of the encoded proteins. This is obviously a new paradigm in understanding the codon usage in S. cerevisiae. Differential amino acid usage between highly and lowly expressed genes in the regions coding for the irregular protein secondary structure in S. cerevisiae is expounded by the stability of the mRNA folded structure. Irrespective of the protein secondary structural type, the highly expressed genes always tend to encode cheaper amino acids in order to reduce the overall biosynthetic cost of production of the corresponding protein. This study supports the hypothesis that the tRNA abundance is a consequence of and not a reason for the biased usage of amino acid between highly and lowly expressed genes.  相似文献   

6.
Analysis of synonymous codon usage pattern in the genome of a thermophilic cyanobacterium, Thermosynechococcus elongatus BP-1 using multivariate statistical analysis revealed a single major explanatory axis accounting for codon usage variation in the organism. This axis is correlated with the GC content at third base of synonymous codons (GC3s) in correspondence analysis taking T. elongatus genes. A negative correlation was observed between effective number of codons i.e. Nc and GC3s. Results suggested a mutational bias as the major factor in shaping codon usage in this cyanobacterium. In comparison to the lowly expressed genes, highly expressed genes of this organism possess significantly higher proportion of pyrimidine-ending codons suggesting that besides, mutational bias, translational selection also influenced codon usage variation in T. elongatus. Correspondence analysis of relative synonymous codon usage (RSCU) with A, T, G, C at third positions (A3s, T3s, G3s, C3s, respectively) also supported this fact and expression levels of genes and gene length also influenced codon usage. A role of translational accuracy was identified in dictating the codon usage variation of this genome. Results indicated that although mutational bias is the major factor in shaping codon usage in T. elongatus, factors like translational selection, translational accuracy and gene expression level also influenced codon usage variation.  相似文献   

7.
Multivariate analysis of codon and amino acid usage was performed for three Leishmania species, including L. donovani, L. infantum and L. major. It was revealed that all three species are under mutational bias and translational selection. Lower GC 12 and higher GC 3S in all three parasites suggests that the ancestral highly expressed genes (HEGs), compared to lowly expressed genes (LEGs), might have been rich in AT-content. This also suggests that there must have been a faster rate of evolution under GC-bias in LEGs. It was observed from the estimation of synonymous/non-synonymous substitutions in HEGs that the HEG dataset of L. donovani is much closer to L. major evolutionarily. This is also supported by the higher d N value as compared to d S between L. donovani and L. major, suggesting the conservation of synonymous codon positions between these two species and the role of translational selection in shaping the composition of protein-coding genes.  相似文献   

8.
Base composition, codon usages and amino acid usages have been analyzed by taking 529 orthologous sequences of Aquifex aeolicus and Bacillus subtilis, having different optimal growth temperatures. These two bacteria do not have significant difference in overall GC composition, but GC(1+2) and GC3 levels were found to vary significantly. Significant increments in purine content and GC3 composition have been observed in the coding sequences of Aquifex aeolicus than its Bacillus subtilis counterparts. Correspondence analyses on codon and amino acid usages reveal that variation in base composition actually influences their codon and amino acid usages. Two selection pressures acting on the nucleotide level (GC3 and purine enrichment), causes variation in the amino acid usage differently in different protein secondary structures. Our results suggest that adaptation of amino acid usages in coil structure of Aquifex aeolicus proteins is under the control of both purine increment and GC3 composition, whereas the adaptation of the amino acids in the helical region of thermophilic bacteria is strongly influenced by the purine content. Evolutionary perspectives concerning the temperature adaptation of DNA and protein molecules of these two bacteria have been discussed on the basis of these results.  相似文献   

9.
Genes involved in the symbiotic interactions between the nitrogen-fixing endosymbiont Bradyrhizobium japonicum, and its leguminous host are mostly clustered in a symbiotic island (SI), acquired by the bacterium through a process of horizontal transfer. A comparative analysis of the codon and amino acid usage in core and SI genes/proteins of B. japonicum has been carried out in the present study. The mutational bias, translational selection, and gene length are found to be the major sources of variation in synonymous codon usage in the core genome as well as in SI, the strength of translational selection being higher in core genes than in SI. In core proteins, hydrophobicity is the main source of variation in amino acid usage, expressivity and aromaticity being the second and third important sources. But in SI proteins, aromaticity is the chief source of variation, followed by expressivity and hydrophobicity. In SI proteins, both the mean molecular weight and mean aromaticity of individual proteins exhibit significant positive correlation with gene expressivity, which violate the cost-minimization hypothesis. Investigation of nucleotide substitution patterns in B. japonicum and Mesorhizobium loti orthologous genes reveals that both synonymous and non-synonymous sites of highly expressed genes are more conserved than their lowly expressed counterparts and this conservation is more pronounced in the genes present in core genome than in SI.  相似文献   

10.
To study the possible codon usage and base composition variation in the bacteriophages, fourteen mycobacteriophages were used as a model system here and both the parameters in all these phages and their plating bacteria, M. smegmatis had been determined and compared. As all the organisms are GC-rich, the GC contents at third codon positions were found in fact higher than the second codon positions as well as the first + second codon positions in all the organisms indicating that directional mutational pressure is strongly operative at the synonymous third codon positions. Nc plot indicates that codon usage variation in all these organisms are governed by the forces other than compositional constraints. Correspondence analysis suggests that: (i) there are codon usage variation among the genes and genomes of the fourteen mycobacteriophages and M. smegmatis, i.e., codon usage patterns in the mycobacteriophages is phage-specific but not the M. smegmatis-specific; (ii) synonymous codon usage patterns of Barnyard, Che8, Che9d, and Omega are more similar than the rest mycobacteriophages and M. smegmatis; (iii) codon usage bias in the mycobacteriophages are mainly determined by mutational pressure; and (iv) the genes of comparatively GC rich genomes are more biased than the GC poor genomes. Translational selection in determining the codon usage variation in highly expressed genes can be invoked from the predominant occurrences of C ending codons in the highly expressed genes. Cluster analysis based on codon usage data also shows that there are two distinct branches for the fourteen mycobacteriophages and there is codon usage variation even among the phages of each branch.  相似文献   

11.
Burkholderia pseudomallei is a recognized biothreat agent and the causative agent of melioidosis. Codon usage biases of all protein-coding genes (length greater than or equal to 300 bp) from the complete genome of B. pseudomallei K96243 have been analyzed. As B. pseudomallei is a GC-rich organism (68.5%), overall codon usage data analysis indicates that indeed codons ending in G and/or C are predominant in this organism. But multivariate statistical analysis indicates that there is a single major trend in the codon usage variation among the genes in this organism, which has a strong positively correlation with the expressivities of the genes. The majority of the lowly expressed genes are scattered towards the negative end of the major axis whereas the highly expressed genes are clustered towards the positive end. At the same time, from the results that there were two significant correlations between axis 1 coordinates and the GC, GC3s content at silent sites of each sequence, and clearly significant negatively correlations between the ‘Effective Number of Codons’ values and GC, GC3s content, we inferred that codon usage bias was affected by gene nucleotide composition also. In addition, some other factors such as the lengths of the genes as well as the hydrophobicity of genes also influence the codon usage variation among the genes in this organism in a minor way. At the same time, notably, 21 codons have been defined as ‘optimal codons’ of the B. pseudomallei. In summary, our work have provided a basic understanding of the mechanisms for codon usage bias and some more useful information for improving the expression of target genes in vivo and in vitro. Sheng Zhao and Qin Zhang contributed equally to this work.  相似文献   

12.
13.
Gupta SK  Ghosh TC 《Gene》2001,273(1):63-70
Codon usage biases of all DNA sequences (length greater than or equal to 300 bp) from the complete genome of Pseudomonas aeruginosa have been analyzed. As P. aeruginosa is a GC-rich organism, G and/or C are expected to predominate in their codons. Overall codon usage data analysis indicates that indeed codons ending in G and/or C are predominant in this organism. But multivariate statistical analysis indicates that there is a single major trend in the codon usage variation among the genes in this organism, which has a strong negative correlation with the expressivities of the genes. The majority of the lowly expressed genes are scattered towards the positive end of the major axis whereas the highly expressed genes are clustered towards the negative end. This is the first report where the prokaryotic organism having highly skewed base composition is dictated mainly by translational selection, though some other factors such as the lengths of the genes as well as the hydrophobicity of genes also influence the codon usage variation among the genes in this organism in a minor way.  相似文献   

14.
15.
The extent of codon usage in the protein coding genes of the mycobacteriophage, Bxz1, and its plating bacteria, M. smegmatis, were determined, and it was observed that the codons ending with either G and / or C were predominant in both the organisms. Multivariate statistical analysis showed that in both organisms, the genes were separated along the first major explanatory axis according to their expression levels and their genomic GC content at the synonymous third positions of the codons. The second major explanatory axis differentiates the genes according to their genome type. A comparison of the relative synonymous codon usage between 20 highly- and 20 lowly expressed genes from Bxz1 identified 21 codons, which are statistically over represented in the former group of genes. Further analysis found that the Bxz1- specific tRNA species could recognize 13 out of the 21 over represented synonymous codons, which incorporated 13 amino acid residues preferentially into the highly expressed proteins of Bxz1. In contrast, seven amino acid residues were preferentially incorporated into the lowly expressed proteins by 10 other tRNA species of Bxz1. This analysis predicts for the first time that the Bxz1-specific tRNA species modulates the optimal expression of its proteins during development.  相似文献   

16.
Lightfield J  Fram NR  Ely B 《PloS one》2011,6(3):e17677
The GC content of bacterial genomes ranges from 16% to 75% and wide ranges of genomic GC content are observed within many bacterial phyla, including both gram negative and gram positive phyla. Thus, divergent genomic GC content has evolved repeatedly in widely separated bacterial taxa. Since genomic GC content influences codon usage, we examined codon usage patterns and predicted protein amino acid content as a function of genomic GC content within eight different phyla or classes of bacteria. We found that similar patterns of codon usage and protein amino acid content have evolved independently in all eight groups of bacteria. For example, in each group, use of amino acids encoded by GC-rich codons increased by approximately 1% for each 10% increase in genomic GC content, while the use of amino acids encoded by AT-rich codons decreased by a similar amount. This consistency within every phylum and class studied led us to conclude that GC content appears to be the primary determinant of the codon and amino acid usage patterns observed in bacterial genomes. These results also indicate that selection for translational efficiency of highly expressed genes is constrained by the genomic parameters associated with the GC content of the host genome.  相似文献   

17.
Amino acids are utilized with different frequencies both among species and among genes within the same genome. Up to date, no study on the amino acid usage pattern of chicken has been performed. In the present study, we carried out a systematic examination of the amino acid usage in the chicken proteome. Our data indicated that the relative amino acid usage is positively correlated with the tRNA gene copy number. GC contents, including GC1, GC2, GC3, GC content of CDS and GC content of the introns, were correlated with the most of the amino acid usage, especially for GC rich and GC poor amino acids, however, multiple linear regression analyses indicated that only approximately 10–40% variation of amino acid usage can be explained by GC content for GC rich and GC poor amino acids. For other intermediate GC content amino acids, only approximately 10% variation can be explained. Correspondence analyses demonstrated that the main factors responsible for the variation of amino acid usage in chicken are hydrophobicity, aromaticity and genomic GC content. Gene expression level also influenced the amino acid usage significantly. We argued that the amino acid usage of chicken proteome likely reflects a balance or near balance between the action of selection, mutation, and genetic drift.  相似文献   

18.
The genomic as well as structural relationship of phycobiliproteins (PBPs) in different cyanobacterial species are determined by nucleotides as well as amino acid composition. The genomic GC constituents influence the amino acid variability and codon usage of particular subunit of PBPs. We have analyzed 11 cyanobacterial species to explore the variation of amino acids and causal relationship between GC constituents and codon usage. The study at the first, second and third levels of GC content showed relatively more amino acid variability on the levels of G3 + C3 position in comparison to the first and second positions. The amino acid encoded GC rich level including G rich and C rich or both correlate the codon variability and amino acid availability. The fluctuation in amino acids such as Arg, Ala, His, Asp, Gly, Leu and Glu in α and β subunits was observed at G1C1 position; however, fluctuation in other amino acids such as Ser, Thr, Cys and Trp was observed at G2C2 position. The coding selection pressure of amino acids such as Ala, Thr, Tyr, Asp, Gly, Ile, Leu, Asn, and Ser in α and β subunits of PBPs was more elaborated at G3C3 position. In this study, we observed that each subunit of PBPs is codon specific for particular amino acid. These results suggest that genomic constraint linked with GC constituents selects the codon for particular amino acids and furthermore, the codon level study may be a novel approach to explore many problems associated with genomics and proteomics of cyanobacteria.  相似文献   

19.
The compositional non-randomness was studied in genes of Saccharomyces cerevisiae and Schizosaccharomyces pombe. In both species, codon usage is well correlated with expressivity (measured as the codon adaptation index). Both species generally display higher nucleotide non-randomness in the group of highly expressed genes than in the lowly expressed genes. The highly expressed genes in both species are furthermore characterized by marked peaks in non-randomness at N=3 upstream of start codons, N=2 downstream of start codons and at N=1 and N=7 downstream of stop codons, indicating that these nucleotides may be key elements in translational regulation. Intragenic variation in codon usage was also observed to be linked to expressivity. It is suggested that the firm link between expressivity and codon usage calls for codon optimization. Based on bioinformatic calculations, examples of proteins are given for which codon optimizations might be relevant.  相似文献   

20.
Abstract

Genes involved in the symbiotic interactions between the nitrogen-fixing endosymbiont Bradyrhizobium japonicum, and its leguminous host are mostly clustered in a symbiotic island (SI), acquired by the bacterium through a process of horizontal transfer. A comparative analysis of the codon and amino acid usage in core and SI genes/proteins of B. japonicum has been carried out in the present study. The mutational bias, translational selection, and gene length are found to be the major sources of variation in synonymous codon usage in the core genome as well as in SI, the strength of translational selection being higher in core genes than in SI. In core proteins, hydrophobicity is the main source of variation in amino acid usage, expressivity and aromaticity being the second and third important sources. But in SI proteins, aromaticity is the chief source of variation, followed by expressivity and hydrophobicity. In SI proteins, both the mean molecular weight and mean aromaticity of individual proteins exhibit significant positive correlation with gene expressivity, which violate the cost-minimization hypothesis. Investigation of nucleotide substitution patterns in B. japonicum and Mesorhizobium loti orthologous genes reveals that both synonymous and non-synonymous sites of highly expressed genes are more conserved than their lowly expressed counterparts and this conservation is more pronounced in the genes present in core genome than in SI.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号