首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Highly expressed genes in any species differ in the usage frequency of synonymous codons. The relative recurrence of an event of the favored codon pair (amino acid pairs) varies between gene and genomes due to varying gene expression and different base composition. Here we propose a new measure for predicting the gene expression level, i.e., codon plus amino bias index (CABI). Our approach is based on the relative bias of the favored codon pair inclination among the genes, illustrated by analyzing the CABI score of the Medicago truncatula genes. CABI showed strong correlation with all other widely used measures (CAI, RCBS, SCUO) for gene expression analysis. Surprisingly, CABI outperforms all other measures by showing better correlation with the wet-lab data. This emphasizes the importance of the neighboring codons of the favored codon in a synonymous group while estimating the expression level of a gene.  相似文献   

2.
Summary This paper reports on the relationship between the number of silent differences and the codon usage changes in the lineages leading to human and rat. Examination of 102 pairs of homologous genes gives rise to four main conclusions: (1) We have previously demonstrated the existence of a codon usage change (called the minor shift) between human and rat; this was confirmed here with a larger sample. For genes with extreme C+G frequencies, the C+G level in the third codon position is less extreme in rat than in human. (2) Protein similarity and percentage of positive differences are the two main factors that discriminate homologous genes when characterized by differences between rat and human. By definition, positive differences result from silent changes between A or T and C or G with a direction implying a C+G content variation in the same direction as the overall gene variation. (3) For genes showing both codon usage change and low protein similarity, a majority of amino acid replacements contributes to C+G level variation in positions I and II in the same direction as the variation in position III. This is thus a new example of protein evolution due to constraints acting at the DNA level. (4) In heavy isochores (high C+G content) no direct correlation exists between codon usage change (measured by the dissymmetry of differences) and silent dissimilarity. In light isochores the opposite situation is observed: modification of codon usage is associated with a high synonymous dissimilarity. This result shows that, in some cases, modification of constraints acting at the DNA level could accelerate divergence between genomes.  相似文献   

3.
The GC contents of 2670 prokaryotic genomes that belong to diverse phylogenetic lineages were analyzed in this paper. These genomes had GC contents that ranged from 13.5% to 74.9%. We analyzed the distance of base frequencies at the three codon positions, codon frequencies, and amino acid compositions across genomes with respect to the differences in the GC content of these prokaryotic species. We found that although the phylogenetic lineages were remote among some species, a similar genomic GC content forced them to adopt similar base usage patterns at the three codon positions, codon usage patterns, and amino acid usage patterns. Our work demonstrates that in prokaryotic genomes: a) base usage, codon usage, and amino acid usage change with GC content with a linear correlation; b) the distance of each usage has a linear correlation with the GC content difference; and c) GC content is more essential than phylogenetic lineage in determining base usage, codon usage, and amino acid usage. This work is exceptional in that we adopted intuitively graphic methods for all analyses, and we used these analyses to examine as many as 2670 prokaryotes. We hope that this work is helpful for understanding common features in the organization of microbial genomes.  相似文献   

4.
Synonymous codon usage has long been known as a factor that affects average expression level of proteins in fast-growing microorganisms, but neither its role in dynamic changes of expression in response to environmental changes nor selective factors shaping it in the genomes of higher eukaryotes have been fully understood. Here, we propose that codon usage is ubiquitously selected to synchronize the translation efficiency with the dynamic alteration of protein expression in response to environmental and physiological changes. Our analysis reveals that codon usage is universally correlated with gene function, suggesting its potential contribution to synchronized regulation of genes with similar functions. We directly show that coexpressed genes have similar synonymous codon usages within the genomes of human, yeast, Caenorhabditis elegans and Escherichia coli. We also demonstrate that perturbing the codon usage directly affects the level or even direction of changes in protein expression in response to environmental stimuli. Perturbing tRNA composition also has tangible phenotypic effects on the cell. By showing that codon usage is universally function-specific, our results expand, to almost all organisms, the notion that cells may need to dynamically alter their intracellular tRNA composition in order to adapt to their new environment or physiological role.  相似文献   

5.
In all, 238 and 155 transfer (t)RNA genes were predicted from the genomes of Phytophthora sojae and P. ramorum, respectively. After omitting pseudogenes and undetermined types of tRNA genes, there remained 208 P. sojae tRNA genes and 140 P. ramorum tRNA genes. There were 45 types of tRNA genes, with distinct anticodons, in each species. Fourteen common anticodon types of tRNAs are missing altogether from the genome in the two species; however, these appear to be compensated by wobbling of other tRNA anticodons in a manner which is tied to the codon bias in Phytophthora genes. The most abundant tRNA class was arginine in both P. sojae and P. ramorum. A codon usage table was generated for these two organisms from a total of 9,803,525 codons in P. sojae and 7,496,598 codons in P. ramorum. The most abundant codon type detected from the codon usage tables was GAG (encoding glutamic acid), whereas the most numerous tRNA gene had a methionine anticodon (CAT). The correlation between the frequencies of tRNA genes and the codon frequencies in protein-coding genes was very low (0.12 in P. sojae and 0.19 in P. ramorum); however, the correlation between amino acid tRNA gene frequency and the corresponding amino acid codon frequency in P. sojae and P. ramorum was substantially higher (0.53 in P. sojae and 0.77 in P. ramorum). The codon usage frequencies of P. sojae and P ramorum were very strongly correlated (0.99), as were tRNA gene frequencies (0.77). Approximately 60% of orthologous tRNA gene pairs in P sojae and P. ramorum are located in regions that have conserved synteny in the two species.  相似文献   

6.

Introduction

Genomic base composition ranges from less than 25% AT to more than 85% AT in prokaryotes. Since only a small fraction of prokaryotic genomes is not protein coding even a minor change in genomic base composition will induce profound protein changes. We examined how amino acid and codon frequencies were distributed in over 2000 microbial genomes and how these distributions were affected by base compositional changes. In addition, we wanted to know how genome-wide amino acid usage was biased in the different genomes and how changes to base composition and mutations affected this bias. To carry this out, we used a Generalized Additive Mixed-effects Model (GAMM) to explore non-linear associations and strong data dependences in closely related microbes; principal component analysis (PCA) was used to examine genomic amino acid- and codon frequencies, while the concept of relative entropy was used to analyze genomic mutation rates.

Results

We found that genomic amino acid frequencies carried a stronger phylogenetic signal than codon frequencies, but that this signal was weak compared to that of genomic %AT. Further, in contrast to codon usage bias (CUB), amino acid usage bias (AAUB) was differently distributed in AT- and GC-rich genomes in the sense that AT-rich genomes did not prefer specific amino acids over others to the same extent as GC-rich genomes. AAUB was also associated with relative entropy; genomes with low AAUB contained more random mutations as a consequence of relaxed purifying selection than genomes with higher AAUB.

Conclusion

Genomic base composition has a substantial effect on both amino acid- and codon frequencies in bacterial genomes. While phylogeny influenced amino acid usage more in GC-rich genomes, AT-content was driving amino acid usage in AT-rich genomes. We found the GAMM model to be an excellent tool to analyze the genomic data used in this study.  相似文献   

7.
鉴于遗传密码子的简并性能够将基因遗传信息的容量提升,同义密码子使用偏嗜性得以在生物体的基因组中广泛存在。虽然同义密码子之间碱基的变化并不能导致氨基酸种类的改变,在研究mRNA半衰期、编码多肽翻译效率及肽链空间构象正确折叠的准确性和翻译等这一系列过程中发现,同义密码子使用的偏嗜性在某种程度上通过精微调控翻译机制体现其遗传学功能。同义密码子指导tRNA在翻译过程中识别核糖体的速率变化是由氨基酸的特定顺序决定,并且在新生多肽链合成时,蛋白质共翻译转运机制同时调节其空间构象的正确折叠从而保证蛋白的正常生物学功能。某些同义密码子使用偏嗜性与特定蛋白结构的形成具有显著相关性,密码子使用偏嗜性一旦改变将可能导致新生多肽空间构象出现错误折叠。结合近些年来国内外在此领域的研究成果,阐述同义密码子使用偏嗜性如何发挥精微调控翻译的生物学功能与作用。  相似文献   

8.
The genomes of the spirochaetes Borrelia burgdorferi and Treponema pallidum show strong strand-specific skews in nucleotide composition, with the leading strand in replication being richer in G and T than the lagging strand in both species. This mutation bias results in codon usage and amino acid composition patterns that are significantly different between genes encoded on the two strands, in both species. There are also substantial differences between the species, with T.pallidum having a much higher G+C content than B. burgdorferi. These changes in amino acid and codon compositions represent neutral sequence change that has been caused by strong strand- and species-specific mutation pressures. Genes that have been relocated between the leading and lagging strands since B. burgdorferi and T.pallidum diverged from a common ancestor now show codon and amino acid compositions typical of their current locations. There is no evidence that translational selection operates on codon usage in highly expressed genes in these species, and the primary influence on codon usage is whether a gene is transcribed in the same direction as replication, or opposite to it. The dnaA gene in both species has codon usage patterns distinctive of a lagging strand gene, indicating that the origin of replication lies downstream of this gene, possibly within dnaN. Our findings strongly suggest that gene-finding algorithms that ignore variability within the genome may be flawed.  相似文献   

9.
Gu W  Zhou T  Ma J  Sun X  Lu Z 《Bio Systems》2004,73(2):89-97
The role of silent position in the codon on the protein structure is an interesting and yet unclear problem. In this paper, 563 Homo sapiens genes and 417 Escherichia coli genes coding for proteins with four different folding types have been analyzed using variance analysis, a multivariate analysis method newly used in codon usage analysis, to find the correlation between amino acid composition, synonymous codon, and protein structure in different organisms. It has been found that in E. coli, both amino acid compositions in differently folded proteins and synonymous codon usage in different gene classes coding for differently folded proteins are significantly different. It was also found that only amino acid composition is different in different protein classes in H. sapiens. There is no universal correlation between synonymous codon usage and protein structure in these two different organisms. Further analysis has shown that GC content on the second codon position can distinguish coding genes for different folded proteins in both organisms.  相似文献   

10.
11.
Lightfield J  Fram NR  Ely B 《PloS one》2011,6(3):e17677
The GC content of bacterial genomes ranges from 16% to 75% and wide ranges of genomic GC content are observed within many bacterial phyla, including both gram negative and gram positive phyla. Thus, divergent genomic GC content has evolved repeatedly in widely separated bacterial taxa. Since genomic GC content influences codon usage, we examined codon usage patterns and predicted protein amino acid content as a function of genomic GC content within eight different phyla or classes of bacteria. We found that similar patterns of codon usage and protein amino acid content have evolved independently in all eight groups of bacteria. For example, in each group, use of amino acids encoded by GC-rich codons increased by approximately 1% for each 10% increase in genomic GC content, while the use of amino acids encoded by AT-rich codons decreased by a similar amount. This consistency within every phylum and class studied led us to conclude that GC content appears to be the primary determinant of the codon and amino acid usage patterns observed in bacterial genomes. These results also indicate that selection for translational efficiency of highly expressed genes is constrained by the genomic parameters associated with the GC content of the host genome.  相似文献   

12.
Singer GA  Hickey DA 《Gene》2003,317(1-2):39-47
A number of recent studies have shown that thermophilic prokaryotes have distinguishable patterns of both synonymous codon usage and amino acid composition, indicating the action of natural selection related to thermophily. On the other hand, several other studies of whole genomes have illustrated that nucleotide bias can have dramatic effects on synonymous codon usage and also on the amino acid composition of the encoded proteins. This raises the possibility that the thermophile-specific patterns observed at both the codon and protein levels are merely reflections of a single underlying effect at the level of nucleotide composition. Moreover, such an effect at the nucleotide level might be due entirely to mutational bias. In this study, we have compared the genomes of thermophiles and mesophiles at three levels: nucleotide content, codon usage and amino acid composition. Our results indicate that the genomes of thermophiles are distinguishable from mesophiles at all three levels and that the codon and amino acid frequency differences cannot be explained simply by the patterns of nucleotide composition. At the nucleotide level, we see a consistent tendency for the frequency of adenine to increase at all codon positions within the thermophiles. Thermophiles are also distinguished by their pattern of synonymous codon usage for several amino acids, particularly arginine and isoleucine. At the protein level, the most dramatic effect is a two-fold decrease in the frequency of glutamine residues among thermophiles. These results indicate that adaptation to growth at high temperature requires a coordinated set of evolutionary changes affecting (i) mRNA thermostability, (ii) stability of codon-anticodon interactions and (iii) increased thermostability of the protein products. We conclude that elevated growth temperature imposes selective constraints at all three molecular levels: nucleotide content, codon usage and amino acid composition. In addition to these multiple selective effects, however, the genomes of both thermophiles and mesophiles are often subject to superimposed large changes in composition due to mutational bias.  相似文献   

13.
Few quantitative measures of genome architecture or organization exist to support assumptions of differences between microorganisms that are broadly defined as being free-living or pathogenic. General principles about complete proteomes exist for codon usage, amino acid biases and essential or core genes. Genome-wide shifts in amino acid usage between free-living and pathogenic microorganisms result in fundamental differences in the complexity of their respective proteomes that are size and gene content independent. These differences are evident across broad phylogenetic groups–a result of environmental factors and population genetic forces rather than phylogenetic distance. A novel comparative analysis of amino acid usage–utilizing linguistic analyses of word frequency in language and text–identified a global pattern of higher peptide word repetition in 376 free-living versus 421 pathogen genomes across broad ranges of genome size, G+C content and phylogenetic ancestry. This imprint of repetitive word usage indicates free-living microorganisms have a bias for repetitive sequence usage compared to pathogens. These findings quantify fundamental differences in microbial genomes relative to life-history function.  相似文献   

14.
15.
Phylogenetic analyses of first and second codon positions (DNA1 + 2 analysis) and amino acid sequences (protein analysis) are often thought to provide similar estimates of deep-level phylogeny. However, here we report a novel artifact influencing DNA level phylogenetic inference of protein-coding genes introduced by codon usage heterogeneity that causes significant incongruities between DNA1 + 2 and protein analyses. DNA1 + 2 analyses of plastid-encoded psbA genes (encoding of photosystem II D1 proteins) strongly suggest a relationship between haptophyte plastids and typical (peridinin-containing) dinoflagellate plastids. The psbA genes from haptophytes and a subset of the peridinin-type plastids display similar codon usage patterns for Leu, Ser, and Arg, which are each encoded by two separated codon sets that differ at first or first plus second codon positions. Our detailed analyses clearly indicate that these unusual preferences shared by haptophyte and some peridinin-type plastid genes are largely responsible for their strong affinity in DNA analyses. In particular, almost all of the support from DNA level analyses for the monophyly of haptophyte and peridinin-type plastids is lost when the codons corresponding to constant Leu, Ser, and Arg amino acids are excluded, suggesting that this signal comes from rapidly evolving synonymous substitutions, rather than from substitutions that result in amino acid changes. Indeed, protein maximum-likelihood analyses of concatenated PsaA and PsbA amino acid sequences indicate that, although 19' hexanoyloxyfucoxanthin-type (19' HNOF-type) plastids in dinoflagellates group with haptophyte plastids, peridinin-type plastids group weakly with those of stramenopiles. Consequently our results cast doubt on the single origin of peridinin-type and 19' HNOF-type plastids in dinoflagellates previously suggested on the basis of psaA and psbA concatenated gene phylogenetic analyses. We suggest that codon usage heterogeneity could be a more general problem for DNA level analyses of protein-coding genes, even when third codon positions are excluded.  相似文献   

16.
The codon usage in the Vibrio cholerae genome is analyzed in this paper. Although there are much more genes on the chromosome 1 than on chromosome 2, the codon usage patterns of genes on the two chromosomes are quite similar, indicating that the two chromosomes may have coexisted in the same cell for a very long history. Unlike the base frequency pattern observed in other genomes, the G+C content at the third codon position of the V. cholerae genome varies in a rather small interval. The most notable feature of codon usage of V. cholerae genome is that there is a fraction of genes show significant bias in base choice at the second codon position. The 2,006 known genes can be classified into two clusters according to the base frequencies at this position. The smaller cluster contains 227 genes, most of which code for proteins involved in transport and binding functions. The encoding products of these genes have significant bias in amino acids composition as compared with other genes. The codon usage patterns for the 1,836 function unknown ORFs are also analyzed, which is useful to study their functions.  相似文献   

17.
Heger A  Ponting CP 《Genetics》2007,177(3):1337-1348
Codon usage bias in Drosophila melanogaster genes has been attributed to negative selection of those codons whose cellular tRNA abundance restricts rates of mRNA translation. Previous studies, which involved limited numbers of genes, can now be compared against analyses of the entire gene complements of 12 Drosophila species whose genome sequences have become available. Using large numbers (6138) of orthologs represented in all 12 species, we establish that the codon preferences of more closely related species are better correlated. Differences between codon usage biases are attributed, in part, to changes in mutational biases. These biases are apparent from the strong correlation (r = 0.92, P < 0.001) among these genomes' intronic G + C contents and exonic G + C contents at degenerate third codon positions. To perform a cross-species comparison of selection on codon usage, while accounting for changes in mutational biases, we calibrated each genome in turn using the codon usage bias indices of highly expressed ribosomal protein genes. The strength of translational selection was predicted to have varied between species largely according to their phylogeny, with the D. melanogaster group species exhibiting the strongest degree of selection.  相似文献   

18.
Codon usage bias varies considerably among genomes and even within the genes of the same genome.In eukaryotic organisms,energy production in the form of oxidative phosphorylation(OXPHOS)is the only process under control of both nuclear and mitochondrial genomes.Although factors affecting codon usage in a single genome have been studied,this has not occurred when both interactional genomes are involved.Consequently, we investigated whether or not other factors influence codon usage of coevolved genes.We used Drosophila melanogaster as a model organism.Our χ2 test on the number of codons of nuclear and mitochondrial genes involved in the OXPHOS system was significantly different (χ2=7945.16,P<0.01).A plot of effective number of codons against GC3s content of nuclear genes showed that few genes lie on the expected curve,indicating that codon usage was random.Correspondence analysis indicated a significant correlation between axis 1 and codon adaptation index(R=0.947,P<0.01)in every nuclear gene sequence.Thus,codon usage bias of nuclear genes appeared to be affected by translational selection.Correlation between axis 1 coordinates and GC content(R=0.814.P<0.01)indicated that the codon usage of nuclear genes was also affected by GC composition.Analysis of mitochondrial genes did not reveal a significant correlation between axis 1 and any parameter.Statistical analyses indicated that codon usages of both nDNA and mtDNA were subjected to context-dependent mutations.  相似文献   

19.
Analysis of codon usage pattern is important to understand the genetic and evolutionary characteristics of genomes. We have used bioinformatic approaches to analyze the codon usage bias (CUB) of the genes located in human Y chromosome. Codon bias index (CBI) indicated that the overall extent of codon usage bias was low. The relative synonymous codon usage (RSCU) analysis suggested that approximately half of the codons out of 59 synonymous codons were most frequently used, and possessed a T or G at the third codon position. The codon usage pattern was different in different genes as revealed from correspondence analysis (COA). A significant correlation between effective number of codons (ENC) and various GC contents suggests that both mutation pressure and natural selection affect the codon usage pattern of genes located in human Y chromosome. In addition, Y-linked genes have significant difference in GC contents at the second and third codon positions, expression level, and codon usage pattern of some codons like the SPANX genes in X chromosome.  相似文献   

20.
Biased usage of synonymous codons has been elucidated under the perspective of cellular tRNA abundance for quite a long time now. Taking advantage of publicly available gene expression data for Saccharomyces cerevisiae, a systematic analysis of the codon and amino acid usages in two different coding regions corresponding to the regular (helix and strand) as well as the irregular (coil) protein secondary structures, have been performed. Our analyses suggest that apart from tRNA abundance, mRNA folding stability is another major evolutionary force in shaping the codon and amino acid usage differences between the highly and lowly expressed genes in S. cerevisiae genome and surprisingly it depends on the coding regions corresponding to the secondary structures of the encoded proteins. This is obviously a new paradigm in understanding the codon usage in S. cerevisiae. Differential amino acid usage between highly and lowly expressed genes in the regions coding for the irregular protein secondary structure in S. cerevisiae is expounded by the stability of the mRNA folded structure. Irrespective of the protein secondary structural type, the highly expressed genes always tend to encode cheaper amino acids in order to reduce the overall biosynthetic cost of production of the corresponding protein. This study supports the hypothesis that the tRNA abundance is a consequence of and not a reason for the biased usage of amino acid between highly and lowly expressed genes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号