首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In prokaryotes, GC levels range from 25% to 75%, and Topt from approximately 0 degrees C to >100 degrees C. When all species are considered together, no correlation is found between the two variables. Correlations are found, however, when Families of prokaryotes are analysed. Indeed, when Families comprising at least 10 species were studied (a set of 20 Families), positive correlations are found for 15 of them. Furthermore, a comparative analysis by independent contrasts made within the Families in order to control for phylogenetic non-independence showed qualitatively equivalent results. We conclude that Topt is one of the factors that influences genomic GC in prokaryotes.  相似文献   

2.
The GC contents of 2670 prokaryotic genomes that belong to diverse phylogenetic lineages were analyzed in this paper. These genomes had GC contents that ranged from 13.5% to 74.9%. We analyzed the distance of base frequencies at the three codon positions, codon frequencies, and amino acid compositions across genomes with respect to the differences in the GC content of these prokaryotic species. We found that although the phylogenetic lineages were remote among some species, a similar genomic GC content forced them to adopt similar base usage patterns at the three codon positions, codon usage patterns, and amino acid usage patterns. Our work demonstrates that in prokaryotic genomes: a) base usage, codon usage, and amino acid usage change with GC content with a linear correlation; b) the distance of each usage has a linear correlation with the GC content difference; and c) GC content is more essential than phylogenetic lineage in determining base usage, codon usage, and amino acid usage. This work is exceptional in that we adopted intuitively graphic methods for all analyses, and we used these analyses to examine as many as 2670 prokaryotes. We hope that this work is helpful for understanding common features in the organization of microbial genomes.  相似文献   

3.
Amino acids are utilized with different frequencies both among species and among genes within the same genome. Up to date, no study on the amino acid usage pattern of chicken has been performed. In the present study, we carried out a systematic examination of the amino acid usage in the chicken proteome. Our data indicated that the relative amino acid usage is positively correlated with the tRNA gene copy number. GC contents, including GC1, GC2, GC3, GC content of CDS and GC content of the introns, were correlated with the most of the amino acid usage, especially for GC rich and GC poor amino acids, however, multiple linear regression analyses indicated that only approximately 10–40% variation of amino acid usage can be explained by GC content for GC rich and GC poor amino acids. For other intermediate GC content amino acids, only approximately 10% variation can be explained. Correspondence analyses demonstrated that the main factors responsible for the variation of amino acid usage in chicken are hydrophobicity, aromaticity and genomic GC content. Gene expression level also influenced the amino acid usage significantly. We argued that the amino acid usage of chicken proteome likely reflects a balance or near balance between the action of selection, mutation, and genetic drift.  相似文献   

4.
We have recently shown that optimal growth temperature (T(opt)) is one of the factors that influence genomic GC in prokaryotes. Our results have been disputed by Marashi and Ghalanbor, who claim that the correlations we show are not "robust" because the elimination of some points (arbitrarily chosen) leads, in some families, to variations in the correlation coefficients and/or significance of correlations. Here, we test whether the correlation between T(opt) and genomic GC is robust by using two independent approaches: detection of possible outliers (using robust Mahalanobis distance) and usage of a non-parametric correlation coefficient that is not sensitive to the presence of outliers. The results presented here reinforce our previous proposal that T(opt) is correlated with genomic GC in prokaryotes.  相似文献   

5.
The correlation between genomic G+C content and optimal growth temperature in prokaryotes has gained renewed interest after Musto et al. [H. Musto, H. Naya, A. Zavala, H. Romero, F. Alvarex-Valin, G. Bernardi, Correlations between genomic GC levels and optimal growth temperatures in prokaryotes, FEBS Lett. 573 (2004) 73-77], reported that positive correlations exist in 15 families studied. We have reanalyzed their data and found that when genome size and data quality were adjusted for, there was no significant evidence of relationship between optimal temperature and GC content for two of the families that had previously shown strongly significant correlations. Using updated temperature optima for Halobacteriaceae species we found the correlation is insignificant in this family. For the family Enterobacteriaceae when genome size and optimal temperature are included in a multiple linear regression, only genome size is significant as a predictor of GC content. We showed that more profound statistical methods than simple two factor correlation analysis should be used for analyzing complex intrinsic and extrinsic factors that affect genomic GC content. We further found that a positive correlation between temperature and genomic GC is only evident in free-living species of low optimal growth temperatures.  相似文献   

6.
Palidwor GA  Perkins TJ  Xia X 《PloS one》2010,5(10):e13431

Background

In spite of extensive research on the effect of mutation and selection on codon usage, a general model of codon usage bias due to mutational bias has been lacking. Because most amino acids allow synonymous GC content changing substitutions in the third codon position, the overall GC bias of a genome or genomic region is highly correlated with GC3, a measure of third position GC content. For individual amino acids as well, G/C ending codons usage generally increases with increasing GC bias and decreases with increasing AT bias. Arginine and leucine, amino acids that allow GC-changing synonymous substitutions in the first and third codon positions, have codons which may be expected to show different usage patterns.

Principal Findings

In analyzing codon usage bias in hundreds of prokaryotic and plant genomes and in human genes, we find that two G-ending codons, AGG (arginine) and TTG (leucine), unlike all other G/C-ending codons, show overall usage that decreases with increasing GC bias, contrary to the usual expectation that G/C-ending codon usage should increase with increasing genomic GC bias. Moreover, the usage of some codons appears nonlinear, even nonmonotone, as a function of GC bias. To explain these observations, we propose a continuous-time Markov chain model of GC-biased synonymous substitution. This model correctly predicts the qualitative usage patterns of all codons, including nonlinear codon usage in isoleucine, arginine and leucine. The model accounts for 72%, 64% and 52% of the observed variability of codon usage in prokaryotes, plants and human respectively. When codons are grouped based on common GC content, 87%, 80% and 68% of the variation in usage is explained for prokaryotes, plants and human respectively.

Conclusions

The model clarifies the sometimes-counterintuitive effects that GC mutational bias can have on codon usage, quantifies the influence of GC mutational bias and provides a natural null model relative to which other influences on codon bias may be measured.  相似文献   

7.
MOTIVATION: Some genomic islands contain horizontally transferred genes, which play critical roles in altering the genotypes and phenotypes of organisms, and horizontal gene transfer has been recognized as a universal event throughout bacterial evolution. A windowless method to display the distribution of genomic GC content, the cumulative GC profile, is proposed to identify genomic islands in genomes whose complete genome sequences are available. Two new indices are proposed to assess the codon usage bias and amino acid usage bias in genomic islands. RESULTS: A 211 kb genomic island (CGGI-1) has been identified in the genome of Corynebacterium glutamicum, and three genomic islands VVGI-1, VVGI-2 and VVGI-3, with lengths 167, 40 and 33 kb, respectively, have been identified in the genome of Vibrio vulnificus CMCP6 chromosome I. The CGGI-1 is flanked by two approximately 500 bp direct repeats, and utilizes a Val-tRNA as the integration site. For the VVGI-1 and VVGI-2, each has an integrase gene at 5' junction. All the identified genomic islands show unusual GC content, codon usage and amino acid usage, compared with the rest of the genomes. In addition, it is found that genomic islands are fairly homogenous in terms of GC content variation. An index, h, to quantify the homogeneity of GC content for genomic islands is proposed, and it is shown that h is less than 0.1 for all the genomic islands analyzed. The cumulative GC profile, as well as various indices to assess the codon usage bias, amino acid usage bias and homogeneity of the genomic islands, will be useful in the analysis of other genomes. AVAILABILITY: Programs used in this work and numerical results are available upon request.  相似文献   

8.
The genomic as well as structural relationship of phycobiliproteins (PBPs) in different cyanobacterial species are determined by nucleotides as well as amino acid composition. The genomic GC constituents influence the amino acid variability and codon usage of particular subunit of PBPs. We have analyzed 11 cyanobacterial species to explore the variation of amino acids and causal relationship between GC constituents and codon usage. The study at the first, second and third levels of GC content showed relatively more amino acid variability on the levels of G3 + C3 position in comparison to the first and second positions. The amino acid encoded GC rich level including G rich and C rich or both correlate the codon variability and amino acid availability. The fluctuation in amino acids such as Arg, Ala, His, Asp, Gly, Leu and Glu in α and β subunits was observed at G1C1 position; however, fluctuation in other amino acids such as Ser, Thr, Cys and Trp was observed at G2C2 position. The coding selection pressure of amino acids such as Ala, Thr, Tyr, Asp, Gly, Ile, Leu, Asn, and Ser in α and β subunits of PBPs was more elaborated at G3C3 position. In this study, we observed that each subunit of PBPs is codon specific for particular amino acid. These results suggest that genomic constraint linked with GC constituents selects the codon for particular amino acids and furthermore, the codon level study may be a novel approach to explore many problems associated with genomics and proteomics of cyanobacteria.  相似文献   

9.
For a long time, the central issue of evolutionary genomics was to find out the adaptive strategy of nucleic acid molecules of various microorganisms having different optimal growth temperatures (Topt). Long-standing controversies exist regarding the correlations between genomic G+C content and Topt, and this debate has not been yet settled. We address this problem by considering the fact that adaptation to growth at high temperature requires a coordinated set of evolutionary changes affecting: (i) nucleic acid thermostability and (ii) stability of codon-anticodon interactions. In the present study, we analyzed 16 prokaryotic genomes having intermediate G+C content and widely varying optimal growth temperatures. Results show that elevated growth temperature imposes selective constraints not only on nucleic acid level but also affects the stability of codon-anticodon interaction. We observed a decrease in the frequency of SSC and SSG codons with the increase in Topt to avoid the formation of side-by-side GC base pairs in the codon-anticodon interaction, thereby making it impossible for a genome to increase GC composition uniformly through the whole coding sequence. Thus, we suggest that any attempt to obtain a generalized relation between genomic GC composition and optimal growth temperature would hardly evolve any satisfactory result.  相似文献   

10.
Naya H  Romero H  Carels N  Zavala A  Musto H 《FEBS letters》2001,501(2-3):127-130
In unicellular species codon usage is determined by mutational biases and natural selection. Among prokaryotes, the influence of these factors is different if the genome is skewed towards AT or GC, since in AT-rich organisms translational selection is absent. On the other hand, in AT-rich unicellular eukaryotes the two factors are present. In order to understand if GC-rich genomes display a similar behavior, the case of Chlamydomonas reinhardtii was studied. Since we found that translational selection strongly influences codon usage in this species, we conclude that there is not a common pattern among unicellular organisms.  相似文献   

11.
Two years ago, we showed that positive correlations between optimal growth temperature (T(opt)) and genome GC are observed in 15 out of the 20 families of prokaryotes we analyzed, thus indicating that "T(opt) is one of the factors that influence genomic GC in prokaryotes". Our results were disputed, but these criticisms were demonstrated to be mistaken and based on misconceptions. In a recent report, Wang et al. [H.C. Wang, E. Susko, A.J. Roger, On the correlation between genomic G+C content and optimal growth temperature in prokaryotes: data quality and confounding factors, Biochem. Biophys. Res. Commun. 342 (2006) 681-684] criticize our results by stating that "all previous simple correlation analyses of GC versus temperature have ignored the fact that genomic GC content is influenced by multiple factors including both intrinsic mutational bias and extrinsic environmental factors". This statement, besides being erroneous, is surprising because it applies in fact not to ours but to the authors' article. Here, we rebut the points raised by Wang et al. and review some issues that have been a matter of debate, regarding the influence of environmental factors upon GC content in prokaryotes. Furthermore, we demonstrate that the relationship that exists between genome size and GC level is valid for aerobic, facultative, and microaerophilic species, but not for anaerobic prokaryotes.  相似文献   

12.
One of the central issues of evolutionary genomics is to find out the adaptive strategies of microorganisms to stabilize nucleic acid molecules under high temperature. Thermal adaptation hypothesis gives a link between G+C content and growth temperature if there is a considerable variation of guanine and cytosine content between species. However, there has been a long-standing debate regarding the correlations between genomic GC content and optimal growth temperature (Topt). We urged that adaptation to growth at high temperature requires a coordinated set of evolutionary changes affecting: (i) nucleic acid thermostability and (ii) stability of codon-anticodon interactions. Moreover, in Bacillaceae family we have demonstrated that a higher genomic GC level do not have any role in stabilizing mRNA secondary structure at high growth temperature. Comparative analysis between homologous sequences of thermophilic Thermus thermophilus and mesophilic Deinococcus radiodurans suggests that increased levels of GC contents in the coding sequence corresponding to strand structure of Thermus thermophilus genes have stabilizing effect on the mRNA secondary structure, whereas increased levels of GC contents in coding sequences corresponding to aperiodic structure have destabilizing effect on the mRNA secondary structure. In this perspective, a critical review of thermal adaptation hypothesis is further advocated.  相似文献   

13.
Lightfield J  Fram NR  Ely B 《PloS one》2011,6(3):e17677
The GC content of bacterial genomes ranges from 16% to 75% and wide ranges of genomic GC content are observed within many bacterial phyla, including both gram negative and gram positive phyla. Thus, divergent genomic GC content has evolved repeatedly in widely separated bacterial taxa. Since genomic GC content influences codon usage, we examined codon usage patterns and predicted protein amino acid content as a function of genomic GC content within eight different phyla or classes of bacteria. We found that similar patterns of codon usage and protein amino acid content have evolved independently in all eight groups of bacteria. For example, in each group, use of amino acids encoded by GC-rich codons increased by approximately 1% for each 10% increase in genomic GC content, while the use of amino acids encoded by AT-rich codons decreased by a similar amount. This consistency within every phylum and class studied led us to conclude that GC content appears to be the primary determinant of the codon and amino acid usage patterns observed in bacterial genomes. These results also indicate that selection for translational efficiency of highly expressed genes is constrained by the genomic parameters associated with the GC content of the host genome.  相似文献   

14.
Romero H  Zavala A  Musto H 《Gene》2000,242(1-2):307-311
It is widely accepted that the compositional pressure is the only factor shaping codon usage in unicellular species displaying extremely biased genomic compositions. This seems to be the case in the prokaryotes Mycoplasma capricolum, Rickettsia prowasekii and Borrelia burgdorferi (GC-poor), and in Micrococcus luteus (GC-rich). However, in the GC-poor unicellular eukaryotes Dictyostelium discoideum and Plasmodium falciparum, there is evidence that selection, acting at the level of translation, influences codon choices. This is a twofold intriguing finding, since (1) the genomic GC levels of the above mentioned eukaryotes are lower than the GC% of any studied bacteria, and (2) bacteria usually have larger effective population sizes than eukaryotes, and hence natural selection is expected to overcome more efficiently the randomizing effects of genetic drift among prokaryotes than among eukaryotes. In order to gain a new insight about this problem, we analysed the patterns of codon preferences of the nuclear genes of Entamoeba histolytica, a unicellular eukaryote characterised by an extremely AT-rich genome (GC = 25%). The overall codon usage is strongly biased towards A and T in the third codon positions, and among the presumed highly expressed sequences, there is an increased relative usage of a subset of codons, many of which are C-ending. Since an increase in C in third codon positions is 'against' the compositional bias, we conclude that codon usage in E. histolytica, as happens in D. discoideum and P. falciparum, is the result of an equilibrium between compositional pressure and selection. These findings raise the question of why strongly compositionally biased eukaryotic cells may be more sensitive to the (presumed) slight differences among synonymous codons than compositionally biased bacteria.  相似文献   

15.
The global amino acid compositions as deduced from the complete genomic sequences of six thermophilic archaea, two thermophilic bacteria, 17 mesophilic bacteria and two eukaryotic species were analysed by hierarchical clustering and principal components analysis. Both methods showed an influence of several factors on amino acid composition. Although GC content has a dominant effect, thermophilic species can be identified by their global amino acid compositions alone. This study presents a careful statistical analysis of factors that affect amino acid composition and also yielded specific features of the average amino acid composition of thermophilic species. Moreover, we introduce the first example of a 'compositional tree' of species that takes into account not only homologous proteins, but also proteins unique to particular species. We expect this simple yet novel approach to be a useful additional tool for the study of phylogeny at the genome level.  相似文献   

16.
Prochlorococcus species are the first example of free-living bacteria with reduced genome. Codon and amino acid usages bias of Prochlorococcus marinus MED4 was investigated using all protein coding genes having length greater than or equal to 100 amino acids. Correspondence analysis on relative synonymous codon usage (RSCU) values shows that there is no such influence of translational selection in shaping the codon usage variation among the genes in this organism. However, amino acid usages were markedly different between the highly and lowly expressed genes in this organism and in particular, GC rich amino acids were found to occur significantly higher in highly expressed genes than the lowly expressed genes. Comparative analysis of the homologous genes of Synechococcus sp. WH8102 and Prochlorococcus marinus MED4 shows that amino acids conservation in highly expressed genes is significantly higher than lowly expressed genes. Based on our results we concluded that conservation of GC rich amino acids in the highly expressed genes to its ancestor is the major source of variation in amino acid usages in the organism.  相似文献   

17.
伪狂犬病病毒基因编码区碱基组成与密码子使用偏差   总被引:6,自引:0,他引:6  
由于伪狂犬病病毒(PRV)中G C含量高达74%,至今尚没有一个毒株完成全基因组测序。对已知的68个PRV基因编码区序列碱基组成及密码子使用现象进行了统计分析,结果发现PRV基因中存在非常强的密码子使用偏差。所有68个PRV基因编码区密码子第三位总的G C含量为96.24%,其中UL48基因高达99.52%。PRV基因偏向于使用富含GC的密码子,特别是以C或G结尾的密码子。此外,还发现PRV中G C含量变化较大的UL48、UL40、UL14和IE180等基因附近正好与已知的PRV基因组复制起始区相对应。根据基因功能将PRV基因分为6类进行分析发现,基因功能相同或相近的基因其密码子使用模式相似,其中调节基因的同义密码子相对使用度(RSCU)与其他基因有显著差异,在调节基因中以C结尾的密码子的RSCU值远大于其他同义密码子。最后,对PRV基因氨基酸组成差异进行多元分析,发现不同功能的PRV基因在对应分析图上分布不同,表明PRV基因密码子使用模式可能与基因功能相关。  相似文献   

18.
The number of completely sequenced archaeal genomes has been sufficient for a large-scale bioinformatic study.We have conducted analyses for each coding region from 36 archaeal genomes using the original CGS algorithm by calculating the total GC content(G+C),GC content in first,second and third codon positions as well as in fourfold and twofold degenerated sites from third codon positions,levels of arginine codon usage(Arg2:AGA/G;Arg4:CGX),levels of amino acid usage and the entropy of amino acid content distribution.In archaeal genomes with strong GC pressure,arginine is coded preferably by GC-rich Arg4 codons,whereas in most of archaeal genomes with G+C0.6,arginine is coded preferably by AT-rich Arg2 codons.In the genome of Haloquadratum walsbyi,which is closely related to GC-rich archaea,GC content has decreased mostly in third codon positions,while Arg4Arg2 bias still persists.Proteomes of archaeal species carry characteristic amino acid biases:levels of isoleucine and lysine are elevated,while levels of alanine,histidine,glutamine and cytosine are relatively decreased.Numerous genomic and proteomic biases observed can be explained by the hypothesis of previously existed strong mutational AT pressure in the common predecessor of all archaea.  相似文献   

19.

Introduction

Genomic base composition ranges from less than 25% AT to more than 85% AT in prokaryotes. Since only a small fraction of prokaryotic genomes is not protein coding even a minor change in genomic base composition will induce profound protein changes. We examined how amino acid and codon frequencies were distributed in over 2000 microbial genomes and how these distributions were affected by base compositional changes. In addition, we wanted to know how genome-wide amino acid usage was biased in the different genomes and how changes to base composition and mutations affected this bias. To carry this out, we used a Generalized Additive Mixed-effects Model (GAMM) to explore non-linear associations and strong data dependences in closely related microbes; principal component analysis (PCA) was used to examine genomic amino acid- and codon frequencies, while the concept of relative entropy was used to analyze genomic mutation rates.

Results

We found that genomic amino acid frequencies carried a stronger phylogenetic signal than codon frequencies, but that this signal was weak compared to that of genomic %AT. Further, in contrast to codon usage bias (CUB), amino acid usage bias (AAUB) was differently distributed in AT- and GC-rich genomes in the sense that AT-rich genomes did not prefer specific amino acids over others to the same extent as GC-rich genomes. AAUB was also associated with relative entropy; genomes with low AAUB contained more random mutations as a consequence of relaxed purifying selection than genomes with higher AAUB.

Conclusion

Genomic base composition has a substantial effect on both amino acid- and codon frequencies in bacterial genomes. While phylogeny influenced amino acid usage more in GC-rich genomes, AT-content was driving amino acid usage in AT-rich genomes. We found the GAMM model to be an excellent tool to analyze the genomic data used in this study.  相似文献   

20.
Correspondence analysis of amino acid usage was applied to 14,815 complete proteins from the human genome. We found that three major factors influence the variability of amino acidic composition of these proteins, explaining, respectively 20.4%, 14.7%, and 9.9% of the total variability. The first trend is strongly correlated with the GC content of first and second codon positions and is also significantly correlated with the GC level of the corresponding flanking regions and introns. Therefore, the main force shaping amino acid usage among human proteins are the compositional constraints determined by the isochore in which each gene is embedded. The second trend correlates with the hydropathy of each protein and with the frequency of beta-strands. Finally, the third trend is strongly associated with the usage of Cys and the frequency of alpha-helices.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号