首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Codon usage data for 56 Bacillus subtilis genes show that synonymous codon usage in B. subtilis is less biased than in Escherichia coli, or in Saccharomyces cerevisiae. Nevertheless, certain genes with a high codon bias can be identified by correspondence analysis, and also by various indices of codon bias. These genes are very highly expressed, and a general trend (a decrease) in codon bias across genes seems to correspond to decreasing expression level. This, then, may be a general phenomenon in unicellular organisms. The unusually small effect of translational selection on the pattern of codon usage in lowly expressed genes in B. subtilis yields similar dinucleotide frequencies among different codon positions, and on complementary strands. These patterns could arise through selection on DNA structure, but more probably are largely determined by mutation. This prevalence of mutational bias could lead to difficulties in assessing whether open reading frames encode proteins.  相似文献   

2.

Background  

Codon adaptation indices (CAIs) represent an evolutionary strategy to modulate gene expression and have widely been used to predict potentially highly expressed genes within microbial genomes. Here, we evaluate and compare two very different methods for estimating CAI values, one corresponding to translational codon usage bias and the second obtained mathematically by searching for the most dominant codon bias.  相似文献   

3.
Proteins in general consist not only of globular structural domains (SDs), but also of intrinsically disordered regions (IDRs), i.e. those that do not assume unique three-dimensional structures by themselves. Although IDRs are especially prevalent in eukaryotic proteins, the functions are mostly unknown. To elucidate the functions of IDRs, we first divided eukaryotic proteins into subcellular localizations, identified IDRs by the DICHOT system that accurately divides entire proteins into SDs and IDRs, and examined charge and hydropathy characteristics. On average, mitochondrial proteins have IDRs more positively charged than SDs. Comparison of mitochondrial proteins with orthologous prokaryotic proteins showed that mitochondrial proteins tend to have segments attached at both N and C termini, high fractions of which are IDRs. Segments added to the N-terminus of mitochondrial proteins contain not only signal sequences but also mature proteins and exhibit a positive charge gradient, with the magnitude increasing toward the N-terminus. This finding is consistent with the notion that positively charged residues are added to the N-terminus of proteobacterial proteins so that the extended proteins can be chromosomally encoded and efficiently transported to mitochondria after translation. By contrast, nuclear proteins generally have positively charged SDs and negatively charged IDRs. Among nuclear proteins, DNA-binding proteins have enhanced charge tendencies. We propose that SDs in nuclear proteins tend to be positively charged because of the need to bind to negatively charged nucleotides, while IDRs tend to be negatively charged to interact with other proteins or other regions of the same proteins to avoid premature proteasomal degradation.  相似文献   

4.

Background

Codon adaptation indices (CAIs) represent an evolutionary strategy to modulate gene expression and have widely been used to predict potentially highly expressed genes within microbial genomes. Here, we evaluate and compare two very different methods for estimating CAI values, one corresponding to translational codon usage bias and the second obtained mathematically by searching for the most dominant codon bias.

Results

The level of correlation between these two CAI methods is a simple and intuitive measure of the degree of translational bias in an organism, and from this we confirm that fast replicating bacteria are more likely to have a dominant translational codon usage bias than are slow replicating bacteria, and that this translational codon usage bias may be used for prediction of highly expressed genes. By analyzing more than 300 bacterial genomes, as well as five fungal genomes, we show that codon usage preference provides an environmental signature by which it is possible to group bacteria according to their lifestyle, for instance soil bacteria and soil symbionts, spore formers, enteric bacteria, aquatic bacteria, and intercellular and extracellular pathogens.

Conclusion

The results and the approach described here may be used to acquire new knowledge regarding species lifestyle and to elucidate relationships between organisms that are far apart evolutionarily.  相似文献   

5.
The compositional non-randomness was studied in genes of Saccharomyces cerevisiae and Schizosaccharomyces pombe. In both species, codon usage is well correlated with expressivity (measured as the codon adaptation index). Both species generally display higher nucleotide non-randomness in the group of highly expressed genes than in the lowly expressed genes. The highly expressed genes in both species are furthermore characterized by marked peaks in non-randomness at N=3 upstream of start codons, N=2 downstream of start codons and at N=1 and N=7 downstream of stop codons, indicating that these nucleotides may be key elements in translational regulation. Intragenic variation in codon usage was also observed to be linked to expressivity. It is suggested that the firm link between expressivity and codon usage calls for codon optimization. Based on bioinformatic calculations, examples of proteins are given for which codon optimizations might be relevant.  相似文献   

6.
Codon usage bias (CUB) results from the complex interplay between translational selection and mutational biases. Current methods for CUB analysis apply heuristics to integrate both components, limiting the depth and scope of CUB analysis as a technique to probe into the evolution and optimization of protein-coding genes. Here we introduce a self-consistent CUB index (scnRCA) that incorporates implicit correction for mutational biases, facilitating exploration of the translational selection component of CUB. We validate this technique using gene expression data and we apply it to a detailed analysis of CUB in the Pseudomonadales. Our results illustrate how the selective enrichment of specific codons among highly expressed genes is preserved in the context of genome-wide shifts in codon frequencies, and how the balance between mutational and translational biases leads to varying definitions of codon optimality. We extend this analysis to other moderate and fast growing bacteria and we provide unified support for the hypothesis that C- and A-ending codons of two-box amino acids, and the U-ending codons of four-box amino acids, are systematically enriched among highly expressed genes across bacteria. The use of an unbiased estimator of CUB allows us to report for the first time that the signature of translational selection is strongly conserved in the Pseudomonadales in spite of drastic changes in genome composition, and extends well beyond the core set of highly optimized genes in each genome. We generalize these results to other moderate and fast growing bacteria, hinting at selection for a universal pattern of gene expression that is conserved and detectable in conserved patterns of codon usage bias.  相似文献   

7.
Codon usage bias in prokaryotic genomes is largely a consequence of background substitution patterns in DNA, but highly expressed genes may show a preference towards codons that enable more efficient and/or accurate translation. We introduce a novel approach based on supervised machine learning that detects effects of translational selection on genes, while controlling for local variation in nucleotide substitution patterns represented as sequence composition of intergenic DNA. A cornerstone of our method is a Random Forest classifier that outperformed previous distance measure-based approaches, such as the codon adaptation index, in the task of discerning the (highly expressed) ribosomal protein genes by their codon frequencies. Unlike previous reports, we show evidence that translational selection in prokaryotes is practically universal: in 460 of 461 examined microbial genomes, we find that a subset of genes shows a higher codon usage similarity to the ribosomal proteins than would be expected from the local sequence composition. These genes constitute a substantial part of the genome—between 5% and 33%, depending on genome size—while also exhibiting higher experimentally measured mRNA abundances and tending toward codons that match tRNA anticodons by canonical base pairing. Certain gene functional categories are generally enriched with, or depleted of codon-optimized genes, the trends of enrichment/depletion being conserved between Archaea and Bacteria. Prominent exceptions from these trends might indicate genes with alternative physiological roles; we speculate on specific examples related to detoxication of oxygen radicals and ammonia and to possible misannotations of asparaginyl–tRNA synthetases. Since the presence of codon optimizations on genes is a valid proxy for expression levels in fully sequenced genomes, we provide an example of an “adaptome” by highlighting gene functions with expression levels elevated specifically in thermophilic Bacteria and Archaea.  相似文献   

8.
Prochlorococcus species are the first example of free-living bacteria with reduced genome. Codon and amino acid usages bias of Prochlorococcus marinus MED4 was investigated using all protein coding genes having length greater than or equal to 100 amino acids. Correspondence analysis on relative synonymous codon usage (RSCU) values shows that there is no such influence of translational selection in shaping the codon usage variation among the genes in this organism. However, amino acid usages were markedly different between the highly and lowly expressed genes in this organism and in particular, GC rich amino acids were found to occur significantly higher in highly expressed genes than the lowly expressed genes. Comparative analysis of the homologous genes of Synechococcus sp. WH8102 and Prochlorococcus marinus MED4 shows that amino acids conservation in highly expressed genes is significantly higher than lowly expressed genes. Based on our results we concluded that conservation of GC rich amino acids in the highly expressed genes to its ancestor is the major source of variation in amino acid usages in the organism.  相似文献   

9.
Hepatitis C virus (HCV) infection is among the leading causes of hepatocellular carcinoma and liver cirrhosis globally, with a high economic burden. The disease progression is well established, but less is known about the spontaneous HCV infection clearance. This study tries to establish the relationship between codon biasness and expression of HCV clearance candidate genes in normal and HCV infected liver tissues. A total of 112 coding sequences comprising 151 679 codons were subjected to the computation of codon indices, namely relative synonymous codon usage, an effective number of codon (Nc), frequency of optimal codon, codon adaptation index, codon bias index, and base compositions. Codon indices report of GC3s, GC12, hydropathicity, and aromaticity implicates both mutational and translational selection in the candidate gene set. This was further correlated with the differentially expressed genes among the selected genes using BioGPS. A significant correlation is observed between the gene expression of normal liver and cancerous liver tissues with codon bias (Nc). Gene expression is also correlated with relative codon bias values, indicating that CCL5, APOA2, CD28, IFITM1, and TNFSF4 genes have higher expression. These results are quite encouraging in selecting the high responsive genes in HCV clearance. However, there could be additional genes which could also orchestrate the clearance role with the above mentioned first line of defensive genes.  相似文献   

10.
A simple, effective measure of synonymous codon usage bias, the Codon Adaptation Index, is detailed. The index uses a reference set of highly expressed genes from a species to assess the relative merits of each codon, and a score for a gene is calculated from the frequency of use of all codons in that gene. The index assesses the extent to which selection has been effective in moulding the pattern of codon usage. In that respect it is useful for predicting the level of expression of a gene, for assessing the adaptation of viral genes to their hosts, and for making comparisons of codon usage in different organisms. The index may also give an approximate indication of the likely success of heterologous gene expression.  相似文献   

11.
Gupta SK  Ghosh TC 《Gene》2001,273(1):63-70
Codon usage biases of all DNA sequences (length greater than or equal to 300 bp) from the complete genome of Pseudomonas aeruginosa have been analyzed. As P. aeruginosa is a GC-rich organism, G and/or C are expected to predominate in their codons. Overall codon usage data analysis indicates that indeed codons ending in G and/or C are predominant in this organism. But multivariate statistical analysis indicates that there is a single major trend in the codon usage variation among the genes in this organism, which has a strong negative correlation with the expressivities of the genes. The majority of the lowly expressed genes are scattered towards the positive end of the major axis whereas the highly expressed genes are clustered towards the negative end. This is the first report where the prokaryotic organism having highly skewed base composition is dictated mainly by translational selection, though some other factors such as the lengths of the genes as well as the hydrophobicity of genes also influence the codon usage variation among the genes in this organism in a minor way.  相似文献   

12.
Codon usage patterns in the slime mould Dictyostelium discoideum have been re-examined (a total of 58 genes have been analysed). Considering the extreme A + T-richness of this genome (G + C = 22%), there is a surprising degree of codon usage variation among genes. For example, G + C content at silent sites varies from less than 10% to greater than 30%. It was previously suggested [Warrick, H.M. and Spudich, J.A. (1988) Nucleic Acids Res. 16: 6617-6635] that highly expressed genes contain fewer 'optimal' codons than genes expressed at lower levels. However, it appears that the optimal codons were misidentified. Multivariate statistical analysis shows that the greatest variation among genes is in relative usage of a particular subset of codons (about one per amino acid), many of which are C-ending. We have identified these as optimal codons, since (i) their frequency is positively correlated with gene expression level, and (ii) there is a strong mutation bias in this genome towards A and T nucleotides. Thus, codon usage in D. discoideum can be explained by a balance between the forces of mutational bias and translational selection.  相似文献   

13.
Codon bias is generally thought to be determined by a balance between mutation, genetic drift, and natural selection on translational efficiency. However, natural selection on codon usage is considered to be a weak evolutionary force and selection on codon usage is expected to be strongest in species with large effective population sizes. In this paper, I study associations between codon usage, gene expression, and molecular evolution at synonymous and nonsynonymous sites in the long-lived, woody perennial plant Populus tremula (Salicaceae). Using expression data for 558 genes derived from expressed sequence tags (EST) libraries from 19 different tissues and developmental stages, I study how gene expression levels within single tissues as well as across tissues affect codon usage and rates sequence evolution at synonymous and nonsynonymous sites. I show that gene expression have direct effects on both codon usage and the level of selective constraint of proteins in P. tremula, although in different ways. Codon usage genes is primarily determined by how highly expressed a genes is, whereas rates of sequence evolution are primarily determined by how widely expressed genes are. In addition to the effects of gene expression, protein length appear to be an important factor influencing virtually all aspects of molecular evolution in P. tremula.  相似文献   

14.
Divergence in codon usage of Lactobacillus species.   总被引:3,自引:0,他引:3       下载免费PDF全文
We have analyzed codon usage patterns of 70 sequenced genes from different Lactobacillus species. Codon usage in lactobacilli is highly biased. Both inter-species and intra-species heterogeneity of codon usage bias was observed. Codon usage in L. acidophilus is similar to that in L. helveticus, but dissimilar to that in L. bulgaricus, L. casei, L. pentosus and L. plantarum. Codon usage in the latter three organisms is not significantly different, but is different from that in L. bulgaricus. Inter-species differences in codon usage can, at least in part, be explained by differences in mutational drift. L. bulgaricus shows GC drift, whereas all other species show AT drift. L. acidophilus and L. helveticus rarely use NNG in family-box (a set of synonymous) codons, in contrast to all other species. This result may be explained by assuming that L. acidophilus and L. helveticus, but not other species examined, use a single tRNA species for translation of family-box codons. Differences in expression level of genes are positively correlated with codon usage bias. Highly expressed genes show highly biased codon usage, whereas weakly expressed genes show much less biased codon usage. Codon usage patterns at the 5'-end of Lactobacillus genes is not significantly different from that of entire genes. The GC content of codons 2-6 is significantly reduced compared with that of the remainder of the gene. The possible implications of a reduced GC content for the control of translation efficiency are discussed.  相似文献   

15.
Studies on codon usage in Entamoeba histolytica   总被引:13,自引:0,他引:13  
Codon usage bias of Entamoeba histolytica, a protozoan parasite, was investigated using the available DNA sequence data. Entamoeba histolytica having AT rich genome, is expected to have A and/or T at the third position of codons. Overall codon usage data analysis indicates that A and/or T ending codons are strongly biased in the coding region of this organism. However, multivariate statistical analysis suggests that there is a single major trend in codon usage variation among the genes. The genes which are supposed to be highly expressed are clustered at one end, while the majority of the putatively lowly expressed genes are clustered at the other end. The codon usage pattern is distinctly different in these two sets of genes. C ending codons are significantly higher in the putatively highly expressed genes suggesting that C ending codons are translationally optimal in this organism. In the putatively lowly expressed genes A and/or T ending codons are predominant, which suggests that compositional constraints are playing the major role in shaping codon usage variation among the lowly expressed genes. These results suggest that both mutational bias and translational selection are operational in the codon usage variation in this organism.  相似文献   

16.
Codon usage in Tetrahymena and other ciliates   总被引:11,自引:0,他引:11  
Codon usage in ciliates was examined by analyzing the coding regions of 22 ciliate genes corresponding to a total of 26,142 nucleotides (8,714 codons). It was found that Tetrahymena, Paramecium and the hypotrichs (Oxytricha and Stylonychia) differed in which synonymous codons were used most frequently by their genes. In fact, the codon choices in highly expressed Tetrahymena genes were more similar to those of yeast genes than those of Paramecium genes. The ciliates do not appear to have unusually strong biases in codon usage frequency when compared to other protists such as yeast. The analysis of the Tetrahymena genes indicated that genes which are highly expressed during normal cell growth have a stronger bias towards using the "preferred" codons than those expressed at lower levels during growth or for brief periods during processes such as conjugation. This conforms to what is found in other protists.  相似文献   

17.
Codon Usage in Tetrahymena and Other Ciliates   总被引:6,自引:0,他引:6  
Codon usage in ciliates was examined by analyzing the coding regions of 22 ciliate genes corresponding to a total of 26, 142 nucleotides (8, 714 codons). It was found that Tetrahymena, Paramecium and the hypotrichs ( Oxytricha and Stylonychia ) differed in which synonymous codons were used most frequently by their genes. In fact, the codon choices in highly expressed Tetrahymena genes were more similar to those of yeast genes than those of Paramecium genes. The ciliates do not appear to have unusually strong biases in codon usage frequency when compared to other protists such as yeast. The analysis of the Tetrahymena genes indicated that genes which are highly expressed during normal cell growth have a stronger bias towards using the "preferred" codons than those expressed at lower levels during growth or for brief periods during processes such as conjugation. This conforms to what is found in other protists.  相似文献   

18.
19.
We have cloned and characterized the cDNA and the macronuclear genomic copy of the highly conserved ribosomal protein (r-protein) L3 of Tetrahymena thermophila. The r-protein L3 is encoded by a single copy gene interrupted by one intron. The organization of the promoter region exhibits features characteristic of ribosomal protein genes in Tetrahymena. The codon usage of the L3 gene is highly biased. A thorough analysis of codon usage in Tetrahymena genes revealed that genes could be categorized into two classes according to codon usage bias. Class A comprises r-protein genes and a number of other highly expressed genes. Class B comprises weakly expressed genes such as the conjugation induced CnjB and CnjC genes, but surprisingly, this class also contains abundantly expressed genes such as the genes encoding the surface antigens SerH3 and SerH1. Codon usage is slightly more restricted in class A than in class B, but both classes exhibit distinct and different codon usage biases. Class A genes preferentially use C and U in the silent third codon positions, whereas class B genes preferentially use A and U in the silent third codon positions. The analysis suggests that two different strategies have been employed for optimization of codon usage in the A+T-rich genome of Tetrahymena.  相似文献   

20.
Highly expressed plastid genes display codon adaptation, which is defined as a bias toward a set of codons which are complementary to abundant tRNAs. This type of adaptation is similar to what is observed in highly expressed Escherichia coli genes and is probably the result of selection to increase translation efficiency. In the current work, the codon adaptation of plastid genes is studied with regard to three specific features that have been observed in E. coli and which may influence translation efficiency. These features are (1) a relatively low codon adaptation at the 5′ end of highly expressed genes, (2) an influence of neighboring codons on codon usage at a particular site (codon context), and (3) a correlation between the level of codon adaptation of a gene and its amino acid content. All three features are found in plastid genes. First, highly expressed plastid genes have a noticeable decrease in codon adaptation over the first 10–20 codons. Second, for the twofold degenerate NNY codon groups, highly expressed genes have an overall bias toward the NNC codon, but this is not observed when the 3′ neighboring base is a G. At these sites highly expressed genes are biased toward NNT instead of NNC. Third, plastid genes that have higher codon adaptations also tend to have an increased usage of amino acids with a high G + C content at the first two codon positions and GNN codons in particular. The correlation between codon adaptation and amino acid content exists separately for both cytosolic and membrane proteins and is not related to any obvious functional property. It is suggested that at certain sites selection discriminates between nonsynonymous codons based on translational, not functional, differences, with the result that the amino acid sequence of highly expressed proteins is partially influenced by selection for increased translation efficiency. Received: 21 July 1999 / Accepted: 5 November 1999  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号