首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Predicted highly expressed genes of diverse prokaryotic genomes   总被引:13,自引:0,他引:13       下载免费PDF全文
  相似文献   

3.
Comparisons of codon frequencies of genes to several gene classes are used to characterize highly expressed and alien genes on the SYNECHOCYSTIS: PCC6803 genome. The primary gene classes include the ensemble of all genes (average gene), ribosomal protein (RP) genes, translation processing factors (TF) and genes encoding chaperone/degradation proteins (CH). A gene is predicted highly expressed (PHX) if its codon usage is close to that of the RP/TF/CH standards but strongly deviant from the average gene. Putative alien (PA) genes are those for which codon usage is significantly different from all four classes of gene standards. In SYNECHOCYSTIS:, 380 genes were identified as PHX. The genes with the highest predicted expression levels include many that encode proteins vital for photosynthesis. Nearly all of the genes of the RP/TF/CH gene classes are PHX. The principal glycolysis enzymes, which may also function in CO(2) fixation, are PHX, while none of the genes encoding TCA cycle enzymes are PHX. The PA genes are mostly of unknown function or encode transposases. Several PA genes encode polypeptides that function in lipopolysaccharide biosynthesis. Both PHX and PA genes often form significant clusters (operons). The proteins encoded by PHX and PA genes are described with respect to functional classifications, their organization in the genome and their stoichiometry in multi-subunit complexes.  相似文献   

4.
5.
6.
Predicted highly expressed (PHX) genes are compared for 16 gamma-proteobacteria and their similarities and differences are interpreted with respect to known or predicted physiological characteristics of the organisms. Predicted highly expressed genes often reflect the organism's predominant lifestyle, habitat, nutrition sources and metabolic propensities. This technique allows to predict principal metabolic activities of the microorganisms operating in their natural habitats. Among our findings is an unusually high number of PHX enzymes acting in cell wall biosynthesis, amino acid biosynthesis and replication in the ant endosymbiont Blochmannia floridanus. We ascribe the abundance of these PHX genes to specific aspects of the relationship between the bacterium and its host. Xanthomonas campestris is unique with a very high number of PHX genes acting in flagellum biosynthesis, which may play a special role during its pathogenicity. Shewanella oneidensis possesses three protein complexes which all can function as complex I in the respiratory chain but only the Na(+)-transporting NADH:ubiquinone oxidoreductase nqr-2 operon is PHX. The PHX genes of Vibrio parahaemolyticus are consistent with the microorganism's adaptation to extremely fast growth rates. Comparative analysis of PHX genes from complex environmental genomic sequences as well as from uncultured pathogenic microbes can provide a novel, useful tool to predict global flux of matter and key intermediates.  相似文献   

7.
Our environment is stressed with a load of heavy and toxic metals. Microbes, abundant in our environment, are found to adapt well to this metal-stressed condition. A comparative study among five Cupriavidus/Ralstonia genomes can offer a better perception of their evolutionary mechanisms to adapt to these conditions. We have studied codon usage among 1051 genes common to all these organisms and identified 15 optimal codons frequently used in highly expressed genes present within 1051 genes. We found the core genes of Cupriavidus metallidurans CH34 have a different optimal codon choice for arginine, glycine and alanine in comparison with the other four bacteria. We also found that the synonymous codon usage bias within these 1051 core genes is highly correlated with their gene expression. This supports that translational selection drives synonymous codon usage in the core genes of these genomes. Synonymous codon usage is highly conserved in the core genes of these five genomes. The only exception among them is C. metallidurans CH34. This genomewide shift in synonymous codon choice in C. metallidurans CH34 may have taken place due to the insertion of new genes in its genomes facilitating them to survive in heavy metal containing environment and the co-evolution of the other genes in its genome to achieve a balance in gene expression. Structural studies indicated the presence of a longer N-terminal region containing a copper-binding domain in the cupC proteins of C. metallidurans CH3 that helps it to attain higher binding efficacy with copper in comparison with its orthologs.  相似文献   

8.
Synonymous codon usage is a commonly used means for estimating gene expression levels of Escherichia coli genes and has also been used for predicting highly expressed genes for a number of prokaryotic genomes. By comparison of expression level-dependent features in codon usage with protein abundance data from two proteome studies of exponentially growing E. coli and Bacillus subtilis cells, we try to evaluate whether the implicit assumption of this approach can be confirmed with experimental data. Log-odds ratio scores are used to model differences in codon usage between highly expressed genes and genomic average. Using these, the strength and significance of expression level-dependent features in codon usage were determined for the genes of the Escherichia coli, Bacillus subtilis and Haemophilus influenzae genomes. The comparison of codon usage features with protein abundance data confirmed a relationship between these to be present, although exceptions to this, possibly related to functional context, were found. For species with expression level-dependent features in their codon usage, the applied methodology could be used to improve in silico simulations of the outcome of two-dimensional gel electrophoretic experiments.  相似文献   

9.
Predicted highly expressed (PHX) genes are comparatively analyzed for six GC-rich Gram-negative phytopathogens, i.e., Ralstonia solanacearum, Agrobacterium tumefaciens, Xanthomonas campestris pv. campestris (Xcc), Xanthomonas axonopodis pv. citri (Xac), Pseudomonas syringae pv. tomato, and Xylella fastidiosa. Enzymes involved in energy metabolism, such as ATP synthase, and genes involved in TCA cycle, are PHX in most bacteria except X. fastidiosa, which prefers an anaerobic environment. Most pathogenicity-related factors, including flagellar proteins and some outer membrane proteins, are PHX, except that flagellar proteins are missing in X. fastidiosa which is spread by insects and does not need to move during invasion. Although type III secretion system apparatus are homologous to flagellar proteins, none of them is PHX, which support the viewpoint that the two types of genes have evolved independently. Furthermore, it is revealed that some biosynthesis-related enzymes are highly expressed in certain bacteria. The PHX genes may provide potential drug targets for the design of new bactericide.  相似文献   

10.
The Horizontal Gene Transfer DataBase (HGT-DB) is a genomic database that includes statistical parameters such as G+C content, codon and amino-acid usage, as well as information about which genes deviate in these parameters for prokaryotic complete genomes. Under the hypothesis that genes from distantly related species have different nucleotide compositions, these deviated genes may have been acquired by horizontal gene transfer. The current version of the database contains 88 bacterial and archaeal complete genomes, including multiple chromosomes and strains. For each genome, the database provides statistical parameters for all the genes, as well as averages and standard deviations of G+C content, codon usage, relative synonymous codon usage and amino-acid content. It also provides information about correspondence analyses of the codon usage, plus lists of extraneous group of genes in terms of G+C content and lists of putatively acquired genes. With this information, researchers can explore the G+C content and codon usage of a gene when they find incongruities in sequence-based phylogenetic trees. A search engine that allows searches for gene names or keywords for a specific organism is also available. HGT-DB is freely accessible at http://www.fut.es/~debb/HGT.  相似文献   

11.
Nocardia farcinica is a Gram positive, filamentous bacterium, and is considered an opportunistic pathogen. In this study, the highly expressed genes in N. farcinica were predicted using the codon adaptation index (CAI) as a numerical estimator of gene expressivity. Using ribosomal protein (RP) genes as references, the top ∼ ∼10% of the genes were predicted to be the predicted highly expressed (PHX) genes in N. farcinica using a CAI cutoff of greater than 0.73. Consistent with earlier analysis of Streptomyces genomes, most of the PHX genes in N. farcinica were involved in various ‘house-keeping’ functions important for cell growth. However, 15 genes putatively involved in nocardial virulence were predicted as PHX genes in N. farcinica, which included genes encoding four Mce proteins, cyclopropane fatty acid synthase which is involved in the modification of cell wall which may be important for nocardia virulence, polyketide synthase PKS13 for mycolic acid synthesis and a non-ribosomal peptide synthetase involved in biosynthesis of a mycobactin-related siderophore. In addition, multiple genes involved in defense against reactive oxygen species (ROS) produced by the phagocyte were predicted with high expressivity, which included alkylhydroperoxide reductase (ahpC), catalase (katG), superoxide dismutase (sodF), thioredoxin, thioredoxin reductase, glutathione peroxidase, and peptide methionine sulfoxide reductase, suggesting that combating against ROS is essential for survival of N. farcinica in host cells. The study also showed that the distribution of PHX genes in the N. farcinica circular chromosome was uneven, with more PHX genes located in the regions close to replication initiation site. The results provided the first estimates of global gene expression patterns in N.␣farcinica, which will be useful in guiding experimental design for further investigations. Electronic Supplementary Material Supplementary material is available for this article at and is accessible for authorized users.  相似文献   

12.
Synonymous codon usage variation among Giardia lamblia genes and isolates.   总被引:3,自引:0,他引:3  
The pattern of codon usage in the amitochondriate diplomonad Giardia lamblia has been investigated. Very extensive heterogeneity was evident among a sample of 65 genes. A discrete group of genes featured unusual codon usage due to the amino acid composition of their products: these variant surface proteins (VSPs) are unusually rich in Cys and, to a lesser extent, Gly and Thr. Among the remaining 50 genes, correspondence analysis revealed a single major source of variation in synonymous codon usage. This trend was related to the extent of use of a particular subset of 21 codons which are inferred to be those which are optimal for translation; at one end of this trend were genes expected to be expressed at low levels with near random codon usage, while at the other extreme were genes expressed at high levels in which these optimal codons are used almost exclusively. These optimal codons all end in C or G so G + C content at silent sites varies enormously among genes, from values around 40%, expected to reflect the background level of the genome, up to nearly 100%. Although VSP genes are occasionally extremely highly expressed, they do not, in general, have high frequencies of optimal codons, presumably because their high expression is only intermittent. These results indicate that natural selection has been very effective in shaping codon usage in G. lamblia. These analyses focused on sequences from strains placed within G. lamblia "assemblage A"; a few sequences from other strains revealed extensive divergence at silent sites, including some divergence in the pattern of codon usage.  相似文献   

13.
Seven GC-rich (group I) and three AT-rich (group II) microbial genomes are analyzed in this paper. The seven microbes in group I belong to different phylogenetic lineages, even different domains of life. The common feature is that they are highly GC-rich organisms, with more than 60% genomic GC content. Group II includes three bacteria, which belong to the same subdivision as Pseudomonas aeruginosa in group I. The genomic GC content of the three bacteria is in the range of 26-50%. It is shown that although the phylogenetic lineages of the organisms in group I are remote, the common feature of highly genomic GC content forces them to adopt similar codon usage patterns, which constitutes the basis of an algorithm using a set of universal parameters to recognize known genes in the seven genomes. The common codon usage pattern of function known genes in the seven genomes is GGS type, where G, G, and S are the bases of G, non-G, and G/C, respectively. On the contrary, although the phylogenetic lineages of the three bacteria in group II are quite close, the codon usage patterns of function known genes in these genomes are obviously distinct. There are no universal parameters to identify known genes in the three genomes in group II. It can be deduced that the genomic GC content is more important than phylogenetic lineage in gene recognition programs. We hope that the work might be useful for understanding the common characteristics in the organization of microbial genomes.  相似文献   

14.
To study the possible codon usage and base composition variation in the bacteriophages, fourteen mycobacteriophages were used as a model system here and both the parameters in all these phages and their plating bacteria, M. smegmatis had been determined and compared. As all the organisms are GC-rich, the GC contents at third codon positions were found in fact higher than the second codon positions as well as the first + second codon positions in all the organisms indicating that directional mutational pressure is strongly operative at the synonymous third codon positions. Nc plot indicates that codon usage variation in all these organisms are governed by the forces other than compositional constraints. Correspondence analysis suggests that: (i) there are codon usage variation among the genes and genomes of the fourteen mycobacteriophages and M. smegmatis, i.e., codon usage patterns in the mycobacteriophages is phage-specific but not the M. smegmatis-specific; (ii) synonymous codon usage patterns of Barnyard, Che8, Che9d, and Omega are more similar than the rest mycobacteriophages and M. smegmatis; (iii) codon usage bias in the mycobacteriophages are mainly determined by mutational pressure; and (iv) the genes of comparatively GC rich genomes are more biased than the GC poor genomes. Translational selection in determining the codon usage variation in highly expressed genes can be invoked from the predominant occurrences of C ending codons in the highly expressed genes. Cluster analysis based on codon usage data also shows that there are two distinct branches for the fourteen mycobacteriophages and there is codon usage variation even among the phages of each branch.  相似文献   

15.
16.
E P Rocha  A Danchin    A Viari 《Nucleic acids research》1999,27(17):3567-3576
We analysed the Bacillus subtilis protein coding sequences termini, and compared it to other genomes. The analysis focused on signals, com-positional biases of nucleotides, oligonucleotides, codons and amino acids and mRNA secondary structure. AUG is the preferred start codon in all genomes, independent of their G+C content, and seems to induce less stable mRNA structures. However, it is not conserved between homologous genes neither is it preferred in highly expressed genes. In B.subtilis the ribosome binding site is very strong. We found that downstream boxes do not seem to exist either in Escherichia coli or in B.subtilis. UAA stop codon usage is correlated with the G+C content and is strongly selected in highly expressed genes. We found less stable mRNA structures at both termini, which we related to mRNA-ribosome and mRNA-release-factor interactions. This pattern seems to impose a peculiar A-rich nucleotide and codon usage bias in these regions. Finally the analysis of all proteins from B.subtilis revealed a similar amino acid bias near both termini of proteins consisting of over-representation of hydrophilic residues. This bias near the stop codon is partially release-factor specific.  相似文献   

17.
The "expression measure" of a gene, E(g), is a statistic devised to predict the level of gene expression from codon usage bias. E(g) has been used extensively to analyze prokaryotic genome sequences. We discuss 2 problems with this approach. First, the formulation of E(g) is such that genes with the strongest selected codon usage bias are not likely to have the highest predicted expression levels; indeed the correlation between E(g) and expression level is weak among moderate to highly expressed genes. Second, in some species, highly expressed genes do not have unusual codon usage, and so codon usage cannot be used to predict expression levels. We outline a simple approach, first to check whether a genome shows evidence of selected codon usage bias and then to assess the strength of bias in genes as a guide to their likely expression level; we illustrate this with an analysis of Shewanella oneidensis.  相似文献   

18.
The codon usage patterns of rhizobia have received increasing attention. However, little information is available regarding the conserved features of the codon usage patterns in a typical rhizobial genus. The codon usage patterns of six completely sequenced strains belonging to the genus Rhizobium were analysed as model rhizobia in the present study. The relative neutrality plot showed that selection pressure played a role in codon usage in the genus Rhizobium. Spearman’s rank correlation analysis combined with correspondence analysis (COA) showed that the codon adaptation index and the effective number of codons (ENC) had strong correlation with the first axis of the COA, which indicated the important role of gene expression level and the ENC in the codon usage patterns in this genus. The relative synonymous codon usage of Cys codons had the strongest correlation with the second axis of the COA. Accordingly, the usage of Cys codons was another important factor that shaped the codon usage patterns in Rhizobium genomes and was a conserved feature of the genus. Moreover, the comparison of codon usage between highly and lowly expressed genes showed that 20 unique preferred codons were shared among Rhizobium genomes, revealing another conserved feature of the genus. This is the first report of the codon usage patterns in the genus Rhizobium.  相似文献   

19.
20.
Wang B  Liu J  Jin L  Feng XY  Chen JQ 《植物学报(英文版)》2010,52(12):1100-1108
Mutation and selection are two major forces causing codon usage biases. How these two forces influence the codon usages in green plant mitochondrial genomes has not been well investigated. In the present study, we surveyed five bryophyte mitochondrial genomes to reveal their codon usage patterns as well as the determining forces. Three interesting findings were made. First, comparing to Chara vulgaris, an algal species sister to all extant land plants, bryophytes have more G, C-ending codon usages in their mitochondrial genes. This is consistent with the generally higher genomic GC content in bryophyte mitochondria, suggesting an increased mutational pressure toward GC. Second, as indicated by Wright's Nc-GC3s plot, mutation, not selection, is the major force affecting codon usages of bryophyte mitochondrial genes. However, the real mutational dynamics seem very complex. Context-dependent analysis indicated that nucleotide at the 2nd codon position would slightly affect synonymous codon choices. Finally, in bryophyte mitochondria, tRNA genes would apply a weak selection force to fine-tune the synonymous codon frequencies, as revealed by data of Ser4-Pro-Thr-Val families. In summary, complex mutation and weak selection together determined the codon usages in bryophyte mitochondrial genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号