首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Codon adaptation indices (CAIs) represent an evolutionary strategy to modulate gene expression and have widely been used to predict potentially highly expressed genes within microbial genomes. Here, we evaluate and compare two very different methods for estimating CAI values, one corresponding to translational codon usage bias and the second obtained mathematically by searching for the most dominant codon bias.  相似文献   

2.

Background

Synonymous codon usage varies widely between genomes, and also between genes within genomes. Although there is now a large body of data on variations in codon usage, it is still not clear if the observed patterns reflect the effects of positive Darwinian selection acting at the level of translational efficiency or whether these patterns are due simply to the effects of mutational bias. In this study, we have included both intra-genomic and inter-genomic comparisons of codon usage. This allows us to distinguish more efficiently between the effects of nucleotide bias and translational selection.

Results

We show that there is an extreme degree of heterogeneity in codon usage patterns within the rice genome, and that this heterogeneity is highly correlated with differences in nucleotide content (particularly GC content) between the genes. In contrast to the situation observed within the rice genome, Arabidopsis genes show relatively little variation in both codon usage and nucleotide content. By exploiting a combination of intra-genomic and inter-genomic comparisons, we provide evidence that the differences in codon usage among the rice genes reflect a relatively rapid evolutionary increase in the GC content of some rice genes. We also noted that the degree of codon bias was negatively correlated with gene length.

Conclusion

Our results show that mutational bias can cause a dramatic evolutionary divergence in codon usage patterns within a period of approximately two hundred million years.The heterogeneity of codon usage patterns within the rice genome can be explained by a balance between genome-wide mutational biases and negative selection against these biased mutations. The strength of the negative selection is proportional to the length of the coding sequences. Our results indicate that the large variations in synonymous codon usage are not related to selection acting on the translational efficiency of synonymous codons.
  相似文献   

3.
It is well known that an amino acid can be encoded by more than one codon, called synonymous codons. The preferential use of one particular codon for coding an amino acid is referred to as codon usage bias (CUB). A quantitative analytical method, CUB and a related tool, Codon Adaptative Index have been applied to comparatively study whole genomes of a few pathogenic Trypanosomatid species. This quantitative attempt is of direct help in the comparison of qualitative features like mutational and translational selection. Pathogens of the Leishmania and Trypanosoma genus cause debilitating disease and suffering in human beings and animals. Of these, whole genome sequences are available for only five species. The complete coding sequences (CDS), highly expressed, essential and low expressed genes have all been studied for their CUB signature. The codon usage bias of essential genes and highly expressed genes show distribution similar to codon usage bias of all CDSs in Trypanosomatids. Translational selection is the dominant force selecting the preferred codon, and selection due to mutation is negligible. In contrast to an earlier study done on these pathogens, it is found in this work that CUB and CAI may be used to distinguish the Trypanosomatid genomes at the sub-genus level. Further, CUB may effectively be used as a signature of the species differentiation by using Principal Component Analysis (PCA).

Abbreviations

CUB - Codon Usage Bias, CAI - Codon Adaptative Index, CDS - Coding sequences, t-RNA - Transfer RNA, PCA - Principal Component Analysis.  相似文献   

4.

Background

Equine infectious anemia virus (EIAV) is an important animal model for understanding the relationship between viral persistence and the host immune response during lentiviral infections. Comparison and analysis of the codon usage model between EIAV and its hosts is important for the comprehension of viral evolution. In our study, the codon usage pattern of EIAV was analyzed from the available 29 full-length EIAV genomes through multivariate statistical methods.

Finding

Effective number of codons (ENC) suggests that the codon usage among EIAV strains is slightly biased. The ENC-plot analysis demonstrates that mutation pressure plays a substantial role in the codon usage pattern of EIAV, whereas other factors such as geographic distribution and host translation selection also take part in the process of EIAV evolution. Comparative analysis of codon adaptation index (CAI) values among EIAV and its hosts suggests that EIAV utilize the translational resources of horse more efficiently than that of donkey.

Conclusion

The codon usage bias in EIAV is slight and mutation pressure is the main factor that affects codon usage variation in EIAV. These results suggest that EIAV genomic biases are the result of the co-evolution of genome composition and the ability to evade the host’s immune response.
  相似文献   

5.

Background

Same-strand overlapping genes may occur in frameshifts of one (phase 1) or two nucleotides (phase 2). In previous studies of bacterial genomes, long phase-1 overlaps were found to be more numerous than long phase-2 overlaps. This bias was explained by either genomic location or an unspecified selection advantage. Models that focused on the ability of the two genes to evolve independently did not predict this phase bias. Here, we propose that a purely compositional model explains the phase bias in a more parsimonious manner. Same-strand overlapping genes may arise through either a mutation at the termination codon of the upstream gene or a mutation at the initiation codon of the downstream gene. We hypothesized that given these two scenarios, the frequencies of initiation and termination codons in the two phases may determine the number for overlapping genes.

Results

We examined the frequencies of initiation- and termination-codons in the two phases, and found that termination codons do not significantly differ between the two phases, whereas initiation codons are more abundant in phase 1. We found that the primary factors explaining the phase inequality are the frequencies of amino acids whose codons may combine to form start codons in the two phases. We show that the frequencies of start codons in each of the two phases, and, hence, the potential for the creation of overlapping genes, are determined by a universal amino-acid frequency and species-specific codon usage, leading to a correlation between long phase-1 overlaps and genomic GC content.

Conclusion

Our model explains the phase bias in same-strand overlapping genes by compositional factors without invoking selection. Therefore, it can be used as a null model of neutral evolution to test selection hypotheses concerning the evolution of overlapping genes.

Reviewers

This article was reviewed by Bill Martin, Itai Yanai, and Mikhail Gelfand.  相似文献   

6.
SK Behura  DW Severson 《PloS one》2012,7(8):e43111

Background

Codon bias is a phenomenon of non-uniform usage of codons whereas codon context generally refers to sequential pair of codons in a gene. Although genome sequencing of multiple species of dipteran and hymenopteran insects have been completed only a few of these species have been analyzed for codon usage bias.

Methods and Principal Findings

Here, we use bioinformatics approaches to analyze codon usage bias and codon context patterns in a genome-wide manner among 15 dipteran and 7 hymenopteran insect species. Results show that GAA is the most frequent codon in the dipteran species whereas GAG is the most frequent codon in the hymenopteran species. Data reveals that codons ending with C or G are frequently used in the dipteran genomes whereas codons ending with A or T are frequently used in the hymenopteran genomes. Synonymous codon usage orders (SCUO) vary within genomes in a pattern that seems to be distinct for each species. Based on comparison of 30 one-to-one orthologous genes among 17 species, the fruit fly Drosophila willistoni shows the least codon usage bias whereas the honey bee (Apis mellifera) shows the highest bias. Analysis of codon context patterns of these insects shows that specific codons are frequently used as the 3′- and 5′-context of start and stop codons, respectively.

Conclusions

Codon bias pattern is distinct between dipteran and hymenopteran insects. While codon bias is favored by high GC content of dipteran genomes, high AT content of genes favors biased usage of synonymous codons in the hymenopteran insects. Also, codon context patterns vary among these species largely according to their phylogeny.  相似文献   

7.
The study of codon usage bias is an important research area that contributes to our understanding of molecular evolution, phylogenetic relationships, respiratory lifestyle, and other characteristics. Translational efficiency bias is perhaps the most well-studied codon usage bias, as it is frequently utilized to predict relative protein expression levels. We present a novel approach to isolating translational efficiency bias in microbial genomes. There are several existent methods for isolating translational efficiency bias. Previous approaches are susceptible to the confounding influences of other potentially dominant biases. Additionally, existing approaches to identifying translational efficiency bias generally require both genomic sequence information and prior knowledge of a set of highly expressed genes. This novel approach provides more accurate results from sequence information alone by resisting the confounding effects of other biases. We validate this increase in accuracy in isolating translational efficiency bias on 10 microbial genomes, five of which have proven particularly difficult for existing approaches due to the presence of strong confounding biases.  相似文献   

8.

Background

Duplicated genes frequently experience asymmetric rates of sequence evolution. Relaxed selective constraints and positive selection have both been invoked to explain the observation that one paralog within a gene-duplicate pair exhibits an accelerated rate of sequence evolution. In the majority of studies where asymmetric divergence has been established, there is no indication as to which gene copy, ancestral or derived, is evolving more rapidly. In this study we investigated the effect of local synteny (gene-neighborhood conservation) and codon usage on the sequence evolution of gene duplicates in the S. cerevisiae genome. We further distinguish the gene duplicates into those that originated from a whole-genome duplication (WGD) event (ohnologs) versus small-scale duplications (SSD) to determine if there exist any differences in their patterns of sequence evolution.

Results

For SSD pairs, the derived copy evolves faster than the ancestral copy. However, there is no relationship between rate asymmetry and synteny conservation (ancestral-like versus derived-like) in ohnologs. mRNA abundance and optimal codon usage as measured by the CAI is lower in the derived SSD copies relative to ancestral paralogs. Moreover, in the case of ohnologs, the faster-evolving copy has lower CAI and lowered expression.

Conclusions

Together, these results suggest that relaxation of selection for codon usage and gene expression contribute to rate asymmetry in the evolution of duplicated genes and that in SSD pairs, the relaxation of selection stems from the loss of ancestral regulatory information in the derived copy.  相似文献   

9.
It is generally believed that the effect of translational selection on codon usage bias is related to the number of transfer RNA genes in bacteria, which is more with respect to the high expression genes than the whole genome. Keeping this in the background, we analyzed codon usage bias with respect to asparagine, isoleucine, phenylalanine, and tyrosine amino acids. Analysis was done in seventeen bacteria with the available gene expression data and information about the tRNA gene number. In most of the bacteria, it was observed that codon usage bias and tRNA gene number were not in agreement, which was unexpected. We extended the study further to 199 bacteria, limiting to the codon usage bias in the two highly expressed genes rpoB and rpoC which encode the RNA polymerase subunits β and β′, respectively. In concordance with the result in the high expression genes, codon usage bias in rpoB and rpoC genes was also found to not be in agreement with tRNA gene number in many of these bacteria. Our study indicates that tRNA gene numbers may not be the sole determining factor for translational selection of codon usage bias in bacterial genomes.  相似文献   

10.
Highly expressed genes in any species differ in the usage frequency of synonymous codons. The relative recurrence of an event of the favored codon pair (amino acid pairs) varies between gene and genomes due to varying gene expression and different base composition. Here we propose a new measure for predicting the gene expression level, i.e., codon plus amino bias index (CABI). Our approach is based on the relative bias of the favored codon pair inclination among the genes, illustrated by analyzing the CABI score of the Medicago truncatula genes. CABI showed strong correlation with all other widely used measures (CAI, RCBS, SCUO) for gene expression analysis. Surprisingly, CABI outperforms all other measures by showing better correlation with the wet-lab data. This emphasizes the importance of the neighboring codons of the favored codon in a synonymous group while estimating the expression level of a gene.  相似文献   

11.
The present study has been aimed to the comparative analysis of high GC composition containing Corynebacterium genomes and their evolutionary study by exploring codon and amino acid usage patterns. Phylogenetic study by MLSA approach, indel analysis and BLAST matrix differentiated Corynebacterium species in pathogenic and non-pathogenic clusters. Correspondence analysis on synonymous codon usage reveals that, gene length, optimal codon frequencies and tRNA abundance affect the gene expression of Corynebacterium. Most of the optimal codons as well as translationally optimal codons are C ending i.e. RNY (R-purine, N-any nucleotide base, and Y-pyrimidine) and reveal translational selection pressure on codon bias of Corynebacterium. Amino acid usage is affected by hydrophobicity, aromaticity, protein energy cost, etc. Highly expressed genes followed the cost minimization hypothesis and are less diverged at their synonymous positions of codons. Functional analysis of core genes shows significant difference in pathogenic and non-pathogenic Corynebacterium. The study reveals close relationship between non-pathogenic and opportunistic pathogenic Corynebaterium as well as between molecular evolution and survival niches of the organism.  相似文献   

12.

Background

In many microbial genomes, a strong preference for a small number of codons can be observed in genes whose products are needed by the cell in large quantities. This codon usage bias (CUB) improves translational accuracy and speed and is one of several factors optimizing cell growth. Whereas CUB and the overrepresentation of individual proteins have been studied in detail, it is still unclear which high-level metabolic categories are subject to translational optimization in different habitats.

Results

In a systematic study of 388 microbial species, we have identified for each genome a specific subset of genes characterized by a marked CUB, which we named the effectome. As expected, gene products related to protein synthesis are abundant in both archaeal and bacterial effectomes. In addition, enzymes contributing to energy production and gene products involved in protein folding and stabilization are overrepresented. The comparison of genomes from eleven habitats shows that the environment has only a minor effect on the composition of the effectomes. As a paradigmatic example, we detailed the effectome content of 37 bacterial genomes that are most likely exposed to strongest selective pressure towards translational optimization. These effectomes accommodate a broad range of protein functions like enzymes related to glycolysis/gluconeogenesis and the TCA cycle, ATP synthases, aminoacyl-tRNA synthetases, chaperones, proteases that degrade misfolded proteins, protectants against oxidative damage, as well as cold shock and outer membrane proteins.

Conclusions

We made clear that effectomes consist of specific subsets of the proteome being involved in several cellular functions. As expected, some functions are related to cell growth and affect speed and quality of protein synthesis. Additionally, the effectomes contain enzymes of central metabolic pathways and cellular functions sustaining microbial life under stress situations. These findings indicate that cell growth is an important but not the only factor modulating translational accuracy and speed by means of CUB.
  相似文献   

13.

Background

Animal mitochondrial genomes are physically separate from the much larger nuclear genomes and have proven useful both for phylogenetic studies and for understanding genome evolution. Within the phylum Arthropoda the subphylum Crustacea includes over 50,000 named species with immense variation in body plans and habitats, yet only 23 complete mitochondrial genomes are available from this subphylum.

Results

I describe here the complete mitochondrial genome of the crustacean Squilla mantis (Crustacea: Malacostraca: Stomatopoda). This 15994-nucleotide genome, the first described from a hoplocarid, contains the standard complement of 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and a non-coding AT-rich region that is found in most other metazoans. The gene order is identical to that considered ancestral for hexapods and crustaceans. The 70% AT base composition is within the range described for other arthropods. A single unusual feature of the genome is a 230 nucleotide non-coding region between a serine transfer RNA and the nad1 gene, which has no apparent function. I also compare gene order, nucleotide composition, and codon usage of the S. mantis genome and eight other malacostracan crustaceans. A translocation of the histidine transfer RNA gene is shared by three taxa in the order Decapoda, infraorder Brachyura; Callinectes sapidus, Portunus trituberculatus and Pseudocarcinus gigas. This translocation may be diagnostic for the Brachyura. For all nine taxa nucleotide composition is biased towards AT-richness, as expected for arthropods, and is within the range reported for other arthropods. Codon usage is biased, and much of this bias is probably due to the skew in nucleotide composition towards AT-richness.

Conclusion

The mitochondrial genome of Squilla mantis contains one unusual feature, a 230 base pair non-coding region has so far not been described in any other malacostracan. Comparisons with other Malacostraca show that all nine genomes, like most other mitochondrial genomes, share a bias toward AT-richness and a related bias in codon usage. The nine malacostracans included in this analysis are not representative of the diversity of the class Malacostraca, and additional malacostracan sequences would surely reveal other unusual genomic features that could be useful in understanding mitochondrial evolution in this taxon.  相似文献   

14.
The helicase gene of Autographa californica multiple nucleopolyhedrovirus (AcMNPV) is not only involved in viral DNA replication, but also plays a role in viral host range. To identify the codon usage bias of helicase of AcMNPV, the codon usage bias of helicase was especially studies in AcMNPV and 41 reference strains of baculoviruses by calculating the codon adaptation index (CAI), effective number of codon (ENc), relative synonymous codon usage (RSCU), and other indices. The helicase of baculovirus is less biased (mean ENc?=?50.539?>?40; mean CAI?=?0.246). AcMNPV helicase has a strong bias toward the synonymous codons with G and C at the third codon position (GC3s?=?53.6%). The plot of GC3s against ENc values revealed that GC compositional constraints are the main factor that determines the codon usage bias of major of helicase. Several indicators supported that the codon usage pattern of helicase is mainly subject to mutation pressure. Analysis of variation in codon usage and amino acid composition indicated AcMNPV helicase shows the significant preference for one or more postulated codons for each amino acid. A cluster analysis based on RSCU values suggested that AcMNPV is evolutionarily closer to members of group I alphabaculovirus. Comparison of the codon usage pattern among E. coli, yeast, mouse, human and AcMNPV showed that yeast is a suitable expression system for AcMNPV helicase. AcMNPV helicase shows weak codon usage bias. This study may help in elucidating the functional mechanism of AcMNPV helicase and the evolution of baculovirus helicases.  相似文献   

15.
The Selective Advantage of Synonymous Codon Usage Bias in Salmonella   总被引:1,自引:0,他引:1  
The genetic code in mRNA is redundant, with 61 sense codons translated into 20 different amino acids. Individual amino acids are encoded by up to six different codons but within codon families some are used more frequently than others. This phenomenon is referred to as synonymous codon usage bias. The genomes of free-living unicellular organisms such as bacteria have an extreme codon usage bias and the degree of bias differs between genes within the same genome. The strong positive correlation between codon usage bias and gene expression levels in many microorganisms is attributed to selection for translational efficiency. However, this putative selective advantage has never been measured in bacteria and theoretical estimates vary widely. By systematically exchanging optimal codons for synonymous codons in the tuf genes we quantified the selective advantage of biased codon usage in highly expressed genes to be in the range 0.2–4.2 x 10−4 per codon per generation. These data quantify for the first time the potential for selection on synonymous codon choice to drive genome-wide sequence evolution in bacteria, and in particular to optimize the sequences of highly expressed genes. This quantification may have predictive applications in the design of synthetic genes and for heterologous gene expression in biotechnology.  相似文献   

16.
17.
Palidwor GA  Perkins TJ  Xia X 《PloS one》2010,5(10):e13431

Background

In spite of extensive research on the effect of mutation and selection on codon usage, a general model of codon usage bias due to mutational bias has been lacking. Because most amino acids allow synonymous GC content changing substitutions in the third codon position, the overall GC bias of a genome or genomic region is highly correlated with GC3, a measure of third position GC content. For individual amino acids as well, G/C ending codons usage generally increases with increasing GC bias and decreases with increasing AT bias. Arginine and leucine, amino acids that allow GC-changing synonymous substitutions in the first and third codon positions, have codons which may be expected to show different usage patterns.

Principal Findings

In analyzing codon usage bias in hundreds of prokaryotic and plant genomes and in human genes, we find that two G-ending codons, AGG (arginine) and TTG (leucine), unlike all other G/C-ending codons, show overall usage that decreases with increasing GC bias, contrary to the usual expectation that G/C-ending codon usage should increase with increasing genomic GC bias. Moreover, the usage of some codons appears nonlinear, even nonmonotone, as a function of GC bias. To explain these observations, we propose a continuous-time Markov chain model of GC-biased synonymous substitution. This model correctly predicts the qualitative usage patterns of all codons, including nonlinear codon usage in isoleucine, arginine and leucine. The model accounts for 72%, 64% and 52% of the observed variability of codon usage in prokaryotes, plants and human respectively. When codons are grouped based on common GC content, 87%, 80% and 68% of the variation in usage is explained for prokaryotes, plants and human respectively.

Conclusions

The model clarifies the sometimes-counterintuitive effects that GC mutational bias can have on codon usage, quantifies the influence of GC mutational bias and provides a natural null model relative to which other influences on codon bias may be measured.  相似文献   

18.
G + C composition at the third codon position (GC3) is widely reported to be correlated with synonymous codon usage bias. However, no quantitative attempt has been made to compare the extent of this correlation among different genomes. Here, we applied Shannon entropy from information theory to measure the degree of GC3 bias and that of synonymous codon usage bias of each gene. The strength of the correlation of GC3 with synonymous codon usage bias, quantified by a correlation coefficient, varied widely among bacterial genomes, ranging from Open image in new window 0.07 to 0.95. Previous analyses suggesting that the relationship between GC3 and synonymous codon usage bias is independent of species are thus inconsistent with the more detailed analyses obtained here for individual species.  相似文献   

19.

Background

The number of completely sequenced plastid genomes available is growing rapidly. This array of sequences presents new opportunities to perform comParative analyses. In comParative studies, it is often useful to compare across wide phylogenetic spans and, within angiosperms, to include representatives from basally diverging lineages such as the genomes reported here: Nuphar advena (from a basal-most lineage) and Ranunculus macranthus (a basal eudicot). We report these two new plastid genome sequences and make comparisons (within angiosperms, seed plants, or all photosynthetic lineages) to evaluate features such as the status of ycf15 and ycf68 as protein coding genes, the distribution of simple sequence repeats (SSRs) and longer dispersed repeats (SDR), and patterns of nucleotide composition.

Results

The Nuphar [GenBank:NC_008788] and Ranunculus [GenBank:NC_008796] plastid genomes share characteristics of gene content and organization with many other chloroplast genomes. Like other plastid genomes, these genomes are A+T-rich, except for rRNA and tRNA genes. Detailed comparisons of Nuphar with Nymphaea, another Nymphaeaceae, show that more than two-thirds of these genomes exhibit at least 95% sequence identity and that most SSRs are shared. In broader comparisons, SSRs vary among genomes in s of abundance and length and most contain repeat motifs based on A and T nucleotides.

Conclusion

SSR and SDR abundance varies by genome and, for SSRs, is proportional to genome size. Long SDRs are rare in the genomes assessed. SSRs occur less frequently than predicted and, although the majority of the repeat motifs do include A and T nucleotides, the A+T bias in SSRs is less than that predicted from the underlying genomic nucleotide composition. In codon usage third positions show an A+T bias, however variation in codon usage does not correlate with differences in A+T-richness. Thus, although plastome nucleotide composition shows "A+T richness", an A+T bias is not apparent upon more in-depth analysis, at least in these aspects. The pattern of evolution in the sequences identified as ycf15 and ycf68 is not consistent with them being protein-coding genes. In fact, these regions show no evidence of sequence conservation beyond what is normal for non-coding regions of the IR.  相似文献   

20.
Codon adaptation index as a measure of dominating codon bias   总被引:9,自引:0,他引:9  
We propose a simple algorithm to detect dominating synonymous codon usage bias in genomes. The algorithm is based on a precise mathematical formulation of the problem that lead us to use the Codon Adaptation Index (CAI) as a 'universal' measure of codon bias. This measure has been previously employed in the specific context of translational bias. With the set of coding sequences as a sole source of biological information, the algorithm provides a reference set of genes which is highly representative of the bias. This set can be used to compute the CAI of genes of prokaryotic and eukaryotic organisms, including those whose functional annotation is not yet available. An important application concerns the detection of a reference set characterizing translational bias which is known to correlate to expression levels; in this case, the algorithm becomes a key tool to predict gene expression levels, to guide regulatory circuit reconstruction, and to compare species. The algorithm detects also leading-lagging strands bias, GC-content bias, GC3 bias, and horizontal gene transfer. The approach is validated on 12 slow-growing and fast-growing bacteria, Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster. AVAILABILITY: http://www.ihes.fr/~materials.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号