共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
Edward Daniel Goodluck U. Onwukwe Rik K. Wierenga Susan E. Quaggin Seppo J. Vainio Mirja Krause 《BMC bioinformatics》2015,16(1)
Background
Codon usage plays a crucial role when recombinant proteins are expressed in different organisms. This is especially the case if the codon usage frequency of the organism of origin and the target host organism differ significantly, for example when a human gene is expressed in E. coli. Therefore, to enable or enhance efficient gene expression it is of great importance to identify rare codons in any given DNA sequence and subsequently mutate these to codons which are more frequently used in the expression host.Results
We describe an open-source web-based application, ATGme, which can in a first step identify rare and highly rare codons from most organisms, and secondly gives the user the possibility to optimize the sequence.Conclusions
This application provides a simple user-friendly interface utilizing three optimization strategies: 1. one-click optimization, 2. bulk optimization (by codon-type), 3. individualized custom (codon-by-codon) optimization. ATGme is an open-source application which is freely available at: http://atgme.org 相似文献3.
4.
5.
Xian Jia Shuyu Liu Hao Zheng Bo Li Qi Qi Lei Wei Taiyi Zhao Jian He Jingchen Sun 《BMC genomics》2015,16(1)
Background
The analysis of codon usage is a good way to understand the genetic and evolutionary characteristics of an organism. However, there are only a few reports related with the codon usage of the domesticated silkworm, Bombyx mori (B. mori). Hence, the codon usage of B. mori was analyzed here to reveal the constraint factors and it could be helpful to improve the bioreactor based on B. mori.Results
A total of 1,097 annotated mRNA sequences from B. mori were analyzed, revealing there is only a weak codon bias. It also shows that the gene expression level is related to the GC content, and the amino acids with higher general average hydropathicity (GRAVY) and aromaticity (Aromo). And the genes on the primary axis are strongly positively correlated with the GC content, and GC3s. Meanwhile, the effective number of codons (ENc) is strongly correlated with codon adaptation index (CAI), gene length, and Aromo values. However, the ENc values are correlated with the second axis, which indicates that the codon usage in B. mori is affected by not only mutation pressure and natural selection, but also nucleotide composition and the gene expression level. It is also associated with Aromo values, and gene length. Additionally, B. mori has a greater relative discrepancy in codon preferences with Drosophila melanogaster (D. melanogaster) or Saccharomyces cerevisiae (S. cerevisiae) than with Arabidopsis thaliana (A. thaliana), Escherichia coli (E. coli), or Caenorhabditis elegans (C. elegans).Conclusions
The codon usage bias in B. mori is relatively weak, and many influence factors are found here, such as nucleotide composition, mutation pressure, natural selection, and expression level. Additionally, it is also associated with Aromo values, and gene length. Among them, natural selection might play a major role. Moreover, the “optimal codons” of B. mori are all encoded by G and C, which provides useful information for enhancing the gene expression in B. mori through codon optimization.Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1596-z) contains supplementary material, which is available to authorized users. 相似文献6.
7.
Background
The genetic code is redundant, meaning that most amino acids can be encoded by more than one codon. Highly expressed genes tend to use optimal codons to increase the accuracy and speed of translation. Thus, codon usage biases provide a signature of the relative expression levels of genes, which can, uniquely, be quantified across the domains of life.Results
Here we describe a general statistical framework to exploit this phenomenon and to systematically associate genes with environments and phenotypic traits through changes in codon adaptation. By inferring evolutionary signatures of translation efficiency in 911 bacterial and archaeal genomes while controlling for confounding effects of phylogeny and inter-correlated phenotypes, we linked 187 gene families to 24 diverse phenotypic traits. A series of experiments in Escherichia coli revealed that 13 of 15, 19 of 23, and 3 of 6 gene families with changes in codon adaptation in aerotolerant, thermophilic, or halophilic microbes. Respectively, confer specific resistance to, respectively, hydrogen peroxide, heat, and high salinity. Further, we demonstrate experimentally that changes in codon optimality alone are sufficient to enhance stress resistance. Finally, we present evidence that multiple genes with altered codon optimality in aerobes confer oxidative stress resistance by controlling the levels of iron and NAD(P)H.Conclusions
Taken together, these results provide experimental evidence for a widespread connection between changes in translation efficiency and phenotypic adaptation. As the number of sequenced genomes increases, this novel genomic context method for linking genes to phenotypes based on sequence alone will become increasingly useful. 相似文献8.
Comparative Analysis of Codon Usage Bias and Codon Context Patterns between Dipteran and Hymenopteran Sequenced Genomes 总被引:1,自引:0,他引:1
Background
Codon bias is a phenomenon of non-uniform usage of codons whereas codon context generally refers to sequential pair of codons in a gene. Although genome sequencing of multiple species of dipteran and hymenopteran insects have been completed only a few of these species have been analyzed for codon usage bias.Methods and Principal Findings
Here, we use bioinformatics approaches to analyze codon usage bias and codon context patterns in a genome-wide manner among 15 dipteran and 7 hymenopteran insect species. Results show that GAA is the most frequent codon in the dipteran species whereas GAG is the most frequent codon in the hymenopteran species. Data reveals that codons ending with C or G are frequently used in the dipteran genomes whereas codons ending with A or T are frequently used in the hymenopteran genomes. Synonymous codon usage orders (SCUO) vary within genomes in a pattern that seems to be distinct for each species. Based on comparison of 30 one-to-one orthologous genes among 17 species, the fruit fly Drosophila willistoni shows the least codon usage bias whereas the honey bee (Apis mellifera) shows the highest bias. Analysis of codon context patterns of these insects shows that specific codons are frequently used as the 3′- and 5′-context of start and stop codons, respectively.Conclusions
Codon bias pattern is distinct between dipteran and hymenopteran insects. While codon bias is favored by high GC content of dipteran genomes, high AT content of genes favors biased usage of synonymous codons in the hymenopteran insects. Also, codon context patterns vary among these species largely according to their phylogeny. 相似文献9.
Background
The frequency of synonymous codon usage varies widely between organisms. Suboptimal codon content limits expression of viral, experimental or therapeutic heterologous proteins due to limiting cognate tRNAs. Codon content is therefore often adjusted to match codon bias of the host organism. Codon content also varies between genes within individual mammalian species. However, little attention has been paid to the consequences of codon content upon translation of host proteins.Methodology/Principal Findings
In comparing the splicing repressor activities of transfected human PTB and its two tissue-restricted paralogs–nPTB and ROD1–we found that the three proteins were expressed at widely varying levels. nPTB was expressed at 1–3% the level of PTB despite similar levels of mRNA expression and 74% amino acid identity. The low nPTB expression was due to the high proportion of codons with A or U at the third codon position, which are suboptimal in human mRNAs. Optimization of the nPTB codon content, akin to the “humanization” of foreign ORFs, allowed efficient translation in vivo and in vitro to levels comparable with PTB. We were then able to demonstrate that all three proteins act as splicing repressors.Conclusions/Significance
Our results provide a striking illustration of the importance of mRNA codon content in determining levels of protein expression, even within cells of the natural host species. 相似文献10.
11.
Background
In spite of extensive research on the effect of mutation and selection on codon usage, a general model of codon usage bias due to mutational bias has been lacking. Because most amino acids allow synonymous GC content changing substitutions in the third codon position, the overall GC bias of a genome or genomic region is highly correlated with GC3, a measure of third position GC content. For individual amino acids as well, G/C ending codons usage generally increases with increasing GC bias and decreases with increasing AT bias. Arginine and leucine, amino acids that allow GC-changing synonymous substitutions in the first and third codon positions, have codons which may be expected to show different usage patterns.Principal Findings
In analyzing codon usage bias in hundreds of prokaryotic and plant genomes and in human genes, we find that two G-ending codons, AGG (arginine) and TTG (leucine), unlike all other G/C-ending codons, show overall usage that decreases with increasing GC bias, contrary to the usual expectation that G/C-ending codon usage should increase with increasing genomic GC bias. Moreover, the usage of some codons appears nonlinear, even nonmonotone, as a function of GC bias. To explain these observations, we propose a continuous-time Markov chain model of GC-biased synonymous substitution. This model correctly predicts the qualitative usage patterns of all codons, including nonlinear codon usage in isoleucine, arginine and leucine. The model accounts for 72%, 64% and 52% of the observed variability of codon usage in prokaryotes, plants and human respectively. When codons are grouped based on common GC content, 87%, 80% and 68% of the variation in usage is explained for prokaryotes, plants and human respectively.Conclusions
The model clarifies the sometimes-counterintuitive effects that GC mutational bias can have on codon usage, quantifies the influence of GC mutational bias and provides a natural null model relative to which other influences on codon bias may be measured. 相似文献12.
Background
Translation is most often terminated when a ribosome encounters the first in-frame stop codon (UAA, UAG or UGA) in an mRNA. However, many viruses (and some cellular mRNAs) contain “stop” codons that cause a proportion of ribosomes to terminate and others to incorporate an amino acid and continue to synthesize a “readthrough”, or C-terminally extended, protein. This dynamic redefinition of codon meaning is dependent on specific sequence context.Methodology
We describe two versatile dual reporter systems which facilitate investigation of stop codon readthrough in vivo in intact plants, and identification of the amino acid incorporated at the decoded stop codon. The first is based on the reporter enzymes NAN and GUS for which sensitive fluorogenic and histochemical substrates are available; the second on GST and GFP.Conclusions
We show that the NAN-GUS system can be used for direct in planta measurements of readthrough efficiency following transient expression of reporter constructs in leaves, and moreover, that the system is sufficiently sensitive to permit measurement of readthrough in stably transformed plants. We further show that the GST-GFP system can be used to affinity purify readthrough products for mass spectrometric analysis and provide the first definitive evidence that tyrosine alone is specified in vivo by a ‘leaky’ UAG codon, and tyrosine and tryptophan, respectively, at decoded UAA, and UGA codons in the Tobacco mosaic virus (TMV) readthrough context. 相似文献13.
14.
Background
Animal mitochondrial genomes typically encode one tRNA for each synonymous codon family, so that each tRNA anticodon essentially has to wobble to recognize two or four synonymous codons. Several factors have been hypothesized to determine the nucleotide at the wobble site of a tRNA anticodon in mitochondrial genomes, such as the codon-anticodon adaptation hypothesis, the wobble versatility hypothesis, the translation initiation and elongation conflict hypothesis, and the wobble cost hypothesis.Principal Findings
In this study, we analyzed codon usage and tRNA anticodon wobble sites of 29 marine bivalve mitochondrial genomes to evaluate features of the wobble nucleotides in tRNA anticodons. The strand-specific mutation bias favors G and T on the H strand in all the 29 marine bivalve mitochondrial genomes. A bias favoring G and T is also visible in the third codon positions of protein-coding genes and the wobble sites of anticodons, rejecting that codon usage bias drives the wobble sites of tRNA anticodons or tRNA anticodon bias drives the evolution of codon usage. Almost all codon families (98.9%) from marine bivalve mitogenomes support the wobble versatility hypothesis. There are a few interesting exceptions involving tRNATrp with an anticodon CCA fixed in Pectinoida species, tRNASer with a GCU anticodon fixed in Mytiloida mitogenomes, and the uniform anticodon CAU of tRNAMet translating the AUR codon family.Conclusions/Significance
These results demonstrate that most of the nucleotides at the wobble sites of tRNA anticodons in marine bivalve mitogenomes are determined by wobble versatility. Other factors such as the translation initiation and elongation conflict, and the cost of wobble translation may contribute to the determination of the wobble nucleotide in tRNA anticodons. The finding presented here provides valuable insights into the previous hypotheses of the wobble nucleotide in tRNA anticodons by adding some new evidence. 相似文献15.
16.
Mark Welch Sridhar Govindarajan Jon E. Ness Alan Villalobos Austin Gurney Jeremy Minshull Claes Gustafsson 《PloS one》2009,4(9)
Background
Production of proteins as therapeutic agents, research reagents and molecular tools frequently depends on expression in heterologous hosts. Synthetic genes are increasingly used for protein production because sequence information is easier to obtain than the corresponding physical DNA. Protein-coding sequences are commonly re-designed to enhance expression, but there are no experimentally supported design principles.Principal Findings
To identify sequence features that affect protein expression we synthesized and expressed in E. coli two sets of 40 genes encoding two commercially valuable proteins, a DNA polymerase and a single chain antibody. Genes differing only in synonymous codon usage expressed protein at levels ranging from undetectable to 30% of cellular protein. Using partial least squares regression we tested the correlation of protein production levels with parameters that have been reported to affect expression. We found that the amount of protein produced in E. coli was strongly dependent on the codons used to encode a subset of amino acids. Favorable codons were predominantly those read by tRNAs that are most highly charged during amino acid starvation, not codons that are most abundant in highly expressed E. coli proteins. Finally we confirmed the validity of our models by designing, synthesizing and testing new genes using codon biases predicted to perform well.Conclusion
The systematic analysis of gene design parameters shown in this study has allowed us to identify codon usage within a gene as a critical determinant of achievable protein expression levels in E. coli. We propose a biochemical basis for this, as well as design algorithms to ensure high protein production from synthetic genes. Replication of this methodology should allow similar design algorithms to be empirically derived for any expression system. 相似文献17.
Background
The degeneracy of the genetic code makes it possible for the same amino acid string to be coded by different messenger RNA (mRNA) sequences. These “synonymous mRNAs” may differ largely in a number of aspects related to their overall translational efficiency, such as secondary structure content and availability of the encoded transfer RNAs (tRNAs). Consequently, they may render different yields of the translated polypeptides. These mRNA features related to translation efficiency are also playing a role locally, resulting in a non-uniform translation speed along the mRNA, which has been previously related to some protein structural features and also used to explain some dramatic effects of “silent” single-nucleotide-polymorphisms (SNPs). In this work we perform the first large scale analysis of the relationship between three experimental proxies of mRNA local translation efficiency and the local features of the corresponding encoded proteins.Results
We found that a number of protein functional and structural features are reflected in the patterns of ribosome occupancy, secondary structure and tRNA availability along the mRNA. One or more of these proxies of translation speed have distinctive patterns around the mRNA regions coding for certain protein local features. In some cases the three patterns follow a similar trend. We also show specific examples where these patterns of translation speed point to the protein’s important structural and functional features.Conclusions
This support the idea that the genome not only codes the protein functional features as sequences of amino acids, but also as subtle patterns of mRNA properties which, probably through local effects on the translation speed, have some consequence on the final polypeptide. These results open the possibility of predicting a protein’s functional regions based on a single genomic sequence, and have implications for heterologous protein expression and fine-tuning protein function.Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1734-7) contains supplementary material, which is available to authorized users. 相似文献18.