首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Production of proteins as therapeutic agents, research reagents and molecular tools frequently depends on expression in heterologous hosts. Synthetic genes are increasingly used for protein production because sequence information is easier to obtain than the corresponding physical DNA. Protein-coding sequences are commonly re-designed to enhance expression, but there are no experimentally supported design principles.

Principal Findings

To identify sequence features that affect protein expression we synthesized and expressed in E. coli two sets of 40 genes encoding two commercially valuable proteins, a DNA polymerase and a single chain antibody. Genes differing only in synonymous codon usage expressed protein at levels ranging from undetectable to 30% of cellular protein. Using partial least squares regression we tested the correlation of protein production levels with parameters that have been reported to affect expression. We found that the amount of protein produced in E. coli was strongly dependent on the codons used to encode a subset of amino acids. Favorable codons were predominantly those read by tRNAs that are most highly charged during amino acid starvation, not codons that are most abundant in highly expressed E. coli proteins. Finally we confirmed the validity of our models by designing, synthesizing and testing new genes using codon biases predicted to perform well.

Conclusion

The systematic analysis of gene design parameters shown in this study has allowed us to identify codon usage within a gene as a critical determinant of achievable protein expression levels in E. coli. We propose a biochemical basis for this, as well as design algorithms to ensure high protein production from synthetic genes. Replication of this methodology should allow similar design algorithms to be empirically derived for any expression system.  相似文献   

2.
Codon usages in different gene classes of the Escherichia coli genome   总被引:3,自引:0,他引:3  
A new measure for assessing codon bias of one group of genes with respect to a second group of genes is introduced. In this formulation, codon bias correlations for Escherichia coli genes are evaluated for level of expression, for contrasts along genes, for genes in different 200 kb (or longer) contigs around the genome, for effects of gene size, for variation over different function classes, for codon bias in relation to possible lateral transfer and for dicodon bias for some gene classes. Among the function classes, codon biases of ribosomal proteins are the most deviant from the codon frequencies of the average E. coli gene. Other classes of ‘highly expressed genes’ (e.g. amino acyl tRNA synthetases, chaperonins, modification genes essential to translation activities) show less extreme codon biases. Consistently for genes with experimentally determined expression rates in the exponential growth phase, those of highest molar abundances are more deviant from the average gene codon frequencies and are more similar in codon frequencies to the average ribosomal protein gene. Independent of gene size, the codon biases in the 5′ third of genes deviate by more than a factor of two from those in the middle and 3′ thirds. In this context, there appear to be conflicting selection pressures imposed by the constraints of ribosomal binding, or more generally the early phase of protein synthesis (about the first 50 codons) may be more biased than the complete nascent polypeptide. In partitioning the E. coli genome into 10 equal lengths, pronounced differences in codon site 3 G+C frequencies accumulate. Genes near to oriC have 5% greater codon site 3 G+C frequencies than do genes from the ter region. This difference also is observed between small (100–300 codons) and large (>800 codons) genes. This result contrasts with that for eukaryotic genomes (including human, Caenorhabditis elegans and yeast) where long genes tend to have site 3 more AT rich than short genes. Many of the above results are special for E. coli genes and do not apply to genes of most bacterial genomes. A gene is defined as alien (possibly horizontally transferred) if its codon bias relative to the average gene exceeds a high threshold and the codon bias relative to ribosomal proteins is also appropriately high. These are identified, including four clusters (operons). The bulk of these genes have no known function.  相似文献   

3.
Cry4Aa produced by Bacillus thuringiensis is a dipteran-specific toxin and is, therefore, of great interest for developing a bioinsecticide to control mosquitoes. However, the expression of Cry4Aa in Escherichia coli is relatively low, which is a major disadvantage in its development as a bioinsecticide. In this study, to establish an effective production system, a 1,914-bp modified gene (cry4Aa-S1) encoding Cry4Aa was designed and synthesized in accordance with the G + C content and codon preference of E. coli genes without altering the encoded amino acid sequence. The cry4Aa-S1 gene allowed a significant improvement in expression level, over five-fold, compared to that of the original cry4Aa gene. The product of the cry4Aa-S1 gene showed the same level of insecticidal activity against Culex pipiens larvae as that from cry4Aa. This suggested that unfavorable codon usage was one of the reasons for poor expression of cry4Aa in E. coli, and, therefore, changing the cry4Aa codons to accord with the codon usage in E. coli led to efficient production of Cry4Aa. Efficient production of Cry4Aa in E. coli can be a powerful measure to prepare a sufficient amount of Cry4Aa protein for both basic analytical and applied researches.  相似文献   

4.
Plant carotenoid cleavage dioxygenase (CCD) catalyses the formation of industrially important apocarotenoids. Here, we applied codon-based classification for 72 CCD genes from 35 plant species using hierarchical clustering analysis. The codon adaptation index (CAI) and relative codon bias (RCB) were utilized to estimate the level of gene expression. The codon-based cluster tree result shows neatly clustered subclass of CCD genes except BoCCD1 gene of Bixa orellana. Correlation analysis of CAI values with RCB indicates an overall low-level expression of CCD across different species. Similarly, the closeness in the codon cluster with same CAI values was not reflected in 3-D structural report of selected CCD genes. These finding not only enhances our insights into the classification of CCD gene across the species but also identifies the critical factors responsible for this variation, which could aid in prediction of gene expression and function for newly reported CCD genes.  相似文献   

5.
An evolutionary perspective on synonymous codon usage in unicellular organisms   总被引:64,自引:0,他引:64  
Summary Observed patterns of synonymous codon usage are explained in terms of the joint effects of mutation, selection, and random drift. Examination of the codon usage in 165Escherichia coli genes reveals a consistent trend of increasing bias with increasing gene expression level. Selection on codon usage appears to be unidirectional, so that the pattern seen in lowly expressed genes is best explained in terms of an absence of strong selection. A measure of directional synonymous-codon usage bias, the Codon Adaptation Index, has been developed. In enterobacteria, rates of synonymous substitution are seen to vary greatly among genes, and genes with a high codon bias evolve more slowly. A theoretical study shows that the patterns of extreme codon bias observed for someE. coli (and yeast) genes can be generated by rather small selective differences. The relative plausibilities of various theoretical models for explaining nonrandom codon usage are discussed.Presented at the FEBS Symposium on Genome Organization and Evolution, held in Crete, Greece, September 1–5, 1986  相似文献   

6.
The genetic code is universal, but recombinant protein expression in heterologous systems is often hampered by divergent codon usage. Here, we demonstrate that reprogramming by standardized multi‐parameter gene optimization software and de novo gene synthesis is a suitable general strategy to improve heterologous protein expression. This study compares expression levels of 94 full‐length human wt and sequence‐optimized genes coding for pharmaceutically important proteins such as kinases and membrane proteins in E. coli. Fluorescence‐based quantification revealed increased protein yields for 70% of in vivo expressed optimized genes compared to the wt DNA sequences and also resulted in increased amounts of protein that can be purified. The improvement in transgene expression correlated with higher mRNA levels in our analyzed examples. In all cases tested, expression levels using wt genes in tRNA‐supplemented bacterial strains were outperformed by optimized genes expressed in non‐supplemented host cells.  相似文献   

7.
There is a growing interest in the use of microalgae as low‐cost hosts for the synthesis of recombinant products such as therapeutic proteins and bioactive metabolites. In particular, the chloroplast, with its small, genetically tractable genome (plastome) and elaborate metabolism, represents an attractive platform for genetic engineering. In Chlamydomonas reinhardtii, none of the 69 protein‐coding genes in the plastome uses the stop codon UGA, therefore this spare codon can be exploited as a useful synthetic biology tool. Here, we report the assignment of the codon to one for tryptophan and show that this can be used as an effective strategy for addressing a key problem in chloroplast engineering: namely, the assembly of expression cassettes in Escherichia coli when the gene product is toxic to the bacterium. This problem arises because the prokaryotic nature of chloroplast promoters and ribosome‐binding sites used in such cassettes often results in transgene expression in E. coli, and is a potential issue when cloning genes for metabolic enzymes, antibacterial proteins and integral membrane proteins. We show that replacement of tryptophan codons with the spare codon (UGG→UGA) within a transgene prevents functional expression in E. coli and in the chloroplast, and that co‐introduction of a plastidial trnW gene carrying a modified anticodon restores function only in the latter by allowing UGA readthrough. We demonstrate the utility of this system by expressing two genes known to be highly toxic to E. coli and discuss its value in providing an enhanced level of biocontainment for transplastomic microalgae.  相似文献   

8.
Recombinant flounder growth hormone was overproduced in E. coli by using codon optimized synthetic gene and optimized expression conditions for high level production. The gene was cloned into PET-28a expression vector and transformed into E. coli BL21 (DE3). Induction at lower temperature, lower IPTG concentrations and richer growth media during expression resulted in increased expression level. The protein expression profile was analyzed by SDS-PAGE, the authenticity was confirmed by western blotting and the concentration was determined by Bradford assay. In addition, several attempts were made to produce soluble product and all resulted in insoluble product. The overexpressed protein was efficiently purified from inclusion bodies by moderate speed centrifugation after cell lysis. Among the solubilization buffers examined, buffer with 1% N-lauroylsarcosine in the presence of reducing agent DTT at alkaline pH resulted in efficient solubilization and recovery. The denaturant was removed by filtration and dialysis. The amount of the growth hormone recovered was significantly higher than previous reports that expressed native growth hormone genes in E. coli. The methodology adapted in this study, can be used to produce flounder growth hormone at large scale level so that it can be used in aquaculture. This approach may also apply to other proteins if high level expression and efficient purification is sought in E. coli.  相似文献   

9.
The "expression measure" of a gene, E(g), is a statistic devised to predict the level of gene expression from codon usage bias. E(g) has been used extensively to analyze prokaryotic genome sequences. We discuss 2 problems with this approach. First, the formulation of E(g) is such that genes with the strongest selected codon usage bias are not likely to have the highest predicted expression levels; indeed the correlation between E(g) and expression level is weak among moderate to highly expressed genes. Second, in some species, highly expressed genes do not have unusual codon usage, and so codon usage cannot be used to predict expression levels. We outline a simple approach, first to check whether a genome shows evidence of selected codon usage bias and then to assess the strength of bias in genes as a guide to their likely expression level; we illustrate this with an analysis of Shewanella oneidensis.  相似文献   

10.
Strongly biased codon usage is common in unicellular organisms, particularly in highly expressed genes. The bias is most simply explained as a balance between selection and mutation, with selection favouring those codons which are more efficiently translated. In a review Ikemura (1985) has proposed four rules for predicting which codons will be preferred, based on the properties of the transfer RNAs responsible for translating messenger RNA into protein. In this paper codon usage in E. coli and yeast is re-examined using the recent compilation of Maruyama et al. (1986). The codon adaptation index of Sharp and Li (1986a) is used as a measure of gene expression to investigate the importance of this factor. It is found that Ikemura's rules successfully predict preferred codons for yeast, but that two of them work less well for E. coli, and it is suggested that some of the apparent bias in weakly expressed genes of E. coli may be due to contextual effects on mutation rates.  相似文献   

11.
Enterotoxigenic Escherichia coli (ETEC) is the most common cause of children diarrhea in the world. Adhesion of ETEC to small intestine is an important virulence trait. One of the most prevalent colonization factors (CFs) in human is CFA/I fimbriae and CfaE which is the required binding factor for adhesion of ETEC to intestinal mucosa.We optimized cfaE gene codons according to codon bias of E. coli to achieve a high level of recombinant protein expression. The optimized gene was expressed in E. coli and rCFaE protein was used for mice immunization. Blocking activity of the obtained antibody was examined by microplate agglutination inhibition test. SDS-PAGE analysis indicated that the optimized sequence of cfaE produces a suitable amount of rCFaE in comparison with native gene sequence. This optimized rCFaE protein could induces strong humoral response in mice and the antibody obtained against rCFaE inhibited the adhesion of ETEC to human group A erythrocytes. It is concluded that codon optimization is a useful approach for obtaining large quantities of recombinant rCFaE protein. With regard to the results of hemagglutination inhibition test, codon optimization and increased production of recombinant protein expressed in E. coli did not affect the immunogenicity potential of CFaE.  相似文献   

12.
13.
Summary An artificial gene encoding the Escherichia coli translational initiation factor IF1 was synthesized based on the primary structure (71 amino acid residues) of the protein. Codons for individual amino acids were selected on the basis of the preferred codon usage found in the structural genes for the initiation factor IF2 of E. coli and Bacillus stearothermophilus, both of which can be expressed at high levels in E. coli cells. We gave the IF1 gene a modular structure by introducing specific restriction enzyme sites into the sequence, resulting in units of three to ten codons. This was conceived to facilitate site-directed mutagenesis of the gene and thus to obtain IF1 with specific amino acid alterations at desired positions. The IF1 gene was assembled by shot-gun ligation of 9 synthetic oligodeoxyri-bonucleotides ranging in size from 31 to 65 nucleotides and cloned into an expression vector to place the gene under the control of an inducible promoter. Upon induction, E. coli cells harbouring the artificial gene were found to produce large amounts (60 mg/100 g cells) of a protein indistinguishable from natural IF1 in both chemecal and biological properties.  相似文献   

14.
In bacteria, synonymous codon usage can be considerably affected by base composition at neighboring sites. Such context-dependent biases may be caused by either selection against specific nucleotide motifs or context-dependent mutation biases. Here we consider the evolutionary conservation of context-dependent codon bias across 11 completely sequenced bacterial genomes. In particular, we focus on two contextual biases previously identified in Escherichia coli; the avoidance of out-of-frame stop codons and AGG motifs. By identifying homologues of E. coli genes, we also investigate the effect of gene expression level in Haemophilus influenzae and Mycoplasma genitalium. We find that while context-dependent codon biases are widespread in bacteria, few are conserved across all species considered. Avoidance of out-of-frame stop codons does not apply to all stop codons or amino acids in E. coli, does not hold for different species, does not increase with gene expression level, and is not relaxed in Mycoplasma spp., in which the canonical stop codon, TGA, is recognized as tryptophan. Avoidance of AGG motifs shows some evolutionary conservation and increases with gene expression level in E. coli, suggestive of the action of selection, but the cause of the bias differs between species. These results demonstrate that strong context-dependent forces, both selective and mutational, operate on synonymous codon usage but that these differ considerably between genomes. Received: 6 May 1999 / Accepted: 29 October 1999  相似文献   

15.
16.
Along the gene, nucleotides in various codon positions tend to exert a slight but observable influence on the nucleotide choice at neighboring positions. Such context biases are different in different organisms and can be used as genomic signatures. In this paper, we will focus specifically on the dinucleotide composed of a third codon position nucleotide and its succeeding first position nucleotide. Using the 16 possible dinucleotide combinations, we calculate how well individual genes conform to the observed mean dinucleotide frequencies of an entire genome, forming a distance measure for each gene. It is found that genes from different genomes can be separated with a high degree of accuracy, according to these distance values. In particular, we address the problem of recent horizontal gene transfer, and how imported genes may be evaluated by their poor assimilation to the host's context biases. By concentrating on the third- and succeeding first position nucleotides, we eliminate most spurious contributions from codon usage and amino-acid requirements, focusing mainly on mutational effects. Since imported genes are expected to converge only gradually to genomic signatures, it is possible to question whether a gene present in only one of two closely related organisms has been imported into one organism or deleted in the other. Striking correlations between the proposed distance measure and poor homology are observed when Escherichia coli genes are compared to Salmonella typhi, indicating that sets of outlier genes in E. coli may contain a high number of genes that have been imported into E. coli, and not deleted in S. typhi. Received: 16 January 2001 / Accepted: 30 August 2001  相似文献   

17.
Halohydrin dehalogenases are attractive biocatalysts in producing a series of important chiral building blocks. Recombinant expression of halohydrin dehalogenase from Arthrobacter sp. AD2 (HheA) in Escherichia coli using T7 promoter-based pGEF(+) system revealed much lower expression level than that of the well-studied halohydrin dehalogenase from Agrobacterium radiobacter AD1 (HheC). In this study, we changed the codon usage in the 5′-end of hheA gene to improve the expression yield of HheA. Our results showed that the expression of HheA could be largely improved by the replacement of G-rich +2 codon (adjacent to the start codon) with less G-containing codons. The expression of one of the resulting mutants HheA-D1 (replaced +2 codon GTG with CCA) was about 4-fold higher and purified yields about 8-fold greater than that of the wild-type HheA. Moreover, the expression level of the resulting HheA variants correlated well with the minimal folding free energy (ΔG) of the mRNA secondary structure surrounding the 5′-end region of the genes. These findings suggested that the G-rich +2 codon of hheA gene might be the main suppressive factor for limiting the recombinant expression of HheA and that +2 codon optimization strategy could be used as a general tool in modulating recombinant protein production in E. coli.  相似文献   

18.
In the recent past years, a large number of proteins have been expressed in Escherichia coli with high productivity due to rapid development of genetic engineering technologies. There are many hosts used for the production of recombinant protein but the preferred choice is E. coli due to its easier culture, short life cycle, well-known genetics, and easy genetic manipulation. We often face a problem in the expression of foreign genes in E. coli. Soluble recombinant protein is a prerequisite for structural, functional and biochemical studies of a protein. Researchers often face problems producing soluble recombinant proteins for over-expression, mainly the expression and solubility of heterologous proteins. There is no universal strategy to solve these problems but there are a few methods that can improve the level of expression, non-expression, or less expression of the gene of interest in E. coli. This review addresses these issues properly. Five levels of strategies can be used to increase the expression and solubility of over-expressed protein; (1) changing the vector, (2) changing the host, (3) changing the culture parameters of the recombinant host strain, (4) co-expression of other genes and (5) changing the gene sequences, which may help increase expression and the proper folding of desired protein. Here we present the resources available for the expression of a gene in E. coli to get a substantial amount of good quality recombinant protein. The resources include different strains of E. coli, different E. coli expression vectors, different physical and chemical agents and the co expression of chaperone interacting proteins. Perhaps it would be the solutions to such problems that will finally lead to the maturity of the application of recombinant proteins. The proposed solutions to such problems will finally lead to the maturity of the application of recombinant proteins.  相似文献   

19.
In the present study, we examined the codon usage bias between pseudorabies virus (PRV) US1 gene and the US1-like genes of 20 reference alphaherpesviruses. Comparative analysis showed noticeable disparities of the synonymous codon usage bias in the 21 alphaherpesviruses, indicated by codon adaptation index, effective number of codons (ENc) and GC3s value. The codon usage pattern of PRV US1 gene was phylogenetically conserved and similar to that of the US1-like genes of the genus Varicellovirus of alphaherpesvirus, with a strong bias towards the codons with C and G at the third codon position. Cluster analysis of codon usage pattern of PRV US1 gene with its reference alphaherpesviruses demonstrated that the codon usage bias of US1-like genes of 21 alphaherpesviruses had a very close relation with their gene functions. ENc-plot revealed that the genetic heterogeneity in PRV US1 gene and the 20 reference alphaherpesviruses was constrained by G+C content, as well as the gene length. In addition, comparison of codon preferences in the US1 gene of PRV with those of E. coli, yeast and human revealed that there were 50 codons showing distinct usage differences between PRV and yeast, 49 between PRV and human, but 48 between PRV and E. coli. Although there were slightly fewer differences in codon usages between E.coli and PRV, the difference is unlikely to be statistically significant, and experimental studies are necessary to establish the most suitable expression system for PRV US1. In conclusion, these results may improve our understanding of the evolution, pathogenesis and functional studies of PRV, as well as contributing to the area of herpesvirus research or even studies with other viruses.  相似文献   

20.
Recent evidence suggests that cell-to-cell difference at the gene expression level is an order of magnitude greater than previously thought even for isogenic bacterial populations. Such gene expression heterogeneity determines the fate of individual bacterial cells in populations and could also affect the ultimate fate of populations themselves. To quantify the heterogeneity and its biological significance, quantitative methods to measure gene expression in single bacterial cells are needed. In this work, we developed two SYBR Green-based RT-qPCR methods to determine gene expression directly in single bacterial cells. The first method involves a single-tube operation that can analyze one gene from each bacterial cell. The second method is featured by a two-stage protocol that consists of RNA isolation from a single bacterial cell and cDNA synthesis in the first stage, and qPCR in the second stage, which allows determination of expression level of multiple genes simultaneously for single bacterial cells of both gram-positive and negative. We applied the methods to stress-treated (i.e. low pH and high temperature) Escherichia coli populations. The reproducible results demonstrated that the method is sensitive enough not only for measuring cellular responses at the single-cell level, but also for revealing gene expression heterogeneity among the bacterial cells. Furthermore, our results showed that the two-stage method can reproducibly measure multiple highly expressed genes from a single E. coli cell, which exhibits important foundation for future development of a high throughput and lab-on-chips whole-genome RT-qPCR methodology for single bacterial cells.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号