首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Molecular Biology makes it possible to express foreign genes in microorganism, plants and animals. To improve the heterologous expression, it is important that the codon usage of sequence be optimized to make it adaptive to host organism. In this paper, a novel method based on Quantum-behaved Particle Swarm Optimization (QPSO) algorithm is developed to optimize the codon usage of synthetic gene. Compared to the existing probability methods, QPSO is able to generate better results when DNA/RNA sequence length is less than 6 Kb which is the commonly used range. While the software or web service based on probability method may not exclude all defined restriction sites when there are many undesired sites in the sequence, our proposed method can remove the undesired site efficiently during the optimization process.  相似文献   

2.
The fungal genus Puccinia, comprising of several menacing pathogens, has been a persistent peril to global agriculture. Genome sequencing of various members of Puccinia offers a scope to excavate their genomic riddles. The present study has been addressed at exploring the complex niceties of codon and amino acid usage patterns and subsequent elucidation of the determinants that drive such behavior. Multivariate statistical analysis revealed a complex interplay of natural selection for translation and compositional bias to be operational on the codon usage patterns. Gene expression level was observed to be the most competent factor governing codon usage behavior of the genus. In spite of subtle AT richness of the genus, potential highly expressed gene sets were found to preferentially employ GC rich optimal codons. Estimation of relative dinucleotide abundance revealed preference toward the employment of GpA, CpA, TpC, and TpG dinucleotides and restraint from using TpA dinucleotide among the members of the genus. Extensive codon context analysis revealed that codon pairs with GpA, CpA, TpC, and TpG dinucleotides were over-represented and codon pairs with TpA dinucleotide were extensively avoided at the codon–codon (cP3–cA1) junctions. Amino acid usage signatures of the genus were found to be influenced considerably by several imperative factors like aromatic and hydrophobic character of the encoded gene products, genomic compositional constraint, and gene expressivity. Detailed know-how of the potential highly expressed gene sets and associated optimal codons in the genus promise to be informative for the scientific community engaged in combating Puccinia pathogenesis.  相似文献   

3.
4.

Background

Production of proteins as therapeutic agents, research reagents and molecular tools frequently depends on expression in heterologous hosts. Synthetic genes are increasingly used for protein production because sequence information is easier to obtain than the corresponding physical DNA. Protein-coding sequences are commonly re-designed to enhance expression, but there are no experimentally supported design principles.

Principal Findings

To identify sequence features that affect protein expression we synthesized and expressed in E. coli two sets of 40 genes encoding two commercially valuable proteins, a DNA polymerase and a single chain antibody. Genes differing only in synonymous codon usage expressed protein at levels ranging from undetectable to 30% of cellular protein. Using partial least squares regression we tested the correlation of protein production levels with parameters that have been reported to affect expression. We found that the amount of protein produced in E. coli was strongly dependent on the codons used to encode a subset of amino acids. Favorable codons were predominantly those read by tRNAs that are most highly charged during amino acid starvation, not codons that are most abundant in highly expressed E. coli proteins. Finally we confirmed the validity of our models by designing, synthesizing and testing new genes using codon biases predicted to perform well.

Conclusion

The systematic analysis of gene design parameters shown in this study has allowed us to identify codon usage within a gene as a critical determinant of achievable protein expression levels in E. coli. We propose a biochemical basis for this, as well as design algorithms to ensure high protein production from synthetic genes. Replication of this methodology should allow similar design algorithms to be empirically derived for any expression system.  相似文献   

5.
Although DNA codon optimization is a standard molecular biology strategy to overcome poor gene expression, to date no public software exists to facilitate this process. Among the uses of codon optimization, human immunodeficiency virus (HIV) vaccine development represents one of the most difficult challenges. A key obstacle to an effective DNA-based vaccine is the low-level expression of HIV genes in mammalian cells, which is due primarily to the instability of HIV mRNAs resulting from AU-rich elements and rare codon usage. In this report we describe the development of a DNA optimization algorithm integrated with a PCR primer design program to redesign specific coding sequences for maximal gene expression. Using this algorithm combination, together with PCR-based gene assembly, we have successfully optimized gene sequences for simian immunodeficiency virus (SIV) strain mac239 structural antigenic proteins gag and env, resulting in high-level gene expression in eukaryotic cells. Our findings demonstrate that our user-friendly algorithm is a valuable tool for DNA-based HIV vaccine development. Moreover, it can be used to optimize any other genes of interest and is freely available online at http://www.vectorcore.pitt.edu/upgene.html.  相似文献   

6.

Background

Synonymous codon usage varies widely between genomes, and also between genes within genomes. Although there is now a large body of data on variations in codon usage, it is still not clear if the observed patterns reflect the effects of positive Darwinian selection acting at the level of translational efficiency or whether these patterns are due simply to the effects of mutational bias. In this study, we have included both intra-genomic and inter-genomic comparisons of codon usage. This allows us to distinguish more efficiently between the effects of nucleotide bias and translational selection.

Results

We show that there is an extreme degree of heterogeneity in codon usage patterns within the rice genome, and that this heterogeneity is highly correlated with differences in nucleotide content (particularly GC content) between the genes. In contrast to the situation observed within the rice genome, Arabidopsis genes show relatively little variation in both codon usage and nucleotide content. By exploiting a combination of intra-genomic and inter-genomic comparisons, we provide evidence that the differences in codon usage among the rice genes reflect a relatively rapid evolutionary increase in the GC content of some rice genes. We also noted that the degree of codon bias was negatively correlated with gene length.

Conclusion

Our results show that mutational bias can cause a dramatic evolutionary divergence in codon usage patterns within a period of approximately two hundred million years.The heterogeneity of codon usage patterns within the rice genome can be explained by a balance between genome-wide mutational biases and negative selection against these biased mutations. The strength of the negative selection is proportional to the length of the coding sequences. Our results indicate that the large variations in synonymous codon usage are not related to selection acting on the translational efficiency of synonymous codons.
  相似文献   

7.

Background  

To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering.  相似文献   

8.
9.
Codon optimization is a generic technique to achieve optimum expression of a foreign gene in the host's cell system. Selection of optimum codons depends on codon usage of the host genome and the presence of several desirable and undesirable sequence motifs. Searching these motifs in all possible combinations of the codons increases the search space exponentially with respect to sequence length. GASCO is an algorithm developed for the optimum codon selection using genetic algorithms. The algorithm reduces the search space and provides an approximate solution to the problem. The algorithm has applications in DNA vaccine design for successfully eliciting potent immune responses and synthetic gene design for metabolic pathway engineering. The software for the proposed algorithm is available on http://miracle.igib.res.in/gasco/.  相似文献   

10.
Analysis of synonymous codon usage in H5N1 virus and other influenza A viruses   总被引:11,自引:0,他引:11  
Zhou T  Gu W  Ma J  Sun X  Lu Z 《Bio Systems》2005,81(1):77-86
In this study, we calculated the codon usage bias in H5N1 virus and performed a comparative analysis of synonymous codon usage patterns in H5N1 virus, five other evolutionary related influenza A viruses and a influenza B virus. Codon usage bias in H5N1 genome is a little slight, which is mainly determined by the base compositions on the third codon position. By comparing synonymous codon usage patterns in different viruses, we observed that the codon usage pattern of H5N1 virus is similar with other influenza A viruses, but not influenza B virus, and the synonymous codon usage in influenza A virus genes is phylogenetically conservative, but not strain-specific. Synonymous codon usage in genes encoded by different influenza A viruses is genus conservative. Compositional constraints could explain most of the variation of synonymous codon usage among these virus genes, while gene function is also correlated to synonymous codon usages to a certain extent. However, translational selection and gene length have no effect on the variations of synonymous codon usage in these virus genes.  相似文献   

11.
Uropathogenic Escherichia coli (UPEC) bacteria are the principal cause of urinary tract infections (UTI). Because these bacteria propagate intracellularly, the cellular immune response is an important factor in UTIs. Therefore, we designed a genetic construct to induce a cellular immune response. In order to develop a genetic construct that induces strong cellular immunity against this pathogen, we used the fimH synthetic gene according to mammalian codon usage, and the gene expression was compared with wild type codon usage. Initially, we designed two constructs, pVAX/fimH mam and pVAX/fimH wt, which contain mammalian and wild type codon usage, respectively. The Cos-7 cell line was transfected separately with a complex of pVAX/fimH mam-ExGene 500 poly cationic polymer and pVAX/fimH wt-ExGene 500 poly cationic polymer. Expression of the fimH gene in both constructs in COS7 cells was confirmed by RT-PCR, SDS-PAGE, and Western blotting. Both of the pVAX/fimH cassettes expressed inserted fimH genes (mam and wt) in Cos-7 cells. Our results suggest that codon optimization successfully expressed the fimH gene because the fimH gene with mammalian codon usage is compatible with the eukaryotic expression system. Therefore, mammalian codon usage could be appropriate in a pVAX/fimH construct as a DNA vaccine.  相似文献   

12.
Mitogen activated protein kinase (MAPK) genes provide resistance to various biotic and abiotic stresses. Codon usage profiling of the genes reveals the characteristic features of the genes like nucleotide composition, gene expressivity, optimal codons etc. The present study is a comparative analysis of codon usage patterns for different MAPK genes in three organisms, viz. Arabidopsis thaliana, Glycine max (soybean) and Oryza sativa (rice). The study has revealed a high AT content in MAPK genes of Arabidopsis and soybean whereas in rice a balanced AT-GC content at the third synonymous position of codon. The genes show a low bias in codon usage profile as reflected in the higher values (50.83 to 56.55) of effective number of codons (Nc). The prediction of gene expression profile in the MAPK genes revealed that these genes might be under the selective pressure of translational optimization as reflected in the low codon adaptation index (CAI) values ranging from 0.147 to 0.208.  相似文献   

13.
Reverse ecology is the inference of ecological information from patterns of genomic variation. One rich, heretofore underutilized, source of ecologically relevant genomic information is codon optimality or adaptation. Bias toward codons that match the tRNA pool is robustly associated with high gene expression in diverse organisms, suggesting that codon optimization could be used in a reverse ecology framework to identify highly expressed, ecologically relevant genes. To test this hypothesis, we examined the relationship between optimal codon usage in the classic galactose metabolism (GAL) pathway and known ecological niches for 329 species of budding yeasts, a diverse subphylum of fungi. We find that optimal codon usage in the GAL pathway is positively correlated with quantitative growth on galactose, suggesting that GAL codon optimization reflects increased capacity to grow on galactose. Optimal codon usage in the GAL pathway is also positively correlated with human-associated ecological niches in yeasts of the CUG-Ser1 clade and with dairy-associated ecological niches in the family Saccharomycetaceae. For example, optimal codon usage of GAL genes is greater than 85% of all genes in the genome of the major human pathogen Candida albicans (CUG-Ser1 clade) and greater than 75% of genes in the genome of the dairy yeast Kluyveromyces lactis (family Saccharomycetaceae). We further find a correlation between optimization in the GALactose pathway genes and several genes associated with nutrient sensing and metabolism. This work suggests that codon optimization harbors information about the metabolic ecology of microbial eukaryotes. This information may be particularly useful for studying fungal dark matter—species that have yet to be cultured in the lab or have only been identified by genomic material.

Bias toward codons that match the tRNA pool is robustly associated with high gene expression in diverse organisms, suggesting that codon optimization could be used in a reverse ecology framework to identify ecologically relevant genes. This study finds that this is indeed the case for 329 species of budding yeasts, suggesting that codon optimization harbors information about the metabolic ecology of microbial eukaryotes.  相似文献   

14.
Synonymous codon usage patterns of bacteriophage and host genomes were compared. Two indexes, G + C base composition of a gene (fgc) and fraction of translationally optimal codons of the gene (fop), were used in the comparison. Synonymous codon usage data of all the coding sequences on a genome are represented as a cloud of points in the plane of fop vs. fgc. The Escherichia coli coding sequences appear to exhibit two phases, "rising" and "flat" phases. Genes that are essential for survival and are thought to be native are located in the flat phase, while foreign-type genes from prophages and transposons are found in the rising phase with a slope of nearly unity in the fgc vs. fop plot. Synonymous codon distribution patterns of genes from temperate phages P4, P2, N15 and lambda are similar to the pattern of E. coli rising phase genes. In contrast, genes from the virulent phage T7 or T4, for which a phage-encoded DNA polymerase is identified, fall in a linear curve with a slope of nearly zero in the fop vs. fgc plane. These results may suggest that the G + C contents for T7, T4 and E. coli flat phase genes are subject to the directional mutation pressure and are determined by the DNA polymerase used in the replication. There is significant variation in the fop values of the phage genes, suggesting an adjustment to gene expression level. Similar analyses of codon distribution patterns were carried out for Haemophilus influenzae, Bacillus subtilis, Mycobacterium tuberculosis and their phages with complete genomic sequences available.  相似文献   

15.
The Horizontal Gene Transfer DataBase (HGT-DB) is a genomic database that includes statistical parameters such as G+C content, codon and amino-acid usage, as well as information about which genes deviate in these parameters for prokaryotic complete genomes. Under the hypothesis that genes from distantly related species have different nucleotide compositions, these deviated genes may have been acquired by horizontal gene transfer. The current version of the database contains 88 bacterial and archaeal complete genomes, including multiple chromosomes and strains. For each genome, the database provides statistical parameters for all the genes, as well as averages and standard deviations of G+C content, codon usage, relative synonymous codon usage and amino-acid content. It also provides information about correspondence analyses of the codon usage, plus lists of extraneous group of genes in terms of G+C content and lists of putatively acquired genes. With this information, researchers can explore the G+C content and codon usage of a gene when they find incongruities in sequence-based phylogenetic trees. A search engine that allows searches for gene names or keywords for a specific organism is also available. HGT-DB is freely accessible at http://www.fut.es/~debb/HGT.  相似文献   

16.

Background

The construction of customized nucleic acid sequences allows us to have greater flexibility in gene design for recombinant protein expression. Among the various parameters considered for such DNA sequence design, individual codon usage (ICU) has been implicated as one of the most crucial factors affecting mRNA translational efficiency. However, previous works have also reported the significant influence of codon pair usage, also known as codon context (CC), on the level of protein expression.

Results

In this study, we have developed novel computational procedures for evaluating the relative importance of optimizing ICU and CC for enhancing protein expression. By formulating appropriate mathematical expressions to quantify the ICU and CC fitness of a coding sequence, optimization procedures based on genetic algorithm were employed to maximize its ICU and/or CC fitness. Surprisingly, the in silico validation of the resultant optimized DNA sequences for Escherichia coli, Lactococcus lactis, Pichia pastoris and Saccharomyces cerevisiae suggests that CC is a more relevant design criterion than the commonly considered ICU.

Conclusions

The proposed CC optimization framework can complement and enhance the capabilities of current gene design tools, with potential applications to heterologous protein production and even vaccine development in synthetic biotechnology.  相似文献   

17.
There is a significant variation of codon usage bias among different species and even among genes within the same organisms. Codon optimization, this is, gene redesigning with the use of codons preferred for the specific expression system, results in improved expression of heterologous genes in bacteria, plants, yeast, mammalian cells, and transgenic animals. The mechanisms preventing expression of genes with rare or low-usage codons at adequate levels are not completely elucidated. Human immunodeficiency virus (HIV) represents an interesting model for studying how differences in codon usage affect gene expression in heterologous systems. Construction of synthetic genes with optimized codons demonstrated that the codon-usage effects might be a major impediment to the efficient expression of HIV gag/pol and env gene products in mammalian cells. According to another hypothesis, the poor expression of HIV structural proteins even without HIV context is attributed to the so-called cis-acting inhibitory elements (INS), which are located within the protein-coding region. They consist of AU-rich sequences and may be inactivated through the introduction of multiple mutations over the large regions of gag gene. In our work, we evaluated expression of hybrid HIV-1 gag mRNAs where wild-type (A-rich) gag sequences were combined with artificial sequences. In such "humanized" gag fragments with adapted codon usage, AT-content was significantly reduced in favor of G and C nucleotides without any changes in protein sequence. We show that wild-type gag sequences negatively influence expression of gag-reporter, and the addition of fragments with optimized codons to gag mRNA partially rescues its expression. The results demonstrate that the expression of HIV-1 gag is determined by the ratio of optimized and rare codons within mRNA. Our data also indicates that some wtgag fragments counteract the influence of the other wtgag sequences, which cause the inhibition of gag expression. The presented data do not contradict the concept of INS; yet, it makes the definition of INS more complex. This supports the idea of a broader role of the selected codon usage in influencing the expression of HIV proteins in mammalian cells.  相似文献   

18.
The Selective Advantage of Synonymous Codon Usage Bias in Salmonella   总被引:1,自引:0,他引:1  
The genetic code in mRNA is redundant, with 61 sense codons translated into 20 different amino acids. Individual amino acids are encoded by up to six different codons but within codon families some are used more frequently than others. This phenomenon is referred to as synonymous codon usage bias. The genomes of free-living unicellular organisms such as bacteria have an extreme codon usage bias and the degree of bias differs between genes within the same genome. The strong positive correlation between codon usage bias and gene expression levels in many microorganisms is attributed to selection for translational efficiency. However, this putative selective advantage has never been measured in bacteria and theoretical estimates vary widely. By systematically exchanging optimal codons for synonymous codons in the tuf genes we quantified the selective advantage of biased codon usage in highly expressed genes to be in the range 0.2–4.2 x 10−4 per codon per generation. These data quantify for the first time the potential for selection on synonymous codon choice to drive genome-wide sequence evolution in bacteria, and in particular to optimize the sequences of highly expressed genes. This quantification may have predictive applications in the design of synthetic genes and for heterologous gene expression in biotechnology.  相似文献   

19.
Synthetic biology is a recent scientific approach towards engineering biological systems from both pre-existing and novel parts. The aim is to introduce computational aided design approach in biology leading to rapid delivery of useful applications. Though the term reprogramming has been frequently used in the synthetic biology community, currently the technological sophistication only allows for a probabilistic approach instead of a precise engineering approach. Recently, several human health applications have emerged that suggest increased usage of synthetic biology approach in developing novel drugs. This mini review discusses recent translational developments in the field and tries to identify some of the upcoming future developments.  相似文献   

20.

Background  

Direct synthesis of genes is rapidly becoming the most efficient way to make functional genetic constructs and enables applications such as codon optimization, RNAi resistant genes and protein engineering. Here we introduce a software tool that drastically facilitates the design of synthetic genes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号