首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Universal genetic codes are degenerated with 61 codons specifying 20 amino acids, thus creating synonymous codons for a single amino acid. Synonymous codons have been shown to affect protein properties in a given organism. To address this issue and explore how Escherichia coli selects its “codon-preferred” DNA template(s) for synthesis of proteins with required properties, we have designed synonymous codon libraries based on an antibody (scFv) sequence and carried out bacterial expression and screening for variants with altered properties. As a result, 342 codon variants have been identified, differing significantly in protein solubility and functionality while retaining the identical original amino acid sequence. The soluble expression level varied from completely insoluble aggregates to a soluble yield of ∼2.5 mg/liter, whereas the antigen-binding activity changed from no binding at all to a binding affinity of > 10−8 m. Not only does our work demonstrate the involvement of genetic codes in regulating protein synthesis and folding but it also provides a novel screening strategy for producing improved proteins without the need to substitute amino acids.  相似文献   

2.
The limitations of current mutagenesis techniques are analyzed in terms of the number and kinds of codon changes they make and in terms of the population size needed to produce all single or multiple amino acid variants. It is shown how a technique that can alter a single codon of a gene, producing all possible variant codons without affecting the rest of the gene, has certain advantages, if it can be used at each place in the gene in one experiment. Such a technique has advantages when the goals are to understand: (1) how specific structural alterations in a mutant protein cause it to function in a different but specific way, (2) how to predict which amino acids in a protein contact or interact with each other, and (3) why a protein is more or less sensitive to mutational disruption, depending upon the specific mutation. This is because it would generate the maximum number of (1) mutant proteins with different functions, (2) intracistronic suppressor for any starting mutation, and (3) random amino acid substitutions at random places. Furthermore, such a technique could produce useful variants more quickly and on a smaller scale than either evolution or current methods.  相似文献   

3.
Fluorocarbons are quintessentially man-made molecules, fluorine being all but absent from biology. Perfluorinated molecules exhibit novel physicochemical properties that include extreme chemical inertness, thermal stability, and an unusual propensity for phase segregation. The question we and others have sought to answer is to what extent can these properties be engineered into proteins? Here, we review recent studies in which proteins have been designed that incorporate highly fluorinated analogs of hydrophobic amino acids with the aim of creating proteins with novel chemical and biological properties. Fluorination seems to be a general and effective strategy to enhance the stability of proteins, both soluble and membrane bound, against chemical and thermal denaturation, although retaining structure and biological activity. Most studies have focused on small proteins that can be produced by peptide synthesis as synthesis of large proteins containing specifically fluorinated residues remains challenging. However, the development of various biosynthetic methods for introducing noncanonical amino acids into proteins promises to expand the utility of fluorinated amino acids in protein design.  相似文献   

4.
MOTIVATION: Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features. RESULTS: We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is approximately 8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions. AVAILABILITY: Dataset and stand-alone program are available upon request.  相似文献   

5.
Predicting the effects of amino acid substitutions on protein stability provides invaluable information for protein design, the assignment of biological function, and for understanding disease-associated variations. To understand the effects of substitutions, computational models are preferred to time-consuming and expensive experimental methods. Several methods have been proposed for this task including machine learning-based approaches. However, models trained using limited data have performance problems and many model parameters tend to be over-fitted. To decrease the number of model parameters and to improve the generalization potential, we calculated the amino acid contact energy change for point variations using a structure-based coarse-grained model. Based on the structural properties including contact energy (CE) and further physicochemical properties of the amino acids as input features, we developed two support vector machine classifiers. M47 predicted the stability of variant proteins with an accuracy of 87 % and a Matthews correlation coefficient of 0.68 for a large dataset of 1925 variants, whereas M8 performed better when a relatively small dataset of 388 variants was used for 20-fold cross-validation. The performance of the M47 classifier on all six tested contingency table evaluation parameters is better than that of existing machine learning-based models or energy function-based protein stability classifiers.  相似文献   

6.
7.
Diversification of protein sequence-structure space is a major concern in protein engineering. Deletion mutagenesis can generate a protein sequence-structure space different from substitution mutagenesis mediated space, but it has not been widely used in protein engineering compared to substitution mutagenesis, because it causes a relatively huge range of structural perturbations of target proteins which often inactivates the proteins. In this study, we demonstrate that, using green fluorescent protein (GFP) as a model system, the drawback of the deletional protein engineering can be overcome by employing the protein structure with high stability. The systematic dissection of N-terminal, C-terminal and internal sequences of GFPs with two different stabilities showed that GFP with high stability (s-GFP), was more tolerant to the elimination of amino acids compared to a GFP with normal stability (n-GFP). The deletion studies of s-GFP enabled us to achieve three interesting variants viz. s-DL4, s-N14, and s-C225, which could not been obtained from n-GFP. The deletion of 191–196 loop sequences led to the variant s-DL4 that was expressed predominantly as insoluble form but mostly active. The s-N14 and s-C225 are the variants without the amino acid residues involving secondary structures around N- and C-terminals of GFP fold respectively, exhibiting comparable biophysical properties of the n-GFP. Structural analysis of the variants through computational modeling study gave a few structural insights that can explain the spectral properties of the variants. Our study suggests that the protein sequence-structure space of deletion mutants can be more efficiently explored by employing the protein structure with higher stability.  相似文献   

8.
Reduced amino acid alphabets are useful to understand molecular evolution as they reveal basal, shared properties of amino acids, which the structures and functions of proteins rely on. Several previous studies derived such reduced alphabets and linked them to the origin of life and biotechnological applications. However, all this previous work presupposes that only direct contacts of amino acids in native protein structures are relevant. We show in this work, using information–theoretical measures, that an appropriate alphabet reduction scheme is in fact a function of the maximum distance amino acids interact at. Although for small distances our results agree with previous ones, we show how long‐range interactions change the overall picture and prompt for a revised understanding of the protein design process. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

9.
Understanding the biochemically active amino acids in proteins is a key factor to improve the knowledge of how enzymes work, to predict the function of newly discovered protein structures of unknown function, and to establish design principles for enzyme engineering. Here, we explore recently reported computational chemistry-based methods for the prediction of active amino acids in protein 3D structures, including biochemically important distal residues, and their implications for functional genomics, for enzyme design, and for enhancing understanding of the function of enzymes.  相似文献   

10.
Constrained Coding Regions (CCRs) in the human genome have been derived from DNA sequencing data of large cohorts of healthy control populations, available in the Genome Aggregation Database (gnomAD) [1]. They identify regions depleted of protein-changing variants and thus identify segments of the genome that have been constrained during human evolution. By mapping these DNA-defined regions from genomic coordinates onto the corresponding protein positions and combining this information with protein annotations, we have explored the distribution of CCRs and compared their co-occurrence with different protein functional features, previously annotated at the amino acid level in public databases.As expected, our results reveal that functional amino acids involved in interactions with DNA/RNA, protein–protein contacts and catalytic sites are the protein features most likely to be highly constrained for variation in the control population. More surprisingly, we also found that linear motifs, linear interacting peptides (LIPs), disorder–order transitions upon binding with other protein partners and liquid–liquid phase separating (LLPS) regions are also strongly associated with high constraint for variability. We also compared intra-species constraints in the human CCRs with inter-species conservation and functional residues to explore how such CCRs may contribute to the analysis of protein variants. As has been previously observed, CCRs are only weakly correlated with conservation, suggesting that intraspecies constraints complement interspecies conservation and can provide more information to interpret variant effects.  相似文献   

11.
Budisa N  Pal PP 《Biological chemistry》2004,385(10):893-904
Fluorescence methods are now well-established and powerful tools to study biological macromolecules. The canonical amino acid tryptophan (Trp), encoded by a single UGG triplet, is the main reporter of intrinsic fluorescence properties of most natural proteins and peptides and is thus an attractive target for tailoring their spectral properties. Recent advances in research have provided substantial evidence that the natural protein translational machinery can be genetically reprogrammed to introduce a large number of non-coded (i.e. noncanonical) Trp analogues and surrogates into various proteins. Especially attractive targets for such an engineering approach are fluorescent proteins in which the chromophore is formed post-translationally from an amino acid sequence, like the green fluorescent protein from Aequorea victoria. With the currently available translationally active fluoro-, hydroxy-, amino-, halogen-, and chalcogen-containing Trp analogues and surrogates, the traditional methods for protein engineering and design can be supplemented or even fully replaced by these novel approaches. Future research will provide a further increase in the number of Trp-like amino acids that are available for redesign (by engineering of the genetic code) of native Trp residues and enable novel strategies to generate proteins with tailored spectral properties.  相似文献   

12.
There is much interest in characterizing the variation in a human individual, because this may elucidate what contributes significantly to a person's phenotype, thereby enabling personalized genomics. We focus here on the variants in a person's 'exome,' which is the set of exons in a genome, because the exome is believed to harbor much of the functional variation. We provide an analysis of the approximately 12,500 variants that affect the protein coding portion of an individual's genome. We identified approximately 10,400 nonsynonymous single nucleotide polymorphisms (nsSNPs) in this individual, of which approximately 15-20% are rare in the human population. We predict approximately 1,500 nsSNPs affect protein function and these tend be heterozygous, rare, or novel. Of the approximately 700 coding indels, approximately half tend to have lengths that are a multiple of three, which causes insertions/deletions of amino acids in the corresponding protein, rather than introducing frameshifts. Coding indels also occur frequently at the termini of genes, so even if an indel causes a frameshift, an alternative start or stop site in the gene can still be used to make a functional protein. In summary, we reduced the set of approximately 12,500 nonsilent coding variants by approximately 8-fold to a set of variants that are most likely to have major effects on their proteins' functions. This is our first glimpse of an individual's exome and a snapshot of the current state of personalized genomics. The majority of coding variants in this individual are common and appear to be functionally neutral. Our results also indicate that some variants can be used to improve the current NCBI human reference genome. As more genomes are sequenced, many rare variants and non-SNP variants will be discovered. We present an approach to analyze the coding variation in humans by proposing multiple bioinformatic methods to hone in on possible functional variation.  相似文献   

13.
This paper describes the structure of a 70-kb porcine gene for nuclear factor I, including its promoter region, comprising a total of 11 exons. Different mRNAs that we have isolated as cDNAs from both porcine liver and human HeLa cells presumably are generated from this gene by differential splicing events. One cDNA species from porcine liver that lacks exon 9 carries coding information for a protein of 439 amino acids. The in vitro translated protein displays all the properties of an NFI-like protein with high affinity toward the sequence element TGG(N)6GCCAA, as shown by gel shift analysis, and no or little affinity toward CCAAT box containing sequences. Cotranslation experiments with full-length and truncated variants of the protein demonstrate that it binds as a dimer to its cognate DNA recognition sequence. Its DNA-binding domain which is retained in all cDNA clones was mapped by deletion analysis to the 250 N-terminal amino acids of the protein. No structural homologies are observed between this protein and other known DNA-binding proteins; instead, the protein contains a novel alpha-helical sequence motif consisting of several lysine residues spaced at intervals of seven amino acids which we have termed the "lysine helix". The C-terminal portion of the protein derived from full-length cDNAs encodes a short amino acid sequence which is identical with the heptapeptide repeat CT7 observed in the C-terminal domain of the largest subunits of yeast and mouse RNA polymerase II. This region is removed by differential splicing in some of the NFI/CTF cDNAs and thus may be of functional significance.  相似文献   

14.

Background

Global residue-specific amino acid mutagenesis can provide important biological insight and generate proteins with altered properties, but at the risk of protein misfolding. Further, targeted libraries are usually restricted to a handful of amino acids because there is an exponential correlation between the number of residues randomized and the size of the resulting ensemble. Using GFP as the model protein, we present a strategy, termed protein evolution via amino acid and codon elimination, through which simplified, native-like polypeptides encoded by a reduced genetic code were obtained via screening of reduced-size ensembles.

Methodology/Principal Findings

The strategy involves combining a sequential mutagenesis scheme to reduce library size with structurally stabilizing mutations, chaperone complementation, and reduced temperature of gene expression. In six steps, we eliminated a common buried residue, Phe, from the green fluorescent protein (GFP), while retaining activity. A GFP variant containing 11 Phe residues was used as starting scaffold to generate 10 separate variants in which each Phe was replaced individually (in one construct two adjacent Phe residues were changed simultaneously), while retaining varying levels of activity. Combination of these substitutions to generate a Phe-free variant of GFP abolished fluorescence. Combinatorial re-introduction of five Phe residues, based on the activities of the respective single amino acid replacements, was sufficient to restore GFP activity. Successive rounds of mutagenesis generated active GFP variants containing, three, two, and zero Phe residues. These GFPs all displayed progenitor-like fluorescence spectra, temperature-sensitive folding, a reduced structural stability and, for the least stable variants, a reduced steady state abundance.

Conclusions/Significance

The results provide strategies for the design of novel GFP reporters. The described approach offers a means to enable engineering of active proteins that lack certain amino acids, a key step towards expanding the functional repertoire of uniquely labeled proteins in synthetic biology.  相似文献   

15.
In contrast to most gammaretrovirus envelope proteins (Env), the Gibbon ape leukemia virus (GaLV) Env protein does not mediate the infectivity of human immunodeficiency virus type 1 (HIV-1) particles. We made use of this observation to set up a directed evolution system by creating a library of GaLV Env variants diversified at three critical amino acids, all located around the R-peptide cleavage site within the cytoplasmic tail. This library was screened for variants that were able to functionally pseudotype HIV-1 vector particles. All selected Env variants mediated the infectivity of HIV-1 vector particles and encoded novel cytoplasmic tail motifs. They were efficiently incorporated into HIV particles, and the R peptide was processed by the HIV protease. Interestingly, in some of the selected variants, the R-peptide cleavage site had shifted closer to the C terminus. These data demonstrate a valuable approach for the engineering of chimeric viruses and vector particles.  相似文献   

16.
As part of the GAIT (genetic analysis of idiopathic thrombophilia) project, we analyzed polymorphisms in the factor V (FV) gene to assess their role as genetic determinants of normal phenotypic variation of hemostasis-related traits in a Spanish population. During the analysis of exon 13 polymorphisms, we detected an abnormal PCR-amplified fragment in some members of the GAIT19 family. Direct sequence analysis revealed a deletion of 108 bp in eight out of 20 individuals in this family. This deletion removes exactly 36 amino acids from the B domain of FV; thus it does not alter the reading frame of the sequence. Among the deleted amino acids there is the 4070A>G polymorphism (H1299R), which could affect the level or function of FV. In addition, in the same family we identified three novel DNA variants (L1257I, Q1317Q and T1327T) in exon 13 of the F5 gene. Despite these variants, we did not detect any differences either in the coagulant or anticoagulant traits, or in the plasma protein levels involved in the blood coagulation cascade, between the carriers compared with their non-carrier relatives. From these results, we can conclude that the mutant allele is expressed and the resultant protein is functional. Moreover, it is unlikely that the 4070A>G polymorphism, within the deletion, and the novel DNA variants alter the functional properties of the mature FV protein. Further analyses of this naturally occurring mutation and the novel DNA variants should yield useful information for the understanding of the function of the B domain of FV.  相似文献   

17.
There are several approaches to creating synthetic-biological systems. Here, we describe a molecular-design approach. First, we lay out a possible synthetic-biology space, which we define with a plot of complexity of components versus divergence from nature. In this scheme, there are basic units, which range from natural amino acids to totally synthetic small molecules. These are linked together to form programmable tectons, for example, amphipathic alpha-helices. In turn, tectons can interact to give self-assembled units, which can combine and organize further to produce functional assemblies and systems. To illustrate one path through this vast landscape, we focus on protein engineering and design. We describe how, for certain protein-folding motifs, polypeptide chains can be instructed to fold. These folds can be combined to give structured complexes, and function can be incorporated through computational design. Finally, we describe how protein-based systems may be encapsulated to control and investigate their functions.  相似文献   

18.
Structural variation in the primary structure of human T200 glycoprotein has been detected. Three cDNA variants have been characterized each of which encode T200 molecules that differ in size as a result of sequence differences in their amino-terminal regions. The largest form of the molecule is distinguished from the smallest by an insert of 161 amino acids, after the first eight amino-terminal residues. The other variant has an insert at the same location of 47 amino acids identical to residues 75-121 in the larger insert. Both extra domains are rich in serine and threonine residues and are likely to display multiple O-linked oligosaccharides. These structural variants which probably arise by cell-type-specific alternative splicing provide a molecular basis for the previously observed structural and antigenic heterogeneity of T200 glycoprotein. In addition to the variable amino-terminal region, the external domain of human T200 glycoprotein consists of a second cysteine-rich region of about 400 amino acids, a single transmembrane-spanning region and a large cytoplasmic domain of 707 amino acids shared by all of the structural variants and highly conserved between species. The gene encoding human T200 is located on the long arm of chromosome 1.  相似文献   

19.
20.
Much effort has been dedicated to the design of significantly red shifted variants of the green fluorescent protein (GFP) from Aequoria victora (av). These approaches have been based on classical engineering with the 20 canonical amino acids. We report here an expansion of these efforts by incorporation of an amino substituted variant of tryptophan into the "cyan" GFP mutant, which turned it into a "gold" variant. This variant possesses a red shift in emission unprecedented for any avFP, similar to "red" FPs, but with enhanced stability and a very low aggregation tendency. An increasing number of non-natural amino acids are available for chromophore redesign (by engineering of the genetic code) and enable new general strategies to generate novel classes of tailor-made GFP proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号