首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Designing new protein folds requires a method for simultaneously optimizing the conformation of the backbone and the side-chains. One approach to this problem is the use of a parameterized backbone, which allows the systematic exploration of families of structures. We report the crystal structure of RH3, a right-handed, three-helix coiled coil that was designed using a parameterized backbone and detailed modeling of core packing. This crystal structure was determined using another rationally designed feature, a metal-binding site that permitted experimental phasing of the X-ray data. RH3 adopted the intended fold, which has not been observed previously in biological proteins. Unanticipated structural asymmetry in the trimer was a principal source of variation within the RH3 structure. The sequence of RH3 differs from that of a previously characterized right-handed tetramer, RH4, at only one position in each 11 amino acid sequence repeat. This close similarity indicates that the design method is sensitive to the core packing interactions that specify the protein structure. Comparison of the structures of RH3 and RH4 indicates that both steric overlap and cavity formation provide strong driving forces for oligomer specificity.  相似文献   

2.
In recent years, there have been significant advances in the field of computational protein design including the successful computational design of enzymes based on backbone scaffolds from experimentally solved structures. It is likely that large‐scale sampling of protein backbone conformations will become necessary as further progress is made on more complicated systems. Removing the constraint of having to use scaffolds based on known protein backbones is a potential method of solving the problem. With this application in mind, we describe a method to systematically construct a large number of de novo backbone structures from idealized topological forms in a top–down hierarchical approach. The structural properties of these novel backbone scaffolds were analyzed and compared with a set of high‐resolution experimental structures from the protein data bank (PDB). It was found that the Ramachandran plot distribution and relative γ‐ and β‐turn frequencies were similar to those found in the PDB. The de novo scaffolds were sequence designed with RosettaDesign, and the energy distributions and amino acid compositions were comparable with the results for redesigned experimentally solved backbones. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

3.
This paper reports the isolation of cDNAs encoding the protein backbone of two arabinogalactan-proteins (AGPs), one from pear cell suspension cultures (AGP Pc 2) and the other from suspension cultures of Nicotiana alata (AGP Na 2). The proteins encoded by these cDNAs are quite different from the 'classical' AGP backbones described previously for AGPs isolated from pear suspension cultures and extracts of N. alata styles. The cDNA for AGP Pc 2 encodes a 294 amino acid protein, of which a relatively short stretch (35 amino acids) is Hyp/Pro rich; this stretch is flanked by sequences which are dominated by Asn residues. Asn residues are not a feature of the 'classical' AGP backbones in which Hyp/Pro, Ser, Ala and Thr account for most of the amino acids. The cDNA for AGP Na 2 encodes a 437 amino acid protein, which contains two distinct domains: one rich in Hyp/Pro, Ser, Ala, Thr and the other rich in Asn, Tyr and Ser. The composition and sequence of the Pro-rich domain resembles that of the 'classical' AGP backbone. The Asn-rich domains of the two cDNAs described have no sequence similarity; in both cases they are predicted to be processed to give a mature backbone with a composition similar to that of the 'classical' AGPs. The study shows that different AGPs can differ in the amino acid sequence in the protein backbone, as well as the composition and sequence of the arabinogalactan side-chains. It also shows that differential expression of genes encoding AGP protein backbones, as well as differential glycosylation, can contribute to the tissue specificity of AGPs.  相似文献   

4.
Arabinogalactan proteins (AGPs) are extracellular proteoglycans implicated in plant growth and development. We searched for classical AGPs in Arabidopsis by identifying expressed sequence tags based on the conserved domain structure of the predicted protein backbone. To confirm that these genes encoded bona fide AGPs, we purified native AGPs and then deglycosylated and deblocked them for N-terminal protein sequencing. In total, we identified 15 genes encoding the protein backbones of classical AGPs, including genes for AG peptides-AGPs with very short backbones (10 to 13 amino acid residues). Seven of the AGPs were verified as AGPs by protein sequencing. A gene encoding a putative cell adhesion molecule with AGP-like domains was also identified. This work provides a firm foundation for beginning functional analysis by using a genetic approach.  相似文献   

5.
Recent efforts to design de novo or redesign the sequence and structure of proteins using computational techniques have met with significant success. Most, if not all, of these computational methodologies attempt to model atomic-level interactions, and hence high-resolution structural characterization of the designed proteins is critical for evaluating the atomic-level accuracy of the underlying design force-fields. We previously used our computational protein design protocol RosettaDesign to completely redesign the sequence of the activation domain of human procarboxypeptidase A2. With 68% of the wild-type sequence changed, the designed protein, AYEdesign, is over 10 kcal/mol more stable than the wild-type protein. Here, we describe the high-resolution crystal structure and solution NMR structure of AYEdesign, which show that the experimentally determined backbone and side-chains conformations are effectively superimposable with the computational model at atomic resolution. To isolate the origins of the remarkable stabilization, we have designed and characterized a new series of procarboxypeptidase mutants that gain significant thermodynamic stability with a minimal number of mutations; one mutant gains more than 5 kcal/mol of stability over the wild-type protein with only four amino acid changes. We explore the relationship between force-field smoothing and conformational sampling by comparing the experimentally determined free energies of the overall design and these focused subsets of mutations to those predicted using modified force-fields, and both fixed and flexible backbone sampling protocols.  相似文献   

6.
Computational protein design methods can complement experimental screening and selection techniques by predicting libraries of low-energy sequences compatible with a desired structure and function. Incorporating backbone flexibility in computational design allows conformational adjustments that should broaden the range of predicted low-energy sequences. Here, we evaluate computational predictions of sequence libraries from different protocols for modeling backbone flexibility using the complex between the therapeutic antibody Herceptin and its target human epidermal growth factor receptor 2 (HER2) as a model system. Within the program RosettaDesign, three methods are compared: The first two use ensembles of structures generated by Monte Carlo protocols for near-native conformational sampling: kinematic closure (KIC) and backrub, and the third method uses snapshots from molecular dynamics (MD) simulations. KIC or backrub methods were better able to identify the amino acid residues experimentally observed by phage display in the Herceptin-HER2 interface than MD snapshots, which generated much larger conformational and sequence diversity. KIC and backrub, as well as fixed backbone simulations, captured the key mutation Asp98Trp in Herceptin, which leads to a further threefold affinity improvement of the already subnanomolar parental Herceptin-HER2 interface. Modeling subtle backbone conformational changes may assist in the design of sequence libraries for improving the affinity of antibody-antigen interfaces and could be suitable for other protein complexes for which structural information is available.  相似文献   

7.
Using a protein design algorithm that considers side-chain packing quantitatively, the effect of explicit backbone motion on the selection of amino acids in protein design was assessed in the core of the streptococcal protein G beta 1 domain (G beta 1). Concerted backbone motion was introduced by varying G beta 1's supersecondary structure parameter values. The stability and structural flexibility of seven of the redesigned proteins were determined experimentally and showed that core variants containing as many as 6 of 10 possible mutations retain native-like properties. This result demonstrates that backbone flexibility can be combined explicitly with amino acid side-chain selection and that the selection algorithm is sufficiently robust to tolerate perturbations as large as 15% of G beta 1's native supersecondary structure parameter values.  相似文献   

8.
We have developed a computational approach for the design and prediction of hydrophobic cores that includes explicit backbone flexibility. The program consists of a two-stage combination of a genetic algorithm and monte carlo sampling using a torsional model of the protein. Backbone structures are evaluated either by a canonical force-field or a constraining potential that emphasizes the preservation of local geometry. The utility of the method for protein design and engineering is explored by designing three novel hydrophobic core variants of the protein 434 cro. We use the new method to evaluate these and previously designed 434 cro variants, as well as a series of phage T4 lysozyme variants. In order to properly evaluate the influence of backbone flexibility, we have also analyzed the effects of varying amounts of side-chain flexibility on the performance of fixed backbone methods. Comparison of results using a fixed versus flexible backbone reveals that, surprisingly, the two methods are almost equivalent in their abilities to predict relative experimental stabilities, but only when full side-chain flexibility is allowed. The prediction of core side-chain structure can vary dramatically between methods. In some, but not all, cases the flexible backbone method is a better predictor of structure. The development of a flexible backbone approach to core design is particularly important for attempts at de novo protein design, where there is no prior knowledge of a precise backbone structure.  相似文献   

9.
Prediction of amino acid sequence from structure   总被引:2,自引:0,他引:2       下载免费PDF全文
We have developed a method for the prediction of an amino acid sequence that is compatible with a three-dimensional backbone structure. Using only a backbone structure of a protein as input, the algorithm is capable of designing sequences that closely resemble natural members of the protein family to which the template structure belongs. In general, the predicted sequences are shown to have multiple sequence profile scores that are dramatically higher than those of random sequences, and sometimes better than some of the natural sequences that make up the superfamily. As anticipated, highly conserved but poorly predicted residues are often those that contribute to the functional rather than structural properties of the protein. Overall, our analysis suggests that statistical profile scores of designed sequences are a novel and valuable figure of merit for assessing and improving protein design algorithms.  相似文献   

10.
Recent advances in modeling protein structures at the atomic level have made it possible to tackle "de novo" computational protein design. Most procedures are based on combinatorial optimization using a scoring function that estimates the folding free energy of a protein sequence on a given main-chain structure. However, the computation of the conformational entropy in the folded state is generally an intractable problem, and its contribution to the free energy is not properly evaluated. In this article, we propose a new automated protein design methodology that incorporates such conformational entropy based on statistical mechanics principles. We define the free energy of a protein sequence by the corresponding partition function over rotamer states. The free energy is written in variational form in a pairwise approximation and minimized using the Belief Propagation algorithm. In this way, a free energy is associated to each amino acid sequence: we use this insight to rescore the results obtained with a standard minimization method, with the energy as the cost function. Then, we set up a design method that directly uses the free energy as a cost function in combination with a stochastic search in the sequence space. We validate the methods on the design of three superficial sites of a small SH3 domain, and then apply them to the complete redesign of 27 proteins. Our results indicate that accounting for entropic contribution in the score function affects the outcome in a highly nontrivial way, and might improve current computational design techniques based on protein stability.  相似文献   

11.
Hu X  Kuhlman B 《Proteins》2006,62(3):739-748
Loss of side-chain conformational entropy is an important force opposing protein folding and the relative preferences of the amino acids for being buried or solvent exposed may be partially determined by which amino acids lose more side-chain entropy when placed in the core of a protein. To investigate these preferences, we have incorporated explicit modeling of side-chain entropy into the protein design algorithm, RosettaDesign. In the standard version of the program, the energy of a particular sequence for a fixed backbone depends only on the lowest energy side-chain conformations that can be identified for that sequence. In the new model, the free energy of a single amino acid sequence is calculated by evaluating the average energy and entropy of an ensemble of structures generated by Monte Carlo sampling of amino acid side-chain conformations. To evaluate the impact of including explicit side-chain entropy, sequences were designed for 110 native protein backbones with and without the entropy model. In general, the differences between the two sets of sequences are modest, with the largest changes being observed for the longer amino acids: methionine and arginine. Overall, the identity between the designed sequences and the native sequences does not increase with the addition of entropy, unlike what is observed when other key terms are added to the model (hydrogen bonding, Lennard-Jones energies, and solvation energies). These results suggest that side-chain conformational entropy has a relatively small role in determining the preferred amino acid at each residue position in a protein.  相似文献   

12.
One of the classical DNA-binding proteins, bacteriophage lambda Cro, forms a homodimer with a unique fold of alpha-helices and beta-sheets. We have computationally designed an artificial sequence of 60 amino acid residues to stabilize the backbone tertiary structure of the lambda Cro dimer by simulated annealing using knowledge-based structure-sequence compatibility functions. The designed amino acid sequence has 25% identity with that of natural lambda Cro and preserves Phe58, which is important for formation of the stably folded structure of lambda Cro. The designed dimer protein and its monomeric variant, which was redesigned by the insertion of a beta-hairpin sequence at the C-terminal region to prevent dimerization, were synthesized and biochemically characterized to be well folded. The designed protein was monomeric under a wide range of protein concentrations and its solution structure was determined by NMR spectroscopy. The solved structure is similar to that of a monomeric variant of natural lambda Cro with a root-mean-square deviation of the polypeptide backbones at 2.1A and has a well-packed protein core. Thus, our knowledge-based functions provide approximate but essential relationships between amino acid sequences and protein structures, and are useful for finding novel sequences that are foldable into a given target structure.  相似文献   

13.
A previously developed computer program for protein design, RosettaDesign, was used to predict low free energy sequences for nine naturally occurring protein backbones. RosettaDesign had no knowledge of the naturally occurring sequences and on average 65% of the residues in the designed sequences differ from wild-type. Synthetic genes for ten completely redesigned proteins were generated, and the proteins were expressed, purified, and then characterized using circular dichroism, chemical and temperature denaturation and NMR experiments. Although high-resolution structures have not yet been determined, eight of these proteins appear to be folded and their circular dichroism spectra are similar to those of their wild-type counterparts. Six of the proteins have stabilities equal to or up to 7kcal/mol greater than their wild-type counterparts, and four of the proteins have NMR spectra consistent with a well-packed, rigid structure. These encouraging results indicate that the computational protein design methods can, with significant reliability, identify amino acid sequences compatible with a target protein backbone.  相似文献   

14.
It is generally accepted that many different protein sequences have similar folded structures, and that there is a relatively high probability that a new sequence possesses a previously observed fold. An indirect consequence of this is that protein design should define the sequence space accessible to a given structure, rather than providing a single optimized sequence. We have recently developed a new approach for protein sequence design, which optimizes the complete sequence of a protein based on the knowledge of its backbone structure, its amino acid composition and a physical energy function including van der Waals interactions, electrostatics, and environment free energy. The specificity of the designed sequence for its template backbone is imposed by keeping the amino acid composition fixed. Here, we show that our procedure converges in sequence space, albeit not to the native sequence of the protein. We observe that while polar residues are well conserved in our designed sequences, non-polar amino acids at the surface of a protein are often replaced by polar residues. The designed sequences provide a multiple alignment of sequences that all adopt the same three-dimensional fold. This alignment is used to derive a profile matrix for chicken triose phosphate isomerase, TIM. The matrix is found to recognize significantly the native sequence for TIM, as well as closely related sequences. Possible application of this approach to protein fold recognition is discussed.  相似文献   

15.
Automated methodologies to design synthetic proteins from first principles use energy computations to estimate the ability of the sequences to adopt a targeted structure. This approach is still far from systematically producing native-like sequences, due, most likely, to inaccuracies when modeling the interactions between the protein and its aqueous environment. This is particularly challenging when engineering small protein domains (with less polar pair interactions than with the solvent). We have re-designed a three-helix bundle, domain B, using a fixed backbone and a four amino acid alphabet. We have enlarged the rotamer library with conformers that increase the weight of electrostatic interactions within the design process without altering the energy function used to compute the folding free energy. Our synthetic sequences show less than 15% similarity to any Swissprot sequence. We have characterized our sequences in different solvents using circular dichroism and nuclear magnetic resonance. The targeted structure achieved is dependent on the solvent used. This method can be readily extended to larger domains. Our method will be useful for the engineering of proteins that become active only in a given solvent and for designing proteins in the context of hydrophobic solvents, an important fraction of the situations in the cell.  相似文献   

16.
采用基于神经网络的算法预测了我们自行克隆的新的白血病相关蛋白EEN(extra elevennineteen, EEN)全长分子的二级结构,结果表明:EEN 蛋白可能有三个结构域,N 端由三段α螺旋和短β折叠组成,中间为四段α螺旋组成的四螺旋结构,C端为SH3结构域,类似于在受体酪氨酸激酶信号传导途径中起重要作用的SEM-5/GRB2 C端SH3结构域;利用同源蛋白结构模拟的方法,模拟了EEN SH3结构域的三维结构,结果表明:EEN SH3结构域与SEM-5/GRB2 SH3结构域具有相近的结构,构成脯氨酸结合区的氨基酸非常保守.上述结果提示:EEN 蛋白可能为新的信号蛋白,可能涉及新的信号传导途径或新的信号传导旁路,SH3结构域是其功能区域.  相似文献   

17.
18.
Computational design of binding sites in proteins remains difficult, in part due to limitations in our current ability to sample backbone conformations that enable precise and accurate geometric positioning of side chains during sequence design. Here we present a benchmark framework for comparison between flexible-backbone design methods applied to binding interactions. We quantify the ability of different flexible backbone design methods in the widely used protein design software Rosetta to recapitulate observed protein sequence profiles assumed to represent functional protein/protein and protein/small molecule binding interactions. The CoupledMoves method, which combines backbone flexibility and sequence exploration into a single acceptance step during the sampling trajectory, better recapitulates observed sequence profiles than the BackrubEnsemble and FastDesign methods, which separate backbone flexibility and sequence design into separate acceptance steps during the sampling trajectory. Flexible-backbone design with the CoupledMoves method is a powerful strategy for reducing sequence space to generate targeted libraries for experimental screening and selection.  相似文献   

19.
Measurements of protein sequence-structure correlations   总被引:1,自引:0,他引:1  
Crooks GE  Wolfe J  Brenner SE 《Proteins》2004,57(4):804-810
Correlations between protein structures and amino acid sequences are widely used for protein structure prediction. For example, secondary structure predictors generally use correlations between a secondary structure sequence and corresponding primary structure sequence, whereas threading algorithms and similar tertiary structure predictors typically incorporate interresidue contact potentials. To investigate the relative importance of these sequence-structure interactions, we measured the mutual information among the primary structure, secondary structure and side-chain surface exposure, both for adjacent residues along the amino acid sequence and for tertiary structure contacts between residues distantly separated along the backbone. We found that local interactions along the amino acid chain are far more important than non-local contacts and that correlations between proximate amino acids are essentially uninformative. This suggests that knowledge-based contact potentials may be less important for structure predication than is generally believed.  相似文献   

20.
Schug A  Herges T  Wenzel W 《Proteins》2004,57(4):792-798
All-atom protein structure prediction from the amino acid sequence alone remains an important goal of biophysical chemistry. Recent progress in force field development and validation suggests that the PFF01 free-energy force field correctly predicts the native conformation of various helical proteins as the global optimum of its free-energy surface. Reproducible protein structure prediction requires the availability of efficient optimization methods to locate the global minima of such complex potentials. Here we investigate an adapted version of the parallel tempering method as an efficient parallel stochastic optimization method for protein structure prediction. Using this approach we report the reproducible all-atom folding of the three-helix 40 amino acid HIV accessory protein from random conformations to within 2.4 A backbone RMS deviation from the experimental structure with modest computational resources.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号