首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

We present a simple method to train a potential function for the protein folding problem which, even though trained using a small number of proteins, is able to place a significantly large number of native conformations near a local minimum. The training relies on generating decoys by energy minimization of the native conformations using the current potential and using a physically meaningful objective function (derivative of energy with respect to torsion angles at the native conformation) during the quadratic programming to place the native conformation near a local minimum.

Results

We also compare the performance of three different types of energy functions and find that while the pairwise energy function is trainable, a solvation energy function by itself is untrainable if decoys are generated by minimizing the current potential starting at the native conformation. The best results are obtained when a pairwise interaction energy function is used with solvation energy function.

Conclusions

We are able to train a potential function using six proteins which places a total of 42 native conformations within ~4 Å rmsd and 71 native conformations within ~6 Å rmsd of a local minimum out of a total of 91 proteins. Furthermore, the threading test using the same 91 proteins ranks 89 native conformations to be first and the other two as second.  相似文献   

2.
Loose C  Klepeis JL  Floudas CA 《Proteins》2004,54(2):303-314
A new force field for pairwise residue interactions as a function of C(alpha) to C(alpha) distances is presented. The force field was developed through the solution of a linear programming formulation with large sets of constraints. The constraints are based on the construction of >80,000 low-energy decoys for a set of proteins and requiring the decoy energies for each protein system to be higher than the native conformation of that particular protein. The generation of a robust force field was facilitated by the use of a novel decoy generation process, which involved the rational selection of proteins to add to the training set and included a significant energy minimization of the decoys. The force field was tested on a large set of decoys for various proteins not included in the training set and shown to perform well compared with a leading force field in identifying the native conformation for these proteins.  相似文献   

3.
There are several knowledge-based energy functions that can distinguish the native fold from a pool of grossly misfolded decoys for a given sequence of amino acids. These decoys, which are typically generated by mounting, or “threading”, the sequence onto the backbones of unrelated protein structures, tend to be non-compact and quite different from the native structure: the root-mean-squared (RMS) deviations from the native are commonly in the range of 15 to 20 Å. Effective energy functions should also demonstrate a similar recognition capability when presented with compact decoys that depart only slightly in conformation from the correct structure (i.e. those with RMS deviations of ∼5 Å or less). Recently, we developed a simple yet powerful method for native fold recognition based on the tendency for native folds to form hydrophobic cores. Our energy measure, which we call the hydrophobic fitness score, is challenged to recognize the native fold from 2000 near-native structures generated for each of five small monomeric proteins. First, 1000 conformations for each protein were generated by molecular dynamics simulation at room temperature. The average RMS deviation of this set of 5000 was 1.5 Å. A total of 323 decoys had energies lower than native; however, none of these had RMS deviations greater than 2 Å. Another 1000 structures were generated for each at high temperature, in which a greater range of conformational space was explored (4.3 Å average RMS deviation). Out of this set, only seven decoys were misrecognized. The hydrophobic fitness energy of a conformation is strongly dependent upon the RMS deviation. On average our potential yields energy values which are lowest for the population of structures generated at room temperature, intermediate for those produced at high temperature and highest for those constructed by threading methods. In general, the lowest energy decoy conformations have backbones very close to native structure. The possible utility of our method for screening backbone candidates for the purpose of modelling by side-chain packing optimization is discussed.  相似文献   

4.
We present a method to derive contact energy parameters from large sets of proteins. The basic requirement on which our method is based is that for each protein in the database the native contact map has lower energy than all its decoy conformations that are obtained by threading. Only when this condition is satisfied one can use the proposed energy function for fold identification. Such a set of parameters can be found (by perceptron learning) if Mp, the number of proteins in the database, is not too large. Other aspects that influence the existence of such a solution are the exact definition of contact and the value of the critical distance Rc, below which two residues are considered to be in contact. Another important novel feature of our approach is its ability to determine whether an energy function of some suitable proposed form can or cannot be parameterized in a way that satisfies our basic requirement. As a demonstration of this, we determine the region in the (Rc, Mp) plane in which the problem is solvable, i.e., we can find a set of contact parameters that stabilize simultaneously all the native conformations. We show that for large enough databases the contact approximation to the energy cannot stabilize all the native folds even against the decoys obtained by gapless threading.  相似文献   

5.
We studied the possibility to approximate a Lennard-Jones interaction by a pairwise contact potential. First we used a Lennard-Jones potential to design off-lattice, protein-like heteropolymer sequences, whose lowest energy (native) conformations were then identified by molecular dynamics. Then we turned to investigate whether one can find a pairwise contact potential, whose ground states are the contact maps associated with these native conformations. We show that such a requirement cannot be satisfied exactly, i.e., no such contact parameters exist. Nevertheless, we found that one can find contact energy parameters for which an energy minimization procedure, acting in the space of contact maps, yields maps whose corresponding structures are close to the native ones. Finally, we show that when these structures are used as the initial point of a molecular dynamics energy minimization process, the correct native folds are recovered with high probability.  相似文献   

6.
Liang S  Liu S  Zhang C  Zhou Y 《Proteins》2007,69(2):244-253
Near-native selections from docking decoys have proved challenging especially when unbound proteins are used in the molecular docking. One reason is that significant atomic clashes in docking decoys lead to poor predictions of binding affinities of near native decoys. Atomic clashes can be removed by structural refinement through energy minimization. Such an energy minimization, however, will lead to an unrealistic bias toward docked structures with large interfaces. Here, we extend an empirical energy function developed for protein design to protein-protein docking selection by introducing a simple reference state that removes the unrealistic dependence of binding affinity of docking decoys on the buried solvent accessible surface area of interface. The energy function called EMPIRE (EMpirical Protein-InteRaction Energy), when coupled with a refinement strategy, is found to provide a significantly improved success rate in near native selections when applied to RosettaDock and refined ZDOCK docking decoys. Our work underlines the importance of removing nonspecific interactions from specific ones in near native selections from docking decoys.  相似文献   

7.
We suggest a new approach to the generation of candidate structures (decoys) for ab initio prediction of protein structures. Our method is based on random sampling of conformation space and subsequent local energy minimization. At the core of this approach lies the design of a novel type of energy function. This energy function has local minima with native structure characteristics and wide basins of attraction. The current work presents our motivation for deriving such an energy function and also tests the derived energy function.Our approach is novel in that it takes advantage of the inherently rough energy landscape of proteins, which is generally considered a major obstacle for protein structure prediction. When local minima have wide basins of attraction, the protein's conformation space can be greatly reduced by the convergence of large regions of the space into single points, namely the local minima corresponding to these funnels. We have implemented this concept by an iterative process. The potential is first used to generate decoy sets and then we study these sets of decoys to guide further development of the potential. A key feature of our potential is the use of cooperative multi-body interactions that mimic the role of the entropic and solvent contributions to the free energy.The validity and value of our approach is demonstrated by applying it to 14 diverse, small proteins. We show that, for these proteins, the size of conformation space is considerably reduced by the new energy function. In fact, the reduction is so substantial as to allow efficient conformational sampling. As a result we are able to find a significant number of near-native conformations in random searches performed with limited computational resources.  相似文献   

8.
Two methods were proposed recently to derive energy parameters from known native protein conformations and corresponding sets of decoys. One is based on finding, by means of a perceptron learning scheme, energy parameters such that the native conformations have lower energies than the decoys. The second method maximizes the difference between the native energy and the average energy of the decoys, measured in terms of the width of the decoys' energy distribution (Z-score). Whereas the perceptron method is sensitive mainly to "outlier" (i.e., extremal) decoys, the Z-score optimization is governed by the high density regions in decoy-space. We compare the two methods by deriving contact energies for two very different sets of decoys: the first obtained for model lattice proteins and the second by threading. We find that the potentials derived by the two methods are of similar quality and fairly closely related. This finding indicates that standard, naturally occurring sets of decoys are distributed in a way that yields robust energy parameters (that are quite insensitive to the particular method used to derive them). The main practical implication of this finding is that it is not necessary to fine-tune the potential search method to the particular set of decoys used.  相似文献   

9.
We develop a protocol for estimating the free energy difference between different conformations of the same polypeptide chain. The conformational free energy evaluation combines the CHARMM force field with a continuum treatment of the solvent. In almost all cases studied, experimentally determined structures are predicted to be more stable than misfolded "decoys." This is due in part to the fact that the Coulomb energy of the native protein is consistently lower than that of the decoys. The solvation free energy generally favors the decoys, although the total electrostatic free energy (sum of Coulomb and solvation terms) favors the native structure. The behavior of the solvation free energy is somewhat counterintuitive and, surprisingly, is not correlated with differences in the burial of polar area between native structures and decoys. Rather. the effect is due to a more favorable charge distribution in the native protein, which, as is discussed, will tend to decrease its interaction with the solvent. Our results thus suggest, in keeping with a number of recent studies, that electrostatic interactions may play an important role in determining the native topology of a folded protein. On this basis, a simplified scoring function is derived that combines a Coulomb term with a hydrophobic contact term. This function performs as well as the more complete free energy evaluation in distinguishing the native structure from misfolded decoys. Its computational efficiency suggests that it can be used in protein structure prediction applications, and that it provides a physically well-defined alternative to statistically derived scoring functions.  相似文献   

10.
One of the common methods for assessing energy functions of proteins is selection of native or near-native structures from decoys. This is an efficient but indirect test of the energy functions because decoy structures are typically generated either by sampling procedures or by a separate energy function. As a result, these decoys may not contain the global minimum structure that reflects the true folding accuracy of the energy functions. This paper proposes to assess energy functions by ab initio refolding of fully unfolded terminal segments with secondary structures while keeping the rest of the proteins fixed in their native conformations. Global energy minimization of these short unfolded segments, a challenging yet tractable problem, is a direct test of the energy functions. As an illustrative example, refolding terminal segments is employed to assess two closely related all-atom statistical energy functions, DFIRE (distance-scaled, finite, ideal-gas reference state) and DOPE (discrete optimized protein energy). We found that a simple sequence-position dependence contained in the DOPE energy function leads to an intrinsic bias toward the formation of helical structures. Meanwhile, a finer statistical treatment of short-range interactions yields a significant improvement in the accuracy of segment refolding by DFIRE. The updated DFIRE energy function yields success rates of 100% and 67%, respectively, for its ability to sample and fold fully unfolded terminal segments of 15 proteins to within 3.5 A global root-mean-squared distance from the corresponding native structures. The updated DFIRE energy function is available as DFIRE 2.0 upon request.  相似文献   

11.
Betancourt MR 《Proteins》2003,53(4):889-907
A protein model that is simple enough to be used in protein-folding simulations but accurate enough to identify a protein native fold is described. Its geometry consists of describing the residues by one, two, or three pseudoatoms, depending on the residue size. Its energy is given by a pairwise, knowledge-based potential obtained for all the pseudoatoms as a function of their relative distance. The pseudoatomic potential is also a function of the primary chain separation and residue order. The model is tested by gapless threading on a large, representative set of known protein and decoy structures obtained from the "Decoys 'R' Us" database. It is also tested by threading on gapped decoys generated for proteins with many homologs. The gapless threading tests show near 98% native-structure recognition as the lowest energy structure and almost 100% as one of the three lowest energy structures for over 2200 test proteins. In decoy threading tests, the model recognized the majority of the native structures. It is also able to recognize native structures among gapped decoys, in spite of close structural similarities. The results indicate that the pseudoatomic model has native recognition ability similar to comparable atomic-based models but much better than equivalent residue-based models.  相似文献   

12.
Protein structure refinement by optimization   总被引:1,自引:0,他引:1       下载免费PDF全文
Martin Carlsen  Peter Røgen 《Proteins》2015,83(9):1616-1624
Knowledge‐based protein potentials are simplified potentials designed to improve the quality of protein models, which is important as more accurate models are more useful for biological and pharmaceutical studies. Consequently, knowledge‐based potentials often are designed to be efficient in ordering a given set of deformed structures denoted decoys according to how close they are to the relevant native protein structure. This, however, does not necessarily imply that energy minimization of this potential will bring the decoys closer to the native structure. In this study, we introduce an iterative strategy to improve the convergence of decoy structures. It works by adding energy optimized decoys to the pool of decoys used to construct the next and improved knowledge‐based potential. We demonstrate that this strategy results in significantly improved decoy convergence on Titan high resolution decoys and refinement targets from Critical Assessment of protein Structure Prediction competitions. Our potential is formulated in Cartesian coordinates and has a fixed backbone potential to restricts motions to be close to those of a dihedral model, a fixed hydrogen‐bonding potential and a variable coarse grained carbon alpha potential consisting of a pair potential and a novel solvent potential that are b‐spline based as we use explicit gradient and Hessian for efficient energy optimization. Proteins 2015; 83:1616–1624. © 2015 Wiley Periodicals, Inc.  相似文献   

13.
We propose a novel method of calculation of free energy for coarse grained models of proteins by combining our newly developed multibody potentials with entropies computed from elastic network models of proteins. Multi-body potentials have been of much interest recently because they take into account three dimensional interactions related to residue packing and capture the cooperativity of these interactions in protein structures. Combining four-body non-sequential, four-body sequential and pairwise short range potentials with optimized weights for each term, our coarse-grained potential improved recognition of native structure among misfolded decoys, outperforming all other contact potentials for CASP8 decoy sets and performance comparable to the fully atomic empirical DFIRE potentials. By combing statistical contact potentials with entropies from elastic network models of the same structures we can compute free energy changes and improve coarse-grained modeling of protein structure and dynamics. The consideration of protein flexibility and dynamics should improve protein structure prediction and refinement of computational models. This work is the first to combine coarse-grained multibody potentials with an entropic model that takes into account contributions of the entire structure, investigating native-like decoy selection.  相似文献   

14.
One of the approaches to protein structure prediction is to obtain energy functions which can recognize the native conformation of a given sequence among a zoo of conformations. The discriminations can be done by assigning the lowest energy to the native conformation, with the guarantee that the native is in the zoo. Well-adjusted functions, then, can be used in the search for other (near-) natives. Here the aim is the discrimination at relatively high resolution (RMSD difference between the native and the closest nonnative is around 1 A) by pairwise energy potentials. The potential is trained using the experimentally determined native conformation of only one protein, instead of the usual large survey over many proteins. The novel feature is that the native structure is compared to a vastly wider and more challenging array of nonnative structures found not only by the usual threading procedure, but by wide-ranging local minimization of the potential. Because of this extremely demanding search, the native is very close to the apparent global minimum of the potential function. The global minimum property holds up for one other protein having 60% sequence identity, but its performance on completely dissimilar proteins is of course much weaker.  相似文献   

15.
A major challenge of the protein docking problem is to define scoring functions that can distinguish near‐native protein complex geometries from a large number of non‐native geometries (decoys) generated with noncomplexed protein structures (unbound docking). In this study, we have constructed a neural network that employs the information from atom‐pair distance distributions of a large number of decoys to predict protein complex geometries. We found that docking prediction can be significantly improved using two different types of polar hydrogen atoms. To train the neural network, 2000 near‐native decoys of even distance distribution were used for each of the 185 considered protein complexes. The neural network normalizes the information from different protein complexes using an additional protein complex identity input neuron for each complex. The parameters of the neural network were determined such that they mimic a scoring funnel in the neighborhood of the native complex structure. The neural network approach avoids the reference state problem, which occurs in deriving knowledge‐based energy functions for scoring. We show that a distance‐dependent atom pair potential performs much better than a simple atom‐pair contact potential. We have compared the performance of our scoring function with other empirical and knowledge‐based scoring functions such as ZDOCK 3.0, ZRANK, ITScore‐PP, EMPIRE, and RosettaDock. In spite of the simplicity of the method and its functional form, our neural network‐based scoring function achieves a reasonable performance in rigid‐body unbound docking of proteins. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

16.
Tobi D  Shafran G  Linial N  Elber R 《Proteins》2000,40(1):71-85
Pairwise interaction models to recognize native folds are designed and analyzed. Different sets of parameters are considered but the focus was on 20 x 20 contact matrices. Simultaneous solution of inequalities and minimization of the variance of the energy find matrices that recognize exactly the native folds of 572 sequences and structures from the protein data bank (PDB). The set includes many homologous pairs, which present a difficult recognition problem. Significant recognition ability is recovered with a small number of parameters (e.g., the H/P model). However, full recognition requires a complete set of amino acids. In addition to structures from the PDB, a folding program (MONSSTER) was used to generate decoy structures for 75 proteins. It is impossible to recognize all the native structures of the extended set by contact potentials. We therefore searched for a new functional form. An energy function U, which is based on a sum of general pairwise interactions limited to a resolution of 1 angstrom, is considered. This set was infeasible too. We therefore conjecture that it is not possible to find a folding potential, resolved to 1 angstrom, which is a sum of pair interactions.  相似文献   

17.
Convergence of the vast sequence space of proteins into a highly restricted fold/conformational space suggests a simple yet unique underlying mechanism of protein folding that has been the subject of much debate in the last several decades. One of the major challenges related to the understanding of protein folding or in silico protein structure prediction is the discrimination of non-native structures/decoys from the native structure. Applications of knowledge-based potentials to attain this goal have been extensively reported in the literature. Also, scoring functions based on accessible surface area and amino acid neighbourhood considerations were used in discriminating the decoys from native structures. In this article, we have explored the potential of protein structure network (PSN) parameters to validate the native proteins against a large number of decoy structures generated by diverse methods. We are guided by two principles: (a) the PSNs capture the local properties from a global perspective and (b) inclusion of non-covalent interactions, at all-atom level, including the side-chain atoms, in the network construction accommodates the sequence dependent features. Several network parameters such as the size of the largest cluster, community size, clustering coefficient are evaluated and scored on the basis of the rank of the native structures and the Z-scores. The network analysis of decoy structures highlights the importance of the global properties contributing to the uniqueness of native structures. The analysis also exhibits that the network parameters can be used as metrics to identify the native structures and filter out non-native structures/decoys in a large number of data-sets; thus also has a potential to be used in the protein 'structure prediction' problem.  相似文献   

18.
杨凌云  吕强 《生物信息学》2011,9(2):167-170
蛋白质小分子对接的难点之一是从生成的大量候选结构中挑选出近天然构象。本文使用了一种基于SVR的方法来挑选RosettaLigand生成的GPCR—配体decoy构象中的近天然构象。首先,对已有数据训练得到一个SVR模型,预测decoy构象的LRMSD,然后依此挑选近天然构象。最终,比较了本文方法和RosettaLigand方法挑选出的近天然构象decoy的质量,结果优于RosettaLigand方法,结果表明了本文方法能够有效地挑选出近天然构象。  相似文献   

19.
A large set of protein structures resolved by X-ray or NMR techniques has been extracted from the Protein Data Bank and analyzed using statistical methods. In particular, we investigate the interactions between side chains and the interactions between solvent and side chains, pointing out on the possibility of including the solvent as part of a knowledge-based potential. The solvent-residue contacts are accounted for on the basis of the Voronoi's polyhedron analysis. Our investigation confirms the importance of hydrophobic residues in determining the protein stability. We observe that in general hydrophobic-hydrophobic interactions and, more specifically, aromatic-aromatic contacts tend to be increasingly distally separated in the primary sequence of proteins, thus connecting distinct secondary structure elements. A simple relation expressing the dependence of the protein free energy by the number of residues is proposed. Such a relation includes both the residue-residue and the solvent-residue contributions. The former is dominant for large size proteins, whereas for small sizes (number of residues less than 100) the two terms are comparable. Gapless threading experiments show that the solvent-residue knowledge-based potential yields a significant contribution with respect to discriminating the native structure of proteins. Such contribution is important especially for proteins of small size and is similar to that given by the most favorable residue-residue knowledge-based potential referring to hydrophobic-hydrophobic interactions such as isoleucine-leucine. In general, the inclusion of the solvent-residue interaction produces a relevant increase of the free energy gap between the native structures and decoys.  相似文献   

20.
pi-pi, Cation-pi, and hydrophobic packing interactions contribute specificity to protein folding and stability to the native state. As a step towards developing improved models of these interactions in proteins, we compare the side-chain packing arrangements in native proteins to those found in compact decoys produced by the Rosetta de novo structure prediction method. We find enrichments in the native distributions for T-shaped and parallel offset arrangements of aromatic residue pairs, in parallel stacked arrangements of cation-aromatic pairs, in parallel stacked pairs involving proline residues, and in parallel offset arrangements for aliphatic residue pairs. We then investigate the extent to which the distinctive features of native packing can be explained using Lennard-Jones and electrostatics models. Finally, we derive orientation-dependent pi-pi, cation-pi and hydrophobic interaction potentials based on the differences between the native and compact decoy distributions and investigate their efficacy for high-resolution protein structure prediction. Surprisingly, the orientation-dependent potential derived from the packing arrangements of aliphatic side-chain pairs distinguishes the native structure from compact decoys better than the orientation-dependent potentials describing pi-pi and cation-pi interactions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号