首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Rigid-body methods, particularly Fourier correlation techniques, are very efficient for docking bound (co-crystallized) protein conformations using measures of surface complementarity as the target function. However, when docking unbound (separately crystallized) conformations, the method generally yields hundreds of false positive structures with good scores but high root mean square deviations (RMSDs). This paper describes a two-step scoring algorithm that can discriminate near-native conformations (with less than 5 A RMSD) from other structures. The first step includes two rigid-body filters that use the desolvation free energy and the electrostatic energy to select a manageable number of conformations for further processing, but are unable to eliminate all false positives. Complete discrimination is achieved in the second step that minimizes the molecular mechanics energy of the retained structures, and re-ranks them with a combined free-energy function which includes electrostatic, solvation, and van der Waals energy terms. After minimization, the improved fit in near-native complex conformations provides the free-energy gap required for discrimination. The algorithm has been developed and tested using docking decoys, i.e., docked conformations generated by Fourier correlation techniques. The decoy sets are available on the web for testing other discrimination procedures. Proteins 2000;40:525-537.  相似文献   

2.
Molecular docking computationally screens thousands to millions of organic molecules against protein structures, looking for those with complementary fits. Many approximations are made, often resulting in low “hit rates.” A strategy to overcome these approximations is to rescore top-ranked docked molecules using a better but slower method. One such is afforded by molecular mechanics-generalized Born surface area (MM-GBSA) techniques. These more physically realistic methods have improved models for solvation and electrostatic interactions and conformational change compared to most docking programs. To investigate MM-GBSA rescoring, we re-ranked docking hit lists in three small buried sites: a hydrophobic cavity that binds apolar ligands, a slightly polar cavity that binds aryl and hydrogen-bonding ligands, and an anionic cavity that binds cationic ligands. These sites are simple; consequently, incorrect predictions can be attributed to particular errors in the method, and many likely ligands may actually be tested. In retrospective calculations, MM-GBSA techniques with binding-site minimization better distinguished the known ligands for each cavity from the known decoys compared to the docking calculation alone. This encouraged us to test rescoring prospectively on molecules that ranked poorly by docking but that ranked well when rescored by MM-GBSA. A total of 33 molecules highly ranked by MM-GBSA for the three cavities were tested experimentally. Of these, 23 were observed to bind—these are docking false negatives rescued by rescoring. The 10 remaining molecules are true negatives by docking and false positives by MM-GBSA. X-ray crystal structures were determined for 21 of these 23 molecules. In many cases, the geometry prediction by MM-GBSA improved the initial docking pose and more closely resembled the crystallographic result; yet in several cases, the rescored geometry failed to capture large conformational changes in the protein. Intriguingly, rescoring not only rescued docking false positives, but also introduced several new false positives into the top-ranking molecules. We consider the origins of the successes and failures in MM-GBSA rescoring in these model cavity sites and the prospects for rescoring in biologically relevant targets.  相似文献   

3.
Liang S  Meroueh SO  Wang G  Qiu C  Zhou Y 《Proteins》2009,75(2):397-403
The identification of near native protein-protein complexes among a set of decoys remains highly challenging. A strategy for improving the success rate of near native detection is to enrich near native docking decoys in a small number of top ranked decoys. Recently, we found that a combination of three scoring functions (energy, conservation, and interface propensity) can predict the location of binding interface regions with reasonable accuracy. Here, these three scoring functions are modified and combined into a consensus scoring function called ENDES for enriching near native docking decoys. We found that all individual scores result in enrichment for the majority of 28 targets in ZDOCK2.3 decoy set and the 22 targets in Benchmark 2.0. Among the three scores, the interface propensity score yields the highest enrichment in both sets of protein complexes. When these scores are combined into the ENDES consensus score, a significant increase in enrichment of near-native structures is found. For example, when 2000 dock decoys are reduced to 200 decoys by ENDES, the fraction of near-native structures in docking decoys increases by a factor of about six in average. ENDES was implemented into a computer program that is available for download at http://sparks.informatics.iupui.edu.  相似文献   

4.
A novel method of parameter optimization is proposed. It makes use of large sets of decoys generated for six nonhomologous proteins with different architecture. Parameter optimization is achieved by creating a free energy gap between sets of nativelike and nonnative conformations. The method is applied to optimize the parameters of a physics-based scoring function consisting of the all-atom ECEPP05 force field coupled with an implicit solvent model (a solvent-accessible surface area model). The optimized force field is able to discriminate near-native from nonnative conformations of the six training proteins when used either for local energy minimization or for short Monte Carlo simulated annealing runs after local energy minimization. The resulting force field is validated with an independent set of six nonhomologous proteins, and appears to be transferable to proteins not included in the optimization; i.e., for five out of the six test proteins, decoys with 1.7- to 4.0-Å all-heavy-atom root mean-square deviations emerge as those with the lowest energy. In addition, we examined the set of misfolded structures created by Park and Levitt using a four-state reduced model. The results from these additional calculations confirm the good discriminative ability of the optimized force field obtained with our decoy sets.  相似文献   

5.
We suggest a new approach to the generation of candidate structures (decoys) for ab initio prediction of protein structures. Our method is based on random sampling of conformation space and subsequent local energy minimization. At the core of this approach lies the design of a novel type of energy function. This energy function has local minima with native structure characteristics and wide basins of attraction. The current work presents our motivation for deriving such an energy function and also tests the derived energy function.Our approach is novel in that it takes advantage of the inherently rough energy landscape of proteins, which is generally considered a major obstacle for protein structure prediction. When local minima have wide basins of attraction, the protein's conformation space can be greatly reduced by the convergence of large regions of the space into single points, namely the local minima corresponding to these funnels. We have implemented this concept by an iterative process. The potential is first used to generate decoy sets and then we study these sets of decoys to guide further development of the potential. A key feature of our potential is the use of cooperative multi-body interactions that mimic the role of the entropic and solvent contributions to the free energy.The validity and value of our approach is demonstrated by applying it to 14 diverse, small proteins. We show that, for these proteins, the size of conformation space is considerably reduced by the new energy function. In fact, the reduction is so substantial as to allow efficient conformational sampling. As a result we are able to find a significant number of near-native conformations in random searches performed with limited computational resources.  相似文献   

6.
One of the common methods for assessing energy functions of proteins is selection of native or near-native structures from decoys. This is an efficient but indirect test of the energy functions because decoy structures are typically generated either by sampling procedures or by a separate energy function. As a result, these decoys may not contain the global minimum structure that reflects the true folding accuracy of the energy functions. This paper proposes to assess energy functions by ab initio refolding of fully unfolded terminal segments with secondary structures while keeping the rest of the proteins fixed in their native conformations. Global energy minimization of these short unfolded segments, a challenging yet tractable problem, is a direct test of the energy functions. As an illustrative example, refolding terminal segments is employed to assess two closely related all-atom statistical energy functions, DFIRE (distance-scaled, finite, ideal-gas reference state) and DOPE (discrete optimized protein energy). We found that a simple sequence-position dependence contained in the DOPE energy function leads to an intrinsic bias toward the formation of helical structures. Meanwhile, a finer statistical treatment of short-range interactions yields a significant improvement in the accuracy of segment refolding by DFIRE. The updated DFIRE energy function yields success rates of 100% and 67%, respectively, for its ability to sample and fold fully unfolded terminal segments of 15 proteins to within 3.5 A global root-mean-squared distance from the corresponding native structures. The updated DFIRE energy function is available as DFIRE 2.0 upon request.  相似文献   

7.
Statistical energy functions are discrete (or stepwise) energy functions that lack van der Waals repulsion. As a result, they are often applied directly to a given structure (native or decoy) without further energy minimization being performed to the structure. However, the full benefit (or hidden defect) of an energy function cannot be revealed without energy minimization. This paper tests a recently developed, all-atom statistical energy function by energy minimization with a fixed secondary helical structure in dihedral space. This is accomplished by combining the statistical energy function based on a distance-scaled finite ideal-gas reference (DFIRE) state with a simple repulsive interaction and an improper torsion energy function. The energy function was used to minimize 2000 random initial structures of 41 small and medium-sized helical proteins in a dihedral space with a fixed helical region. Results indicate that near-native structures for most studied proteins can be obtained by minimization alone. The average minimum root-mean-squared distance (rmsd) from the native structure for all 41 proteins is 4.1 A. The energy function (together with a simple clustering of similar structures) also makes a reasonable selection of near-native structures from minimized structures. The average rmsd value and the average rank for the best structure in the top five is 6.8 A and 2.4, respectively. The accuracy of the structures sampled and the structure selections can be improved significantly with the removal of flexible terminal regions in rmsd calculations and in minimization and with the increase in the number of minimizations. The minimized structures form an excellent decoy set for testing other energy functions because most structures are well-packed with minimum hard-core overlaps with correct hydrophobic/hydrophilic partitioning. They are available online at http://theory.med.buffalo.edu.  相似文献   

8.
A low-resolution scoring function for the selection of native and near-native structures from a set of predicted structures for a given protein sequence has been developed. The scoring function, ProVal (Protein Validate), used several variables that describe an aspect of protein structure for which the proximity to the native structure can be assessed quantitatively. Among the parameters included are a packing estimate, surface areas, and the contact order. A partial least squares for latent variables (PLS) model was built for each candidate set of the 28 decoy sets of structures generated for 22 different proteins using the described parameters as independent variables. The C(alpha) RMS of the candidate structures versus the experimental structure was used as the dependent variable. The final generalized scoring function was an average of all models derived, ensuring that the function was not optimized for specific fold classes or method of structure generation of the candidate folds. The results show that the crystal structure was scored best in 64% of the 28 test sets and was clearly separated from the decoys in many examples. In all the other cases in which the crystal structure did not rank first, it ranked within the top 10%. Thus, although ProVal could not distinguish between predicted structures that were similar overall in fold quality due to its inherently low resolution, it can clearly be used as a primary filter to eliminate approximately 90% of fold candidates generated by current prediction methods from all-atom modeling and further evaluation. The correlation between the predicted and actual C(alpha) RMS values varies considerably between the candidate fold sets.  相似文献   

9.
MOTIVATION: Predicting protein interactions is one of the most challenging problems in functional genomics. Given two proteins known to interact, current docking methods evaluate billions of docked conformations by simple scoring functions, and in addition to near-native structures yield many false positives, i.e. structures with good surface complementarity but far from the native. RESULTS: We have developed a fast algorithm for filtering docked conformations with good surface complementarity, and ranking them based on their clustering properties. The free energy filters select complexes with lowest desolvation and electrostatic energies. Clustering is then used to smooth the local minima and to select the ones with the broadest energy wells-a property associated with the free energy at the binding site. The robustness of the method was tested on sets of 2000 docked conformations generated for 48 pairs of interacting proteins. In 31 of these cases, the top 10 predictions include at least one near-native complex, with an average RMSD of 5 A from the native structure. The docking and discrimination method also provides good results for a number of complexes that were used as targets in the Critical Assessment of PRedictions of Interactions experiment. AVAILABILITY: The fully automated docking and discrimination server ClusPro can be found at http://structure.bu.edu  相似文献   

10.
A protein-protein docking procedure traditionally consists in two successive tasks: a search algorithm generates a large number of candidate conformations mimicking the complex existing in vivo between two proteins, and a scoring function is used to rank them in order to extract a native-like one. We have already shown that using Voronoi constructions and a well chosen set of parameters, an accurate scoring function could be designed and optimized. However to be able to perform large-scale in silico exploration of the interactome, a near-native solution has to be found in the ten best-ranked solutions. This cannot yet be guaranteed by any of the existing scoring functions. In this work, we introduce a new procedure for conformation ranking. We previously developed a set of scoring functions where learning was performed using a genetic algorithm. These functions were used to assign a rank to each possible conformation. We now have a refined rank using different classifiers (decision trees, rules and support vector machines) in a collaborative filtering scheme. The scoring function newly obtained is evaluated using 10 fold cross-validation, and compared to the functions obtained using either genetic algorithms or collaborative filtering taken separately. This new approach was successfully applied to the CAPRI scoring ensembles. We show that for 10 targets out of 12, we are able to find a near-native conformation in the 10 best ranked solutions. Moreover, for 6 of them, the near-native conformation selected is of high accuracy. Finally, we show that this function dramatically enriches the 100 best-ranking conformations in near-native structures.  相似文献   

11.
Liang S  Liu S  Zhang C  Zhou Y 《Proteins》2007,69(2):244-253
Near-native selections from docking decoys have proved challenging especially when unbound proteins are used in the molecular docking. One reason is that significant atomic clashes in docking decoys lead to poor predictions of binding affinities of near native decoys. Atomic clashes can be removed by structural refinement through energy minimization. Such an energy minimization, however, will lead to an unrealistic bias toward docked structures with large interfaces. Here, we extend an empirical energy function developed for protein design to protein-protein docking selection by introducing a simple reference state that removes the unrealistic dependence of binding affinity of docking decoys on the buried solvent accessible surface area of interface. The energy function called EMPIRE (EMpirical Protein-InteRaction Energy), when coupled with a refinement strategy, is found to provide a significantly improved success rate in near native selections when applied to RosettaDock and refined ZDOCK docking decoys. Our work underlines the importance of removing nonspecific interactions from specific ones in near native selections from docking decoys.  相似文献   

12.
Li L  Chen R  Weng Z 《Proteins》2003,53(3):693-707
We present a simple and effective algorithm RDOCK for refining unbound predictions generated by a rigid-body docking algorithm ZDOCK, which has been developed earlier by our group. The main component of RDOCK is a three-stage energy minimization scheme, followed by the evaluation of electrostatic and desolvation energies. Ionic side chains are kept neutral in the first two stages of minimization, and reverted to their full charge states in the last stage of brief minimization. Without side chain conformational search or filtering/clustering of resulting structures, RDOCK represents the simplest approach toward refining unbound docking predictions. Despite its simplicity, RDOCK makes substantial improvement upon the top predictions by ZDOCK with all three scoring functions and the improvement is observed across all three categories of test cases in a large benchmark of 49 non-redundant unbound test cases. RDOCK makes the most powerful combination with ZDOCK2.1, which uses pairwise shape complementarity as the scoring function. Collectively, they rank a near-native structure as the number-one prediction for 18 test cases (37% of the benchmark), and within the top 4 predictions for 24 test cases (49% of the benchmark). To various degrees, funnel-like energy landscapes are observed for these 24 test cases. To the best of our knowledge, this is the first report of binding funnels starting from global searches for a broad range of test cases. These results are particularly exciting, given that we have not used any biological information that is specific to individual test cases and the whole process is entirely automated. Among three categories of test cases, the best results are seen for enzyme/inhibitor, with a near-native structure ranked as the number-one prediction for 48% test cases, and within the top 10 predictions for 78% test cases. RDOCK is freely available to academic users at http://zlab.bu.edu/ approximately rong/dock.  相似文献   

13.
Arriving at the native conformation of a polypeptide chain characterized by minimum most free energy is a problem of long standing interest in protein structure prediction endeavors. Owing to the computational requirements in developing free energy estimates, scoring functions--energy based or statistical--have received considerable renewed attention in recent years for distinguishing native structures of proteins from non-native like structures. Several cleverly designed decoy sets, CASP (Critical Assessment of Techniques for Protein Structure Prediction) structures and homology based internet accessible three dimensional model builders are now available for validating the scoring functions. We describe here an all-atom energy based empirical scoring function and examine its performance on a wide series of publicly available decoys. Barring two protein sequences where native structure is ranked second and seventh, native is identified as the lowest energy structure in 67 protein sequences from among 61,659 decoys belonging to 12 different decoy sets. We further illustrate a potential application of the scoring function in bracketing native-like structures of two small mixed alpha/beta globular proteins starting from sequence and secondary structural information. The scoring function has been web enabled at www.scfbio-iitd.res.in/utility/proteomics/energy.jsp.  相似文献   

14.
杨凌云  吕强 《生物信息学》2011,9(2):167-170
蛋白质小分子对接的难点之一是从生成的大量候选结构中挑选出近天然构象。本文使用了一种基于SVR的方法来挑选RosettaLigand生成的GPCR—配体decoy构象中的近天然构象。首先,对已有数据训练得到一个SVR模型,预测decoy构象的LRMSD,然后依此挑选近天然构象。最终,比较了本文方法和RosettaLigand方法挑选出的近天然构象decoy的质量,结果优于RosettaLigand方法,结果表明了本文方法能够有效地挑选出近天然构象。  相似文献   

15.
The methods of continuum electrostatics are used to calculate the binding free energies of a set of protein-protein complexes including experimentally determined structures as well as other orientations generated by a fast docking algorithm. In the native structures, charged groups that are deeply buried were often found to favor complex formation (relative to isosteric nonpolar groups), whereas in nonnative complexes generated by a geometric docking algorithm, they were equally likely to be stabilizing as destabilizing. These observations were used to design a new filter for screening docked conformations that was applied, in conjunction with a number of geometric filters that assess shape complementarity, to 15 antibody-antigen complexes and 14 enzyme-inhibitor complexes. For the bound docking problem, which is the major focus of this paper, native and near-native solutions were ranked first or second in all but two enzyme-inhibitor complexes. Less success was encountered for antibody-antigen complexes, but in all cases studied, the more complete free energy evaluation was able to identify native and near-native structures. A filter based on the enrichment of tyrosines and tryptophans in antibody binding sites was applied to the antibody-antigen complexes and resulted in a native and near-native solution being ranked first and second in all cases. A clear improvement over previously reported results was obtained for the unbound antibody-antigen examples as well. The algorithm and various filters used in this work are quite efficient and are able to reduce the number of plausible docking orientations to a size small enough so that a final more complete free energy evaluation on the reduced set becomes computationally feasible.  相似文献   

16.
We develop a protocol for estimating the free energy difference between different conformations of the same polypeptide chain. The conformational free energy evaluation combines the CHARMM force field with a continuum treatment of the solvent. In almost all cases studied, experimentally determined structures are predicted to be more stable than misfolded "decoys." This is due in part to the fact that the Coulomb energy of the native protein is consistently lower than that of the decoys. The solvation free energy generally favors the decoys, although the total electrostatic free energy (sum of Coulomb and solvation terms) favors the native structure. The behavior of the solvation free energy is somewhat counterintuitive and, surprisingly, is not correlated with differences in the burial of polar area between native structures and decoys. Rather. the effect is due to a more favorable charge distribution in the native protein, which, as is discussed, will tend to decrease its interaction with the solvent. Our results thus suggest, in keeping with a number of recent studies, that electrostatic interactions may play an important role in determining the native topology of a folded protein. On this basis, a simplified scoring function is derived that combines a Coulomb term with a hydrophobic contact term. This function performs as well as the more complete free energy evaluation in distinguishing the native structure from misfolded decoys. Its computational efficiency suggests that it can be used in protein structure prediction applications, and that it provides a physically well-defined alternative to statistically derived scoring functions.  相似文献   

17.
There are several knowledge-based energy functions that can distinguish the native fold from a pool of grossly misfolded decoys for a given sequence of amino acids. These decoys, which are typically generated by mounting, or “threading”, the sequence onto the backbones of unrelated protein structures, tend to be non-compact and quite different from the native structure: the root-mean-squared (RMS) deviations from the native are commonly in the range of 15 to 20 Å. Effective energy functions should also demonstrate a similar recognition capability when presented with compact decoys that depart only slightly in conformation from the correct structure (i.e. those with RMS deviations of ∼5 Å or less). Recently, we developed a simple yet powerful method for native fold recognition based on the tendency for native folds to form hydrophobic cores. Our energy measure, which we call the hydrophobic fitness score, is challenged to recognize the native fold from 2000 near-native structures generated for each of five small monomeric proteins. First, 1000 conformations for each protein were generated by molecular dynamics simulation at room temperature. The average RMS deviation of this set of 5000 was 1.5 Å. A total of 323 decoys had energies lower than native; however, none of these had RMS deviations greater than 2 Å. Another 1000 structures were generated for each at high temperature, in which a greater range of conformational space was explored (4.3 Å average RMS deviation). Out of this set, only seven decoys were misrecognized. The hydrophobic fitness energy of a conformation is strongly dependent upon the RMS deviation. On average our potential yields energy values which are lowest for the population of structures generated at room temperature, intermediate for those produced at high temperature and highest for those constructed by threading methods. In general, the lowest energy decoy conformations have backbones very close to native structure. The possible utility of our method for screening backbone candidates for the purpose of modelling by side-chain packing optimization is discussed.  相似文献   

18.
Murphy J  Gatchell DW  Prasad JC  Vajda S 《Proteins》2003,53(4):840-854
Two structure-based potentials are used for both filtering (i.e., selecting a subset of conformations generated by rigid-body docking), and rescoring and ranking the selected conformations. ACP (atomic contact potential) is an atom-level extension of the Miyazawa-Jernigan potential parameterized on protein structures, whereas RPScore (residue pair potential score) is a residue-level potential, based on interactions in protein-protein complexes. These potentials are combined with other energy terms and applied to 13 sets of protein decoys, as well as to the results of docking 10 pairs of unbound proteins. For both potentials, the ability to discriminate between near-native and non-native docked structures is substantially improved by refining the structures and by adding a van der Waals energy term. It is observed that ACP and RPScore complement each other in a number of ways (e.g., although RPScore yields more hits than ACP, mainly as a result of its better performance for charged complexes, ACP usually ranks the near-native complexes better). As a general solution to the protein-docking problem, we have found that the best discrimination strategies combine either an RPScore filter with an ACP-based scoring function, or an ACP-based filter with an RPScore-based scoring function. Thus, ACP and RPScore capture complementary structural information, and combining them in a multistage postprocessing protocol provides substantially better discrimination than the use of the same potential for both filtering and ranking the docked conformations.  相似文献   

19.
This work presents a novel C(alpha)--C(alpha) distance dependent force field which is successful in selecting native structures from an ensemble of high resolution near-native conformers. An enhanced and diverse protein set, along with an improved decoy generation technique, contributes to the effectiveness of this potential. High quality decoys were generated for 1489 nonhomologous proteins and used to train an optimization based linear programming formulation. The goal in developing a set of high resolution decoys was to develop a simple, distance-dependent force field that yields the native structure as the lowest energy structure and assigns higher energies to decoy structures that are quite similar as well as those that are less similar. The model also includes a set of physical constraints that were based on experimentally observed physical behavior of the amino acids. The force field was tested on two sets of test decoys not in the training set and was found to excel on all the metrics that are widely used to measure the effectiveness of a force field. The high resolution force field was successful in correctly identifying 113 native structures out of 150 test cases and the average rank obtained for this test was 1.87. All the high resolution structures (training and testing) used for this work are available online and can be downloaded from http://titan.princeton.edu/HRDecoys.  相似文献   

20.
Tobi D  Bahar I 《Proteins》2006,62(4):970-981
Protein-protein docking is a challenging computational problem in functional genomics, particularly when one or both proteins undergo conformational change(s) upon binding. The major challenge is to define scoring function soft enough to tolerate these changes and specific enough to distinguish between near-native and "misdocked" conformations. Using a linear programming technique, we derived protein docking potentials (PDPs) that comply with this requirement. We considered a set of 63 nonredundant complexes to this aim, and generated 400,000 putative docked complexes (decoys) based on shape complementarity criterion for each complex. The PDPs were required to yield for the native (correctly docked) structure a potential energy lower than those of all the nonnative (misdocked) structures. The energy constraints applied to all complexes led to ca. 25 million inequalities, the simultaneous solution of which yielded an optimal set of PDPs that discriminated the correctly docked (up to 4.0 A root-mean-square deviation from known complex structure) structure among the 85 top-ranking (0.02%) decoys in 59/63 examined bound-bound cases. The high performance of the potentials was further verified in jackknife tests and by ranking putative docked conformation submitted to CAPRI. In addition to their utility in identifying correctly folded complexes, the PDPs reveal biologically meaningful features that distinguish docking potentials from folding potentials.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号