首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Martin O  Schomburg D 《Proteins》2008,70(4):1367-1378
Biological systems and processes rely on a complex network of molecular interactions. While the association of biological macromolecules is a fundamental biochemical phenomenon crucial for the understanding of complex living systems, protein-protein docking methods aim for the computational prediction of protein complexes from individual subunits. Docking algorithms generally produce large numbers of putative protein complexes with only few of these conformations resembling the native complex structure within an acceptable degree of structural similarity. A major challenge in the field of docking is to extract near-native structure(s) out of the large pool of solutions, the so called scoring or ranking problem. A series of structural, chemical, biological and physical properties are used in this work to classify docked protein-protein complexes. These properties include specialized energy functions, evolutionary relationship, class specific residue interface propensities, gap volume, buried surface area, empiric pair potentials on residue and atom level as well as measures for the tightness of fit. Efficient comprehensive scoring functions have been developed using probabilistic Support Vector Machines in combination with this array of properties on the largest currently available protein-protein docking benchmark. The established classifiers are shown to be specific for certain types of protein-protein complexes and are able to detect near-native complex conformations from large sets of decoys with high sensitivity. Using classification probabilities the ranking of near-native structures was drastically improved, leading to a significant enrichment of near-native complex conformations within the top ranks. It could be shown that the developed schemes outperform five other previously published scoring functions.  相似文献   

2.
The accurate scoring of rigid-body docking orientations represents one of the major difficulties in protein-protein docking prediction. Other challenges are the development of faster and more efficient sampling methods and the introduction of receptor and ligand flexibility during simulations. Overall, good discrimination of near-native docking poses from the very early stages of rigid-body protein docking is essential step before applying more costly interface refinement to the correct docking solutions. Here we explore a simple approach to scoring of rigid-body docking poses, which has been implemented in a program called pyDock. The scheme is based on Coulombic electrostatics with distance dependent dielectric constant, and implicit desolvation energy with atomic solvation parameters previously adjusted for rigid-body protein-protein docking. This scoring function is not highly dependent on specific geometry of the docking poses and therefore can be used in rigid-body docking sets generated by a variety of methods. We have tested the procedure in a large benchmark set of 80 unbound docking cases. The method is able to detect a near-native solution from 12,000 docking poses and place it within the 100 lowest-energy docking solutions in 56% of the cases, in a completely unrestricted manner and without any other additional information. More specifically, a near-native solution will lie within the top 20 solutions in 37% of the cases. The simplicity of the approach allows for a better understanding of the physical principles behind protein-protein association, and provides a fast tool for the evaluation of large sets of rigid-body docking poses in search of the near-native orientation.  相似文献   

3.
4.
Masone D  Vaca IC  Pons C  Recio JF  Guallar V 《Proteins》2012,80(3):818-824
Structural prediction of protein-protein complexes given the structures of the two interacting compounds in their unbound state is a key problem in biophysics. In addition to the problem of sampling of near-native orientations, one of the modeling main difficulties is to discriminate true from false positives. Here, we present a hierarchical protocol for docking refinement able to discriminate near native poses from a group of docking candidates. The main idea is to combine an efficient sampling of the full system hydrogen bond network and side chains, together with an all-atom force field and a surface generalized born implicit solvent. We tested our method on a set of twenty two complexes containing a near-native solution within the top 100 docking poses, obtaining a near native solution as the top pose in 70% of the cases. We show that all atom force fields optimized H-bond networks do improve significantly state of the art scoring functions.  相似文献   

5.
Huang SY  Zou X 《Proteins》2008,72(2):557-579
Using an efficient iterative method, we have developed a distance-dependent knowledge-based scoring function to predict protein-protein interactions. The function, referred to as ITScore-PP, was derived using the crystal structures of a training set of 851 protein-protein dimeric complexes containing true biological interfaces. The key idea of the iterative method for deriving ITScore-PP is to improve the interatomic pair potentials by iteration, until the pair potentials can distinguish true binding modes from decoy modes for the protein-protein complexes in the training set. The iterative method circumvents the challenging reference state problem in deriving knowledge-based potentials. The derived scoring function was used to evaluate the ligand orientations generated by ZDOCK 2.1 and the native ligand structures on a diverse set of 91 protein-protein complexes. For the bound test cases, ITScore-PP yielded a success rate of 98.9% if the top 10 ranked orientations were considered. For the more realistic unbound test cases, the corresponding success rate was 40.7%. Furthermore, for faster orientational sampling purpose, several residue-level knowledge-based scoring functions were also derived following the similar iterative procedure. Among them, the scoring function that uses the side-chain center of mass (SCM) to represent a residue, referred to as ITScore-PP(SCM), showed the best performance and yielded success rates of 71.4% and 30.8% for the bound and unbound cases, respectively, when the top 10 orientations were considered. ITScore-PP was further tested using two other published protein-protein docking decoy sets, the ZDOCK decoy set and the RosettaDock decoy set. In addition to binding mode prediction, the binding scores predicted by ITScore-PP also correlated well with the experimentally determined binding affinities, yielding a correlation coefficient of R = 0.71 on a test set of 74 protein-protein complexes with known affinities. ITScore-PP is computationally efficient. The average run time for ITScore-PP was about 0.03 second per orientation (including optimization) on a personal computer with 3.2 GHz Pentium IV CPU and 3.0 GB RAM. The computational speed of ITScore-PP(SCM) is about an order of magnitude faster than that of ITScore-PP. ITScore-PP and/or ITScore-PP(SCM) can be combined with efficient protein docking software to study protein-protein recognition.  相似文献   

6.
7.
Khashan R  Zheng W  Tropsha A 《Proteins》2012,80(9):2207-2217
Accurate prediction of the structure of protein-protein complexes in computational docking experiments remains a formidable challenge. It has been recognized that identifying native or native-like poses among multiple decoys is the major bottleneck of the current scoring functions used in docking. We have developed a novel multibody pose-scoring function that has no theoretical limit on the number of residues contributing to the individual interaction terms. We use a coarse-grain representation of a protein-protein complex where each residue is represented by its side chain centroid. We apply a computational geometry approach called Almost-Delaunay tessellation that transforms protein-protein complexes into a residue contact network, or an undirectional graph where vertex-residues are nodes connected by edges. This treatment forms a family of interfacial graphs representing a dataset of protein-protein complexes. We then employ frequent subgraph mining approach to identify common interfacial residue patterns that appear in at least a subset of native protein-protein interfaces. The geometrical parameters and frequency of occurrence of each "native" pattern in the training set are used to develop the new SPIDER scoring function. SPIDER was validated using standard "ZDOCK" benchmark dataset that was not used in the development of SPIDER. We demonstrate that SPIDER scoring function ranks native and native-like poses above geometrical decoys and that it exceeds in performance a popular ZRANK scoring function. SPIDER was ranked among the top scoring functions in a recent round of CAPRI (Critical Assessment of PRedicted Interactions) blind test of protein-protein docking methods.  相似文献   

8.
Protein-protein docking plays an important role in the computational prediction of the complex structure between two proteins. For years, a variety of docking algorithms have been developed, as witnessed by the critical assessment of prediction interactions (CAPRI) experiments. However, despite their successes, many docking algorithms often require a series of manual operations like modeling structures from sequences, incorporating biological information, and selecting final models. The difficulties in these manual steps have significantly limited the applications of protein-protein docking, as most of the users in the community are nonexperts in docking. Therefore, automated docking like a web server, which can give a comparable performance to human docking protocol, is pressingly needed. As such, we have participated in the blind CAPRI experiments for Rounds 38-45 and CASP13-CAPRI challenge for Round 46 with both our HDOCK automated docking web server and human docking protocol. It was shown that our HDOCK server achieved an “acceptable” or higher CAPRI-rated model in the top 10 submitted predictions for 65.5% and 59.1% of the targets in the docking experiments of CAPRI and CASP13-CAPRI, respectively, which are comparable to 66.7% and 54.5% for human docking protocol. Similar trends can also be observed in the scoring experiments. These results validated our HDOCK server as an efficient automated docking protocol for nonexpert users. Challenges and opportunities of automated docking are also discussed.  相似文献   

9.
MOTIVATION: Protein-protein complexes are known to play key roles in many cellular processes. However, they are often not accessible to experimental study because of their low stability and difficulty to produce the proteins and assemble them in native conformation. Thus, docking algorithms have been developed to provide an in silico approach of the problem. A protein-protein docking procedure traditionally consists of two successive tasks: a search algorithm generates a large number of candidate solutions, and then a scoring function is used to rank them. RESULTS: To address the second step, we developed a scoring function based on a Vorono? tessellation of the protein three-dimensional structure. We showed that the Vorono? representation may be used to describe in a simplified but useful manner, the geometric and physico-chemical complementarities of two molecular surfaces. We measured a set of parameters on native protein-protein complexes and on decoys, and used them as attributes in several statistical learning procedures: a logistic function, Support Vector Machines (SVM), and a genetic algorithm. For the later, we used ROGER, a genetic algorithm designed to optimize the area under the receiver operating characteristics curve. To further test the scores derived with ROGER, we ranked models generated by two different docking algorithms on targets of a blind prediction experiment, improving in almost all cases the rank of native-like solutions. AVAILABILITY: http://genomics.eu.org/spip/-Bioinformatics-tools-  相似文献   

10.
Li Y  Cortés J  Siméon T 《Proteins》2011,79(11):3037-3049
Systematic protein-protein docking methods need to evaluate a huge number of different probe configurations, thus leading to high computational cost. We present an efficient filter-ray casting filter (RCF)-that enables a notable speed-up of systematic protein-protein docking. The high efficiency of RCF is the outcome of the following factors: (i) extracting of pockets and protrusions on the surfaces of the proteins using visibilities; (ii) a ray casting method that finds aligned receptor pocket/probe protrusion pairs without explicit similarity computations. The RCF method enables the integration of systematic methods and local shape feature matching methods. To verify the efficiency and the accuracy of RCF, we integrated it with a systematic protein-protein docking approach (ATTRACT) based on a reduced protein representation. The test results show that the integrated docking approach is much faster. At the same time, it ranks the lowest ligand root-mean-square deviation (RMSD) (L_rms) solutions higher when docking enzyme-enzyme inhibitor complexes. Consequently, RCF not only enables much faster execution of systematic docking runs but also improves the qualities of docking predictions.  相似文献   

11.
Yue Cao  Yang Shen 《Proteins》2020,88(8):1091-1099
Structural information about protein-protein interactions, often missing at the interactome scale, is important for mechanistic understanding of cells and rational discovery of therapeutics. Protein docking provides a computational alternative for such information. However, ranking near-native docked models high among a large number of candidates, often known as the scoring problem, remains a critical challenge. Moreover, estimating model quality, also known as the quality assessment problem, is rarely addressed in protein docking. In this study, the two challenging problems in protein docking are regarded as relative and absolute scoring, respectively, and addressed in one physics-inspired deep learning framework. We represent protein and complex structures as intra- and inter-molecular residue contact graphs with atom-resolution node and edge features. And we propose a novel graph convolutional kernel that aggregates interacting nodes’ features through edges so that generalized interaction energies can be learned directly from 3D data. The resulting energy-based graph convolutional networks (EGCN) with multihead attention are trained to predict intra- and inter-molecular energies, binding affinities, and quality measures (interface RMSD) for encounter complexes. Compared to a state-of-the-art scoring function for model ranking, EGCN significantly improves ranking for a critical assessment of predicted interactions (CAPRI) test set involving homology docking; and is comparable or slightly better for Score_set, a CAPRI benchmark set generated by diverse community-wide docking protocols not known to training data. For Score_set quality assessment, EGCN shows about 27% improvement to our previous efforts. Directly learning from 3D structure data in graph representation, EGCN represents the first successful development of graph convolutional networks for protein docking.  相似文献   

12.
Król M  Tournier AL  Bates PA 《Proteins》2007,68(1):159-169
Molecular Dynamics (MD) simulations have been performed on a set of rigid-body docking poses, carried out over 25 protein-protein complexes. The results show that fully flexible relaxation increases the fraction of native contacts (NC) by up to 70% for certain docking poses. The largest increase in the fraction of NC is observed for docking poses where anchor residues are able to sample their bound conformation. For each MD simulation, structural snap-shots were clustered and the centre of each cluster used as the MD-relaxed docking pose. A comparison between two energy-based scoring schemes, the first calculated for the MD-relaxed poses, the second for energy minimized poses, shows that the former are better in ranking complexes with large hydrophobic interfaces. Furthermore, complexes with large interfaces are generally ranked well, regardless of the type of relaxation method chosen, whereas complexes with small hydrophobic interfaces remain difficult to rank. In general, the results indicate that current force-fields are able to correctly describe direct intermolecular interactions between receptor and ligand molecules. However, these force-fields still fail in cases where protein-protein complexes are stabilized by subtle energy contributions.  相似文献   

13.
Liang S  Meroueh SO  Wang G  Qiu C  Zhou Y 《Proteins》2009,75(2):397-403
The identification of near native protein-protein complexes among a set of decoys remains highly challenging. A strategy for improving the success rate of near native detection is to enrich near native docking decoys in a small number of top ranked decoys. Recently, we found that a combination of three scoring functions (energy, conservation, and interface propensity) can predict the location of binding interface regions with reasonable accuracy. Here, these three scoring functions are modified and combined into a consensus scoring function called ENDES for enriching near native docking decoys. We found that all individual scores result in enrichment for the majority of 28 targets in ZDOCK2.3 decoy set and the 22 targets in Benchmark 2.0. Among the three scores, the interface propensity score yields the highest enrichment in both sets of protein complexes. When these scores are combined into the ENDES consensus score, a significant increase in enrichment of near-native structures is found. For example, when 2000 dock decoys are reduced to 200 decoys by ENDES, the fraction of near-native structures in docking decoys increases by a factor of about six in average. ENDES was implemented into a computer program that is available for download at http://sparks.informatics.iupui.edu.  相似文献   

14.
Lorenzen S  Zhang Y 《Proteins》2007,68(1):187-194
Most state-of-the-art protein-protein docking algorithms use the Fast Fourier Transform (FFT) technique to sample the six-dimensional translational and rotational space. Scoring functions including shape complementarity, electrostatics, and desolvation are usually exploited in ranking the docking conformations. While these rigid-body docking methods provide good performance in bound docking, using unbound structures as input frequently leads to a high number of false positive hits. For the purpose of better selecting correct docking conformations, we structurally cluster the docking decoys generated by four widely-used FFT-based protein-protein docking methods. In all cases, the selection based on cluster size outperforms the ranking based on the inherent scoring function. If we cluster decoys from different servers together, only marginal improvement is obtained in comparison with clustering decoys from the best individual server. A collection of multiple decoy sets of comparable quality will be the key to improve the clustering result from meta-docking servers.  相似文献   

15.
Most scoring functions for protein-protein docking algorithms are either atom-based or residue-based, with the former being able to produce higher quality structures and latter more tolerant to conformational changes upon binding. Earlier, we developed the ZRANK algorithm for reranking docking predictions, with a scoring function that contained only atom-based terms. Here we combine ZRANK's atom-based potentials with five residue-based potentials published by other labs, as well as an atom-based potential IFACE that we published after ZRANK. We simultaneously optimized the weights for selected combinations of terms in the scoring function, using decoys generated with the protein-protein docking algorithm ZDOCK. We performed rigorous cross validation of the combinations using 96 test cases from a docking benchmark. Judged by the integrative success rate of making 1000 predictions per complex, addition of IFACE and the best residue-based pair potential reduced the number of cases without a correct prediction by 38 and 27% relative to ZDOCK and ZRANK, respectively. Thus combination of residue-based and atom-based potentials into a scoring function can improve performance for protein-protein docking. The resulting scoring function is called IRAD (integration of residue- and atom-based potentials for docking) and is available at http://zlab.umassmed.edu.  相似文献   

16.
We report the performance of the protein docking prediction pipeline of our group and the results for Critical Assessment of Prediction of Interactions (CAPRI) rounds 38-46. The pipeline integrates programs developed in our group as well as other existing scoring functions. The core of the pipeline is the LZerD protein-protein docking algorithm. If templates of the target complex are not found in PDB, the first step of our docking prediction pipeline is to run LZerD for a query protein pair. Meanwhile, in the case of human group prediction, we survey the literature to find information that can guide the modeling, such as protein-protein interface information. In addition to any literature information and binding residue prediction, generated docking decoys were selected by a rank aggregation of statistical scoring functions. The top 10 decoys were relaxed by a short molecular dynamics simulation before submission to remove atom clashes and improve side-chain conformations. In these CAPRI rounds, our group, particularly the LZerD server, showed robust performance. On the other hand, there are failed cases where some other groups were successful. To understand weaknesses of our pipeline, we analyzed sources of errors for failed targets. Since we noted that structure refinement is a step that needs improvement, we newly performed a comparative study of several refinement approaches. Finally, we show several examples that illustrate successful and unsuccessful cases by our group.  相似文献   

17.
A major challenge of the protein docking problem is to define scoring functions that can distinguish near‐native protein complex geometries from a large number of non‐native geometries (decoys) generated with noncomplexed protein structures (unbound docking). In this study, we have constructed a neural network that employs the information from atom‐pair distance distributions of a large number of decoys to predict protein complex geometries. We found that docking prediction can be significantly improved using two different types of polar hydrogen atoms. To train the neural network, 2000 near‐native decoys of even distance distribution were used for each of the 185 considered protein complexes. The neural network normalizes the information from different protein complexes using an additional protein complex identity input neuron for each complex. The parameters of the neural network were determined such that they mimic a scoring funnel in the neighborhood of the native complex structure. The neural network approach avoids the reference state problem, which occurs in deriving knowledge‐based energy functions for scoring. We show that a distance‐dependent atom pair potential performs much better than a simple atom‐pair contact potential. We have compared the performance of our scoring function with other empirical and knowledge‐based scoring functions such as ZDOCK 3.0, ZRANK, ITScore‐PP, EMPIRE, and RosettaDock. In spite of the simplicity of the method and its functional form, our neural network‐based scoring function achieves a reasonable performance in rigid‐body unbound docking of proteins. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

18.
Treating flexibility in molecular docking is a major challenge in cell biology research. Here we describe the background and the principles of existing flexible protein-protein docking methods, focusing on the algorithms and their rational. We describe how protein flexibility is treated in different stages of the docking process: in the preprocessing stage, rigid and flexible parts are identified and their possible conformations are modeled. This preprocessing provides information for the subsequent docking and refinement stages. In the docking stage, an ensemble of pre-generated conformations or the identified rigid domains may be docked separately. In the refinement stage, small-scale movements of the backbone and side-chains are modeled and the binding orientation is improved by rigid-body adjustments. For clarity of presentation, we divide the different methods into categories. This should allow the reader to focus on the most suitable method for a particular docking problem.  相似文献   

19.
Zhao N  Pang B  Shyu CR  Korkin D 《Proteomics》2011,11(22):4321-4330
Structural knowledge about protein-protein interactions can provide insights to the basic processes underlying cell function. Recent progress in experimental and computational structural biology has led to a rapid growth of experimentally resolved structures and computationally determined near-native models of protein-protein interactions. However, determining whether a protein-protein interaction is physiological or it is the artifact of an experimental or computational method remains a challenging problem. In this work, we have addressed two related problems. The first problem is distinguishing between the experimentally obtained physiological and crystal-packing protein-protein interactions. The second problem is concerned with the classification of near-native and inaccurate docking models. We first defined a universal set of interface features and employed a support vector machines (SVM)-based approach to classify the interactions for both problems, with the accuracy, precision, and recall for the first problem classifier reaching 93%. To improve the classification, we next developed a semi-supervised learning approach for the second problem, using transductive SVM (TSVM). We applied both classifiers to a commonly used protein docking benchmark of 124 complexes. We found that while we reached the classification accuracies of 78.9% for the SVM classifier and 80.3% for the TSVM classifier, improving protein-docking methods by model re-ranking remains a challenging problem.  相似文献   

20.
Tobi D  Bahar I 《Proteins》2006,62(4):970-981
Protein-protein docking is a challenging computational problem in functional genomics, particularly when one or both proteins undergo conformational change(s) upon binding. The major challenge is to define scoring function soft enough to tolerate these changes and specific enough to distinguish between near-native and "misdocked" conformations. Using a linear programming technique, we derived protein docking potentials (PDPs) that comply with this requirement. We considered a set of 63 nonredundant complexes to this aim, and generated 400,000 putative docked complexes (decoys) based on shape complementarity criterion for each complex. The PDPs were required to yield for the native (correctly docked) structure a potential energy lower than those of all the nonnative (misdocked) structures. The energy constraints applied to all complexes led to ca. 25 million inequalities, the simultaneous solution of which yielded an optimal set of PDPs that discriminated the correctly docked (up to 4.0 A root-mean-square deviation from known complex structure) structure among the 85 top-ranking (0.02%) decoys in 59/63 examined bound-bound cases. The high performance of the potentials was further verified in jackknife tests and by ranking putative docked conformation submitted to CAPRI. In addition to their utility in identifying correctly folded complexes, the PDPs reveal biologically meaningful features that distinguish docking potentials from folding potentials.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号