首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A major challenge of the protein docking problem is to define scoring functions that can distinguish near‐native protein complex geometries from a large number of non‐native geometries (decoys) generated with noncomplexed protein structures (unbound docking). In this study, we have constructed a neural network that employs the information from atom‐pair distance distributions of a large number of decoys to predict protein complex geometries. We found that docking prediction can be significantly improved using two different types of polar hydrogen atoms. To train the neural network, 2000 near‐native decoys of even distance distribution were used for each of the 185 considered protein complexes. The neural network normalizes the information from different protein complexes using an additional protein complex identity input neuron for each complex. The parameters of the neural network were determined such that they mimic a scoring funnel in the neighborhood of the native complex structure. The neural network approach avoids the reference state problem, which occurs in deriving knowledge‐based energy functions for scoring. We show that a distance‐dependent atom pair potential performs much better than a simple atom‐pair contact potential. We have compared the performance of our scoring function with other empirical and knowledge‐based scoring functions such as ZDOCK 3.0, ZRANK, ITScore‐PP, EMPIRE, and RosettaDock. In spite of the simplicity of the method and its functional form, our neural network‐based scoring function achieves a reasonable performance in rigid‐body unbound docking of proteins. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

2.
Zhao Y  Sanner MF 《Proteins》2007,68(3):726-737
Conformational changes of biological macromolecules when binding with ligands have long been observed and remain a challenge for automated docking methods. Here we present a novel protein-ligand docking software called FLIPDock (Flexible LIgand-Protein Docking) allowing the automated docking of flexible ligand molecules into active sites of flexible receptor molecules. In FLIPDock, conformational spaces of molecules are encoded using a data structure that we have developed recently called the Flexibility Tree (FT). While the FT can represent fully flexible ligands, it was initially designed as a hierarchical and multiresolution data structure for the selective encoding of conformational subspaces of large biological macromolecules. These conformational subspaces can be built to span a range of conformations important for the biological activity of a protein. A variety of motions can be combined, ranging from domains moving as rigid bodies or backbone atoms undergoing normal mode-based deformations, to side chains assuming rotameric conformations. In addition, these conformational subspaces are parameterized by a small number of variables which can be searched during the docking process, thus effectively modeling the conformational changes in a flexible receptor. FLIPDock searches the variables using genetic algorithm-based search techniques and evaluates putative docking complexes with a scoring function based on the AutoDock3.05 force-field. In this paper, we describe the concepts behind FLIPDock and the overall architecture of the program. We demonstrate FLIPDock's ability to solve docking problems in which the assumption of a rigid receptor previously prevented the successful docking of known ligands. In particular, we repeat an earlier cross docking experiment and demonstrate an increased success rate of 93.5%, compared to original 72% success rate achieved by AutoDock over the 400 cross-docking calculations. We also demonstrate FLIPDock's ability to handle conformational changes involving backbone motion by docking balanol to an adenosine-binding pocket of protein kinase A.  相似文献   

3.
Protein‐protein interactions play fundamental roles in biological processes including signaling, metabolism, and trafficking. While the structure of a protein complex reveals crucial details about the interaction, it is often difficult to acquire this information experimentally. As the number of interactions discovered increases faster than they can be characterized, protein‐protein docking calculations may be able to reduce this disparity by providing models of the interacting proteins. Rigid‐body docking is a widely used docking approach, and is often capable of generating a pool of models within which a near‐native structure can be found. These models need to be scored in order to select the acceptable ones from the set of poses. Recently, more than 100 scoring functions from the CCharPPI server were evaluated for this task using decoy structures generated with SwarmDock. Here, we extend this analysis to identify the predictive success rates of the scoring functions on decoys from three rigid‐body docking programs, ZDOCK, FTDock, and SDOCK, allowing us to assess the transferability of the functions. We also apply set‐theoretic measure to test whether the scoring functions are capable of identifying near‐native poses within different subsets of the benchmark. This information can provide guides for the use of the most efficient scoring function for each docking method, as well as instruct future scoring functions development efforts. Proteins 2017; 85:1287–1297. © 2017 Wiley Periodicals, Inc.  相似文献   

4.
We describe protein-protein recognition within the frame of the random energy model of statistical physics. We simulate, by docking the component proteins, the process of association of two proteins that form a complex. We obtain the energy spectrum of a set of protein-protein complexes of known three-dimensional structure by performing docking in random orientations and scoring the models thus generated. We use a coarse protein representation where each amino acid residue is replaced by its Vorono? cell, and derive a scoring function by applying the evolutionary learning program ROGER to a set of parameters measured on that representation. Taking the scores of the docking models to be interaction energies, we obtain energy spectra for the complexes and fit them to a Gaussian distribution, from which we derive physical parameters such as a glass transition temperature and a specificity transition temperature.  相似文献   

5.
Venkatraman V  Ritchie DW 《Proteins》2012,80(9):2262-2274
Modeling conformational changes in protein docking calculations is challenging. To make the calculations tractable, most current docking algorithms typically treat proteins as rigid bodies and use soft scoring functions that implicitly accommodate some degree of flexibility. Alternatively, ensembles of structures generated from molecular dynamics (MD) may be cross-docked. However, such combinatorial approaches can produce many thousands or even millions of docking poses, and require fast and sensitive scoring functions to distinguish them. Here, we present a novel approach called "EigenHex," which is based on normal mode analyses (NMAs) of a simple elastic network model of protein flexibility. We initially assume that the proteins to be docked are rigid, and we begin by performing conventional soft docking using the Hex polar Fourier correlation algorithm. We then apply a pose-dependent NMA to each of the top 1000 rigid body docking solutions, and we sample and re-score multiple perturbed docking conformations generated from linear combinations of up to 20 eigenvectors using a multi-threaded particle swarm optimization algorithm. When applied to the 63 "rigid body" targets of the Protein Docking Benchmark version 2.0, our results show that sampling and re-scoring from just one to three eigenvectors gives a modest but consistent improvement for these targets. Thus, pose-dependent NMA avoids the need to sample multiple eigenvectors and it offers a promising alternative to combinatorial cross-docking.  相似文献   

6.
Akio Kitao 《Proteins》2013,81(6):1005-1016
We propose a fast clustering and reranking method, CyClus, for protein–protein docking decoys. This method enables comprehensive clustering of whole decoys generated by rigid‐body docking using cylindrical approximation of the protein–proteininterface and hierarchical clustering procedures. We demonstrate the clustering and reranking of 54,000 decoy structures generated by ZDOCK for each complex within a few minutes. After parameter tuning for the test set in ZDOCK benchmark 2.0 with the ZDOCK and ZRANK scoring functions, blind tests for the incremental data in ZDOCK benchmark 3.0 and 4.0 were conducted. CyClus successfully generated smaller subsets of decoys containing near‐native decoys. For example, the number of decoys required to create subsets containing near‐native decoys with 80% probability was reduced from 22% to 50% of the number required in the original ZDOCK. Although specific ZDOCK and ZRANK results were demonstrated, the CyClus algorithm was designed to be more general and can be applied to a wide range of decoys and scoring functions by adjusting just two parameters, p and T. CyClus results were also compared to those from ClusPro. Proteins 2013; © 2012 Wiley Periodicals, Inc.  相似文献   

7.
8.
The ATTRACT protein-protein docking program has been employed to predict protein-protein complex structures in CAPRI rounds 38-45. For 11 out of 16 targets acceptable or better quality solutions have been submitted (~70%). It includes also several cases of peptide-protein docking and the successful prediction of the geometry of carbohydrate-protein interactions. The option of combining rigid body minimization and simultaneous optimization in collective degrees of freedom based on elastic network modes was employed and systematically evaluated. Application to a large benchmark set indicates a modest improvement in docking performance compared to rigid docking. Possible further improvements of the docking approach in particular at the scoring and the flexible refinement steps are discussed.  相似文献   

9.
Protein‐protein interactions are abundant in the cell but to date structural data for a large number of complexes is lacking. Computational docking methods can complement experiments by providing structural models of complexes based on structures of the individual partners. A major caveat for docking success is accounting for protein flexibility. Especially, interface residues undergo significant conformational changes upon binding. This limits the performance of docking methods that keep partner structures rigid or allow limited flexibility. A new docking refinement approach, iATTRACT, has been developed which combines simultaneous full interface flexibility and rigid body optimizations during docking energy minimization. It employs an atomistic molecular mechanics force field for intermolecular interface interactions and a structure‐based force field for intramolecular contributions. The approach was systematically evaluated on a large protein‐protein docking benchmark, starting from an enriched decoy set of rigidly docked protein–protein complexes deviating by up to 15 Å from the native structure at the interface. Large improvements in sampling and slight but significant improvements in scoring/discrimination of near native docking solutions were observed. Complexes with initial deviations at the interface of up to 5.5 Å were refined to significantly better agreement with the native structure. Improvements in the fraction of native contacts were especially favorable, yielding increases of up to 70%. Proteins 2015; 83:248–258. © 2014 Wiley Periodicals, Inc.  相似文献   

10.
Identification and characterization of antigenic determinants on proteins has received considerable attention utilizing both, experimental as well as computational methods. For computational routines mostly structural as well as physicochemical parameters have been utilized for predicting the antigenic propensity of protein sites. However, the performance of computational routines has been low when compared to experimental alternatives. Here we describe the construction of machine learning based classifiers to enhance the prediction quality for identifying linear B-cell epitopes on proteins. Our approach combines several parameters previously associated with antigenicity, and includes novel parameters based on frequencies of amino acids and amino acid neighborhood propensities. We utilized machine learning algorithms for deriving antigenicity classification functions assigning antigenic propensities to each amino acid of a given protein sequence. We compared the prediction quality of the novel classifiers with respect to established routines for epitope scoring, and tested prediction accuracy on experimental data available for HIV proteins. The major finding is that machine learning classifiers clearly outperform the reference classification systems on the HIV epitope validation set.  相似文献   

11.
Ensemble docking has provided an inexpensive method to account for receptor flexibility in molecular docking for virtual screening. Unfortunately, as there is no rigorous theory to connect the docking scores from multiple structures to measured activity, researchers have not yet come up with effective ways to use these scores to classify compounds into actives and inactives. This shortcoming has led to the decrease, rather than an increase in the performance of classifying compounds when more structures are added to the ensemble. Previously, we suggested machine learning, implemented in the form of a naïve Bayesian model could alleviate this problem. However, the naïve Bayesian model assumed that the probabilities of observing the docking scores to different structures to be independent. This approximation might prevent it from achieving even higher performance. In the work presented in this paper, we have relaxed this approximation when using several other machine learning methods—k nearest neighbor, logistic regression, support vector machine, and random forest—to improve ensemble docking. We found significant improvement.  相似文献   

12.
Wu MY  Dai DQ  Yan H 《Proteins》2012,80(9):2137-2153
Protein-ligand docking is widely applied to structure-based virtual screening for drug discovery. This article presents a novel docking technique, PRL-Dock, based on hydrogen bond matching and probabilistic relaxation labeling. It deals with multiple hydrogen bonds and can match many acceptors and donors simultaneously. In the matching process, the initial probability of matching an acceptor with a donor is estimated by an efficient scoring function and the compatibility coefficients are assigned according to the coexisting condition of two hydrogen bonds. After hydrogen bond matching, the geometric complementarity of the interacting donor and acceptor sites is taken into account for displacement of the ligand. It is reduced to an optimization problem to calculate the optimal translation and rotation matrixes that minimize the root mean square deviation between two sets of points, which can be solved using the Kabsch algorithm. In addition to the van der Waals interaction, the contribution of intermolecular hydrogen bonds in a complex is included in the scoring function to evaluate the docking quality. A modified Lennard-Jones 12-6 dispersion-repulsion term is used to estimate the van der Waals interaction to make the scoring function fairly "soft" so that ligands are not heavily penalized for small errors in the binding geometry. The calculation of this scoring function is very convenient. The evaluation is carried out on 278 rigid complexes and 93 flexible ones where there is at least one intermolecular hydrogen bond. The experiment results of docking accuracy and prediction of binding affinity demonstrate that the proposed method is highly effective.  相似文献   

13.
Bordner AJ  Gorin AA 《Proteins》2007,68(2):488-502
Computational prediction of protein complex structures through docking offers a means to gain a mechanistic understanding of protein interactions that mediate biological processes. This is particularly important as the number of experimentally determined structures of isolated proteins exceeds the number of structures of complexes. A comprehensive docking procedure is described in which efficient sampling of conformations is achieved by matching surface normal vectors, fast filtering for shape complementarity, clustering by RMSD, and scoring the docked conformations using a supervised machine learning approach. Contacting residue pair frequencies, residue propensities, evolutionary conservation, and shape complementarity score for each docking conformation are used as input data to a Random Forest classifier. The performance of the Random Forest approach for selecting correctly docked conformations was assessed by cross-validation using a nonredundant benchmark set of X-ray structures for 93 heterodimer and 733 homodimer complexes. The single highest rank docking solution was the correct (near-native) structure for slightly more than one third of the complexes. Furthermore, the fraction of highly ranked correct structures was significantly higher than the overall fraction of correct structures, for almost all complexes. A detailed analysis of the difficult to predict complexes revealed that the majority of the homodimer cases were explained by incorrect oligomeric state annotation. Evolutionary conservation and shape complementarity score as well as both underrepresented and overrepresented residue types and residue pairs were found to make the largest contributions to the overall prediction accuracy. Finally, the method was also applied to docking unbound subunit structures from a previously published benchmark set.  相似文献   

14.
Hartmann C  Antes I  Lengauer T 《Proteins》2009,74(3):712-726
We describe a scoring and modeling procedure for docking ligands into protein models that have either modeled or flexible side-chain conformations. Our methodical contribution comprises a procedure for generating new potentials of mean force for the ROTA scoring function which we have introduced previously for optimizing side-chain conformations with the tool IRECS. The ROTA potentials are specially trained to tolerate small-scale positional errors of atoms that are characteristic of (i) side-chain conformations that are modeled using a sparse rotamer library and (ii) ligand conformations that are generated using a docking program. We generated both rigid and flexible protein models with our side-chain prediction tool IRECS and docked ligands to proteins using the scoring function ROTA and the docking programs FlexX (for rigid side chains) and FlexE (for flexible side chains). We validated our approach on the forty screening targets of the DUD database. The validation shows that the ROTA potentials are especially well suited for estimating the binding affinity of ligands to proteins. The results also show that our procedure can compensate for the performance decrease in screening that occurs when using protein models with side chains modeled with a rotamer library instead of using X-ray structures. The average runtime per ligand of our method is 168 seconds on an Opteron V20z, which is fast enough to allow virtual screening of compound libraries for drug candidates.  相似文献   

15.
Protein-ligand docking is a computational method to identify the binding mode of a ligand and a target protein, and predict the corresponding binding affinity using a scoring function. This method has great value in drug design. After decades of development, scoring functions nowadays typically can identify the true binding mode, but the prediction of binding affinity still remains a major problem. Here we present CScore, a data-driven scoring function using a modified Cerebellar Model Articulation Controller (CMAC) learning architecture, for accurate binding affinity prediction. The performance of CScore in terms of correlation between predicted and experimental binding affinities is benchmarked under different validation approaches. CScore achieves a prediction with R = 0.7668 and RMSE = 1.4540 when tested on an independent dataset. To the best of our knowledge, this result outperforms other scoring functions tested on the same dataset. The performance of CScore varies on different clusters under the leave-cluster-out validation approach, but still achieves competitive result. Lastly, the target-specified CScore achieves an even better result with R = 0.8237 and RMSE = 1.0872, trained on a much smaller but more relevant dataset for each target. The large dataset of protein-ligand complexes structural information and advances of machine learning techniques enable the data-driven approach in binding affinity prediction. CScore is capable of accurate binding affinity prediction. It is also shown that CScore will perform better if sufficient and relevant data is presented. As there is growth of publicly available structural data, further improvement of this scoring scheme can be expected.  相似文献   

16.
Qian Wang  Luhua Lai 《Proteins》2014,82(10):2472-2482
Target structure‐based virtual screening, which employs protein‐small molecule docking to identify potential ligands, has been widely used in small‐molecule drug discovery. In the present study, we used a protein–protein docking program to identify proteins that bind to a specific target protein. In the testing phase, an all‐to‐all protein–protein docking run on a large dataset was performed. The three‐dimensional rigid docking program SDOCK was used to examine protein–protein docking on all protein pairs in the dataset. Both the binding affinity and features of the binding energy landscape were considered in the scoring function in order to distinguish positive binding pairs from negative binding pairs. Thus, the lowest docking score, the average Z‐score, and convergency of the low‐score solutions were incorporated in the analysis. The hybrid scoring function was optimized in the all‐to‐all docking test. The docking method and the hybrid scoring function were then used to screen for proteins that bind to tumor necrosis factor‐α (TNFα), which is a well‐known therapeutic target for rheumatoid arthritis and other autoimmune diseases. A protein library containing 677 proteins was used for the screen. Proteins with scores among the top 20% were further examined. Sixteen proteins from the top‐ranking 67 proteins were selected for experimental study. Two of these proteins showed significant binding to TNFα in an in vitro binding study. The results of the present study demonstrate the power and potential application of protein–protein docking for the discovery of novel binding proteins for specific protein targets. Proteins 2014; 82:2472–2482. © 2014 Wiley Periodicals, Inc.  相似文献   

17.
基于支持向量机(SVM)的剪接位点识别   总被引:14,自引:1,他引:13  
剪接位点的识别作为基因识别中的一个重要环节, 一直受到研究人员的关注。考虑到剪接位点附近存在的序列保守性,已有一些基于统计特性的方法被用于剪接位点的识别中,但效果仍有待进一步改进。支持向量机(Support Vector Machines) 作为一种新的基于统计学习理论的学习机,近几年有了很大的发展,已被应用在模式识别的许多问题中。文中将其用于剪接位点的识别中,并针对满足GT- AG 规则的序列样本中虚假剪接位点的样本数远大于真实位点这一特性, 提出了一种基于SVM 的平衡取小法以获得更好的识别效果。实验结果表明,应用支持向量机进行剪接位点的识别能更好地提取位点附近保守序列的统计特征,对测试集具有更好的推广能力,并且使用上更加简单。这一结果为剪接位点的识别提供了一种新的方法,同时也为生物大分子研究中结构和位点的识别问题的解决提供了新的线索。  相似文献   

18.
19.
The aim of docking is to accurately predict the structure of a ligand within the constraints of a receptor binding site and to correctly estimate the strength of binding. We discuss, in detail, methodological developments that occurred in the docking field in 2010 and 2011, with a particular focus on the more difficult, and sometimes controversial, aspects of this promising computational discipline. The main developments in docking in this period, covered in this review, are receptor flexibility, solvation, fragment docking, postprocessing, docking into homology models, and docking comparisons. Several new, or at least newly invigorated, advances occurred in areas such as nonlinear scoring functions, using machine‐learning approaches. This review is strongly focused on docking advances in the context of drug design, specifically in virtual screening and fragment‐based drug design. Where appropriate, we refer readers to exemplar case studies. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

20.
Proteins play important roles in living organisms, and their function is directly linked with their structure. Due to the growing gap between the number of proteins being discovered and their functional characterization (in particular as a result of experimental limitations), reliable prediction of protein function through computational means has become crucial. This paper reviews the machine learning techniques used in the literature, following their evolution from simple algorithms such as logistic regression to more advanced methods like support vector machines and modern deep neural networks. Hyperparameter optimization methods adopted to boost prediction performance are presented. In parallel, the metamorphosis in the features used by these algorithms from classical physicochemical properties and amino acid composition, up to text-derived features from biomedical literature and learned feature representations using autoencoders, together with feature selection and dimensionality reduction techniques, are also reviewed. The success stories in the application of these techniques to both general and specific protein function prediction are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号