首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein‐protein interactions are abundant in the cell but to date structural data for a large number of complexes is lacking. Computational docking methods can complement experiments by providing structural models of complexes based on structures of the individual partners. A major caveat for docking success is accounting for protein flexibility. Especially, interface residues undergo significant conformational changes upon binding. This limits the performance of docking methods that keep partner structures rigid or allow limited flexibility. A new docking refinement approach, iATTRACT, has been developed which combines simultaneous full interface flexibility and rigid body optimizations during docking energy minimization. It employs an atomistic molecular mechanics force field for intermolecular interface interactions and a structure‐based force field for intramolecular contributions. The approach was systematically evaluated on a large protein‐protein docking benchmark, starting from an enriched decoy set of rigidly docked protein–protein complexes deviating by up to 15 Å from the native structure at the interface. Large improvements in sampling and slight but significant improvements in scoring/discrimination of near native docking solutions were observed. Complexes with initial deviations at the interface of up to 5.5 Å were refined to significantly better agreement with the native structure. Improvements in the fraction of native contacts were especially favorable, yielding increases of up to 70%. Proteins 2015; 83:248–258. © 2014 Wiley Periodicals, Inc.  相似文献   

2.
Protein–protein interactions (PPI) are crucial for protein function. There exist many techniques to identify PPIs experimentally, but to determine the interactions in molecular detail is still difficult and very time‐consuming. The fact that the number of PPIs is vastly larger than the number of individual proteins makes it practically impossible to characterize all interactions experimentally. Computational approaches that can bridge this gap and predict PPIs and model the interactions in molecular detail are greatly needed. Here we present InterPred, a fully automated pipeline that predicts and model PPIs from sequence using structural modeling combined with massive structural comparisons and molecular docking. A key component of the method is the use of a novel random forest classifier that integrate several structural features to distinguish correct from incorrect protein–protein interaction models. We show that InterPred represents a major improvement in protein–protein interaction detection with a performance comparable or better than experimental high‐throughput techniques. We also show that our full‐atom protein–protein complex modeling pipeline performs better than state of the art protein docking methods on a standard benchmark set. In addition, InterPred was also one of the top predictors in the latest CAPRI37 experiment. InterPred source code can be downloaded from http://wallnerlab.org/InterPred Proteins 2017; 85:1159–1170. © 2017 Wiley Periodicals, Inc.  相似文献   

3.
Juliette Martin 《Proteins》2014,82(7):1444-1452
A number of predictive methods have been developed to predict protein–protein binding sites. Each new method is traditionally benchmarked using sets of protein structures of various sizes, and global statistics are used to assess the quality of the prediction. Little attention has been paid to the potential bias due to protein size on these statistics. Indeed, small proteins involve proportionally more residues at interfaces than large ones. If a predictive method is biased toward small proteins, this can lead to an over‐estimation of its performance. Here, we investigate the bias due to the size effect when benchmarking protein‐protein interface prediction on the widely used docking benchmark 4.0. First, we simulate random scores that favor small proteins over large ones. Instead of the 0.5 AUC (Area Under the Curve) value expected by chance, these biased scores result in an AUC equal to 0.6 using hypergeometric distributions, and up to 0.65 using constant scores. We then use real prediction results to illustrate how to detect the size bias by shuffling, and subsequently correct it using a simple conversion of the scores into normalized ranks. In addition, we investigate the scores produced by eight published methods and show that they are all affected by the size effect, which can change their relative ranking. The size effect also has an impact on linear combination scores by modifying the relative contributions of each method. In the future, systematic corrections should be applied when benchmarking predictive methods using data sets with mixed protein sizes. Proteins 2014; 82:1444–1452. © 2014 Wiley Periodicals, Inc.  相似文献   

4.
有关蛋白质功能的研究是解析生命奥秘的基础,机器学习技术在该领域已有广泛应用。利用支持向量机(support vectormachine,SVM)方法,构建一个预测蛋白质功能位点的通用平台。该平台先提取非同源蛋白质序列,再对这些序列进行特征编码(包括序列的基本信息、物化特征、结构信息及序列保守性特征等),以编码好的样本作为训练数据,利用SVM进行训练,得到敏感性、特异性、Matthew相关系数、准确率及ROC曲线等评价指标,反复测试,得到评价指标最优的SVM模型后,便可以用来预测蛋白质序列上的功能位点。该平台除了应用在预测蛋白质功能位点之外,还可以应用于疾病相关单核苷酸多态性(SNP)预测分析、预测蛋白质结构域分析、生物分子间的相互作用等。  相似文献   

5.
Proteins are essential elements of biological systems, and their function typically relies on their ability to successfully bind to specific partners. Recently, an emphasis of study into protein interactions has been on hot spots, or residues in the binding interface that make a significant contribution to the binding energetics. In this study, we investigate how conservation of hot spots can be used to guide docking prediction. We show that the use of evolutionary data combined with hot spot prediction highlights near‐native structures across a range of benchmark examples. Our approach explores various strategies for using hot spots and evolutionary data to score protein complexes, using both absolute and chemical definitions of conservation along with refinements to these strategies that look at windowed conservation and filtering to ensure a minimum number of hot spots in each binding partner. Finally, structure‐based models of orthologs were generated for comparison with sequence‐based scoring. Using two data sets of 22 and 85 examples, a high rate of top 10 and top 1 predictions are observed, with up to 82% of examples returning a top 10 hit and 35% returning top 1 hit depending on the data set and strategy applied; upon inclusion of the native structure among the decoys, up to 55% of examples yielded a top 1 hit. The 20 common examples between data sets show that more carefully curated interolog data yields better predictions, particularly in achieving top 1 hits. Proteins 2015; 83:1940–1946. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.  相似文献   

6.
Protein docking algorithms can be used to study the driving forces and reaction mechanisms of docking processes. They are also able to speed up the lengthy process of experimental structure elucidation of protein complexes by proposing potential structures. In this paper, we are discussing a variant of the protein-protein docking problem, where the input consists of the tertiary structures of proteins A and B plus an unassigned one-dimensional 1H-NMR spectrum of the complex AB. We present a new scoring function for evaluating and ranking potential complex structures produced by a docking algorithm. The scoring function computes a `theoretical' 1H-NMR spectrum for each tentative complex structure and subtracts the calculated spectrum from the experimental one. The absolute areas of the difference spectra are then used to rank the potential complex structures. In contrast to formerly published approaches (e.g. [Morelli et al. (2000) Biochemistry, 39, 2530–2537]) we do not use distance constraints (intermolecular NOE constraints). We have tested the approach with four protein complexes whose three-dimensional structures are stored in the PDB data bank [Bernstein et al. (1977)] and whose 1H-NMR shift assignments are available from the BMRB database. The best result was obtained for an example, where all standard scoring functions failed completely. Here, our new scoring function achieved an almost perfect separation between good approximations of the true complex structure and false positives.  相似文献   

7.
Akio Kitao 《Proteins》2013,81(6):1005-1016
We propose a fast clustering and reranking method, CyClus, for protein–protein docking decoys. This method enables comprehensive clustering of whole decoys generated by rigid‐body docking using cylindrical approximation of the protein–proteininterface and hierarchical clustering procedures. We demonstrate the clustering and reranking of 54,000 decoy structures generated by ZDOCK for each complex within a few minutes. After parameter tuning for the test set in ZDOCK benchmark 2.0 with the ZDOCK and ZRANK scoring functions, blind tests for the incremental data in ZDOCK benchmark 3.0 and 4.0 were conducted. CyClus successfully generated smaller subsets of decoys containing near‐native decoys. For example, the number of decoys required to create subsets containing near‐native decoys with 80% probability was reduced from 22% to 50% of the number required in the original ZDOCK. Although specific ZDOCK and ZRANK results were demonstrated, the CyClus algorithm was designed to be more general and can be applied to a wide range of decoys and scoring functions by adjusting just two parameters, p and T. CyClus results were also compared to those from ClusPro. Proteins 2013; © 2012 Wiley Periodicals, Inc.  相似文献   

8.
9.
Selecting near‐native conformations from the immense number of conformations generated by docking programs remains a major challenge in molecular docking. We introduce DockRank, a novel approach to scoring docked conformations based on the degree to which the interface residues of the docked conformation match a set of predicted interface residues. DockRank uses interface residues predicted by partner‐specific sequence homology‐based protein–protein interface predictor (PS‐HomPPI), which predicts the interface residues of a query protein with a specific interaction partner. We compared the performance of DockRank with several state‐of‐the‐art docking scoring functions using Success Rate (the percentage of cases that have at least one near‐native conformation among the top m conformations) and Hit Rate (the percentage of near‐native conformations that are included among the top m conformations). In cases where it is possible to obtain partner‐specific (PS) interface predictions from PS‐HomPPI, DockRank consistently outperforms both (i) ZRank and IRAD, two state‐of‐the‐art energy‐based scoring functions (improving Success Rate by up to 4‐fold); and (ii) Variants of DockRank that use predicted interface residues obtained from several protein interface predictors that do not take into account the binding partner in making interface predictions (improving success rate by up to 39‐fold). The latter result underscores the importance of using partner‐specific interface residues in scoring docked conformations. We show that DockRank, when used to re‐rank the conformations returned by ClusPro, improves upon the original ClusPro rankings in terms of both Success Rate and Hit Rate. DockRank is available as a server at http://einstein.cs.iastate.edu/DockRank/ . Proteins 2014; 82:250–267. © 2013 Wiley Periodicals, Inc.  相似文献   

10.
HADDOCK is one of the few docking programs that can explicitly account for water molecules in the docking process. Its solvated docking protocol starts from hydrated molecules and a fraction of the resulting interfacial waters is subsequently removed in a biased Monte Carlo procedure based on water‐mediated contact probabilities. The latter were derived from an analysis of water contact frequencies from high‐resolution crystal structures. Here, we introduce a simple water‐mediated amino acid–amino acid contact probability scale derived from the Kyte‐Doolittle hydrophobicity scale and assess its performance on the largest high‐resolution dataset developed to date for solvated docking. Both scales yield high‐quality docking results. The novel and simple hydrophobicity scale, which should reflect better the physicochemical principles underlying contact propensities, leads to a performance improvement of around 10% in ranking, cluster quality and water recovery at the interface compared with the statistics‐based original solvated docking protocol. Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

11.
The identification of protein–protein interactions is vital for understanding protein function, elucidating interaction mechanisms, and for practical applications in drug discovery. With the exponentially growing protein sequence data, fully automated computational methods that predict interactions between proteins are becoming essential components of system‐level function inference. A thorough analysis of protein complex structures demonstrated that binding site locations as well as the interfacial geometry are highly conserved across evolutionarily related proteins. Because the conformational space of protein–protein interactions is highly covered by experimental structures, sensitive protein threading techniques can be used to identify suitable templates for the accurate prediction of interfacial residues. Toward this goal, we developed eFindSitePPI, an algorithm that uses the three‐dimensional structure of a target protein, evolutionarily remotely related templates and machine learning techniques to predict binding residues. Using crystal structures, the average sensitivity (specificity) of eFindSitePPI in interfacial residue prediction is 0.46 (0.92). For weakly homologous protein models, these values only slightly decrease to 0.40–0.43 (0.91–0.92) demonstrating that eFindSitePPI performs well not only using experimental data but also tolerates structural imperfections in computer‐generated structures. In addition, eFindSitePPI detects specific molecular interactions at the interface; for instance, it correctly predicts approximately one half of hydrogen bonds and aromatic interactions, as well as one third of salt bridges and hydrophobic contacts. Comparative benchmarks against several dimer datasets show that eFindSitePPI outperforms other methods for protein‐binding residue prediction. It also features a carefully tuned confidence estimation system, which is particularly useful in large‐scale applications using raw genomic data. eFindSitePPI is freely available to the academic community at http://www.brylinski.org/efindsiteppi . Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

12.
Identifying correct binding modes in a large set of models is an important step in protein–protein docking. We identified protein docking filter based on overlap area that significantly reduces the number of candidate structures that require detailed examination. We also developed potentials based on residue contacts and overlap areas using a comprehensive learning set of 640 two‐chain protein complexes with mathematical programming. Our potential showed substantially better recognition capacity compared to other publicly accessible protein docking potentials in discriminating between native and nonnative binding modes on a large test set of 84 complexes independent of our training set. We were able to rank a near‐native model on the top in 43 cases and within top 10 in 51 cases. We also report an atomic potential that ranks a near‐native model on the top in 46 cases and within top 10 in 58 cases. Our filter+potential is well suited for selecting a small set of models to be refined to atomic resolution. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

13.
Mihaly Mezei 《Proteins》2017,85(2):235-241
The recently developed statistical measure for the type of residue–residue contact at protein complex interfaces, based on a parameter‐free definition of contact, has been used to define a contact score that is correlated with the likelihood of correctness of a proposed complex structure. Comparing the proposed contact scores on the native structure and on a set of model structures the proposed measure was shown to generally favor the native structure but in itself was not able to reliably score the native structure to be the best. Adjusting the scores of redocking experiments with the contact score showed that the adjusted score was able to move up the ranking of the native‐like structure among the proposed complexes when the native‐like was not ranked the best by the respective program. Tests on docking of unbound proteins compared the contact scores of the complexes with the contact score of the crystal structure again showing the tendency of the contact score to favor native‐like conformations. The possibility of using the contact score to improve the determination of biological dimers in a crystal structure was also explored. Proteins 2017; 85:235–241. © 2016 Wiley Periodicals, Inc.  相似文献   

14.
Elucidation of signaling events in a pathogen is potentially important to tackle the infection caused by it. Such events mediated by protein phosphorylation play important roles in infection, and therefore, to predict the phosphosites and substrates of the serine/threonine protein kinases, we have developed a Machine learning-based approach for Mycobacterium tuberculosis serine/threonine protein kinases using kinase-peptide structure–sequence data. This approach utilizes features derived from kinase three-dimensional-structure environment and known phosphosite sequences to generate support vector machine (SVM)-based kinase-specific predictions of phosphosites of serine/threonine protein kinases (STPKs) with no or scarce data of their substrates. SVM outperformed the four machine learning algorithms we tried (random forest, logistic regression, SVM, and k-nearest neighbors) with an area under the curve receiver-operating characteristic value of 0.88 on the independent testing dataset and a 10-fold cross-validation accuracy of ~81.6% for the final model. Our predicted phosphosites of M. tuberculosis STPKs form a useful resource for experimental biologists enabling elucidation of STPK mediated posttranslational regulation of important cellular processes.  相似文献   

15.
How to refine a near‐native structure to make it closer to its native conformation is an unsolved problem in protein‐structure and protein–protein complex‐structure prediction. In this article, we first test several scoring functions for selecting locally resampled near‐native protein–protein docking conformations and then propose a computationally efficient protocol for structure refinement via local resampling and energy minimization. The proposed method employs a statistical energy function based on a Distance‐scaled Ideal‐gas REference state (DFIRE) as an initial filter and an empirical energy function EMPIRE (EMpirical Protein‐InteRaction Energy) for optimization and re‐ranking. Significant improvement of final top‐1 ranked structures over initial near‐native structures is observed in the ZDOCK 2.3 decoy set for Benchmark 1.0 (74% whose global rmsd reduced by 0.5 Å or more and only 7% increased by 0.5 Å or more). Less significant improvement is observed for Benchmark 2.0 (38% versus 33%). Possible reasons are discussed. Proteins 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

16.
Structures of proteins and protein–protein complexes are determined by the same physical principles and thus share a number of similarities. At the same time, there could be differences because in order to function, proteins interact with other molecules, undergo conformations changes, and so forth, which might impose different restraints on the tertiary versus quaternary structures. This study focuses on structural properties of protein–protein interfaces in comparison with the protein core, based on the wealth of currently available structural data and new structure‐based approaches. The results showed that physicochemical characteristics, such as amino acid composition, residue–residue contact preferences, and hydrophilicity/hydrophobicity distributions, are similar in protein core and protein–protein interfaces. On the other hand, characteristics that reflect the evolutionary pressure, such as structural composition and packing, are largely different. The results provide important insight into fundamental properties of protein structure and function. At the same time, the results contribute to better understanding of the ways to dock proteins. Recent progress in predicting structures of individual proteins follows the advancement of deep learning techniques and new approaches to residue coevolution data. Protein core could potentially provide large amounts of data for application of the deep learning to docking. However, our results showed that the core motifs are significantly different from those at protein–protein interfaces, and thus may not be directly useful for docking. At the same time, such difference may help to overcome a major obstacle in application of the coevolutionary data to docking—discrimination of the intramolecular information not directly relevant to docking.  相似文献   

17.
18.
Molecular docking is a computational method for predicting the placement of ligands in the binding sites of their receptor(s). In this review, we discuss the methodological developments that occurred in the docking field in 2012 and 2013, with a particular focus on the more difficult aspects of this computational discipline. The main challenges and therefore focal points for developments in docking, covered in this review, are receptor flexibility, solvation, scoring, and virtual screening. We specifically deal with such aspects of molecular docking and its applications as selection criteria for constructing receptor ensembles, target dependence of scoring functions, integration of higher‐level theory into scoring, implicit and explicit handling of solvation in the binding process, and comparison and evaluation of docking and scoring methods. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

19.
Characterizing the nature of interaction between proteins that have not been experimentally cocrystallized requires a computational docking approach that can successfully predict the spatial conformation adopted in the complex. In this work, the Hydropathic INTeractions (HINT) force field model was used for scoring docked models in a data set of 30 high‐resolution crystallographically characterized “dry” protein–protein complexes and was shown to reliably identify native‐like models. However, most current protein–protein docking algorithms fail to explicitly account for water molecules involved in bridging interactions that mediate and stabilize the association of the protein partners, so we used HINT to illuminate the physical and chemical properties of bridging waters and account for their energetic stabilizing contributions. The HINT water Relevance metric identified the “truly” bridging waters at the 30 protein–protein interfaces and we utilized them in “solvated” docking by manually inserting them into the input files for the rigid body ZDOCK program. By accounting for these interfacial waters, a statistically significant improvement of ~24% in the average hit‐count within the top‐10 predictions the protein–protein dataset was seen, compared to standard “dry” docking. The results also show scoring improvement, with medium and high accuracy models ranking much better than incorrect ones. These improvements can be attributed to the physical presence of water molecules that alter surface properties and better represent native shape and hydropathic complementarity between interacting partners, with concomitantly more accurate native‐like structure predictions. Proteins 2014; 82:916–932. © 2013 Wiley Periodicals, Inc.  相似文献   

20.
The identification of protein–protein interactions (PPIs) can lead to a better understanding of cellular functions and biological processes of proteins and contribute to the design of drugs to target disease-causing PPIs. In addition, targeting host–pathogen PPIs is useful for elucidating infection mechanisms. Although several experimental methods have been used to identify PPIs, these methods can yet to draw complete PPI networks. Hence, computational techniques are increasingly required for the prediction of potential PPIs, which have never been seen experimentally. Recent high-performance sequence-based methods have contributed to the construction of PPI networks and the elucidation of pathogenetic mechanisms in specific diseases. However, the usefulness of these methods depends on the quality and quantity of training data of PPIs. In this brief review, we introduce currently available PPI databases and recent sequence-based methods for predicting PPIs. Also, we discuss key issues in this field and present future perspectives of the sequence-based PPI predictions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号