首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Computational docking approaches are important as a source of protein-protein complexes structures and as a means to understand the principles of protein association. A key element in designing better docking approaches, including search procedures, potentials, and scoring functions is their validation on experimentally determined structures. Thus, the databases of such structures (benchmark sets) are important. The previous, first release of the DOCKGROUND resource (Douguet et al., Bioinformatics 2006; 22:2612-2618) implemented a comprehensive database of cocrystallized (bound) protein-protein complexes in a relational database of annotated structures. The current release adds important features to the set of bound structures, such as regularly updated downloadable datasets: automatically generated nonredundant set, built according to most common criteria, and a manually curated set that includes only biological nonobligate complexes along with a number of additional useful characteristics. The main focus of the current release is unbound (experimental and simulated) protein-protein complexes. Complexes from the bound dataset are used to identify crystallized unbound analogs. If such analogs do not exist, the unbound structures are simulated by rotamer library optimization. Thus, the database contains comprehensive sets of complexes suitable for large scale benchmarking of docking algorithms. Advanced methodologies for simulating unbound conformations are being explored for the next release. The future releases will include datasets of modeled protein-protein complexes, and systematic sets of docking decoys obtained by different docking algorithms. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new docking methodologies.  相似文献   

2.
The protein-protein docking problem is one of the focal points of activity in computational biophysics and structural biology. The three-dimensional structure of a protein-protein complex, generally, is more difficult to determine experimentally than the structure of an individual protein. Adequate computational techniques to model protein interactions are important because of the growing number of known protein structures, particularly in the context of structural genomics. Docking offers tools for fundamental studies of protein interactions and provides a structural basis for drug design. Protein-protein docking is the prediction of the structure of the complex, given the structures of the individual proteins. In the heart of the docking methodology is the notion of steric and physicochemical complementarity at the protein-protein interface. Originally, mostly high-resolution, experimentally determined (primarily by x-ray crystallography) protein structures were considered for docking. However, more recently, the focus has been shifting toward lower-resolution modeled structures. Docking approaches have to deal with the conformational changes between unbound and bound structures, as well as the inaccuracies of the interacting modeled structures, often in a high-throughput mode needed for modeling of large networks of protein interactions. The growing number of docking developers is engaged in the community-wide assessments of predictive methodologies. The development of more powerful and adequate docking approaches is facilitated by rapidly expanding information and data resources, growing computational capabilities, and a deeper understanding of the fundamental principles of protein interactions.  相似文献   

3.
The protein-protein docking problem is one of the focal points of activity in computational biophysics and structural biology. The three-dimensional structure of a protein-protein complex, generally, is more difficult to determine experimentally than the structure of an individual protein. Adequate computational techniques to model protein interactions are important because of the growing number of known protein structures, particularly in the context of structural genomics. Docking offers tools for fundamental studies of protein interactions and provides a structural basis for drug design. Protein-protein docking is the prediction of the structure of the complex, given the structures of the individual proteins. In the heart of the docking methodology is the notion of steric and physicochemical complementarity at the protein-protein interface. Originally, mostly high-resolution, experimentally determined (primarily by x-ray crystallography) protein structures were considered for docking. However, more recently, the focus has been shifting toward lower-resolution modeled structures. Docking approaches have to deal with the conformational changes between unbound and bound structures, as well as the inaccuracies of the interacting modeled structures, often in a high-throughput mode needed for modeling of large networks of protein interactions. The growing number of docking developers is engaged in the community-wide assessments of predictive methodologies. The development of more powerful and adequate docking approaches is facilitated by rapidly expanding information and data resources, growing computational capabilities, and a deeper understanding of the fundamental principles of protein interactions.  相似文献   

4.
DIP: the database of interacting proteins   总被引:24,自引:3,他引:21  
The Database of Interacting Proteins (DIP; http://dip.doe-mbi.ucla.edu) is a database that documents experimentally determined protein-protein interactions. This database is intended to provide the scientific community with a comprehensive and integrated tool for browsing and efficiently extracting information about protein interactions and interaction networks in biological processes. Beyond cataloging details of protein-protein interactions, the DIP is useful for understanding protein function and protein-protein relationships, studying the properties of networks of interacting proteins, benchmarking predictions of protein-protein interactions, and studying the evolution of protein-protein interactions.  相似文献   

5.
The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking). Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu). The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features) approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound benchmark set, significantly increasing the docking success rate.  相似文献   

6.
Zhao N  Pang B  Shyu CR  Korkin D 《Proteomics》2011,11(22):4321-4330
Structural knowledge about protein-protein interactions can provide insights to the basic processes underlying cell function. Recent progress in experimental and computational structural biology has led to a rapid growth of experimentally resolved structures and computationally determined near-native models of protein-protein interactions. However, determining whether a protein-protein interaction is physiological or it is the artifact of an experimental or computational method remains a challenging problem. In this work, we have addressed two related problems. The first problem is distinguishing between the experimentally obtained physiological and crystal-packing protein-protein interactions. The second problem is concerned with the classification of near-native and inaccurate docking models. We first defined a universal set of interface features and employed a support vector machines (SVM)-based approach to classify the interactions for both problems, with the accuracy, precision, and recall for the first problem classifier reaching 93%. To improve the classification, we next developed a semi-supervised learning approach for the second problem, using transductive SVM (TSVM). We applied both classifiers to a commonly used protein docking benchmark of 124 complexes. We found that while we reached the classification accuracies of 78.9% for the SVM classifier and 80.3% for the TSVM classifier, improving protein-docking methods by model re-ranking remains a challenging problem.  相似文献   

7.
Knowledge about protein interaction sites provides detailed information of protein–protein interactions (PPIs). To date, nearly 20,000 of PPIs from Arabidopsis thaliana have been identified. Nevertheless, the interaction site information has been largely missed by previously published PPI databases. Here, AraPPISite, a database that presents fine-grained interaction details for A. thaliana PPIs is established. First, the experimentally determined 3D structures of 27 A. thaliana PPIs are collected from the Protein Data Bank database and the predicted 3D structures of 3023 A. thaliana PPIs are modeled by using two well-established template-based docking methods. For each experimental/predicted complex structure, AraPPISite not only provides an interactive user interface for browsing interaction sites, but also lists detailed evolutionary and physicochemical properties of these sites. Second, AraPPISite assigns domain–domain interactions or domain–motif interactions to 4286 PPIs whose 3D structures cannot be modeled. In this case, users can easily query protein interaction regions at the sequence level. AraPPISite is a free and user-friendly database, which does not require user registration or any configuration on local machines. We anticipate AraPPISite can serve as a helpful database resource for the users with less experience in structural biology or protein bioinformatics to probe the details of PPIs, and thus accelerate the studies of plant genetics and functional genomics. AraPPISite is available at http://systbio.cau.edu.cn/arappisite/index.html.  相似文献   

8.
MOTIVATION: The current need for high-throughput protein interaction detection has resulted in interaction data being generated en masse through such experimental methods as yeast-two-hybrids and protein chips. Such data can be erroneous and they often do not provide adequate functional information for the detected interactions. Therefore, it is useful to develop an in silico approach to further validate and annotate the detected protein interactions. RESULTS: Given that protein-protein interactions involve physical interactions between protein domains, domain-domain interaction information can be useful for validating, annotating, and even predicting protein interactions. However, large-scale, experimentally determined domain-domain interaction data do not exist. Here, we describe an integrative approach to computationally derive putative domain interactions from multiple data sources, including protein interactions, protein complexes, and Rosetta Stone sequences. We further prove the usefulness of such an integrative approach by applying the derived domain interactions to predict and validate protein-protein interactions. AVAILABILITY: A database of putative protein domain interactions derived using the method described in this paper is available at http://interdom.lit.org.sg.  相似文献   

9.
NLDB (Natural Ligand DataBase; URL: http://nldb.hgc.jp) is a database of automatically collected and predicted 3D protein–ligand interactions for the enzymatic reactions of metabolic pathways registered in KEGG. Structural information about these reactions is important for studying the molecular functions of enzymes, however a large number of the 3D interactions are still unknown. Therefore, in order to complement such missing information, we predicted protein–ligand complex structures, and constructed a database of the 3D interactions in reactions. NLDB provides three different types of data resources; the natural complexes are experimentally determined protein–ligand complex structures in PDB, the analog complexes are predicted based on known protein structures in a complex with a similar ligand, and the ab initio complexes are predicted by docking simulations. In addition, NLDB shows the known polymorphisms found in human genome on protein structures. The database has a flexible search function based on various types of keywords, and an enrichment analysis function based on a set of KEGG compound IDs. NLDB will be a valuable resource for experimental biologists studying protein–ligand interactions in specific reactions, and for theoretical researchers wishing to undertake more precise simulations of interactions.  相似文献   

10.
11.
MOTIVATION: Public resources for studying protein interfaces are necessary for better understanding of molecular recognition and developing intermolecular potentials, search procedures and scoring functions for the prediction of protein complexes. RESULTS: The first release of the DOCKGROUND resource implements a comprehensive database of co-crystallized (bound-bound) protein-protein complexes, providing foundation for the upcoming expansion to unbound (experimental and simulated) protein-protein complexes, modeled protein-protein complexes and systematic sets of docking decoys. The bound-bound part of DOCKGROUND is a relational database of annotated structures based on the Biological Unit file (Biounit) provided by the RCSB as a separated file containing probable biological molecule. DOCKGROUND is automatically updated to reflect the growth of PDB. It contains 67,220 pairwise complexes that rely on 14,913 Biounit entries from 34,778 PDB entries (January 30, 2006). The database includes a dynamic generation of non-redundant datasets of pairwise complexes based either on the structural similarity (SCOP classification) or on user-defined sequence identity. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new methodologies for modeling of protein interactions. AVAILABILITY: DOCKGROUND is available at http://dockground.bioinformatics.ku.edu. The current first release implements the bound-bound part.  相似文献   

12.
The Alanine Scanning Energetics database (ASEdb) is a searchable database of single alanine mutations in protein-protein, protein-nucleic acid, and protein-small molecule interactions for which binding affinities have been experimentally determined. In cases where structures are available, it contains surface areas of the mutated side chain and links to the PDB entries. It is useful for studying the contribution of single amino acids to the energetics of protein interactions, and can be updated by researchers as new data are generated.  相似文献   

13.
Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.  相似文献   

14.
15.
Lu H  Lu L  Skolnick J 《Biophysical journal》2003,84(3):1895-1901
A residue-based and a heavy atom-based statistical pair potential are developed for use in assessing the strength of protein-protein interactions. To ensure the quality of the potentials, a nonredundant, high-quality dimer database is constructed. The protein complexes in this dataset are checked by a literature search to confirm that they form multimers, and the pairwise amino acid preference to interact across a protein-protein interface is analyzed and pair potentials constructed. The performance of the residue-based potentials is evaluated by using four jackknife tests and by assessing the potentials' ability to select true protein-protein interfaces from false ones. Compared to potentials developed for monomeric protein structure prediction, the interdomain potential performs much better at distinguishing protein-protein interactions. The potential developed from homodimer interfaces is almost the same as that developed from heterodimer interfaces with a correlation coefficient of 0.92. The residue-based potential is well suited for genomic scale protein interaction prediction and analysis, such as in a recently developed threading-based algorithm, MULTIPROSPECTOR. However, the more time-consuming atom-based potential performs better in identifying near-native structures from docking generated decoys.  相似文献   

16.
Structural characterization of protein‐protein interactions is essential for understanding life processes at the molecular level. However, only a fraction of protein interactions have experimentally resolved structures. Thus, reliable computational methods for structural modeling of protein interactions (protein docking) are important for generating such structures and understanding the principles of protein recognition. Template‐based docking techniques that utilize structural similarity between target protein‐protein interaction and cocrystallized protein‐protein complexes (templates) are gaining popularity due to generally higher reliability than that of the template‐free docking. However, the template‐based approach lacks explicit penalties for intermolecular penetration, as opposed to the typical free docking where such penalty is inherent due to the shape complementarity paradigm. Thus, template‐based docking models are commonly assumed to require special treatment to remove large structural penetrations. In this study, we compared clashes in the template‐based and free docking of the same proteins, with crystallographically determined and modeled structures. The results show that for the less accurate protein models, free docking produces fewer clashes than the template‐based approach. However, contrary to the common expectation, in acceptable and better quality docking models of unbound crystallographically determined proteins, the clashes in the template‐based docking are comparable to those in the free docking, due to the overall higher quality of the template‐based docking predictions. This suggests that the free docking refinement protocols can in principle be applied to the template‐based docking predictions as well. Proteins 2016; 85:39–45. © 2016 Wiley Periodicals, Inc.  相似文献   

17.
Currently, there is a major effort to map protein-protein interactions on a genome-wide scale. The utility of the resulting interaction networks will depend on the reliability of the experimental methods and the coverage of the approaches. Known macromolecular complexes provide a defined and objective set of protein interactions with which to compare biochemical and genetic data for validation. Here, we show that a significant fraction of the protein-protein interactions in genome-wide datasets, as well as many of the individual interactions reported in the literature, are inconsistent with the known 3D structures of three recent complexes (RNA polymerase II, Arp2/3 and the proteasome). Furthermore, comparison among genome-wide datasets, and between them and a larger (but less well resolved) group of 174 complexes, also shows marked inconsistencies. Finally, individual interaction datasets, being inherently noisy, are best used when integrated together, and we show how simple Bayesian approaches can combine them, significantly decreasing error rate.  相似文献   

18.
The interactions between proteins allow the cell's life. A number of experimental, genome-wide, high-throughput studies have been devoted to the determination of protein-protein interactions and the consequent interaction networks. Here, the bioinformatics methods dealing with protein-protein interactions and interaction network are overviewed. 1. Interaction databases developed to collect and annotate this immense amount of data; 2. Automated data mining techniques developed to extract information about interactions from the published literature; 3. Computational methods to assess the experimental results developed as a consequence of the finding that the results of high-throughput methods are rather inaccurate; 4. Exploitation of the information provided by protein interaction networks in order to predict functional features of the proteins; and 5. Prediction of protein-protein interactions.  相似文献   

19.
MOTIVATION: Biological processes in cells are properly performed by gene regulations, signal transductions and interactions between proteins. To understand such molecular networks, we propose a statistical method to estimate gene regulatory networks and protein-protein interaction networks simultaneously from DNA microarray data, protein-protein interaction data and other genome-wide data. RESULTS: We unify Bayesian networks and Markov networks for estimating gene regulatory networks and protein-protein interaction networks according to the reliability of each biological information source. Through the simultaneous construction of gene regulatory networks and protein-protein interaction networks of Saccharomyces cerevisiae cell cycle, we predict the role of several genes whose functions are currently unknown. By using our probabilistic model, we can detect false positives of high-throughput data, such as yeast two-hybrid data. In a genome-wide experiment, we find possible gene regulatory relationships and protein-protein interactions between large protein complexes that underlie complex regulatory mechanisms of biological processes.  相似文献   

20.
Characterization of life processes at the molecular level requires structural details of protein–protein interactions (PPIs). The number of experimentally determined protein structures accounts only for a fraction of known proteins. This gap has to be bridged by modeling, typically using experimentally determined structures as templates to model related proteins. The fraction of experimentally determined PPI structures is even smaller than that for the individual proteins, due to a larger number of interactions than the number of individual proteins, and a greater difficulty of crystallizing protein–protein complexes. The approaches to structural modeling of PPI (docking) often have to rely on modeled structures of the interactors, especially in the case of large PPI networks. Structures of modeled proteins are typically less accurate than the ones determined by X‐ray crystallography or nuclear magnetic resonance. Thus the utility of approaches to dock these structures should be assessed by thorough benchmarking, specifically designed for protein models. To be credible, such benchmarking has to be based on carefully curated sets of structures with levels of distortion typical for modeled proteins. This article presents such a suite of models built for the benchmark set of the X‐ray structures from the Dockground resource ( http://dockground.bioinformatics.ku.edu ) by a combination of homology modeling and Nudged Elastic Band method. For each monomer, six models were generated with predefined Cα root mean square deviation from the native structure (1, 2, …, 6 Å). The sets and the accompanying data provide a comprehensive resource for the development of docking methodology for modeled proteins. Proteins 2014; 82:278–287. © 2013 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号