首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Here, we present a diverse, structurally nonredundant data set of two-chain protein-protein interfaces derived from the PDB. Using a sequence order-independent structural comparison algorithm and hierarchical clustering, 3799 interface clusters are obtained. These yield 103 clusters with at least five nonhomologous members. We divide the clusters into three types. In Type I clusters, the global structures of the chains from which the interfaces are derived are also similar. This cluster type is expected because, in general, related proteins associate in similar ways. In Type II, the interfaces are similar; however, remarkably, the overall structures and functions of the chains are different. The functional spectrum is broad, from enzymes/inhibitors to immunoglobulins and toxins. The fact that structurally different monomers associate in similar ways, suggests "good" binding architectures. This observation extends a paradigm in protein science: It has been well known that proteins with similar structures may have different functions. Here, we show that it extends to interfaces. In Type III clusters, only one side of the interface is similar across the cluster. This structurally nonredundant data set provides rich data for studies of protein-protein interactions and recognition, cellular networks and drug design. In particular, it may be useful in addressing the difficult question of what are the favorable ways for proteins to interact. (The data set is available at http://protein3d.ncifcrf.gov/~keskino/ and http://home.ku.edu.tr/~okeskin/INTERFACE/INTERFACES.html.)  相似文献   

2.
Proteins are highly flexible molecules. Prediction of molecular flexibility aids in the comprehension and prediction of protein function and in providing details of functional mechanisms. The ability to predict the locations, directions, and extent of molecular movements can assist in fitting atomic resolution structures to low-resolution EM density maps and in predicting the complex structures of interacting molecules (docking). There are several types of molecular movements. In this work, we focus on the prediction of hinge movements. Given a single protein structure, the method automatically divides it into the rigid parts and the hinge regions connecting them. The method employs the Elastic Network Model, which is very efficient and was validated against a large data set of proteins. The output can be used in applications such as flexible protein-protein and protein-ligand docking, flexible docking of protein structures into cryo-EM maps, and refinement of low-resolution EM structures. The web server of HingeProt provides convenient visualization of the results and is available with two mirror sites at http://www.prc.boun.edu.tr/appserv/prc/HingeProt3 and http://bioinfo3d.cs.tau.ac.il/HingeProt/.  相似文献   

3.
SUMMARY: We present an algorithmic tool for the identification of biologically significant amino acids in proteins of known three dimensional structure. We estimate the degree of purifying selection and positive Darwinian selection at each site and project these estimates onto the molecular surface of the protein. Thus, patches of functional residues (undergoing either positive or purifying selection), which may be discontinuous in the linear sequence, are revealed. We test for the statistical significance of the site-specific scores in order to obtain reliable and valid estimates. AVAILABILITY: The Selecton web server is available at: http://selecton.bioinfo.tau.ac.il SUPPLEMENTARY INFORMATION: More information is available at http://selecton.bioinfo.tau.ac.il/overview.html. A set of examples is available at http://selecton.bioinfo.tau.ac.il/gallery.html.  相似文献   

4.
5.
Network propagation is a powerful tool for genetic analysis which is widely used to identify genes and genetic modules that underlie a process of interest. Here we provide a graphical, web-based platform (http://anat.cs.tau.ac.il/WebPropagate/) in which researchers can easily apply variants of this method to data sets of interest using up-to-date networks of protein–protein interactions in several organisms.  相似文献   

6.
Here, we propose a binding site prediction method based on the high frequency end of the spectrum in the native state of the protein structural dynamics. The spectrum is obtained using an elastic network model (GNM). High frequency vibrating (HFV) residues are determined from the fastest modes dynamics. HFV residue clusters and the associated surface patch residues are tested for their likelihood to locate at the binding interfaces using two different data sets, the Benchmark Set of mainly enzymes and antigen/antibodies and the Cluster Set of more diverse structures. The binding interface is identified to be within 7.5 A of the HFV residue clusters in the Benchmark Set and Cluster Set, for 77% and 70% of the structures, respectively. The success rate increases to 88% and 84%, respectively, by using the surface patches. The results suggest that concave binding interfaces, typically those of enzyme-binding sites, are enriched by the HFV residues. Thus, we expect that the association of HFV residues with the interfaces is mostly for enzymes. If, however, a binding region has invaginations and cavities, as in some of the antigen/antibodies and in cases in the Cluster data set, we expect it would be detected there too. This implies that binding sites possess several (inter-related) properties such as cavities, high packing density, conservation, and disposition for hotspots at binding surfaces. It further suggests that the high frequency vibrating residue-based approach is a potential tool for identification of regions likely to serve as protein-binding sites. The software is available at http://www.prc.boun.edu.tr/PRC/software.html.  相似文献   

7.
Shatsky M  Nussinov R  Wolfson HJ 《Proteins》2006,62(1):209-217
Routinely used multiple-sequence alignment methods use only sequence information. Consequently, they may produce inaccurate alignments. Multiple-structure alignment methods, on the other hand, optimize structural alignment by ignoring sequence information. Here, we present an optimization method that unifies sequence and structure information. The alignment score is based on standard amino acid substitution probabilities combined with newly computed three-dimensional structure alignment probabilities. The advantage of our alignment scheme is in its ability to produce more accurate multiple alignments. We demonstrate the usefulness of the method in three applications: 1) computing more accurate multiple-sequence alignments, 2) analyzing protein conformational changes, and 3) computation of amino acid structure-sequence conservation with application to protein-protein docking prediction. The method is available at http://bioinfo3d.cs.tau.ac.il/staccato/.  相似文献   

8.
The diverse range of cellular functions is performed by a limited number of protein folds existing in nature. One may similarly expect that cellular functional diversity would be covered by a limited number of protein-protein interface architectures. Here, we present 8205 interface clusters, each representing a unique interface architecture. This data set of protein-protein interfaces is analyzed and compared with older data sets. We observe that the number of both biological and crystal interfaces increases significantly compared to the number of Protein Data Bank entries. Furthermore, we find that the number of distinct interface architectures grows at a much faster rate than the number of folds and is yet to level off. We further analyze the growth trend of the functional coverage by constructing functional interaction networks from interfaces. The functional coverage is also found to steadily increase. Interestingly, we also observe that despite the diversity of interface architectures, some are more favorable and frequently used, and of particular interest, are the ones that are also preferred in single chains.  相似文献   

9.
FireDock: fast interaction refinement in molecular docking   总被引:3,自引:0,他引:3  
Here, we present FireDock, an efficient method for the refinement and rescoring of rigid-body docking solutions. The refinement process consists of two main steps: (1) rearrangement of the interface side-chains and (2) adjustment of the relative orientation of the molecules. Our method accounts for the observation that most interface residues that are important in recognition and binding do not change their conformation significantly upon complexation. Allowing full side-chain flexibility, a common procedure in refinement methods, often causes excessive conformational changes. These changes may distort preformed structural signatures, which have been shown to be important for binding recognition. Here, we restrict side-chain movements, and thus manage to reduce the false-positive rate noticeably. In the later stages of our procedure (orientation adjustments and scoring), we smooth the atomic radii. This allows for the minor backbone and side-chain movements and increases the sensitivity of our algorithm. FireDock succeeds in ranking a near-native structure within the top 15 predictions for 83% of the 30 enzyme-inhibitor test cases, and for 78% of the 18 semiunbound antibody-antigen complexes. Our refinement procedure significantly improves the ranking of the rigid-body PatchDock algorithm for these cases. The FireDock program is fully automated. In particular, to our knowledge, FireDock's prediction results are comparable to current state-of-the-art refinement methods while its running time is significantly lower. The method is available at http://bioinfo3d.cs.tau.ac.il/FireDock/.  相似文献   

10.
We present a set of geometric docking algorithms for rigid, flexible, and cyclic symmetry docking. The algorithms are highly efficient and have demonstrated very good performance in CAPRI Rounds 3-5. The flexible docking algorithm, FlexDock, is unique in its ability to handle any number of hinges in the flexible molecule, without degradation in run-time performance, as compared to rigid docking. The algorithm for reconstruction of cyclically symmetric complexes successfully assembles multimolecular complexes satisfying C(n) symmetry for any n in a matter of minutes on a desktop PC. Most of the algorithms presented here are available at the Tel Aviv University Structural Bioinformatics Web server (http://bioinfo3d.cs.tau.ac.il/).  相似文献   

11.
Nayal M  Honig B 《Proteins》2006,63(4):892-906
In this article we introduce a new method for the identification and the accurate characterization of protein surface cavities. The method is encoded in the program SCREEN (Surface Cavity REcognition and EvaluatioN). As a first test of the utility of our approach we used SCREEN to locate and analyze the surface cavities of a nonredundant set of 99 proteins cocrystallized with drugs. We find that this set of proteins has on average about 14 distinct cavities per protein. In all cases, a drug is bound at one (and sometimes more than one) of these cavities. Using cavity size alone as a criterion for predicting drug-binding sites yields a high balanced error rate of 15.7%, with only 71.7% coverage. Here we characterize each surface cavity by computing a comprehensive set of 408 physicochemical, structural, and geometric attributes. By applying modern machine learning techniques (Random Forests) we were able to develop a classifier that can identify drug-binding cavities with a balanced error rate of 7.2% and coverage of 88.9%. Only 18 of the 408 cavity attributes had a statistically significant role in the prediction. Of these 18 important attributes, almost all involved size and shape rather than physicochemical properties of the surface cavity. The implications of these results are discussed. A SCREEN Web server is available at http://interface.bioc.columbia.edu/screen.  相似文献   

12.
13.
Recent enhancements and current research in the GeneCards (GC) (http://bioinfo.weizmann.ac.il/cards/) project are described, including the addition of gene expression profiles and integrated gene locations. Also highlighted are the contributions of specialized associated human gene-centric databases developed at the Weizmann Institute. These include the Unified Database (UDB) (http://bioinfo.weizmann.ac.il/udb) for human genome mapping, the human Chromosome 21 database at the Weizmann Insti-tute (CroW 21) (http://bioinfo.weizmann.ac.il/crow21), and the Human Olfactory Receptor Data Explora-torium (HORDE) (http://bioinfo.weizmann.ac.il/HORDE). The synergistic relationships amongst these efforts have positively impacted the quality, quantity and usefulness of the GeneCards gene compendium.  相似文献   

14.
Bordner AJ  Abagyan R 《Proteins》2005,60(3):353-366
Predicting protein-protein interfaces from a three-dimensional structure is a key task of computational structural proteomics. In contrast to geometrically distinct small molecule binding sites, protein-protein interface are notoriously difficult to predict. We generated a large nonredundant data set of 1494 true protein-protein interfaces using biological symmetry annotation where necessary. The data set was carefully analyzed and a Support Vector Machine was trained on a combination of a new robust evolutionary conservation signal with the local surface properties to predict protein-protein interfaces. Fivefold cross validation verifies the high sensitivity and selectivity of the model. As much as 97% of the predicted patches had an overlap with the true interface patch while only 22% of the surface residues were included in an average predicted patch. The model allowed the identification of potential new interfaces and the correction of mislabeled oligomeric states.  相似文献   

15.
The analysis and prediction of protein-protein interaction sites from structural data are restricted by the limited availability of structural complexes that represent the complete protein-protein interaction space. The domain classification schemes CATH and SCOP are normally used independently in the analysis and prediction of protein domain-domain interactions. In this article, the effect of different domain classification schemes on the number and type of domain-domain interactions observed in structural data is systematically evaluated for the SCOP and CATH hierarchies. Although there is a large overlap in domain assignments between SCOP and CATH, 23.6% of CATH interfaces had no SCOP equivalent and 37.3% of SCOP interfaces had no CATH equivalent in a nonredundant set. Therefore, combining both classifications gives an increase of between 23.6 and 37.3% in domain-domain interfaces. It is suggested that if possible, both domain classification schemes should be used together, but if only one is selected, SCOP provides better coverage than CATH. Employing both SCOP and CATH reduces the false negative rate of predictive methods, which employ homology matching to structural data to predict protein-protein interaction by an estimated 6.5%.  相似文献   

16.
Chen H  Zhou HX 《Proteins》2005,61(1):21-35
The number of structures of protein-protein complexes deposited to the Protein Data Bank is growing rapidly. These structures embed important information for predicting structures of new protein complexes. This motivated us to develop the PPISP method for predicting interface residues in protein-protein complexes. In PPISP, sequence profiles and solvent accessibility of spatially neighboring surface residues were used as input to a neural network. The network was trained on native interface residues collected from the Protein Data Bank. The prediction accuracy at the time was 70% with 47% coverage of native interface residues. Now we have extensively improved PPISP. The training set now consisted of 1156 nonhomologous protein chains. Test on a set of 100 nonhomologous protein chains showed that the prediction accuracy is now increased to 80% with 51% coverage. To solve the problem of over-prediction and under-prediction associated with individual neural network models, we developed a consensus method that combines predictions from multiple models with different levels of accuracy and coverage. Applied on a benchmark set of 68 proteins for protein-protein docking, the consensus approach outperformed the best individual models by 3-8 percentage points in accuracy. To demonstrate the predictive power of cons-PPISP, eight complex-forming proteins with interfaces characterized by NMR were tested. These proteins are nonhomologous to the training set and have a total of 144 interface residues identified by chemical shift perturbation. cons-PPISP predicted 174 interface residues with 69% accuracy and 47% coverage and promises to complement experimental techniques in characterizing protein-protein interfaces. .  相似文献   

17.
While it has been established that microRNAs (miRNAs) play key roles throughout development and are dysregulated in many human pathologies, the specific processes and pathways regulated by individual miRNAs are mostly unknown. Here, we use computational target predictions in order to automatically infer the processes affected by human miRNAs. Our approach improves upon standard statistical tools by addressing specific characteristics of miRNA regulation. Our analysis is based on a novel compendium of experimentally verified miRNA-pathway and miRNA-process associations that we constructed, which can be a useful resource by itself. Our method also predicts novel miRNA-regulated pathways, refines the annotation of miRNAs for which only crude functions are known, and assigns differential functions to miRNAs with closely related sequences. Applying our approach to groups of co-expressed genes allows us to identify miRNAs and genomic miRNA clusters with functional importance in specific stages of early human development. A full list of the predicted mRNA functions is available at http://acgt.cs.tau.ac.il/fame/.  相似文献   

18.
Symmetric protein complexes are abundant in the living cell. Predicting their atomic structure can shed light on the mechanism of many important biological processes. Symmetric docking methods aim to predict the structure of these complexes given the unbound structure of a single monomer, or its model. Symmetry constraints reduce the search-space of these methods and make the prediction easier compared to asymmetric protein-protein docking. However, the challenge of modeling the conformational changes that the monomer might undergo is a major obstacle. In this article, we present SymmRef, a novel method for refinement and reranking of symmetric docking solutions. The method models backbone and side-chain movements and optimizes the rigid-body orientations of the monomers. The backbone movements are modeled by normal modes minimization and the conformations of the side-chains are modeled by selecting optimal rotamers. Since solved structures of symmetric multimers show asymmetric side-chain conformations, we do not use symmetry constraints in the side-chain optimization procedure. The refined models are re-ranked according to an energy score. We tested the method on a benchmark of unbound docking challenges. The results show that the method significantly improves the accuracy and the ranking of symmetric rigid docking solutions. SymmRef is available for download at http:// bioinfo3d.cs.tau.ac.il/SymmRef/download.html.  相似文献   

19.
We present and review coupled two-way clustering, a method designed to mine gene expression data. The method identifies submatrices of the total expression matrix, whose clustering analysis reveals partitions of samples (and genes) into biologically relevant classes. We demonstrate, on data from colon and breast cancer, that we are able to identify partitions that elude standard clustering analysis. AVAILABILITY: Free, at http://ctwc.weizmann.ac.il.. SUPPLEMENTARY INFORMATION: http://www.weizmann.ac.il/physics/complex/compphys/bioinfo2/  相似文献   

20.
Mintseris J  Weng Z 《Proteins》2003,53(3):629-639
The ability to analyze and compare protein-protein interactions on the structural level is critical to our understanding of various aspects of molecular recognition and the functional interplay of components of biochemical networks. In this study, we introduce atomic contact vectors (ACVs) as an intuitive way to represent the physico-chemical characteristics of a protein-protein interface as well as a way to compare interfaces to each other. We test the utility of ACVs in classification by using them to distinguish between homodimers and crystal contacts. Our results compare favorably with those reported by other authors. We then apply ACVs to mine the PDB for all known protein-protein complexes and separate transient recognition complexes from permanent oligomeric ones. Getting at the basis of this difference is important for our understanding of recognition and we achieved a success rate of 91% for distinguishing these two classes of complexes. Although accessible surface area of the interface is a major discriminating feature, we also show that there are distinct differences in the contact preferences between the two kinds of complexes. Illustrating the superiority of ACVs as a basic comparison measure over a sequence-based approach, we derive a general rule of thumb to determine whether two protein-protein interfaces are redundant. With this method, we arrive at a nonredundant set of 209 recognition complexes--the largest set reported so far.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号