首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to π–related interactions, especially π · · · π interactions.  相似文献   

3.
Liu Q  Wong L  Li J 《Biochimica et biophysica acta》2012,1824(12):1457-1467
Characterization of binding hot spots of protein interfaces is a fundamental study in molecular biology. Many computational methods have been proposed to identify binding hot spots. However, there are few studies to assess the biological significance of binding hot spots. We introduce the notion of biological significance of a contact residue for capturing the probability of the residue occurring in or contributing to protein binding interfaces. We take a statistical Z-score approach to the assessment of the biological significance. The method has three main steps. First, the potential score of a residue is defined by using a knowledge-based potential function with relative accessible surface area calculations. A null distribution of this potential score is then generated from artifact crystal packing contacts. Finally, the Z-score significance of a contact residue with a specific potential score is determined according to this null distribution. We hypothesize that residues at binding hot spots have big absolute values of Z-score as they contribute greatly to binding free energy. Thus, we propose to use Z-score to predict whether a contact residue is a hot spot residue. Comparison with previously reported methods on two benchmark datasets shows that this Z-score method is mostly superior to earlier methods. This article is part of a Special Issue entitled: Computational Methods for Protein Interaction and Structural Prediction.  相似文献   

4.
Energetic hot spots account for a significant portion of the total binding free energy and correlate with structurally conserved interface residues. Here, we map experimentally determined hot spots and structurally conserved residues to investigate their geometrical organization. Unfilled pockets are pockets that remain unfilled after protein-protein complexation, while complemented pockets are pockets that disappear upon binding, representing tightly fit regions. We find that structurally conserved residues and energetic hot spots are strongly favored to be located in complemented pockets, and are disfavored in unfilled pockets. For the three available protein-protein complexes with complemented pockets where both members of the complex were alanine-scanned, 62% of all hot spots (DeltaDeltaG>2kcal/mol) are within these pockets, and 60% of the residues in the complemented pockets are hot spots. 93% of all red-hot residues (DeltaDeltaG>/=4kcal/mol) either protrude into or are located in complemented pockets. The occurrence of hot spots and conserved residues in complemented pockets highlights the role of local tight packing in protein associations, and rationalizes their energetic contribution and conservation. Complemented pockets and their corresponding protruding residues emerge among the most important geometric features in protein-protein interactions. By screening the solvent, this organization shields backbone hydrogen bonds and charge-charge interactions. Complemented pockets often pre-exist binding. For 18 protein-protein complexes with complemented pockets whose unbound structures are available, in 16 the pockets are identified to pre-exist in the unbound structures. The root-mean-squared deviations of the atoms lining the pockets between the bound and unbound states is as small as 0.9A, suggesting that such pockets constitute features of the populated native state that may be used in docking.  相似文献   

5.
Zhu X  Mitchell JC 《Proteins》2011,79(9):2671-2683
Hot spots constitute a small fraction of protein-protein interface residues, yet they account for a large fraction of the binding affinity. Based on our previous method (KFC), we present two new methods (KFC2a and KFC2b) that outperform other methods at hot spot prediction. A number of improvements were made in developing these new methods. First, we created a training data set that contained a similar number of hot spot and non-hot spot residues. In addition, we generated 47 different features, and different numbers of features were used to train the models to avoid over-fitting. Finally, two feature combinations were selected: One (used in KFC2a) is composed of eight features that are mainly related to solvent accessible surface area and local plasticity; the other (KFC2b) is composed of seven features, only two of which are identical to those used in KFC2a. The two models were built using support vector machines (SVM). The two KFC2 models were then tested on a mixed independent test set, and compared with other methods such as Robetta, FOLDEF, HotPoint, MINERVA, and KFC. KFC2a showed the highest predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.85); however, the false positive rate was somewhat higher than for other models. KFC2b showed the best predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.62) among all methods other than KFC2a, and the False Positive Rate (FPR = 0.15) was comparable with other highly predictive methods.  相似文献   

6.
Proteins interact with each other through binding interfaces that differ greatly in size and physico‐chemical properties. Within the binding interface, a few residues called hot spots contribute the majority of the binding free energy and are hence irreplaceable. In contrast, cold spots are occupied by suboptimal amino acids, providing possibility for affinity enhancement through mutations. In this study, we identify cold spots due to cavities and unfavorable charge interactions in multiple protein–protein interactions (PPIs). For our cold spot analysis, we first use a small affinity database of PPIs with known structures and affinities and then expand our search to nearly 4000 homo‐ and heterodimers in the Protein Data Bank (PDB). We observe that cold spots due to cavities are present in nearly all PPIs unrelated to their binding affinity, while unfavorable charge interactions are relatively rare. We also find that most cold spots are located in the periphery of the binding interface, with high‐affinity complexes showing fewer centrally located colds spots than low‐affinity complexes. A larger number of cold spots is also found in non‐cognate interactions compared to their cognate counterparts. Furthermore, our analysis reveals that cold spots are more frequent in homo‐dimeric complexes compared to hetero‐complexes, likely due to symmetry constraints imposed on sequences of homodimers. Finally, we find that glycines, glutamates, and arginines are the most frequent amino acids appearing at cold spot positions. Our analysis emphasizes the importance of cold spot positions to protein evolution and facilitates protein engineering studies directed at enhancing binding affinity and specificity in a wide range of applications.  相似文献   

7.
In this study, we propose a novel method to predict the solvent accessible surface areas of transmembrane residues. For both transmembrane alpha-helix and beta-barrel residues, the correlation coefficients between the predicted and observed accessible surface areas are around 0.65. On the basis of predicted accessible surface areas, residues exposed to the lipid environment or buried inside a protein can be identified by using certain cutoff thresholds. We have extensively examined our approach based on different definitions of accessible surface areas and a variety of sets of control parameters. Given that experimentally determining the structures of membrane proteins is very difficult and membrane proteins are actually abundant in nature, our approach is useful for theoretically modeling membrane protein tertiary structures, particularly for modeling the assembly of transmembrane domains. This approach can be used to annotate the membrane proteins in proteomes to provide extra structural and functional information.  相似文献   

8.
Structurally conserved residues at protein-protein interfaces correlate with the experimental alanine-scanning hot spots. Here, we investigate the organization of these conserved, computational hot spots and their contribution to the stability of protein associations. We find that computational hot spots are not homogeneously distributed along the protein interfaces; rather they are clustered within locally tightly packed regions. Within the dense clusters, they form a network of interactions and consequently their contributions to the stability of the complex are cooperative; however the contributions of independent clusters are additive. This suggests that the binding free energy is not a simple summation of the single hot spot residue contributions. As expected, around the hot spot residues we observe moderately conserved residues, further highlighting the crucial role of the conserved interactions in the local densely packed environment. The conserved occurrence of these organizations suggests that they are advantageous for protein-protein associations. Interestingly, the total number of hydrogen bonds and salt bridges contributed by hot spots is as expected. Thus, H-bond forming residues may use a "hot spot for water exclusion" mechanism. Since conserved residues are located within highly packed regions, water molecules are easily removed upon binding, strengthening electrostatic contributions of charge-charge interactions. Hence, the picture that emerges is that protein-protein associations are optimized locally, with the clustered, networked, highly packed structurally conserved residues contributing dominantly and cooperatively to the stability of the complex. When addressing the crucial question of "what are the preferred ways of proteins to associate", these findings point toward a critical involvement of hot regions in protein-protein interactions.  相似文献   

9.
del Sol A  O'Meara P 《Proteins》2005,58(3):672-682
We show that protein complexes can be represented as small-world networks, exhibiting a relatively small number of highly central amino-acid residues occurring frequently at protein-protein interfaces. We further base our analysis on a set of different biological examples of protein-protein interactions with experimentally validated hot spots, and show that 83% of these predicted highly central residues, which are conserved in sequence alignments and nonexposed to the solvent in the protein complex, correspond to or are in direct contact with an experimentally annotated hot spot. The remaining 17% show a general tendency to be close to an annotated hot spot. On the other hand, although there is no available experimental information on their contribution to the binding free energy, detailed analysis of their properties shows that they are good candidates for being hot spots. Thus, highly central residues have a clear tendency to be located in regions that include hot spots. We also show that some of the central residues in the protein complex interfaces are central in the monomeric structures before dimerization and that possible information relating to hot spots of binding free energy could be obtained from the unbound structures.  相似文献   

10.
Hot spot residues contribute dominantly to protein-protein interactions. Statistically, conserved residues correlate with hot spots, and their occurrence can distinguish between binding sites and the remainder of the protein surface. The hot spot and conservation analyses have been carried out on one side of the interface. Here, we show that both experimental hot spots and conserved residues tend to couple across two-chain interfaces. Intriguingly, the local packing density around both hot spots and conserved residues is higher than expected. We further observe a correlation between local packing density and experimental deltadeltaG. Favorable conserved pairs include Gly coupled with aromatics, charged and polar residues, as well as aromatic residue coupling. Remarkably, charged residue couples are underrepresented. Overall, protein-protein interactions appear to consist of regions of high and low packing density, with the hot spots organized in the former. The high local packing density in binding interfaces is reminiscent of protein cores.  相似文献   

11.
Single nucleotide polymorphisms (SNPs) are the most frequent variation in the human genome. Nonsynonymous SNPs that lead to missense mutations can be neutral or deleterious, and several computational methods have been presented that predict the phenotype of human missense mutations. These methods use sequence‐based and structure‐based features in various combinations, relying on different statistical distributions of these features for deleterious and neutral mutations. One structure‐based feature that has not been studied significantly is the accessible surface area within biologically relevant oligomeric assemblies. These assemblies are different from the crystallographic asymmetric unit for more than half of X‐ray crystal structures. We find that mutations in the core of proteins or in the interfaces in biological assemblies are significantly more likely to be disease‐associated than those on the surface of the biological assemblies. For structures with more than one protein in the biological assembly (whether the same sequence or different), we find the accessible surface area from biological assemblies provides a statistically significant improvement in prediction over the accessible surface area of monomers from protein crystal structures (P = 6e‐5). When adding this information to sequence‐based features such as the difference between wildtype and mutant position‐specific profile scores, the improvement from biological assemblies is statistically significant but much smaller (P = 0.018). Combining this information with sequence‐based features in a support vector machine leads to 82% accuracy on a balanced dataset of 50% disease‐associated mutations from SwissVar and 50% neutral mutations from human/primate sequence differences in orthologous proteins. Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

12.
The underlying physico-chemical principles of the interactions between domains in protein folding are similar to those between protein molecules in binding. Here we show that conserved residues and experimental hot spots at intermolecular binding interfaces overlap residues that vibrate with high frequencies. Similarly, conserved residues and hot spots are found in protein cores and are also observed to vibrate with high frequencies. In both cases, these residues contribute significantly to the stability. Hence, these observations validate the proposition that binding and folding are similar processes. In both packing plays a critical role, rationalizing the residue conservation and the experimental alanine scanning hot spots. We further show that high-frequency vibrating residues distinguish between protein binding sites and the remainder of the protein surface.  相似文献   

13.
Protein-carbohydrate interactions play an important role in several biological processes. The mutation of amino acid residues in carbohydrate-binding proteins may alter the binding affinity, affect the functions and lead to diseases. Elucidating the factors influencing the binding affinity change (ΔΔG) of protein-carbohydrate complexes upon mutation is a challenging task. In this work, we have collected the experimental data for the binding affinity change of 318 unique mutants and related with sequence and structural features of amino acid residues at the mutant sites. We found that accessible surface area, secondary structure, mutation preference, conservation score, hydrophobicity and contact energies are important to understand the binding affinity change upon mutation. We have developed multiple regression equations for predicting the binding affinity change upon mutation and our method showed an average correlation of 0.74 and a mean absolute error of 0.70 kcal/mol between experimental and predicted ΔΔG on a 10-fold cross-validation. Further, we have validated our method using an independent test data set of 124 (62 unique) mutations, which showed a correlation and MAE of 0.79 and 0.56 kcal/mol, respectively. We have developed a web server PCA-MutPred, Protein-CArbohydrate complex Mutation affinity Predictor, for predicting the change in binding affinity of protein–carbohydrate complexes and it is freely accessible at https://web.iitm.ac.in/bioinfo2/pcamutpred. We suggest that the method could be a useful resource for designing protein-carbohydrate complexes with desired affinities.  相似文献   

14.
Darnell SJ  Page D  Mitchell JC 《Proteins》2007,68(4):813-823
Protein-protein interactions can be altered by mutating one or more "hot spots," the subset of residues that account for most of the interface's binding free energy. The identification of hot spots requires a significant experimental effort, highlighting the practical value of hot spot predictions. We present two knowledge-based models that improve the ability to predict hot spots: K-FADE uses shape specificity features calculated by the Fast Atomic Density Evaluation (FADE) program, and K-CON uses biochemical contact features. The combined K-FADE/CON (KFC) model displays better overall predictive accuracy than computational alanine scanning (Robetta-Ala). In addition, because these methods predict different subsets of known hot spots, a large and significant increase in accuracy is achieved by combining KFC and Robetta-Ala. The KFC analysis is applied to the calmodulin (CaM)/smooth muscle myosin light chain kinase (smMLCK) interface, and to the bone morphogenetic protein-2 (BMP-2)/BMP receptor-type I (BMPR-IA) interface. The results indicate a strong correlation between KFC hot spot predictions and mutations that significantly reduce the binding affinity of the interface.  相似文献   

15.
In this paper we address the problem of extracting features relevant for predicting protein--protein interaction sites from the three-dimensional structures of protein complexes. Our approach is based on information about evolutionary conservation and surface disposition. We implement a neural network based system, which uses a cross validation procedure and allows the correct detection of 73% of the residues involved in protein interactions in a selected database comprising 226 heterodimers. Our analysis confirms that the chemico-physical properties of interacting surfaces are difficult to distinguish from those of the whole protein surface. However neural networks trained with a reduced representation of the interacting patch and sequence profile are sufficient to generalize over the different features of the contact patches and to predict whether a residue in the protein surface is or is not in contact. By using a blind test, we report the prediction of the surface interacting sites of three structural components of the Dnak molecular chaperone system, and find close agreement with previously published experimental results. We propose that the predictor can significantly complement results from structural and functional proteomics.  相似文献   

16.
A non-redundant database of 4536 structural domains, comprising more than 790,000 residues, has been used for the calculation of their solvent accessibility in the native protein environment and then in the isolated domain environment. Nearly 140,000 (18%) residues showed a change in accessible surface area in the above two conditions. General features of this change under these two circumstances have been pointed out. Propensities of these interfacing amino acid residues have been calculated and their variation for different secondary structure types has been analyzed. Actual amount of surface area lost by different secondary structures is higher in the case of helix and strands compared to coil and other conformations. Overall change in surface area in hydrophobic and uncharged residues is higher than that in charged residues. An attempt has been made to know the predictability of interface residues from sequence environments. This analysis and prediction results have significant implications towards determining interacting residues in proteins and for the prediction of protein-protein, protein-ligand, protein-DNA and similar interactions.  相似文献   

17.
Proteins are essential elements of biological systems, and their function typically relies on their ability to successfully bind to specific partners. Recently, an emphasis of study into protein interactions has been on hot spots, or residues in the binding interface that make a significant contribution to the binding energetics. In this study, we investigate how conservation of hot spots can be used to guide docking prediction. We show that the use of evolutionary data combined with hot spot prediction highlights near‐native structures across a range of benchmark examples. Our approach explores various strategies for using hot spots and evolutionary data to score protein complexes, using both absolute and chemical definitions of conservation along with refinements to these strategies that look at windowed conservation and filtering to ensure a minimum number of hot spots in each binding partner. Finally, structure‐based models of orthologs were generated for comparison with sequence‐based scoring. Using two data sets of 22 and 85 examples, a high rate of top 10 and top 1 predictions are observed, with up to 82% of examples returning a top 10 hit and 35% returning top 1 hit depending on the data set and strategy applied; upon inclusion of the native structure among the decoys, up to 55% of examples yielded a top 1 hit. The 20 common examples between data sets show that more carefully curated interolog data yields better predictions, particularly in achieving top 1 hits. Proteins 2015; 83:1940–1946. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.  相似文献   

18.
Hu Z  Ma B  Wolfson H  Nussinov R 《Proteins》2000,39(4):331-342
A number of studies have addressed the question of which are the critical residues at protein-binding sites. These studies examined either a single or a few protein-protein interfaces. The most extensive study to date has been an analysis of alanine-scanning mutagenesis. However, although the total number of mutations was large, the number of protein interfaces was small, with some of the interfaces closely related. Here we show that although overall binding sites are hydrophobic, they are studded with specific, conserved polar residues at specific locations, possibly serving as energy "hot spots." Our results confirm and generalize the alanine-scanning data analysis, despite its limited size. Previously Trp, Arg, and Tyr were shown to constitute energetic hot spots. These were rationalized by their polar interactions and by their surrounding rings of hydrophobic residues. However, there was no compelling reason as to why specifically these residues were conserved. Here we show that other polar residues are similarly conserved. These conserved residues have been detected consistently in all interface families that we have examined. Our results are based on an extensive examination of residues which are in contact across protein interfaces. We utilize all clustered interface families with at least five members and with sequence similarity between the members in the range of 20-90%. There are 11 such clustered interface families, comprising a total of 97 crystal structures. Our three-dimensional superpositioning analysis of the occurrences of matched residues in each of the families identifies conserved residues at spatially similar environments. Additionally, in enzyme inhibitors, we observe that residues are more conserved at the interfaces than at other locations. On the other hand, antibody-protein interfaces have similar surface conservation as compared to their corresponding linear sequence alignment, consistent with the suggestion that evolution has optimized protein interfaces for function.  相似文献   

19.
The interaction between beta-catenin and Tcf family members is crucial for the Wnt signal transduction pathway, which is commonly mutated in cancer. This interaction extends over a very large surface area (4800 A(2)), and inhibiting such interactions using low molecular weight inhibitors is a challenge. However, protein surfaces frequently contain "hot spots," small patches that are the main mediators of binding affinity. By making tight interactions with a hot spot, a small molecule can compete with a protein. The Tcf3/Tcf4-binding surface on beta-catenin contains a well-defined hot spot around residues K435 and R469. A 17,700 compounds subset of the Pharmacia corporate collection was docked to this hot spot with the QXP program; 22 of the best scoring compounds were put into a biophysical (NMR and ITC) screening funnel, where specific binding to beta-catenin, competition with Tcf4 and finally binding constants were determined. This process led to the discovery of three druglike, low molecular weight Tcf4-competitive compounds with the tightest binder having a K(D) of 450 nM. Our approach can be used in several situations (e.g., when selecting compounds from external collections, when no biochemical functional assay is available, or when no HTS is envisioned), and it may be generally applicable to the identification of inhibitors of protein-protein interactions.  相似文献   

20.
Di Cui  Shuching Ou  Sandeep Patel 《Proteins》2014,82(12):3312-3326
Hydrophobic effects, often conflated with hydrophobic forces, are implicated as major determinants in biological association and self‐assembly processes. Protein–protein interactions involved in signaling pathways in living systems are a prime example where hydrophobic effects have profound implications. In the context of protein–protein interactions, a priori knowledge of relevant binding interfaces (i.e., clusters of residues involved directly with binding interactions) is difficult. In the case of hydrophobically mediated interactions, use of hydropathy‐based methods relying on single residue hydrophobicity properties are routinely and widely used to predict propensities for such residues to be present in hydrophobic interfaces. However, recent studies suggest that consideration of hydrophobicity for single residues on a protein surface require accounting of the local environment dictated by neighboring residues and local water. In this study, we use a method derived from percolation theory to evaluate spanning water networks in the first hydration shells of a series of small proteins. We use residue‐based water density and single‐linkage clustering methods to predict hydrophobic regions of proteins; these regions are putatively involved in binding interactions. We find that this simple method is able to predict with sufficient accuracy and coverage the binding interface residues of a series of proteins. The approach is competitive with automated servers. The results of this study highlight the importance of accounting of local environment in determining the hydrophobic nature of individual residues on protein surfaces. Proteins 2014; 82:3312–3326. © 2014 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号