首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
Chen H  Zhou HX 《Proteins》2005,61(1):21-35
The number of structures of protein-protein complexes deposited to the Protein Data Bank is growing rapidly. These structures embed important information for predicting structures of new protein complexes. This motivated us to develop the PPISP method for predicting interface residues in protein-protein complexes. In PPISP, sequence profiles and solvent accessibility of spatially neighboring surface residues were used as input to a neural network. The network was trained on native interface residues collected from the Protein Data Bank. The prediction accuracy at the time was 70% with 47% coverage of native interface residues. Now we have extensively improved PPISP. The training set now consisted of 1156 nonhomologous protein chains. Test on a set of 100 nonhomologous protein chains showed that the prediction accuracy is now increased to 80% with 51% coverage. To solve the problem of over-prediction and under-prediction associated with individual neural network models, we developed a consensus method that combines predictions from multiple models with different levels of accuracy and coverage. Applied on a benchmark set of 68 proteins for protein-protein docking, the consensus approach outperformed the best individual models by 3-8 percentage points in accuracy. To demonstrate the predictive power of cons-PPISP, eight complex-forming proteins with interfaces characterized by NMR were tested. These proteins are nonhomologous to the training set and have a total of 144 interface residues identified by chemical shift perturbation. cons-PPISP predicted 174 interface residues with 69% accuracy and 47% coverage and promises to complement experimental techniques in characterizing protein-protein interfaces. .  相似文献   

2.
Is the whole protein surface available for interaction with other proteins, or are specific sites pre-assigned according to their biophysical and structural character? And if so, is it possible to predict the location of the binding site from the surface properties? These questions are answered quantitatively by probing the surfaces of proteins using spheres of radius of 10 A on a database (DB) of 57 unique, non-homologous proteins involved in heteromeric, transient protein-protein interactions for which the structures of both the unbound and bound states were determined. In structural terms, we found the binding site to have a preference for beta-sheets and for relatively long non-structured chains, but not for alpha-helices. Chemically, aromatic side-chains show a clear preference for binding sites. While the hydrophobic and polar content of the interface is similar to the rest of the surface, hydrophobic and polar residues tend to cluster in interfaces. In the crystal, the binding site has more bound water molecules surrounding it, and a lower B-factor already in the unbound protein. The same biophysical properties were found to hold for the unbound and bound DBs. All the significant interface properties were combined into ProMate, an interface prediction program. This was followed by an optimization step to choose the best combination of properties, as many of them are correlated. During optimization and prediction, the tested proteins were not used for data collection, to avoid over-fitting. The prediction algorithm is fully automated, and is used to predict the location of potential binding sites on unbound proteins with known structures. The algorithm is able to successfully predict the location of the interface for about 70% of the proteins. The success rate of the predictor was equal whether applied on the unbound DB or on the disjoint bound DB. A prediction is assumed correct if over half of the predicted continuous interface patch is indeed interface. The ability to predict the location of protein-protein interfaces has far reaching implications both towards our understanding of specificity and kinetics of binding, as well as in assisting in the analysis of the proteome.  相似文献   

3.
We present a set of four parameters that in combination can predict DNA-binding residues on protein structures to a high degree of accuracy. These are the number of evolutionary conserved residues (N(cons)) and their spatial clustering (ρ(e)), hydrogen bond donor capability (D(p)) and residue propensity (R(p)). We first used these parameters to characterize 130 interfaces in a set of 126 DNA-binding proteins (DBPs). The applicability of these parameters both individually and in combination, to distinguish the true binding region from the rest of the protein surface was then analyzed. R(p) shows the best performance identifying the true interface with the top rank in 83% cases. Importantly, we also used the unbound-bound test cases of the protein-DNA docking benchmark to test the efficacy of our method. When applied to the unbound form of the DBPs, R(p) can distinguish 86% cases. Finally, we have applied the SVM approach for recognizing the interface region using the above parameters along with the individual amino acid composition as attributes. The accuracy of prediction is 90.5% for the bound structures and 93.6% for the unbound form of the proteins.  相似文献   

4.
Protein-water interactions have long been recognized as a major determinant of chain folding, conformational stability, binding specificity and catalysis. However, the detailed effects of water on stabilizing protein-protein interactions remain elusive. A way to test experimentally the contribution of water-mediated interactions is by applying double mutant cycle analysis on pairs of residues that do not form direct interactions, but are bridged by water. Seven such interactions within the interface between TEM1 and BLIP proteins were evaluated. No significant interaction free energy was found between either of them. Water can bridge interactions, but also stabilize the structure of the monomer. To distinguish between these, we performed a bioinformatic analysis using AQUAPROT (http://bioinfo.weizmann.ac.il/aquaprot) to determine the degree of water conservation between the bound and unbound states. 29 structures of twelve complexes and 20 related monomers were analyzed. Of the 262 water molecules located within the interfaces, 145 were conserved between the unbound and bound structures. Strikingly, all 50 buried or partially buried waters in the monomer structures were conserved at the same location in the bound structures. Thus, buried waters have an important role in stabilizing the monomer fold rather than contributing to protein-protein binding, and are not replaced by residues from the incoming protein. Taking together the experimental and bioinformatics evidence suggests that exposed waters within the interface may be good sites for protein engineering, while buried or mostly buried waters should be left unchanged.  相似文献   

5.
Interaction-site prediction for protein complexes: a critical assessment   总被引:2,自引:0,他引:2  
MOTIVATION: Proteins function through interactions with other proteins and biomolecules. Protein-protein interfaces hold key information toward molecular understanding of protein function. In the past few years, there have been intensive efforts in developing methods for predicting protein interface residues. A review that presents the current status of interface prediction and an overview of its applications and project future developments is in order. SUMMARY: Interface prediction methods rely on a wide range of sequence, structural and physical attributes that distinguish interface residues from non-interface surface residues. The input data are manipulated into either a numerical value or a probability representing the potential for a residue to be inside a protein interface. Predictions are now satisfactory for complex-forming proteins that are well represented in the Protein Data Bank, but less so for under-represented ones. Future developments will be directed at tackling problems such as building structural models for multi-component structural complexes.  相似文献   

6.
Structural and physical properties of DNA provide important constraints on the binding sites formed on surfaces of DNA-targeting proteins. Characteristics of such binding sites may form the basis for predicting DNA-binding sites from the structures of proteins alone. Such an approach has been successfully developed for predicting protein–protein interface. Here this approach is adapted for predicting DNA-binding sites. We used a representative set of 264 protein–DNA complexes from the Protein Data Bank to analyze characteristics and to train and test a neural network predictor of DNA-binding sites. The input to the predictor consisted of PSI-blast sequence profiles and solvent accessibilities of each surface residue and 14 of its closest neighboring residues. Predicted DNA-contacting residues cover 60% of actual DNA-contacting residues and have an accuracy of 76%. This method significantly outperforms previous attempts of DNA-binding site predictions. Its application to the prion protein yielded a DNA-binding site that is consistent with recent NMR chemical shift perturbation data, suggesting that it can complement experimental techniques in characterizing protein–DNA interfaces.  相似文献   

7.
Protein binding site prediction using an empirical scoring function   总被引:4,自引:1,他引:3  
Liang S  Zhang C  Liu S  Zhou Y 《Nucleic acids research》2006,34(13):3698-3707
Most biological processes are mediated by interactions between proteins and their interacting partners including proteins, nucleic acids and small molecules. This work establishes a method called PINUP for binding site prediction of monomeric proteins. With only two weight parameters to optimize, PINUP produces not only 42.2% coverage of actual interfaces (percentage of correctly predicted interface residues in actual interface residues) but also 44.5% accuracy in predicted interfaces (percentage of correctly predicted interface residues in the predicted interface residues) in a cross validation using a 57-protein dataset. By comparison, the expected accuracy via random prediction (percentage of actual interface residues in surface residues) is only 15%. The binding sites of the 57-protein set are found to be easier to predict than that of an independent test set of 68 proteins. The average coverage and accuracy for this independent test set are 30.5 and 29.4%, respectively. The significant gain of PINUP over expected random prediction is attributed to (i) effective residue-energy score and accessible-surface-area-dependent interface-propensity, (ii) isolation of functional constraints contained in the conservation score from the structural constraints through the combination of residue-energy score (for structural constraints) and conservation score and (iii) a consensus region built on top-ranked initial patches.  相似文献   

8.
del Sol A  O'Meara P 《Proteins》2005,58(3):672-682
We show that protein complexes can be represented as small-world networks, exhibiting a relatively small number of highly central amino-acid residues occurring frequently at protein-protein interfaces. We further base our analysis on a set of different biological examples of protein-protein interactions with experimentally validated hot spots, and show that 83% of these predicted highly central residues, which are conserved in sequence alignments and nonexposed to the solvent in the protein complex, correspond to or are in direct contact with an experimentally annotated hot spot. The remaining 17% show a general tendency to be close to an annotated hot spot. On the other hand, although there is no available experimental information on their contribution to the binding free energy, detailed analysis of their properties shows that they are good candidates for being hot spots. Thus, highly central residues have a clear tendency to be located in regions that include hot spots. We also show that some of the central residues in the protein complex interfaces are central in the monomeric structures before dimerization and that possible information relating to hot spots of binding free energy could be obtained from the unbound structures.  相似文献   

9.
The distinguishing property of Sm protein associations is their high stability. In order to understand this property, we analyzed the interface non-covalent interactions and compared the properties of the Sm protein interfaces with those of a test set, Binding Interface Database (BID). The comparison revealed that the main differences between interfaces of Sm proteins and those of the BID set are the content of charged residues, hydrogen bonds, salt bridges, and conservation scores of interface residues. In Sm proteins, the interfaces have more hydrophobic and fewer charged residues than the surface, which is also the case for the BID test set and other proteins. However, in the interfaces, the content of charged residues in Sm proteins (26%) is substantially larger than that in the BID set (22%). Both interfaces of Sm proteins and of test set have a similar number of hydrophobic interactions per 100 Å2. The interfaces of Sm proteins have substantially more hydrogen bonds than the interfaces in test set. The results show clearly that the interfaces of Sm proteins form more salt bridges compared with test set. On average, there are about 16 salt bridges per interface. The high conservation score of amino acids that are involved in non-covalent interactions in protein interfaces is an additional strong argument for their importance. The overriding conclusion from this study is that the non-covalent interactions in Sm protein interfaces considerably contribute to stability of higher order structures.  相似文献   

10.
MOTIVATION: Large-scale experiments reveal pairs of interacting proteins but leave the residues involved in the interactions unknown. These interface residues are essential for understanding the mechanism of interaction and are often desired drug targets. Reliable identification of residues that reside in protein-protein interface typically requires analysis of protein structure. Therefore, for the vast majority of proteins, for which there is no high-resolution structure, there is no effective way of identifying interface residues. RESULTS: Here we present a machine learning-based method that identifies interacting residues from sequence alone. Although the method is developed using transient protein-protein interfaces from complexes of experimentally known 3D structures, it never explicitly uses 3D information. Instead, we combine predicted structural features with evolutionary information. The strongest predictions of the method reached over 90% accuracy in a cross-validation experiment. Our results suggest that despite the significant diversity in the nature of protein-protein interactions, they all share common basic principles and that these principles are identifiable from sequence alone.  相似文献   

11.
Conserved residues in protein-protein interfaces correlate with residue hot-spots. To obtain insight into their roles, we have studied their mobility. We have performed 39 explicit solvent simulations of 15 complexes and their monomers, with the interfaces varying in size, shape, and function. The dynamic behavior of conserved residues in unbound monomers illustrates significantly lower flexibility as compared to their environment, suggesting that already before binding they are constrained in a boundlike configuration. To understand this behavior, we have analyzed the inter- and intrachain hydrogen-bond residence-time in the interfaces. We find that conserved residues are not involved significantly in hydrogen bonds across the interface as compared to nonconserved. However, the monomer simulations reveal that conserved residues contribute dominantly to hydrogen-bond formation before binding. Packing of conserved residues across the trajectories is significantly higher before and after the binding, rationalizing their lower mobility. Backbone torsional angle distributions show that conserved residues assume restricted regions of space and the most visited conformations in the bound and unbound trajectories are similar, suggesting that conserved residues are preorganized. Combined with previous studies, we conclude that conserved residues, hot spots, anchor, and interface-buried residues may be similar residues, fulfilling similar roles.  相似文献   

12.
Small molecules that bind at protein-protein interfaces may either block or stabilize protein-protein interactions in cells. Thus, some of these binding interfaces may turn into prospective targets for drug design. Here, we collected 175 pairs of protein-protein (PP) complexes and protein-ligand (PL) complexes with known three-dimensional structures for which (1) one protein from the PP complex shares at least 40% sequence identity with the protein from the PL complex, and (2) the interface regions of these proteins overlap at least partially with each other. We found that those residues of the interfaces that may bind the other protein as well as the small molecule are evolutionary more conserved on average, have a higher tendency of being located in pockets and expose a smaller fraction of their surface area to the solvent than the remaining protein-protein interface region. Based on these findings we derived a statistical classifier that predicts patches at binding interfaces that have a higher tendency to bind small molecules. We applied this new prediction method to more than 10 000 interfaces from the protein data bank. For several complexes related to apoptosis the predicted binding patches were in direct contact to co-crystallized small molecules.  相似文献   

13.
Multiprotein systems mediate most regulatory processes in living organisms. Although the structures of the individual proteins are often defined, less is known of the structures of multiprotein systems. Computational methods for predicting interfaces, using evolutionary conservation and/or physicochemical data, have been developed. Here we consider the use of solvent accessibility, residue propensity, and hydrophobicity, in conjunction with secondary structure data, as prediction parameters. We analyze the influence of residue type and secondary structure on solvent accessibility and define a measure of "relative exposedness." Clustering abnormally high scoring residues provides a basis for predicting interaction sites. The analysis is extended to investigate abnormally exposed secondary structure elements, particularly beta-sheet strands. We show that surface-exposed beta-strands lacking protective features are more likely to be found at protein-protein interfaces, allowing us to create an algorithm with approximately 68% and approximately 75% accuracy in differentiating between interacting and edge strands in isolated beta-strands and beta-sheet strands, respectively. These methods of identifying abnormally exposed surface regions are combined in an algorithm, which, on a data set of 77 unbound and disjoint (single chain extracted from complex) structures, predicts 79% of the protein-protein interfaces correctly. If enzyme-inhibitor complexes, where the inhibitor mimics a nonprotein substrate, are excluded, the accuracy increases to 85%.  相似文献   

14.
We compare the geometric and physical-chemical properties of interfaces involved in specific and non-specific protein-protein interactions in crystal structures reported in the Protein Data Bank. Specific interactions are illustrated by 70 protein-protein complexes and by subunit contacts in 122 homodimeric proteins; non-specific interactions are illustrated by 188 pairs of monomeric proteins making crystal-packing contacts selected to bury more than 800 A2 of protein surface. A majority of these pairs have 2-fold symmetry and form "crystal dimers" that cannot be distinguished from real dimers on the basis of the interface size or symmetry. The chemical and amino acid compositions of the large crystal-packing interfaces resemble the protein solvent-accessible surface. These interfaces are less hydrophobic than in homodimers and contain much fewer fully buried atoms. We develop a residue propensity score and a hydrophobic interaction score to assess preferences seen in the chemical and amino acid compositions of the different types of interfaces, and we derive indexes to evaluate the atomic packing, which we find to be less compact at non-specific than at specific interfaces. We test the capacity of these parameters to identify homodimeric proteins in crystal structures, and show that a simple combination of the non-polar interface area and the fraction of buried interface atoms assigns the quaternary structure of 88% of the homodimers and 77% of the monomers in our data set correctly. These success rates increase to 93-95% when the residue propensity score of the interfaces is taken into consideration.  相似文献   

15.
Protein interactions are often accompanied by significant changes in conformation. We have analyzed the relationships between protein structures and the conformational changes they undergo upon binding. Based upon this, we introduce a simple measure, the relative solvent accessible surface area, which can be used to predict the magnitude of binding-induced conformational changes from the structures of either monomeric proteins or bound subunits. Applying this to a large set of protein complexes suggests that large conformational changes upon binding are common. In addition, we observe considerable enrichment of intrinsically disordered sequences in proteins predicted to undergo large conformational changes. Finally, we demonstrate that the relative solvent accessible surface area of monomeric proteins can be used as a simple proxy for protein flexibility. This reveals a powerful connection between the flexibility of unbound proteins and their binding-induced conformational changes, consistent with the conformational selection model of molecular recognition.  相似文献   

16.
Conformational changes on complex formation have been measured for 39 pairs of structures of complexed proteins and unbound equivalents, averaged over interface and non-interface regions and for individual residues. We evaluate their significance by comparison with the differences seen in 12 pairs of independently solved structures of identical proteins, and find that just over half have some substantial overall movement. Movements involve main chains as well as side chains, and large changes in the interface are closely involved with complex formation, while those of exposed non-interface residues are caused by flexibility and disorder. Interface movements in enzymes are similar in extent to those of inhibitors. All eight of the complexes (six enzyme-inhibitor and two antibody-antigen) that have structures of both components in an unbound form available show some significant interface movement. However, predictive docking is successful even when some of the largest changes occur. We note however that the situation may be different in systems other than the enzyme-inhibitors which dominate this study. Thus the general model is induced fit but, because there is only limited conformational change in many systems, recognition can be treated as lock and key to a first approximation.  相似文献   

17.
Mason AC  Jensen JH 《Proteins》2008,71(1):81-91
pK(a) values of ionizable residues have been calculated using the PROPKA method and structures of 75 protein-protein complexes and their corresponding free forms. These pK(a) values were used to compute changes in protonation state of individual residues, net changes in protonation state of the complex relative to the uncomplexed proteins, and the correction to a binding energy calculated assuming standard protonation states at pH 7. For each complex, two different structures for the uncomplexed form of the proteins were used: the X-ray structures determined for the proteins in the absence of the other protein and the individual protein structures taken from the structure of the complex (referred to as unbound and bound structures, respectively). In 28 and 77% of the cases considered here, protein-protein binding is accompanied by a complete (>95%) or significant (>50%) change in protonation state of at least one residue using unbound structures. Furthermore, in 36 and 61% of the cases, protein-protein binding is accompanied by a complete or significant net change in protonation state of the complex relative to the separated monomers. Using bound structures, the corresponding values are 12, 51, 20, and 48%. Comparison to experimental data suggest that using unbound and bound structures lead to over- and underestimation of binding-induced protonation state changes, respectively. Thus, we conclude that protein-protein binding is often associated with changes in protonation state of amino acid residues and with changes in the net protonation state of the proteins. The pH-dependent correction to the binding energy contributes at least one order of magnitude to the binding constant in 45 and 23%, using unbound and bound structures, respectively.  相似文献   

18.
We investigate the extent to which the conformational fluctuations of proteins in solution reflect the conformational changes that they undergo when they form binary protein-protein complexes. To do this, we study a set of 41 proteins that form such complexes and whose three-dimensional structures are known, both bound in the complex and unbound. We carry out molecular dynamics simulations of each protein, starting from the unbound structure, and analyze the resulting conformational fluctuations in trajectories of 5 ns in length, comparing with the structure in the complex. It is found that fluctuations take some parts of the molecules into regions of conformational space close to the bound state (or give information about it), but at no point in the simulation does each protein as whole sample the complete bound state. Subsequent use of conformations from a clustered MD ensemble in rigid-body docking is nevertheless partially successful when compared to docking the unbound conformations, as long as the unbound conformations are themselves included with the MD conformations and the whole globally rescored. For one key example where sub-domain motion is present, a ribonuclease inhibitor, principal components analysis of the MD was applied and was also able to produce conformations for docking that gave enhanced results compared to the unbound. The most significant finding is that core interface residues show a tendency to be less mobile (by size of fluctuation or entropy) than the rest of the surface even when the other binding partner is absent, and conversely the peripheral interface residues are more mobile. This surprising result, consistent across up to 40 of the 41 proteins, suggests different roles for these regions in protein recognition and binding, and suggests ways that docking algorithms could be improved by treating these regions differently in the docking process.  相似文献   

19.
The goal of this article is to reduce the complexity of the side chain search within docking problems. We apply six methods of generating side chain conformers to unbound protein structures and determine their ability of obtaining the bound conformation in small ensembles of conformers. Methods are evaluated in terms of the positions of side chain end groups. Results for 68 protein complexes yield two important observations. First, the end‐group positions change less than 1 Å on association for over 60% of interface side chains. Thus, the unbound protein structure carries substantial information about the side chains in the bound state, and the inclusion of the unbound conformation into the ensemble of conformers is very beneficial. Second, considering each surface side chain separately in its protein environment, small ensembles of low‐energy states include the bound conformation for a large fraction of side chains. In particular, the ensemble consisting of the unbound conformation and the two highest probability predicted conformers includes the bound conformer with an accuracy of 1 Å for 78% of interface side chains. As more than 60% of the interface side chains have only one conformer and many others only a few, these ensembles of low‐energy states substantially reduce the complexity of side chain search in docking problems. This approach was already used for finding pockets in protein–protein interfaces that can bind small molecules to potentially disrupt protein–protein interactions. Side‐chain search with the reduced search space will also be incorporated into protein docking algorithms. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

20.
Xie BB  Chen XL  Zhang XY  He HL  Zhang YZ  Zhou BC 《Proteins》2008,71(3):1461-1474
Identification of protein interaction interfaces is very important for understanding the molecular mechanisms underlying biological phenomena. Here, we present a novel method for predicting protein interaction interfaces from sequences by using PAM matrix (PIFPAM). Sequence alignments for interacting proteins were constructed and parsed into segments using sliding windows. By calculating distance matrix for each segment, the correlation coefficients between segments were estimated. The interaction interfaces were predicted by extracting highly correlated segment pairs from the correlation map. The predictions achieved an accuracy 0.41-0.71 for eight intraprotein interaction examples, and 0.07-0.60 for four interprotein interaction examples. Compared with three previously published methods, PIFPAM predicted more contacting site pairs for 11 out of the 12 example proteins, and predicted at least 34% more contacting site pairs for eight proteins of them. The factors affecting the predictions were also analyzed. Since PIFPAM uses only the alignments of the two interacting proteins as input, it is especially useful when no three-dimensional protein structure data are available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号