首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
MOTIVATION: Protein-protein docking algorithms typically generate large numbers of possible complex structures with only a few of them resembling the native structure. Recently (Duan et al., Protein Sci, 14:316-218, 2005), it was observed that the surface density of conserved residue positions is high at the interface regions of interacting protein surfaces, except for antibody-antigen complexes, where a lesser number of conserved positions than average is observed at the interface regions. Using this observation, we identified putative interacting regions on the surface of interacting partners and significantly improved docking results by assigning top ranks to near-native complex structures. In this paper, we combine the residue conservation information with a widely used shape complementarity algorithm to generate candidate complex structures with a higher percentage of near-native structures (hits). What is new in this work is that the conservation information is used early in the generation stage and not only in the ranking stage of the docking algorithm. This results in a significantly larger number of generated hits and an improved predictive ability in identifying the native structure of protein-protein complexes. RESULTS: We report on results from 48 well-characterized protein complexes, which have enough residue conservation information from the same 59 benchmark complexes used in our previous work. We compute conservation indices of residue positions on the surfaces of interacting proteins using available homologous sequences from UNIPROT and calculate the solvent accessible surface area. We combine this information with shape-complementarity scores to generate candidate protein-protein complex structures. When compared with pure shape-complementarity algorithms, performed by FTDock, our method results in significantly more hits, with the improvement being over 100% in many instances. We demonstrate that residue conservation information is useful not only in refinement and scoring of docking solutions, but also helpful in enrichment of near-native-structures during the generation of candidate geometries of complex structures.  相似文献   

2.
3.
Protein interfaces are thought to be distinguishable from the rest of the protein surface by their greater degree of residue conservation. We test the validity of this approach on an expanded set of 64 protein-protein interfaces using conservation scores derived from two multiple sequence alignment types, one of close homologs/orthologs and one of diverse homologs/paralogs. Overall, we find that the interface is slightly more conserved than the rest of the protein surface when using either alignment type, with alignments of diverse homologs showing marginally better discrimination. However, using a novel surface-patch definition, we find that the interface is rarely significantly more conserved than other surface patches when using either alignment type. When an interface is among the most conserved surface patches, it tends to be part of an enzyme active site. The most conserved surface patch overlaps with 39% (+/- 28%) and 36% (+/- 28%) of the actual interface for diverse and close homologs, respectively. Contrary to results obtained from smaller data sets, this work indicates that residue conservation is rarely sufficient for complete and accurate prediction of protein interfaces. Finally, we find that obligate interfaces differ from transient interfaces in that the former have significantly fewer alignment gaps at the interface than the rest of the protein surface, as well as having buried interface residues that are more conserved than partially buried interface residues.  相似文献   

4.
La D  Kihara D 《Proteins》2012,80(1):126-141
Protein-protein binding events mediate many critical biological functions in the cell. Typically, functionally important sites in proteins can be well identified by considering sequence conservation. However, protein-protein interaction sites exhibit higher sequence variation than other functional regions, such as catalytic sites of enzymes. Consequently, the mutational behavior leading to weak sequence conservation poses significant challenges to the protein-protein interaction site prediction. Here, we present a phylogenetic framework to capture critical sequence variations that favor the selection of residues essential for protein-protein binding. Through the comprehensive analysis of diverse protein families, we show that protein binding interfaces exhibit distinct amino acid substitution as compared with other surface residues. On the basis of this analysis, we have developed a novel method, BindML, which utilizes the substitution models to predict protein-protein binding sites of protein with unknown interacting partners. BindML estimates the likelihood that a phylogenetic tree of a local surface region in a query protein structure follows the substitution patterns of protein binding interface and nonbinding surfaces. BindML is shown to perform well compared to alternative methods for protein binding interface prediction. The methodology developed in this study is very versatile in the sense that it can be generally applied for predicting other types of functional sites, such as DNA, RNA, and membrane binding sites in proteins.  相似文献   

5.
Protein–protein interactions are essential to all aspects of life. Specific interactions result from evolutionary pressure at the interacting interfaces of partner proteins. However, evolutionary pressure is not homogeneous within the interface: for instance, each residue does not contribute equally to the binding energy of the complex. To understand functional differences between residues within the interface, we analyzed their properties in the core and rim regions. Here, we characterized protein interfaces with two evolutionary measures, conservation and coevolution, using a comprehensive dataset of 896 protein complexes. These scores can detect different selection pressures at a given position in a multiple sequence alignment. We also analyzed how the number of interactions in which a residue is involved influences those evolutionary signals. We found that the coevolutionary signal is higher in the interface core than in the interface rim region. Additionally, the difference in coevolution between core and rim regions is comparable to the known difference in conservation between those regions. Considering proteins with multiple interactions, we found that conservation and coevolution increase with the number of different interfaces in which a residue is involved, suggesting that more constraints (i.e., a residue that must satisfy a greater number of interactions) allow fewer sequence changes at those positions, resulting in higher conservation and coevolution values. These findings shed light on the evolution of protein interfaces and provide information useful for identifying protein interfaces and predicting protein–protein interactions.  相似文献   

6.
7.
Protein-protein complex formation involves removal of water from the interface region. Surface regions with a small free energy penalty for water removal or desolvation may correspond to preferred interaction sites. A method to calculate the electrostatic free energy of placing a neutral low-dielectric probe at various protein surface positions has been designed and applied to characterize putative interaction sites. Based on solutions of the finite-difference Poisson equation, this method also includes long-range electrostatic contributions and the protein solvent boundary shape in contrast to accessible-surface-area-based solvation energies. Calculations on a large set of proteins indicate that in many cases (>90%), the known binding site overlaps with one of the six regions of lowest electrostatic desolvation penalty (overlap with the lowest desolvation region for 48% of proteins). Since the onset of electrostatic desolvation occurs even before direct protein-protein contact formation, it may help guide proteins toward the binding region in the final stage of complex formation. It is interesting that the probe desolvation properties associated with residue types were found to depend to some degree on whether the residue was outside of or part of a binding site. The probe desolvation penalty was on average smaller if the residue was part of a binding site compared to other surface locations. Applications to several antigen-antibody complexes demonstrated that the approach might be useful not only to predict protein interaction sites in general but to map potential antigenic epitopes on protein surfaces.  相似文献   

8.
Amino acid residues, which play important roles in protein function, are often conserved. Here, we analyze thermodynamic and structural data of protein-DNA interactions to explore a relationship between free energy, sequence conservation and structural cooperativity. We observe that the most stabilizing residues or putative hotspots are those which occur as clusters of conserved residues. The higher packing density of the clusters and available experimental thermodynamic data of mutations suggest cooperativity between conserved residues in the clusters. Conserved singlets contribute to the stability of protein-DNA complexes to a lesser extent. We also analyze structural features of conserved residues and their clusters and examine their role in identifying DNA-binding sites. We show that about half of the observed conserved residue clusters are in the interface with the DNA, which could be identified from their amino acid composition; whereas the remaining clusters are at the protein-protein or protein-ligand interface, or embedded in the structural scaffolds. In protein-protein interfaces, conserved residues are highly correlated with experimental residue hotspots, contributing dominantly and often cooperatively to the stability of protein-protein complexes. Overall, the conservation patterns of the stabilizing residues in DNA-binding proteins also highlight the significance of clustering as compared to single residue conservation.  相似文献   

9.
In this paper we address the problem of extracting features relevant for predicting protein--protein interaction sites from the three-dimensional structures of protein complexes. Our approach is based on information about evolutionary conservation and surface disposition. We implement a neural network based system, which uses a cross validation procedure and allows the correct detection of 73% of the residues involved in protein interactions in a selected database comprising 226 heterodimers. Our analysis confirms that the chemico-physical properties of interacting surfaces are difficult to distinguish from those of the whole protein surface. However neural networks trained with a reduced representation of the interacting patch and sequence profile are sufficient to generalize over the different features of the contact patches and to predict whether a residue in the protein surface is or is not in contact. By using a blind test, we report the prediction of the surface interacting sites of three structural components of the Dnak molecular chaperone system, and find close agreement with previously published experimental results. We propose that the predictor can significantly complement results from structural and functional proteomics.  相似文献   

10.
Protein heterodimer complexes are often involved in catalysis, regulation, assembly, immunity and inhibition. This involves the formation of stable interfaces between the interacting partners. Hence, it is of interest to describe heterodimer interfaces using known structural complexes. We use a non-redundant dataset of 192 heterodimer complex structures from the protein databank (PDB) to identify interface residues and describe their interfaces using amino-acids residue property preference. Analysis of the dataset shows that the heterodimer interfaces are often abundant in polar residues. The analysis also shows the presence of two classes of interfaces in heterodimer complexes. The first class of interfaces (class A) with more polar residues than core but less than surface is known. These interfaces are more hydrophobic than surfaces, where protein-protein binding is largely hydrophobic. The second class of interfaces (class B) with more polar residues than core and surface is shown. These interfaces are more polar than surfaces, where binding is mainly polar. Thus, these findings provide insights to the understanding of protein-protein interactions.  相似文献   

11.
Hot spot residues contribute dominantly to protein-protein interactions. Statistically, conserved residues correlate with hot spots, and their occurrence can distinguish between binding sites and the remainder of the protein surface. The hot spot and conservation analyses have been carried out on one side of the interface. Here, we show that both experimental hot spots and conserved residues tend to couple across two-chain interfaces. Intriguingly, the local packing density around both hot spots and conserved residues is higher than expected. We further observe a correlation between local packing density and experimental deltadeltaG. Favorable conserved pairs include Gly coupled with aromatics, charged and polar residues, as well as aromatic residue coupling. Remarkably, charged residue couples are underrepresented. Overall, protein-protein interactions appear to consist of regions of high and low packing density, with the hot spots organized in the former. The high local packing density in binding interfaces is reminiscent of protein cores.  相似文献   

12.
Multiprotein systems mediate most regulatory processes in living organisms. Although the structures of the individual proteins are often defined, less is known of the structures of multiprotein systems. Computational methods for predicting interfaces, using evolutionary conservation and/or physicochemical data, have been developed. Here we consider the use of solvent accessibility, residue propensity, and hydrophobicity, in conjunction with secondary structure data, as prediction parameters. We analyze the influence of residue type and secondary structure on solvent accessibility and define a measure of "relative exposedness." Clustering abnormally high scoring residues provides a basis for predicting interaction sites. The analysis is extended to investigate abnormally exposed secondary structure elements, particularly beta-sheet strands. We show that surface-exposed beta-strands lacking protective features are more likely to be found at protein-protein interfaces, allowing us to create an algorithm with approximately 68% and approximately 75% accuracy in differentiating between interacting and edge strands in isolated beta-strands and beta-sheet strands, respectively. These methods of identifying abnormally exposed surface regions are combined in an algorithm, which, on a data set of 77 unbound and disjoint (single chain extracted from complex) structures, predicts 79% of the protein-protein interfaces correctly. If enzyme-inhibitor complexes, where the inhibitor mimics a nonprotein substrate, are excluded, the accuracy increases to 85%.  相似文献   

13.
Structural analysis of a non-redundant data set of 47 immunoglobulin (Ig) proteins was carried out using a combination of criteria: atom--atom contact compatibility, position occupancy rate, conservation of residue type and positional conservation in 3D space. Our analysis shows that roughly half of the interface positions between the light and heavy chains are specific to individual structures while the other half are conserved across the database. The tendency for conservation of a primary subset of positions holds true for the intra-domain faces as well. These subsets, with an average of 12 conserved positions and a contact surface of 630 A(2), delineate the inter- and intra-domain core, a refined instrument with a reduced target for analysis of sheet--sheet interactions in sandwich-like proteins. Employing this instrument, we find that a majority of Ig interface core positions are adjoined in sequence to domain core positions. This was derived independent of geometric considerations, however beta-sheet side-chain geometry clearly dictates it. The geometric wedding of the domain and interface cores supports the concept of a rigid-like substructure on the protein surface involved in complex formation and indicates a close relationship between surface determinants and those involved in protein folding of Ig domains. The definitions developed for the Ig interface and domain cores proved satisfactory to extract first-approximation cores for a group of 24 non-Ig sandwich-like proteins, treated as individual structures due to their diverse strand topologies. We show that the same rule of positional connectivity between the rigid domain core and interface core extends generally to sandwich-like proteins interacting in a sheet--sheet fashion. The non-Ig structures were used as templates to analyze sandwich-like interfaces of unresolved homologous proteins using a database merging structure and sequence conservation.  相似文献   

14.

Background

Protein-protein interactions play a critical role in protein function. Completion of many genomes is being followed rapidly by major efforts to identify interacting protein pairs experimentally in order to decipher the networks of interacting, coordinated-in-action proteins. Identification of protein-protein interaction sites and detection of specific amino acids that contribute to the specificity and the strength of protein interactions is an important problem with broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks.

Results

In order to increase the power of predictive methods for protein-protein interaction sites, we have developed a consensus methodology for combining four different methods. These approaches include: data mining using Support Vector Machines, threading through protein structures, prediction of conserved residues on the protein surface by analysis of phylogenetic trees, and the Conservatism of Conservatism method of Mirny and Shakhnovich. Results obtained on a dataset of hydrolase-inhibitor complexes demonstrate that the combination of all four methods yield improved predictions over the individual methods.

Conclusions

We developed a consensus method for predicting protein-protein interface residues by combining sequence and structure-based methods. The success of our consensus approach suggests that similar methodologies can be developed to improve prediction accuracies for other bioinformatic problems.  相似文献   

15.
Dvorsky R  Ahmadian MR 《EMBO reports》2004,5(12):1130-1136
The signalling functions of Rho-family GTPases are based on the formation of distinctive protein-protein complexes. Invaluable insights into the structure-function relationships of the Rho GTPases have been obtained through the resolution of several of their structures in complex with regulators and downstream effectors. In this review, we use these complexes to compare the binding and specificity-determining sites of the Rho GTPases. Although the properties that characterize these sites are diverse, some fundamental conserved principles that govern their intermolecular interactions have emerged. Notably, all of the interacting partners of the Rho GTPases, irrespective of their function, bind to a common set of conserved amino acids that are clustered on the surface of the switch regions. This conserved region and its specific structural characteristics exemplify the convergence of the Rho GTPases on a consensus binding site.  相似文献   

16.
The representation of protein structures as small-world networks facilitates the search for topological determinants, which may relate to functionally important residues. Here, we aimed to investigate the performance of residue centrality, viewed as a family fold characteristic, in identifying functionally important residues in protein families. Our study is based on 46 families, including 29 enzyme and 17 non-enzyme families. A total of 80% of these central positions corresponded to active site residues or residues in direct contact with these sites. For enzyme families, this percentage increased to 91%, while for non-enzyme families the percentage decreased substantially to 48%. A total of 70% of these central positions are located in catalytic sites in the enzyme families, 64% are in hetero-atom binding sites in those families binding hetero-atoms, and only 16% belong to protein-protein interfaces in families with protein-protein interaction data. These differences reflect the active site shape: enzyme active sites locate in surface clefts, hetero-atom binding residues are in deep cavities, while protein-protein interactions involve a more planar configuration. On the other hand, not all surface cavities or clefts are comprised of central residues. Thus, closeness centrality identifies functionally important residues in enzymes. While here we focus on binding sites, we expect to identify key residues for the integration and transmission of the information to the rest of the protein, reflecting the relationship between fold and function. Residue centrality is more conserved than the protein sequence, emphasizing the robustness of protein structures.  相似文献   

17.
Small molecules that bind at protein-protein interfaces may either block or stabilize protein-protein interactions in cells. Thus, some of these binding interfaces may turn into prospective targets for drug design. Here, we collected 175 pairs of protein-protein (PP) complexes and protein-ligand (PL) complexes with known three-dimensional structures for which (1) one protein from the PP complex shares at least 40% sequence identity with the protein from the PL complex, and (2) the interface regions of these proteins overlap at least partially with each other. We found that those residues of the interfaces that may bind the other protein as well as the small molecule are evolutionary more conserved on average, have a higher tendency of being located in pockets and expose a smaller fraction of their surface area to the solvent than the remaining protein-protein interface region. Based on these findings we derived a statistical classifier that predicts patches at binding interfaces that have a higher tendency to bind small molecules. We applied this new prediction method to more than 10 000 interfaces from the protein data bank. For several complexes related to apoptosis the predicted binding patches were in direct contact to co-crystallized small molecules.  相似文献   

18.
Protein-protein interactions (PPI) are pivotal to the numerous processes in the cell. Therefore, it is of interest to document the analysis of these interactions in terms of binding sites, topology of the interacting structures and physiochemical properties of interacting interfaces and the of forces interactions. The interaction interface of obligatory protein-protein complexes differs from that of the transient interactions. We have created a large database of protein-protein interactions containing over100 thousand interfaces. The structural redundancy was eliminated to obtain a non-redundant database of over 2,265 interaction interfaces. Therefore, it is of interest to document the analysis of these interactions in terms of binding sites, topology of the interacting structures and physiochemical properties of interacting interfaces and the offorces interactions. The residue interaction propensity and all of the rest of the parametric scores converged to a statistical indistinguishable common sub-range and followed the similar distribution trends for all three classes of sequence-based classifications PPInS. This indicates that the principles of molecular recognition are dependent on the preciseness of the fit in the interaction interfaces. Thus, it reinforces the importance of geometrical and electrostatic complementarity as the main determinants for PPIs.  相似文献   

19.
Shukla A  Guptasarma P 《Proteins》2004,57(3):548-557
We show that residues at the interfaces of protein-protein complexes have higher side-chain energy than other surface residues. Eight different sets of protein complexes were analyzed. For each protein pair, the complex structure was used to identify the interface residues in the unbound monomer structures. Side-chain energy was calculated for each surface residue in the unbound monomer using our previously developed scoring function.1 The mean energy was calculated for the interface residues and the other surface residues. In 15 of the 16 monomers, the mean energy of the interface residues was higher than that of other surface residues. By decomposing the scoring function, we found that the energy term of the buried surface area of non-hydrogen-bonded hydrophilic atoms is the most important factor contributing to the high energy of the interface regions. In spite of lacking hydrophilic residues, the interface regions were found to be rich in buried non-hydrogen-bonded hydrophilic atoms. Although the calculation results could be affected by the inaccuracy of the scoring function, patch analysis of side-chain energy on the surface of an isolated protein may be helpful in identifying the possible protein-protein interface. A patch was defined as 20 residues surrounding the central residue on the protein surface, and patch energy was calculated as the mean value of the side-chain energy of all residues in the patch. In 12 of the studied monomers, the patch with the highest energy overlaps with the observed interface. The results are more remarkable when only three residues with the highest energy in a patch are averaged to derive the patch energy. All three highest-energy residues of the top energy patch belong to interfacial residues in four of the eight small protomers. We also found that the residue with the highest energy score on the surface of a small protomer is very possibly the key interaction residue.  相似文献   

20.
Chen CT  Peng HP  Jian JW  Tsai KC  Chang JY  Yang EW  Chen JB  Ho SY  Hsu WL  Yang AS 《PloS one》2012,7(6):e37706
Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI) sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins) and were tested on an independent dataset (consisting of 142 proteins). The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted correctly with the physicochemical complementarity features based on the non-covalent interaction data derived from protein interiors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号