首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Tuncbag N  Keskin O  Nussinov R  Gursoy A 《Proteins》2012,80(4):1239-1249
The similarity between folding and binding led us to posit the concept that the number of protein-protein interface motifs in nature is limited, and interacting protein pairs can use similar interface architectures repeatedly, even if their global folds completely vary. Thus, known protein-protein interface architectures can be used to model the complexes between two target proteins on the proteome scale, even if their global structures differ. This powerful concept is combined with a flexible refinement and global energy assessment tool. The accuracy of the method is highly dependent on the structural diversity of the interface architectures in the template dataset. Here, we validate this knowledge-based combinatorial method on the Docking Benchmark and show that it efficiently finds high-quality models for benchmark complexes and their binding regions even in the absence of template interfaces having sequence similarity to the targets. Compared to "classical" docking, it is computationally faster; as the number of target proteins increases, the difference becomes more dramatic. Further, it is able to distinguish binders from nonbinders. These features allow performing large-scale network modeling. The results on an independent target set (proteins in the p53 molecular interaction map) show that current method can be used to predict whether a given protein pair interacts. Overall, while constrained by the diversity of the template set, this approach efficiently produces high-quality models of protein-protein complexes. We expect that with the growing number of known interface architectures, this type of knowledge-based methods will be increasingly used by the broad proteomics community.  相似文献   

2.
Here, we present a diverse, structurally nonredundant data set of two-chain protein-protein interfaces derived from the PDB. Using a sequence order-independent structural comparison algorithm and hierarchical clustering, 3799 interface clusters are obtained. These yield 103 clusters with at least five nonhomologous members. We divide the clusters into three types. In Type I clusters, the global structures of the chains from which the interfaces are derived are also similar. This cluster type is expected because, in general, related proteins associate in similar ways. In Type II, the interfaces are similar; however, remarkably, the overall structures and functions of the chains are different. The functional spectrum is broad, from enzymes/inhibitors to immunoglobulins and toxins. The fact that structurally different monomers associate in similar ways, suggests "good" binding architectures. This observation extends a paradigm in protein science: It has been well known that proteins with similar structures may have different functions. Here, we show that it extends to interfaces. In Type III clusters, only one side of the interface is similar across the cluster. This structurally nonredundant data set provides rich data for studies of protein-protein interactions and recognition, cellular networks and drug design. In particular, it may be useful in addressing the difficult question of what are the favorable ways for proteins to interact. (The data set is available at http://protein3d.ncifcrf.gov/~keskino/ and http://home.ku.edu.tr/~okeskin/INTERFACE/INTERFACES.html.)  相似文献   

3.
Chen H  Zhou HX 《Proteins》2005,61(1):21-35
The number of structures of protein-protein complexes deposited to the Protein Data Bank is growing rapidly. These structures embed important information for predicting structures of new protein complexes. This motivated us to develop the PPISP method for predicting interface residues in protein-protein complexes. In PPISP, sequence profiles and solvent accessibility of spatially neighboring surface residues were used as input to a neural network. The network was trained on native interface residues collected from the Protein Data Bank. The prediction accuracy at the time was 70% with 47% coverage of native interface residues. Now we have extensively improved PPISP. The training set now consisted of 1156 nonhomologous protein chains. Test on a set of 100 nonhomologous protein chains showed that the prediction accuracy is now increased to 80% with 51% coverage. To solve the problem of over-prediction and under-prediction associated with individual neural network models, we developed a consensus method that combines predictions from multiple models with different levels of accuracy and coverage. Applied on a benchmark set of 68 proteins for protein-protein docking, the consensus approach outperformed the best individual models by 3-8 percentage points in accuracy. To demonstrate the predictive power of cons-PPISP, eight complex-forming proteins with interfaces characterized by NMR were tested. These proteins are nonhomologous to the training set and have a total of 144 interface residues identified by chemical shift perturbation. cons-PPISP predicted 174 interface residues with 69% accuracy and 47% coverage and promises to complement experimental techniques in characterizing protein-protein interfaces. .  相似文献   

4.
Mintseris J  Weng Z 《Proteins》2003,53(3):629-639
The ability to analyze and compare protein-protein interactions on the structural level is critical to our understanding of various aspects of molecular recognition and the functional interplay of components of biochemical networks. In this study, we introduce atomic contact vectors (ACVs) as an intuitive way to represent the physico-chemical characteristics of a protein-protein interface as well as a way to compare interfaces to each other. We test the utility of ACVs in classification by using them to distinguish between homodimers and crystal contacts. Our results compare favorably with those reported by other authors. We then apply ACVs to mine the PDB for all known protein-protein complexes and separate transient recognition complexes from permanent oligomeric ones. Getting at the basis of this difference is important for our understanding of recognition and we achieved a success rate of 91% for distinguishing these two classes of complexes. Although accessible surface area of the interface is a major discriminating feature, we also show that there are distinct differences in the contact preferences between the two kinds of complexes. Illustrating the superiority of ACVs as a basic comparison measure over a sequence-based approach, we derive a general rule of thumb to determine whether two protein-protein interfaces are redundant. With this method, we arrive at a nonredundant set of 209 recognition complexes--the largest set reported so far.  相似文献   

5.
Garma L  Mukherjee S  Mitra P  Zhang Y 《PloS one》2012,7(6):e38913
"Protein quaternary structure universe" refers to the ensemble of all protein-protein complexes across all organisms in nature. The number of quaternary folds thus corresponds to the number of ways proteins physically interact with other proteins. This study focuses on answering two basic questions: Whether the number of protein-protein interactions is limited and, if yes, how many different quaternary folds exist in nature. By all-to-all sequence and structure comparisons, we grouped the protein complexes in the protein data bank (PDB) into 3,629 families and 1,761 folds. A statistical model was introduced to obtain the quantitative relation between the numbers of quaternary families and quaternary folds in nature. The total number of possible protein-protein interactions was estimated around 4,000, which indicates that the current protein repository contains only 42% of quaternary folds in nature and a full coverage needs approximately a quarter century of experimental effort. The results have important implications to the protein complex structural modeling and the structure genomics of protein-protein interactions.  相似文献   

6.
The analysis and prediction of protein-protein interaction sites from structural data are restricted by the limited availability of structural complexes that represent the complete protein-protein interaction space. The domain classification schemes CATH and SCOP are normally used independently in the analysis and prediction of protein domain-domain interactions. In this article, the effect of different domain classification schemes on the number and type of domain-domain interactions observed in structural data is systematically evaluated for the SCOP and CATH hierarchies. Although there is a large overlap in domain assignments between SCOP and CATH, 23.6% of CATH interfaces had no SCOP equivalent and 37.3% of SCOP interfaces had no CATH equivalent in a nonredundant set. Therefore, combining both classifications gives an increase of between 23.6 and 37.3% in domain-domain interfaces. It is suggested that if possible, both domain classification schemes should be used together, but if only one is selected, SCOP provides better coverage than CATH. Employing both SCOP and CATH reduces the false negative rate of predictive methods, which employ homology matching to structural data to predict protein-protein interaction by an estimated 6.5%.  相似文献   

7.
The general similarity in the forces governing protein folding and protein-protein associations has led us to examine the similarity in the architectural motifs between the interfaces and the monomers. We have carried out extensive, all-against-all structural comparisons between the single-chain protein structural dataset and the interface dataset, derived both from all protein-protein complexes in the structural database and from interfaces generated via an automated crystal symmetry operation. We show that despite the absence of chain connections, the global features of the architectural motifs, present in monomers, recur in the interfaces, a reflection of the limited set of the folding patterns. However, although similarity has been observed, the details of the architectural motifs vary. In particular, the extent of the similarity correlates with the consideration of how the interface has been formed. Interfaces derived from two-state model complexes, where the chains fold cooperatively, display a considerable similarity to architectures in protein cores, as judged by the quality of their geometric superposition. On the other hand, the three-state model interfaces, representing binding of already folded molecules, manifest a larger variability and resemble the monomer architecture only in general outline. The origin of the difference between the monomers and the three-state model interfaces can be understood in terms of the different nature of the folding and the binding that are involved. Whereas in the former all degrees of freedom are available to the backbone to maximize favorable interactions, in rigid body, three-state model binding, only six degrees of freedom are allowed. Hence, residue or atom pair-wise potentials derived from protein-protein associations are expected to be less accurate, substantially increasing the number of computationally acceptable alternate binding modes (Finkelstein et al., 1995).  相似文献   

8.
Using a data set of aligned protein domain superfamilies of known three-dimensional structure, we compared the location of interdomain interfaces on the tertiary folds between members of distantly related protein domain superfamilies. The data set analyzed is comprised of interdomain interfaces, with domains occurring within a polypeptide chain and those between two polypeptide chains. We observe that, in general, the interfaces between protein domains are formed entirely in different locations on the tertiary folds in such pairs. This variation in the location of interface happens in protein domains involved in a wide range of functions, such as enzymes, adapters, and domains that bind protein ligands, or cofactors. While basic biochemical functionality is preserved at the domain superfamily level, the effect of biochemical function on protein assemblies is different in these protein domains related by superfamily. The divergence between proteins, in most cases, is coupled with domain recruitment, with different modes of interaction with the recruited domain. This is in complete contrast to the observation that in closely related homologous protein domains, almost always the interaction interfaces are topologically equivalent. In a small subset of interacting domains within proteins related by remote homology, we observe that the relative positioning of domains with respect to one another is preserved. Based on the analysis of multidomain proteins of known or unknown structure, we suggest that variation in protein-protein interactions in members within a superfamily could serve as diverging points in otherwise parallel metabolic or signaling pathways. We discuss a few representative cases of diverging pathways involving domains in a superfamily.  相似文献   

9.
Protein-protein interfaces are regions between 2 polypeptide chains that are not covalently connected. Here, we have created a nonredundant interface data set generated from all 2-chain interfaces in the Protein Data Bank. This data set is unique, since it contains clusters of interfaces with similar shapes and spatial organization of chemical functional groups. The data set allows statistical investigation of similar interfaces, as well as the identification and analysis of the chemical forces that account for the protein-protein associations. Toward this goal, we have developed I2I-SiteEngine (Interface-to-Interface SiteEngine) [Data set available at http://bioinfo3d.cs.tau.ac.il/Interfaces; Web server: http://bioinfo3d.cs.tau.ac.il/I2I-SiteEngine]. The algorithm recognizes similarities between protein-protein binding surfaces. I2I-SiteEngine is independent of the sequence or the fold of the proteins that comprise the interfaces. In addition to geometry, the method takes into account both the backbone and the side-chain physicochemical properties of the interacting atom groups. Its high efficiency makes it suitable for large-scale database searches and classifications. Below, we briefly describe the I2I-SiteEngine method. We focus on the classification process and the obtained nonredundant protein-protein interface data set. In particular, we analyze the biological significance of the clusters and present examples which illustrate that given constellations of chemical groups in protein-protein binding sites may be preferred, and are observed in proteins with different structures and different functions. We expect that these would yield further information regarding the forces stabilizing protein-protein interactions.  相似文献   

10.
La D  Kihara D 《Proteins》2012,80(1):126-141
Protein-protein binding events mediate many critical biological functions in the cell. Typically, functionally important sites in proteins can be well identified by considering sequence conservation. However, protein-protein interaction sites exhibit higher sequence variation than other functional regions, such as catalytic sites of enzymes. Consequently, the mutational behavior leading to weak sequence conservation poses significant challenges to the protein-protein interaction site prediction. Here, we present a phylogenetic framework to capture critical sequence variations that favor the selection of residues essential for protein-protein binding. Through the comprehensive analysis of diverse protein families, we show that protein binding interfaces exhibit distinct amino acid substitution as compared with other surface residues. On the basis of this analysis, we have developed a novel method, BindML, which utilizes the substitution models to predict protein-protein binding sites of protein with unknown interacting partners. BindML estimates the likelihood that a phylogenetic tree of a local surface region in a query protein structure follows the substitution patterns of protein binding interface and nonbinding surfaces. BindML is shown to perform well compared to alternative methods for protein binding interface prediction. The methodology developed in this study is very versatile in the sense that it can be generally applied for predicting other types of functional sites, such as DNA, RNA, and membrane binding sites in proteins.  相似文献   

11.
Cho KI  Lee K  Lee KH  Kim D  Lee D 《Proteins》2006,65(3):593-606
In this study, we investigate what types of interactions are specific to their biological function, and what types of interactions are persistent regardless of their functional category in transient protein-protein heterocomplexes. This is the first approach to analyze protein-protein interfaces systematically at the molecular interaction level in the context of protein functions. We perform systematic analysis at the molecular interaction level using classification and feature subset selection technique prevalent in the field of pattern recognition. To represent the physicochemical properties of protein-protein interfaces, we design 18 molecular interaction types using canonical and noncanonical interactions. Then, we construct input vector using the frequency of each interaction type in protein-protein interface. We analyze the 131 interfaces of transient protein-protein heterocomplexes in PDB: 33 protease-inhibitors, 52 antibody-antigens, 46 signaling proteins including 4 cyclin dependent kinase and 26 G-protein. Using kNN classification and feature subset selection technique, we show that there are specific interaction types based on their functional category, and such interaction types are conserved through the common binding mechanism, rather than through the sequence or structure conservation. The extracted interaction types are C(alpha)-- H...O==C interaction, cation...anion interaction, amine...amine interaction, and amine...cation interaction. With these four interaction types, we achieve the classification success rate up to 83.2% with leave-one-out cross-validation at k = 15. Of these four interaction types, C(alpha)--H...O==C shows binding specificity for protease-inhibitor complexes, while cation-anion interaction is predominant in signaling complexes. The amine ... amine and amine...cation interaction give a minor contribution to the classification accuracy. When combined with these two interactions, they increase the accuracy by 3.8%. In the case of antibody-antigen complexes, the sign is somewhat ambiguous. From the evolutionary perspective, while protease-inhibitors and sig-naling proteins have optimized their interfaces to suit their biological functions, antibody-antigen interactions are the happenstance, implying that antibody-antigen complexes do not show distinctive interaction types. Persistent interaction types such as pi...pi, amide-carbonyl, and hydroxyl-carbonyl interaction, are also investigated. Analyzing the structural orientations of the pi...pi stacking interactions, we find that herringbone shape is a major configuration in transient protein-protein interfaces. This result is different from that of protein core, where parallel-displaced configurations are the major configuration. We also analyze overall trend of amide-carbonyl and hydroxyl-carbonyl interactions. It is noticeable that nearly 82% of the interfaces have at least one hydroxyl-carbonyl interactions.  相似文献   

12.
Hu Z  Ma B  Wolfson H  Nussinov R 《Proteins》2000,39(4):331-342
A number of studies have addressed the question of which are the critical residues at protein-binding sites. These studies examined either a single or a few protein-protein interfaces. The most extensive study to date has been an analysis of alanine-scanning mutagenesis. However, although the total number of mutations was large, the number of protein interfaces was small, with some of the interfaces closely related. Here we show that although overall binding sites are hydrophobic, they are studded with specific, conserved polar residues at specific locations, possibly serving as energy "hot spots." Our results confirm and generalize the alanine-scanning data analysis, despite its limited size. Previously Trp, Arg, and Tyr were shown to constitute energetic hot spots. These were rationalized by their polar interactions and by their surrounding rings of hydrophobic residues. However, there was no compelling reason as to why specifically these residues were conserved. Here we show that other polar residues are similarly conserved. These conserved residues have been detected consistently in all interface families that we have examined. Our results are based on an extensive examination of residues which are in contact across protein interfaces. We utilize all clustered interface families with at least five members and with sequence similarity between the members in the range of 20-90%. There are 11 such clustered interface families, comprising a total of 97 crystal structures. Our three-dimensional superpositioning analysis of the occurrences of matched residues in each of the families identifies conserved residues at spatially similar environments. Additionally, in enzyme inhibitors, we observe that residues are more conserved at the interfaces than at other locations. On the other hand, antibody-protein interfaces have similar surface conservation as compared to their corresponding linear sequence alignment, consistent with the suggestion that evolution has optimized protein interfaces for function.  相似文献   

13.
We developed a model of macromolecular interfaces based on the Voronoi diagram and the related alpha-complex, and we tested its properties on a set of 96 protein-protein complexes taken from the Protein Data Bank. The Voronoi model provides a natural definition of the interfaces, and it yields values of the number of interface atoms and of the interface area that have excellent correlation coefficients with those of the classical model based on solvent accessibility. Nevertheless, some atoms that do not lose solvent accessibility are part of the interface defined by the Voronoi model. The Voronoi model provides robust definitions of the curvature and of the connectivity of the interfaces, and leads to estimates of these features that generally agree with other approaches. Our implementation of the model allows an analysis of protein-water contacts that highlights the role of structural water molecules at protein-protein interfaces.  相似文献   

14.
Molecular principles of the interactions of disordered proteins   总被引:6,自引:0,他引:6  
Thorough knowledge of the molecular principles of protein-protein recognition is essential to our understanding of protein function at the cellular level. Whereas interactions of ordered proteins have been analyzed in great detail, complexes of intrinsically unstructured/disordered proteins (IUPs) have hardly been addressed so far. Here, we have collected a database of 39 complexes of experimentally verified IUPs, and compared their interfaces with those of 72 complexes of ordered, globular proteins. The characteristic differences found between the two types of complexes suggest that IUPs represent a distinct molecular implementation of the principles of protein-protein recognition. The interfaces do not differ in size, but those of IUPs cover a much larger part of the surface of the protein than for their ordered counterparts. Moreover, IUP interfaces are significantly more hydrophobic relative to their overall amino acid composition, but also in absolute terms. They rely more on hydrophobic-hydrophobic than on polar-polar interactions. Their amino acids in the interface realize more intermolecular contacts, which suggests a better fit with the partner due to induced folding upon binding that results in a better adaptation to the partner. The two modes of interaction also differ in that IUPs usually use only a single continuous segment for partner binding, whereas the binding sites of ordered proteins are more segmented. Probably, all these features contribute to the increased evolutionary conservation of IUP interface residues. These noted molecular differences are also manifested in the interaction energies of IUPs. Our approximation of these by low-resolution force-fields shows that IUPs gain much more stabilization energy from intermolecular contacts, than from folding, i.e. they use their binding energy for folding. Overall, our findings provide a structural rationale to the prior suggestions that many IUPs are specialized for functions realized by protein-protein interactions.  相似文献   

15.
Bahadur RP  Janin J 《Proteins》2008,71(1):407-414
To evaluate the evolutionary constraints placed on viral proteins by the structure and assembly of the capsid, we calculate Shannon entropies in the aligned sequences of 45 polypeptide chains in 32 icosahedral viruses, and relate these entropies to the residue location in the three-dimensional structure of the capsids. Three categories of residues have entropies lower than the chain average implying that they are better conserved than average: residues that are buried within a subunit (the protein core), residues that contain atoms buried at an interface between subunits (the interface core), and residues that contribute to several such interfaces. The interface core is also conserved in homomeric proteins and in transient protein-protein complexes, which have only one interface whereas capsids have many. In capsids, the subunit interfaces implicate most of the polypeptide chain: on average, 66% of the capsid residues are at an interface, 34% at more than one, and 47% at the interface core. Nevertheless, we observe that the degree of residue conservation can vary widely between interfaces within a capsid and between regions within an interface. The interfaces and regions of interfaces that show a low sequence variability are likely to play major roles in the self-assembly of the capsid, with implications on its mechanism that we discuss taking adeno-associated virus as an example.  相似文献   

16.
A global census of stereochemical metrics including interface size, hydropathy, amino acid propensities, packing and hydrogen bonding was carried out on 32 x-ray-elucidated structures of lectin-carbohydrate complexes covering eight different lectin families. It is shown that the interactions at primary binding subsites are more efficient than at other subsites. Another salient behavior found for primary subsites was a marked negative correlation between the interface size and the polar surface content. It is noteworthy that this demographic rule is delineated by lectins with unrelated phylogenetic origin, indicating that independent interface architectures have evolved through common optimization paths. The structural properties of lectin-carbohydrate interfaces were compared with those characterizing a set of 32 protein homodimers. Overall, the analysis shows that the stereochemical bases of lectin-carbohydrate and protein-protein interfaces differ drastically from each other. In comparison with protein-protein complexes, lectin-carbohydrate interfaces have superior packing efficiency, better hydrogen bonding stereochemistry, and higher interaction cooperativity. A similar conclusion holds in the comparison with protein-protein heterocomplexes. We propose that the energetic consequence of this better interaction geometry is a larger decrease in free energy per unit of area buried, feature that enables lectins and carbohydrates to form stable complexes with relatively small interface areas. These observations lend support to the emerging notion that systems differing from each other in their stereochemical metrics may rely on different energetic bases.  相似文献   

17.
Bordner AJ  Abagyan R 《Proteins》2005,60(3):353-366
Predicting protein-protein interfaces from a three-dimensional structure is a key task of computational structural proteomics. In contrast to geometrically distinct small molecule binding sites, protein-protein interface are notoriously difficult to predict. We generated a large nonredundant data set of 1494 true protein-protein interfaces using biological symmetry annotation where necessary. The data set was carefully analyzed and a Support Vector Machine was trained on a combination of a new robust evolutionary conservation signal with the local surface properties to predict protein-protein interfaces. Fivefold cross validation verifies the high sensitivity and selectivity of the model. As much as 97% of the predicted patches had an overlap with the true interface patch while only 22% of the surface residues were included in an average predicted patch. The model allowed the identification of potential new interfaces and the correction of mislabeled oligomeric states.  相似文献   

18.
The C2 domain is one of the most frequent and widely distributed calcium-binding motifs. Its structure comprises an eight-stranded beta-sandwich with two structural types as if the result of a circular permutation. Combining sequence, structural and modelling information, we have explored, at different levels of granularity, the functional characteristics of several families of C2 domains. At the coarsest level, the similarity correlates with key structural determinants of the C2 domain fold and, at the finest level, with the domain architecture of the proteins containing them, highlighting the functional diversity between the various sub-families. The functional diversity appears as different conserved surface patches throughout this common fold. In some cases, these patches are related to substrate-binding sites whereas in others they correspond to interfaces of presumably permanent interaction between other domains within the same polypeptide chain. For those related to substrate-binding sites, the predictions overlap with biochemical data in addition to providing some novel observations. For those acting as protein-protein interfaces, our modelling analysis suggests that slight variations between families are a result of not only complementary adaptations in the interfaces involved but also different domain architecture. In the light of the sequence and structural genomic projects, the work presented here shows that modelling approaches along with careful sub-typing of protein families will be a powerful combination for a broader coverage in proteomics.  相似文献   

19.
The subunit interfaces of 122 homodimers of known three-dimensional structure are analyzed and dissected into sets of surface patches by clustering atoms at the interface; 70 interfaces are single-patch, the others have up to six patches, often contributed by different structural domains. The average interface buries 1,940 A2 of the surface of each monomer, contains one or two patches burying 600-1,600 A2, is 65% nonpolar and includes 18 hydrogen bonds. However, the range of size and of hydrophobicity is wide among the 122 interfaces. Each interface has a core made of residues with atoms buried in the dimer, surrounded by a rim of residues with atoms that remain accessible to solvent. The core, which constitutes 77% of the interface on average, has an amino acid composition that resembles the protein interior except for the presence of arginine residues, whereas the rim is more like the protein surface. These properties of the interfaces in homodimers, which are permanent assemblies, are compared to those of protein-protein complexes where the components associate after they have independently folded. On average, subunit interfaces in homodimers are twice larger than in complexes, and much less polar due to the large fraction belonging to the core, although the amino acid compositions of the cores are similar in the two types of interfaces.  相似文献   

20.
Jiang L  Kuhlman B  Kortemme T  Baker D 《Proteins》2005,58(4):893-904
Water-mediated hydrogen bonds play critical roles at protein-protein and protein-nucleic acid interfaces, and the interactions formed by discrete water molecules cannot be captured using continuum solvent models. We describe a simple model for the energetics of water-mediated hydrogen bonds, and show that, together with knowledge of the positions of buried water molecules observed in X-ray crystal structures, the model improves the prediction of free-energy changes upon mutation at protein-protein interfaces, and the recovery of native amino acid sequences in protein interface design calculations. We then describe a "solvated rotamer" approach to efficiently predict the positions of water molecules, at protein-protein interfaces and in monomeric proteins, that is compatible with widely used rotamer-based side-chain packing and protein design algorithms. Finally, we examine the extent to which the predicted water molecules can be used to improve prediction of amino acid identities and protein-protein interface stability, and discuss avenues for overcoming current limitations of the approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号