首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Most scoring functions for protein-protein docking algorithms are either atom-based or residue-based, with the former being able to produce higher quality structures and latter more tolerant to conformational changes upon binding. Earlier, we developed the ZRANK algorithm for reranking docking predictions, with a scoring function that contained only atom-based terms. Here we combine ZRANK's atom-based potentials with five residue-based potentials published by other labs, as well as an atom-based potential IFACE that we published after ZRANK. We simultaneously optimized the weights for selected combinations of terms in the scoring function, using decoys generated with the protein-protein docking algorithm ZDOCK. We performed rigorous cross validation of the combinations using 96 test cases from a docking benchmark. Judged by the integrative success rate of making 1000 predictions per complex, addition of IFACE and the best residue-based pair potential reduced the number of cases without a correct prediction by 38 and 27% relative to ZDOCK and ZRANK, respectively. Thus combination of residue-based and atom-based potentials into a scoring function can improve performance for protein-protein docking. The resulting scoring function is called IRAD (integration of residue- and atom-based potentials for docking) and is available at http://zlab.umassmed.edu.  相似文献   

2.
H Lu  J Skolnick 《Proteins》2001,44(3):223-232
A heavy atom distance-dependent knowledge-based pairwise potential has been developed. This statistical potential is first evaluated and optimized with the native structure z-scores from gapless threading. The potential is then used to recognize the native and near-native structures from both published decoy test sets, as well as decoys obtained from our group's protein structure prediction program. In the gapless threading test, there is an average z-score improvement of 4 units in the optimized atomic potential over the residue-based quasichemical potential. Examination of the z-scores for individual pairwise distance shells indicates that the specificity for the native protein structure is greatest at pairwise distances of 3.5-6.5 A, i.e., in the first solvation shell. On applying the current atomic potential to test sets obtained from the web, composed of native protein and decoy structures, the current generation of the potential performs better than residue-based potentials as well as the other published atomic potentials in the task of selecting native and near-native structures. This newly developed potential is also applied to structures of varying quality generated by our group's protein structure prediction program. The current atomic potential tends to pick lower RMSD structures than do residue-based contact potentials. In particular, this atomic pairwise interaction potential has better selectivity especially for near-native structures. As such, it can be used to select near-native folds generated by structure prediction algorithms as well as for protein structure refinement.  相似文献   

3.
4.
Rykunov D  Fiser A 《Proteins》2007,67(3):559-568
Statistical distance dependent pair potentials are frequently used in a variety of folding, threading, and modeling studies of proteins. The applicability of these types of potentials is tightly connected to the reliability of statistical observations. We explored the possible origin and extent of false positive signals in statistical potentials by analyzing their distance dependence in a variety of randomized protein-like models. While on average potentials derived from such models are expected to equal zero at any distance, we demonstrate that systematic and significant distortions exist. These distortions originate from the limited statistical counts in local environments of proteins and from the limited size of protein structures at large distances. We suggest that these systematic errors in statistical potentials are connected to the dependence of amino acid composition on protein size and to variation in protein sizes. Additionally, atom-based potentials are dominated by a false positive signal that is due to correlation among distances measured from atoms of one residue to atoms of another residue. The significance of residue-based pairwise potentials at various spatial pair separations was assessed in this study and it was found that as few as approximately 50% of potential values were statistically significant at distances below 4 A, and only at most approximately 80% of them were significant at larger pair separations. A new definition for reference state, free of the observed systematic errors, is suggested. It has been demonstrated to generate statistical potentials that compare favorably to other publicly available ones.  相似文献   

5.
Potential of mean force for protein-protein interaction studies.   总被引:5,自引:0,他引:5  
Calculating protein-protein interaction energies is crucial for understanding protein-protein associations. On the basis of the methodology of mean-field potential, we have developed an empirical approach to estimate binding free energy for protein-protein interactions. This knowledge-based approach has been used to derive distance-dependent free energies of protein complexes from a nonredundant training set in the Protein Data Bank (PDB), with a careful treatment of homology. We calculate atom pair potentials for 16 pair interactions, which can reflect the importance of hydrophobic interactions and specific hydrogen-bonding interactions. The derived potentials for hydrogen-bonding interactions show a valley of favorable interactions at a distance of approximately 3 A, corresponding to that of an established hydrogen bond. For the test set of 28 protein complexes, the calculated energies have a correlation coefficient of 0.75 compared with experimental binding free energies. The performance of the method in ranking the binding energies of different protein-protein complexes shows that the energy estimation can be applied to value binding free energies for protein-protein associations.  相似文献   

6.
MOTIVATION: Protein assemblies are currently poorly represented in structural databases and their structural elucidation is a key goal in biology. Here we analyse clefts in protein surfaces, likely to correspond to binding 'hot-spots', and rank them according to sequence conservation and simple measures of physical properties including hydrophobicity, desolvation, electrostatic and van der Waals potentials, to predict which are involved in binding in the native complex. RESULTS: The resulting differences between predicting binding-sites at protein-protein and protein-ligand interfaces are striking. There is a high level of prediction accuracy (< or =93%) for protein-ligand interactions, based on the following attributes: van der Waals potential, electrostatic potential, desolvation and surface conservation. Generally, the prediction accuracy for protein-protein interactions is lower, with the exception of enzymes. Our results show that the ease of cleft desolvation is strongly predictive of interfaces and strongly maintained across all classes of protein-binding interface.  相似文献   

7.
Hydrogen bonding is a key contributor to the specificity of intramolecular and intermolecular interactions in biological systems. Here, we develop an orientation-dependent hydrogen bonding potential based on the geometric characteristics of hydrogen bonds in high-resolution protein crystal structures, and evaluate it using four tests related to the prediction and design of protein structures and protein-protein complexes. The new potential is superior to the widely used Coulomb model of hydrogen bonding in prediction of the sequences of proteins and protein-protein interfaces from their structures, and improves discrimination of correctly docked protein-protein complexes from large sets of alternative structures.  相似文献   

8.
Key to successful protein structure prediction is a potential that recognizes the native state from misfolded structures. Recent advances in empirical potentials based on known protein structures include improved reference states for assessing random interactions, sidechain-orientation-dependent pair potentials, potentials for describing secondary or supersecondary structural preferences and, most importantly, optimization protocols that sculpt the energy landscape to enhance the correlation between native-like features and the energy. Improved clustering algorithms that select native-like structures on the basis of cluster density also resulted in greater prediction accuracy. For template-based modeling, these advances allowed improvement in predicted structures relative to their initial template alignments over a wide range of target-template homology. This represents significant progress and suggests applications to proteome-scale structure prediction.  相似文献   

9.
Clark LA  van Vlijmen HW 《Proteins》2008,70(4):1540-1550
A distance-dependent knowledge-based potential for protein-protein interactions is derived and tested for application in protein design. Information on residue type specific C(alpha) and C(beta) pair distances is extracted from complex crystal structures in the Protein Data Bank and used in the form of radial distribution functions. The use of only backbone and C(beta) position information allows generation of relative protein-protein orientation poses with minimal sidechain information. Further coarse-graining can be done simply in the same theoretical framework to give potentials for residues of known type interacting with unknown type, as in a one-sided interface design problem. Both interface design via pose generation followed by sidechain repacking and localized protein-protein docking tests are performed on 39 nonredundant antibody-antigen complexes for which crystal structures are available. As reference, Lennard-Jones potentials, unspecific for residue type and biasing toward varying degrees of residue pair separation are used as controls. For interface design, the knowledge-based potentials give the best combination of consistently designable poses, low RMSD to the known structure, and more tightly bound interfaces with no added computational cost. 77% of the poses could be designed to give complexes with negative free energies of binding. Generally, larger interface separation promotes designability, but weakens the binding of the resulting designs. A localized docking test shows that the knowledge-based nature of the potentials improves performance and compares respectably with more sophisticated all-atoms potentials.  相似文献   

10.
Herein, we study the interfaces of a set of 146 transient protein-protein interfaces in order to better understand the principles of their interactions. We define and generate the protein interface using tools from computational geometry and topology and then apply statistical analysis to its residue composition. In addition to counting individual occurrences, we evaluate pairing preferences, both across and as neighbors on one side of an interface. Likelihood correction emphasizes novel and unexpected pairs, such as the His-Cys pair found in most complexes of serine proteases with their diverse inhibitors and the Met-Met neighbor pair found in unrelated protein interfaces. We also present a visualization of the protein interface that allows for facile identification of residue-residue contacts and other biochemical properties.  相似文献   

11.
Huang SY  Zou X 《Proteins》2008,72(2):557-579
Using an efficient iterative method, we have developed a distance-dependent knowledge-based scoring function to predict protein-protein interactions. The function, referred to as ITScore-PP, was derived using the crystal structures of a training set of 851 protein-protein dimeric complexes containing true biological interfaces. The key idea of the iterative method for deriving ITScore-PP is to improve the interatomic pair potentials by iteration, until the pair potentials can distinguish true binding modes from decoy modes for the protein-protein complexes in the training set. The iterative method circumvents the challenging reference state problem in deriving knowledge-based potentials. The derived scoring function was used to evaluate the ligand orientations generated by ZDOCK 2.1 and the native ligand structures on a diverse set of 91 protein-protein complexes. For the bound test cases, ITScore-PP yielded a success rate of 98.9% if the top 10 ranked orientations were considered. For the more realistic unbound test cases, the corresponding success rate was 40.7%. Furthermore, for faster orientational sampling purpose, several residue-level knowledge-based scoring functions were also derived following the similar iterative procedure. Among them, the scoring function that uses the side-chain center of mass (SCM) to represent a residue, referred to as ITScore-PP(SCM), showed the best performance and yielded success rates of 71.4% and 30.8% for the bound and unbound cases, respectively, when the top 10 orientations were considered. ITScore-PP was further tested using two other published protein-protein docking decoy sets, the ZDOCK decoy set and the RosettaDock decoy set. In addition to binding mode prediction, the binding scores predicted by ITScore-PP also correlated well with the experimentally determined binding affinities, yielding a correlation coefficient of R = 0.71 on a test set of 74 protein-protein complexes with known affinities. ITScore-PP is computationally efficient. The average run time for ITScore-PP was about 0.03 second per orientation (including optimization) on a personal computer with 3.2 GHz Pentium IV CPU and 3.0 GB RAM. The computational speed of ITScore-PP(SCM) is about an order of magnitude faster than that of ITScore-PP. ITScore-PP and/or ITScore-PP(SCM) can be combined with efficient protein docking software to study protein-protein recognition.  相似文献   

12.
For the first time, a statistical potential has been developed to quantitatively describe the CH.O hydrogen bonding interaction at the protein-protein interface. The calculated energies of the CH.O pair interaction show a favorable valley at approximately 3.3 A, exhibiting a feature typical of an H-bond and similar to the ab initio quantum calculation result (Scheiner, S., Kar, T., and Gu, Y. (2001) J. Biol. Chem. 276, 9832-9837). The potentials have been applied to a set of 469 protein-protein complexes to calculate the contribution of different types of interactions to each protein complex: the average energy contribution of a conventional H-bond is approximately 30%; that of a CH.O H-bond is 17%; and that of a hydrophobic interaction is 50%. In some protein-protein complexes, the contribution of the CH.O H-bond can reach as high as approximately 40-50%, indicating the importance of the CH.O H-bond at the protein interface. At the interfaces of these complexes, C(alpha)H.O H-bonds frequently occur between adjacent strands in both parallel and antiparallel orientations, having the obvious structural motif of bifurcated H-bonds. Our study suggests that the weak CH.O H-bond makes an important contribution to the association and stability of protein complexes and needs more attention in protein-protein interaction studies.  相似文献   

13.
The analysis and prediction of protein-protein interaction sites from structural data are restricted by the limited availability of structural complexes that represent the complete protein-protein interaction space. The domain classification schemes CATH and SCOP are normally used independently in the analysis and prediction of protein domain-domain interactions. In this article, the effect of different domain classification schemes on the number and type of domain-domain interactions observed in structural data is systematically evaluated for the SCOP and CATH hierarchies. Although there is a large overlap in domain assignments between SCOP and CATH, 23.6% of CATH interfaces had no SCOP equivalent and 37.3% of SCOP interfaces had no CATH equivalent in a nonredundant set. Therefore, combining both classifications gives an increase of between 23.6 and 37.3% in domain-domain interfaces. It is suggested that if possible, both domain classification schemes should be used together, but if only one is selected, SCOP provides better coverage than CATH. Employing both SCOP and CATH reduces the false negative rate of predictive methods, which employ homology matching to structural data to predict protein-protein interaction by an estimated 6.5%.  相似文献   

14.
Li X  Hu C  Liang J 《Proteins》2003,53(4):792-805
Protein representation and potential function are two important ingredients for studying protein folding, equilibrium thermodynamics, and sequence design. We introduce a novel geometric representation of protein contact interactions using the edge simplices from the alpha shape of the protein structure. This representation can eliminate implausible neighbors that are not in physical contact, and can avoid spurious contact between two residues when a third residue is between them. We developed statistical alpha contact potential using an odds-ratio model. A studentized bootstrap method was then introduced to assess the 95% confidence intervals for each of the 210 propensity parameters. We found, with confidence, that there is significant long-range propensity (>30 residues apart) for hydrophobic interactions. We tested alpha contact potential for native structure discrimination using several sets of decoy structures, and found that it often performs comparably with atom-based potentials requiring many more parameters. We also show that accurate geometric representation is important, and that alpha contact potential has better performance than potential defined by cutoff distance between geometric centers of side chains. Hierarchical clustering of alpha contact potentials reveals natural grouping of residues. To explore the relationship between shape and physicochemical representations, we tested the minimum alphabet size necessary for native structure discrimination. We found that there is no significant difference in performance of discrimination when alphabet size varies from 7 to 20, if geometry is represented accurately by alpha simplicial edges. This result suggests that the geometry of packing plays an important role, but the specific residue types are often interchangeable.  相似文献   

15.
A method is presented for the derivation of knowledge-based pair potentials that corrects for the various compositions of different proteins. The resulting statistical pair potential is more specific than that derived from previous approaches as assessed by gapless threading results. Additionally, a methodology is presented that interpolates between statistical potentials when no homologous examples to the protein of interest are in the structural database used to derive the potential, to a Go-like potential (in which native interactions are favorable and all nonnative interactions are not) when homologous proteins are present. For cases in which no protein exceeds 30% sequence identity, pairs of weakly homologous interacting fragments are employed to enhance the specificity of the potential. In gapless threading, the mean z score increases from -10.4 for the best statistical pair potential to -12.8 when the local sequence similarity, fragment-based pair potentials are used. Examination of the ab initio structure prediction of four representative globular proteins consistently reveals a qualitative improvement in the yield of structures in the 4 to 6 A rmsd from native range when the fragment-based pair potential is used relative to that when the quasichemical pair potential is employed. This suggests that such protein-specific potentials provide a significant advantage relative to generic quasichemical potentials.  相似文献   

16.
MOTIVATION: Experimental evidence suggests that certain short protein segments have stronger amyloidogenic propensities than others. Identification of the fibril-forming segments of proteins is crucial for understanding diseases associated with protein misfolding and for finding favorable targets for therapeutic strategies. RESULT: In this study, we used the microcrystal structure of the NNQQNY peptide from yeast prion protein and residue-based statistical potentials to establish an algorithm to identify the amyloid fibril-forming segment of proteins. Using the same sets of sequences, a comparable prediction performance was obtained from this study to that from 3D profile method based on the physical atomic-level potential ROSETTADESIGN. The predicted results are consistent with experiments for several representative proteins associated with amyloidosis, and also agree with the idea that peptides that can form fibrils may have strong sequence signatures. Application of the residue-based statistical potentials is computationally more efficient than using atomic-level potentials and can be applied in whole proteome analysis to investigate the evolutionary pressure effect or forecast other latent diseases related to amyloid deposits. AVAILABILITY: The fibril prediction program is available at ftp://mdl.ipc.pku.edu.cn/pub/software/pre-amyl/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

17.
MOTIVATION: Public resources for studying protein interfaces are necessary for better understanding of molecular recognition and developing intermolecular potentials, search procedures and scoring functions for the prediction of protein complexes. RESULTS: The first release of the DOCKGROUND resource implements a comprehensive database of co-crystallized (bound-bound) protein-protein complexes, providing foundation for the upcoming expansion to unbound (experimental and simulated) protein-protein complexes, modeled protein-protein complexes and systematic sets of docking decoys. The bound-bound part of DOCKGROUND is a relational database of annotated structures based on the Biological Unit file (Biounit) provided by the RCSB as a separated file containing probable biological molecule. DOCKGROUND is automatically updated to reflect the growth of PDB. It contains 67,220 pairwise complexes that rely on 14,913 Biounit entries from 34,778 PDB entries (January 30, 2006). The database includes a dynamic generation of non-redundant datasets of pairwise complexes based either on the structural similarity (SCOP classification) or on user-defined sequence identity. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new methodologies for modeling of protein interactions. AVAILABILITY: DOCKGROUND is available at http://dockground.bioinformatics.ku.edu. The current first release implements the bound-bound part.  相似文献   

18.
Crowley PB  Golovin A 《Proteins》2005,59(2):231-239
Arginine is an abundant residue in protein-protein interfaces. The importance of this residue relates to the versatility of its side chain in intermolecular interactions. Different classes of protein-protein interfaces were surveyed for cation-pi interactions. Approximately half of the protein complexes and one-third of the homodimers analyzed were found to contain at least one intermolecular cation-pi pair. Interactions between arginine and tyrosine were found to be the most abundant. The electrostatic interaction energy was calculated to be approximately 3 kcal/mol, on average. A distance-based search of guanidinium:aromatic interactions was also performed using the Macromolecular Structure Database (MSD). This search revealed that half of the guanidinium:aromatic pairs pack in a coplanar manner. Furthermore, it was found that the cationic group of the cation-pi pair is frequently involved in intermolecular hydrogen bonds. In this manner the arginine side chain can participate in multiple interactions, providing a mechanism for inter-protein specificity. Thus, the cation-pi interaction is established as an important contributor to protein-protein interfaces.  相似文献   

19.
Ansari S  Helms V 《Proteins》2005,61(2):344-355
A non-redundant set of 170 protein-protein interfaces of known structure was statistically analyzed for residue and secondary-structure compositions, pairing preferences and side-chain-backbone interaction frequencies. By focussing mainly on transient protein-protein interfaces, the results underline previous findings for protein-protein interfaces but also show some new interesting aspects of transient interfaces. The residue compositions at interfaces found in this study correlate well with the results of other studies. On average, contacts between pairs of hydrophobic and polar residues were unfavorable, and the charged residues tended to pair subject to charge complementarity. Secondary structure composition analysis shows that neither helices nor beta-sheets are dominantly populated at interfaces. Analyzing the pairing preferences of the secondary structure elements revealed a higher affinity within the same elements and alludes to tight packings. In addition, the results for the side-chain and backbone interaction frequencies, which were measured under more stringent conditions, showed a high occurrence of side-chain-backbone interactions. Taking a closer look at the helix and beta-sheet binding frequencies for a given side-chain and backbone interaction underlined the relevance of tight packings. The polarity of interfaces increased with decreasing interface size. These types of information may be useful for scoring complexes in protein-protein docking studies or for prediction of protein-protein interfaces from the sequences alone.  相似文献   

20.
Ma XH  Wang CX  Li CH  Chen WZ 《Protein engineering》2002,15(8):677-681
Three useful variables from the interfaces of 20 protein-protein complexes were investigated. These variables are the side-chain accessible number (N(b)), the number of hydrophilic pairs (N(pair)) and buried a polar solvent accessible surface areas (DeltaDeltaASA(apol)). An empirical model based on the three variables was developed to describe the free energy of protein associations. As the results show, the side-chain accessible numbers characterize the loss of side-chain conformational entropy of protein interactions and the effective empirical function presented here has great capability for estimating the binding free energy. It was found that the variables of interface information capture most of the significant features of protein-protein association. Also, we applied the model based on the variables as a rescoring function to docking simulations and found that it has the potential to distinguish the 'true' binding mode. It is clear that the simple and empirical scale developed here is an attractive target function for calculating binding free energy for various biological processes to rational protein design.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号