首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein–protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein–protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein–protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein–protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein‐protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10‐fold cross‐validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein–protein interaction networks and human–pathogen interactions based on the strength of interactions. Proteins 2014; 82:2088–2096. © 2014 Wiley Periodicals, Inc.  相似文献   

2.
《Proteins》2018,86(5):536-547
Additivity in binding affinity of protein‐protein complexes refers to the change in free energy of binding (ΔΔGbind) for double (or multiple) mutations which is approximately equal to the sum of their corresponding single mutation ΔΔGbind values. In this study, we have explored the additivity effect of double mutants, which shows a linear relationship between the binding affinity of double and sum of single mutants with a correlation of 0.90. However, the comparison of ΔΔGbind values showed a mean absolute deviation of 0.86 kcal/mol, and 25.6% of the double mutants show a deviation of more than 1 kcal/mol, which are identified as non‐additive. The additivity effects have been analyzed based on the influence of structural features such as accessible surface area, long range order, binding propensity change, surrounding hydrophobicity, flexibility, atomic contacts between the mutations and distance between the 2 mutations. We found that non‐additive mutations tend to be closer to each other and have more contacts. We have also used machine learning methods to discriminate additive and non‐additive mutations using structure‐based features, which showed the accuracies in the range of 0.77–0.92 for protein‐protein complexes belonging to different functions. Further, we have compared the additivity effects of protein stability along with binding affinity and explored the similarities and differences between them. The results obtained in this study provide insights into the effects of various structural features on binding affinity of double mutants, and will aid the development of accurate methods to predict the binding affinity of double mutants.  相似文献   

3.
Seeliger D  de Groot BL 《Proteins》2007,68(3):595-601
A rigorous quantitative assessment of atomic contacts and packing in native protein structures is presented. The analysis is based on optimized atomic radii derived from a set of high-resolution protein structures and reveals that the distribution of atomic contacts and overlaps is a structural constraint in proteins, irrespective of structural or functional classification and size. Furthermore, a newly developed method for calculating packing properties is introduced and applied to sets of protein structures at different levels of resolution. The results show that limited resolution yields decreasing packing quality, which underscores the relevance of packing considerations for structure prediction, design, dynamics, and docking.  相似文献   

4.
While cryo-electron microscopy (cryo-EM) has revolutionized the structure determination of supramolecular protein complexes that are refractory to structure determination by X-ray crystallography, structure determination by cryo-EM can nonetheless be complicated by excessive conformational flexibility or structural heterogeneity resulting from weak or transient protein–protein association. Since such transient complexes are often critical for function, specialized approaches must be employed for the determination of meaningful structure–function relationships. Here, we outline examples in which transient protein–protein interactions have been visualized successfully by cryo-EM in the biosynthesis of fatty acids, polyketides, and terpenes. These studies demonstrate the utility of chemical crosslinking to stabilize transient protein–protein complexes for cryo-EM structural analysis, as well as the use of partial signal subtraction and localized reconstruction to extract useful structural information out of cryo-EM data collected from inherently dynamic systems. While these approaches do not always yield atomic resolution insights on protein–protein interactions, they nonetheless enable direct experimental observation of complexes in assembly-line biosynthesis that would otherwise be too fleeting for structural analysis.  相似文献   

5.
Mintseris J  Weng Z 《Proteins》2003,53(3):629-639
The ability to analyze and compare protein-protein interactions on the structural level is critical to our understanding of various aspects of molecular recognition and the functional interplay of components of biochemical networks. In this study, we introduce atomic contact vectors (ACVs) as an intuitive way to represent the physico-chemical characteristics of a protein-protein interface as well as a way to compare interfaces to each other. We test the utility of ACVs in classification by using them to distinguish between homodimers and crystal contacts. Our results compare favorably with those reported by other authors. We then apply ACVs to mine the PDB for all known protein-protein complexes and separate transient recognition complexes from permanent oligomeric ones. Getting at the basis of this difference is important for our understanding of recognition and we achieved a success rate of 91% for distinguishing these two classes of complexes. Although accessible surface area of the interface is a major discriminating feature, we also show that there are distinct differences in the contact preferences between the two kinds of complexes. Illustrating the superiority of ACVs as a basic comparison measure over a sequence-based approach, we derive a general rule of thumb to determine whether two protein-protein interfaces are redundant. With this method, we arrive at a nonredundant set of 209 recognition complexes--the largest set reported so far.  相似文献   

6.
Cytochromes c are very widespread proteins that play key roles in the electron transfer events associated to a wide variety of physiological redox processes. The function of cytochromes c is, at the broad level, to interact with different partners in order to allow electrons to flow from one protein to another. Here, we focused our attention on the protein-protein interactions that involve mono-heme cytochrome c domains in order to identify possible general vs. specific patterns of intermolecular interactions at the structural level. We observed that a number of physico-chemical properties are statistically different in transient vs. permanent and fused complexes. These include the extent of the protein interface area, the amino acid composition and the packing density at the interface. The understanding of the features of transient redox complexes is of particular importance because of the difficulty of obtaining co-crystals that preserve the physiologically relevant configuration. In addition, we identified three different structural modes of interaction that cover all the structurally characterized cytochrome c interactions except one. The mode of interaction does not correlate with the nature of the complex (transient, permanent, fused). Regardless of the mode of interaction, the distance between the heme iron and the partner metal center or organic cofactor center of mass is typically around 19-20 ? for complexes permitting direct electron transfer between the two sites.  相似文献   

7.
Identifying correct binding modes in a large set of models is an important step in protein–protein docking. We identified protein docking filter based on overlap area that significantly reduces the number of candidate structures that require detailed examination. We also developed potentials based on residue contacts and overlap areas using a comprehensive learning set of 640 two‐chain protein complexes with mathematical programming. Our potential showed substantially better recognition capacity compared to other publicly accessible protein docking potentials in discriminating between native and nonnative binding modes on a large test set of 84 complexes independent of our training set. We were able to rank a near‐native model on the top in 43 cases and within top 10 in 51 cases. We also report an atomic potential that ranks a near‐native model on the top in 46 cases and within top 10 in 58 cases. Our filter+potential is well suited for selecting a small set of models to be refined to atomic resolution. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

8.
In the structural models determined by X‐ray crystallography, contacts between molecules can be divided into two categories: biologically relevant contacts and crystal packing contacts. With the growth in the number and quality of available large crystal packing contacts structures, distinguishing crystal packing contacts from biologically relevant contacts remains a difficult task, which can lead to wrong interpretation of structural models. In this study, we performed a systematic analysis on the biologically relevant contacts and crystal packing contacts. The analysis results reveal that biologically contacts are more tightly packed than crystal packing contacts. This property of biologically contacts may contribute to the formation of their interfacial core region. Meanwhile, the differences between the core and surface region of biologically contacts in amino acid composition and evolutionary measure are more dramatic than crystal packing contacts and these differences appear to be useful in distinguishing these two categories of contacts. On the basis of the features derived from our analysis, we developed a random forest model to classify biological relevant contacts and crystal packing contacts. Our method can achieve a high receiver operating curve of 0.923 in the 5‐fold cross‐validation and accuracies of 91.4% and 91.7% for two different test sets. Moreover, in a comparison study, our model outperforms other existing methods, such as DiMoVo, Pita, Pisa, and Eppic. We believe that this study will provide useful help in the validation of oligomeric proteins and protein complexes. The model and all data used in this paper are freely available at http://cic.scu.edu.cn/bioinformatics/bio‐cry.zip . Proteins 2014; 82:3090–3100. © 2014 Wiley Periodicals, Inc.  相似文献   

9.
During the characterization of mutants and covalently inhibited complexes of Fusarium solani cutinase, nine different crystal forms have been obtained so far. Protein mutants with a different surface charge distribution form new intermolecular salt bridges or long-range electrostatic interactions that are accompanied by a change in the crystal packing. The whole protein surface is involved in the packing contacts and the hydrophobicities of the protein surfaces in mutual contact turned out to be noncorrelated, which indicates that the packing interactions are nonspecific. In the case of the hydrophobic variants, the packing contacts showed some specificity, as the protein in the crystal tends to form either crystallographic or noncrystallographic dimers, which shield the hydrophobic surface from the solvent. The likelihood of surface atoms to be involved in a crystal contact is the same for both polar and nonpolar atoms. However, when taking areas in the 200–600 Å2 range, instead of individual atoms, the either highly hydrophobic or highly polar surface regions were found to have an increased probability of establishing crystal lattice contacts. The protein surface surrounding the active-site crevice of cutinase constitutes a large hydrophobic area that is involved in packing contacts in all the various crystalline contexts. Proteins 31:320–333, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

10.
Protein attribute prediction from primary sequences is an important task and how to extract discriminative features is one of the most crucial aspects. Because single-view feature cannot reflect all the information of a protein, fusing multi-view features is considered as a promising route to improve prediction accuracy. In this paper, we propose a novel framework for protein multi-view feature fusion: first, features from different views are parallely combined to form complex feature vectors; Then, we extend the classic principal component analysis to the generalized principle component analysis for further feature extraction from the parallely combined complex features, which lie in a complex space. Finally, the extracted features are used for prediction. Experimental results on different benchmark datasets and machine learning algorithms demonstrate that parallel strategy outperforms the traditional serial approach and is particularly helpful for extracting the core information buried among multi-view feature sets. A web server for protein structural class prediction based on the proposed method (COMSPA) is freely available for academic use at: http://www.csbio.sjtu.edu.cn/bioinf/COMSPA/.  相似文献   

11.
《IRBM》2020,41(4):229-239
Feature selection algorithms are the cornerstone of machine learning. By increasing the properties of the samples and samples, the feature selection algorithm selects the significant features. The general name of the methods that perform this function is the feature selection algorithm. The general purpose of feature selection algorithms is to select the most relevant properties of data classes and to increase the classification performance. Thus, we can select features based on their classification performance. In this study, we have developed a feature selection algorithm based on decision support vectors classification performance. The method can work according to two different selection criteria. We tested the classification performances of the features selected with P-Score with three different classifiers. Besides, we assessed P-Score performance with 13 feature selection algorithms in the literature. According to the results of the study, the P-Score feature selection algorithm has been determined as a method which can be used in the field of machine learning.  相似文献   

12.
Rong Liu  Jianjun Hu 《Proteins》2013,81(11):1885-1899
Accurate prediction of DNA‐binding residues has become a problem of increasing importance in structural bioinformatics. Here, we presented DNABind, a novel hybrid algorithm for identifying these crucial residues by exploiting the complementarity between machine learning‐ and template‐based methods. Our machine learning‐based method was based on the probabilistic combination of a structure‐based and a sequence‐based predictor, both of which were implemented using support vector machines algorithms. The former included our well‐designed structural features, such as solvent accessibility, local geometry, topological features, and relative positions, which can effectively quantify the difference between DNA‐binding and nonbinding residues. The latter combined evolutionary conservation features with three other sequence attributes. Our template‐based method depended on structural alignment and utilized the template structure from known protein–DNA complexes to infer DNA‐binding residues. We showed that the template method had excellent performance when reliable templates were found for the query proteins but tended to be strongly influenced by the template quality as well as the conformational changes upon DNA binding. In contrast, the machine learning approach yielded better performance when high‐quality templates were not available (about 1/3 cases in our dataset) or the query protein was subject to intensive transformation changes upon DNA binding. Our extensive experiments indicated that the hybrid approach can distinctly improve the performance of the individual methods for both bound and unbound structures. DNABind also significantly outperformed the state‐of‐art algorithms by around 10% in terms of Matthews's correlation coefficient. The proposed methodology could also have wide application in various protein functional site annotations. DNABind is freely available at http://mleg.cse.sc.edu/DNABind/ . Proteins 2013; 81:1885–1899. © 2013 Wiley Periodicals, Inc.  相似文献   

13.
In this Letter, we present a novel methodology of searching for biologically active compounds, which is based on the combination of docking experiments and analysis of the results by machine learning methods. The study was performed for 5 different protein kinases, and several sets of compounds (active, inactive and assumed inactives) were docked into their targets. The resulting ligand–protein complexes were represented by the means of structural interaction fingerprints profiles (SIFts profiles) that constituted an input for ML methods. The developed protocol was found to be superior to the combination of classification algorithms with the standard fingerprint MACCSFP.  相似文献   

14.
Prediction-based fingerprints of protein-protein interactions   总被引:2,自引:0,他引:2  
Porollo A  Meller J 《Proteins》2007,66(3):630-645
The recognition of protein interaction sites is an important intermediate step toward identification of functionally relevant residues and understanding protein function, facilitating experimental efforts in that regard. Toward that goal, the authors propose a novel representation for the recognition of protein-protein interaction sites that integrates enhanced relative solvent accessibility (RSA) predictions with high resolution structural data. An observation that RSA predictions are biased toward the level of surface exposure consistent with protein complexes led the authors to investigate the difference between the predicted and actual (i.e., observed in an unbound structure) RSA of an amino acid residue as a fingerprint of interaction sites. The authors demonstrate that RSA prediction-based fingerprints of protein interactions significantly improve the discrimination between interacting and noninteracting sites, compared with evolutionary conservation, physicochemical characteristics, structure-derived and other features considered before. On the basis of these observations, the authors developed a new method for the prediction of protein-protein interaction sites, using machine learning approaches to combine the most informative features into the final predictor. For training and validation, the authors used several large sets of protein complexes and derived from them nonredundant representative chains, with interaction sites mapped from multiple complexes. Alternative machine learning techniques are used, including Support Vector Machines and Neural Networks, so as to evaluate the relative effects of the choice of a representation and a specific learning algorithm. The effects of induced fit and uncertainty of the negative (noninteracting) class assignment are also evaluated. Several representative methods from the literature are reimplemented to enable direct comparison of the results. Using rigorous validation protocols, the authors estimated that the new method yields the overall classification accuracy of about 74% and Matthews correlation coefficients of 0.42, as opposed to up to 70% classification accuracy and up to 0.3 Matthews correlation coefficient for methods that do not utilize RSA prediction-based fingerprints. The new method is available at http://sppider.cchmc.org.  相似文献   

15.
MOTIVATION: The experimental difficulties of alpha-helical transmembrane protein structure determination make this class of protein an important target for sequence-based structure prediction tools. The MEMPACK prediction server allows users to submit a transmembrane protein sequence and returns transmembrane topology, lipid exposure, residue contacts, helix-helix interactions and helical packing arrangement predictions in both plain text and graphical formats using a number of novel machine learning-based algorithms. AVAILABILITY: The server can be accessed as a new component of the PSIPRED portal by at http://bioinf.cs.ucl.ac.uk/psipred/.  相似文献   

16.
Crowley PB  Carrondo MA 《Proteins》2004,55(3):603-612
Interprotein electron transfer is characterized by protein interactions on the millisecond time scale. Such transient encounters are ensured by extremely high rates of complex dissociation. Computational analysis of the available crystal structures of redox protein complexes reveals features of the binding site that favor fast dissociation. In particular, the complex interface is shown to have low geometric complementarity and poor packing. These features are consistent with the necessity for fast dissociation since the absence of close packing facilitates solvation of the interface and disruption of the complex.  相似文献   

17.
18.
Protein-protein crystal-packing contacts.   总被引:3,自引:1,他引:2       下载免费PDF全文
Protein-protein contacts in monomeric protein crystal structures have been analyzed and compared to the physiological protein-protein contacts in oligomerization. A number of features differentiate the crystal-packing contacts from the natural contacts occurring in multimeric proteins. The area of the protein surface patches involved in packing contacts is generally smaller and its amino acid composition is indistinguishable from that of the protein surface accessible to the solvent. The fraction of protein surface in crystal contacts is very variable and independent of the number of packing contacts. The thermal motion at the crystal packing interface and that of the protein core, even for large packing interfaces, though the tendency is to be closer to that of the core. These results suggest that protein crystallization depends on random protein-protein interactions, which have little in common with physiological protein-protein recognition processes, and that the possibility of engineering macromolecular crystallization to improve crystal quality could be widened.  相似文献   

19.
Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to π–related interactions, especially π · · · π interactions.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号