首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Darnell SJ  Page D  Mitchell JC 《Proteins》2007,68(4):813-823
Protein-protein interactions can be altered by mutating one or more "hot spots," the subset of residues that account for most of the interface's binding free energy. The identification of hot spots requires a significant experimental effort, highlighting the practical value of hot spot predictions. We present two knowledge-based models that improve the ability to predict hot spots: K-FADE uses shape specificity features calculated by the Fast Atomic Density Evaluation (FADE) program, and K-CON uses biochemical contact features. The combined K-FADE/CON (KFC) model displays better overall predictive accuracy than computational alanine scanning (Robetta-Ala). In addition, because these methods predict different subsets of known hot spots, a large and significant increase in accuracy is achieved by combining KFC and Robetta-Ala. The KFC analysis is applied to the calmodulin (CaM)/smooth muscle myosin light chain kinase (smMLCK) interface, and to the bone morphogenetic protein-2 (BMP-2)/BMP receptor-type I (BMPR-IA) interface. The results indicate a strong correlation between KFC hot spot predictions and mutations that significantly reduce the binding affinity of the interface.  相似文献   

2.
There is consensus surrounding the need to include a third dimension when estimating Species Distribution Models (SDMs), which is of special interest for marine species. Application of the third dimension is, however, rarely available, thus users are obliged to manually combine 2D SDM outputs (i.e., suitability or presence/absence maps) for 3D distribution generation. Herein, the Niche of Occurrence 3D (NOO3D) is presented, which is a new, simple modelling procedure that provides 3D distributions using both 3D occurrence samples and environmental datasets that consist of one layer per depth value. NOO3D performance was evaluated using five virtual marine species to avoid errors associated with real data sets (three pelagic species, with wide, medium, and narrow distributions, respectively, a mesopelagic species and an abyssal species). These virtual species are distributed across the North Atlantic Ocean and were built to a 0.5° x 0.5° resolution and considering 49 depth levels (from 0.43 m to an undersea depth of 5274.7 m). NOO3D results were also compared to those provided by 3D Alpha Shapes and Maximum Entropy (MaxEnt). The True Positive Rate (TPR), or sensitivity, True Negative Rate (TNR), or specificity, False Positive Rate (FPR), or commission error, and False Negative Rate (FNR), or omission error, were employed in order to facilitate comparison between methods. MaxEnt performed best for TPR, TSS and FNR, and Alpha Shape 3D performed best for FPR and TNR. NOO3D was always the second-ranked method for all metrics considered, which indicates that it was the most suitable method. The provided results indicate that NOO3D can be considered a viable alternative in achieving three-dimensional species distribution models.  相似文献   

3.
Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to π–related interactions, especially π · · · π interactions.  相似文献   

4.
5.
Hot spot residues of proteins are fundamental interface residues that help proteins perform their functions. Detecting hot spots by experimental methods is costly and time‐consuming. Sequential and structural information has been widely used in the computational prediction of hot spots. However, structural information is not always available. In this article, we investigated the problem of identifying hot spots using only physicochemical characteristics extracted from amino acid sequences. We first extracted 132 relatively independent physicochemical features from a set of the 544 properties in AAindex1, an amino acid index database. Each feature was utilized to train a classification model with a novel encoding schema for hot spot prediction by the IBk algorithm, an extension of the K‐nearest neighbor algorithm. The combinations of the individual classifiers were explored and the classifiers that appeared frequently in the top performing combinations were selected. The hot spot predictor was built based on an ensemble of these classifiers and to work in a voting manner. Experimental results demonstrated that our method effectively exploited the feature space and allowed flexible weights of features for different queries. On the commonly used hot spot benchmark sets, our method significantly outperformed other machine learning algorithms and state‐of‐the‐art hot spot predictors. The program is available at http://sfb.kaust.edu.sa/pages/software.aspx . Proteins 2013; 81:1351–1362 © 2013 Wiley Periodicals, Inc.  相似文献   

6.
Training and testing of conventional machine learning models on binary classification problems depend on the proportions of the two outcomes in the relevant data sets. This may be especially important in practical terms when real-world applications of the classifier are either highly imbalanced or occur in unknown proportions. Intuitively, it may seem sensible to train machine learning models on data similar to the target data in terms of proportions of the two binary outcomes. However, we show that this is not the case using the example of prediction of deleterious and neutral phenotypes of human missense mutations in human genome data, for which the proportion of the binary outcome is unknown. Our results indicate that using balanced training data (50% neutral and 50% deleterious) results in the highest balanced accuracy (the average of True Positive Rate and True Negative Rate), Matthews correlation coefficient, and area under ROC curves, no matter what the proportions of the two phenotypes are in the testing data. Besides balancing the data by undersampling the majority class, other techniques in machine learning include oversampling the minority class, interpolating minority-class data points and various penalties for misclassifying the minority class. However, these techniques are not commonly used in either the missense phenotype prediction problem or in the prediction of disordered residues in proteins, where the imbalance problem is substantial. The appropriate approach depends on the amount of available data and the specific problem at hand.  相似文献   

7.
Though there is an increase in popularity of predictive modelling for assessing the geographical distribution of species, there is still a clear gap on explaining geospatial methods to derive the presence/absence of species in terms of geospatial extent besides the ambiguity of robust models. In this paper, we evaluate four major species distribution modelling methods: Artificial Neural Network (ANN), Support Vector Machines (SVM), Maximum Entropy (MaxEnt) and Generalized Linear Model (GLM) with pseudo absence and background absence data. To investigate the efficacy of these models, we present a case study using Coffea arabica L. species in Ethiopia as there was no species distribution modelling that has been done at a local scale especially in the coffee growing areas. We made predictions on 75% subsets and validation on 25% of the 112 presence of the species records that were collected from field observation and 0.5 m spatial resolution of true colour aerial photographs. Twelve biophysical explanatory variables; climatic, remote sensing based and landscape variables were employed in modelling. The results show that MaxEnt with pseudo absence data and SVM with background absence have highest area of understory coffee presence prediction with 12.2% and 23.1% area coverage of indigenous forest, respectively. The result from the model performance test using True Positive Rate (TPR) shows that GLM and SVM with pseudo absence data performed highest (TPR = 0.821). MaxEnt and SVM were the robust modelling methods (TPR = 0.964) using background absence data.  相似文献   

8.
The signal recognition particle (SRP) is a ribonucleoprotein complex which is crucial for the delivery of proteins to cellular membranes. Among the six proteins of the eukaryotic SRP, the two largest, SRP68 and SRP72, form a stable SRP68/72 heterodimer of unknown structure which is required for SRP function. Fragments 68e′ (residues 530 to 620) and 72b′ (residues 1 to 166) participate in the SRP68/72 interface. Both polypeptides were expressed in Escherichia coli and assembled into a complex which was stable at high ionic strength. Disruption of 68e′/72b′ and SRP68/72 was achieved by denaturation using moderate concentrations of urea. The four predicted tetratricopeptide repeats (TPR1 to TPR4) of 72b′ were required for stable binding of 68e′. Site‐directed mutagenesis suggested that they provide the structural framework for the binding of SRP68. Deleting the region between TPR3 and TPR4 (h120) also prevented the formation of a heterodimer, but this predicted alpha‐helical region appeared to engage several of its amino acid residues directly at the interface with 68e′. A 39‐residue polypeptide (68h, residues 570–605), rich in prolines and containing an invariant aspartic residue at position 585, was found to be active. Mutagenesis scanning of the central region of 68h demonstrated that D585 was solely responsible for the formation of the heterodimer. Coexpression experiments suggested that 72b′ protects 68h from proteolytic digestion consistent with the assertion that 68h is accommodated inside a groove formed by the superhelically arranged four TPRs of the N‐terminal region of SRP72.  相似文献   

9.

Background

Hot spot residues are functional sites in protein interaction interfaces. The identification of hot spot residues is time-consuming and laborious using experimental methods. In order to address the issue, many computational methods have been developed to predict hot spot residues. Moreover, most prediction methods are based on structural features, sequence characteristics, and/or other protein features.

Results

This paper proposed an ensemble learning method to predict hot spot residues that only uses sequence features and the relative accessible surface area of amino acid sequences. In this work, a novel feature selection technique was developed, an auto-correlation function combined with a sliding window technique was applied to obtain the characteristics of amino acid residues in protein sequence, and an ensemble classifier with SVM and KNN base classifiers was built to achieve the best classification performance.

Conclusion

The experimental results showed that our model yields the highest F1 score of 0.92 and an MCC value of 0.87 on ASEdb dataset. Compared with other machine learning methods, our model achieves a big improvement in hot spot prediction.
  相似文献   

10.
The frequently observed ankyrin repeat motif represents a structural scaffold evolved for mediating protein-protein interactions. As such, these repeats modulate a diverse range of cellular functions. We thermodynamically characterized the heterodimeric GA-binding protein (GABP) alphabeta complex and focused specifically on the interaction mediated by the ankyrin repeat domain of the GABPbeta. Our isothermal titration calorimetric analysis of the interaction between the GABP subunits determined an association constant (K(A)) of 6.0 x 10(8) M(-1) and that the association is favorably driven by a significant change in enthalpy (DeltaH) and a minor change in entropy (-TDeltaS). A total of 16 GABPbeta interface residues were chosen for alanine scanning mutagenesis. The calorimetrically measured differences in the free energy of binding were compared to computationally calculated values resulting in a correlation coefficient r = 0.71. We identified three spatially contiguous hydrophobic and aromatic residues that form a binding free energy hot spot (DeltaDeltaG > 2.0 kcal/mol). One residue provides structural support to the hot spot residues. Three non-hot spot residues are intermediate contributors (DeltaDeltaG approximately 1.0 kcal/mol) and create a canopy-like structure over the hot spot residues to possibly occlude solvent and orientate the subunits. The remaining interface residues are located peripherally and have weak contributions. Finally, our mutational analysis revealed a significant entropy-enthalpy compensation for this interaction.  相似文献   

11.
The linear IgE-binding epitopes of non-specific lipid transfer proteins (nsLTP) from plants were predicted using a combination of predictive tools including (1) the hydropathic profiles based on different scales of hydrophilicity, flexibility and exposure to the solvent, (2) the hydrophobic cluster analysis plots, (3) the occurrence of charged residues in the predicted amino acid sequence stretches and, (4) the exposition of the predicted linear IgE-binding epitopes checked on the three-dimensional models built for the nsLTP. A reliable prediction was obtained for nsLTP as compared with the previously characterized IgE-binding epitopes of various proteins. A consensual IgE-binding epitope occurring in other plant nsLTP and responsible for some IgE-binding cross-reactivity among fruit nsLTP has been identified and characterized. Despite some discrepancies, a fairly good prediction resulted in applying our combination of predictive methods to longer nsLTP or plant profilins.  相似文献   

12.
del Sol A  O'Meara P 《Proteins》2005,58(3):672-682
We show that protein complexes can be represented as small-world networks, exhibiting a relatively small number of highly central amino-acid residues occurring frequently at protein-protein interfaces. We further base our analysis on a set of different biological examples of protein-protein interactions with experimentally validated hot spots, and show that 83% of these predicted highly central residues, which are conserved in sequence alignments and nonexposed to the solvent in the protein complex, correspond to or are in direct contact with an experimentally annotated hot spot. The remaining 17% show a general tendency to be close to an annotated hot spot. On the other hand, although there is no available experimental information on their contribution to the binding free energy, detailed analysis of their properties shows that they are good candidates for being hot spots. Thus, highly central residues have a clear tendency to be located in regions that include hot spots. We also show that some of the central residues in the protein complex interfaces are central in the monomeric structures before dimerization and that possible information relating to hot spots of binding free energy could be obtained from the unbound structures.  相似文献   

13.
Lipidation catalyzed by protein prenyltransferases is essential for the biological function of a number of eukaryotic proteins, many of which are involved in signal transduction and vesicular traffic regulation. Sequence similarity searches reveal that the alpha-subunit of protein prenyltransferases (PTalpha) is a member of the tetratricopeptide repeat (TPR) superfamily. This finding makes the three-dimensional structure of the rat protein farnesyltransferase the first structural model of a TPR protein interacting with its protein partner. Structural comparison of the two TPR domains in protein farnesyltransferase and protein phosphatase 5 indicates that variation in TPR consensus residues may affect protein binding specificity through altering the overall shape of the TPR superhelix. A general approach to evolutionary analysis of proteins with repetitive sequence motifs has been developed and applied to the protein prenyltransferases and other TPR proteins. The results suggest that all members in PTalpha family originated from a common multirepeat ancestor, while the common ancestor of PTalpha and other members of TPR superfamily is likely to be a single repeat protein.  相似文献   

14.
Cyclophilin 40 (CyP40) is a tetratricopeptide repeat (TPR)-containing immunophilin and a modulator of steroid receptor function through its binding to heat shock protein 90 (Hsp90). Critical to this binding are the carboxyl-terminal MEEVD motif of Hsp90 and the TPR domain of CyP40. Two different models of the CyP40-MEEVD peptide interaction were used as the basis for a comprehensive mutational analysis of the Hsp90-interacting domain of CyP40. Using a carboxyl-terminal CyP40 construct as template, 24 amino acids from the TPR and flanking acidic and basic domains were individually mutated by site-directed mutagenesis, and the mutants were coexpressed in yeast with a carboxyl-terminal Hsp90beta construct and qualitatively assessed for binding using a beta-galactosidase filter assay. For quantitative assessment, mutants were expressed as glutathione S-transferase fusion proteins and assayed for binding to carboxyl-terminal Hsp90beta using conventional pulldown and enzyme-linked immunosorbent assay microtiter plate assays. Collectively, the models predict that the following TPR residues help define a binding groove for the MEEVD peptide: Lys-227, Asn-231, Phe-234, Ser-274, Asn-278, Lys-308, and Arg-312. Mutational analysis identified five of these residues (Lys-227, Asn-231, Asn-278, Lys-308, and Arg-312) as essential for Hsp90 binding. The other two residues (Phe-234 and Ser-274) and another three TPR domain residues not definitively associated with the binding groove (Leu-284, Lys-285, and Asp-329) are required for efficient Hsp90 binding. These data confirm the critical importance of the MEEVD binding groove in CyP40 for Hsp90 recognition and reveal that additional charged and hydrophobic residues within the CyP40 TPR domain are required for Hsp90 binding.  相似文献   

15.
Liu YP  Chang CW  Chang KY 《FEBS letters》2003,554(3):403-409
Structure-based mutagenesis was used to probe the binding surface for the activation domain of sterol-responsive element binding protein (SREBP) in the KIX domain of CREB binding protein. A set of conserved residues scattering in the alpha2 helix and the extended C-terminal region of alpha 3 helix in the KIX domain including two arginines previously characterized as a hot spot for cofactor-mediated methylation was shown to be crucial for SREBP-KIX interaction, and was not essential for phosphorylated KID recognition. Therefore, our results suggest the existence of a SREBP binding site formed by positively charged residues in the C-terminal part of the extended alpha 3 helix of the KIX domain distinct from the previously identified phosphorylated KID binding site.  相似文献   

16.
The tetratricopeptide repeat (TPR) motif is a protein–protein interaction module that acts as an organizing centre for complexes regulating a multitude of biological processes. Despite accumulating evidence for the formation of TPR oligomers as an additional level of regulation there is a lack of structural and solution data explaining TPR self‐association. In the present work we characterize the trimeric TPR‐containing protein YbgF, which is linked to the Tol system in Gram‐negative bacteria. By subtracting previously identified TPR consensus residues required for stability of the fold from residues conserved across YbgF homologs, we identified residues involved in oligomerization of the C‐terminal YbgF TPR domain. Crafting these residues, which are located in loop regions between TPR motifs, onto the monomeric consensus TPR protein CTPR3 induced the formation of oligomers. The crystal structure of this engineered oligomer shows an asymmetric trimer where stacking interactions between the introduced tyrosines and displacement of the C‐terminal hydrophilic capping helix, present in most TPR domains, are key to oligomerization. Asymmetric trimerization of the YbgF TPR domain and CTPR3Y3 leads to the formation of higher order oligomers both in the crystal and in solution. However, such open‐ended self‐association does not occur in full‐length YbgF suggesting that the protein's N‐terminal coiled‐coil domain restricts further oligomerization. This interpretation is borne out in experiments where the coiled‐coil domain of YbgF was engineered onto the N‐terminus of CTPR3Y3 and shown to block self‐association beyond trimerization. Our study lays the foundations for understanding the structural basis for TPR domain self‐association and how such self‐association can be regulated in TPR domain‐containing proteins. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

17.
Mammalian mitochondrial fission requires at least two proteins, hFis1 and the dynamin-like GTPase DLP1/Drp1. The mitochondrial protein hFis1 is anchored at the outer membrane by a C-terminal transmembrane domain. The cytosolic domain of hFis1 contains six α helices [α1–α6] out of which [α2–α5] form tetratricopeptide repeat (TPR)-like motifs. DLP1 and possibly other proteins are thought to interact with the hFis1 TPR region during the fission process. It has also been suggested that the α1-helix regulates protein-protein interactions at the TPR. We performed random peptide phage display screening using the hFis1[α2–α6] as the target and identified ten different peptide sequences. Phage ELISA using mutant hFis1 indicates that the peptide binding requires the α2 and α3 helices and the intact TPR structure. Competition experiments and surface plasmon resonance analyses confirmed that a subset of free peptides enriched with proline residues directly bind to the target. Two of these peptides bind to the α1-containing intact cytosolic domain of hFis1 with decreased affinity. Peptide microinjection into cells abolished the mitochondrial swelling induced by overexpression of α1-deleted hFis1, and significantly decreased cytochrome c release from mitochondria upon apoptotic induction. Our data demonstrate that hFis1 can bind to multiple amino acid sequences selectively, and that the TPR constitutes the main binding region of hFis1, providing a first insight into the hFis1 TPR as a potential therapeutic target.  相似文献   

18.
19.
20.
The structures of the oxidized and reduced forms of the rubredoxin from the archaebacterium, Pyrococcus furiosus, an organism that grows optimally at 100 degrees C, have been determined by X-ray crystallography to a resolution of 1.8 A. Crystals of this rubredoxin grow in space group P2(1)2(1)2(1) with room temperature cell dimensions a = 34.6 A, b = 35.5 A, and c = 44.4 A. Initial phases were determined by the method of molecular replacement using the oxidized form of the rubredoxin from the mesophilic eubacterium, Clostridium pasteurianum, as a starting model. The oxidized and reduced models of P. furiosus rubredoxin each contain 414 nonhydrogen protein atoms comprising 53 residues. The model of the oxidized form contains 61 solvent H2O oxygen atoms and has been refined with X-PLOR and TNT to a final R = 0.178 with root mean square (rms) deviations from ideality in bond distances and bond angles of 0.014 A and 2.06 degrees, respectively. The model of the reduced form contains 37 solvent H2O oxygen atoms and has been refined to R = 0.193 with rms deviations from ideality in bond lengths of 0.012 A and in bond angles of 1.95 degrees. The overall structure of P. furiosus rubredoxin is similar to the structures of mesophilic rubredoxins, with the exception of a more extensive hydrogen-bonding network in the beta-sheet region and multiple electrostatic interactions (salt bridge, hydrogen bonds) of the Glu 14 side chain with groups on three other residues (the amino-terminal nitrogen of Ala 1; the indole nitrogen of Trp 3; and the amide nitrogen group of Phe 29). The influence of these and other features upon the thermostability of the P. furiosus protein is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号