首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Background  

Accurate small molecule binding site information for a protein can facilitate studies in drug docking, drug discovery and function prediction, but small molecule binding site protein sequence annotation is sparse. The Small Molecule Interaction Database (SMID), a database of protein domain-small molecule interactions, was created using structural data from the Protein Data Bank (PDB). More importantly it provides a means to predict small molecule binding sites on proteins with a known or unknown structure and unlike prior approaches, removes large numbers of false positive hits arising from transitive alignment errors, non-biologically significant small molecules and crystallographic conditions that overpredict ion binding sites.  相似文献   

2.
La D  Kihara D 《Proteins》2012,80(1):126-141
Protein-protein binding events mediate many critical biological functions in the cell. Typically, functionally important sites in proteins can be well identified by considering sequence conservation. However, protein-protein interaction sites exhibit higher sequence variation than other functional regions, such as catalytic sites of enzymes. Consequently, the mutational behavior leading to weak sequence conservation poses significant challenges to the protein-protein interaction site prediction. Here, we present a phylogenetic framework to capture critical sequence variations that favor the selection of residues essential for protein-protein binding. Through the comprehensive analysis of diverse protein families, we show that protein binding interfaces exhibit distinct amino acid substitution as compared with other surface residues. On the basis of this analysis, we have developed a novel method, BindML, which utilizes the substitution models to predict protein-protein binding sites of protein with unknown interacting partners. BindML estimates the likelihood that a phylogenetic tree of a local surface region in a query protein structure follows the substitution patterns of protein binding interface and nonbinding surfaces. BindML is shown to perform well compared to alternative methods for protein binding interface prediction. The methodology developed in this study is very versatile in the sense that it can be generally applied for predicting other types of functional sites, such as DNA, RNA, and membrane binding sites in proteins.  相似文献   

3.
Chung JL  Wang W  Bourne PE 《Proteins》2006,62(3):630-640
A rapid increase in the number of experimentally derived three-dimensional structures provides an opportunity to better understand and subsequently predict protein-protein interactions. In this study, structurally conserved residues were derived from multiple structure alignments of the individual components of known complexes and the assigned conservation score was weighted based on the crystallographic B factor to account for the structural flexibility that will result in a poor alignment. Sequence profile and accessible surface area information was then combined with the conservation score to predict protein-protein binding sites using a Support Vector Machine (SVM). The incorporation of the conservation score significantly improved the performance of the SVM. About 52% of the binding sites were precisely predicted (greater than 70% of the residues in the site were identified); 77% of the binding sites were correctly predicted (greater than 50% of the residues in the site were identified), and 21% of the binding sites were partially covered by the predicted residues (some residues were identified). The results support the hypothesis that in many cases protein interfaces require some residues to provide rigidity to minimize the entropic cost upon complex formation.  相似文献   

4.
Small molecules that bind at protein-protein interfaces may either block or stabilize protein-protein interactions in cells. Thus, some of these binding interfaces may turn into prospective targets for drug design. Here, we collected 175 pairs of protein-protein (PP) complexes and protein-ligand (PL) complexes with known three-dimensional structures for which (1) one protein from the PP complex shares at least 40% sequence identity with the protein from the PL complex, and (2) the interface regions of these proteins overlap at least partially with each other. We found that those residues of the interfaces that may bind the other protein as well as the small molecule are evolutionary more conserved on average, have a higher tendency of being located in pockets and expose a smaller fraction of their surface area to the solvent than the remaining protein-protein interface region. Based on these findings we derived a statistical classifier that predicts patches at binding interfaces that have a higher tendency to bind small molecules. We applied this new prediction method to more than 10 000 interfaces from the protein data bank. For several complexes related to apoptosis the predicted binding patches were in direct contact to co-crystallized small molecules.  相似文献   

5.
Bordner AJ  Abagyan R 《Proteins》2005,60(3):353-366
Predicting protein-protein interfaces from a three-dimensional structure is a key task of computational structural proteomics. In contrast to geometrically distinct small molecule binding sites, protein-protein interface are notoriously difficult to predict. We generated a large nonredundant data set of 1494 true protein-protein interfaces using biological symmetry annotation where necessary. The data set was carefully analyzed and a Support Vector Machine was trained on a combination of a new robust evolutionary conservation signal with the local surface properties to predict protein-protein interfaces. Fivefold cross validation verifies the high sensitivity and selectivity of the model. As much as 97% of the predicted patches had an overlap with the true interface patch while only 22% of the surface residues were included in an average predicted patch. The model allowed the identification of potential new interfaces and the correction of mislabeled oligomeric states.  相似文献   

6.
The modulation of protein-protein interactions (PPIs) by small drug-like molecules is a relatively new area of research and has opened up new opportunities in drug discovery. However, the progress made in this area is limited to a handful of known cases of small molecules that target specific diseases. With the increasing availability of protein structure complexes, it is highly important to devise strategies exploiting homologous structure space on a large scale for discovering putative PPIs that could be attractive drug targets. Here, we propose a scheme that allows performing large-scale screening of all protein complexes and finding putative small-molecule and/or peptide binding sites overlapping with protein-protein binding sites (so-called "multibinding sites"). We find more than 600 nonredundant proteins from 60 protein families with multibinding sites. Moreover, we show that the multibinding sites are mostly observed in transient complexes, largely overlap with the binding hotspots and are more evolutionarily conserved than other interface sites. We investigate possible mechanisms of how small molecules may modulate protein-protein binding and discuss examples of new candidates for drug design.  相似文献   

7.
Metals play a variety of roles in biological processes, and hence their presence in a protein structure can yield vital functional information. Because the residues that coordinate a metal often undergo conformational changes upon binding, detection of binding sites based on simple geometric criteria in proteins without bound metal is difficult. However, aspects of the physicochemical environment around a metal binding site are often conserved even when this structural rearrangement occurs. We have developed a Bayesian classifier using known zinc binding sites as positive training examples and nonmetal binding regions that nonetheless contain residues frequently observed in zinc sites as negative training examples. In order to allow variation in the exact positions of atoms, we average a variety of biochemical and biophysical properties in six concentric spherical shells around the site of interest. At a specificity of 99.8%, this method achieves 75.5% sensitivity in unbound proteins at a positive predictive value of 73.6%. We also test its accuracy on predicted protein structures obtained by homology modeling using templates with 30%-50% sequence identity to the target sequences. At a specificity of 99.8%, we correctly identify at least one zinc binding site in 65.5% of modeled proteins. Thus, in many cases, our model is accurate enough to identify metal binding sites in proteins of unknown structure for which no high sequence identity homologs of known structure exist. Both the source code and a Web interface are available to the public at http://feature.stanford.edu/metals.  相似文献   

8.
Macromolecular interactions are central to most cellular processes. Experimental methods generate diverse data on these interactions ranging from high throughput protein-protein interactions (PPIs) to the crystallised structures of complexes. Despite this, only a fraction of interactions have been identified and therefore predictive methods are essential to fill in the numerous gaps. Many predictive methods use information from related proteins. Accordingly, we review the conservation of interface and ligand binding sites within protein families and their association with conserved residues and Specificity Determining Positions. We then review recent developments in predictive methods for the identification of PPIs, protein interface sites and small molecule ligand binding sites. The challenges that are still faced by the community in these areas are discussed.  相似文献   

9.
Liang S  Liu Z  Li W  Ni L  Lai L 《Biopolymers》2000,54(7):515-523
We have developed a strategy for grafting a protein-protein interface based on the known crystal structure of a native ligand and receptor proteins in a complex. The key interaction residues at the ligand protein binding interface are transferred onto a scaffold protein so that the mutated scaffold protein will bind the receptor protein in the same manner as the ligand protein. First, our method identifies key residues and atoms in the ligand protein, which strongly interact with the receptor protein. Second, this method searches the scaffold protein for combinations of candidate residues, among which the distance between any two candidate residues is similar to that between relevant key interaction residues in the ligand protein. These candidate residues are mutated to key interaction residues in the ligand protein respectively. The scaffold protein is superposed onto the ligand protein based upon the coordinates of corresponding atoms, which are assumed to strongly interact with the receptor protein. Complementarity between scaffold and receptor proteins is evaluated. Scaffold proteins with a low superposing rms difference and high complementary score are accepted for further analysis. Then, the relative position of the scaffold protein is adjusted so that the interfaces between the scaffold and receptor proteins have a reasonable packing density. Other mutations are also considered to reduce the desolvation energy or bad steric contacts. Finally, the scaffold protein is cominimized with the receptor protein and evaluated. To test the method, the binding interface of barstar, the inhibitor of barnase, was grafted onto small proteins. Four scaffold proteins with high complementary scores are accepted.  相似文献   

10.
Lu CH  Lin YF  Lin JJ  Yu CS 《PloS one》2012,7(6):e39252
The structure of a protein determines its function and its interactions with other factors. Regions of proteins that interact with ligands, substrates, and/or other proteins, tend to be conserved both in sequence and structure, and the residues involved are usually in close spatial proximity. More than 70,000 protein structures are currently found in the Protein Data Bank, and approximately one-third contain metal ions essential for function. Identifying and characterizing metal ion-binding sites experimentally is time-consuming and costly. Many computational methods have been developed to identify metal ion-binding sites, and most use only sequence information. For the work reported herein, we developed a method that uses sequence and structural information to predict the residues in metal ion-binding sites. Six types of metal ion-binding templates- those involving Ca(2+), Cu(2+), Fe(3+), Mg(2+), Mn(2+), and Zn(2+)-were constructed using the residues within 3.5 ? of the center of the metal ion. Using the fragment transformation method, we then compared known metal ion-binding sites with the templates to assess the accuracy of our method. Our method achieved an overall 94.6 % accuracy with a true positive rate of 60.5 % at a 5 % false positive rate and therefore constitutes a significant improvement in metal-binding site prediction.  相似文献   

11.
The metallocarboxypeptidases (MCPs) belonging to the clan MC were studied by the Optimal Docking Area (ODA) method to evaluate protein-protein binding sites and to provide a basis for the identification of binding partners for this class of enzymes. The ODA method identifies surface patches with optimal desolvation energy based on the selection of low-energy docking regions, generated from a set of surface points around the protein. With few exceptions, the ODA method identified surface patches with a significant low-energy docking surface for all the MCPs with known three-dimensional structure. Overall, in 14 out of 24 cases, the detected ODA patches were correctly located (i.e. more than 50% of the predicted residues were in known protein-protein binding sites), yielding a global success rate of 58%. More specifically, the success rate increased up to 80% on the ODA patches detected for the catalytic domains of the M14A subfamily, independently on the partner. Interestingly, the ODA residues on the catalytic domain were correctly located in the interface with the N-terminal pro domain in all MCPs. The spatial distribution of the ODA patches for the different members of the family is in relation to the origin and function of the particular MCP, which allowed distinguishing between them. In good agreement with the experimentally characterized protein interfaces, the total average surface area of the theoretically derived ODA patches for the catalytic domain of MCPs is around 1700 A2 and their content in hydrophobic residues is about 40%. As a particular case, the average surface area of the ODA patches in MCPs of crop insect pests is about twice that of the MCPs of vertebrates, which might be related to their particular function. We recognized two binding regions for the catalytic domain of the MCPs, one of them accounting for nearly all the known intermolecular interactions made up by the enzymes. Protein inhibitors seem to have evolved to dock on this subset of ODA patches, evoking the binding mode of the N-terminal pro domains. The second binding region detected, for which no ligands have been identified so far, seems to be related to the acquisition/maintenance of the native structure of the peptidase. Overall, the ODA method has been successful in identifying low-energy docking areas in a set of structurally and functionally related proteins, suggesting that it can be easily extended to other families in the search for protein-protein binding sites and for their functional significance.  相似文献   

12.
Protein-protein complex formation involves removal of water from the interface region. Surface regions with a small free energy penalty for water removal or desolvation may correspond to preferred interaction sites. A method to calculate the electrostatic free energy of placing a neutral low-dielectric probe at various protein surface positions has been designed and applied to characterize putative interaction sites. Based on solutions of the finite-difference Poisson equation, this method also includes long-range electrostatic contributions and the protein solvent boundary shape in contrast to accessible-surface-area-based solvation energies. Calculations on a large set of proteins indicate that in many cases (>90%), the known binding site overlaps with one of the six regions of lowest electrostatic desolvation penalty (overlap with the lowest desolvation region for 48% of proteins). Since the onset of electrostatic desolvation occurs even before direct protein-protein contact formation, it may help guide proteins toward the binding region in the final stage of complex formation. It is interesting that the probe desolvation properties associated with residue types were found to depend to some degree on whether the residue was outside of or part of a binding site. The probe desolvation penalty was on average smaller if the residue was part of a binding site compared to other surface locations. Applications to several antigen-antibody complexes demonstrated that the approach might be useful not only to predict protein interaction sites in general but to map potential antigenic epitopes on protein surfaces.  相似文献   

13.
Water and ligand binding play critical roles in the structure and function of proteins, yet their binding sites and significance are difficult to predict a priori. Multiple solvent crystal structures (MSCS) is a method where several X-ray crystal structures are solved, each in a unique solvent environment, with organic molecules that serve as probes of the protein surface for sites evolved to bind ligands, while the first hydration shell is essentially maintained. When superimposed, these structures contain a vast amount of information regarding hot spots of protein-protein or protein-ligand interactions, as well as conserved water-binding sites retained with the change in solvent properties. Optimized mining of this information requires reliable structural data and a consistent, objective analysis tool. Detection of related solvent positions (DRoP) was developed to automatically organize and rank the water or small organic molecule binding sites within a given set of structures. It is a flexible tool that can also be used in conserved water analysis given multiple structures of any protein independent of the MSCS method. The DRoP output is an HTML format list of the solvent sites ordered by conservation rank in its population within the set of structures, along with renumbered and recolored PDB files for visualization and facile analysis. Here, we present a previously unpublished set of MSCS structures of bovine pancreatic ribonuclease A (RNase A) and use it together with published structures to illustrate the capabilities of DRoP.  相似文献   

14.
The crystal structure of the olfactory marker protein at 2.3 A resolution   总被引:1,自引:0,他引:1  
Olfactory marker protein (OMP) is a highly expressed and phylogenetically conserved cytoplasmic protein of unknown function found almost exclusively in mature olfactory sensory neurons. Electrophysiological studies of olfactory epithelia in OMP knock-out mice show strongly retarded recovery following odorant stimulation leading to an impaired response to pulsed odor stimulation. Although these studies show that OMP is a modulator of the olfactory signal-transduction cascade, its biochemical role is not established. In order to facilitate further studies on the molecular function of OMP, its crystal structure has been determined at 2.3 A resolution using multiwavelength anomalous diffraction experiments on selenium-labeled protein. OMP is observed to form a modified beta-clamshell structure with eight antiparallel beta-strands. While OMP has no significant sequence homology to proteins of known structure, it has a similar fold to a domain found in a variety of existing structures, including in a large family of viral capsid proteins. The surface of OMP is mostly convex and lacking obvious small molecule binding sites, suggesting that it is more likely to be involved in modulating protein-protein interaction than in interacting with small molecule ligands. Three highly conserved regions have been identified as leading candidates for protein-protein interaction sites in OMP. One of these sites represents a loop known to mediate ligand interactions in the structurally homologous EphB2 receptor ligand-binding domain. This site is partially buried in the crystal structure but fully exposed in the NMR solution structure of OMP due to a change in the orientation of an alpha-helix that projects outward from the structurally invariant beta-clamshell core. Gating of this conformational change by molecular interactions in the signal-transduction cascade could be used to control access to OMP's equivalent of the EphB2 ligand-interaction loop, thereby allowing OMP to function as a molecular switch.  相似文献   

15.
We show that reductive methylation of proteins can be used for highly sensitive NMR identification of conformational changes induced by metal- and small molecule binding, as well as protein-protein interactions. Reductive methylation of proteins introduces two (13)C-methyl groups on each lysine in the protein of interest. This method works well even when the lysines are not actively involved in the interaction, due to changes in the microenvironments of lysine residues. Most lysine residues are located on the protein exterior, and the exposed (13)C-methyl groups may exhibit rapid localized motions. These motions could be faster than the tumbling rate of the molecule as a whole. Thus, this technique has great potential in the study of large molecular weight systems which are currently beyond the scope of conventional NMR methods.  相似文献   

16.
This work investigates statistical prevalence and overall physical origins of changes in charge states of receptor proteins upon ligand binding. These changes are explored as a function of the ligand type (small molecule, protein, and nucleic acid), and distance from the binding region. Standard continuum solvent methodology is used to compute, on an equal footing, pK changes upon ligand binding for a total of 5899 ionizable residues in 20 protein-protein, 20 protein-small molecule, and 20 protein-nucleic acid high-resolution complexes. The size of the data set combined with an extensive error and sensitivity analysis allows us to make statistically justified and conservative conclusions: in 60% of all protein-small molecule, 90% of all protein-protein, and 85% of all protein-nucleic acid complexes there exists at least one ionizable residue that changes its charge state upon ligand binding at physiological conditions (pH = 6.5). Considering the most biologically relevant pH range of 4-8, the number of ionizable residues that experience substantial pK changes (ΔpK > 1.0) due to ligand binding is appreciable: on average, 6% of all ionizable residues in protein-small molecule complexes, 9% in protein-protein, and 12% in protein-nucleic acid complexes experience a substantial pK change upon ligand binding. These changes are safely above the statistical false-positive noise level. Most of the changes occur in the immediate binding interface region, where approximately one out of five ionizable residues experiences substantial pK change regardless of the ligand type. However, the physical origins of the change differ between the types: in protein-nucleic acid complexes, the pK values of interface residues are predominantly affected by electrostatic effects, whereas in protein-protein and protein-small molecule complexes, structural changes due to the induced-fit effect play an equally important role. In protein-protein and protein-nucleic acid complexes, there is a statistically significant number of substantial pK perturbations, mostly due to the induced-fit structural changes, in regions far from the binding interface.  相似文献   

17.
MOTIVATION: Large-scale experiments reveal pairs of interacting proteins but leave the residues involved in the interactions unknown. These interface residues are essential for understanding the mechanism of interaction and are often desired drug targets. Reliable identification of residues that reside in protein-protein interface typically requires analysis of protein structure. Therefore, for the vast majority of proteins, for which there is no high-resolution structure, there is no effective way of identifying interface residues. RESULTS: Here we present a machine learning-based method that identifies interacting residues from sequence alone. Although the method is developed using transient protein-protein interfaces from complexes of experimentally known 3D structures, it never explicitly uses 3D information. Instead, we combine predicted structural features with evolutionary information. The strongest predictions of the method reached over 90% accuracy in a cross-validation experiment. Our results suggest that despite the significant diversity in the nature of protein-protein interactions, they all share common basic principles and that these principles are identifiable from sequence alone.  相似文献   

18.
Protein–protein interactions are challenging targets for modulation by small molecules. Here, we propose an approach that harnesses the increasing structural coverage of protein complexes to identify small molecules that may target protein interactions. Specifically, we identify ligand and protein binding sites that overlap upon alignment of homologous proteins. Of the 2,619 protein structure families observed to bind proteins, 1,028 also bind small molecules (250–1000 Da), and 197 exhibit a statistically significant (p<0.01) overlap between ligand and protein binding positions. These “bi-functional positions”, which bind both ligands and proteins, are particularly enriched in tyrosine and tryptophan residues, similar to “energetic hotspots” described previously, and are significantly less conserved than mono-functional and solvent exposed positions. Homology transfer identifies ligands whose binding sites overlap at least 20% of the protein interface for 35% of domain–domain and 45% of domain–peptide mediated interactions. The analysis recovered known small-molecule modulators of protein interactions as well as predicted new interaction targets based on the sequence similarity of ligand binding sites. We illustrate the predictive utility of the method by suggesting structural mechanisms for the effects of sanglifehrin A on HIV virion production, bepridil on the cellular entry of anthrax edema factor, and fusicoccin on vertebrate developmental pathways. The results, available at http://pibase.janelia.org, represent a comprehensive collection of structurally characterized modulators of protein interactions, and suggest that homologous structures are a useful resource for the rational design of interaction modulators.  相似文献   

19.
Ribonuclease P (RNase P) is the endonuclease responsible for the removal of 5' leader sequences from tRNA precursors. The crystal structure of an archaeal RNase P protein, Ph1771p (residues 36-127) from hyperthermophilic archaeon Pyrococcus horikoshii OT3 was determined at 2.0 A resolution by X-ray crystallography. The structure is composed of four helices (alpha1-alpha4) and a six-stranded antiparallel beta-sheet (beta1-beta6) with a protruding beta-strand (beta7) at the C-terminal region. The strand beta7 forms an antiparallel beta-sheet by interacting with strand beta4 in a symmetry-related molecule, suggesting that strands beta4 and beta7 could be involved in protein-protein interactions with other RNase P proteins. Structural comparison showed that the beta-barrel structure of Ph1771p has a topological resemblance to those of Staphylococcus aureus translational regulator Hfq and Haloarcula marismortui ribosomal protein L21E, suggesting that these RNA binding proteins have a common ancestor and then diverged to specifically bind to their cognate RNAs. The structure analysis as well as structural comparison suggested two possible RNA binding sites in Ph1771p, one being a concave surface formed by terminal alpha-helices (alpha1-alpha4) and beta-strand beta6, where positively charged residues are clustered. A second possible RNA binding site is at a loop region connecting strands beta2 and beta3, where conserved hydrophilic residues are exposed to the solvent and interact specifically with sulfate ion. These two potential sites for RNA binding are located in close proximity. The crystal structure of Ph1771p provides insight into the structure and function relationships of archaeal and eukaryotic RNase P.  相似文献   

20.
The discovery of small-molecule drugs aimed at disrupting protein-protein associations is expected to lead to promising therapeutic strategies. The small molecule binds to the target protein thus replacing its natural protein partner. Noteworthy, structural analysis of complexes between successful disruptive small molecules and their target proteins has suggested the possibility that such ligands might somehow mimic the binding behavior of the protein they replace. In these cases, the molecules show a spatial and "chemical" (i.e., hydrophobicity) similarity with the residues of the partner protein involved in the protein-protein complex interface. However, other disruptive small molecules do not seem to show such spatial and chemical correspondence with the replaced protein. In turn, recent progress in the understanding of protein-protein interactions and binding hot spots has revealed the main role of intermolecular wrapping interactions: three-body cooperative correlations in which nonpolar groups in the partner protein promote dehydration of a two-body electrostatic interaction of the other protein. Hence, in the present work, we study some successful complexes between already discovered small disruptive drug-like molecules and their target proteins already reported in the literature and we compare them with the complexes between such proteins and their natural protein partners. Our results show that the small molecules do in fact mimic to a great extent the wrapping behavior of the protein they replace. Thus, by revealing the replacement the small molecule performs of relevant wrapping interactions, we convey precise physical meaning to the mimicking concept, a knowledge that might be exploited in future drug-design endeavors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号