首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 830 毫秒
1.
Nearly half of known protein structures interact with phosphate-containing ligands, such as nucleotides and other cofactors. Many methods have been developed for the identification of metal ions-binding sites and some for bigger ligands such as carbohydrates, but none is yet available for the prediction of phosphate-binding sites. Here we describe Pfinder, a method that predicts binding sites for phosphate groups, both in the form of ions or as parts of other non-peptide ligands, in proteins of known structure. Pfinder uses the Query3D local structural comparison algorithm to scan a protein structure for the presence of a number of structural motifs identified for their ability to bind the phosphate chemical group. Pfinder has been tested on a data set of 52 proteins for which both the apo and holo forms were available. We obtained at least one correct prediction in 63% of the holo structures and in 62% of the apo. The ability of Pfinder to recognize a phosphate-binding site in unbound protein structures makes it an ideal tool for functional annotation and for complementing docking and drug design methods. The Pfinder program is available at http://pdbfun.uniroma2.it/pfinder.  相似文献   

2.
Theoretical microscopic titration curves (THEMATICS) is a computational method for the identification of active sites in proteins through deviations in computed titration behavior of ionizable residues. While the sensitivity to catalytic sites is high, the previously reported sensitivity to catalytic residues was not as high, about 50%. Here THEMATICS is combined with support vector machines (SVM) to improve sensitivity for catalytic residue prediction from protein 3D structure alone. For a test set of 64 proteins taken from the Catalytic Site Atlas (CSA), the average recall rate for annotated catalytic residues is 61%; good precision is maintained selecting only 4% of all residues. The average false positive rate, using the CSA annotations is only 3.2%, far lower than other 3D-structure-based methods. THEMATICS-SVM returns higher precision, lower false positive rate, and better overall performance, compared with other 3D-structure-based methods. Comparison is also made with the latest machine learning methods that are based on both sequence alignments and 3D structures. For annotated sets of well-characterized enzymes, THEMATICS-SVM performance compares very favorably with methods that utilize sequence homology. However, since THEMATICS depends only on the 3D structure of the query protein, no decline in performance is expected when applied to novel folds, proteins with few sequence homologues, or even orphan sequences. An extension of the method to predict non-ionizable catalytic residues is also presented. THEMATICS-SVM predicts a local network of ionizable residues with strong interactions between protonation events; this appears to be a special feature of enzyme active sites.  相似文献   

3.
A new approach to the functional classification of protein 3D structures is described with application to some examples from structural genomics. This approach is based on functional site prediction with THEMATICS and POOL. THEMATICS employs calculated electrostatic potentials of the query structure. POOL is a machine learning method that utilizes THEMATICS features and has been shown to predict accurate, precise, highly localized interaction sites. Extension to the functional classification of structural genomics proteins is now described. Predicted functionally important residues are structurally aligned with those of proteins with previously characterized biochemical functions. A 3D structure match at the predicted local functional site then serves as a more reliable predictor of biochemical function than an overall structure match. Annotation is confirmed for a structural genomics protein with the ribulose phosphate binding barrel (RPBB) fold. A putative glucoamylase from Bacteroides fragilis (PDB ID 3eu8) is shown to be in fact probably not a glucoamylase. Finally a structural genomics protein from Streptomyces coelicolor annotated as an enoyl-CoA hydratase (PDB ID 3g64) is shown to be misannotated. Its predicted active site does not match the well-characterized enoyl-CoA hydratases of similar structure but rather bears closer resemblance to those of a dehalogenase with similar fold.  相似文献   

4.
Genomics has posed the challenge of determination of protein function from sequence and/or 3-D structure. Functional assignment from sequence relationships can be misleading, and structural similarity does not necessarily imply functional similarity. Proteins in the DJ-1 family, many of which are of unknown function, are examples of proteins with both sequence and fold similarity that span multiple functional classes. THEMATICS (theoretical microscopic titration curves), an electrostatics-based computational approach to functional site prediction, is used to sort proteins in the DJ-1 family into different functional classes. Active site residues are predicted for the eight distinct DJ-1 proteins with available 3-D structures. Placement of the predicted residues onto a structural alignment for six of these proteins reveals three distinct types of active sites. Each type overlaps only partially with the others, with only one residue in common across all six sets of predicted residues. Human DJ-1 and YajL from Escherichia coli have very similar predicted active sites and belong to the same probable functional group. Protease I, a known cysteine protease from Pyrococcus horikoshii, and PfpI/YhbO from E. coli, a hypothetical protein of unknown function, belong to a separate class. THEMATICS predicts a set of residues that is typical of a cysteine protease for Protease I; the prediction for PfpI/YhbO bears some similarity. YDR533Cp from Saccharomyces cerevisiae, of unknown function, and the known chaperone Hsp31 from E. coli constitute a third group with nearly identical predicted active sites. While the first four proteins have predicted active sites at dimer interfaces, YDR533Cp and Hsp31 both have predicted sites contained within each subunit. Although YDR533Cp and Hsp31 form different dimers with different orientations between the subunits, the predicted active sites are superimposable within the monomer structures. Thus, the three predicted functional classes form four different types of quaternary structures. The computational prediction of the functional sites for protein structures of unknown function provides valuable clues for functional classification.  相似文献   

5.
One of the major challenges in genomics is to understand the function of gene products from their 3D structures. Computational methods are needed for the high-throughput prediction of the function of proteins from their 3D structure. Methods that identify active sites are important for understanding and annotating the function of proteins. Traditional methods exploiting either sequence similarity or structural similarity can be unreliable and cannot be applied to proteins with novel folds or low homology with other proteins. Here, we present a machine-learning application that combines computed electrostatic, evolutionary, and pocket geometric information for high-performance prediction of catalytic residues. Input features consist of our structure-based theoretical microscopic anomalous titration curve shapes (THEMATICS) electrostatics data, enhanced with sequence-based phylogenetic information from INTREPID and topological pocket information from ConCavity. Our THEMATICS-based input features are augmented with an additional metric, the theoretical buffer range. With the integration of the three different types of input, each of which performs admirably on its own, significantly better performance is achieved than that of any of these methods by itself. This combined method achieves 86.7%, 92.5%, and 93.8% recall of annotated functional residues at 5, 8, and 10% false-positive rates, respectively.  相似文献   

6.
Babor M  Gerzon S  Raveh B  Sobolev V  Edelman M 《Proteins》2008,70(1):208-217
Metal ions are crucial for protein function. They participate in enzyme catalysis, play regulatory roles, and help maintain protein structure. Current tools for predicting metal-protein interactions are based on proteins crystallized with their metal ions present (holo forms). However, a majority of resolved structures are free of metal ions (apo forms). Moreover, metal binding is a dynamic process, often involving conformational rearrangement of the binding pocket. Thus, effective predictions need to be based on the structure of the apo state. Here, we report an approach that identifies transition metal-binding sites in apo forms with a resulting selectivity >95%. Applying the approach to apo forms in the Protein Data Bank and structural genomics initiative identifies a large number of previously unknown, putative metal-binding sites, and their amino acid residues, in some cases providing a first clue to the function of the protein.  相似文献   

7.
Hundreds of protein crystal structures exist for proteins whose function cannot be confidently determined from sequence similarity. Surflex‐PSIM, a previously reported surface‐based protein similarity algorithm, provides an alternative method for hypothesizing function for such proteins. The method now supports fully automatic binding site detection and is fast enough to screen comprehensive databases of protein binding sites. The binding site detection methodology was validated on apo/holo cognate protein pairs, correctly identifying 91% of ligand binding sites in holo structures and 88% in apo structures where corresponding sites existed. For correctly detected apo binding sites, the cognate holo site was the most similar binding site 87% of the time. PSIM was used to screen a set of proteins that had poorly characterized functions at the time of crystallization, but were later biochemically annotated. Using a fully automated protocol, this set of 8 proteins was screened against ~60,000 ligand binding sites from the PDB. PSIM correctly identified functional matches that predated query protein biochemical annotation for five out of the eight query proteins. A panel of 12 currently unannotated proteins was also screened, resulting in a large number of statistically significant binding site matches, some of which suggest likely functions for the podorly characterized proteins. Proteins 2014; 82:679–694. © 2013 Wiley Periodicals, Inc.  相似文献   

8.
Calcium binding in proteins exhibits a wide range of polygonal geometries that relate directly to an equally diverse set of biological functions. The binding process stabilizes protein structures and typically results in local conformational change and/or global restructuring of the backbone. Previously, we established the MUG program, which utilized multiple geometries in the Ca2+‐binding pockets of holoproteins to identify such pockets, ignoring possible Ca2+‐induced conformational change. In this article, we first report our progress in the analysis of Ca2+‐induced conformational changes followed by improved prediction of Ca2+‐binding sites in the large group of Ca2+‐binding proteins that exhibit only localized conformational changes. The MUGSR algorithm was devised to incorporate side chain torsional rotation as a predictor. The output from MUGSR presents groups of residues where each group, typically containing two to five residues, is a potential binding pocket. MUGSR was applied to both X‐ray apo structures and NMR holo structures, which did not use calcium distance constraints in structure calculations. Predicted pockets were validated by comparison with homologous holo structures. Defining a “correct hit” as a group of residues containing at least two true ligand residues, the sensitivity was at least 90%; whereas for a “correct hit” defined as a group of residues containing at least three true ligand residues, the sensitivity was at least 78%. These data suggest that Ca2+‐binding pockets are at least partially prepositioned to chelate the ion in the apo form of the protein.  相似文献   

9.
The characteristics of heme prosthetic groups and their binding sites have been analyzed in detail in a data set of nonhomologous heme proteins. Variations in the shape, volume, and chemical composition of the binding site, in the mode of heme binding and in the number and nature of heme–protein interactions are found to result in significantly different heme environments in proteins with different functions in biology. Differences are also seen in the properties of the apo states of the proteins. The apo states of proteins that bind heme permanently in their functional form show some disorder, ranging from local unfolding in the heme binding pocket to complete unfolding to give a random coil. In contrast, proteins that bind heme transiently are fully folded in their apo and holo states, presumably allowing both apo and holo forms to remain biologically active resisting aggregation or proteolysis. The principles identified here provide a framework for the design of de novo proteins that will exhibit tight heme ligand binding and for the identification of the function of structural genomic target proteins with heme ligands. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

10.
Cellular retinoic acid binding protein I (CRABPI) belongs to the family of intracellular lipid binding proteins (iLBPs), all of which bind a hydrophobic ligand within an internal cavity. The structures of several iLBPs reveal minimal structural differences between the apo (ligand-free) and holo (ligand-bound) forms, suggesting that dynamics must play an important role in the ligand recognition and binding processes. Here, a variety of nuclear magnetic resonance (NMR) spectroscopy methods were used to systematically study the dynamics of both apo and holo CRABPI at various time scales. Translational and rotational diffusion constant measurements were used to study the overall motions of the proteins. Both apo and holo forms of CRABPI tend to self-associate at high (1.2 mM) concentrations, while at low concentrations (0.2 mM), they are predominantly monomeric. Rapid amide exchange rate and laboratory frame relaxation rate measurements at two spectrometer field strengths (500 and 600 MHz) were used to probe the internal motions of the individual residues. Several residues in the apo form, notably within the ligand recognition region, exhibit millisecond time scale motions that are significantly arrested in the holo form. In contrast, no significant differences in the high-frequency motions were observed between the two forms. These results provide direct experimental evidence for dynamics-induced ligand recognition and binding at a specifically defined time scale. They also exemplify the importance of dynamics in providing a more comprehensive understanding of how a protein functions.  相似文献   

11.
THEMATICS (Theoretical Microscopic Titration Curves) is a simple, reliable computational predictor of the active sites of enzymes from structure. Our method, based on well-established Finite Difference Poisson-Boltzmann techniques, identifies the ionisable residues with anomalous predicted titration behavior. A cluster of two or more such perturbed residues is a very reliable predictor of the active site. The protein does not have to bear any resemblance in sequence or structure to any previously characterized protein, but the method does require the three-dimensional structure. We now present evidence that THEMATICS can also locate the active site in structures built by comparative modeling from similar structures. Results are given for a total of 21 sets of proteins, including 21 templates and 83 comparative model structures. Detailed results are presented for three sets of orthologous proteins (Triosephosphate isomerase, 6-Hydroxymethyl-7,8-dihydropterin pyrophosphokinase, and Aspartate aminotransferase) and for one set of human homologues of Aldose reductase with different functions. THEMATICS correctly locates the active site in the model structures. This suggests that the method can be applicable to a much larger set of proteins for which an experimentally determined structure is unavailable. With a few exceptions, the predicted active sites in the comparative model structures are similar to that of the corresponding template structure.  相似文献   

12.
A new monotonicity-constrained maximum likelihood approach, called Partial Order Optimum Likelihood (POOL), is presented and applied to the problem of functional site prediction in protein 3D structures, an important current challenge in genomics. The input consists of electrostatic and geometric properties derived from the 3D structure of the query protein alone. Sequence-based conservation information, where available, may also be incorporated. Electrostatics features from THEMATICS are combined with multidimensional isotonic regression to form maximum likelihood estimates of probabilities that specific residues belong to an active site. This allows likelihood ranking of all ionizable residues in a given protein based on THEMATICS features. The corresponding ROC curves and statistical significance tests demonstrate that this method outperforms prior THEMATICS-based methods, which in turn have been shown previously to outperform other 3D-structure-based methods for identifying active site residues. Then it is shown that the addition of one simple geometric property, the size rank of the cleft in which a given residue is contained, yields improved performance. Extension of the method to include predictions of non-ionizable residues is achieved through the introduction of environment variables. This extension results in even better performance than THEMATICS alone and constitutes to date the best functional site predictor based on 3D structure only, achieving nearly the same level of performance as methods that use both 3D structure and sequence alignment data. Finally, the method also easily incorporates such sequence alignment data, and when this information is included, the resulting method is shown to outperform the best current methods using any combination of sequence alignments and 3D structures. Included is an analysis demonstrating that when THEMATICS features, cleft size rank, and alignment-based conservation scores are used individually or in combination THEMATICS features represent the single most important component of such classifiers.  相似文献   

13.
Wong S  Jacobson MP 《Proteins》2008,71(1):153-164
Ligand binding frequently induces significant conformational changes in a protein receptor. Understanding and predicting such conformational changes represent an important challenge for computational biology, including applications to structure-based drug design. We describe an approach to this problem based on the assumption that the holo state is at least transiently populated in the absence of a ligand; this hypothesis has been referred to as "conformational selection." Here, we apply a method that tests this hypothesis on a challenging class of ligand-induced conformational changes, which we refer to as loop latching: the closing of a loop around an active site that sequesters the ligand from solvent. The method uses a combination of replica exchange molecular dynamics and a loop prediction algorithm to generate low-energy loop structures, and docking to select the conformation appropriate for binding a particular ligand. On a test set of six proteins, it yields loop structures including hololike conformations, generally below 2 A RMSD from the liganded structure, for loops that span up to 15 residues. Docking serves as a stringent test of the predictions. In five of the six cases, the predicted loop conformations improve the ranks of cognate ligands relative to using the apo structure, although the results remain, in most cases, significantly worse than using a holo structure. The poses of the cognate ligands are correct in four of the six test cases, while they are correct for five of the six using a holo structure.  相似文献   

14.
Harris R  Olson AJ  Goodsell DS 《Proteins》2008,70(4):1506-1517
We present a method, termed AutoLigand, for the prediction of ligand-binding sites in proteins of known structure. The method searches the space surrounding the protein and finds the contiguous envelope with the specified volume of atoms, which has the largest possible interaction energy with the protein. It uses a full atomic representation, with atom types for carbon, hydrogen, oxygen, nitrogen and sulfur (and others, if desired), and is designed to minimize the need for artificial geometry. Testing on a set of 187 diverse protein-ligand complexes has shown that the method is successful in predicting the location and approximate volume of the binding site in 73% of cases. Additional testing was performed on a set of 96 protein-ligand complexes with crystallographic structures of apo and holo forms, and AutoLigand was able to predict the binding site in 80% of the apo structures.  相似文献   

15.
New directions in computational methods for the prediction of protein function are discussed. THEMATICS, a method for the location and characterization of the active sites of enzymes, is featured. THEMATICS, for Theoretical Microscopic Titration Curves, is based on well-established finite-difference Poisson-Boltzmann methods for computing the electric field function of a protein. THEMATICS requires only the structure of the subject protein and thus may be applied to proteins that bear no similarity in structure or sequence to any previously characterized protein. The unique features of catalytic sites in proteins are discussed. Discussion of the chemical basis for the predictive powers of THEMATICS is featured in this paper. Some results are given for three illustrative examples: HIV-1 protease, human apurinic/apyrimidinic endonuclease, and human adenosine kinase.  相似文献   

16.
Location of functional binding pockets of bioactive ligands on protein molecules is essential in structural genomics and drug design projects. If the experimental determination of ligand-protein complex structures is complicated, blind docking (BD) and pocket search (PS) calculations can help in the prediction of atomic resolution binding mode and the location of the pocket of a ligand on the entire protein surface. Whereas the number of successful predictions by these methods is increasing even for the complicated cases of exosites or allosteric binding sites, their reliability has not been fully established. For a critical assessment of reliability, we use a set of ligand-protein complexes, which were found to be problematic in previous studies. The robustness of BD and PS methods is addressed in terms of success of the selection of truly functional pockets from among the many putative ones identified on the surfaces of ligand-bound and ligand-free (holo and apo) protein forms. Issues related to BD such as effect of hydration, existence of multiple pockets, and competition of subsidiary ligands are considered. Practical cases of PS are discussed, categorized and strategies are recommended for handling the different situations. PS can be used in conjunction with BD, as we find that a consensus approach combining the techniques improves predictive power.  相似文献   

17.

Background  

Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites.  相似文献   

18.
Cu, Zn superoxide dismutase (SOD1) forms a crucial component of the cellular defence against oxidative stress. Zn-deficient wild-type and mutant human SOD1 have been implicated in the disease familial amyotrophic lateral sclerosis (FALS). We present here the crystal structures of holo and metal-deficient (apo) wild-type protein at 1.8A resolution. The P21 wild-type holo enzyme structure has nine independently refined dimers and these combine to form a "trimer of dimers" packing motif in each asymmetric unit. There is no significant asymmetry between the monomers in these dimers, in contrast to the subunit structures of the FALS G37R mutant of human SOD1 and in bovine Cu,Zn SOD. Metal-deficient apo SOD1 crystallizes with two dimers in the asymmetric unit and shows changes in the metal-binding sites and disorder in the Zn binding and electrostatic loops of one dimer, which is devoid of metals. The second dimer lacks Cu but has approximately 20% occupancy of the Zn site and remains structurally similar to wild-type SOD1. The apo protein forms a continuous, extended arrangement of beta-barrels stacked up along the short crystallographic b-axis, while perpendicular to this axis, the constituent beta-strands form a zig-zag array of filaments, the overall arrangement of which has a similarity to the common structure associated with amyloid-like fibrils.  相似文献   

19.
The crystal structures of corepressor-bound and free Escherichia coli purine repressor (PurR) have delineated the roles of several residues in corepressor binding and specificity and the intramolecular signal transduction (allosterism) of this LacI/GalR family member. From these structures, residue W147 was implicated as a key component of the allosteric response, but in many members of the LacI/GalR family, position 147 is occupied by an arginine. To understand the role of this tryptophan at position 147, three proteins, substituted by phenylalanine (W147F), alanine (W147A), or arginine (W147R), were constructed and characterized in vivo and in vitro, and their structures were determined. W147F displays a decreased affinity for corepressor and is a poor repressor in vivo. W147A and W147R, on the other hand, are super repressors and bind corepressor 13.6 and 7.9 times more tightly, respectively, than wild-type. Each mutant PurR-hypoxanthine-purF operator holo complex crystallizes isomorphously to wild-type. Whereas the apo corepressor binding domain (CBD) of W147F crystallizes under those conditions used for the wild-type protein, neither the apo CBD of W147R nor W147A crystallizes, although screened extensively for new crystal forms. Structures of the holo repressor mutants have been solved to resolutions between 2.5 and 2.9 A, and the structure of the apo CBD of W147F has been solved to 2.4 A resolution. These structures provide insight into the altered biochemical properties and physiological functions of these mutants, which appear to depend on the sometimes subtle preference for one conformation (apo vs holo) over the other.  相似文献   

20.
X-ray diffraction studies show that the diferric (holo) forms of human serum transferrin and lactoferrin have almost the same conformation in crystal. In solution, however, the two proteins exhibit different characteristics. The differences are even more pronounced in the apo forms. Small-angle X-ray and neutron scattering data show that lactoferrin is less compact, in apo and holo forms, than the corresponding forms of transferrin in solution. The comparison of primary structures of the two proteins suggests that one of the interdomain hinge regions is significantly longer in lactoferrin than its counterpart in transferrin. The difference in flexibility due to the long hinge region in lactoferrin may be responsible for many of the differences in the physicochemical characteristics of the two proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号