首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
The function of a protein is often fulfilled via molecular interactions on its surfaces, so identifying the functional surface(s) of a protein is helpful for understanding its function. Here, we introduce the concept of a split pocket, which is a pocket that is split by a cognate ligand. We use a geometric approach that is site‐specific. Specifically, we first compute a set of all pockets in the protein with its ligand(s) and a set of all pockets with the ligand(s) removed and then compare the two sets of pockets to identify the split pocket(s) of the protein. To reduce the search space and expedite the process of surface partitioning, we design probe radii according to the physicochemical textures of molecules. Our method achieves a success rate of 96% on a benchmark test set. We conduct a large‐scale computation to identify ~19,000 split pockets from 11,328 structures (1.16 million potential pockets); for each pocket, we obtain residue composition, solvent‐accessible area, and molecular volume. With this database of split pockets, our method can be used to predict the functional surfaces of unbound structures. Indeed, the functional surface of an unbound protein may often be found from its similarity to remotely related bound forms that belong to distinct folds. Finally, we apply our method to identify glucose‐binding proteins, including unbound structures. Our study demonstrates the power of geometric and evolutionary matching for studying protein functional evolution and provides a framework for classifying protein functions by local spatial patterns of functional surfaces. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

3.
We describe a novel approach for inferring functional relationship of proteins by detecting sequence and spatial patterns of protein surfaces. Well-formed concave surface regions in the form of pockets and voids are examined to identify similarity relationship that might be directly related to protein function. We first exhaustively identify and measure analytically all 910,379 surface pockets and interior voids on 12,177 protein structures from the Protein Data Bank. The similarity of patterns of residues forming pockets and voids are then assessed in sequence, in spatial arrangement, and in orientational arrangement. Statistical significance in the form of E and p-values is then estimated for each of the three types of similarity measurements. Our method is fully automated without human intervention and can be used without input of query patterns. It does not assume any prior knowledge of functional residues of a protein, and can detect similarity based on surface patterns small and large. It also tolerates, to some extent, conformational flexibility of functional sites. We show with examples that this method can detect functional relationship with specificity for members of the same protein family and superfamily, as well as remotely related functional surfaces from proteins of different fold structures. We envision that this method can be used for discovering novel functional relationship of protein surfaces, for functional annotation of protein structures with unknown biological roles, and for further inquiries on evolutionary origins of structural elements important for protein function.  相似文献   

4.
One difficult aspect of the protein‐folding problem is characterizing the nonspecific interactions that define packing in protein tertiary structure. To better understand tertiary structure, this work extends the knob‐socket model by classifying the interactions of a single knob residue packed into a set of contiguous sockets, or a pocket made up of 4 or more residues. The knob‐socket construct allows for a symbolic two‐dimensional mapping of pockets. The two‐dimensional mapping of pockets provides a simple method to investigate the variety of pocket shapes to understand the geometry of protein tertiary surfaces. The diversity of pocket geometries can be organized into groups of pockets that share a common core, which suggests that some interactions in pockets are ancillary to packing. Further analysis of pocket geometries displays a preferred configuration that is right‐handed in α‐helices and left‐handed in β‐sheets. The amino acid composition of pockets illustrates the importance of nonpolar amino acids in packing as well as position specificity. As expected, all pocket shapes prefer to pack with hydrophobic knobs; however, knobs are not selective for the pockets they pack. Investigating side‐chain rotamer preferences for certain pocket shapes uncovers no strong correlations. These findings allow a simple vocabulary based on knobs and sockets to describe protein tertiary packing that supports improved analysis, design, and prediction of protein structure. Proteins 2016; 84:201–216. © 2015 Wiley Periodicals, Inc.  相似文献   

5.
Structural genomics (SG) initiatives are expanding the universe of protein fold space by rapidly determining structures of proteins that were intentionally selected on the basis of low sequence similarity to proteins of known structure. Often these proteins have no associated biochemical or cellular functions. The SG success has resulted in an accelerated deposition of novel structures. In some cases the structural bioinformatics analysis applied to these novel structures has provided specific functional assignment. However, this approach has also uncovered limitations in the functional analysis of uncharacterized proteins using traditional sequence and backbone structure methodologies. A novel method, named pvSOAR (pocket and void Surface of Amino Acid Residues), of comparing the protein surfaces of geometrically defined pockets and voids was developed. pvSOAR was able to detect previously unrecognized and novel functional relationships between surface features of proteins. In this study, pvSOAR is applied to several structural genomics proteins. We examined the surfaces of YecM, BioH, and RpiB from Escherichia coli as well as the CBS domains from inosine-5'-monosphate dehydrogenase from Streptococcus pyogenes, conserved hypothetical protein Ta549 from Thermoplasm acidophilum, and CBS domain protein mt1622 from Methanobacterium thermoautotrophicum with the goal to infer information about their biochemical function.  相似文献   

6.
Patterns of receptor-ligand interaction can be conserved in functionally equivalent proteins even in the absence of sequence homology. Therefore, structural comparison of ligand-binding pockets and their pharmacophoric features allow for the characterization of so-called "orphan" proteins with known three-dimensional structure but unknown function, and predict ligand promiscuity of binding pockets. We present an algorithm for rapid pocket comparison (PoLiMorph), in which protein pockets are represented by self-organizing graphs that fill the volume of the cavity. Vertices in these three-dimensional frameworks contain information about the local ligand-receptor interaction potential coded by fuzzy property labels. For framework matching, we developed a fast heuristic based on the maximum dispersion problem, as an alternative to techniques utilizing clique detection or geometric hashing algorithms. A sophisticated scoring function was applied that incorporates knowledge about property distributions and ligand-receptor interaction patterns. In an all-against-all virtual screening experiment with 207 pocket frameworks extracted from a subset of PDBbind, PoLiMorph correctly assigned 81% of 69 distinct structural classes and demonstrated sustained ability to group pockets accommodating the same ligand chemotype. We determined a score threshold that indicates "true" pocket similarity with high reliability, which not only supports structure-based drug design but also allows for sequence-independent studies of the proteome.  相似文献   

7.
Identifying conserved pockets on the surfaces of a family of proteins can provide insight into conserved geometric features and sites of protein–protein interaction. Here we describe mapping and comparison of the surfaces of aligned crystallographic structures, using the protein kinase family as a model. Pockets are rapidly computed using two computer programs, FADE and Crevasse. FADE uses gradients of atomic density to locate grooves and pockets on the molecular surface. Crevasse, a new piece of software, splits the FADE output into distinct pockets. The computation was run on 10 kinase catalytic cores aligned on the αF‐helix, and the resulting pockets spatially clustered. The active site cleft appears as a large, contiguous site that can be subdivided into nucleotide and substrate docking sites. Substrate specificity determinants in the active site cleft between serine/threonine and tyrosine kinases are visible and distinct. The active site clefts cluster tightly, showing a conserved spatial relationship between the active site and αF‐helix in the C‐lobe. When the αC‐helix is examined, there are multiple mechanisms for anchoring the helix using spatially conserved docking sites. A novel site at the top of the N‐lobe is present in all the kinases, and there is a large conserved pocket over the hinge and the αC‐β4 loop. Other pockets on the kinase core are strongly conserved but have not yet been mapped to a protein–protein interaction. Sites identified by this algorithm have revealed structural and spatially conserved features of the kinase family and potential conserved intermolecular and intramolecular binding sites.  相似文献   

8.
9.
Kawabata T  Go N 《Proteins》2007,68(2):516-529
One of the simplest ways to predict ligand binding sites is to identify pocket-shaped regions on the protein surface. Many programs have already been proposed to identify these pocket regions. Examination of their algorithms revealed that a pocket intrinsically has two arbitrary properties, "size" and "depth". We proposed a new definition for pockets using two explicit adjustable parameters that correspond to these two arbitrary properties. A pocket region is defined as a space into which a small probe can enter, but a large probe cannot. The radii of small and large probe spheres are the two parameters that correspond to the "size" and "depth" of the pockets, respectively. These values can be adjusted individual putative ligand molecule. To determine the optimal value of the large probe spheres radius, we generated pockets for thousands of protein structures in the database, using several size of large probe spheres, examined the correspondence of these pockets with known binding site positions. A new measure of shallowness, a minimum inaccessible radius, R(inaccess), indicated that binding sites of coenzymes are very deep, while those for adenine/guanine mononucleotide have only medium shallowness and those for short peptides and oligosaccharides are shallow. The optimal radius of large probe spheres was 3-4 A for the coenzymes, 4 A for adenine/guanine mononucleotides, and 5 A or more for peptides/oligosaccharides. Comparison of our program with two other popular pocket-finding programs showed that our program had a higher performance of detecting binding pockets, although it required more computational time.  相似文献   

10.
Identification and size characterization of surface pockets and occluded cavities are initial steps in protein structure-based ligand design. A new program, CAST, for automatically locating and measuring protein pockets and cavities, is based on precise computational geometry methods, including alpha shape and discrete flow theory. CAST identifies and measures pockets and pocket mouth openings, as well as cavities. The program specifies the atoms lining pockets, pocket openings, and buried cavities; the volume and area of pockets and cavities; and the area and circumference of mouth openings. CAST analysis of over 100 proteins has been carried out; proteins examined include a set of 51 monomeric enzyme-ligand structures, several elastase-inhibitor complexes, the FK506 binding protein, 30 HIV-1 protease-inhibitor complexes, and a number of small and large protein inhibitors. Medium-sized globular proteins typically have 10-20 pockets/cavities. Most often, binding sites are pockets with 1-2 mouth openings; much less frequently they are cavities. Ligand binding pockets vary widely in size, most within the range 10(2)-10(3)A3. Statistical analysis reveals that the number of pockets and cavities is correlated with protein size, but there is no correlation between the size of the protein and the size of binding sites. Most frequently, the largest pocket/cavity is the active site, but there are a number of instructive exceptions. Ligand volume and binding site volume are somewhat correlated when binding site volume is < or =700 A3, but the ligand seldom occupies the entire site. Auxiliary pockets near the active site have been suggested as additional binding surface for designed ligands (Mattos C et al., 1994, Nat Struct Biol 1:55-58). Analysis of elastase-inhibitor complexes suggests that CAST can identify ancillary pockets suitable for recruitment in ligand design strategies. Analysis of the FK506 binding protein, and of compounds developed in SAR by NMR (Shuker SB et al., 1996, Science 274:1531-1534), indicates that CAST pocket computation may provide a priori identification of target proteins for linked-fragment design. CAST analysis of 30 HIV-1 protease-inhibitor complexes shows that the flexible active site pocket can vary over a range of 853-1,566 A3, and that there are two pockets near or adjoining the active site that may be recruited for ligand design.  相似文献   

11.
Protein similarity comparisons may be made on a local or global basis and may consider sequence information or differing levels of structural information. We present a local three‐dimensional method that compares protein binding site surfaces in full atomic detail. The approach is based on the morphological similarity method which has been widely applied for global comparison of small molecules. We apply the method to all‐by‐all comparisons two sets of human protein kinases, a very diverse set of ATP‐bound proteins from multiple species, and three heterogeneous benchmark protein binding site data sets. Cases of disagreement between sequence‐based similarity and binding site similarity yield informative examples. Where sequence similarity is very low, high pocket similarity can reliably identify important binding motifs. Where sequence similarity is very high, significant differences in pocket similarity are related to ligand binding specificity and similarity. Local protein binding pocket similarity provides qualitatively complementary information to other approaches, and it can yield quantitative information in support of functional annotation. Proteins 2011; © 2011 Wiley‐Liss, Inc.  相似文献   

12.
13.
The function of DNA‐ and RNA‐binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure‐based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high‐resolution three‐dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I‐TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high‐resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I‐TASSER produces high‐quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low‐resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

14.
15.

Background  

Identifying pockets on protein surfaces is of great importance for many structure-based drug design applications and protein-ligand docking algorithms. Over the last ten years, many geometric methods for the prediction of ligand-binding sites have been developed.  相似文献   

16.
The structural genomics projects have been accumulating an increasing number of protein structures, many of which remain functionally unknown. In parallel effort to experimental methods, computational methods are expected to make a significant contribution for functional elucidation of such proteins. However, conventional computational methods that transfer functions from homologous proteins do not help much for these uncharacterized protein structures because they do not have apparent structural or sequence similarity with the known proteins. Here, we briefly review two avenues of computational function prediction methods, i.e. structure-based methods and sequence-based methods. The focus is on our recent developments of local structure-based and sequence-based methods, which can effectively extract function information from distantly related proteins. Two structure-based methods, Pocket-Surfer and Patch-Surfer, identify similar known ligand binding sites for pocket regions in a query protein without using global protein fold similarity information. Two sequence-based methods, protein function prediction and extended similarity group, make use of weakly similar sequences that are conventionally discarded in homology based function annotation. Combined together with experimental methods we hope that computational methods will make leading contribution in functional elucidation of the protein structures.  相似文献   

17.
Systematic investigation of a protein and its binding site characteristics are crucial for designing small molecules that modulate protein functions. However, fundamental uncertainties in binding site interactions and insufficient knowledge of the properties of even well‐defined binding pockets can make it difficult to design optimal drugs. Herein, we report the development and implementation of a cavity detection algorithm built with HINT toolkit functions that we are naming Vectorial Identification of Cavity Extents (VICE). This very efficient algorithm is based on geometric criteria applied to simple integer grid maps. In testing, we carried out a systematic investigation on a very diverse data set of proteins and protein–protein/protein–polynucleotide complexes for locating and characterizing the indentations, cavities, pockets, grooves, channels, and surface regions. Additionally, we evaluated a curated data set of unbound proteins for which a ligand‐bound protein structures are also known; here the VICE algorithm located the actual ligand in the largest cavity in 83% of the cases and in one of the three largest in 90% of the cases. An interactive front‐end provides a quick and simple procedure for locating, displaying and manipulating cavities in these structures. Information describing the cavity, including its volume and surface area metrics, and lists of atoms, residues, and/or chains lining the binding pocket, can be easily obtained and analyzed. For example, the relative cross‐sectional surface area (to total surface area) of cavity openings in well‐enclosed cavities is 0.06 ± 0.04 and in surface clefts or crevices is 0.25 ± 0.09. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

18.
Structural location of disease-associated single-nucleotide polymorphisms   总被引:7,自引:0,他引:7  
Non-synonymous single-nucleotide polymorphism (nsSNP) of genes introduces amino acid changes to proteins, and plays an important role in providing genetic functional diversity. To understand the structural characteristics of disease-associated SNPs, we have mapped a set of nsSNPs derived from the online mendelian inheritance in man (OMIM) database to the structural surfaces of encoded proteins. These nsSNPs are disease-associated or have distinctive phenotypes. As a control dataset, we mapped a set of nsSNPs derived from SNP database dbSNP to the structural surfaces of those encoded proteins. Using the alpha shape method from computational geometry, we examine the geometric locations of the structural sites of these nsSNPs. We classify each nsSNP site into one of three categories of geometric locations: those in a pocket or a void (type P); those on a convex region or a shallow depressed region (type S); and those that are buried completely in the interior (type I). We find that the majority (88%) of disease-associated nsSNPs are located in voids or pockets, and they are infrequently observed in the interior of proteins (3.2% in the data set). We find that nsSNPs mapped from dbSNP are less likely to be located in pockets or voids (68%). We further introduce a novel application of hidden Markov models (HMM) for analyzing sequence homology of SNPs on various geometric sites. For SNPs on surface pocket or void, we find that there is no strong tendency for them to occur on conserved residues. For SNPs buried in the interior, we find that disease-associated mutations are more likely to be conserved. The approach of classifying nsSNPs with alpha shape and HMM developed in this study can be integrated with additional methods to improve the accuracy of predictions of whether a given nsSNP is likely to be disease-associated.  相似文献   

19.
Identification and characterization of protein functional surfaces are important for predicting protein function, understanding enzyme mechanism, and docking small compounds to proteins. As the rapid speed of accumulation of protein sequence information far exceeds that of structures, constructing accurate models of protein functional surfaces and identify their key elements become increasingly important. A promising approach is to build comparative models from sequences using known structural templates such as those obtained from structural genome projects. Here we assess how well this approach works in modeling binding surfaces. By systematically building three-dimensional comparative models of proteins using Modeller, we determine how well functional surfaces can be accurately reproduced. We use an alpha shape based pocket algorithm to compute all pockets on the modeled structures, and conduct a large-scale computation of similarity measurements (pocket RMSD and fraction of functional atoms captured) for 26,590 modeled enzyme protein structures. Overall, we find that when the sequence fragment of the binding surfaces has more than 45% identity to that of the template protein, the modeled surfaces have on average an RMSD of 0.5 Å, and contain 48% or more of the binding surface atoms, with nearly all of the important atoms in the signatures of binding pockets captured.  相似文献   

20.
Location of functional binding pockets of bioactive ligands on protein molecules is essential in structural genomics and drug design projects. If the experimental determination of ligand-protein complex structures is complicated, blind docking (BD) and pocket search (PS) calculations can help in the prediction of atomic resolution binding mode and the location of the pocket of a ligand on the entire protein surface. Whereas the number of successful predictions by these methods is increasing even for the complicated cases of exosites or allosteric binding sites, their reliability has not been fully established. For a critical assessment of reliability, we use a set of ligand-protein complexes, which were found to be problematic in previous studies. The robustness of BD and PS methods is addressed in terms of success of the selection of truly functional pockets from among the many putative ones identified on the surfaces of ligand-bound and ligand-free (holo and apo) protein forms. Issues related to BD such as effect of hydration, existence of multiple pockets, and competition of subsidiary ligands are considered. Practical cases of PS are discussed, categorized and strategies are recommended for handling the different situations. PS can be used in conjunction with BD, as we find that a consensus approach combining the techniques improves predictive power.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号