首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Most protein chains interact with only one ligand but a small number of protein chains can bind several ligands, and many examples are available in the protein-ligand complex database of PDB. Among these proteins, some show preferences for the ligands or types of ligands they bind; however, so far we have only poor understanding of what determines protein-ligand binding and its specificity. Here we investigate the structural and functional properties of proteins in protein-ligand complexes. Analysis of the protein-ligand complex dataset from the PDB structure database reveals that proteins with more interactions have more disordered contact residues. Those proteins containing few disordered contact residues that bind multiple ligands have a tendency to consist of several domains. Analysis of physicochemical properties of hub contact residues binding multiple ligands indicates that they are enriched for hydrophilic, charged, polar and His-Asp catalytic triad residues. Finally, in order to differentiate proteins binding different classes of ligands, we mapped the three most prominent classes of ligands onto different superfamily domains. Our results demonstrate that contact residue disorder and ordered multiple domains are complementary factors that play a crucial role in determining ligand binding specificity and promiscuity.  相似文献   

2.
Discovering amino acid (AA) patterns on protein binding sites has recently become popular. We propose a method to discover the association relationship among AAs on binding sites. Such knowledge of binding sites is very helpful in predicting protein-protein interactions. In this paper, we focus on protein complexes which have protein-protein recognition. The association rule mining technique is used to discover geographically adjacent amino acids on a binding site of a protein complex. When mining, instead of treating all AAs of binding sites as a transaction, we geographically partition AAs of binding sites in a protein complex. AAs in a partition are treated as a transaction. For the partition process, AAs on a binding site are projected from three-dimensional to two-dimensional. And then, assisted with a circular grid, AAs on the binding site are placed into grid cells. A circular grid has ten rings: a central ring, the second ring with 6 sectors, the third ring with 12 sectors, and later rings are added to four sectors in order. As for the radius of each ring, we examined the complexes and found that 10Å is a suitable range, which can be set by the user. After placing these recognition complexes on the circular grid, we obtain mining records (i.e. transactions) from each sector. A sector is regarded as a record. Finally, we use the association rule to mine these records for frequent AA patterns. If the support of an AA pattern is larger than the predetermined minimum support (i.e. threshold), it is called a frequent pattern. With these discovered patterns, we offer the biologists a novel point of view, which will improve the prediction accuracy of protein-protein recognition. In our experiments, we produced the AA patterns by data mining. As a result, we found that arginine (arg) most frequently appears on the binding sites of two proteins in the recognition protein complexes, while cysteine (cys) appears the fewest. In addition, if we discriminate the shape of binding sites between concave and convex further, we discover that patterns {arg, glu, asp} and {arg, ser, asp} on the concave shape of binding sites in a protein more frequently (i.e. higher probability) make contact with {lys} or {arg} on the convex shape of binding sites in another protein. Thus, we can confidently achieve a rate of at least 78%. On the other hand {val, gly, lys} on the convex surface of binding sites in proteins is more frequently in contact with {asp} on the concave site of another protein, and the confidence achieved is over 81%. Applying data mining in biology can reveal more facts that may otherwise be ignored or not easily discovered by the naked eye. Furthermore, we can discover more relationships among AAs on binding sites by appropriately rotating these residues on binding sites from a three-dimension to two-dimension perspective. We designed a circular grid to deposit the data, which total to 463 records consisting of AAs. Then we used the association rules to mine these records for discovering relationships. The proposed method in this paper provides an insight into the characteristics of binding sites for recognition complexes.  相似文献   

3.
Ruvinsky AM  Kozintsev AV 《Proteins》2006,62(1):202-208
We present two novel methods to predict native protein-ligand binding positions. Both methods identify the native binding position as the most probable position corresponding to a maximum of a probability distribution function (PDF) of possible binding positions in a protein active site. Possible binding positions are the origins of clusters composed, on the basis of root-mean square deviations (RMSD), from the multiple ligand positions determined by a docking algorithm. The difference between the methods lies in the ways the PDF is derived. To validate the suggested methods, we compare the averaged RMSD of the predicted ligand docked positions relative to the experimentally determined positions for a set of 135 PDB protein-ligand complexes. We demonstrate that the suggested methods improve docking accuracy by as much as 21-24% in comparison with a method that simply identifies the binding position as the energy top-scored ligand position.  相似文献   

4.
Identification of protein biochemical functions based on their three-dimensional structures is now required in the post-genome-sequencing era. Ligand binding is one of the major biochemical functions of proteins, and thus the identification of ligands and their binding sites is the starting point for the function identification. Previously we reported our first trial on structure-based function prediction, based on the similarity searches of molecular surfaces against the functional site database. Here we describe the extension of our first trial by expanding the search database to whole heteroatom binding sites appearing within the Protein Data Bank (PDB) with the new analysis protocol. In addition, we have determined the similarity threshold line, by using 10 structure pairs with solved free and complex structures. Finally, we extensively applied our method to newly determined hypothetical proteins, including some without annotations, and evaluated the performance of our methods.  相似文献   

5.
A database comprising all ligand-binding sites of known structure aligned with all related protein sequences and structures is described. Currently, the database contains approximately 50000 ligand-binding sites for small molecules found in the Protein Data Bank (PDB). The structure-structure alignments are obtained by the Combinatorial Extension (CE) program (Shindyalov and Bourne, Protein Eng., 11, 739-747, 1998) and sequence-structure alignments are extracted from the ModBase database of comparative protein structure models for all known protein sequences (Sanchez et al., Nucleic Acids Res., 28, 250-253, 2000). It is possible to search for binding sites in LigBase by a variety of criteria. LigBase reports summarize ligand data including relevant structural information from the PDB file, such as ligand type and size, and contain links to all related protein sequences in the TrEMBL database. Residues in the binding sites are graphically depicted for comparison with other structurally defined family members. LigBase provides a resource for the analysis of families of related binding sites.  相似文献   

6.
Water and ligand binding play critical roles in the structure and function of proteins, yet their binding sites and significance are difficult to predict a priori. Multiple solvent crystal structures (MSCS) is a method where several X-ray crystal structures are solved, each in a unique solvent environment, with organic molecules that serve as probes of the protein surface for sites evolved to bind ligands, while the first hydration shell is essentially maintained. When superimposed, these structures contain a vast amount of information regarding hot spots of protein-protein or protein-ligand interactions, as well as conserved water-binding sites retained with the change in solvent properties. Optimized mining of this information requires reliable structural data and a consistent, objective analysis tool. Detection of related solvent positions (DRoP) was developed to automatically organize and rank the water or small organic molecule binding sites within a given set of structures. It is a flexible tool that can also be used in conserved water analysis given multiple structures of any protein independent of the MSCS method. The DRoP output is an HTML format list of the solvent sites ordered by conservation rank in its population within the set of structures, along with renumbered and recolored PDB files for visualization and facile analysis. Here, we present a previously unpublished set of MSCS structures of bovine pancreatic ribonuclease A (RNase A) and use it together with published structures to illustrate the capabilities of DRoP.  相似文献   

7.
Knowledge of protein-ligand binding sites is very important for structure-based drug designs. To get information on the binding site of a targeted protein with its ligand in a timely way, many scientists tried to resort to computational methods. Although several methods have been released in the past few years, their accuracy needs to be improved. In this study, based on the combination of incremental convex hull, traditional geometric algorithm, and solvent accessible surface of proteins, we developed a novel approach for predicting the protein-ligand binding sites. Using PDBbind database as a benchmark dataset and comparing the new approach with the existing methods such as POCKET, Q-SiteFinder, MOE-SiteFinder, and PASS, we found that the new method has the highest accuracy for the Top 2 and Top 3 predictions. Furthermore, our approach can not only successfully predict the protein-ligand binding sites but also provide more detailed information for the interactions between proteins and ligands. It is anticipated that the new method may become a useful tool for drug development, or at least play a complementary role to the other existing methods in this area.  相似文献   

8.
Ligand–protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED‐SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED‐SMA is an automated and fast method to classify binding sites. It is based on MED‐SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora‐A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED‐SMA approach is demonstrated as it gathers binding sites of proteins with similar structure‐activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target‐based drug design and prediction of cross‐reactivity and therefore potential toxic side effects.  相似文献   

9.
Knowledge-based scoring function to predict protein-ligand interactions   总被引:5,自引:0,他引:5  
The development and validation of a new knowledge-based scoring function (DrugScore) to describe the binding geometry of ligands in proteins is presented. It discriminates efficiently between well-docked ligand binding modes (root-mean-square deviation <2.0 A with respect to a crystallographically determined reference complex) and those largely deviating from the native structure, e.g. generated by computer docking programs. Structural information is extracted from crystallographically determined protein-ligand complexes using ReLiBase and converted into distance-dependent pair-preferences and solvent-accessible surface (SAS) dependent singlet preferences for protein and ligand atoms. Definition of an appropriate reference state and accounting for inaccuracies inherently present in experimental data is required to achieve good predictive power. The sum of the pair preferences and the singlet preferences is calculated based on the 3D structure of protein-ligand binding modes generated by docking tools. For two test sets of 91 and 68 protein-ligand complexes, taken from the Protein Data Bank (PDB), the calculated score recognizes poses generated by FlexX deviating <2 A from the crystal structure on rank 1 in three quarters of all possible cases. Compared to FlexX, this is a substantial improvement. For ligand geometries generated by DOCK, DrugScore is superior to the "chemical scoring" implemented into this tool, while comparable results are obtained using the "energy scoring" in DOCK. None of the presently known scoring functions achieves comparable power to extract binding modes in agreement with experiment. It is fast to compute, regards implicitly solvation and entropy contributions and produces correctly the geometry of directional interactions. Small deviations in the 3D structure are tolerated and, since only contacts to non-hydrogen atoms are regarded, it is independent from assumptions of protonation states.  相似文献   

10.
Here, a protein atom-ligand fragment interaction library is described. The library is based on experimentally solved structures of protein-ligand and protein-protein complexes deposited in the Protein Data Bank (PDB) and it is able to characterize binding sites given a ligand structure suitable for a protein. A set of 30 ligand fragment types were defined to include three or more atoms in order to unambiguously define a frame of reference for interactions of ligand atoms with their receptor proteins. Interactions between ligand fragments and 24 classes of protein target atoms plus a water oxygen atom were collected and segregated according to type. The spatial distributions of individual fragment - target atom pairs were visually inspected in order to obtain rough-grained constraints on the interaction volumes. Data fulfilling these constraints were given as input to an iterative expectation-maximization algorithm that produces as output maximum likelihood estimates of the parameters of the finite Gaussian mixture models. Concepts of statistical pattern recognition and the resulting mixture model densities are used (i) to predict the detailed interactions between Chlorella virus DNA ligase and the adenine ring of its ligand and (ii) to evaluate the "error" in prediction for both the training and validation sets of protein-ligand interaction found in the PDB. These analyses demonstrate that this approach can successfully narrow down the possibilities for both the interacting protein atom type and its location relative to a ligand fragment.  相似文献   

11.
研究蛋白质和配体相互作用的结构和亲和力,不仅有助于了解蛋白质的功能,而且对药物研发以及药物作用机制的研究,也具有十 分重要的意义。目前,人们通过人工检索和半自动检索的方式,从文献和蛋白质数据库(Protein Data Bank,PDB)中获得了许多蛋白质- 配体亲和力信息和生物相关配体信息,并构建了许多蛋白质-配体相互作用的信息数据库。对3 个蛋白质-配体亲和力数据库和6 个蛋白质 晶体结构-配体生物相关性数据库进行介绍,并对其主要应用进行简述,希望能为实现高效准确地筛选和设计药物提供一定的帮助。  相似文献   

12.
SuperStar is an empirical method for identifying interaction sites in proteins, based entirely on the experimental information about non-bonded interactions, present in the IsoStar database. The interaction information in IsoStar is contained in scatterplots, which show the distribution of a chosen probe around structure fragments. SuperStar breaks a template molecule (e.g. a protein binding site) into structural fragments which correspond to those in the scatterplots. The scatterplots are then superimposed on the corresponding parts of the template and converted into a composite propensity map.The original version of SuperStar was based entirely on scatterplots from the CSD. Here, scatterplots based on protein-ligand interactions are implemented in SuperStar, and validated on a test set of 122 X-ray structures of protein-ligand complexes. In this validation, propensity maps are compared with the experimentally observed positions of ligand atoms of comparable types. Although non-bonded interaction geometries in small molecule structures are similar to those found in protein-ligand complexes, their relative frequencies of occurrence are different. Polar interactions are more common in the first class of structures, while interactions between hydrophobic groups are more common in protein crystals. In general, PDB and CSD-based SuperStar maps appear equally successful in the prediction of protein-ligand interactions. PDB-based maps are more suitable to identify hydrophobic pockets, and inherently take into account the experimental uncertainties of protein atomic positions. If the protonation state of a histidine, aspartate or glutamate protein side-chain is known, specific CSD-based maps for that protonation state are preferred over PDB-based maps which represent an ensemble of protonation states.  相似文献   

13.
MOTIVATION: The large-scale comparison of protein-ligand binding sites is problematic, in that measures of structural similarity are difficult to quantify and are not easily understood in terms of statistical similarity that can ultimately be related to structure and function. We present a binding site matching score the Poisson Index (PI) based upon a well-defined statistical model. PI requires only the number of matching atoms between two sites and the size of the two sites-the same information used by the Tanimoto Index (TI), a comparable and widely used measure for molecular similarity. We apply PI and TI to a previously automatically extracted set of binding sites to determine the robustness and usefulness of both scores. RESULTS: We found that PI outperforms TI; moreover, site similarity is poorly defined for TI at values around the 99.5% confidence level for which PI is well defined. A difference map at this confidence level shows that PI gives much more meaningful information than TI. We show individual examples where TI fails to distinguish either a false or a true site paring in contrast to PI, which performs much better. TI cannot handle large or small sites very well, or the comparison of large and small sites, in contrast to PI that is shown to be much more robust. Despite the difficulty of determining a biological 'ground truth' for binding site similarity we conclude that PI is a suitable measure of binding site similarity and could form the basis for a binding site classification scheme comparable to existing protein domain classification schema.  相似文献   

14.
Based on the experimental data and homologous sites in Protein Data Bank (PDB) a model for metal binding sites in D1/D2 heterodimer has been proposed. On searching for tetranuclear and binuclear Mn binding sites in the PDB, a suitable sequence homology in thermolysin and D1 could be observed. From the homology and site-directed mutagenesis data, a model for binuclear Mn-Ca or Mn-Mn has been built and it is extended to a tetranuclear Mn centre.  相似文献   

15.
Nayeem A  Krystek S  Stouch T 《Biopolymers》2003,70(2):201-211
Electronic polarizability, an important physical property of biomolecules, is currently ignored in most biomolecular calculations. Yet, it is widely believed that polarization could account for a substantial fraction of the total nonbonded energy of a system. This belief is supported by studies of small complexes in vacuum. This perception is driving the development of a new class of polarizable force fields for biomolecular calculations. However, the quantification of this term for protein-ligand complexes has never been attempted. Here we explore the polarizable nature of protein-ligand complexes in order to evaluate the importance of this effect. We introduce two indexes describing the polarizability of protein binding sites. These we apply to a large range of pharmaceutically relevant complexes. We offer a recommendation of particular complexes as test systems with which to determine the effects of polarizability and as test cases with which to test the new generation of force fields. Additionally, we provide a tabulation of the amino acid composition of these binding sites and show that composition can be specific for certain classes of proteins. We also show that the relative abundance of some amino acids is different in binding sites than elsewhere in a protein's structure.  相似文献   

16.
Ghersi D  Sanchez R 《Proteins》2009,74(2):417-424
The use of predicted binding sites (binding sites calculated from the protein structure alone) is evaluated here as a tool to focus the docking of small molecule ligands into protein structures, simulating cases where the real binding sites are unknown. The resulting approach consists of a few independent docking runs carried out on small boxes, centered on the predicted binding sites, as opposed to one larger blind docking run that covers the complete protein structure. The focused and blind approaches were compared using a set of 77 known protein-ligand complexes and 19 ligand-free structures. The focused approach is shown to: (1) identify the correct binding site more frequently than blind docking; (2) produce more accurate docking poses for the ligand; (3) require less computational time. Additionally, the results show that very few real binding sites are missed in spite of focusing on only three predicted binding sites per target protein. Overall the results indicate that, by improving the sampling in regions that are likely to correspond to binding sites, the focused docking approach increases accuracy and efficiency of protein ligand docking for those cases where the ligand-binding site is unknown. This is especially relevant in applications such as reverse virtual screening and structure-based functional annotation of proteins.  相似文献   

17.
Staphylococcus aureus expresses numerous virulence factors that aid in immune evasion. The four-domain staphylococcal immunoglobulin binding (Sbi) protein interacts with complement component 3 (C3) and its thioester domain (C3d)-containing fragments. Recent structural data suggested two possible modes of binding of Sbi domain IV (Sbi-IV) to C3d, but the physiological binding mode remains unclear. We used a computational approach to provide insight into the C3d-Sbi-IV interaction. Molecular dynamics (MD) simulations showed that the first binding mode (PDB code 2WY8) is more robust than the second (PDB code 2WY7), with more persistent polar and nonpolar interactions, as well as conserved interfacial solvent accessible surface area. Brownian dynamics and steered MD simulations revealed that the first binding mode has faster association kinetics and maintains more stable intermolecular interactions compared to the second binding mode. In light of available experimental and structural data, our data confirm that the first binding mode represents Sbi-IV interaction with C3d (and C3) in a physiological context. Although the second binding mode is inherently less stable, we suggest a possible physiological role. Both binding sites may serve as a template for structure-based design of novel complement therapeutics.  相似文献   

18.
19.
Algorithms for comparing protein structure are frequently used for function annotation. By searching for subtle similarities among very different proteins, these algorithms can identify remote homologs with similar biological functions. In contrast, few comparison algorithms focus on specificity annotation, where the identification of subtle differences among very similar proteins can assist in finding small structural variations that create differences in binding specificity. Few specificity annotation methods consider electrostatic fields, which play a critical role in molecular recognition. To fill this gap, this paper describes VASP-E (Volumetric Analysis of Surface Properties with Electrostatics), a novel volumetric comparison tool based on the electrostatic comparison of protein-ligand and protein-protein binding sites. VASP-E exploits the central observation that three dimensional solids can be used to fully represent and compare both electrostatic isopotentials and molecular surfaces. With this integrated representation, VASP-E is able to dissect the electrostatic environments of protein-ligand and protein-protein binding interfaces, identifying individual amino acids that have an electrostatic influence on binding specificity. VASP-E was used to examine a nonredundant subset of the serine and cysteine proteases as well as the barnase-barstar and Rap1a-raf complexes. Based on amino acids established by various experimental studies to have an electrostatic influence on binding specificity, VASP-E identified electrostatically influential amino acids with 100% precision and 83.3% recall. We also show that VASP-E can accurately classify closely related ligand binding cavities into groups with different binding preferences. These results suggest that VASP-E should prove a useful tool for the characterization of specific binding and the engineering of binding preferences in proteins.  相似文献   

20.
The genomes of more than 100 species have been sequenced, and the biological functions of encoded proteins are now actively being researched. Protein function is based on interactions between proteins and other molecules. One approach to assuming protein function based on genomic sequence is to predict interactions between an encoded protein and other molecules. As a data source for such predictions, knowledge regarding known protein-small molecule interactions needs to be compiled. We have, therefore, surveyed interactions between proteins and other molecules in Protein Data Bank (PDB), the protein three-dimensional (3D) structure database. Among 20,685 entries in PDB (April, 2003), 4,189 types of small molecules were found to interact with proteins. Biologically relevant small molecules most often found in PDB were metal ions, such as calcium, zinc, and magnesium. Sugars and nucleotides were the next most common. These molecules are known to act as cofactors for enzymes and/or stabilizers of proteins. In each case of interactions between a protein and small molecule, we found preferred amino acid residues at the interaction sites. These preferences can be the basis for predicting protein function from genomic sequence and protein 3D structures. The data pertaining to these small molecules were collected in a database named Het-PDB Navi., which is freely available at http://daisy.nagahama-i-bio.ac.jp/golab/hetpdbnavi.html and linked to the official PDB home page.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号