首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
SuperStar is an empirical method for identifying interaction sites in proteins, based entirely on experimental information about non-bonded interactions occurring in small-molecule crystal structures, taken from the IsoStar database. We describe recent modifications and additions to SuperStar, validating the results on a test set of 122 X-ray structures of protein-ligand complexes. In this validation, propensity maps are generated for all the binding sites of these proteins, using four different probes: a charged NH(+)(3) nitrogen atom, a carbonyl oxygen atom, a hydroxyl oxygen atom and a methyl carbon atom. Next, the maps are compared with the experimentally observed positions of ligand atoms of these types. A peak-searching algorithm is introduced that highlights potential interaction hot spots. For the three hydrogen-bonding probes - NH(+)(3) nitrogen atom, carbonyl oxygen atom and hydroxyl oxygen atom - the average distance from the ligand atom to the nearest SuperStar peak is 1.0-1.2 A (0.8-1.0 A for solvent-inaccessible ligand atoms). For the methyl carbon atom probe, this distance is about 1.5 A, probably because interactions to methyl groups are much less directional.The most important addition to SuperStar is the enabling of propensity maps around metal centres - Ca(2+), Mg(2+) and Zn(2+) - in protein binding sites. The results are validated on a test set of 24 protein-ligand complexes that have a metal ion in their binding site. Coordination geometries are derived automatically, using only the protein atoms that coordinate to the metal ion. The correct coordination geometry is derived in approximately 75 % of the cases. If the derived geometry is assumed during the SuperStar calculation, the average distance from a ligand atom coordinating to the metal ion to the nearest peak in the propensity map for an oxygen probe is 0.87(7) A. If the correct coordination geometry is imposed, this distance reduces to 0.59(7)A. This indicates that the SuperStar predictions around metal-binding sites are at least as good as those around other protein groups. Using clustering techniques, a non-redundant set of probes is selected from the set of probes available in the IsoStar database. The performance in SuperStar of all these probes is tested on the test set of protein-ligand complexes. With the exception of the "ether oxygen" probe and the "any NH(+)" probe, all new probes perform as well as the four probes introduced first.  相似文献   

2.
SuperStar is an empirical method for identifying interaction sites in proteins, based entirely on the experimental information about non-bonded interactions, present in the IsoStar database. The interaction information in IsoStar is contained in scatterplots, which show the distribution of a chosen probe around structure fragments. SuperStar breaks a template molecule (e.g. a protein binding site) into structural fragments which correspond to those in the scatterplots. The scatterplots are then superimposed on the corresponding parts of the template and converted into a composite propensity map.The original version of SuperStar was based entirely on scatterplots from the CSD. Here, scatterplots based on protein-ligand interactions are implemented in SuperStar, and validated on a test set of 122 X-ray structures of protein-ligand complexes. In this validation, propensity maps are compared with the experimentally observed positions of ligand atoms of comparable types. Although non-bonded interaction geometries in small molecule structures are similar to those found in protein-ligand complexes, their relative frequencies of occurrence are different. Polar interactions are more common in the first class of structures, while interactions between hydrophobic groups are more common in protein crystals. In general, PDB and CSD-based SuperStar maps appear equally successful in the prediction of protein-ligand interactions. PDB-based maps are more suitable to identify hydrophobic pockets, and inherently take into account the experimental uncertainties of protein atomic positions. If the protonation state of a histidine, aspartate or glutamate protein side-chain is known, specific CSD-based maps for that protonation state are preferred over PDB-based maps which represent an ensemble of protonation states.  相似文献   

3.
Gorelik B  Goldblum A 《Proteins》2008,71(3):1373-1386
Multiple near-optimal conformations of protein-ligand complexes provide a better chance for accurate representation of biomolecular interactions, compared with a single structure. We present ISE-dock--a docking program which is based on the iterative stochastic elimination (ISE) algorithm. ISE eliminates values that consistently lead to the worst results, thus optimizing the search for docking poses. It constructs large sets of such poses with no additional computational cost compared with single poses. ISE-dock is validated using 81 protein-ligand complexes from the PDB and its performance was compared with those of Glide, GOLD, and AutoDock. ISE-dock has a better chance than the other three to find more than 60% top single poses under RMSD = 2.0 A and more than 80% under RMSD = 3.0 A from experimental. ISE alone produced at least one 3.0 A or better solutions among the top 20 poses in the entire test set. In 98% of the examined molecules, ISE produced solutions that are closer than 2.0 A from experimental. Paired t-tests (PTT) were used throughout to assess the significance of comparisons between the performances of the different programs. ISE-dock provides more than 100-fold docking solutions in a similar time frame as LGA in AutoDock. We demonstrate the usefulness of the large near optimal populations of ligand poses by showing a correlation between the docking results and experiments that support multiple binding modes in p38 MAP kinase (Pargellis et al., Nat Struct Biol 2002;9:268-272] and in Human Transthyretin (Hamilton, Benson, Cell Mol Life Sci 2001;58:1491-1521).  相似文献   

4.
Park MS  Gao C  Stern HA 《Proteins》2011,79(1):304-314
To investigate the effects of multiple protonation states on protein-ligand recognition, we generated alternative protonation states for selected titratable groups of ligands and receptors. The selection of states was based on the predicted pK(a) of the unbound receptor and ligand and the proximity of titratable groups of the receptor to the binding site. Various ligand tautomer states were also considered. An independent docking calculation was run for each state. Several protocols were examined: using an ensemble of all generated states of ligand and receptor, using only the most probable state of the unbound ligand/receptor, and using only the state giving the most favorable docking score. The accuracies of these approaches were compared, using a set of 176 protein-ligand complexes (15 receptors) for which crystal structures and measured binding affinities are available. The best agreement with experiment was obtained when ligand poses from experimental crystal structures were used. For 9 of 15 receptors, using an ensemble of all generated protonation states of the ligand and receptor gave the best correlation between calculated and measured affinities.  相似文献   

5.
Ruvinsky AM  Kozintsev AV 《Proteins》2006,62(1):202-208
We present two novel methods to predict native protein-ligand binding positions. Both methods identify the native binding position as the most probable position corresponding to a maximum of a probability distribution function (PDF) of possible binding positions in a protein active site. Possible binding positions are the origins of clusters composed, on the basis of root-mean square deviations (RMSD), from the multiple ligand positions determined by a docking algorithm. The difference between the methods lies in the ways the PDF is derived. To validate the suggested methods, we compare the averaged RMSD of the predicted ligand docked positions relative to the experimentally determined positions for a set of 135 PDB protein-ligand complexes. We demonstrate that the suggested methods improve docking accuracy by as much as 21-24% in comparison with a method that simply identifies the binding position as the energy top-scored ligand position.  相似文献   

6.
GEMDOCK: a generic evolutionary method for molecular docking   总被引:1,自引:0,他引:1  
Yang JM  Chen CC 《Proteins》2004,55(2):288-304
We have developed an evolutionary approach for flexible ligand docking. This approval, GEMDOCK, uses a Generic Evolutionary Method for molecular DOCKing and an empirical scoring function. The former combines both discrete and continuous global search strategies with local search strategies to speed up convergence, whereas the latter results in rapid recognition of potential ligands. GEMDOCK was tested on a diverse data set of 100 protein-ligand complexes from the Protein Data Bank. In 79% of these complexes, the docked lowest energy ligand structures had root-mean-square derivations (RMSDs) below 2.0 A with respect to the corresponding crystal structures. The success rate increased to 85% if the structure water molecules were retained. We evaluated GEMDOCK on two cross-docking experiments in which each ligand of a protein ensemble was docked into each protein of the ensemble. Seventy-six percent of the docked structures had RMSDs below 2.0 A when the ligands were docked into foreign structures. We analyzed and validated GEMDOCK with respect to various search spaces and scoring functions, and found that if the scoring function was perfect, then the predicted accuracy was also essentially perfect. This study suggests that GEMDOCK is a useful tool for molecular recognition and may be used to systematically evaluate and thus improve scoring functions.  相似文献   

7.
A new approach, MOBILE, is presented that models protein binding-sites including bound ligand molecules as restraints. Initially generated, homology models of the target protein are refined iteratively by including information about bioactive ligands as spatial restraints and optimising the mutual interactions between the ligands and the binding-sites. Thus optimised models can be used for structure-based drug design and virtual screening. In a first step, ligands are docked into an averaged ensemble of crude homology models of the target protein. In the next step, improved homology models are generated, considering explicitly the previously placed ligands by defining restraints between protein and ligand atoms. These restraints are expressed in terms of knowledge-based distance-dependent pair potentials, which were compiled from crystallographically determined protein-ligand complexes. Subsequently, the most favourable models are selected by ranking the interactions between the ligands and the generated pockets using these potentials. Final models are obtained by selecting the best-ranked side-chain conformers from various models, followed by an energy optimisation of the entire complex using a common force-field. Application of the knowledge-based pair potentials proved efficient to restrain the homology modelling process and to score and optimise the modelled protein-ligand complexes. For a test set of 46 protein-ligand complexes, taken from the Protein Data Bank (PDB), the success rate of producing near-native binding-site geometries (rmsd<2.0A) with MODELLER is 70% when the ligand restrains the homology modelling process in its native orientation. Scoring these complexes with the knowledge-based potentials, in 66% of the cases a pose with rmsd <2.0A is found on rank 1. Finally, MOBILE has been applied to two case studies modelling factor Xa based on trypsin and aldose reductase based on aldehyde reductase.  相似文献   

8.
Hu L  Benson ML  Smith RD  Lerner MG  Carlson HA 《Proteins》2005,60(3):333-340
Binding MOAD (Mother of All Databases) is the largest collection of high-quality, protein-ligand complexes available from the Protein Data Bank. At this time, Binding MOAD contains 5331 protein-ligand complexes comprised of 1780 unique protein families and 2630 unique ligands. We have searched the crystallography papers for all 5000+ structures and compiled binding data for 1375 (26%) of the protein-ligand complexes. The binding-affinity data ranges 13 orders of magnitude. This is the largest collection of binding data reported to date in the literature. We have also addressed the issue of redundancy in the data. To create a nonredundant dataset, one protein from each of the 1780 protein families was chosen as a representative. Representatives were chosen by tightest binding, best resolution, etc. For the 1780 "best" complexes that comprise the nonredundant version of Binding MOAD, 475 (27%) have binding data. This significant collection of protein-ligand complexes will be very useful in elucidating the biophysical patterns of molecular recognition and enzymatic regulation. The complexes with binding-affinity data will help in the development of improved scoring functions and structure-based drug discovery techniques. The dataset can be accessed at http://www.BindingMOAD.org.  相似文献   

9.
Critical Assessment of PRedicted Interactions (CAPRI) has proven to be a catalyst for the development of docking algorithms. An essential step in docking is the scoring of predicted binding modes in order to identify stable complexes. In 2005, CAPRI introduced the scoring experiment, where upon completion of a prediction round, a larger set of models predicted by different groups and comprising both correct and incorrect binding modes, is made available to all participants for testing new scoring functions independently from docking calculations. Here we present an expanded benchmark data set for testing scoring functions, which comprises the consolidated ensemble of predicted complexes made available in the CAPRI scoring experiment since its inception. This consolidated scoring benchmark contains predicted complexes for 15 published CAPRI targets. These targets were subjected to 23 CAPRI assessments, due to existence of multiple binding modes for some targets. The benchmark contains more than 19,000 protein complexes. About 10% of the complexes represent docking predictions of acceptable quality or better, the remainder represent incorrect solutions (decoys). The benchmark set contains models predicted by 47 different predictor groups including web servers, which use different docking and scoring procedures, and is arguably as diverse as one may expect, representing the state of the art in protein docking. The data set is publicly available at the following URL: http://cb.iri.univ‐lille1.fr/Users/lensink/Score_set . Proteins 2014; 82:3163–3169. © 2014 Wiley Periodicals, Inc.  相似文献   

10.
We propose a self-consistent approach to analyze knowledge-based atom-atom potentials used to calculate protein-ligand binding energies. Ligands complexed to actual protein structures were first built using the SMoG growth procedure (DeWitte & Shakhnovich, 1996) with a chosen input potential. These model protein-ligand complexes were used to construct databases from which knowledge-based protein-ligand potentials were derived. We then tested several different modifications to such potentials and evaluated their performance on their ability to reconstruct the input potential using the statistical information available from a database composed of model complexes. Our data indicate that the most significant improvement resulted from properly accounting for the following key issues when estimating the reference state: (1) the presence of significant nonenergetic effects that influence the contact frequencies and (2) the presence of correlations in contact patterns due to chemical structure. The most successful procedure was applied to derive an atom-atom potential for real protein-ligand complexes. Despite the simplicity of the model (pairwise contact potential with a single interaction distance), the derived binding free energies showed a statistically significant correlation (approximately 0.65) with experimental binding scores for a diverse set of complexes.  相似文献   

11.
RNA molecules have recently become attractive as potential drug targets due to the increased awareness of their importance in key biological processes. The increase of the number of experimentally determined RNA 3D structures enabled structure-based searches for small molecules that can specifically bind to defined sites in RNA molecules, thereby blocking or otherwise modulating their function. However, as of yet, computational methods for structure-based docking of small molecule ligands to RNA molecules are not as well established as analogous methods for protein-ligand docking. This motivated us to create LigandRNA, a scoring function for the prediction of RNA–small molecule interactions. Our method employs a grid-based algorithm and a knowledge-based potential derived from ligand-binding sites in the experimentally solved RNA–ligand complexes. As an input, LigandRNA takes an RNA receptor file and a file with ligand poses. As an output, it returns a ranking of the poses according to their score. The predictive power of LigandRNA favorably compares to five other publicly available methods. We found that the combination of LigandRNA and Dock6 into a “meta-predictor” leads to further improvement in the identification of near-native ligand poses. The LigandRNA program is available free of charge as a web server at http://ligandrna.genesilico.pl.  相似文献   

12.
13.
Ghersi D  Sanchez R 《Proteins》2009,74(2):417-424
The use of predicted binding sites (binding sites calculated from the protein structure alone) is evaluated here as a tool to focus the docking of small molecule ligands into protein structures, simulating cases where the real binding sites are unknown. The resulting approach consists of a few independent docking runs carried out on small boxes, centered on the predicted binding sites, as opposed to one larger blind docking run that covers the complete protein structure. The focused and blind approaches were compared using a set of 77 known protein-ligand complexes and 19 ligand-free structures. The focused approach is shown to: (1) identify the correct binding site more frequently than blind docking; (2) produce more accurate docking poses for the ligand; (3) require less computational time. Additionally, the results show that very few real binding sites are missed in spite of focusing on only three predicted binding sites per target protein. Overall the results indicate that, by improving the sampling in regions that are likely to correspond to binding sites, the focused docking approach increases accuracy and efficiency of protein ligand docking for those cases where the ligand-binding site is unknown. This is especially relevant in applications such as reverse virtual screening and structure-based functional annotation of proteins.  相似文献   

14.
15.
Grosdidier A  Zoete V  Michielin O 《Proteins》2007,67(4):1010-1025
In recent years, protein-ligand docking has become a powerful tool for drug development. Although several approaches suitable for high throughput screening are available, there is a need for methods able to identify binding modes with high accuracy. This accuracy is essential to reliably compute the binding free energy of the ligand. Such methods are needed when the binding mode of lead compounds is not determined experimentally but is needed for structure-based lead optimization. We present here a new docking software, called EADock, that aims at this goal. It uses an hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 A around the center of mass of the ligand position in the crystal structure, and on the contrary to other benchmarks, our algorithm was fed with optimized ligand positions up to 10 A root mean square deviation (RMSD) from the crystal structure, excluding the latter. This validation illustrates the efficiency of our sampling strategy, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures could be explained by the presence of crystal contacts in the experimental structure. Finally, the ability of EADock to accurately predict binding modes on a real application was illustrated by the successful docking of the RGD cyclic pentapeptide on the alphaVbeta3 integrin, starting far away from the binding pocket.  相似文献   

16.
Seebeck B  Reulecke I  Kämper A  Rarey M 《Proteins》2008,71(3):1237-1254
The accurate modeling of metal coordination geometries plays an important role for structure-based drug design applied to metalloenzymes. For the development of a new metal interaction model, we perform a statistical analysis of metal interaction geometries that are relevant to protein-ligand complexes. A total of 43,061 metal sites of the Protein Data Bank (PDB), containing amongst others magnesium, calcium, zinc, iron, manganese, copper, cadmium, cobalt, and nickel, were evaluated according to their metal coordination geometry. Based on statistical analysis, we derived a model for the automatic calculation and definition of metal interaction geometries for the purpose of molecular docking analyses. It includes the identification of the metal-coordinating ligands, the calculation of the coordination geometry and the superposition of ideal polyhedra to identify the optimal positions for free coordination sites. The new interaction model was integrated in the docking software FlexX and evaluated on a data set of 103 metalloprotein-ligand complexes, which were extracted from the PDB. In a first step, the quality of the automatic calculation of the metal coordination geometry was analyzed. In 74% of the cases, the correct prediction of the coordination geometry could be determined on the basis of the protein structure alone. Secondly, the new metal interaction model was tested in terms of predicting protein-ligand complexes. In the majority of test cases, the new interaction model resulted in an improved docking accuracy of the top ranking placements.  相似文献   

17.
We present a novel notion of binding site local similarity based on the analysis of complete protein environments of ligand fragments. Comparison of a query protein binding site (target) against the 3D structure of another protein (analog) in complex with a ligand enables ligand fragments from the analog complex to be transferred to positions in the target site, so that the complete protein environments of the fragment and its image are similar. The revealed environments are similarity regions and the fragments transferred to the target site are considered as binding patterns. The set of such binding patterns derived from a database of analog complexes forms a cloud-like structure (fragment cloud), which is a powerful tool for computational drug design. It has been shown on independent test sets that the combined use of a traditional energy-based score together with the cloud-based score responsible for the quality of embedding of a ligand into the fragment cloud improves the self-docking and screening results dramatically. The usage of a fragment cloud as a source of positioned molecular fragments fitting the binding protein environment has been validated by reproduction of experimental ligand optimization results.  相似文献   

18.
Mooij WT  Verdonk ML 《Proteins》2005,61(2):272-287
We present a novel atom-atom potential derived from a database of protein-ligand complexes. First, we clarify the similarities and differences between two statistical potentials described in the literature, PMF and Drugscore. We highlight shortcomings caused by an important factor unaccounted for in their reference states, and describe a new potential, which we name the Astex Statistical Potential (ASP). ASP's reference state considers the difference in exposure of protein atom types towards ligand binding sites. We show that this new potential predicts binding affinities with an accuracy similar to that of Goldscore and Chemscore. We investigate the influence of the choice of reference state by constructing two additional statistical potentials that differ from ASP only in this respect. The reference states in these two potentials are defined along the lines of Drugscore and PMF. In docking experiments, the potential using the new reference state proposed for ASP gives better success rates than when these literature reference states were used; a success rate similar to the established scoring functions Goldscore and Chemscore is achieved with ASP. This is the case both for a large, general validation set of protein-ligand structures and for small test sets of actives against four pharmaceutically relevant targets. Virtual screening experiments for these targets show less discrimination between the different reference states in terms of enrichment. In addition, we describe how statistical potentials can be used in the construction of targeted scoring functions. Examples are given for cdk2, using four different targeted scoring functions, biased towards increasingly large target-specific databases. Using these targeted scoring functions, docking success rates as well as enrichments are significantly better than for the general ASP scoring function. Results improve with the number of structures used in the construction of the target scoring functions, thus illustrating that these targeted ASP potentials can be continuously improved as new structural data become available.  相似文献   

19.
Protein-ligand docking: current status and future challenges   总被引:1,自引:0,他引:1  
Understanding the ruling principles whereby protein receptors recognize, interact, and associate with molecular substrates and inhibitors is of paramount importance in drug discovery efforts. Protein-ligand docking aims to predict and rank the structure(s) arising from the association between a given ligand and a target protein of known 3D structure. Despite the breathtaking advances in the field over the last decades and the widespread application of docking methods, several downsides still exist. In particular, protein flexibility-a critical aspect for a thorough understanding of the principles that guide ligand binding in proteins-is a major hurdle in current protein-ligand docking efforts that needs to be more efficiently accounted for. In this review the key concepts of protein-ligand docking methods are outlined, with major emphasis being given to the general strengths and weaknesses that presently characterize this methodology. Despite the size of the field, the principal types of search algorithms and scoring functions are reviewed and the most popular docking tools are briefly depicted. Recent advances that aim to address some of the traditional limitations associated with molecular docking are also described. A selection of hand-picked examples is used to illustrate these features.  相似文献   

20.
A thorough evaluation of some of the most advanced docking and scoring methods currently available is described, and guidelines for the choice of an appropriate protocol for docking and virtual screening are defined. The generation of a large and highly curated test set of pharmaceutically relevant protein-ligand complexes with known binding affinities is described, and three highly regarded docking programs (Glide, GOLD, and ICM) are evaluated on the same set with respect to their ability to reproduce crystallographic binding orientations. Glide correctly identified the crystallographic pose within 2.0 A in 61% of the cases, versus 48% for GOLD and 45% for ICM. In general Glide appears to perform most consistently with respect to diversity of binding sites and ligand flexibility, while the performance of ICM and GOLD is more binding site-dependent and it is significantly poorer when binding is predominantly driven by hydrophobic interactions. The results also show that energy minimization and reranking of the top N poses can be an effective means to overcome some of the limitations of a given docking function. The same docking programs are evaluated in conjunction with three different scoring functions for their ability to discriminate actives from inactives in virtual screening. The evaluation, performed on three different systems (HIV-1 protease, IMPDH, and p38 MAP kinase), confirms that the relative performance of different docking and scoring methods is to some extent binding site-dependent. GlideScore appears to be an effective scoring function for database screening, with consistent performance across several types of binding sites, while ChemScore appears to be most useful in sterically demanding sites since it is more forgiving of repulsive interactions. Energy minimization of docked poses can significantly improve the enrichments in systems with sterically demanding binding sites. Overall Glide appears to be a safe general choice for docking, while the choice of the best scoring tool remains to a larger extent system-dependent and should be evaluated on a case-by-case basis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号