首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Zhang C  Liu S  Zhou H  Zhou Y 《Biophysical journal》2004,86(6):3349-3358
An accurate statistical energy function that is suitable for the prediction of protein structures of all classes should be independent of the structural database used for energy extraction. Here, two high-resolution, low-sequence-identity structural databases of 333 alpha-proteins and 271 beta-proteins were built for examining the database dependence of three all-atom statistical energy functions. They are RAPDF (residue-specific all-atom conditional probability discriminatory function), atomic KBP (atomic knowledge-based potential), and DFIRE (statistical potential based on distance-scaled finite ideal-gas reference state). These energy functions differ in the reference states used for energy derivation. The energy functions extracted from the different structural databases are used to select native structures from multiple decoys of 64 alpha-proteins and 28 beta-proteins. The performance in native structure selections indicates that the DFIRE-based energy function is mostly independent of the structural database whereas RAPDF and KBP have a significant dependence. The construction of two additional structural databases of alpha/beta and alpha + beta-proteins further confirmed the weak dependence of DFIRE on the structural databases of various structural classes. The possible source for the difference between the three all-atom statistical energy functions is that the physical reference state of ideal gas used in the DFIRE-based energy function is least dependent on the structural database.  相似文献   

2.
Structure prediction on a genomic scale requires a simplified energy function that can efficiently sample the conformational space of polypeptide chains. A good energy function at minimum should discriminate native structures against decoys. Here, we show that a recently developed, residue-specific, all-atom knowledge-based potential (167 atomic types) based on distance-scaled, finite ideal-gas reference state (DFIRE-all-atom) can be substantially simplified to 20 residue types located at side-chain center of mass (DFIRE-SCM) without a significant change in its capability of structure discrimination. Using 96 standard multiple decoy sets, we show that there is only a small reduction (from 80% to 78%) in success rate of ranking native structures as the top 1. The success rate is higher than two previously developed, all-atom distance-dependent statistical pair potentials. Applied to structure selections of 21 docking decoys without modification, the DFIRE-SCM potential is 29% more successful in recognizing native complex structures than an all-atom statistical potential trained by a database of dimeric interfaces. The potential also achieves 92% accuracy in distinguishing true dimeric interfaces from artificial crystal interfaces. In addition, the DFIRE potential with the C(alpha) positions as the interaction centers recognizes 123 native structures out of a comprehensive 125-protein TOUCHSTONE decoy set in which each protein has 24,000 decoys with only C(alpha) positions. Furthermore, the performance by DFIRE-SCM on newly established 25 monomeric and 31 docking Rosetta-decoy sets is comparable to (or better than in the case of monomeric decoy sets) that of a recently developed, all-atom Rosetta energy function enhanced with an orientation-dependent hydrogen bonding potential.  相似文献   

3.
We developed a series of statistical potentials to recognize the native protein from decoys, particularly when using only a reduced representation in which each side chain is treated as a single C(beta) atom. Beginning with a highly successful all-atom statistical potential, the Discrete Optimized Protein Energy function (DOPE), we considered the implications of including additional information in the all-atom statistical potential and subsequently reducing to the C(beta) representation. One of the potentials includes interaction energies conditional on backbone geometries. A second potential separates sequence local from sequence nonlocal interactions and introduces a novel reference state for the sequence local interactions. The resultant potentials perform better than the original DOPE statistical potential in decoy identification. Moreover, even upon passing to a reduced C(beta) representation, these statistical potentials outscore the original (all-atom) DOPE potential in identifying native states for sets of decoys. Interestingly, the backbone-dependent statistical potential is shown to retain nearly all of the information content of the all-atom representation in the C(beta) representation. In addition, these new statistical potentials are combined with existing potentials to model hydrogen bonding, torsion energies, and solvation energies to produce even better performing potentials. The ability of the C(beta) statistical potentials to accurately represent protein interactions bodes well for computational efficiency in protein folding calculations using reduced backbone representations, while the extensions to DOPE illustrate general principles for improving knowledge-based potentials.  相似文献   

4.
An accurate scoring function is a key component for successful protein structure prediction. To address this important unsolved problem, we develop a generalized orientation and distance-dependent all-atom statistical potential. The new statistical potential, generalized orientation-dependent all-atom potential (GOAP), depends on the relative orientation of the planes associated with each heavy atom in interacting pairs. GOAP is a generalization of previous orientation-dependent potentials that consider only representative atoms or blocks of side-chain or polar atoms. GOAP is decomposed into distance- and angle-dependent contributions. The DFIRE distance-scaled finite ideal gas reference state is employed for the distance-dependent component of GOAP. GOAP was tested on 11 commonly used decoy sets containing 278 targets, and recognized 226 native structures as best from the decoys, whereas DFIRE recognized 127 targets. The major improvement comes from decoy sets that have homology-modeled structures that are close to native (all within ∼4.0 Å) or from the ROSETTA ab initio decoy set. For these two kinds of decoys, orientation-independent DFIRE or only side-chain orientation-dependent RWplus performed poorly. Although the OPUS-PSP block-based orientation-dependent, side-chain atom contact potential performs much better (recognizing 196 targets) than DFIRE, RWplus, and dDFIRE, it is still ∼15% worse than GOAP. Thus, GOAP is a promising advance in knowledge-based, all-atom statistical potentials. GOAP is available for download at http://cssb.biology.gatech.edu/GOAP.  相似文献   

5.
The conformations of loops are determined by the water-mediated interactions between amino acid residues. Energy functions that describe the interactions can be derived either from physical principles (physical-based energy function) or statistical analysis of known protein structures (knowledge-based statistical potentials). It is commonly believed that statistical potentials are appropriate for coarse-grained representation of proteins but are not as accurate as physical-based potentials when atomic resolution is required. Several recent applications of physical-based energy functions to loop selections appear to support this view. In this article, we apply a recently developed DFIRE-based statistical potential to three different loop decoy sets (RAPPER, Jacobson, and Forrest-Woolf sets). Together with a rotamer library for side-chain optimization, the performance of DFIRE-based potential in the RAPPER decoy set (385 loop targets) is comparable to that of AMBER/GBSA for short loops (two to eight residues). The DFIRE is more accurate for longer loops (9 to 12 residues). Similar trend is observed when comparing DFIRE with another physical-based OPLS/SGB-NP energy function in the large Jacobson decoy set (788 loop targets). In the Forrest-Woolf decoy set for the loops of membrane proteins, the DFIRE potential performs substantially better than the combination of the CHARMM force field with several solvation models. The results suggest that a single-term DFIRE-statistical energy function can provide an accurate loop prediction at a fraction of computing cost required for more complicate physical-based energy functions. A Web server for academic users is established for loop selection at the softwares/services section of the Web site http://theory.med.buffalo.edu/.  相似文献   

6.
Huang SY  Zou X 《Proteins》2011,79(9):2648-2661
In this study, we have developed a statistical mechanics-based iterative method to extract statistical atomic interaction potentials from known, nonredundant protein structures. Our method circumvents the long-standing reference state problem in deriving traditional knowledge-based scoring functions, by using rapid iterations through a physical, global convergence function. The rapid convergence of this physics-based method, unlike other parameter optimization methods, warrants the feasibility of deriving distance-dependent, all-atom statistical potentials to keep the scoring accuracy. The derived potentials, referred to as ITScore/Pro, have been validated using three diverse benchmarks: the high-resolution decoy set, the AMBER benchmark decoy set, and the CASP8 decoy set. Significant improvement in performance has been achieved. Finally, comparisons between the potentials of our model and potentials of a knowledge-based scoring function with a randomized reference state have revealed the reason for the better performance of our scoring function, which could provide useful insight into the development of other physical scoring functions. The potentials developed in this study are generally applicable for structural selection in protein structure prediction.  相似文献   

7.
Liu S  Zhang C  Zhou H  Zhou Y 《Proteins》2004,56(1):93-101
Extracting knowledge-based statistical potential from known structures of proteins is proved to be a simple, effective method to obtain an approximate free-energy function. However, the different compositions of amino acid residues at the core, the surface, and the binding interface of proteins prohibited the establishment of a unified statistical potential for folding and binding despite the fact that the physical basis of the interaction (water-mediated interaction between amino acids) is the same. Recently, a physical state of ideal gas, rather than a statistically averaged state, has been used as the reference state for extracting the net interaction energy between amino acid residues of monomeric proteins. Here, we find that this monomer-based potential is more accurate than an existing all-atom knowledge-based potential trained with interfacial structures of dimers in distinguishing native complex structures from docking decoys (100% success rate vs. 52% in 21 dimer/trimer decoy sets). It is also more accurate than a recently developed semiphysical empirical free-energy functional enhanced by an orientation-dependent hydrogen-bonding potential in distinguishing native state from Rosetta docking decoys (94% success rate vs. 74% in 31 antibody-antigen and other complexes based on Z score). In addition, the monomer potential achieved a 93% success rate in distinguishing true dimeric interfaces from artificial crystal interfaces. More importantly, without additional parameters, the potential provides an accurate prediction of binding free energy of protein-peptide and protein-protein complexes (a correlation coefficient of 0.87 and a root-mean-square deviation of 1.76 kcal/mol with 69 experimental data points). This work marks a significant step toward a unified knowledge-based potential that quantitatively captures the common physical principle underlying folding and binding. A Web server for academic users, established for the prediction of binding free energy and the energy evaluation of the protein-protein complexes, may be found at http://theory.med.buffalo.edu.  相似文献   

8.
RNA molecules play integral roles in gene regulation, and understanding their structures gives us important insights into their biological functions. Despite recent developments in template-based and parameterized energy functions, the structure of RNA--in particular the nonhelical regions--is still difficult to predict. Knowledge-based potentials have proven efficient in protein structure prediction. In this work, we describe two differentiable knowledge-based potentials derived from a curated data set of RNA structures, with all-atom or coarse-grained representation, respectively. We focus on one aspect of the prediction problem: the identification of native-like RNA conformations from a set of near-native models. Using a variety of near-native RNA models generated from three independent methods, we show that our potential is able to distinguish the native structure and identify native-like conformations, even at the coarse-grained level. The all-atom version of our knowledge-based potential performs better and appears to be more effective at discriminating near-native RNA conformations than one of the most highly regarded parameterized potential. The fully differentiable form of our potentials will additionally likely be useful for structure refinement and/or molecular dynamics simulations.  相似文献   

9.
Protein decoy data sets provide a benchmark for testing scoring functions designed for fold recognition and protein homology modeling problems. It is commonly believed that statistical potentials based on reduced atomic models are better able to discriminate native-like from misfolded decoys than scoring functions based on more detailed molecular mechanics models. Recent benchmark tests on small data sets, however, suggest otherwise. In this work, we report the results of extensive decoy detection tests using an effective free energy function based on the OPLS all-atom (OPLS-AA) force field and the Surface Generalized Born (SGB) model for the solvent electrostatic effects. The OPLS-AA/SGB effective free energy is used as a scoring function to detect native protein folds among a total of 48,832 decoys for 32 different proteins from Park and Levitt's 4-state-reduced, Levitt's local-minima, Baker's ROSETTA all-atom, and Skolnick's decoy sets. Solvent electrostatic effects are included through the Surface Generalized Born (SGB) model. All structures are locally minimized without restraints. From an analysis of the individual energy components of the OPLS-AA/SGB energy function for the native and the best-ranked decoy, it is determined that a balance of the terms of the potential is responsible for the minimized energies that most successfully distinguish the native from the misfolded conformations. Different combinations of individual energy terms provide less discrimination than the total energy. The results are consistent with observations that all-atom molecular potentials coupled with intermediate level solvent dielectric models are competitive with knowledge-based potentials for decoy detection and protein modeling problems such as fold recognition and homology modeling.  相似文献   

10.
The variational approach of evaluation for knowledge-based potentials is considered for the first time. In this approach, the problem to derive knowledge-based potentials is solved as the optimization task in the multiparametric model of atom types, reference states and interaction cutoff radii. Using analogy to liquid state theory we offered four new reference states and derived corresponding knowledge-based potentials. The cutoff radii and atom types are optimized to minimize averaged root-mean square deviations (RMSD) of the ligand docked positions regarding to the experimentally determined poses. The number of atom types is varied on the developed atom type tree with 6 root (C, N, O, S, P and the halogen type) and 49 apical atom types. We showed a pronounced effect of atom type choice on docking accuracy and proved that splitting of elements C, N and O of the periodic system up to the 18 optimal atom types essentially improves docking accuracy.  相似文献   

11.
The relationship between the unfolding pseudo free energies of reduced and detailed atomic models of the GCN4 leucine zipper is examined. Starting from the native crystal structure, a large number of conformations ranging from folded to unfolded were generated by all-atom molecular dynamics unfolding simulations in an aqueous environment at elevated temperatures. For the detailed atomic model, the pseudo free energies are obtained by combining the CHARMM all-atom potential with a solvation component from the generalized Born, surface accessibility, GB/SA, model. Reduced model energies were evaluated using a knowledge-based potential. Both energies are highly correlated. In addition, both show a good correlation with the root mean square deviation, RMSD, of the backbone from native. These results suggest that knowledge-based potentials are capable of describing at least some of the properties of the folded as well as the unfolded states of proteins, even though they are derived from a database of native protein structures. Since only conformations generated from an unfolding simulation are used, we cannot assess whether these potentials can discriminate the native conformation from the manifold of alternative, low-energy misfolded states. Nevertheless, these results also have significant implications for the development of a methodology for multiscale modeling of proteins that combines reduced and detailed atomic models.  相似文献   

12.
We propose a self-consistent approach to analyze knowledge-based atom-atom potentials used to calculate protein-ligand binding energies. Ligands complexed to actual protein structures were first built using the SMoG growth procedure (DeWitte & Shakhnovich, 1996) with a chosen input potential. These model protein-ligand complexes were used to construct databases from which knowledge-based protein-ligand potentials were derived. We then tested several different modifications to such potentials and evaluated their performance on their ability to reconstruct the input potential using the statistical information available from a database composed of model complexes. Our data indicate that the most significant improvement resulted from properly accounting for the following key issues when estimating the reference state: (1) the presence of significant nonenergetic effects that influence the contact frequencies and (2) the presence of correlations in contact patterns due to chemical structure. The most successful procedure was applied to derive an atom-atom potential for real protein-ligand complexes. Despite the simplicity of the model (pairwise contact potential with a single interaction distance), the derived binding free energies showed a statistically significant correlation (approximately 0.65) with experimental binding scores for a diverse set of complexes.  相似文献   

13.
Mooij WT  Verdonk ML 《Proteins》2005,61(2):272-287
We present a novel atom-atom potential derived from a database of protein-ligand complexes. First, we clarify the similarities and differences between two statistical potentials described in the literature, PMF and Drugscore. We highlight shortcomings caused by an important factor unaccounted for in their reference states, and describe a new potential, which we name the Astex Statistical Potential (ASP). ASP's reference state considers the difference in exposure of protein atom types towards ligand binding sites. We show that this new potential predicts binding affinities with an accuracy similar to that of Goldscore and Chemscore. We investigate the influence of the choice of reference state by constructing two additional statistical potentials that differ from ASP only in this respect. The reference states in these two potentials are defined along the lines of Drugscore and PMF. In docking experiments, the potential using the new reference state proposed for ASP gives better success rates than when these literature reference states were used; a success rate similar to the established scoring functions Goldscore and Chemscore is achieved with ASP. This is the case both for a large, general validation set of protein-ligand structures and for small test sets of actives against four pharmaceutically relevant targets. Virtual screening experiments for these targets show less discrimination between the different reference states in terms of enrichment. In addition, we describe how statistical potentials can be used in the construction of targeted scoring functions. Examples are given for cdk2, using four different targeted scoring functions, biased towards increasingly large target-specific databases. Using these targeted scoring functions, docking success rates as well as enrichments are significantly better than for the general ASP scoring function. Results improve with the number of structures used in the construction of the target scoring functions, thus illustrating that these targeted ASP potentials can be continuously improved as new structural data become available.  相似文献   

14.
Zhang J  Zhang Y 《PloS one》2010,5(10):e15386

Background

An accurate potential function is essential to attack protein folding and structure prediction problems. The key to developing efficient knowledge-based potential functions is to design reference states that can appropriately counteract generic interactions. The reference states of many knowledge-based distance-dependent atomic potential functions were derived from non-interacting particles such as ideal gas, however, which ignored the inherent sequence connectivity and entropic elasticity of proteins.

Methodology

We developed a new pair-wise distance-dependent, atomic statistical potential function (RW), using an ideal random-walk chain as reference state, which was optimized on CASP models and then benchmarked on nine structural decoy sets. Second, we incorporated a new side-chain orientation-dependent energy term into RW (RWplus) and found that the side-chain packing orientation specificity can further improve the decoy recognition ability of the statistical potential.

Significance

RW and RWplus demonstrate a significantly better ability than the best performing pair-wise distance-dependent atomic potential functions in both native and near-native model selections. It has higher energy-RMSD and energy-TM-score correlations compared with other potentials of the same type in real-life structure assembly decoys. When benchmarked with a comprehensive list of publicly available potentials, RW and RWplus shows comparable performance to the state-of-the-art scoring functions, including those combining terms from multiple resources. These data demonstrate the usefulness of random-walk chain as reference states which correctly account for sequence connectivity and entropic elasticity of proteins. It shows potential usefulness in structure recognition and protein folding simulations. The RW and RWplus potentials, as well as the newly generated I-TASSER decoys, are freely available in http://zhanglab.ccmb.med.umich.edu/RW.  相似文献   

15.
Xu Y  Rahman NA  Othman R  Hu P  Huang M 《Proteins》2012,80(9):2154-2168
Fusion process is known to be the initial step of viral infection and hence targeting the entry process is a promising strategy to design antiviral therapy. The self-inhibitory peptides derived from the enveloped (E) proteins function to inhibit the protein-protein interactions in the membrane fusion step mediated by the viral E protein. Thus, they have the potential to be developed into effective antiviral therapy. Herein, we have developed a Monte Carlo-based computational method with the aim to identify and optimize potential peptide hits from the E proteins. The stability of the peptides, which indicates their potential to bind in situ to the E proteins, was evaluated by two different scoring functions, dipolar distance-scaled, finite, ideal-gas reference state and residue-specific all-atom probability discriminatory function. The method was applied to α-helical Class I HIV-1 gp41, β-sheet Class II Dengue virus (DENV) type 2 E proteins, as well as Class III Herpes Simplex virus-1 (HSV-1) glycoprotein, a E protein with a mixture of α-helix and β-sheet structural fold. The peptide hits identified are in line with the druggable regions where the self-inhibitory peptide inhibitors for the three classes of viral fusion proteins were derived. Several novel peptides were identified from either the hydrophobic regions or the functionally important regions on Class II DENV-2 E protein and Class III HSV-1 gB. They have potential to disrupt the protein-protein interaction in the fusion process and may serve as starting points for the development of novel inhibitors for viral E proteins.  相似文献   

16.
Ribonucleic acid (RNA) molecules play important roles in a variety of biological processes. To properly function, RNA molecules usually have to fold to specific structures, and therefore understanding RNA structure is vital in comprehending how RNA functions. One approach to understanding and predicting biomolecular structure is to use knowledge-based potentials built from experimentally determined structures. These types of potentials have been shown to be effective for predicting both protein and RNA structures, but their utility is limited by their significantly rugged nature. This ruggedness (and hence the potential's usefulness) depends heavily on the choice of bin width to sort structural information (e.g. distances) but the appropriate bin width is not known a priori. To circumvent the binning problem, we compared knowledge-based potentials built from inter-atomic distances in RNA structures using different mixture models (Kernel Density Estimation, Expectation Minimization and Dirichlet Process). We show that the smooth knowledge-based potential built from Dirichlet process is successful in selecting native-like RNA models from different sets of structural decoys with comparable efficacy to a potential developed by spline-fitting - a commonly taken approach - to binned distance histograms. The less rugged nature of our potential suggests its applicability in diverse types of structural modeling.  相似文献   

17.
Whether knowledge-based intra-molecular inter-residue potentials are valid to represent inter-molecular interactions taking place at protein-protein interfaces has been questioned in several studies. Differences in the chain connectivity effect and in residue packing geometry between interfaces and single chain monomers have been pointed out as possible sources of distinct energetics for the two cases. In the present study, the interfacial regions of protein-protein complexes are examined to extract inter-molecular inter-residue potentials, using the same statistical methods as those previously adopted for intra-molecular residue pairs. Two sets of energy parameters are derived, corresponding to solvent-mediation and "average residue" mediation. The former set is shown to be highly correlated (correlation coefficient 0.89) with that previously obtained for inter-residue interactions within single chain monomers, while the latter exhibits a weaker correlation (0.69) with its intra-molecular counterpart. In addition to the close similarity of intra- and inter-molecular solvent-mediated potentials, they are shown to be significantly more residue-specific and thereby discriminative compared to the residue-mediated ones, indicating that solvent-mediation plays a major role in controlling the effective inter-residue interactions, either at interfaces, or within single monomers. Based on this observation, a reduced set of energy parameters comprising 20 one-body and 3 two-body terms is proposed (as opposed to the 20 x 20 tables of inter-residue potentials), which reproduces the conventional 20 x 20 tables with a correlation coefficient of 0.99.  相似文献   

18.
19.
The folding specificity of proteins can be simulated using simplified structural models and knowledge-based pair-potentials. However, when the same models are used to simulate systems that contain many proteins, large aggregates tend to form. In other words, these models cannot account for the fact that folded, globular proteins are soluble. Here we show that knowledge-based pair-potentials, which include explicitly calculated energy terms between the solvent and each amino acid, enable the simulation of proteins that are much less aggregation-prone in the folded state. Our analysis clarifies why including a solvent term improves the foldability. The aggregation for potentials without water is due to the unrealistically attractive interactions between polar residues, causing artificial clustering. When a water-based potential is used instead, polar residues prefer to interact with water; this leads to designed protein surfaces rich in polar residues and well-defined hydrophobic cores, as observed in real protein structures. We developed a simple knowledge-based method to calculate interactions between the solvent and amino acids. The method provides a starting point for modeling the folding and aggregation of soluble proteins. Analysis of our simple model suggests that inclusion of these solvent terms may also improve off-lattice potentials for protein simulation, design, and structure prediction.  相似文献   

20.
Clark LA  van Vlijmen HW 《Proteins》2008,70(4):1540-1550
A distance-dependent knowledge-based potential for protein-protein interactions is derived and tested for application in protein design. Information on residue type specific C(alpha) and C(beta) pair distances is extracted from complex crystal structures in the Protein Data Bank and used in the form of radial distribution functions. The use of only backbone and C(beta) position information allows generation of relative protein-protein orientation poses with minimal sidechain information. Further coarse-graining can be done simply in the same theoretical framework to give potentials for residues of known type interacting with unknown type, as in a one-sided interface design problem. Both interface design via pose generation followed by sidechain repacking and localized protein-protein docking tests are performed on 39 nonredundant antibody-antigen complexes for which crystal structures are available. As reference, Lennard-Jones potentials, unspecific for residue type and biasing toward varying degrees of residue pair separation are used as controls. For interface design, the knowledge-based potentials give the best combination of consistently designable poses, low RMSD to the known structure, and more tightly bound interfaces with no added computational cost. 77% of the poses could be designed to give complexes with negative free energies of binding. Generally, larger interface separation promotes designability, but weakens the binding of the resulting designs. A localized docking test shows that the knowledge-based nature of the potentials improves performance and compares respectably with more sophisticated all-atoms potentials.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号