首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Protein decoy data sets provide a benchmark for testing scoring functions designed for fold recognition and protein homology modeling problems. It is commonly believed that statistical potentials based on reduced atomic models are better able to discriminate native-like from misfolded decoys than scoring functions based on more detailed molecular mechanics models. Recent benchmark tests on small data sets, however, suggest otherwise. In this work, we report the results of extensive decoy detection tests using an effective free energy function based on the OPLS all-atom (OPLS-AA) force field and the Surface Generalized Born (SGB) model for the solvent electrostatic effects. The OPLS-AA/SGB effective free energy is used as a scoring function to detect native protein folds among a total of 48,832 decoys for 32 different proteins from Park and Levitt's 4-state-reduced, Levitt's local-minima, Baker's ROSETTA all-atom, and Skolnick's decoy sets. Solvent electrostatic effects are included through the Surface Generalized Born (SGB) model. All structures are locally minimized without restraints. From an analysis of the individual energy components of the OPLS-AA/SGB energy function for the native and the best-ranked decoy, it is determined that a balance of the terms of the potential is responsible for the minimized energies that most successfully distinguish the native from the misfolded conformations. Different combinations of individual energy terms provide less discrimination than the total energy. The results are consistent with observations that all-atom molecular potentials coupled with intermediate level solvent dielectric models are competitive with knowledge-based potentials for decoy detection and protein modeling problems such as fold recognition and homology modeling.  相似文献   

2.
Deng H  Jia Y  Wei Y  Zhang Y 《Proteins》2012,80(9):2311-2322
Many statistical potentials were developed in last two decades for protein folding and protein structure recognition. The major difference of these potentials is on the selection of reference states to offset sampling bias. However, since these potentials used different databases and parameter cutoffs, it is difficult to judge what the best reference states are by examining the original programs. In this study, we aim to address this issue and evaluate the reference states by a unified database and programming environment. We constructed distance-specific atomic potentials using six widely-used reference states based on 1022 high-resolution protein structures, which are applied to rank modeling in six sets of structure decoys. The reference state on random-walk chain outperforms others in three decoy sets while those using ideal-gas, quasi-chemical approximation and averaging sample stand out in one set separately. Nevertheless, the performance of the potentials relies on the origin of decoy generations and no reference state can clearly outperform others in all decoy sets. Further analysis reveals that the statistical potentials have a contradiction between the universality and pertinence, and optimal reference states should be extracted based on specific application environments and decoy spaces.  相似文献   

3.
We propose a novel method of calculation of free energy for coarse grained models of proteins by combining our newly developed multibody potentials with entropies computed from elastic network models of proteins. Multi-body potentials have been of much interest recently because they take into account three dimensional interactions related to residue packing and capture the cooperativity of these interactions in protein structures. Combining four-body non-sequential, four-body sequential and pairwise short range potentials with optimized weights for each term, our coarse-grained potential improved recognition of native structure among misfolded decoys, outperforming all other contact potentials for CASP8 decoy sets and performance comparable to the fully atomic empirical DFIRE potentials. By combing statistical contact potentials with entropies from elastic network models of the same structures we can compute free energy changes and improve coarse-grained modeling of protein structure and dynamics. The consideration of protein flexibility and dynamics should improve protein structure prediction and refinement of computational models. This work is the first to combine coarse-grained multibody potentials with an entropic model that takes into account contributions of the entire structure, investigating native-like decoy selection.  相似文献   

4.
In this paper, an improved Cα-SC energy potential designed for protein fold recognition was reported. It consists of three extremely simple interaction terms which are supposed to be the dominant interactions in protein folding: residue-residue contact, hydrophobicity and pseudodihedral potentials. The potential function only contains 210 contacts, one hydrophobic and one torsion parameters, which have been optimized using an interior point algorithm of linear programming. Tests of the derived potential function on commonly used decoy sets illustrate that it outperforms most of the existing coarse-grained potentials in terms of its capabilities in recognizing native structures and consistency in achieving high Z-scores across decoy sets, and it has almost equivalent performance to the potentials which considered complex intra-molecular interactions. The results show that our scoring function is a generally prospective potential for protein structure prediction and modeling with regard to its recognition and computation efficacy.  相似文献   

5.
Using information‐theoretic concepts, we examine the role of the reference state, a crucial component of empirical potential functions, in protein fold recognition. We derive an information‐based connection between the probability distribution functions of the reference state and those that characterize the decoy set used in threading. In examining commonly used contact reference states, we find that the quasi‐chemical approximation is informatically superior to other variant models designed to include characteristics of real protein chains, such as finite length and variable amino acid composition from protein to protein. We observe that in these variant models, the total divergence, the operative function that quantifies discrimination, decreases along with threading performance. We find that any amount of nativeness encoded in the reference state model does not significantly improve threading performance. A promising avenue for the development of better potentials is suggested by our information‐theoretic analysis of the action of contact potentials on individual protein sequences. Our results show that contact potentials perform better when the compositional properties of the data set used to derive the score function probabilities are similar to the properties of the sequence of interest. Results also suggest to use only sequences of similar composition in deriving contact potentials, to tailor the contact potential specifically for a test sequence. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

6.
Solis AD  Rackovsky S 《Proteins》2008,71(3):1071-1087
We examine the information-theoretic characteristics of statistical potentials that describe pairwise long-range contacts between amino acid residues in proteins. In our work, we seek to map out an efficient information-based strategy to detect and optimally utilize the structural information latent in empirical data, to make contact potentials, and other statistically derived folding potentials, more effective tools in protein structure prediction. Foremost, we establish fundamental connections between basic information-theoretic quantities (including the ubiquitous Z-score) and contact "energies" or scores used routinely in protein structure prediction, and demonstrate that the informatic quantity that mediates fold discrimination is the total divergence. We find that pairwise contacts between residues bear a moderate amount of fold information, and if optimized, can assist in the discrimination of native conformations from large ensembles of native-like decoys. Using an extensive battery of threading tests, we demonstrate that parameters that affect the information content of contact potentials (e.g., choice of atoms to define residue location and the cut-off distance between pairs) have a significant influence in their performance in fold recognition. We conclude that potentials that have been optimized for mutual information and that have high number of score events per sequence-structure alignment are superior in identifying the correct fold. We derive the quantity "information product" that embodies these two critical factors. We demonstrate that the information product, which does not require explicit threading to compute, is as effective as the Z-score, which requires expensive decoy threading to evaluate. This new objective function may be able to speed up the multidimensional parameter search for better statistical potentials. Lastly, by demonstrating the functional equivalence of quasi-chemically approximated "energies" to fundamental informatic quantities, we make statistical potentials less dependent on theoretically tenuous biophysical formalisms and more amenable to direct bioinformatic optimization.  相似文献   

7.
Zhang J  Zhang Y 《PloS one》2010,5(10):e15386

Background

An accurate potential function is essential to attack protein folding and structure prediction problems. The key to developing efficient knowledge-based potential functions is to design reference states that can appropriately counteract generic interactions. The reference states of many knowledge-based distance-dependent atomic potential functions were derived from non-interacting particles such as ideal gas, however, which ignored the inherent sequence connectivity and entropic elasticity of proteins.

Methodology

We developed a new pair-wise distance-dependent, atomic statistical potential function (RW), using an ideal random-walk chain as reference state, which was optimized on CASP models and then benchmarked on nine structural decoy sets. Second, we incorporated a new side-chain orientation-dependent energy term into RW (RWplus) and found that the side-chain packing orientation specificity can further improve the decoy recognition ability of the statistical potential.

Significance

RW and RWplus demonstrate a significantly better ability than the best performing pair-wise distance-dependent atomic potential functions in both native and near-native model selections. It has higher energy-RMSD and energy-TM-score correlations compared with other potentials of the same type in real-life structure assembly decoys. When benchmarked with a comprehensive list of publicly available potentials, RW and RWplus shows comparable performance to the state-of-the-art scoring functions, including those combining terms from multiple resources. These data demonstrate the usefulness of random-walk chain as reference states which correctly account for sequence connectivity and entropic elasticity of proteins. It shows potential usefulness in structure recognition and protein folding simulations. The RW and RWplus potentials, as well as the newly generated I-TASSER decoys, are freely available in http://zhanglab.ccmb.med.umich.edu/RW.  相似文献   

8.
Qiu J  Elber R 《Proteins》2005,61(1):44-55
Atomically detailed potentials for recognition of protein folds are presented. The potentials consist of pair interactions between atoms. One or three distance steps are used to describe the range of interactions between a pair. Training is carried out with the mathematical programming approach on the decoy sets of Baker, Levitt, and some of our own design. Recognition is required not only for decoy-native structural pairs but also for pairs of decoy and homologous structures. Performance is tested on the targets of CASP5 using templates from the Protein Data Bank, on two test ab initio decoy sets from Skolnick's laboratory, and on decoy sets from Moult's laboratory. We conclude that the newly derived potentials have significant recognition capacity, comparable to the best models derived from other techniques. The new potentials require a significantly smaller number of parameters. The enhanced recognition capacity extends primarily to the identification of structures generated by ab initio simulation and less to the recognition of approximate shapes created by homology.  相似文献   

9.
In this paper, we report a knowledge-based potential function, named the OPUS-Ca potential, that requires only Calpha positions as input. The contributions from other atomic positions were established from pseudo-positions artificially built from a Calpha trace for auxiliary purposes. The potential function is formed based on seven major representative molecular interactions in proteins: distance-dependent pairwise energy with orientational preference, hydrogen bonding energy, short-range energy, packing energy, tri-peptide packing energy, three-body energy, and solvation energy. From the testing of decoy recognition on a number of commonly used decoy sets, it is shown that the new potential function outperforms all known Calpha-based potentials and most other coarse-grained ones that require more information than Calpha positions. We hope that this potential function adds a new tool for protein structural modeling.  相似文献   

10.
11.
S Miyazawa  R L Jernigan 《Proteins》1999,36(3):357-369
We consider modifications of an empirical energy potential for fold and sequence recognition to represent approximately the stabilities of proteins in various environments. A potential used here includes a secondary structure potential representing short-range interactions for secondary structures of proteins, and a tertiary structure potential consisting of a long-range, pairwise contact potential and a repulsive packing potential. This potential is devised to evaluate together the total conformational energy of a protein at the coarse grained residue level. It was previously estimated from the observed frequencies of secondary structures, from contact frequencies between residues, and from the distributions of the number of residues in contact in known protein structures by regarding those distributions as the equilibrium distributions with the Boltzmann factor of these interaction energies. The stability of native structures is assumed as a primary requirement for proteins to fold into their native structures. A collapse energy is subtracted from the contact energies to remove the protein size dependence and to represent protein stabilities for monomeric and multimeric states. The free energy of the whole ensemble of protein conformations that is subtracted from the conformational energy to represent protein stability is approximated as the average energy expected for a typical native structure with the same amino acid composition. This term may be constant in fold recognition but essentially varies in sequence recognition. A simple test of threading sequences into structures without gaps is employed to demonstrate the importance of the present modifications that permit the same potential to be utilized for both fold and sequence recognition. Proteins 1999;36:357-369. Published 1999 Wiley-Liss, Inc.  相似文献   

12.
The simplest approximation of interaction potential between amino acid residues in proteins is the contact potential, which defines the effective free energy of a protein conformation by a set of amino acid contacts formed in this conformation. Finding a contact potential capable of predicting free energies of protein states across a variety of protein families will aid protein folding and engineering in silico on a computationally tractable time-scale. We test the ability of contact potentials to accurately and transferably (across various protein families) predict stability changes of proteins upon mutations. We develop a new methodology to determine the contact potentials in proteins from experimental measurements of changes in protein's thermodynamic stabilities (DeltaDeltaG) upon mutations. We apply our methodology to derive sets of contact interaction parameters for a hierarchy of interaction models including solvation and multi-body contact parameters. We test how well our models reproduce experimental measurements by statistical tests. We evaluate the maximum accuracy of predictions obtained by using contact potentials and the correlation between parameters derived from different data-sets of experimental (DeltaDeltaG) values. We argue that it is impossible to reach experimental accuracy and derive fully transferable contact parameters using the contact models of potentials. However, contact parameters may yield reliable predictions of DeltaDeltaG for datasets of mutations confined to the same amino acid positions in the sequence of a single protein.  相似文献   

13.

Background  

Contradicting evidence has been presented in the literature concerning the effectiveness of empirical contact energies for fold recognition. Empirical contact energies are calculated on the basis of information available from selected protein structures, with respect to a defined reference state, according to the quasi-chemical approximation. Protein-solvent interactions are estimated from residue solvent accessibility.  相似文献   

14.
15.
The relationship between the unfolding pseudo free energies of reduced and detailed atomic models of the GCN4 leucine zipper is examined. Starting from the native crystal structure, a large number of conformations ranging from folded to unfolded were generated by all-atom molecular dynamics unfolding simulations in an aqueous environment at elevated temperatures. For the detailed atomic model, the pseudo free energies are obtained by combining the CHARMM all-atom potential with a solvation component from the generalized Born, surface accessibility, GB/SA, model. Reduced model energies were evaluated using a knowledge-based potential. Both energies are highly correlated. In addition, both show a good correlation with the root mean square deviation, RMSD, of the backbone from native. These results suggest that knowledge-based potentials are capable of describing at least some of the properties of the folded as well as the unfolded states of proteins, even though they are derived from a database of native protein structures. Since only conformations generated from an unfolding simulation are used, we cannot assess whether these potentials can discriminate the native conformation from the manifold of alternative, low-energy misfolded states. Nevertheless, these results also have significant implications for the development of a methodology for multiscale modeling of proteins that combines reduced and detailed atomic models.  相似文献   

16.
Multibody potentials have been of much interest recently because they take into account three dimensional interactions related to residue packing and capture the cooperativity of these interactions in protein structures. Our goal was to combine long range multibody potentials and short range potentials to improve recognition of native structure among misfolded decoys. We optimized the weights for four-body nonsequential, four-body sequential, and short range potentials to obtain optimal model ranking results for threading and have compared these data against results obtained with other potentials (26 different coarse-grained potentials from the Potentials 'R'Us web server have been used). Our optimized multibody potentials outperform all other contact potentials in the recognition of the native structure among decoys, both for models from homology template-based modeling and from template-free modeling in CASP8 decoy sets. We have compared the results obtained for this optimized coarse-grained potentials, where each residue is represented by a single point, with results obtained by using the DFIRE potential, which takes into account atomic level information of proteins. We found that for all proteins larger than 80 amino acids our optimized coarse-grained potentials yield results comparable to those obtained with the atomic DFIRE potential.  相似文献   

17.
Knowledge-based potentials are widely used in simulations of protein folding, structure prediction, and protein design. Their advantages include limited computational requirements and the ability to deal with low-resolution protein models compatible with long-scale simulations. Their drawbacks comprehend their dependence on specific features of the dataset from which they are derived, such as the size of the proteins it contains, and their physical meaning is still a subject of debate. We address these issues by probing the theoretical validity of these potentials as mean-force potentials that take the solvent implicitly into account and involve entropic contributions due to atomic degrees of freedom and solvation. The dependence on the size of the system is checked on distance-dependent amino acid pair potentials, derived from six protein structure sets containing proteins of increasing length N. For large inter-residue distances, they are found to display the theoretically predicted 1/N behavior weighted by a factor depending on the boundaries and the compressibility of the system. For short distances, different trends are observed according to the nature of the residue pairs and their ability to form, for example, electrostatic, cation-pi or pi-pi interactions, or hydrophobic packing. The results of this analysis are used to devise a novel protein size-dependent distance potential, which displays an improved performance in discriminating native sequence-structure matches among decoy models.  相似文献   

18.
We investigate the landscape of the internal free-energy of the 36 amino acid villin headpiece with a modified basin hopping method in the all-atom force field PFF01, which was previously used to predictively fold several helical proteins with atomic resolution. We identify near native conformations of the protein as the global optimum of the force field. More than half of the twenty best simulations started from random initial conditions converge to the folding funnel of the native conformation, but several competing low-energy metastable conformations were observed. From 76,000 independently generated conformations we derived a decoy tree which illustrates the topological structure of the entire low-energy part of the free-energy landscape and characterizes the ensemble of metastable conformations. These emerge as similar in secondary content, but differ in tertiary arrangement.  相似文献   

19.
Protein structure prediction is limited by the inaccuracy of the simplified energy functions necessary for efficient sorting over many conformations. It was recently suggested (Finkelstein, Phys Rev Lett 1998;80:4823-4825) that these errors can be reduced by energy averaging over a set of homologous sequences. This conclusion is confirmed in this study by testing protein structure recognition in gapless threading. The accuracy of recognition was estimated by the Z-score values obtained in gapless threading tests. For threading, we used 20 target proteins, each having from 20 to 70 homologs taken from the HSSP sequence base. The energy of the native structures was compared with the energy from 34 to 75 thousand of alternative structures generated by threading. The energy calculations were done with our recently developed Calpha atom-based phenomenological potentials. We show that averaging of protein energies over homologs reduces the Z-score from approximately -6.1 (average Z-score for individual chains) to approximately -8.1. This means that a correct fold can be found among 3 x 10(9) random folds in the first case and among 3 x 10(15) in the second. Such increase in selectivity is important for recognition of protein folds.  相似文献   

20.
Huang SY  Zou X 《Proteins》2011,79(9):2648-2661
In this study, we have developed a statistical mechanics-based iterative method to extract statistical atomic interaction potentials from known, nonredundant protein structures. Our method circumvents the long-standing reference state problem in deriving traditional knowledge-based scoring functions, by using rapid iterations through a physical, global convergence function. The rapid convergence of this physics-based method, unlike other parameter optimization methods, warrants the feasibility of deriving distance-dependent, all-atom statistical potentials to keep the scoring accuracy. The derived potentials, referred to as ITScore/Pro, have been validated using three diverse benchmarks: the high-resolution decoy set, the AMBER benchmark decoy set, and the CASP8 decoy set. Significant improvement in performance has been achieved. Finally, comparisons between the potentials of our model and potentials of a knowledge-based scoring function with a randomized reference state have revealed the reason for the better performance of our scoring function, which could provide useful insight into the development of other physical scoring functions. The potentials developed in this study are generally applicable for structural selection in protein structure prediction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号