首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Solis AD  Rackovsky S 《Proteins》2008,71(3):1071-1087
We examine the information-theoretic characteristics of statistical potentials that describe pairwise long-range contacts between amino acid residues in proteins. In our work, we seek to map out an efficient information-based strategy to detect and optimally utilize the structural information latent in empirical data, to make contact potentials, and other statistically derived folding potentials, more effective tools in protein structure prediction. Foremost, we establish fundamental connections between basic information-theoretic quantities (including the ubiquitous Z-score) and contact "energies" or scores used routinely in protein structure prediction, and demonstrate that the informatic quantity that mediates fold discrimination is the total divergence. We find that pairwise contacts between residues bear a moderate amount of fold information, and if optimized, can assist in the discrimination of native conformations from large ensembles of native-like decoys. Using an extensive battery of threading tests, we demonstrate that parameters that affect the information content of contact potentials (e.g., choice of atoms to define residue location and the cut-off distance between pairs) have a significant influence in their performance in fold recognition. We conclude that potentials that have been optimized for mutual information and that have high number of score events per sequence-structure alignment are superior in identifying the correct fold. We derive the quantity "information product" that embodies these two critical factors. We demonstrate that the information product, which does not require explicit threading to compute, is as effective as the Z-score, which requires expensive decoy threading to evaluate. This new objective function may be able to speed up the multidimensional parameter search for better statistical potentials. Lastly, by demonstrating the functional equivalence of quasi-chemically approximated "energies" to fundamental informatic quantities, we make statistical potentials less dependent on theoretically tenuous biophysical formalisms and more amenable to direct bioinformatic optimization.  相似文献   

2.
There are several knowledge-based energy functions that can distinguish the native fold from a pool of grossly misfolded decoys for a given sequence of amino acids. These decoys, which are typically generated by mounting, or “threading”, the sequence onto the backbones of unrelated protein structures, tend to be non-compact and quite different from the native structure: the root-mean-squared (RMS) deviations from the native are commonly in the range of 15 to 20 Å. Effective energy functions should also demonstrate a similar recognition capability when presented with compact decoys that depart only slightly in conformation from the correct structure (i.e. those with RMS deviations of ∼5 Å or less). Recently, we developed a simple yet powerful method for native fold recognition based on the tendency for native folds to form hydrophobic cores. Our energy measure, which we call the hydrophobic fitness score, is challenged to recognize the native fold from 2000 near-native structures generated for each of five small monomeric proteins. First, 1000 conformations for each protein were generated by molecular dynamics simulation at room temperature. The average RMS deviation of this set of 5000 was 1.5 Å. A total of 323 decoys had energies lower than native; however, none of these had RMS deviations greater than 2 Å. Another 1000 structures were generated for each at high temperature, in which a greater range of conformational space was explored (4.3 Å average RMS deviation). Out of this set, only seven decoys were misrecognized. The hydrophobic fitness energy of a conformation is strongly dependent upon the RMS deviation. On average our potential yields energy values which are lowest for the population of structures generated at room temperature, intermediate for those produced at high temperature and highest for those constructed by threading methods. In general, the lowest energy decoy conformations have backbones very close to native structure. The possible utility of our method for screening backbone candidates for the purpose of modelling by side-chain packing optimization is discussed.  相似文献   

3.
One of the approaches to protein structure prediction is to obtain energy functions which can recognize the native conformation of a given sequence among a zoo of conformations. The discriminations can be done by assigning the lowest energy to the native conformation, with the guarantee that the native is in the zoo. Well-adjusted functions, then, can be used in the search for other (near-) natives. Here the aim is the discrimination at relatively high resolution (RMSD difference between the native and the closest nonnative is around 1 A) by pairwise energy potentials. The potential is trained using the experimentally determined native conformation of only one protein, instead of the usual large survey over many proteins. The novel feature is that the native structure is compared to a vastly wider and more challenging array of nonnative structures found not only by the usual threading procedure, but by wide-ranging local minimization of the potential. Because of this extremely demanding search, the native is very close to the apparent global minimum of the potential function. The global minimum property holds up for one other protein having 60% sequence identity, but its performance on completely dissimilar proteins is of course much weaker.  相似文献   

4.
5.
We developed a series of statistical potentials to recognize the native protein from decoys, particularly when using only a reduced representation in which each side chain is treated as a single C(beta) atom. Beginning with a highly successful all-atom statistical potential, the Discrete Optimized Protein Energy function (DOPE), we considered the implications of including additional information in the all-atom statistical potential and subsequently reducing to the C(beta) representation. One of the potentials includes interaction energies conditional on backbone geometries. A second potential separates sequence local from sequence nonlocal interactions and introduces a novel reference state for the sequence local interactions. The resultant potentials perform better than the original DOPE statistical potential in decoy identification. Moreover, even upon passing to a reduced C(beta) representation, these statistical potentials outscore the original (all-atom) DOPE potential in identifying native states for sets of decoys. Interestingly, the backbone-dependent statistical potential is shown to retain nearly all of the information content of the all-atom representation in the C(beta) representation. In addition, these new statistical potentials are combined with existing potentials to model hydrogen bonding, torsion energies, and solvation energies to produce even better performing potentials. The ability of the C(beta) statistical potentials to accurately represent protein interactions bodes well for computational efficiency in protein folding calculations using reduced backbone representations, while the extensions to DOPE illustrate general principles for improving knowledge-based potentials.  相似文献   

6.
Protein decoy data sets provide a benchmark for testing scoring functions designed for fold recognition and protein homology modeling problems. It is commonly believed that statistical potentials based on reduced atomic models are better able to discriminate native-like from misfolded decoys than scoring functions based on more detailed molecular mechanics models. Recent benchmark tests on small data sets, however, suggest otherwise. In this work, we report the results of extensive decoy detection tests using an effective free energy function based on the OPLS all-atom (OPLS-AA) force field and the Surface Generalized Born (SGB) model for the solvent electrostatic effects. The OPLS-AA/SGB effective free energy is used as a scoring function to detect native protein folds among a total of 48,832 decoys for 32 different proteins from Park and Levitt's 4-state-reduced, Levitt's local-minima, Baker's ROSETTA all-atom, and Skolnick's decoy sets. Solvent electrostatic effects are included through the Surface Generalized Born (SGB) model. All structures are locally minimized without restraints. From an analysis of the individual energy components of the OPLS-AA/SGB energy function for the native and the best-ranked decoy, it is determined that a balance of the terms of the potential is responsible for the minimized energies that most successfully distinguish the native from the misfolded conformations. Different combinations of individual energy terms provide less discrimination than the total energy. The results are consistent with observations that all-atom molecular potentials coupled with intermediate level solvent dielectric models are competitive with knowledge-based potentials for decoy detection and protein modeling problems such as fold recognition and homology modeling.  相似文献   

7.
In order to investigate the level of representation required to simulate folding and predict structure, we test the ability of a variety of reduced representations to identify native states in decoy libraries and to recover the native structure given the advanced knowledge of the very broad native Ramachandran basin assignments. Simplifications include the removal of the entire side-chain or the retention of only the Cbeta atoms. Scoring functions are derived from an all-atom statistical potential that distinguishes between atoms and different residue types. Structures are obtained by minimizing the scoring function with a computationally rapid simulated annealing algorithm. Results are compared for simulations in which backbone conformations are sampled from a Protein Data Bank-based backbone rotamer library generated by either ignoring or including a dependence on the identity and conformation of the neighboring residues. Only when the Cbeta atoms and nearest neighbor effects are included do the lowest energy structures generally fall within 4 A of the native backbone root-mean square deviation (RMSD), despite the initial configuration being highly expanded with an average RMSD > or = 10 A. The side-chains are reinserted into the Cbeta models with minimal steric clash. Therefore, the detailed, all-atom information lost in descending to a Cbeta-level representation is recaptured to a large measure using backbone dihedral angle sampling that includes nearest neighbor effects and an appropriate scoring function.  相似文献   

8.
9.
Using information‐theoretic concepts, we examine the role of the reference state, a crucial component of empirical potential functions, in protein fold recognition. We derive an information‐based connection between the probability distribution functions of the reference state and those that characterize the decoy set used in threading. In examining commonly used contact reference states, we find that the quasi‐chemical approximation is informatically superior to other variant models designed to include characteristics of real protein chains, such as finite length and variable amino acid composition from protein to protein. We observe that in these variant models, the total divergence, the operative function that quantifies discrimination, decreases along with threading performance. We find that any amount of nativeness encoded in the reference state model does not significantly improve threading performance. A promising avenue for the development of better potentials is suggested by our information‐theoretic analysis of the action of contact potentials on individual protein sequences. Our results show that contact potentials perform better when the compositional properties of the data set used to derive the score function probabilities are similar to the properties of the sequence of interest. Results also suggest to use only sequences of similar composition in deriving contact potentials, to tailor the contact potential specifically for a test sequence. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

10.

Background

We present a simple method to train a potential function for the protein folding problem which, even though trained using a small number of proteins, is able to place a significantly large number of native conformations near a local minimum. The training relies on generating decoys by energy minimization of the native conformations using the current potential and using a physically meaningful objective function (derivative of energy with respect to torsion angles at the native conformation) during the quadratic programming to place the native conformation near a local minimum.

Results

We also compare the performance of three different types of energy functions and find that while the pairwise energy function is trainable, a solvation energy function by itself is untrainable if decoys are generated by minimizing the current potential starting at the native conformation. The best results are obtained when a pairwise interaction energy function is used with solvation energy function.

Conclusions

We are able to train a potential function using six proteins which places a total of 42 native conformations within ~4 Å rmsd and 71 native conformations within ~6 Å rmsd of a local minimum out of a total of 91 proteins. Furthermore, the threading test using the same 91 proteins ranks 89 native conformations to be first and the other two as second.  相似文献   

11.
To estimate how extensively the ensemble of denatured-state conformations is constrained by local side-chain–backbone interactions, propensities of each of the 20 amino acids to occur in mono- and dipeptides mapped to discrete regions of the Ramachandran map are computed from proteins of known structure. In addition, propensities are computed for the trans, gauche−, and gauche+ rotamers, with or without consideration of the values of phi and psi. These propensities are used in scoring functions for fragment threading, which estimates the energetic favorability of fragments of protein sequence to adopt the native conformation as opposed to hundreds of thousands of incorrect conformations. As finer subdivisions of the Ramachandran plot, neighboring residue phi/psi angles, and rotamers are incorporated, scoring functions become better at ranking the native conformation as the most favorable. With the best composite propensity function, the native structure can be distinguished from 300,000 incorrect structures for 71% of the 2130 arbitrary protein segments of length 40, 48% of 2247 segments of length 30, and 20% of 2368 segments of length 20. A majority of fragments of length 30–40 are estimated to be folded into the native conformation a substantial fraction of the time. These data suggest that the variations observed in amino acid frequencies in different phi/psi/chi1 environments in folded proteins reflect energetically important local side-chain–backbone interactions, interactions that may severely restrict the ensemble of conformations populated in the denatured state to a relatively small subset with nativelike structure.  相似文献   

12.
Abstract

Arriving at the native conformation of a polypeptide chain characterized by minimum most free energy is a problem of long standing interest in protein structure prediction endeavors. Owing to the computational requirements in developing free energy estimates, scoring functions—energy based or statistical—have received considerable renewed attention in recent years for distinguishing native structures of proteins from non-native like structures. Several cleverly designed decoy sets, CASP (Critical Assessment of Techniques for Protein Structure Prediction) structures and homology based internet accessible three dimensional model builders are now available for validating the scoring functions. We describe here an all-atom energy based empirical scoring function and examine its performance on a wide series of publicly available decoys. Barring two protein sequences where native structure is ranked second and seventh, native is identified as the lowest energy structure in 67 protein sequences from among 61,659 decoys belonging to 12 different decoy sets. We further illustrate a potential application of the scoring function in bracketing native-like structures of two small mixed alpha/beta globular proteins starting from sequence and secondary structural information. The scoring function has been web enabled at www.scfbio-iitd.res.in/utility/proteomics/energy.jsp  相似文献   

13.

Background

Multibody potentials accounting for cooperative effects of molecular interactions have shown better accuracy than typical pairwise potentials. The main challenge in the development of such potentials is to find relevant structural features that characterize the tightly folded proteins. Also, the side-chains of residues adopt several specific, staggered conformations, known as rotamers within protein structures. Different molecular conformations result in different dipole moments and induce charge reorientations. However, until now modeling of the rotameric state of residues had not been incorporated into the development of multibody potentials for modeling non-bonded interactions in protein structures.

Results

In this study, we develop a new multibody statistical potential which can account for the influence of rotameric states on the specificity of atomic interactions. In this potential, named “rotamer-dependent atomic statistical potential” (ROTAS), the interaction between two atoms is specified by not only the distance and relative orientation but also by two state parameters concerning the rotameric state of the residues to which the interacting atoms belong. It was clearly found that the rotameric state is correlated to the specificity of atomic interactions. Such rotamer-dependencies are not limited to specific type or certain range of interactions. The performance of ROTAS was tested using 13 sets of decoys and was compared to those of existing atomic-level statistical potentials which incorporate orientation-dependent energy terms. The results show that ROTAS performs better than other competing potentials not only in native structure recognition, but also in best model selection and correlation coefficients between energy and model quality.

Conclusions

A new multibody statistical potential, ROTAS accounting for the influence of rotameric states on the specificity of atomic interactions was developed and tested on decoy sets. The results show that ROTAS has improved ability to recognize native structure from decoy models compared to other potentials. The effectiveness of ROTAS may provide insightful information for the development of many applications which require accurate side-chain modeling such as protein design, mutation analysis, and docking simulation.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-307) contains supplementary material, which is available to authorized users.  相似文献   

14.
Lin Chen  Jing He 《Biopolymers》2014,101(6):681-692
Blocks are the selected portions of amino acids. They have been used effectively to represent amino acids in distinguishing the native conformation from the decoys. Although many statistical energy functions exist, most of them rely on the distances between two or more amino acids. In this study, the authors have developed a pairwise energy function “DOKB” that is both distance and orientation dependent, and it is based on the key blocks that bias the distal ends of side chains. The results suggest that both the distance and the orientation are needed to distinguish the fine details of the packing geometry. DOKB appears to perform well in recognizing native conformations when compared with six other energy functions. Highly packed clusters play important roles in stabilizing the structure. The investigation about the highly packed clusters at the residue level suggests that certain residue pairs in a low‐energy region have lower probability to appear in the highly packed clusters than in the entire protein. The cluster energy term appears to significantly improve the recognition of the native conformations in ig_structal decoy set, in which more highly packed clusters are contained than in other decoy sets. © 2013 Wiley Periodicals, Inc. Biopolymers 101: 681–692, 2014.  相似文献   

15.

Background

Elucidating the native structure of a protein molecule from its sequence of amino acids, a problem known as de novo structure prediction, is a long standing challenge in computational structural biology. Difficulties in silico arise due to the high dimensionality of the protein conformational space and the ruggedness of the associated energy surface. The issue of multiple minima is a particularly troublesome hallmark of energy surfaces probed with current energy functions. In contrast to the true energy surface, these surfaces are weakly-funneled and rich in comparably deep minima populated by non-native structures. For this reason, many algorithms seek to be inclusive and obtain a broad view of the low-energy regions through an ensemble of low-energy (decoy) conformations. Conformational diversity in this ensemble is key to increasing the likelihood that the native structure has been captured.

Methods

We propose an evolutionary search approach to address the multiple-minima problem in decoy sampling for de novo structure prediction. Two population-based evolutionary search algorithms are presented that follow the basic approach of treating conformations as individuals in an evolving population. Coarse graining and molecular fragment replacement are used to efficiently obtain protein-like child conformations from parents. Potential energy is used both to bias parent selection and determine which subset of parents and children will be retained in the evolving population. The effect on the decoy ensemble of sampling minima directly is measured by additionally mapping a conformation to its nearest local minimum before considering it for retainment. The resulting memetic algorithm thus evolves not just a population of conformations but a population of local minima.

Results and conclusions

Results show that both algorithms are effective in terms of sampling conformations in proximity of the known native structure. The additional minimization is shown to be key to enhancing sampling capability and obtaining a diverse ensemble of decoy conformations, circumventing premature convergence to sub-optimal regions in the conformational space, and approaching the native structure with proximity that is comparable to state-of-the-art decoy sampling methods. The results are shown to be robust and valid when using two representative state-of-the-art coarse-grained energy functions.
  相似文献   

16.
The conformations of loops are determined by the water-mediated interactions between amino acid residues. Energy functions that describe the interactions can be derived either from physical principles (physical-based energy function) or statistical analysis of known protein structures (knowledge-based statistical potentials). It is commonly believed that statistical potentials are appropriate for coarse-grained representation of proteins but are not as accurate as physical-based potentials when atomic resolution is required. Several recent applications of physical-based energy functions to loop selections appear to support this view. In this article, we apply a recently developed DFIRE-based statistical potential to three different loop decoy sets (RAPPER, Jacobson, and Forrest-Woolf sets). Together with a rotamer library for side-chain optimization, the performance of DFIRE-based potential in the RAPPER decoy set (385 loop targets) is comparable to that of AMBER/GBSA for short loops (two to eight residues). The DFIRE is more accurate for longer loops (9 to 12 residues). Similar trend is observed when comparing DFIRE with another physical-based OPLS/SGB-NP energy function in the large Jacobson decoy set (788 loop targets). In the Forrest-Woolf decoy set for the loops of membrane proteins, the DFIRE potential performs substantially better than the combination of the CHARMM force field with several solvation models. The results suggest that a single-term DFIRE-statistical energy function can provide an accurate loop prediction at a fraction of computing cost required for more complicate physical-based energy functions. A Web server for academic users is established for loop selection at the softwares/services section of the Web site http://theory.med.buffalo.edu/.  相似文献   

17.
A method is presented for the derivation of knowledge-based pair potentials that corrects for the various compositions of different proteins. The resulting statistical pair potential is more specific than that derived from previous approaches as assessed by gapless threading results. Additionally, a methodology is presented that interpolates between statistical potentials when no homologous examples to the protein of interest are in the structural database used to derive the potential, to a Go-like potential (in which native interactions are favorable and all nonnative interactions are not) when homologous proteins are present. For cases in which no protein exceeds 30% sequence identity, pairs of weakly homologous interacting fragments are employed to enhance the specificity of the potential. In gapless threading, the mean z score increases from -10.4 for the best statistical pair potential to -12.8 when the local sequence similarity, fragment-based pair potentials are used. Examination of the ab initio structure prediction of four representative globular proteins consistently reveals a qualitative improvement in the yield of structures in the 4 to 6 A rmsd from native range when the fragment-based pair potential is used relative to that when the quasichemical pair potential is employed. This suggests that such protein-specific potentials provide a significant advantage relative to generic quasichemical potentials.  相似文献   

18.
All-atom free-energy methods offer a promising alternative to kinetic molecular mechanics simulations of protein folding and association. Here we report an accurate, transferable all-atom biophysical force field (PFF02) that stabilizes the native conformation of a wide range of proteins as the global optimum of the free-energy landscape. For 32 proteins of the ROSETTA decoy set and six proteins that we have previously folded with PFF01, we find near-native conformations with an average backbone RMSD of 2.14 Å to the native conformation and an average Z-score of −3.46 to the corresponding decoy set. We used nonequilibrium sampling techniques starting from completely extended conformations to exhaustively sample the energy surface of three nonhomologous hairpin-peptides, a three-stranded β-sheet, the all-helical 40 amino-acid HIV accessory protein, and a zinc-finger ββα motif, and find near-native conformations for the minimal energy for each protein. Using a massively parallel evolutionary algorithm, we also obtain a near-native low-energy conformation for the 54 amino-acid engrailed homeodomain. Our force field thus stabilized near-native conformations for a total of 20 proteins of all structure classes with an average RMSD of only 3.06 Å to their respective experimental conformations.  相似文献   

19.
We develop coarse-grained, distance- and orientation-dependent statistical potentials from the growing protein structural databases. For protein structural classes (alpha, beta, and alpha/beta), a substantial number of backbone-backbone and backbone-side-chain contacts stabilize the native folds. By taking into account the importance of backbone interactions with a virtual backbone interaction center as the 21st anisotropic site, we construct a 21 x 21 interaction scheme. The new potentials are studied using spherical harmonics analysis (SHA) and a smooth, continuous version is constructed using spherical harmonic synthesis (SHS). Our approach has the following advantages: (1) The smooth, continuous form of the resulting potentials is more realistic and presents significant advantages for computational simulations, and (2) with SHS, the potential values can be computed efficiently for arbitrary coordinates, requiring only the knowledge of a few spherical harmonic coefficients. The performance of the new orientation-dependent potentials was tested using a standard database of decoy structures. The results show that the ability of the new orientation-dependent potentials to recognize native protein folds from a set of decoy structures is strongly enhanced by the inclusion of anisotropic backbone interaction centers. The anisotropic potentials can be used to develop realistic coarse-grained simulations of proteins, with direct applications to protein design, folding, and aggregation.  相似文献   

20.
H Lu  J Skolnick 《Proteins》2001,44(3):223-232
A heavy atom distance-dependent knowledge-based pairwise potential has been developed. This statistical potential is first evaluated and optimized with the native structure z-scores from gapless threading. The potential is then used to recognize the native and near-native structures from both published decoy test sets, as well as decoys obtained from our group's protein structure prediction program. In the gapless threading test, there is an average z-score improvement of 4 units in the optimized atomic potential over the residue-based quasichemical potential. Examination of the z-scores for individual pairwise distance shells indicates that the specificity for the native protein structure is greatest at pairwise distances of 3.5-6.5 A, i.e., in the first solvation shell. On applying the current atomic potential to test sets obtained from the web, composed of native protein and decoy structures, the current generation of the potential performs better than residue-based potentials as well as the other published atomic potentials in the task of selecting native and near-native structures. This newly developed potential is also applied to structures of varying quality generated by our group's protein structure prediction program. The current atomic potential tends to pick lower RMSD structures than do residue-based contact potentials. In particular, this atomic pairwise interaction potential has better selectivity especially for near-native structures. As such, it can be used to select near-native folds generated by structure prediction algorithms as well as for protein structure refinement.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号