首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
The purpose of this article is to introduce a novel model for discriminating correctly folded proteins from well designed decoy structures using mechanical interatomic forces. In our model, we consider a protein as a collection of springs and the force imposed to each atom is calculated. A potential function is obtained from statistical contact preferences within known protein structures. Combining this function with the spring equation, the interatomic forces are calculated. Finally, we consider a structure and define a score function on the 3D structure of a protein. We compare the force imposed to each atom of a protein with the corresponding atom in the other structures. We then assign larger scores to those atoms with lower forces. The total score is the sum of partial scores of atoms. The optimal structure is assumed to be the one with the highest score in the data set. To evaluate the performance of our model, we apply it on several decoy sets. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

2.
3.
Shape information about macromolecules is increasingly available but is difficult to use in modeling efforts. We demonstrate that shape information alone can often distinguish structural models of biological macromolecules. By using a data structure called a surface envelope (SE) to represent the shape of the molecule, we propose a method that generates a fitness score for the shape of a particular molecular model. This score correlates well with root mean squared deviation (RMSD) of the model to the known test structures and can be used to filter models in decoy sets. The scoring method requires both alignment of the model to the SE in three-dimensional space and assessment of the degree to which atoms in the model fill the SE. Alignment combines a hybrid algorithm using principal components and a previously published iterated closest point algorithm. We test our method against models generated from random atom perturbation from crystal structures, published decoy sets used in structure prediction, and models created from the trajectories of atoms in molecular modeling runs. We also test our alignment algorithm against experimental electron microscopic data from rice dwarf virus. The alignment performance is reliable, and we show a high correlation between model RMSD and score function. This correlation is stronger for molecular models with greater oblong character (as measured by the ratio of largest to smallest principal component).  相似文献   

4.
Statistical potential for assessment and prediction of protein structures   总被引:2,自引:0,他引:2  
Protein structures in the Protein Data Bank provide a wealth of data about the interactions that determine the native states of proteins. Using the probability theory, we derive an atomic distance-dependent statistical potential from a sample of native structures that does not depend on any adjustable parameters (Discrete Optimized Protein Energy, or DOPE). DOPE is based on an improved reference state that corresponds to noninteracting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures. The DOPE potential was extracted from a nonredundant set of 1472 crystallographic structures. We tested DOPE and five other scoring functions by the detection of the native state among six multiple target decoy sets, the correlation between the score and model error, and the identification of the most accurate non-native structure in the decoy set. For all decoy sets, DOPE is the best performing function in terms of all criteria, except for a tie in one criterion for one decoy set. To facilitate its use in various applications, such as model assessment, loop modeling, and fitting into cryo-electron microscopy mass density maps combined with comparative protein structure modeling, DOPE was incorporated into the modeling package MODELLER-8.  相似文献   

5.
In the absence of experimentally determined protein structure many biological questions can be addressed using computational structural models. However, the utility of protein structural models depends on their quality. Therefore, the estimation of the quality of predicted structures is an important problem. One of the approaches to this problem is the use of knowledge‐based statistical potentials. Such methods typically rely on the statistics of distances and angles of residue‐residue or atom‐atom interactions collected from experimentally determined structures. Here, we present VoroMQA (Voronoi tessellation‐based Model Quality Assessment), a new method for the estimation of protein structure quality. Our method combines the idea of statistical potentials with the use of interatomic contact areas instead of distances. Contact areas, derived using Voronoi tessellation of protein structure, are used to describe and seamlessly integrate both explicit interactions between protein atoms and implicit interactions of protein atoms with solvent. VoroMQA produces scores at atomic, residue, and global levels, all in the fixed range from 0 to 1. The method was tested on the CASP data and compared to several other single‐model quality assessment methods. VoroMQA showed strong performance in the recognition of the native structure and in the structural model selection tests, thus demonstrating the efficacy of interatomic contact areas in estimating protein structure quality. The software implementation of VoroMQA is freely available as a standalone application and as a web server at http://bioinformatics.lt/software/voromqa . Proteins 2017; 85:1131–1145. © 2017 Wiley Periodicals, Inc.  相似文献   

6.
Shirota M  Ishida T  Kinoshita K 《Proteins》2011,79(5):1550-1563
In protein structure prediction, it is crucial to evaluate the degree of native-likeness of given model structures. Statistical potentials extracted from protein structure data sets are widely used for such quality assessment problems, but they are only applicable for comparing different models of the same protein. Although various other methods, such as machine learning approaches, were developed to predict the absolute similarity of model structures to the native ones, they required a set of decoy structures in addition to the model structures. In this paper, we tried to reformulate the statistical potentials as absolute quality scores, without using the information from decoy structures. For this purpose, we regarded the native state and the reference state, which are necessary components of statistical potentials, as the good and bad standard states, respectively, and first showed that the statistical potentials can be regarded as the state functions, which relate a model structure to the native and reference states. Then, we proposed a standardized measure of protein structure, called native-likeness, by interpolating the score of a model structure between the native and reference state scores defined for each protein. The native-likeness correlated with the similarity to the native structures and discriminated the native structures from the models, with better accuracy than the raw score. Our results show that statistical potentials can quantify the native-like properties of protein structures, if they fully utilize the statistical information obtained from the data set.  相似文献   

7.
We present an automated method incorporated into a software package, FOLDER, to fold a protein sequence on a given three-dimensional (3D) template. Starting with the sequence alignment of a family of homologous proteins, tertiary structures are modeled using the known 3D structure of one member of the family as a template. Homologous interatomic distances from the template are used as constraints. For nonhomologous regions in the model protein, the lower and the upper bounds for the interatomic distances are imposed by steric constraints and the globular dimensions of the template, respectively. Distance geometry is used to embed an ensemble of structures consistent with these distance bounds. Structures are selected from this ensemble based on minimal distance error criteria, after a penalty function optimization step. These structures are then refined using energy optimization methods. The method is tested by simulating the alpha-chain of horse hemoglobin using the alpha-chain of human hemoglobin as the template and by comparing the generated models with the crystal structure of the alpha-chain of horse hemoglobin. We also test the packing efficiency of this method by reconstructing the atomic positions of the interior side chains beyond C beta atoms of a protein domain from a known 3D structure. In both test cases, models retain the template constraints and any additionally imposed constraints while the packing of the interior residues is optimized with no short contacts or bond deformations. To demonstrate the use of this method in simulating structures of proteins with nonhomologous disulfides, we construct a model of murine interleukin (IL)-4 using the NMR structure of human IL-4 as the template. The resulting geometry of the nonhomologous disulfide in the model structure for murine IL-4 is consistent with standard disulfide geometry.  相似文献   

8.
An accurate scoring function is a key component for successful protein structure prediction. To address this important unsolved problem, we develop a generalized orientation and distance-dependent all-atom statistical potential. The new statistical potential, generalized orientation-dependent all-atom potential (GOAP), depends on the relative orientation of the planes associated with each heavy atom in interacting pairs. GOAP is a generalization of previous orientation-dependent potentials that consider only representative atoms or blocks of side-chain or polar atoms. GOAP is decomposed into distance- and angle-dependent contributions. The DFIRE distance-scaled finite ideal gas reference state is employed for the distance-dependent component of GOAP. GOAP was tested on 11 commonly used decoy sets containing 278 targets, and recognized 226 native structures as best from the decoys, whereas DFIRE recognized 127 targets. The major improvement comes from decoy sets that have homology-modeled structures that are close to native (all within ∼4.0 Å) or from the ROSETTA ab initio decoy set. For these two kinds of decoys, orientation-independent DFIRE or only side-chain orientation-dependent RWplus performed poorly. Although the OPUS-PSP block-based orientation-dependent, side-chain atom contact potential performs much better (recognizing 196 targets) than DFIRE, RWplus, and dDFIRE, it is still ∼15% worse than GOAP. Thus, GOAP is a promising advance in knowledge-based, all-atom statistical potentials. GOAP is available for download at http://cssb.biology.gatech.edu/GOAP.  相似文献   

9.
Simplified force fields play an important role in protein structure prediction and de novo protein design by requiring less computational effort than detailed atomistic potentials. A side chain centroid based, distance dependent pairwise interaction potential has been developed. A linear programming based formulation was used in which non-native "decoy" conformers are forced to take a higher energy compared with the corresponding native structure. This model was trained on an enhanced and diverse protein set. High quality decoy structures were generated for approximately 1400 nonhomologous proteins using torsion angle dynamics along with restricted variations of the hydrophobic cores of the native structure. The resulting decoy set was used to train the model yielding two different side chain centroid based force fields that differ in the way distance dependence has been used to calculate energy parameters. These force fields were tested on an independent set of 148 test proteins with 500 decoy structures for each protein. The side chain centroid force fields were successful in correctly identifying approximately 86% native structures. The Z-scores produced by the proposed centroid-centroid distance dependent force fields improved compared with other distance dependent C(alpha)-C(alpha) or side chain based force fields.  相似文献   

10.
Coarse‐grained models for protein structure are increasingly used in simulations and structural bioinformatics. In this study, we evaluated the effectiveness of three granularities of protein representation based on their ability to discriminate between correctly folded native structures and incorrectly folded decoy structures. The three levels of representation used one bead per amino acid (coarse), two beads per amino acid (medium), and all atoms (fine). Multiple structure features were compared at each representation level including two‐body interactions, three‐body interactions, solvent exposure, contact numbers, and angle bending. In most cases, the all‐atom level was most successful at discriminating decoys, but the two‐bead level provided a good compromise between the number of model parameters which must be estimated and the accuracy achieved. The most effective feature type appeared to be two‐body interactions. Considering three‐body interactions increased accuracy only marginally when all atoms were used and not at all in medium and coarse representations. Though two‐body interactions were most effective for the coarse representations, the accuracy loss for using only solvent exposure or contact number was proportionally less at these levels than in the all‐atom representation. We propose an optimization method capable of selecting bead types of different granularities to create a mixed representation of the protein. We illustrate its behavior on decoy discrimination and discuss implications for data‐driven protein model selection. Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

11.
The DOcking decoy‐based Optimized Potential (DOOP) energy function for protein structure prediction is based on empirical distance‐dependent atom‐pair interactions. To optimize the atom‐pair interactions, native protein structures are decomposed into polypeptide chain segments that correspond to structural motives involving complete secondary structure elements. They constitute near native ligand–receptor systems (or just pairs). Thus, a total of 8609 ligand–receptor systems were prepared from 954 selected proteins. For each of these hypothetical ligand–receptor systems, 1000 evenly sampled docking decoys with 0–10 Å interface root‐mean‐square‐deviation (iRMSD) were generated with a method used before for protein–protein docking. A neural network‐based optimization method was applied to derive the optimized energy parameters using these decoys so that the energy function mimics the funnel‐like energy landscape for the interaction between these hypothetical ligand–receptor systems. Thus, our method hierarchically models the overall funnel‐like energy landscape of native protein structures. The resulting energy function was tested on several commonly used decoy sets for native protein structure recognition and compared with other statistical potentials. In combination with a torsion potential term which describes the local conformational preference, the atom‐pair‐based potential outperforms other reported statistical energy functions in correct ranking of native protein structures for a variety of decoy sets. This is especially the case for the most challenging ROSETTA decoy set, although it does not take into account side chain orientation‐dependence explicitly. The DOOP energy function for protein structure prediction, the underlying database of protein structures with hypothetical ligand–receptor systems and their decoys are freely available at http://agknapp.chemie.fu‐berlin.de/doop/ . Proteins 2015; 83:881–890. © 2015 Wiley Periodicals, Inc.  相似文献   

12.
This study is aimed at showing that considering only nonlocal interactions (interactions of two atoms with a sequence separation larger than five amino acids) extracted using Delaunay tessellation is sufficient and accurate for protein fold recognition. An atomic knowledge‐based potential was extracted based on a Delaunay tessellation with 167 atom types from a sample of the native structures and the normalized energy was calculated for only nonlocal interactions in each structure. The performance of this method was tested on several decoy sets and compared to a method considering all interactions extracted by Delaunay tessellation and three other popular scoring functions. Features such as the contents of different types of interactions and atoms with the highest number of interactions were also studied. The results suggest that considering only nonlocal interactions in a Delaunay tessellation of protein structure is a discrete structure catching deep properties of the three‐dimensional protein data. Proteins 2014; 82:415–423. © 2013 Wiley Periodicals, Inc.  相似文献   

13.
We have developed a solvation function that combines a Generalized Born model for polarization of protein charge by the high dielectric solvent, with a hydrophobic potential of mean force (HPMF) as a model for hydrophobic interaction, to aid in the discrimination of native structures from other misfolded states in protein structure prediction. We find that our energy function outperforms other reported scoring functions in terms of correct native ranking for 91% of proteins and low Z scores for a variety of decoy sets, including the challenging Rosetta decoys. This work shows that the stabilizing effect of hydrophobic exposure to aqueous solvent that defines the HPMF hydration physics is an apparent improvement over solvent-accessible surface area models that penalize hydrophobic exposure. Decoys generated by thermal sampling around the native-state basin reveal a potentially important role for side-chain entropy in the future development of even more accurate free energy surfaces.  相似文献   

14.
We have developed a free‐energy function based on an all‐atom model for proteins. It comprises two components, the hydration entropy (HE) and the total dehydration penalty (TDP). Upon a transition to a more compact structure, the number of accessible configurations arising from the translational displacement of water molecules in the system increases, leading to a water‐entropy gain. To fully account for this effect, the HE is calculated using a statistical‐mechanical theory applied to a molecular model for water. The TDP corresponds to the sum of the hydration energy and the protein intramolecular energy when a fully extended structure, which possesses the maximum number of hydrogen bonds with water molecules and no intramolecular hydrogen bonds, is chosen as the standard one. When a donor and an acceptor (e.g., N and O, respectively) are buried in the interior after the break of hydrogen bonds with water molecules, if they form an intramolecular hydrogen bond, no penalty is imposed. When a donor or an acceptor is buried with no intramolecular hydrogen bond formed, an energetic penalty is imposed. We examine all the donors and acceptors for backbone‐backbone, backbone‐side chain, and side chain‐side chain intramolecular hydrogen bonds and calculate the TDP. Our free‐energy function has been tested for three different decoy sets. It is better than any other physics‐based or knowledge‐based potential function in terms of the accuracy in discriminating the native fold from misfolded decoys and the achievement of high Z‐scores. Proteins 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

15.
Zhu J  Zhu Q  Shi Y  Liu H 《Proteins》2003,52(4):598-608
One strategy for ab initio protein structure prediction is to generate a large number of possible structures (decoys) and select the most fitting ones based on a scoring or free energy function. The conformational space of a protein is huge, and chances are rare that any heuristically generated structure will directly fall in the neighborhood of the native structure. It is desirable that, instead of being thrown away, the unfitting decoy structures can provide insights into native structures so prediction can be made progressively. First, we demonstrate that a recently parameterized physics-based effective free energy function based on the GROMOS96 force field and a generalized Born/surface area solvent model is, as several other physics-based and knowledge-based models, capable of distinguishing native structures from decoy structures for a number of widely used decoy databases. Second, we observe a substantial increase in correlations of the effective free energies with the degree of similarity between the decoys and the native structure, if the similarity is measured by the content of native inter-residue contacts in a decoy structure rather than its root-mean-square deviation from the native structure. Finally, we investigate the possibility of predicting native contacts based on the frequency of occurrence of contacts in decoy structures. For most proteins contained in the decoy databases, a meaningful amount of native contacts can be predicted based on plain frequencies of occurrence at a relatively high level of accuracy. Relative to using plain frequencies, overwhelming improvements in sensitivity of the predictions are observed for the 4_state_reduced decoy sets by applying energy-dependent weighting of decoy structures in determining the frequency. There, approximately 80% native contacts can be predicted at an accuracy of approximately 80% using energy-weighted frequencies. The sensitivity of the plain frequency approach is much lower (20% to 40%). Such improvements are, however, not observed for the other decoy databases. The rationalization and implications of the results are discussed.  相似文献   

16.
17.
We have calculated the stability of decoy structures of several proteins (from the CASP3 models and the Park and Levitt decoy set) relative to the native structures. The calculations were performed with the force field-consistent ES/IS method, in which an implicit solvent (IS) model is used to calculate the average solvation free energy for snapshots from explicit simulations (ESs). The conformational free energy is obtained by adding the internal energy of the solute from the ESs and an entropic term estimated from the covariance positional fluctuation matrix. The set of atomic Born radii and the cavity-surface free energy coefficient used in the implicit model has been optimized to be consistent with the all-atom force field used in the ESs (cedar/gromos with simple point charge (SPC) water model). The decoys are found to have a consistently higher free energy than that of the native structure; the gap between the native structure and the best decoy varies between 10 and 15 kcal/mole, on the order of the free energy difference that typically separates the native state of a protein from the unfolded state. The correlation between the free energy and the extent to which the decoy structures differ from the native (as root mean square deviation) is very weak; hence, the free energy is not an accurate measure for ranking the structurally most native-like structures from among a set of models. Analysis of the energy components shows that stability is attained as a result of three major driving forces: (1) minimum size of the protein-water surface interface; (2) minimum total electrostatic energy, which includes solvent polarization; and (3) minimum protein packing energy. The detailed fit required to optimize the last term may underlie difficulties encountered in recovering the native fold from an approximate decoy or model structure.  相似文献   

18.
We describe the derivation and testing of a knowledge-based atomic environment potential for the modeling of protein structural energetics. An analysis of the probabilities of atomic interactions in a dataset of high-resolution protein structures shows that the probabilities of non-bonded inter-atomic contacts are not statistically independent events, and that the multi-body contact frequencies are poorly predicted from pairwise contact potentials. A pseudo-energy function is defined that measures the preferences for protein atoms to be in a given microenvironment defined by the number of contacting atoms in the environment and its atomic composition. This functional form is tested for its ability to recognize native protein structures amongst an ensemble of decoy structures and a detailed relative performance comparison is made with a number of common functions used in protein structure prediction.  相似文献   

19.
20.
We describe a novel method to calculate the packing interactions in protein structural models. The method calculates the interatomic occluded surface areas for each atom in the protein model. The identification of, and degree of interaction with, neighboring atoms is accomplished by extending surface normal from a dot surface of each atom to the point of intersection with neighboring atoms. The combined occluded and non-occluded surface areas may be normalized for the amino acid composition of the protein providing a single parameter, the normalized protein surface ratio, which is diagnostic for native-like Structures. Individual residues in the model which are in infrequent occluded surface environments may be identified. The method provides a means to explicitly describe packing densities and packing environments of individual atoms in a protein model. Finally, the method allows estimation of the complementarity between any interacting molecules, for example a ligand binding to a receptor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号