期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Network properties of protein-decoy structures

Chatterjee S Bhattacharyya M Vishveshwara S 《Journal of biomolecular structure & dynamics》2012,29(6):606-622

Convergence of the vast sequence space of proteins into a highly restricted fold/conformational space suggests a simple yet unique underlying mechanism of protein folding that has been the subject of much debate in the last several decades. One of the major challenges related to the understanding of protein folding or in silico protein structure prediction is the discrimination of non-native structures/decoys from the native structure. Applications of knowledge-based potentials to attain this goal have been extensively reported in the literature. Also, scoring functions based on accessible surface area and amino acid neighbourhood considerations were used in discriminating the decoys from native structures. In this article, we have explored the potential of protein structure network (PSN) parameters to validate the native proteins against a large number of decoy structures generated by diverse methods. We are guided by two principles: (a) the PSNs capture the local properties from a global perspective and (b) inclusion of non-covalent interactions, at all-atom level, including the side-chain atoms, in the network construction accommodates the sequence dependent features. Several network parameters such as the size of the largest cluster, community size, clustering coefficient are evaluated and scored on the basis of the rank of the native structures and the Z-scores. The network analysis of decoy structures highlights the importance of the global properties contributing to the uniqueness of native structures. The analysis also exhibits that the network parameters can be used as metrics to identify the native structures and filter out non-native structures/decoys in a large number of data-sets; thus also has a potential to be used in the protein 'structure prediction' problem. 相似文献

2.

Optimized distance‐dependent atom‐pair‐based potential DOOP for protein structure prediction

下载免费PDF全文

Myong‐Ho Chae Florian Krull Ernst‐Walter Knapp 《Proteins》2015,83(5):881-890

The DOcking decoy‐based Optimized Potential (DOOP) energy function for protein structure prediction is based on empirical distance‐dependent atom‐pair interactions. To optimize the atom‐pair interactions, native protein structures are decomposed into polypeptide chain segments that correspond to structural motives involving complete secondary structure elements. They constitute near native ligand–receptor systems (or just pairs). Thus, a total of 8609 ligand–receptor systems were prepared from 954 selected proteins. For each of these hypothetical ligand–receptor systems, 1000 evenly sampled docking decoys with 0–10 Å interface root‐mean‐square‐deviation (iRMSD) were generated with a method used before for protein–protein docking. A neural network‐based optimization method was applied to derive the optimized energy parameters using these decoys so that the energy function mimics the funnel‐like energy landscape for the interaction between these hypothetical ligand–receptor systems. Thus, our method hierarchically models the overall funnel‐like energy landscape of native protein structures. The resulting energy function was tested on several commonly used decoy sets for native protein structure recognition and compared with other statistical potentials. In combination with a torsion potential term which describes the local conformational preference, the atom‐pair‐based potential outperforms other reported statistical energy functions in correct ranking of native protein structures for a variety of decoy sets. This is especially the case for the most challenging ROSETTA decoy set, although it does not take into account side chain orientation‐dependence explicitly. The DOOP energy function for protein structure prediction, the underlying database of protein structures with hypothetical ligand–receptor systems and their decoys are freely available at http://agknapp.chemie.fu‐berlin.de/doop/ . Proteins 2015; 83:881–890. © 2015 Wiley Periodicals, Inc. 相似文献

3.

Protein structure evaluation using an all-atom energy based empirical scoring function

Narang P Bhushan K Bose S Jayaram B 《Journal of biomolecular structure & dynamics》2006,23(4):385-406

Arriving at the native conformation of a polypeptide chain characterized by minimum most free energy is a problem of long standing interest in protein structure prediction endeavors. Owing to the computational requirements in developing free energy estimates, scoring functions--energy based or statistical--have received considerable renewed attention in recent years for distinguishing native structures of proteins from non-native like structures. Several cleverly designed decoy sets, CASP (Critical Assessment of Techniques for Protein Structure Prediction) structures and homology based internet accessible three dimensional model builders are now available for validating the scoring functions. We describe here an all-atom energy based empirical scoring function and examine its performance on a wide series of publicly available decoys. Barring two protein sequences where native structure is ranked second and seventh, native is identified as the lowest energy structure in 67 protein sequences from among 61,659 decoys belonging to 12 different decoy sets. We further illustrate a potential application of the scoring function in bracketing native-like structures of two small mixed alpha/beta globular proteins starting from sequence and secondary structural information. The scoring function has been web enabled at www.scfbio-iitd.res.in/utility/proteomics/energy.jsp. 相似文献

4.

How well can we predict native contacts in proteins based on decoy structures and their energies?

Zhu J Zhu Q Shi Y Liu H 《Proteins》2003,52(4):598-608

One strategy for ab initio protein structure prediction is to generate a large number of possible structures (decoys) and select the most fitting ones based on a scoring or free energy function. The conformational space of a protein is huge, and chances are rare that any heuristically generated structure will directly fall in the neighborhood of the native structure. It is desirable that, instead of being thrown away, the unfitting decoy structures can provide insights into native structures so prediction can be made progressively. First, we demonstrate that a recently parameterized physics-based effective free energy function based on the GROMOS96 force field and a generalized Born/surface area solvent model is, as several other physics-based and knowledge-based models, capable of distinguishing native structures from decoy structures for a number of widely used decoy databases. Second, we observe a substantial increase in correlations of the effective free energies with the degree of similarity between the decoys and the native structure, if the similarity is measured by the content of native inter-residue contacts in a decoy structure rather than its root-mean-square deviation from the native structure. Finally, we investigate the possibility of predicting native contacts based on the frequency of occurrence of contacts in decoy structures. For most proteins contained in the decoy databases, a meaningful amount of native contacts can be predicted based on plain frequencies of occurrence at a relatively high level of accuracy. Relative to using plain frequencies, overwhelming improvements in sensitivity of the predictions are observed for the 4_state_reduced decoy sets by applying energy-dependent weighting of decoy structures in determining the frequency. There, approximately 80% native contacts can be predicted at an accuracy of approximately 80% using energy-weighted frequencies. The sensitivity of the plain frequency approach is much lower (20% to 40%). Such improvements are, however, not observed for the other decoy databases. The rationalization and implications of the results are discussed. 相似文献

5.

Protein Structure Evaluation using an All-Atom Energy Based Empirical Scoring Function

Pooja Narang Kumkum Bhushan Surojit Bose B. Jayaram 《Journal of biomolecular structure & dynamics》2013,31(4):385-406

Abstract

Arriving at the native conformation of a polypeptide chain characterized by minimum most free energy is a problem of long standing interest in protein structure prediction endeavors. Owing to the computational requirements in developing free energy estimates, scoring functions—energy based or statistical—have received considerable renewed attention in recent years for distinguishing native structures of proteins from non-native like structures. Several cleverly designed decoy sets, CASP (Critical Assessment of Techniques for Protein Structure Prediction) structures and homology based internet accessible three dimensional model builders are now available for validating the scoring functions. We describe here an all-atom energy based empirical scoring function and examine its performance on a wide series of publicly available decoys. Barring two protein sequences where native structure is ranked second and seventh, native is identified as the lowest energy structure in 67 protein sequences from among 61,659 decoys belonging to 12 different decoy sets. We further illustrate a potential application of the scoring function in bracketing native-like structures of two small mixed alpha/beta globular proteins starting from sequence and secondary structural information. The scoring function has been web enabled at www.scfbio-iitd.res.in/utility/proteomics/energy.jsp 相似文献

6.

Empirical potential function for simplified protein models: combining contact and local sequence-structure descriptors

Zhang J Chen R Liang J 《Proteins》2006,63(4):949-960

相似文献

7.

SVR_CAF: An integrated score function for detecting native protein structures among decoys

Guang Hu Bairong Shen 《Proteins》2014,82(4):556-564

An accurate score function for detecting the most native‐like models among a huge number of decoy sets is essential to the protein structure prediction. In this work, we developed a novel integrated score function (SVR_CAF) to discriminate native structures from decoys, as well as to rank near‐native structures and select best decoys when native structures are absent. SVR_CAF is a machine learning score, which incorporates the contact energy based score ( C E_score), amino acid network based score ( A AN_score), and the fast Fourier transform based score ( F FT_score). The score function was evaluated with four decoy sets for its discriminative ability and it shows higher overall performance than the state‐of‐the‐art score functions. Proteins 2014; 82:556–564. © 2013 Wiley Periodicals, Inc. 相似文献

8.

Steiner minimal trees, twist angles, and the protein folding problem

Smith JM Jang Y Kim MK 《Proteins》2007,66(4):889-902

The Steiner Minimal Tree (SMT) problem determines the minimal length network for connecting a given set of vertices in three-dimensional space. SMTs have been shown to be useful in the geometric modeling and characterization of proteins. Even though the SMT problem is an NP-Hard Optimization problem, one can define planes within the amino acids that have a surprising regularity property for the twist angles of the planes. This angular property is quantified for all amino acids through the Steiner tree topology structure. The twist angle properties and other associated geometric properties unique for the remaining amino acids are documented in this paper. We also examine the relationship between the Steiner ratio rho and the torsion energy in amino acids with respect to the side chain torsion angle chi(1). The rho value is shown to be inversely proportional to the torsion energy. Hence, it should be a useful approximation to the potential energy function. Finally, the Steiner ratio is used to evaluate folded and misfolded protein structures. We examine all the native proteins and their decoys at http://dd.stanford.edu. and compare their Steiner ratio values. Because these decoy structures have been delicately misfolded, they look even more favorable than the native proteins from the potential energy viewpoint. However, the rho value of a decoy folded protein is shown to be much closer to the average value of an empirical Steiner ratio for each residue involved than that of the corresponding native one, so that we recognize the native folded structure more easily. The inverse relationship between the Steiner ratio and the energy level in the protein is shown to be a significant measure to distinguish native and decoy structures. These properties should be ultimately useful in the ab initio protein folding prediction. 相似文献

9.

Using physical features of protein core packing to distinguish real proteins from decoys

Alex T. Grigas Zhe Mei John D. Treado Zachary A. Levine Lynne Regan Corey S. O'Hern 《Protein science : a publication of the Protein Society》2020,29(9):1931-1944

The ability to consistently distinguish real protein structures from computationally generated model decoys is not yet a solved problem. One route to distinguish real protein structures from decoys is to delineate the important physical features that specify a real protein. For example, it has long been appreciated that the hydrophobic cores of proteins contribute significantly to their stability. We used two sources to obtain datasets of decoys to compare with real protein structures: submissions to the biennial Critical Assessment of protein Structure Prediction competition, in which researchers attempt to predict the structure of a protein only knowing its amino acid sequence, and also decoys generated by 3DRobot, which have user‐specified global root‐mean‐squared deviations from experimentally determined structures. Our analysis revealed that both sets of decoys possess cores that do not recapitulate the key features that define real protein cores. In particular, the model structures appear more densely packed (because of energetically unfavorable atomic overlaps), contain too few residues in the core, and have improper distributions of hydrophobic residues throughout the structure. Based on these observations, we developed a feed‐forward neural network, which incorporates key physical features of protein cores, to predict how well a computational model recapitulates the real protein structure without knowledge of the structure of the target sequence. By identifying the important features of protein structure, our method is able to rank decoy structures with similar accuracy to that obtained by state‐of‐the‐art methods that incorporate many additional features. The small number of physical features makes our model interpretable, emphasizing the importance of protein packing and hydrophobicity in protein structure prediction. 相似文献

10.

Recognizing protein–protein interfaces with empirical potentials and reduced amino acid alphabets

Guillaume Launay Raul Mendez Shoshana Wodak Thomas Simonson 《BMC bioinformatics》2007,8(1):270

Background

In structural genomics, an important goal is the detection and classification of protein–protein interactions, given the structures of the interacting partners. We have developed empirical energy functions to identify native structures of protein–protein complexes among sets of decoy structures. To understand the role of amino acid diversity, we parameterized a series of functions, using a hierarchy of amino acid alphabets of increasing complexity, with 2, 3, 4, 6, and 20 amino acid groups. Compared to previous work, we used the simplest possible functional form, with residue–residue interactions and a stepwise distance-dependence. We used increased computational ressources, however, constructing 290,000 decoys for 219 protein–protein complexes, with a realistic docking protocol where the protein partners are flexible and interact through a molecular mechanics energy function. The energy parameters were optimized to correctly assign as many native complexes as possible. To resolve the multiple minimum problem in parameter space, over 64000 starting parameter guesses were tried for each energy function. The optimized functions were tested by cross validation on subsets of our native and decoy structures, by blind tests on series of native and decoy structures available on the Web, and on models for 13 complexes submitted to the CAPRI structure prediction experiment. 相似文献

11.

Refinement of pairwise potentials via logistic regression to score protein-protein interactions

Kiyoto A. Tanemura Jun Pei Kenneth M. Merz Jr 《Proteins》2020,88(12):1559-1568

相似文献

12.

Distance dependent centroid to centroid force fields using high resolution decoys

Rajgaria R McAllister SR Floudas CA 《Proteins》2008,70(3):950-970

Simplified force fields play an important role in protein structure prediction and de novo protein design by requiring less computational effort than detailed atomistic potentials. A side chain centroid based, distance dependent pairwise interaction potential has been developed. A linear programming based formulation was used in which non-native "decoy" conformers are forced to take a higher energy compared with the corresponding native structure. This model was trained on an enhanced and diverse protein set. High quality decoy structures were generated for approximately 1400 nonhomologous proteins using torsion angle dynamics along with restricted variations of the hydrophobic cores of the native structure. The resulting decoy set was used to train the model yielding two different side chain centroid based force fields that differ in the way distance dependence has been used to calculate energy parameters. These force fields were tested on an independent set of 148 test proteins with 500 decoy structures for each protein. The side chain centroid force fields were successful in correctly identifying approximately 86% native structures. The Z-scores produced by the proposed centroid-centroid distance dependent force fields improved compared with other distance dependent C(alpha)-C(alpha) or side chain based force fields. 相似文献

13.

167 Network properties of decoy and CASP predicted models: a comparison with native protein structures

S. Chatterjee S. Ghosh 《Journal of biomolecular structure & dynamics》2013,31(1):108-109

Principles that govern protein folding still remain elusive. Given the huge sequence space, it is reasonable to assume that sequences follow a particular pattern to attain one of the folds already defined in the relatively small structural space. In this study, we have used protein structure networks at different interaction strengths of non-covalent interactions (I_min) (Brinda & Vishveshwara, 2005; Kannan & Vishveshwara, 1999), to identify patterns that can distinguish a native protein from decoy/modelled structures. This is a rigorous extension of an earlier study performed at I_min???0% (Chatterjee, Bhattacharyya et al., 2012). Network properties such as the size of the largest cluster (SLClu), largest k-2 communities (ComSk2) and clustering coefficients (CCoe) are analysed for 5422 native structures and 29543 decoy/modelled structures. Steeper transition profile of the native structures as a function of I_min is consistently observed (see Figure) . The network properties generated at different I_min and main-chain hydrogen bonds (MHB) are integrated into support vector machine to build a classifier, giving an accuracy of 94.11%. The uniqueness of the protein structures through side-chain interactions are captured by the network parameters, while MHB represents the backbone packing. Quality predictions for the recently concluded CASP 10 predicted models are also performed using the model with the selected ones showing RMSD values?<?2.5?Å with respect to the native structures. Amongst the network properties, ComSk2 is maximally able to capture the transition properties of the structures. Importance of ComSk2 has earlier been implicated to capture the percolating behaviour of a protein structure (Deb & Vishveshwara, 2009). Overall, a robust classifier is obtained, and patterns specific to native structures have been analysed and discussed. The study highlights the importance of side-chain interactions at different I_mins, along with backbone level interactions. 相似文献

14.

A global machine learning based scoring function for protein structure prediction

Eshel Faraggi Andrzej Kloczkowski 《Proteins》2014,82(5):752-759

We present a knowledge‐based function to score protein decoys based on their similarity to native structure. A set of features is constructed to describe the structure and sequence of the entire protein chain. Furthermore, a qualitative relationship is established between the calculated features and the underlying electromagnetic interaction that dominates this scale. The features we use are associated with residue–residue distances, residue–solvent distances, pairwise knowledge‐based potentials and a four‐body potential. In addition, we introduce a new target to be predicted, the fitness score, which measures the similarity of a model to the native structure. This new approach enables us to obtain information both from decoys and from native structures. It is also devoid of previous problems associated with knowledge‐based potentials. These features were obtained for a large set of native and decoy structures and a back‐propagating neural network was trained to predict the fitness score. Overall this new scoring potential proved to be superior to the knowledge‐based scoring functions used as its inputs. In particular, in the latest CASP (CASP10) experiment our method was ranked third for all targets, and second for freely modeled hard targets among about 200 groups for top model prediction. Ours was the only method ranked in the top three for all targets and for hard targets. This shows that initial results from the novel approach are able to capture details that were missed by a broad spectrum of protein structure prediction approaches. Source codes and executable from this work are freely available at http://mathmed.org /#Software and http://mamiris.com/ . Proteins 2014; 82:752–759. © 2013 Wiley Periodicals, Inc. 相似文献

15.

Optimal potentials for predicting inter-helical packing in transmembrane proteins

Dobbs H Orlandini E Bonaccini R Seno F 《Proteins》2002,49(3):342-349

A set of pairwise contact potentials between amino acid residues in transmembrane helices was determined from the known native structure of the transmembrane protein (TMP) bacteriorhodopsin by the method of perceptron learning, using Monte Carlo dynamics to generate suitable "decoy" structures. The procedure of finding these decoys is simpler than for globular proteins, since it is reasonable to assume that helices behave as independent, stable objects and, therefore, the search in the conformational space is greatly reduced. With the learnt potentials, the association of the helices in bacteriorhodopsin was successfully simulated. The folding of a second TMP (the helix-dimer glycophorin A) was then accomplished with only a refinement of the potentials from a small number of decoys. 相似文献

16.

Protein decoy sets for evaluating energy functions

Gilis D 《Journal of biomolecular structure & dynamics》2004,21(6):725-736

Energy functions are crucial ingredients of protein tertiary structure prediction methods. Assessing the quality of energy functions is therefore of prime importance. It requires the elaboration of a standard evaluation scheme, whose key elements are: i). sets that contain the native and several non-native structures of proteins (decoys) in order to test whether the energy functions display the expected quality features and ii). measures to evaluate the reliability of energy functions. We present here a survey of the recent advances in these two related fields. In a first part, we analyze and review the large number of decoy sets that are available on the web, and we summarize the characteristics of a challenging decoy set. We then discuss how to define the quality of energy functions and review the measures related to it. 相似文献

17.

Structure refinement of protein model decoys requires accurate side‐chain placement

Mark A. Olson Michael S. Lee 《Proteins》2013,81(3):469-478

In this study, the application of temperature‐based replica‐exchange (T‐ReX) simulations for structure refinement of decoys taken from the I‐TASSER dataset was examined. A set of eight nonredundant proteins was investigated using self‐guided Langevin dynamics (SGLD) with a generalized Born implicit solvent model to sample conformational space. For two of the protein test cases, a comparison of the SGLD/T‐ReX method with that of a hybrid explicit/implicit solvent molecular dynamics T‐ReX simulation model is provided. Additionally, the effect of side‐chain placement among the starting decoy structures, using alternative rotamer conformations taken from the SCWRL4 modeling program, was investigated. The simulation results showed that, despite having near‐native backbone conformations among the starting decoys, the determinant of their refinement is side‐chain packing to a level that satisfies a minimum threshold of native contacts to allow efficient excursions toward the downhill refinement regime on the energy landscape. By repacking using SCWRL4 and by applying the RWplus statistical potential for structure identification, the SGLD/T‐ReX simulations achieved refinement to an average of 38% increase in the number of native contacts relative to the original I‐TASSER decoy sets and a 25% reduction in values of C_α root‐mean‐square deviation. The hybrid model succeeded in obtaining a sharper funnel to low‐energy states for a modeled target than the implicit solvent SGLD model; yet, structure identification remained roughly the same. Without meeting a threshold of near‐native packing of side chains, the T‐ReX simulations degrade the accuracy of the decoys, and subsequently, refinement becomes tantamount to the protein folding problem. Proteins 2013. 2012 Published by Wiley Periodicals, Inc. 相似文献

18.

Sampling and scoring: A marriage made in heaven

Sandor Vajda David R. Hall Dima Kozakov 《Proteins》2013,81(11):1874-1884

Most structure prediction algorithms consist of initial sampling of the conformational space, followed by rescoring and possibly refinement of a number of selected structures. Here we focus on protein docking, and show that while decoupling sampling and scoring facilitates method development, integration of the two steps can lead to substantial improvements in docking results. Since decoupling is usually achieved by generating a decoy set containing both non‐native and near‐native docked structures, which can be then used for scoring function construction, we first review the roles and potential pitfalls of decoys in protein–protein docking, and show that some type of decoys are better than others for method development. We then describe three case studies showing that complete decoupling of scoring from sampling is not the best choice for solving realistic docking problems. Although some of the examples are based on our own experience, the results of the CAPRI docking and scoring experiments also show that performing both sampling and scoring generally yields better results than scoring the structures generated by all predictors. Next we investigate how the selection of training and decoy sets affects the performance of the scoring functions obtained. Finally, we discuss pathways to better alignment of the two steps, and show some algorithms that achieve a certain level of integration. Although we focus on protein–protein docking, our observations most likely also apply to other conformational search problems, including protein structure prediction and the docking of small molecules to proteins.Proteins 2013; 81:1874–1884. © 2013 Wiley Periodicals, Inc. 相似文献

19.

Analysis of anisotropic side-chain packing in proteins and application to high-resolution structure prediction

Misura KM Morozov AV Baker D 《Journal of molecular biology》2004,342(2):651-664

pi-pi, Cation-pi, and hydrophobic packing interactions contribute specificity to protein folding and stability to the native state. As a step towards developing improved models of these interactions in proteins, we compare the side-chain packing arrangements in native proteins to those found in compact decoys produced by the Rosetta de novo structure prediction method. We find enrichments in the native distributions for T-shaped and parallel offset arrangements of aromatic residue pairs, in parallel stacked arrangements of cation-aromatic pairs, in parallel stacked pairs involving proline residues, and in parallel offset arrangements for aliphatic residue pairs. We then investigate the extent to which the distinctive features of native packing can be explained using Lennard-Jones and electrostatics models. Finally, we derive orientation-dependent pi-pi, cation-pi and hydrophobic interaction potentials based on the differences between the native and compact decoy distributions and investigate their efficacy for high-resolution protein structure prediction. Surprisingly, the orientation-dependent potential derived from the packing arrangements of aliphatic side-chain pairs distinguishes the native structure from compact decoys better than the orientation-dependent potentials describing pi-pi and cation-pi interactions. 相似文献

20.

Computational protein design and large-scale assessment by I-TASSER structure assembly simulations

Bazzoli A Tettamanzi AG Zhang Y 《Journal of molecular biology》2011,407(5):764-776

Protein design aims at designing new protein molecules of desired structure and functionality. One of the major obstacles to large-scale protein design are the extensive time and manpower requirements for experimental validation of designed sequences. Recent advances in protein structure prediction have provided potentials for an automated assessment of the designed sequences via folding simulations. We present a new protocol for protein design and validation. The sequence space is initially searched by Monte Carlo sampling guided by a public atomic potential, with candidate sequences selected by the clustering of sequence decoys. The designed sequences are then assessed by I-TASSER folding simulations, which generate full-length atomic structural models by the iterative assembly of threading fragments. The protocol is tested on 52 nonhomologous single-domain proteins, with an average sequence identity of 24% between the designed sequences and the native sequences. Despite this low sequence identity, three-dimensional models predicted for the first designed sequence have an RMSD of < 2 Å to the target structure in 62% of cases. This percentage increases to 77% if we consider the three-dimensional models from the top 10 designed sequences. Such a striking consistency between the target structure and the structural prediction from nonhomologous sequences, despite the fact that the design and folding algorithms adopt completely different force fields, indicates that the design algorithm captures the features essential to the global fold of the target. On average, the designed sequences have a free energy that is 0.39 kcal/(mol residue) lower than in the native sequences, potentially affording a greater stability to synthesized target folds. 相似文献