首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Lange OF  Baker D 《Proteins》2012,80(3):884-895
Recent work has shown that NMR structures can be determined by integrating sparse NMR data with structure prediction methods such as Rosetta. The experimental data serve to guide the search for the lowest energy state towards the deep minimum at the native state which is frequently missed in Rosetta de novo structure calculations. However, as the protein size increases, sampling again becomes limiting; for example, the standard Rosetta protocol involving Monte Carlo fragment insertion starting from an extended chain fails to converge for proteins over 150 amino acids even with guidance from chemical shifts (CS-Rosetta) and other NMR data. The primary limitation of this protocol--that every folding trajectory is completely independent of every other--was recently overcome with the development of a new approach involving resolution-adapted structural recombination (RASREC). Here we describe the RASREC approach in detail and compare it to standard CS-Rosetta. We show that the improved sampling of RASREC is essential in obtaining accurate structures over a benchmark set of 11 proteins in the 15-25 kDa size range using chemical shifts, backbone RDCs and HN-HN NOE data; in a number of cases the improved sampling methodology makes a larger contribution than incorporation of additional experimental data. Experimental data are invaluable for guiding sampling to the vicinity of the global energy minimum, but for larger proteins, the standard Rosetta fold-from-extended-chain protocol does not converge on the native minimum even with experimental data and the more powerful RASREC approach is necessary to converge to accurate solutions.  相似文献   

2.
Bowman GR  Pande VS 《Proteins》2009,74(3):777-788
Rosetta is a structure prediction package that has been employed successfully in numerous protein design and other applications.1 Previous reports have attributed the current limitations of the Rosetta de novo structure prediction algorithm to inadequate sampling, particularly during the low-resolution phase.2-5 Here, we implement the Simulated Tempering (ST) sampling algorithm67 in Rosetta to address this issue. ST is intended to yield canonical sampling by inducing a random walk in temperatures space such that broad sampling is achieved at high temperatures and detailed exploration of local free energy minima is achieved at low temperatures. ST should therefore visit basins in accordance with their free energies rather than their energies and achieve more global sampling than the localized scheme currently implemented in Rosetta. However, we find that ST does not improve structure prediction with Rosetta. To understand why, we carried out a detailed analysis of the low-resolution scoring functions and find that they do not provide a strong bias towards the native state. In addition, we find that both ST and standard Rosetta runs started from the native state are biased away from the native state. Although the low-resolution scoring functions could be improved, we propose that working entirely at full-atom resolution is now possible and may be a better option due to superior native-state discrimination at full-atom resolution. Such an approach will require more attention to the kinetics of convergence, however, as functions capable of native state discrimination are not necessarily capable of rapidly guiding non-native conformations to the native state.  相似文献   

3.
De novo structure prediction can be defined as a search in conformational space under the guidance of an energy function. The most successful de novo structure prediction methods, such as Rosetta, assemble the fragments from known structures to reduce the search space. Therefore, the fragment quality is an important factor in structure prediction. In our study, a method is proposed to generate a new set of fragments from the lowest energy de novo models. These fragments were subsequently used to predict the next‐round of models. In a benchmark of 30 proteins, the new set of fragments showed better performance when used to predict de novo structures. The lowest energy model predicted using our method was closer to native structure than Rosetta for 22 proteins. Following a similar trend, the best model among top five lowest energy models predicted using our method was closer to native structure than Rosetta for 20 proteins. In addition, our experiment showed that the C‐alpha root mean square deviation was improved from 5.99 to 5.03 Å on average compared to Rosetta when the lowest energy models were picked as the best predicted models. Proteins 2014; 82:2240–2252. © 2014 Wiley Periodicals, Inc.  相似文献   

4.
Thompson J  Baker D 《Proteins》2011,79(8):2380-2388
Prediction of protein structures from sequences is a fundamental problem in computational biology. Algorithms that attempt to predict a structure from sequence primarily use two sources of information. The first source is physical in nature: proteins fold into their lowest energy state. Given an energy function that describes the interactions governing folding, a method for constructing models of protein structures, and the amino acid sequence of a protein of interest, the structure prediction problem becomes a search for the lowest energy structure. Evolution provides an orthogonal source of information: proteins of similar sequences have similar structure, and therefore proteins of known structure can guide modeling. The relatively successful Rosetta approach takes advantage of the first, but not the second source of information during model optimization. Following the classic work by Andrej Sali and colleagues, we develop a probabilistic approach to derive spatial restraints from proteins of known structure using advances in alignment technology and the growth in the number of structures in the Protein Data Bank. These restraints define a region of conformational space that is high-probability, given the template information, and we incorporate them into Rosetta's comparative modeling protocol. The combined approach performs considerably better on a benchmark based on previous CASP experiments. Incorporating evolutionary information into Rosetta is analogous to incorporating sparse experimental data: in both cases, the additional information eliminates large regions of conformational space and increases the probability that energy-based refinement will hone in on the deep energy minimum at the native state.  相似文献   

5.
Intrinsically unstructured/disordered proteins and domains (IUPs) lack a well-defined three-dimensional structure under native conditions. The IUPred server presents a novel algorithm for predicting such regions from amino acid sequences by estimating their total pairwise interresidue interaction energy, based on the assumption that IUP sequences do not fold due to their inability to form sufficient stabilizing interresidue interactions. Optional to the prediction are built-in parameter sets optimized for predicting short or long disordered regions and structured domains.  相似文献   

6.
We have developed a solvation function that combines a Generalized Born model for polarization of protein charge by the high dielectric solvent, with a hydrophobic potential of mean force (HPMF) as a model for hydrophobic interaction, to aid in the discrimination of native structures from other misfolded states in protein structure prediction. We find that our energy function outperforms other reported scoring functions in terms of correct native ranking for 91% of proteins and low Z scores for a variety of decoy sets, including the challenging Rosetta decoys. This work shows that the stabilizing effect of hydrophobic exposure to aqueous solvent that defines the HPMF hydration physics is an apparent improvement over solvent-accessible surface area models that penalize hydrophobic exposure. Decoys generated by thermal sampling around the native-state basin reveal a potentially important role for side-chain entropy in the future development of even more accurate free energy surfaces.  相似文献   

7.
The approach described in this paper on the prediction of folding nuclei in globular proteins with known three dimensional structures is based on a search of the lowest saddle points through the barrier separating the unfolded state from the native structure on the free-energy landscape of protein chain. This search is performed by a dynamic programming method. Comparison of theoretical results with experimental data on the folding nuclei of two dozen of proteins shows that our model provides good phi value predictions for proteins whose structures have been determined by X-ray analysis, with a less limited success for proteins whose structures have been determined by NMR techniques only. Consideration of a full ensemble of transition states results in more successful prediction than consideration of only the transition states with the minimal free energy. In conclusion we have predicted the localization of folding nuclei for three dimensional protein structures for which kinetics of folding is studied now but the localization of folding nuclei is still unknown.  相似文献   

8.
The primary obstacle to de novo protein structure prediction is conformational sampling: the native state generally has lower free energy than nonnative structures but is exceedingly difficult to locate. Structure predictions with atomic level accuracy have been made for small proteins using the Rosetta structure prediction method, but for larger and more complex proteins, the native state is virtually never sampled, and it has been unclear how much of an increase in computing power would be required to successfully predict the structures of such proteins. In this paper, we develop an approach to determining how much computer power is required to accurately predict the structure of a protein, based on a reformulation of the conformational search problem as a combinatorial sampling problem in a discrete feature space. We find that conformational sampling for many proteins is limited by critical “linchpin” features, often the backbone torsion angles of individual residues, which are sampled very rarely in unbiased trajectories and, when constrained, dramatically increase the sampling of the native state. These critical features frequently occur in less regular and likely strained regions of proteins that contribute to protein function. In a number of proteins, the linchpin features are in regions found experimentally to form late in folding, suggesting a correspondence between folding in silico and in reality.  相似文献   

9.
De novo protein structure prediction requires location of the lowest energy state of the polypeptide chain among a vast set of possible conformations. Powerful approaches include conformational space annealing, in which search progressively focuses on the most promising regions of conformational space, and genetic algorithms, in which features of the best conformations thus far identified are recombined. We describe a new approach that combines the strengths of these two approaches. Protein conformations are projected onto a discrete feature space which includes backbone torsion angles, secondary structure, and beta pairings. For each of these there is one “native” value: the one found in the native structure. We begin with a large number of conformations generated in independent Monte Carlo structure prediction trajectories from Rosetta. Native values for each feature are predicted from the frequencies of feature value occurrences and the energy distribution in conformations containing them. A second round of structure prediction trajectories are then guided by the predicted native feature distributions. We show that native features can be predicted at much higher than background rates, and that using the predicted feature distributions improves structure prediction in a benchmark of 28 proteins. The advantages of our approach are that features from many different input structures can be combined simultaneously without producing atomic clashes or otherwise physically inviable models, and that the features being recombined have a relatively high chance of being correct. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

10.

Background

Here we continue our efforts to use methods developed in the folding mechanism community to both better understand and improve structure prediction. Our previous work demonstrated that Rosetta''s coarse-grained potentials may actually impede accurate structure prediction at full-atom resolution. Based on this work we postulated that it may be time to work completely at full-atom resolution but that doing so may require more careful attention to the kinetics of convergence.

Methodology/Principal Findings

To explore the possibility of working entirely at full-atom resolution, we apply enhanced sampling algorithms and the free energy theory developed in the folding mechanism community to full-atom protein structure prediction with the prominent Rosetta package. We find that Rosetta''s full-atom scoring function is indeed able to recognize diverse protein native states and that there is a strong correlation between score and Cα RMSD to the native state. However, we also show that there is a huge entropic barrier to folding under this potential and the kinetics of folding are extremely slow. We then exploit this new understanding to suggest ways to improve structure prediction.

Conclusions/Significance

Based on this work we hypothesize that structure prediction may be improved by taking a more physical approach, i.e. considering the nature of the model thermodynamics and kinetics which result from structure prediction simulations.  相似文献   

11.
12.
Georg Kuenze  Jens Meiler 《Proteins》2019,87(12):1341-1350
Computational methods that produce accurate protein structure models from limited experimental data, for example, from nuclear magnetic resonance (NMR) spectroscopy, hold great potential for biomedical research. The NMR-assisted modeling challenge in CASP13 provided a blind test to explore the capabilities and limitations of current modeling techniques in leveraging NMR data which had high sparsity, ambiguity, and error rate for protein structure prediction. We describe our approach to predict the structure of these proteins leveraging the Rosetta software suite. Protein structure models were predicted de novo using a two-stage protocol. First, low-resolution models were generated with the Rosetta de novo method guided by nonambiguous nuclear Overhauser effect (NOE) contacts and residual dipolar coupling (RDC) restraints. Second, iterative model hybridization and fragment insertion with the Rosetta comparative modeling method was used to refine and regularize models guided by all ambiguous and nonambiguous NOE contacts and RDCs. Nine out of 16 of the Rosetta de novo models had the correct fold (global distance test total score > 45) and in three cases high-resolution models were achieved (root-mean-square deviation < 3.5 å). We also show that a meta-approach applying iterative Rosetta + NMR refinement on server-predicted models which employed non-NMR-contacts and structural templates leads to substantial improvement in model quality. Integrating these data-assisted refinement strategies with innovative non-data-assisted approaches which became possible in CASP13 such as high precision contact prediction will in the near future enable structure determination for large proteins that are outside of the realm of conventional NMR.  相似文献   

13.
Photochemically generated hydroxyl radicals were used to map solvent-exposed regions in the C14S mutant of the protein Sml1p, a regulator of the ribonuclease reductase enzyme Rnr1p in Saccharomyces cerevisiae. By using high-performance mass spectrometry to characterize the oxidized peptides created by the hydroxyl radical reactions, amino acid solvent-accessibility data for native and denatured C14S Sml1p that revealed a solvent-excluding tertiary structure in the native state were obtained. The data on solvent accessibilities of various amino acids within the protein were then utilized to evaluate the de novo computational models generated by the HMMSTR/Rosetta server. The top five models initially generated by the server all disagreed with both published nuclear magnetic resonance (NMR) data and the solvent-accessibility data obtained in this study. A structural model adjusted to fit the previously reported NMR data satisfied most of the solvent-accessibility constraints. Through minor adjustment of the rotamers of two amino acid side chains for this latter structure, a model that not only provided a lower energy conformation but also completely satisfied previously reported data from NMR and tryptophan fluorescence measurements, in addition to the solvent-accessibility data presented here, was generated.  相似文献   

14.
Numerous quantitative stability/flexibility relationships, within Escherichia coli thioredoxin (Trx) and its fragments are determined using a minimal distance constraint model (DCM). A one-dimensional free energy landscape as a function of global flexibility reveals Trx to fold in a low-barrier two-state process, with a voluminous transition state. Near the folding transition temperature, the native free energy basin is markedly skewed to allow partial unfolded forms. Under native conditions the skewed shape is lost, and the protein forms a compact structure with some flexibility. Predictions on ten Trx fragments are generally consistent with experimental observations that they are disordered, and that complementary fragments reconstitute. A hierarchical unfolding pathway is uncovered using an exhaustive computational procedure of breaking interfacial cross-linking hydrogen bonds that span over a series of fragment dissociations. The unfolding pathway leads to a stable core structure (residues 22-90), predicted to act as a kinetic trap. Direct connection between degree of rigidity within molecular structure and non-additivity of free energy is demonstrated using a thermodynamic cycle involving fragments and their hierarchical unfolding pathway. Additionally, the model provides insight about molecular cooperativity within Trx in its native state, and about intermediate states populating the folding/unfolding pathways. Native state cooperativity correlation plots highlight several flexibly correlated regions, giving insight into the catalytic mechanism that facilitates access to the active site disulfide bond. Residual native cooperativity correlations are present in the core substructure, suggesting that Trx can function when it is partly unfolded. This natively disordered kinetic trap, interpreted as a molten globule, has a wide temperature range of metastability, and it is identified as the "slow intermediate state" observed in kinetic experiments. These computational results are found to be in overall agreement with a large array of experimental data.  相似文献   

15.
MOTIVATION: Knots in polypeptide chains have been found in very few proteins, and consequently should be generally avoided in protein structure prediction methods. Most effective structure prediction methods do not model the protein folding process itself, but rather seek only to correctly obtain the final native state. Consequently, the mechanisms that prevent knots from occurring in native proteins are not relevant to the modeling process, and as a result, knots can occur with significantly higher frequency in protein models. Here we describe Knotfind, a simple algorithm for knot detection that is fast enough for structure prediction, where tens or hundreds of thousands of conformations may be sampled during the course of a prediction. We have used this algorithm to characterize knots in large populations of model structures generated for targets in CASP 5 and CASP 6 using the Rosetta homology-based modeling method. RESULTS: Analysis of CASP5 models suggested several possible avenues for introduction of knots into these models, and these insights were applied to structure prediction in CASP 6, resulting in a significant decrease in the proportion of knotted models generated. Additionally, using the knot detection algorithm on structures in the Protein Data Bank, a previously unreported deep trefoil knot was found in acetylornithine transcarbamylase. AVAILABILITY: The Knotfind algorithm is available in the Rosetta structure prediction program at http://www.rosettacommons.org.  相似文献   

16.
Mottamal M  Zhang J  Lazaridis T 《Proteins》2006,62(4):996-1009
Using an implicit membrane model (IMM1), we examine whether the structure of the transmembrane domain of Glycophorin A (GpA) could be predicted based on energetic considerations alone. The energetics of native GpA shows that van der Waals interactions make the largest contribution to stability. Although specific electrostatic interactions are stabilizing, the overall electrostatic contribution is close to zero. The GXXXG motif contributes significantly to stability, but residues outside this motif contribute almost twice as much. To generate non-native states a global conformational search was done on two segments of GpA: an 18-residue peptide (GpA74-91) that is embedded in the membrane and a 29-residue peptide (GpA70-98) that has additional polar residues flanking the transmembrane region. Simulated annealing was done on a large number of conformations generated from parallel, antiparallel, left- and right-handed starting structures by rotating each helix at 20 degrees intervals around its helical axis. Several crossing points along the helix dimer were considered. For 18-residue parallel topology, an ensemble of native-like structures was found at the lowest effective energy region; the effective energy is lowest for a right-handed structure with an RMSD of 1.0 A from the solid-state NMR structure with correct orientation of the helices. For the 29-residue peptide, the effective energies of several left-handed structures were lower than that of the native, right-handed structure. This could be due to deficiencies in modeling the interactions between charged sidechains and/or omission of the sidechain entropy contribution to the free energy. For 18-residue antiparallel topology, both IMM1 and a Generalized Born model give effective energies that are lower than that of the native structure. In contrast, the Poisson-Boltzmann solvation model gives lower effective energy for the parallel topology, largely because the electrostatic solvation energy is more favorable for the parallel structure. IMM1 seems to underestimate the solvation free energy advantage when the CO and NH dipoles just outside the membrane are parallel. This highlights the importance of electrostatic interactions even when these are not obvious by looking at the structures.  相似文献   

17.
Recent work on the thermodynamics of protein denatured states is providing insight into the stability of residual structure and the conformational constraints that affect the disordered states of proteins. Current data from native state hydrogen exchange and the pH dependence of protein stability indicate that residual structure can modulate the stability of the denatured state by up to 4 kcal mol(-1). NMR structural data have emphasized the role of hydrophobic clusters in stabilizing denatured state residual structures, however recent results indicate that electrostatic interactions, both favorable and unfavorable, are also important modulators of the stability of the denatured state. Thermodynamics methods that take advantage of histidine-heme ligation chemistry have also been developed to probe the conformational constraints that act on denatured states. These methods have provided insights into the role of excluded volume, chain stiffness, and loop persistence in modulating the conformational preferences of highly disordered proteins. New insights into protein folding and novel methods to manipulate protein stability are emerging from this work.  相似文献   

18.
Dong Xie  Ernesto Freire 《Proteins》1994,19(4):291-301
The heat-denatured state of proteins has been usually assumed to be a fully hydrated random coil. It is now evident that under certain solvent conditions or after chemical or genetic modifications, the protein molecule may exhibit a hydrophobic core and residual secondary structure after thermal denaturation. This state of the protein has been called the “compact denatured” or “molten globule” state. Recently is has been shown that α-lactalbumin at pH < 5 denatures into a molten globule state upon increasing the temperature (Griko, Y., Freire, E., Privalov, P. L. Biochemistry 33:1889–1899, 1994). This state has a lower heat capacity and a higher enthalpy at low temperatures than the unfolded state. At those temperatures the stabilization of the molten globule state is of an entropic origin since the enthalpy contributes unfavorably to the Gibbs free energy. Since the molten globule is more structured than the unfolded state and, therefore, is expected to have a lower configurational entropy, the net entropic gain must originate primarily from solvent related entropy arising from the hydrophobic effect, and to a lesser extent from protonation or electrostatic effects. In this work, we have examined a large ensemble of partly folded states derived from the native structure of α-lactalbumin in order to identify those states that satisfy the energetic criteria of the molten globule. It was found that only few states satisfied the experimental constraints and that, furthermore, those states were part of the same structural family. In particular, the regions corresponding to the A, B, and C helices were found to be folded, while the β sheet and the D helix were found to be unfolded. At temperatures below 45°C the states exhibiting those structural characteristics are enthalpically higher than the unfolded state in agreement with the experimental data. Interestingly, those states have a heat capacity close to that observed for the acid pH compact denatured state of α-lactalbumin [980 cal (mol.K)?l]. In addition, the folded regions of these states include those residues found to be highly protected by NMR hydrogen exchange experiments. This work represents an initial attempt to model the structural origin of the thermodynamic properties of partly folded states. The results suggest a number of structural features that are consistent with experimental data. © 1994 Wiley-Liss, Inc.  相似文献   

19.
Reliable prediction of free energy changes upon amino acid substitutions (ΔΔGs) is crucial to investigate their impact on protein stability and protein–protein interaction. Advances in experimental mutational scans allow high-throughput studies thanks to multiplex techniques. On the other hand, genomics initiatives provide a large amount of data on disease-related variants that can benefit from analyses with structure-based methods. Therefore, the computational field should keep the same pace and provide new tools for fast and accurate high-throughput ΔΔG calculations. In this context, the Rosetta modeling suite implements effective approaches to predict folding/unfolding ΔΔGs in a protein monomer upon amino acid substitutions and calculate the changes in binding free energy in protein complexes. However, their application can be challenging to users without extensive experience with Rosetta. Furthermore, Rosetta protocols for ΔΔG prediction are designed considering one variant at a time, making the setup of high-throughput screenings cumbersome. For these reasons, we devised RosettaDDGPrediction, a customizable Python wrapper designed to run free energy calculations on a set of amino acid substitutions using Rosetta protocols with little intervention from the user. Moreover, RosettaDDGPrediction assists with checking completed runs and aggregates raw data for multiple variants, as well as generates publication-ready graphics. We showed the potential of the tool in four case studies, including variants of uncertain significance in childhood cancer, proteins with known experimental unfolding ΔΔGs values, interactions between target proteins and disordered motifs, and phosphomimetics. RosettaDDGPrediction is available, free of charge and under GNU General Public License v3.0, at https://github.com/ELELAB/RosettaDDGPrediction .  相似文献   

20.
The native state can be considered as a unique conformation of the protein molecule with the lowest free energy of residue contacts. In this case, all other conformations correspond to the denatured state. The degree of their compactness varies significantly. Under folding conditions, the compact denatured state rather than the random coil is in equilibrium with native protein. The balance between the main forces of protein folding, the solvophobic interactions and conformational entropy, suggests that some properties of the compact denatured state are close to those of native protein, whereas other properties are close to those of the random coil. To investigate the molecular structure of the compact denatured state, the method of molecular dynamics simulation seems to be very useful.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号