首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Misura KM  Baker D 《Proteins》2005,59(1):15-29
Achieving atomic level accuracy in de novo structure prediction presents a formidable challenge even in the context of protein models with correct topologies. High-resolution refinement is a fundamental test of force field accuracy and sampling methodology, and its limited success in both comparative modeling and de novo prediction contexts highlights the limitations of current approaches. We constructed four tests to identify bottlenecks in our current approach and to guide progress in this challenging area. The first three tests showed that idealized native structures are stable under our refinement simulation conditions and that the refinement protocol can significantly decrease the root mean square deviation (RMSD) of perturbed native structures. In the fourth test we applied the refinement protocol to de novo models and showed that accurate models could be identified based on their energies, and in several cases many of the buried side chains adopted native-like conformations. We also showed that the differences in backbone and side-chain conformations between the refined de novo models and the native structures are largely localized to loop regions and regions where the native structure has unusual features such as rare rotamers or atypical hydrogen bonding between beta-strands. The refined de novo models typically have higher energies than refined idealized native structures, indicating that sampling of local backbone conformations and side-chain packing arrangements in a condensed state is a primary obstacle.  相似文献   

2.
Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of [Formula: see text] proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold's decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software/.  相似文献   

3.
Zhu J  Xie L  Honig B 《Proteins》2006,65(2):463-479
In this article, we present an iterative, modular optimization (IMO) protocol for the local structure refinement of protein segments containing secondary structure elements (SSEs). The protocol is based on three modules: a torsion-space local sampling algorithm, a knowledge-based potential, and a conformational clustering algorithm. Alternative methods are tested for each module in the protocol. For each segment, random initial conformations were constructed by perturbing the native dihedral angles of loops (and SSEs) of the segment to be refined while keeping the protein body fixed. Two refinement procedures based on molecular mechanics force fields - using either energy minimization or molecular dynamics - were also tested but were found to be less successful than the IMO protocol. We found that DFIRE is a particularly effective knowledge-based potential and that clustering algorithms that are biased by the DFIRE energies improve the overall results. Results were further improved by adding an energy minimization step to the conformations generated with the IMO procedure, suggesting that hybrid strategies that combine both knowledge-based and physical effective energy functions may prove to be particularly effective in future applications.  相似文献   

4.
Fragment assembly is a powerful method of protein structure prediction that builds protein models from a pool of candidate fragments taken from known structures. Stochastic sampling is subsequently used to refine the models. The structures are first represented as coarse-grained models and then as all-atom models for computational efficiency. Many models have to be generated independently due to the stochastic nature of the sampling methods used to search for the global minimum in a complex energy landscape. In this paper we present , a fragment-based approach which shares information between the generated models and steers the search towards native-like regions. A distribution over fragments is estimated from a pool of low energy all-atom models. This iteratively-refined distribution is used to guide the selection of fragments during the building of models for subsequent rounds of structure prediction. The use of an estimation of distribution algorithm enabled to reach lower energy levels and to generate a higher percentage of near-native models. uses an all-atom energy function and produces models with atomic resolution. We observed an improvement in energy-driven blind selection of models on a benchmark of in comparison with the AbInitioRelax protocol.  相似文献   

5.
We present CATHEDRAL, an iterative protocol for determining the location of previously observed protein folds in novel multidomain protein structures. CATHEDRAL builds on the features of a fast secondary-structure-based method (using graph theory) to locate known folds within a multidomain context and a residue-based, double-dynamic programming algorithm, which is used to align members of the target fold groups against the query protein structure to identify the closest relative and assign domain boundaries. To increase the fidelity of the assignments, a support vector machine is used to provide an optimal scoring scheme. Once a domain is verified, it is excised, and the search protocol is repeated in an iterative fashion until all recognisable domains have been identified. We have performed an initial benchmark of CATHEDRAL against other publicly available structure comparison methods using a consensus dataset of domains derived from the CATH and SCOP domain classifications. CATHEDRAL shows superior performance in fold recognition and alignment accuracy when compared with many equivalent methods. If a novel multidomain structure contains a known fold, CATHEDRAL will locate it in 90% of cases, with <1% false positives. For nearly 80% of assigned domains in a manually validated test set, the boundaries were correctly delineated within a tolerance of ten residues. For the remaining cases, previously classified domains were very remotely related to the query chain so that embellishments to the core of the fold caused significant differences in domain sizes and manual refinement of the boundaries was necessary. To put this performance in context, a well-established sequence method based on hidden Markov models was only able to detect 65% of domains, with 33% of the subsequent boundaries assigned within ten residues. Since, on average, 50% of newly determined protein structures contain more than one domain unit, and typically 90% or more of these domains are already classified in CATH, CATHEDRAL will considerably facilitate the automation of protein structure classification.  相似文献   

6.
The increasing importance of non-coding RNA in biology and medicine has led to a growing interest in the problem of RNA 3-D structure prediction. As is the case for proteins, RNA 3-D structure prediction methods require two key ingredients: an accurate energy function and a conformational sampling procedure. Both are only partly solved problems. Here, we focus on the problem of conformational sampling. The current state of the art solution is based on fragment assembly methods, which construct plausible conformations by stringing together short fragments obtained from experimental structures. However, the discrete nature of the fragments necessitates the use of carefully tuned, unphysical energy functions, and their non-probabilistic nature impairs unbiased sampling. We offer a solution to the sampling problem that removes these important limitations: a probabilistic model of RNA structure that allows efficient sampling of RNA conformations in continuous space, and with associated probabilities. We show that the model captures several key features of RNA structure, such as its rotameric nature and the distribution of the helix lengths. Furthermore, the model readily generates native-like 3-D conformations for 9 out of 10 test structures, solely using coarse-grained base-pairing information. In conclusion, the method provides a theoretical and practical solution for a major bottleneck on the way to routine prediction and simulation of RNA structure and dynamics in atomic detail.  相似文献   

7.

Background

The quality of X-ray crystallographic models for biomacromolecules refined from data obtained at high-resolution is assured by the data itself. However, at low-resolution, >3.0 Å, additional information is supplied by a forcefield coupled with an associated refinement protocol. These resulting structures are often of lower quality and thus unsuitable for downstream activities like structure-based drug discovery.

Methodology

An X-ray crystallography refinement protocol that enhances standard methodology by incorporating energy terms from the HINT (Hydropathic INTeractions) empirical forcefield is described. This protocol was tested by refining synthetic low-resolution structural data derived from 25 diverse high-resolution structures, and referencing the resulting models to these structures. The models were also evaluated with global structural quality metrics, e.g., Ramachandran score and MolProbity clashscore. Three additional structures, for which only low-resolution data are available, were also re-refined with this methodology.

Results

The enhanced refinement protocol is most beneficial for reflection data at resolutions of 3.0 Å or worse. At the low-resolution limit, ≥4.0 Å, the new protocol generated models with Cα positions that have RMSDs that are 0.18 Å more similar to the reference high-resolution structure, Ramachandran scores improved by 13%, and clashscores improved by 51%, all in comparison to models generated with the standard refinement protocol. The hydropathic forcefield terms are at least as effective as Coulombic electrostatic terms in maintaining polar interaction networks, and significantly more effective in maintaining hydrophobic networks, as synthetic resolution is decremented. Even at resolutions ≥4.0 Å, these latter networks are generally native-like, as measured with a hydropathic interactions scoring tool.  相似文献   

8.

Background  

Protein tertiary structure prediction is a fundamental problem in computational biology and identifying the most native-like model from a set of predicted models is a key sub-problem. Consensus methods work well when the redundant models in the set are the most native-like, but fail when the most native-like model is unique. In contrast, structure-based methods score models independently and can be applied to model sets of any size and redundancy level. Additionally, structure-based methods have a variety of important applications including analogous fold recognition, refinement of sequence-structure alignments, and de novo prediction. The purpose of this work was to develop a structure-based model selection method based on predicted structural features that could be applied successfully to any set of models.  相似文献   

9.
Determining an accurate initial native-like protein fold is one of the most important and time-consuming steps of de novo NMR structure determination. Here we demonstrate that high-quality native-like models can be rapidly generated from initial structures obtained using limited NOE assignments, through replica exchange molecular dynamics refinement with a generalized Born implicit solvent (REX/GB). Conventional structure calculations using an initial sparse NOE set were unable to identify a unique topology for the zinc-bound C-terminal domain of E. coli chaperone Hsp33, due to a lack of unambiguous long range NOEs. An accurate overall topology was eventually obtained through laborious hand identification of long range NOEs. However we were able to obtain high-quality models with backbone RMSD values of about 2 Å with respect to the final structures, using REX/GB refinement with the original limited set of initial NOE restraints. These models could then be used to make further assignments of ambiguous NOEs and thereby speed up the structure determination process. The ability to calculate accurate starting structures from the limited unambiguous NOE set available at the beginning of a structure calculation offers the potential of a much more rapid and automated process for NMR structure determination. Jianhan Chen: Authors contributed equally to this work.Hyung-Sik Won: Authors contributed equally to this work.  相似文献   

10.
The prediction of protein structure from sequence remains a major unsolved problem in biology. The most successful protein structure prediction methods make use of a divide-and-conquer strategy to attack the problem: a conformational sampling method generates plausible candidate structures, which are subsequently accepted or rejected using an energy function. Conceptually, this often corresponds to separating local structural bias from the long-range interactions that stabilize the compact, native state. However, sampling protein conformations that are compatible with the local structural bias encoded in a given protein sequence is a long-standing open problem, especially in continuous space. We describe an elegant and mathematically rigorous method to do this, and show that it readily generates native-like protein conformations simply by enforcing compactness. Our results have far-reaching implications for protein structure prediction, determination, simulation, and design.  相似文献   

11.
We present a CPU efficient protocol for refinement of protein structures in a thin layer of explicit solvent and energy parameters with completely revised dihedral angle terms. Our approach is suitable for protein structures determined by theoretical (e.g., homology modeling or threading) or experimental methods (e.g., NMR). In contrast to other recently proposed refinement protocols, we put a strong emphasis on consistency with widely accepted covalent parameters and computational efficiency. We illustrate the method for NMR structure calculations of three proteins: interleukin-4, ubiquitin, and crambin. We show a comparison of their structure ensembles before and after refinement in water with and without a force field energy term for the dihedral angles; crambin was also refined in DMSO. Our results demonstrate the significant improvement of structure quality by a short refinement in a thin layer of solvent. Further, they show that a dihedral angle energy term in the force field is beneficial for structure calculation and refinement. We discuss the optimal weight for the energy constant for the backbone angle omega and include an extensive discussion of meaning and relevance of the calculated validation criteria, in particular root mean square Z scores for covalent parameters such as bond lengths.  相似文献   

12.
Structural refinement of predicted models of biological macromolecules using atomistic or coarse‐grained molecular force fields having various degree of error is investigated. The goal of this analysis is to estimate what is the probability for designing an effective structural refinement based on computations of conformational energies using force field, and starting from a structure predicted from the sequence (using template‐based or template‐free modeling), and refining it to bring the structure into closer proximity to the native state. It is widely believed that it should be possible to develop such a successful structure refinement algorithm by applying an iterative procedure with stochastic sampling and appropriate energy function, which assesses the quality (correctness) of protein decoys. Here, an analysis of noise in an artificially introduced scoring function is investigated for a model of an ideal sampling scheme, where the underlying distribution of RMSDs is assumed to be Gaussian. Sampling of the conformational space is performed by random generation of RMSD values. We demonstrate that whenever the random noise in a force field exceeds some level, it is impossible to obtain reliable structural refinement. The magnitude of the noise, above which a structural refinement, on average is impossible, depends strongly on the quality of sampling scheme and a size of the protein. Finally, possible strategies to overcome the intrinsic limitations in the force fields for impacting the development of successful refinement algorithms are discussed. Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

13.
14.
Raval A  Piana S  Eastwood MP  Dror RO  Shaw DE 《Proteins》2012,80(8):2071-2079
Accurate computational prediction of protein structure represents a longstanding challenge in molecular biology and structure-based drug design. Although homology modeling techniques are widely used to produce low-resolution models, refining these models to high resolution has proven difficult. With long enough simulations and sufficiently accurate force fields, molecular dynamics (MD) simulations should in principle allow such refinement, but efforts to refine homology models using MD have for the most part yielded disappointing results. It has thus far been unclear whether MD-based refinement is limited primarily by accessible simulation timescales, force field accuracy, or both. Here, we examine MD as a technique for homology model refinement using all-atom simulations, each at least 100 μs long-more than 100 times longer than previous refinement simulations-and a physics-based force field that was recently shown to successfully fold a structurally diverse set of fast-folding proteins. In MD simulations of 24 proteins chosen from the refinement category of recent Critical Assessment of Structure Prediction (CASP) experiments, we find that in most cases, simulations initiated from homology models drift away from the native structure. Comparison with simulations initiated from the native structure suggests that force field accuracy is the primary factor limiting MD-based refinement. This problem can be mitigated to some extent by restricting sampling to the neighborhood of the initial model, leading to structural improvement that, while limited, is roughly comparable to the leading alternative methods.  相似文献   

15.
The structure of human protein HSPC034 has been determined by both solution nuclear magnetic resonance (NMR) spectroscopy and X-ray crystallography. Refinement of the NMR structure ensemble, using a Rosetta protocol in the absence of NMR restraints, resulted in significant improvements not only in structure quality, but also in molecular replacement (MR) performance with the raw X-ray diffraction data using MOLREP and Phaser. This method has recently been shown to be generally applicable with improved MR performance demonstrated for eight NMR structures refined using Rosetta (Qian et al., Nature 2007;450:259-264). Additionally, NMR structures of HSPC034 calculated by standard methods that include NMR restraints have improvements in the RMSD to the crystal structure and MR performance in the order DYANA, CYANA, XPLOR-NIH, and CNS with explicit water refinement (CNSw). Further Rosetta refinement of the CNSw structures, perhaps due to more thorough conformational sampling and/or a superior force field, was capable of finding alternative low energy protein conformations that were equally consistent with the NMR data according to the Recall, Precision, and F-measure (RPF) scores. On further examination, the additional MR-performance shortfall for NMR refined structures as compared with the X-ray structure were attributed, in part, to crystal-packing effects, real structural differences, and inferior hydrogen bonding in the NMR structures. A good correlation between a decrease in the number of buried unsatisfied hydrogen-bond donors and improved MR performance demonstrates the importance of hydrogen-bond terms in the force field for improving NMR structures. The superior hydrogen-bond network in Rosetta-refined structures demonstrates that correct identification of hydrogen bonds should be a critical goal of NMR structure refinement. Inclusion of nonbivalent hydrogen bonds identified from Rosetta structures as additional restraints in the structure calculation results in NMR structures with improved MR performance.  相似文献   

16.
Park H  Ko J  Joo K  Lee J  Seok C  Lee J 《Proteins》2011,79(9):2725-2734
The rapid increase in the number of experimentally determined protein structures in recent years enables us to obtain more reliable protein tertiary structure models than ever by template-based modeling. However, refinement of template-based models beyond the limit available from the best templates is still needed for understanding protein function in atomic detail. In this work, we develop a new method for protein terminus modeling that can be applied to refinement of models with unreliable terminus structures. The energy function for terminus modeling consists of both physics-based and knowledge-based potential terms with carefully optimized relative weights. Effective sampling of both the framework and terminus is performed using the conformational space annealing technique. This method has been tested on a set of termini derived from a nonredundant structure database and two sets of termini from the CASP8 targets. The performance of the terminus modeling method is significantly improved over our previous method that does not employ terminus refinement. It is also comparable or superior to the best server methods tested in CASP8. The success of the current approach suggests that similar strategy may be applied to other types of refinement problems such as loop modeling or secondary structure rearrangement.  相似文献   

17.
The most probable secondary structure of an RNA molecule, given the nucleotide sequence, can be computed efficiently if a stochastic context-free grammar (SCFG) is used as the prior distribution of the secondary structure. The structures of some RNA molecules contain so-called pseudoknots. Allowing all possible configurations of pseudoknots is not compatible with context-free grammar models and makes the search for an optimal secondary structure NP-complete. We suggest a probabilistic model for RNA secondary structures with pseudoknots and present a Markov-chain Monte-Carlo Method for sampling RNA structures according to their posterior distribution for a given sequence. We favor Bayesian sampling over optimization methods in this context, because it makes the uncertainty of RNA structure predictions assessable. We demonstrate the benefit of our method in examples with tmRNA and also with simulated data. McQFold, an implementation of our method, is freely available from http://www.cs.uni-frankfurt.de/~metzler/McQFold.  相似文献   

18.
Crystals of many important biological macromolecules diffract to limited resolution, rendering accurate model building and refinement difficult and time-consuming. We present a torsional optimization protocol that is applicable to many such situations and combines Protein Data Bank-based torsional optimization with real-space refinement against the electron density derived from crystallography or cryo-electron microscopy. Our method converts moderate- to low-resolution structures at initial (e.g., backbone trace only) or late stages of refinement to structures with increased numbers of hydrogen bonds, improved crystallographic R-factors, and superior backbone geometry. This automated method is applicable to DNA-binding and membrane proteins of any size and will aid studies of structural biology by improving model quality and saving considerable effort. The method can be extended to improve NMR and other structures. Our backbone score and its sequence profile provide an additional standard tool for evaluating structural quality.  相似文献   

19.
Chen J  Brooks CL 《Proteins》2007,67(4):922-930
Recent advances in efficient and accurate treatment of solvent with the generalized Born approximation (GB) have made it possible to substantially refine the protein structures generated by various prediction tools through detailed molecular dynamics simulations. As demonstrated in a recent CASPR experiment, improvement can be quite reliably achieved when the initial models are sufficiently close to the native basin (e.g., 3-4 A C(alpha) RMSD). A key element to effective refinement is to incorporate reliable structural information into the simulation protocol. Without intimate knowledge of the target and prediction protocol used to generate the initial structural models, it can be assumed that the regular secondary structure elements (helices and strands) and overall fold topology are largely correct to start with, such that the protocol limits itself to the scope of refinement and focuses the sampling in vicinity of the initial structure. The secondary structures can be enforced by dihedral restraints and the topology through structural contacts, implemented as either multiple pair-wise C(alpha) distance restraints or a single sidechain distance matrix restraint. The restraints are weakly imposed with flat-bottom potentials to allow sufficient flexibility for structural rearrangement. Refinement is further facilitated by enhanced sampling of advanced techniques such as the replica exchange method (REX). In general, for single domain proteins of small to medium sizes, 3-5 nanoseconds of REX/GB refinement simulations appear to be sufficient for reasonable convergence. Clustering of the resulting structural ensembles can yield refined models over 1.0 A closer to the native structure in C(alpha) RMSD. Substantial improvement of sidechain contacts and rotamer states can also be achieved in most cases. Additional improvement is possible with longer sampling and knowledge of the robust structural features in the initial models for a given prediction protocol. Nevertheless, limitations still exist in sampling as well as force field accuracy, manifested as difficulty in refinement of long and flexible loops.  相似文献   

20.
Molecular replacement (MR) is widely used for addressing the phase problem in X-ray crystallography. Historically, crystallographers have had limited success using NMR structures as MR search models. Here, we report a comprehensive investigation of the utility of protein NMR ensembles as MR search models, using data for 25 pairs of X-ray and NMR structures solved and refined using modern NMR methods. Starting from NMR ensembles prepared by an improved protocol, FindCore, correct MR solutions were obtained for 22 targets. Based on these solutions, automatic model rebuilding could be done successfully. Rosetta refinement of NMR structures provided MR solutions for another two proteins. We also demonstrate that such properly prepared NMR ensembles and X-ray crystal structures have similar performance when used as MR search models for homologous structures, particularly for targets with sequence identity >40%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号