共查询到20条相似文献,搜索用时 0 毫秒
1.
Katrin Reichel Olivier Fisette Tatjana Braun Oliver F. Lange Gerhard Hummer Lars V. Schäfer 《Proteins》2017,85(5):812-826
We critically test and validate the CS‐Rosetta methodology for de novo structure prediction of ‐helical membrane proteins (MPs) from NMR data, such as chemical shifts and NOE distance restraints. By systematically reducing the number and types of NOE restraints, we focus on determining the regime in which MP structures can be reliably predicted and pinpoint the boundaries of the approach. Five MPs of known structure were used as test systems, phototaxis sensory rhodopsin II (pSRII), a subdomain of pSRII, disulfide binding protein B (DsbB), microsomal prostaglandin E2 synthase‐1 (mPGES‐1), and translocator protein (TSPO). For pSRII and DsbB, where NMR and X‐ray structures are available, resolution‐adapted structural recombination (RASREC) CS‐Rosetta yields structures that are as close to the X‐ray structure as the published NMR structures if all available NMR data are used to guide structure prediction. For mPGES‐1 and Bacillus cereus TSPO, where only X‐ray crystal structures are available, highly accurate structures are obtained using simulated NMR data. One main advantage of RASREC CS‐Rosetta is its robustness with respect to even a drastic reduction of the number of NOEs. Close‐to‐native structures were obtained with one randomly picked long‐range NOEs for every 14, 31, 38, and 8 residues for full‐length pSRII, the pSRII subdomain, TSPO, and DsbB, respectively, in addition to using chemical shifts. For mPGES‐1, atomically accurate structures could be predicted even from chemical shifts alone. Our results show that atomic level accuracy for helical membrane proteins is achievable with CS‐Rosetta using very sparse NOE restraint sets to guide structure prediction. Proteins 2017; 85:812–826. © 2016 Wiley Periodicals, Inc. 相似文献
2.
Nederveen AJ Doreleijers JF Vranken W Miller Z Spronk CA Nabuurs SB Güntert P Livny M Markley JL Nilges M Ulrich EL Kaptein R Bonvin AM 《Proteins》2005,59(4):662-672
State-of-the-art methods based on CNS and CYANA were used to recalculate the nuclear magnetic resonance (NMR) solution structures of 500+ proteins for which coordinates and NMR restraints are available from the Protein Data Bank. Curated restraints were obtained from the BioMagResBank FRED database. Although the original NMR structures were determined by various methods, they all were recalculated by CNS and CYANA and refined subsequently by restrained molecular dynamics (CNS) in a hydrated environment. We present an extensive analysis of the results, in terms of various quality indicators generated by PROCHECK and WHAT_CHECK. On average, the quality indicators for packing and Ramachandran appearance moved one standard deviation closer to the mean of the reference database. The structural quality of the recalculated structures is discussed in relation to various parameters, including number of restraints per residue, NOE completeness and positional root mean square deviation (RMSD). Correlations between pairs of these quality indicators were generally low; for example, there is a weak correlation between the number of restraints per residue and the Ramachandran appearance according to WHAT_CHECK (r = 0.31). The set of recalculated coordinates constitutes a unified database of protein structures in which potential user- and software-dependent biases have been kept as small as possible. The database can be used by the structural biology community for further development of calculation protocols, validation tools, structure-based statistical approaches and modeling. The RECOORD database of recalculated structures is publicly available from http://www.ebi.ac.uk/msd/recoord. 相似文献
3.
Li W Zhang Y Kihara D Huang YJ Zheng D Montelione GT Kolinski A Skolnick J 《Proteins》2003,53(2):290-306
TOUCHSTONEX, a new method for folding proteins that uses a small number of long-range contact restraints derived from NMR experimental NOE (nuclear Overhauser enhancement) data, is described. The method employs a new lattice-based, reduced model of proteins that explicitly represents C(alpha), C(beta), and the sidechain centers of mass. The force field consists of knowledge-based terms to produce protein-like behavior, including various short-range interactions, hydrogen bonding, and one-body, pairwise, and multibody long-range interactions. Contact restraints were incorporated into the force field as an NOE-specific pairwise potential. We evaluated the algorithm using a set of 125 proteins of various secondary structure types and lengths up to 174 residues. Using N/8 simulated, long-range sidechain contact restraints, where N is the number of residues, 108 proteins were folded to a C(alpha)-root-mean-square deviation (RMSD) from native below 6.5 A. The average RMSD of the lowest RMSD structures for all 125 proteins (folded and unfolded) was 4.4 A. The algorithm was also applied to limited experimental NOE data generated for three proteins. Using very few experimental sidechain contact restraints, and a small number of sidechain-main chain and main chain-main chain contact restraints, we folded all three proteins to low-to-medium resolution structures. The algorithm can be applied to the NMR structure determination process or other experimental methods that can provide tertiary restraint information, especially in the early stage of structure determination, when only limited data are available. 相似文献
4.
《Structure (London, England : 1993)》2019,27(11):1721-1734.e5
- Download : Download high-res image (252KB)
- Download : Download full-size image
5.
The TASSER structure prediction algorithm is employed to investigate whether NMR structures can be moved closer to their corresponding X-ray counterparts by automatic refinement procedures. The benchmark protein dataset includes 61 nonhomologous proteins whose structures have been determined by both NMR and X-ray experiments. Interestingly, by starting from NMR structures, the majority (79%) of TASSER refined models show a structural shift toward their X-ray structures. On average, the TASSER refined models have a root-mean-square-deviation (RMSD) from the X-ray structure of 1.785 A (1.556 A) over the entire chain (aligned region), while the average RMSD between NMR and X-ray structures (RMSD(NMR_X-ray)) is 2.080 A (1.731 A). For all proteins having a RMSD(NMR_X-ray) >2 A, the TASSER refined structures show consistent improvement. However, for the 34 proteins with a RMSD(NMR_X-ray) <2 A, there are only 21 cases (60%) where the TASSER model is closer to the X-ray structure than NMR, which may be due to the inherent resolution of TASSER. We also compare the TASSER models with 12 NMR models in the RECOORD database that have been recalculated recently by Nederveen et al. from original NMR restraints using the newest molecular dynamics tools. In 8 of 12 cases, TASSER models show a smaller RMSD to X-ray structures; in 3 of 12 cases, where RMSD(NMR_X-ray) <1 A, RECOORD does better than TASSER. These results suggest that TASSER can be a useful tool to improve the quality of NMR structures. 相似文献
6.
Rosetta is a structure prediction package that has been employed successfully in numerous protein design and other applications.1 Previous reports have attributed the current limitations of the Rosetta de novo structure prediction algorithm to inadequate sampling, particularly during the low-resolution phase.2-5 Here, we implement the Simulated Tempering (ST) sampling algorithm67 in Rosetta to address this issue. ST is intended to yield canonical sampling by inducing a random walk in temperatures space such that broad sampling is achieved at high temperatures and detailed exploration of local free energy minima is achieved at low temperatures. ST should therefore visit basins in accordance with their free energies rather than their energies and achieve more global sampling than the localized scheme currently implemented in Rosetta. However, we find that ST does not improve structure prediction with Rosetta. To understand why, we carried out a detailed analysis of the low-resolution scoring functions and find that they do not provide a strong bias towards the native state. In addition, we find that both ST and standard Rosetta runs started from the native state are biased away from the native state. Although the low-resolution scoring functions could be improved, we propose that working entirely at full-atom resolution is now possible and may be a better option due to superior native-state discrimination at full-atom resolution. Such an approach will require more attention to the kinetics of convergence, however, as functions capable of native state discrimination are not necessarily capable of rapidly guiding non-native conformations to the native state. 相似文献
7.
Bian Li Michaela Fooksa Sten Heinze 《Critical reviews in biochemistry and molecular biology》2018,53(1):1-28
Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as “the protein folding problem,” has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions. 相似文献
8.
Markley JL Ulrich EL Berman HM Henrick K Nakamura H Akutsu H 《Journal of biomolecular NMR》2008,40(3):153-155
We describe the role of the BioMagResBank (BMRB) within the Worldwide Protein Data Bank (wwPDB) and recent policies affecting
the deposition of biomolecular NMR data. All PDB depositions of structures based on NMR data must now be accompanied by experimental
restraints. A scheme has been devised that allows depositors to specify a representative structure and to define residues
within that structure found experimentally to be largely unstructured. The BMRB now accepts coordinate sets representing three-dimensional
structural models based on experimental NMR data of molecules of biological interest that fall outside the guidelines of the
Protein Data Bank (i.e., the molecule is a peptide with 23 or fewer residues, a polynucleotide with 3 or fewer residues, a
polysaccharide with 3 or fewer sugar residues, or a natural product), provided that the coordinates are accompanied by representation
of the covalent structure of the molecule (atom connectivity), assigned NMR chemical shifts, and the structural restraints
used in generating model. The BMRB now contains an archive of NMR data for metabolites and other small molecules found in
biological systems. 相似文献
9.
Varun Mandalaparthy Venkata Ramana Sanaboyana Hitesh Rafalia Shachi Gosavi 《Proteins》2018,86(2):248-262
One of the main barriers to accurate computational protein structure prediction is searching the vast space of protein conformations. Distance restraints or inter‐residue contacts have been used to reduce this search space, easing the discovery of the correct folded state. It has been suggested that about 1 contact for every 12 residues may be sufficient to predict structure at fold level accuracy. Here, we use coarse‐grained structure‐based models in conjunction with molecular dynamics simulations to examine this empirical prediction. We generate sparse contact maps for 15 proteins of varying sequence lengths and topologies and find that given perfect secondary‐structural information, a small fraction of the native contact map (5%‐10%) suffices to fold proteins to their correct native states. We also find that different sparse maps are not equivalent and we make several observations about the type of maps that are successful at such structure prediction. Long range contacts are found to encode more information than shorter range ones, especially for α and αβ‐proteins. However, this distinction reduces for β‐proteins. Choosing contacts that are a consensus from successful maps gives predictive sparse maps as does choosing contacts that are well spread out over the protein structure. Additionally, the folding of proteins can also be used to choose predictive sparse maps. Overall, we conclude that structure‐based models can be used to understand the efficacy of structure‐prediction restraints and could, in future, be tuned to include specific force‐field interactions, secondary structure errors and noise in the sparse maps. 相似文献
10.
Manuela Gorgel Andreas Bggild Jakob Jensen Ulstrup Manfred S. Weiss Uwe Müller Poul Nissen Thomas Boesen 《Acta Crystallographica. Section D, Structural Biology》2015,71(5):1095-1101
Exploiting the anomalous signal of the intrinsic S atoms to phase a protein structure is advantageous, as ideally only a single well diffracting native crystal is required. However, sulfur is a weak anomalous scatterer at the typical wavelengths used for X‐ray diffraction experiments, and therefore sulfur SAD data sets need to be recorded with a high multiplicity. In this study, the structure of a small pilin protein was determined by sulfur SAD despite several obstacles such as a low anomalous signal (a theoretical Bijvoet ratio of 0.9% at a wavelength of 1.8 Å), radiation damage‐induced reduction of the cysteines and a multiplicity of only 5.5. The anomalous signal was improved by merging three data sets from different volumes of a single crystal, yielding a multiplicity of 17.5, and a sodium ion was added to the substructure of anomalous scatterers. In general, all data sets were balanced around the threshold values for a successful phasing strategy. In addition, a collection of statistics on structures from the PDB that were solved by sulfur SAD are presented and compared with the data. Looking at the quality indicator Ranom/Rp.i.m., an inconsistency in the documentation of the anomalous R factor is noted and reported. 相似文献
11.
12.
Davide Sala Yuanpeng Janet Huang Casey A. Cole David A. Snyder Gaohua Liu Yojiro Ishida G.V.T. Swapna Kelly P. Brock Chris Sander Krzysztof Fidelis Andriy Kryshtafovych Masayori Inouye Roberto Tejero Homayoun Valafar Antonio Rosato Gaetano T. Montelione 《Proteins》2019,87(12):1315-1332
CASP13 has investigated the impact of sparse NMR data on the accuracy of protein structure prediction. NOESY and 15N-1H residual dipolar coupling data, typical of that obtained for 15N,13C-enriched, perdeuterated proteins up to about 40 kDa, were simulated for 11 CASP13 targets ranging in size from 80 to 326 residues. For several targets, two prediction groups generated models that are more accurate than those produced using baseline methods. Real NMR data collected for a de novo designed protein were also provided to predictors, including one data set in which only backbone resonance assignments were available. Some NMR-assisted prediction groups also did very well with these data. CASP13 also assessed whether incorporation of sparse NMR data improves the accuracy of protein structure prediction relative to nonassisted regular methods. In most cases, incorporation of sparse, noisy NMR data results in models with higher accuracy. The best NMR-assisted models were also compared with the best regular predictions of any CASP13 group for the same target. For six of 13 targets, the most accurate model provided by any NMR-assisted prediction group was more accurate than the most accurate model provided by any regular prediction group; however, for the remaining seven targets, one or more regular prediction method provided a more accurate model than even the best NMR-assisted model. These results suggest a novel approach for protein structure determination, in which advanced prediction methods are first used to generate structural models, and sparse NMR data is then used to validate and/or refine these models. 相似文献
13.
David T. Jones Claire M. Moody Julia Uppenbrink John H. Viles Paul M. Doyle C. John Harris Laurence H. Pearl Peter J. Sadler Janet M. Thornton 《Proteins》1996,24(4):502-513
In response to the Paracelsus Challenge (Rose and Creamer, Proteins, 19:1–3, 1994), we present here the design, synthesis, and characterization of a helical protein, whose sequence is 50% identical to that of an all-β protein. The new sequence was derived by applying an inverse protein folding approach, in which the sequence was optimized to “fit” the new helical structure, but constrained to retain 50% of the original amino acid residues. The program utilizes a genetic algorithm to optimize the sequence, together with empirical potentials of mean force to evaluate the sequence-structure compatibility. Although the designed sequence has little ordered (secondary) structure in water, circular dichroism and nuclear magnetic resonance data show clear evidence for significant helical content in water/ethylene glycol and in water/methanol mixtures at low temperatures, as well as melting behavior indicative of cooperative folding. We believe that this represents a significant step toward meeting the Paracelsus Challenge. 相似文献
14.
Chu Wang Robert Vernon Oliver Lange Michael Tyka David Baker 《Protein science : a publication of the Protein Society》2010,19(3):494-506
Metal ions play an essential role in stabilizing protein structures and contributing to protein function. Ions such as zinc have well‐defined coordination geometries, but it has not been easy to take advantage of this knowledge in protein structure prediction efforts. Here, we present a computational method to predict structures of zinc‐binding proteins given knowledge of the positions of zinc‐coordinating residues in the amino acid sequence. The method takes advantage of the “atom‐tree” representation of molecular systems and modular architecture of the Rosetta3 software suite to incorporate explicit metal ion coordination geometry into previously developed de novo prediction and loop modeling protocols. Zinc cofactors are tethered to their interacting residues based on coordination geometries observed in natural zinc‐binding proteins. The incorporation of explicit zinc atoms and their coordination geometry in both de novo structure prediction and loop modeling significantly improves sampling near the native conformation. The method can be readily extended to predict protein structures bound to other metal and/or small chemical cofactors with well‐defined coordination or ligation geometry. 相似文献
15.
Computational methods that produce accurate protein structure models from limited experimental data, for example, from nuclear magnetic resonance (NMR) spectroscopy, hold great potential for biomedical research. The NMR-assisted modeling challenge in CASP13 provided a blind test to explore the capabilities and limitations of current modeling techniques in leveraging NMR data which had high sparsity, ambiguity, and error rate for protein structure prediction. We describe our approach to predict the structure of these proteins leveraging the Rosetta software suite. Protein structure models were predicted de novo using a two-stage protocol. First, low-resolution models were generated with the Rosetta de novo method guided by nonambiguous nuclear Overhauser effect (NOE) contacts and residual dipolar coupling (RDC) restraints. Second, iterative model hybridization and fragment insertion with the Rosetta comparative modeling method was used to refine and regularize models guided by all ambiguous and nonambiguous NOE contacts and RDCs. Nine out of 16 of the Rosetta de novo models had the correct fold (global distance test total score > 45) and in three cases high-resolution models were achieved (root-mean-square deviation < 3.5 å). We also show that a meta-approach applying iterative Rosetta + NMR refinement on server-predicted models which employed non-NMR-contacts and structural templates leads to substantial improvement in model quality. Integrating these data-assisted refinement strategies with innovative non-data-assisted approaches which became possible in CASP13 such as high precision contact prediction will in the near future enable structure determination for large proteins that are outside of the realm of conventional NMR. 相似文献
16.
R. S. DeWitte S. W. Michnick E. I. Shakhnovich 《Protein science : a publication of the Protein Society》1995,4(9):1780-1791
We present an efficient new algorithm that enumerates all possible conformations of a protein that satisfy a given set of distance restraints. Rapid growth of all possible self-avoiding conformations on the diamond lattice provides construction of alpha-carbon representations of a protein fold. We investigated the dependence of the number of conformations on pairwise distance restraints for the proteins crambin, pancreatic trypsin inhibitor, and ubiquitin. Knowledge of between one and two contacts per monomer is shown to be sufficient to restrict the number of candidate structures to approximately 1,000 conformations. Pairwise RMS deviations of atomic position comparisons between pairs of these 1,000 structures revealed that these conformations can be grouped into about 25 families of structures. These results suggest a new approach to assessing alternative protein folds given a very limited number of distance restraints. Such restraints are available from several experimental techniques such as NMR, NOESY, energy transfer fluorescence spectroscopy, and crosslinking experiments. This work focuses on exhaustive enumeration of protein structures with emphasis on the possible use of NOESY-determined distance restraints. 相似文献
17.
Distinguishing native from non-native folds remains a challenging problem for protein structure prediction. We describe a method, SCA-distance scoring, based on results from statistical coupling analysis which discriminates between native and non-native folds produced by a de novo protein structure prediction method for four out of five test proteins. The method is particularly good at discriminating non-native folds which are close in RMSD to the true fold but contain a change in an internal structural element. SCA-distance scoring is a useful addition to the tools available for distinguishing native from non-native folds in protein structure prediction. 相似文献
18.
Go A Kim S Baum J Hecht MH 《Protein science : a publication of the Protein Society》2008,17(5):821-832
Libraries of de novo proteins provide an opportunity to explore the structural and functional potential of biological molecules that have not been biased by billions of years of evolutionary selection. Given the enormity of sequence space, a rational approach to library design is likely to yield a higher fraction of folded and functional proteins than a stochastic sampling of random sequences. We previously investigated the potential of library design by binary patterning of hydrophobic and hydrophilic amino acids. The structure of the most stable protein from a binary patterned library of de novo 4-helix bundles was solved previously and shown to be consistent with the design. One structure, however, cannot fully assess the potential of the design strategy, nor can it account for differences in the stabilities of individual proteins. To more fully probe the quality of the library, we now report the NMR structure of a second protein, S-836. Protein S-836 proved to be a 4-helix bundle, consistent with design. The similarity between the two solved structures reinforces previous evidence that binary patterning can encode stable, 4-helix bundles. Despite their global similarities, the two proteins have cores that are packed at different degrees of tightness. The relationship between packing and dynamics was probed using the Modelfree approach, which showed that regions containing a high frequency of chemical exchange coincide with less well-packed side chains. These studies show (1) that binary patterning can drive folding into a particular topology without the explicit design of residue-by-residue packing, and (2) that within a superfamily of binary patterned proteins, the structures and dynamics of individual proteins are modulated by the identity and packing of residues in the hydrophobic core. 相似文献
19.
Peter W.A. Howe 《Journal of biomolecular NMR》2001,20(1):61-70
One important problem when calculating structures of biomolecules from NMR data is distinguishing converged structures from outlier structures. This paper describes how Principal Components Analysis (PCA) has the potential to classify calculated structures automatically, according to correlated structural variation across the population. PCA analysis has the additional advantage that it highlights regions of proteins which are varying across the population. To apply PCA, protein structures have to be reduced in complexity and this paper describes two different representations of protein structures which achieve this. The calculated structures of a 28 amino acid peptide are used to demonstrate the methods. The two different representations of protein structure are shown to give equivalent results, and correct results are obtained even though the ensemble of structures used as an example contains two different protein conformations. The PCA analysis also correctly identifies the structural differences between the two conformations. 相似文献
20.
C Sander G Vriend F Bazan A Horovitz H Nakamura L Ribas A V Finkelstein A Lockhart R Merkl L J Perry 《Proteins》1992,12(2):105-110
What is the current state of the art in protein design? This question was approached in a recent two-week protein design workshop sponsored by EMBO and held at the EMBL in Heidelberg. The goals were to test available design tools and to explore new design strategies. Five novel proteins were designed: Shpilka, a sandwich of two four-stranded β-sheets, a scaffold on which to explore variations in loop topology; Grendel, a four-helical membrane anchor, ready for fusion to water-soluble functional domains; Fingerclasp, a dimer of interdigitating β–β–α units, the simplest variant of the “handshake” structural class; Aida, an antibody binding surface intended to be specific for flavodoxin; Leather—a minimal NAD binding domain, extracted from a larger protein. Each design is available as a set of three-dimensional coordinates, the corresponding amino acid sequence and a set of analytical results. The designs are placed in the public domain for scrutiny, improvement, and possible experimental verification. 相似文献