共查询到20条相似文献,搜索用时 15 毫秒
1.
Bonneau R Ruczinski I Tsai J Baker D 《Protein science : a publication of the Protein Society》2002,11(8):1937-1944
Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structures; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such a strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down beta-sheets that may resemble precursors to amyloid formation. 相似文献
2.
pi-pi, Cation-pi, and hydrophobic packing interactions contribute specificity to protein folding and stability to the native state. As a step towards developing improved models of these interactions in proteins, we compare the side-chain packing arrangements in native proteins to those found in compact decoys produced by the Rosetta de novo structure prediction method. We find enrichments in the native distributions for T-shaped and parallel offset arrangements of aromatic residue pairs, in parallel stacked arrangements of cation-aromatic pairs, in parallel stacked pairs involving proline residues, and in parallel offset arrangements for aliphatic residue pairs. We then investigate the extent to which the distinctive features of native packing can be explained using Lennard-Jones and electrostatics models. Finally, we derive orientation-dependent pi-pi, cation-pi and hydrophobic interaction potentials based on the differences between the native and compact decoy distributions and investigate their efficacy for high-resolution protein structure prediction. Surprisingly, the orientation-dependent potential derived from the packing arrangements of aliphatic side-chain pairs distinguishes the native structure from compact decoys better than the orientation-dependent potentials describing pi-pi and cation-pi interactions. 相似文献
3.
Proteins with complex, nonlocal beta-sheets are challenging for de novo structure prediction, due in part to the difficulty of efficiently sampling long-range strand pairings. We present a new, multilevel approach to beta-sheet structure prediction that circumvents this difficulty by reformulating structure generation in terms of a folding tree. Nonlocal connections in this tree allow us to explicitly sample alternative beta-strand pairings while simultaneously exploring local conformational space using backbone torsion-space moves. An iterative, energy-biased resampling strategy is used to explore the space of beta-strand pairings; we expect that such a strategy will be generally useful for searching large conformational spaces with a high degree of combinatorial complexity. 相似文献
4.
Protein structure prediction in genomics 总被引:1,自引:0,他引:1
Jones DT 《Briefings in bioinformatics》2001,2(2):111-125
As the number of completely sequenced genomes rapidly increases, including now the complete Human Genome sequence, the post-genomic problems of genome-scale protein structure determination and the issue of gene function identification become ever more pressing. In fact, these problems can be seen as interrelated in that experimentally determining or predicting or the structure of proteins encoded by genes of interest is one possible means to glean subtle hints as to the functions of these genes. The applicability of this approach to gene characterisation is reviewed, along with a brief survey of the reliability of large-scale protein structure prediction methods and the prospects for the development of new prediction methods. 相似文献
5.
Protein β‐sheets often involve nonlocal interactions between parts of the polypeptide chain that are separated by hundreds of residues, raising the question of how these nonlocal contacts form. A recent study of the smallest β‐sheets found that their formation was not driven by signals hidden in the primary sequence. Instead, the strands in these sheets were either local in sequence, or, when separated by large sequential distances, the intervening residues were found to fold into compact modules that anchored distant parts of the chain in close spatial proximity. Here, we examine larger β‐sheets to investigate the extensibility of this principle. From an analysis of the β‐sheets in a nonredundant protein dataset, we find that a highly ordered hierarchical relationship exists in the intervening structure between nonlocal β‐strands. This observation is almost universal: virtually all β‐sheets, no matter their complexity, appear to adopt an antiparallel model to manage the nonlocal aspects of their assembly, one where the chain, having left the vicinity of an unfinished β‐sheet, retraces its steps via the same route to complete the initial sheet. Exceptions typically involve unstructured regions at chain termini. Moreover, an analysis of the residues involved in nonlocal crossstrand interactions did not produce any evidence of a signal hidden in the sequence that might direct long‐range interactions. These results build on those reported for the smallest sheets, suggesting that sheet formation is either local in sequence or local in space following prior folding events that anchor disparate parts of the chain in close proximity. Proteins 2013. © 2012 Wiley Periodicals, Inc. 相似文献
6.
Computational methods in protein structure prediction 总被引:1,自引:0,他引:1
Floudas CA 《Biotechnology and bioengineering》2007,97(2):207-213
This review presents the advances in protein structure prediction from the computational methods perspective. The approaches are classified into four major categories: comparative modeling, fold recognition, first principles methods that employ database information, and first principles methods without database information. Important advances along with current limitations and challenges are presented. 相似文献
7.
Protein residues that are critical for structure and function are expected to be conserved throughout evolution. Here, we investigate the extent to which these conserved residues are clustered in three-dimensional protein structures. In 92% of the proteins in a data set of 79 proteins, the most conserved positions in multiple sequence alignments are significantly more clustered than randomly selected sets of positions. The comparison to random subsets is not necessarily appropriate, however, because the signal could be the result of differences in the amino acid composition of sets of conserved residues compared to random subsets (hydrophobic residues tend to be close together in the protein core), or differences in sequence separation of the residues in the different sets. In order to overcome these limits, we compare the degree of clustering of the conserved positions on the native structure and on alternative conformations generated by the de novo structure prediction method Rosetta. For 65% of the 79 proteins, the conserved residues are significantly more clustered in the native structure than in the alternative conformations, indicating that the clustering of conserved residues in protein structures goes beyond that expected purely from sequence locality and composition effects. The differences in the spatial distribution of conserved residues can be utilized in de novo protein structure prediction: We find that for 79% of the proteins, selection of the Rosetta generated conformations with the greatest clustering of the conserved residues significantly enriches the fraction of close-to-native structures. 相似文献
8.
It is well established that protein structures are more conserved than protein sequences. One-third of all known protein structures can be classified into ten protein folds, which themselves are composed mainly of alpha-helical hairpin, beta hairpin, and betaalphabeta supersecondary structural elements. In this study, we explore the ability of a recent Monte Carlo-based procedure to generate the 3D structures of eight polypeptides that correspond to units of supersecondary structure and three-stranded antiparallel beta sheet. Starting from extended or misfolded compact conformations, all Monte Carlo simulations show significant success in predicting the native topology using a simplified chain representation and an energy model optimized on other structures. Preliminary results on model peptides from nucleotide binding proteins suggest that this simple protein folding model can help clarify the relation between sequence and topology. 相似文献
9.
Systematic evaluation of CS‐Rosetta for membrane protein structure prediction with sparse NOE restraints 下载免费PDF全文
Katrin Reichel Olivier Fisette Tatjana Braun Oliver F. Lange Gerhard Hummer Lars V. Schäfer 《Proteins》2017,85(5):812-826
We critically test and validate the CS‐Rosetta methodology for de novo structure prediction of ‐helical membrane proteins (MPs) from NMR data, such as chemical shifts and NOE distance restraints. By systematically reducing the number and types of NOE restraints, we focus on determining the regime in which MP structures can be reliably predicted and pinpoint the boundaries of the approach. Five MPs of known structure were used as test systems, phototaxis sensory rhodopsin II (pSRII), a subdomain of pSRII, disulfide binding protein B (DsbB), microsomal prostaglandin E2 synthase‐1 (mPGES‐1), and translocator protein (TSPO). For pSRII and DsbB, where NMR and X‐ray structures are available, resolution‐adapted structural recombination (RASREC) CS‐Rosetta yields structures that are as close to the X‐ray structure as the published NMR structures if all available NMR data are used to guide structure prediction. For mPGES‐1 and Bacillus cereus TSPO, where only X‐ray crystal structures are available, highly accurate structures are obtained using simulated NMR data. One main advantage of RASREC CS‐Rosetta is its robustness with respect to even a drastic reduction of the number of NOEs. Close‐to‐native structures were obtained with one randomly picked long‐range NOEs for every 14, 31, 38, and 8 residues for full‐length pSRII, the pSRII subdomain, TSPO, and DsbB, respectively, in addition to using chemical shifts. For mPGES‐1, atomically accurate structures could be predicted even from chemical shifts alone. Our results show that atomic level accuracy for helical membrane proteins is achievable with CS‐Rosetta using very sparse NOE restraint sets to guide structure prediction. Proteins 2017; 85:812–826. © 2016 Wiley Periodicals, Inc. 相似文献
10.
Computational methods that produce accurate protein structure models from limited experimental data, for example, from nuclear magnetic resonance (NMR) spectroscopy, hold great potential for biomedical research. The NMR-assisted modeling challenge in CASP13 provided a blind test to explore the capabilities and limitations of current modeling techniques in leveraging NMR data which had high sparsity, ambiguity, and error rate for protein structure prediction. We describe our approach to predict the structure of these proteins leveraging the Rosetta software suite. Protein structure models were predicted de novo using a two-stage protocol. First, low-resolution models were generated with the Rosetta de novo method guided by nonambiguous nuclear Overhauser effect (NOE) contacts and residual dipolar coupling (RDC) restraints. Second, iterative model hybridization and fragment insertion with the Rosetta comparative modeling method was used to refine and regularize models guided by all ambiguous and nonambiguous NOE contacts and RDCs. Nine out of 16 of the Rosetta de novo models had the correct fold (global distance test total score > 45) and in three cases high-resolution models were achieved (root-mean-square deviation < 3.5 å). We also show that a meta-approach applying iterative Rosetta + NMR refinement on server-predicted models which employed non-NMR-contacts and structural templates leads to substantial improvement in model quality. Integrating these data-assisted refinement strategies with innovative non-data-assisted approaches which became possible in CASP13 such as high precision contact prediction will in the near future enable structure determination for large proteins that are outside of the realm of conventional NMR. 相似文献
11.
We present an unusual method for parametrizing low-resolution force fields of the type used for protein structure prediction. Force field parameters were-determined by assigning each a fictitious mass and using a quasi-molecular dynamics algorithm in parameter space. The quasi-energy term favored folded native structures and specifically penalized folded nonnative structures. The force field was generated after optimizing less than 70 adjustable parameters, but shows a strong ability to discriminate between native structures and compact misfolded-alternatives. The functional form of the force field was chosen as in molecular mechanics and is not table-driven. It is continuous with continuous derivatives and is thus suitable for use with algorithms such as energy minimization or newtonian dynamics. Proteins 27:367–384, 1997. © 1997 Wiley-Liss, Inc. 相似文献
12.
We have improved the original Rosetta centroid/backbone decoy set by increasing the number of proteins and frequency of near native models and by building on sidechains and minimizing clashes. The new set consists of 1,400 model structures for 78 different and diverse protein targets and provides a challenging set for the testing and evaluation of scoring functions. We evaluated the extent to which a variety of all-atom energy functions could identify the native and close-to-native structures in the new decoy sets. Of various implicit solvent models, we found that a solvent-accessible surface area-based solvation provided the best enrichment and discrimination of close-to-native decoys. The combination of this solvation treatment with Lennard Jones terms and the original Rosetta energy provided better enrichment and discrimination than any of the individual terms. The results also highlight the differences in accuracy of NMR and X-ray crystal structures: a large energy gap was observed between native and non-native conformations for X-ray structures but not for NMR structures. 相似文献
13.
Li W Zhang Y Kihara D Huang YJ Zheng D Montelione GT Kolinski A Skolnick J 《Proteins》2003,53(2):290-306
TOUCHSTONEX, a new method for folding proteins that uses a small number of long-range contact restraints derived from NMR experimental NOE (nuclear Overhauser enhancement) data, is described. The method employs a new lattice-based, reduced model of proteins that explicitly represents C(alpha), C(beta), and the sidechain centers of mass. The force field consists of knowledge-based terms to produce protein-like behavior, including various short-range interactions, hydrogen bonding, and one-body, pairwise, and multibody long-range interactions. Contact restraints were incorporated into the force field as an NOE-specific pairwise potential. We evaluated the algorithm using a set of 125 proteins of various secondary structure types and lengths up to 174 residues. Using N/8 simulated, long-range sidechain contact restraints, where N is the number of residues, 108 proteins were folded to a C(alpha)-root-mean-square deviation (RMSD) from native below 6.5 A. The average RMSD of the lowest RMSD structures for all 125 proteins (folded and unfolded) was 4.4 A. The algorithm was also applied to limited experimental NOE data generated for three proteins. Using very few experimental sidechain contact restraints, and a small number of sidechain-main chain and main chain-main chain contact restraints, we folded all three proteins to low-to-medium resolution structures. The algorithm can be applied to the NMR structure determination process or other experimental methods that can provide tertiary restraint information, especially in the early stage of structure determination, when only limited data are available. 相似文献
14.
Retrospective analysis of a secondary structure prediction: the catalytic domain of matrix metalloproteinases. 下载免费PDF全文
E. E. Hodgkin I. C. Gillman R. J. Gilbert 《Protein science : a publication of the Protein Society》1994,3(6):984-986
Secondary structure prediction of the catalytic domain of matrix metalloproteinases is evaluated in the light of recently published experimentally determined structures. The prediction was made by combining conformational propensity, surface probability, and residue conservation calculated for an alignment of 19 sequences. The position of each observed secondary structure element was correctly predicted with a high degree of accuracy, with a single beta-strand falsely predicted. The domain fold was also anticipated from the prediction by analogy with the structural elements found in the distantly related metalloproteinases thermolysin, astacin, and adamalysin. 相似文献
15.
16.
17.
The pathways by which proteins fold into their specific native structure are still an unsolved mystery. Currently, many methods for protein structure prediction are available, and most of them tackle the problem by relying on the vast amounts of data collected from known protein structures. These methods are often not concerned with the route the protein follows to reach its final fold. This work is based on the premise that proteins fold in a hierarchical manner. We present FOBIA, an automated method for predicting a protein structure. FOBIA consists of two main stages: the first finds matches between parts of the target sequence and independently folding structural units using profile-profile comparison. The second assembles these units into a 3D structure by searching and ranking their possible orientations toward each other using a docking-based approach. We have previously reported an application of an initial version of this strategy to homology based targets. Since then we have considerably enhanced our method's abilities to allow it to address the more difficult template-based target category. This allows us to now apply FOBIA to the template-based targets of CASP8 and to show that it is both very efficient and promising. Our method can provide an alternative for template-based structure prediction, and in particular, the docking-basedranking technique presented here can be incorporated into any profile-profile comparison based method. 相似文献
18.
19.
Balancing exploration and exploitation in population‐based sampling improves fragment‐based de novo protein structure prediction 下载免费PDF全文
Conformational search space exploration remains a major bottleneck for protein structure prediction methods. Population‐based meta‐heuristics typically enable the possibility to control the search dynamics and to tune the balance between local energy minimization and search space exploration. EdaFold is a fragment‐based approach that can guide search by periodically updating the probability distribution over the fragment libraries used during model assembly. We implement the EdaFold algorithm as a Rosetta protocol and provide two different probability update policies: a cluster‐based variation (EdaRosec) and an energy‐based one (EdaRoseen). We analyze the search dynamics of our new Rosetta protocols and show that EdaRosec is able to provide predictions with lower C RMSD to the native structure than EdaRoseen and Rosetta AbInitio Relax protocol. Our software is freely available as a C++ patch for the Rosetta suite and can be downloaded from http://www.riken.jp/zhangiru/software/ . Our protocols can easily be extended in order to create alternative probability update policies and generate new search dynamics. Proteins 2017; 85:852–858. © 2016 Wiley Periodicals, Inc. 相似文献
20.