首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Contact order and ab initio protein structure prediction   总被引:1,自引:0,他引:1       下载免费PDF全文
Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structures; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such a strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down beta-sheets that may resemble precursors to amyloid formation.  相似文献   

2.
pi-pi, Cation-pi, and hydrophobic packing interactions contribute specificity to protein folding and stability to the native state. As a step towards developing improved models of these interactions in proteins, we compare the side-chain packing arrangements in native proteins to those found in compact decoys produced by the Rosetta de novo structure prediction method. We find enrichments in the native distributions for T-shaped and parallel offset arrangements of aromatic residue pairs, in parallel stacked arrangements of cation-aromatic pairs, in parallel stacked pairs involving proline residues, and in parallel offset arrangements for aliphatic residue pairs. We then investigate the extent to which the distinctive features of native packing can be explained using Lennard-Jones and electrostatics models. Finally, we derive orientation-dependent pi-pi, cation-pi and hydrophobic interaction potentials based on the differences between the native and compact decoy distributions and investigate their efficacy for high-resolution protein structure prediction. Surprisingly, the orientation-dependent potential derived from the packing arrangements of aliphatic side-chain pairs distinguishes the native structure from compact decoys better than the orientation-dependent potentials describing pi-pi and cation-pi interactions.  相似文献   

3.
Bradley P  Baker D 《Proteins》2006,65(4):922-929
Proteins with complex, nonlocal beta-sheets are challenging for de novo structure prediction, due in part to the difficulty of efficiently sampling long-range strand pairings. We present a new, multilevel approach to beta-sheet structure prediction that circumvents this difficulty by reformulating structure generation in terms of a folding tree. Nonlocal connections in this tree allow us to explicitly sample alternative beta-strand pairings while simultaneously exploring local conformational space using backbone torsion-space moves. An iterative, energy-biased resampling strategy is used to explore the space of beta-strand pairings; we expect that such a strategy will be generally useful for searching large conformational spaces with a high degree of combinatorial complexity.  相似文献   

4.
Protein structure prediction in genomics   总被引:1,自引:0,他引:1  
As the number of completely sequenced genomes rapidly increases, including now the complete Human Genome sequence, the post-genomic problems of genome-scale protein structure determination and the issue of gene function identification become ever more pressing. In fact, these problems can be seen as interrelated in that experimentally determining or predicting or the structure of proteins encoded by genes of interest is one possible means to glean subtle hints as to the functions of these genes. The applicability of this approach to gene characterisation is reviewed, along with a brief survey of the reliability of large-scale protein structure prediction methods and the prospects for the development of new prediction methods.  相似文献   

5.
Protein β‐sheets often involve nonlocal interactions between parts of the polypeptide chain that are separated by hundreds of residues, raising the question of how these nonlocal contacts form. A recent study of the smallest β‐sheets found that their formation was not driven by signals hidden in the primary sequence. Instead, the strands in these sheets were either local in sequence, or, when separated by large sequential distances, the intervening residues were found to fold into compact modules that anchored distant parts of the chain in close spatial proximity. Here, we examine larger β‐sheets to investigate the extensibility of this principle. From an analysis of the β‐sheets in a nonredundant protein dataset, we find that a highly ordered hierarchical relationship exists in the intervening structure between nonlocal β‐strands. This observation is almost universal: virtually all β‐sheets, no matter their complexity, appear to adopt an antiparallel model to manage the nonlocal aspects of their assembly, one where the chain, having left the vicinity of an unfinished β‐sheet, retraces its steps via the same route to complete the initial sheet. Exceptions typically involve unstructured regions at chain termini. Moreover, an analysis of the residues involved in nonlocal crossstrand interactions did not produce any evidence of a signal hidden in the sequence that might direct long‐range interactions. These results build on those reported for the smallest sheets, suggesting that sheet formation is either local in sequence or local in space following prior folding events that anchor disparate parts of the chain in close proximity. Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

6.
Computational methods in protein structure prediction   总被引:1,自引:0,他引:1  
This review presents the advances in protein structure prediction from the computational methods perspective. The approaches are classified into four major categories: comparative modeling, fold recognition, first principles methods that employ database information, and first principles methods without database information. Important advances along with current limitations and challenges are presented.  相似文献   

7.
Protein residues that are critical for structure and function are expected to be conserved throughout evolution. Here, we investigate the extent to which these conserved residues are clustered in three-dimensional protein structures. In 92% of the proteins in a data set of 79 proteins, the most conserved positions in multiple sequence alignments are significantly more clustered than randomly selected sets of positions. The comparison to random subsets is not necessarily appropriate, however, because the signal could be the result of differences in the amino acid composition of sets of conserved residues compared to random subsets (hydrophobic residues tend to be close together in the protein core), or differences in sequence separation of the residues in the different sets. In order to overcome these limits, we compare the degree of clustering of the conserved positions on the native structure and on alternative conformations generated by the de novo structure prediction method Rosetta. For 65% of the 79 proteins, the conserved residues are significantly more clustered in the native structure than in the alternative conformations, indicating that the clustering of conserved residues in protein structures goes beyond that expected purely from sequence locality and composition effects. The differences in the spatial distribution of conserved residues can be utilized in de novo protein structure prediction: We find that for 79% of the proteins, selection of the Rosetta generated conformations with the greatest clustering of the conserved residues significantly enriches the fraction of close-to-native structures.  相似文献   

8.
It is well established that protein structures are more conserved than protein sequences. One-third of all known protein structures can be classified into ten protein folds, which themselves are composed mainly of alpha-helical hairpin, beta hairpin, and betaalphabeta supersecondary structural elements. In this study, we explore the ability of a recent Monte Carlo-based procedure to generate the 3D structures of eight polypeptides that correspond to units of supersecondary structure and three-stranded antiparallel beta sheet. Starting from extended or misfolded compact conformations, all Monte Carlo simulations show significant success in predicting the native topology using a simplified chain representation and an energy model optimized on other structures. Preliminary results on model peptides from nucleotide binding proteins suggest that this simple protein folding model can help clarify the relation between sequence and topology.  相似文献   

9.
We critically test and validate the CS‐Rosetta methodology for de novo structure prediction of ‐helical membrane proteins (MPs) from NMR data, such as chemical shifts and NOE distance restraints. By systematically reducing the number and types of NOE restraints, we focus on determining the regime in which MP structures can be reliably predicted and pinpoint the boundaries of the approach. Five MPs of known structure were used as test systems, phototaxis sensory rhodopsin II (pSRII), a subdomain of pSRII, disulfide binding protein B (DsbB), microsomal prostaglandin E2 synthase‐1 (mPGES‐1), and translocator protein (TSPO). For pSRII and DsbB, where NMR and X‐ray structures are available, resolution‐adapted structural recombination (RASREC) CS‐Rosetta yields structures that are as close to the X‐ray structure as the published NMR structures if all available NMR data are used to guide structure prediction. For mPGES‐1 and Bacillus cereus TSPO, where only X‐ray crystal structures are available, highly accurate structures are obtained using simulated NMR data. One main advantage of RASREC CS‐Rosetta is its robustness with respect to even a drastic reduction of the number of NOEs. Close‐to‐native structures were obtained with one randomly picked long‐range NOEs for every 14, 31, 38, and 8 residues for full‐length pSRII, the pSRII subdomain, TSPO, and DsbB, respectively, in addition to using chemical shifts. For mPGES‐1, atomically accurate structures could be predicted even from chemical shifts alone. Our results show that atomic level accuracy for helical membrane proteins is achievable with CS‐Rosetta using very sparse NOE restraint sets to guide structure prediction. Proteins 2017; 85:812–826. © 2016 Wiley Periodicals, Inc.  相似文献   

10.
Georg Kuenze  Jens Meiler 《Proteins》2019,87(12):1341-1350
Computational methods that produce accurate protein structure models from limited experimental data, for example, from nuclear magnetic resonance (NMR) spectroscopy, hold great potential for biomedical research. The NMR-assisted modeling challenge in CASP13 provided a blind test to explore the capabilities and limitations of current modeling techniques in leveraging NMR data which had high sparsity, ambiguity, and error rate for protein structure prediction. We describe our approach to predict the structure of these proteins leveraging the Rosetta software suite. Protein structure models were predicted de novo using a two-stage protocol. First, low-resolution models were generated with the Rosetta de novo method guided by nonambiguous nuclear Overhauser effect (NOE) contacts and residual dipolar coupling (RDC) restraints. Second, iterative model hybridization and fragment insertion with the Rosetta comparative modeling method was used to refine and regularize models guided by all ambiguous and nonambiguous NOE contacts and RDCs. Nine out of 16 of the Rosetta de novo models had the correct fold (global distance test total score > 45) and in three cases high-resolution models were achieved (root-mean-square deviation < 3.5 å). We also show that a meta-approach applying iterative Rosetta + NMR refinement on server-predicted models which employed non-NMR-contacts and structural templates leads to substantial improvement in model quality. Integrating these data-assisted refinement strategies with innovative non-data-assisted approaches which became possible in CASP13 such as high precision contact prediction will in the near future enable structure determination for large proteins that are outside of the realm of conventional NMR.  相似文献   

11.
We present an unusual method for parametrizing low-resolution force fields of the type used for protein structure prediction. Force field parameters were-determined by assigning each a fictitious mass and using a quasi-molecular dynamics algorithm in parameter space. The quasi-energy term favored folded native structures and specifically penalized folded nonnative structures. The force field was generated after optimizing less than 70 adjustable parameters, but shows a strong ability to discriminate between native structures and compact misfolded-alternatives. The functional form of the force field was chosen as in molecular mechanics and is not table-driven. It is continuous with continuous derivatives and is thus suitable for use with algorithms such as energy minimization or newtonian dynamics. Proteins 27:367–384, 1997. © 1997 Wiley-Liss, Inc.  相似文献   

12.
We have improved the original Rosetta centroid/backbone decoy set by increasing the number of proteins and frequency of near native models and by building on sidechains and minimizing clashes. The new set consists of 1,400 model structures for 78 different and diverse protein targets and provides a challenging set for the testing and evaluation of scoring functions. We evaluated the extent to which a variety of all-atom energy functions could identify the native and close-to-native structures in the new decoy sets. Of various implicit solvent models, we found that a solvent-accessible surface area-based solvation provided the best enrichment and discrimination of close-to-native decoys. The combination of this solvation treatment with Lennard Jones terms and the original Rosetta energy provided better enrichment and discrimination than any of the individual terms. The results also highlight the differences in accuracy of NMR and X-ray crystal structures: a large energy gap was observed between native and non-native conformations for X-ray structures but not for NMR structures.  相似文献   

13.
TOUCHSTONEX, a new method for folding proteins that uses a small number of long-range contact restraints derived from NMR experimental NOE (nuclear Overhauser enhancement) data, is described. The method employs a new lattice-based, reduced model of proteins that explicitly represents C(alpha), C(beta), and the sidechain centers of mass. The force field consists of knowledge-based terms to produce protein-like behavior, including various short-range interactions, hydrogen bonding, and one-body, pairwise, and multibody long-range interactions. Contact restraints were incorporated into the force field as an NOE-specific pairwise potential. We evaluated the algorithm using a set of 125 proteins of various secondary structure types and lengths up to 174 residues. Using N/8 simulated, long-range sidechain contact restraints, where N is the number of residues, 108 proteins were folded to a C(alpha)-root-mean-square deviation (RMSD) from native below 6.5 A. The average RMSD of the lowest RMSD structures for all 125 proteins (folded and unfolded) was 4.4 A. The algorithm was also applied to limited experimental NOE data generated for three proteins. Using very few experimental sidechain contact restraints, and a small number of sidechain-main chain and main chain-main chain contact restraints, we folded all three proteins to low-to-medium resolution structures. The algorithm can be applied to the NMR structure determination process or other experimental methods that can provide tertiary restraint information, especially in the early stage of structure determination, when only limited data are available.  相似文献   

14.
Secondary structure prediction of the catalytic domain of matrix metalloproteinases is evaluated in the light of recently published experimentally determined structures. The prediction was made by combining conformational propensity, surface probability, and residue conservation calculated for an alignment of 19 sequences. The position of each observed secondary structure element was correctly predicted with a high degree of accuracy, with a single beta-strand falsely predicted. The domain fold was also anticipated from the prediction by analogy with the structural elements found in the distantly related metalloproteinases thermolysin, astacin, and adamalysin.  相似文献   

15.
16.
17.
Kifer I  Nussinov R  Wolfson HJ 《Proteins》2011,79(6):1759-1773
The pathways by which proteins fold into their specific native structure are still an unsolved mystery. Currently, many methods for protein structure prediction are available, and most of them tackle the problem by relying on the vast amounts of data collected from known protein structures. These methods are often not concerned with the route the protein follows to reach its final fold. This work is based on the premise that proteins fold in a hierarchical manner. We present FOBIA, an automated method for predicting a protein structure. FOBIA consists of two main stages: the first finds matches between parts of the target sequence and independently folding structural units using profile-profile comparison. The second assembles these units into a 3D structure by searching and ranking their possible orientations toward each other using a docking-based approach. We have previously reported an application of an initial version of this strategy to homology based targets. Since then we have considerably enhanced our method's abilities to allow it to address the more difficult template-based target category. This allows us to now apply FOBIA to the template-based targets of CASP8 and to show that it is both very efficient and promising. Our method can provide an alternative for template-based structure prediction, and in particular, the docking-basedranking technique presented here can be incorporated into any profile-profile comparison based method.  相似文献   

18.
蛋白质结构从头预测是不依赖模板仅从氨基酸序列信息得到天然结构。它的关键是正确定义能量函数、精确选用计算机搜索算法来寻找能量最低值。基于此,本文系统介绍了能量函数和构象搜索策略,并列举了几种比较成功的从头预测方法,通过比较得出结论:基于统计学知识的能量函数是近年来从头预测发展的主要方向,现有从头预测的构象搜索都用到Monte Carlo法。这表明随着蛋白质结构预测研究的深入,能量函数的构建、构象搜索方法的选择、大分子蛋白质结构的从头预测等关键性问题都取得了突破性进展。  相似文献   

19.
Conformational search space exploration remains a major bottleneck for protein structure prediction methods. Population‐based meta‐heuristics typically enable the possibility to control the search dynamics and to tune the balance between local energy minimization and search space exploration. EdaFold is a fragment‐based approach that can guide search by periodically updating the probability distribution over the fragment libraries used during model assembly. We implement the EdaFold algorithm as a Rosetta protocol and provide two different probability update policies: a cluster‐based variation (EdaRosec) and an energy‐based one (EdaRoseen). We analyze the search dynamics of our new Rosetta protocols and show that EdaRosec is able to provide predictions with lower C RMSD to the native structure than EdaRoseen and Rosetta AbInitio Relax protocol. Our software is freely available as a C++ patch for the Rosetta suite and can be downloaded from http://www.riken.jp/zhangiru/software/ . Our protocols can easily be extended in order to create alternative probability update policies and generate new search dynamics. Proteins 2017; 85:852–858. © 2016 Wiley Periodicals, Inc.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号