共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Ab initio structure prediction for small polypeptides and protein fragments using genetic algorithms
Ab initio folding simulations have been performed on three peptides, using a genetic algorithm-based search method which operates on a full atom representation. Conformations are evaluated with an empirical force field parameterized by a potential of mean force analysis of experimental structures. The dominant terms in the force field are local and nonlocal main chain electrostatics and the hydrophobic effect. Two of the simulated structures were for fragments of complete proteins (eosinophil-derived neurotoxin (EDN) and the subtilisin propeptide) that were identified as being likely initiation sites for folding. The experimental structure of one of these (EDN) was subsequently found to be consistent with that prediction (using local hydrophobic burial as the determinant for independent folding). The simulations of the structures of these two peptides were only partly successful. The most successful folding simulation was that of a 22-residue peptide corresponding to the membrane binding domain of blood coagulation factor VIII (Membind). Three simulations were performed on this peptide and the lowest energy conformation was found to be the most similar to the experimental structure. The conformation of this peptide was determined with a Cα rms deviation of 4.4 Å. Although these simulations were partly successful there are still many unresolved problems, which we expect to be able to address in the next structure prediction experiment. © 1995 Wiley-Liss, Inc. 相似文献
3.
4.
Recently ab initio protein structure prediction methods have advanced sufficiently so that they often assemble the correct low resolution structure of the protein. To enhance the speed of conformational search, many ab initio prediction programs adopt a reduced protein representation. However, for drug design purposes, better quality structures are probably needed. To achieve this refinement, it is natural to use a more detailed heavy atom representation. Here, as opposed to costly implicit or explicit solvent molecular dynamics simulations, knowledge-based heavy atom pair potentials were employed. By way of illustration, we tried to improve the quality of the predicted structures obtained from the ab initio prediction program TOUCHSTONE by three methods: local constraint refinement, reduced predicted tertiary contact refinement, and statistical pair potential guided molecular dynamics. Sixty-seven predicted structures from 30 small proteins (less than 150 residues in length) representing different structural classes (alpha, beta, alpha;/beta) were examined. In 33 cases, the root mean square deviation (RMSD) from native structures improved by more than 0.3 A; in 19 cases, the improvement was more than 0.5 A, and sometimes as large as 1 A. In only seven (four) cases did the refinement procedure increase the RMSD by more than 0.3 (0.5) A. For the remaining structures, the refinement procedures changed the structures by less than 0.3 A. While modest, the performance of the current refinement methods is better than the published refinement results obtained using standard molecular dynamics. 相似文献
5.
Protein folding is a hierarchical process where structure forms locally first, then globally. Some short sequence segments initiate folding through strong structural preferences that are independent of their three‐dimensional context in proteins. We have constructed a knowledge‐based force field in which the energy functions are conditional on local sequence patterns, as expressed in the hidden Markov model for local structure (HMMSTR). Carbon‐alpha force field (CALF) builds sequence specific statistical potentials based on database frequencies for α‐carbon virtual bond opening and dihedral angles, pair‐wise contacts and hydrogen bond donor‐acceptor pairs, and simulates folding via Brownian dynamics. We introduce hydrogen bond donor and acceptor potentials as α‐carbon probability fields that are conditional on the predicted local sequence. Constant temperature simulations were carried out using 27 peptides selected as putative folding initiation sites, each 12 residues in length, representing several different local structure motifs. Each 0.6 μs trajectory was clustered based on structure. Simulation convergence or representativeness was assessed by subdividing trajectories and comparing clusters. For 21 of the 27 sequences, the largest cluster made up more than half of the total trajectory. Of these 21 sequences, 14 had cluster centers that were at most 2.6 Å root mean square deviation (RMSD) from their native structure in the corresponding full‐length protein. To assess the adequacy of the energy function on nonlocal interactions, 11 full length native structures were relaxed using Brownian dynamics simulations. Equilibrated structures deviated from their native states but retained their overall topology and compactness. A simple potential that folds proteins locally and stabilizes proteins globally may enable a more realistic understanding of hierarchical folding pathways. Proteins 2009. © 2008 Wiley‐Liss, Inc. 相似文献
6.
Bonneau R Ruczinski I Tsai J Baker D 《Protein science : a publication of the Protein Society》2002,11(8):1937-1944
Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structures; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such a strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down beta-sheets that may resemble precursors to amyloid formation. 相似文献
7.
A hierarchical methodology for ab initio structure prediction is extended to treat oligomeric proteins. Modifications are made to a united-residue (UNRES) force field and a Conformational Space Annealing (CSA) global search method. The computational cost of including additional chains and the increase in speed from symmetry optimizations are evaluated. The native structures of two oligomeric proteins from the CASP3 exercise, the retro-GCN4 leucine zipper and the synthetic domain-swapped dimer, were identified as the lowest-energy families resulting from the search of the proteins when rotational symmetry was imposed. Additional searches in different symmetries and oligomerization states were carried out, and the results indicate some problems in the thoroughness of the search and in the search of packing arrangements if symmetry constraints are not imposed. 相似文献
8.
We have improved the multiple linear regression (MLR) algorithm for protein secondary structure prediction by combining it with the evolutionary information provided by multiple sequence alignment of PSI-BLAST. On the CB513 dataset, the three states average overall per-residue accuracy, Q(3), reached 76.4%, while segment overlap accuracy, SOV99, reached 73.2%, using a rigorous jackknife procedure and the strictest reduction of eight states DSSP definition to three states. This represents an improvement of approximately 5% on overall per-residue accuracy compared with previous work. The relative solvent accessibility prediction also benefited from this combination of methods. The system achieved 77.7% average jackknifed accuracy for two states prediction based on a 25% relative solvent accessibility mode, with a Mathews' correlation coefficient of 0.548. The improved MLR secondary structure and relative solvent accessibility prediction server is available at http://spg.biosci.tsinghua.edu.cn/. 相似文献
9.
The results of a protein structure prediction contest are reviewed. Twelve different groups entered predictions on 14 proteins of known sequence whose structures had been determined but not yet disseminated to the scientific community. Thus, these represent true tests of the current state of structure prediction methodologies. From this work, it is clear that accurate tertiary structure prediction is not yet possible. However, protein fold and motif prediction are possible when the motif is recognizably similar to another known structure. Internal symmetry and the information inherent in an aligned family of homologous sequences facilitate predictive efforts. Novel folds remain a major challenge for prediction efforts. © 1995 Wiley-Liss, Inc. 相似文献
10.
Fragment assembly using structural motifs excised from other solved proteins has shown to be an efficient method for ab initio protein‐structure prediction. However, how to construct accurate fragments, how to derive optimal restraints from fragments, and what the best fragment length is are the basic issues yet to be systematically examined. In this work, we developed a gapless‐threading method to generate position‐specific structure fragments. Distance profiles and torsion angle pairs are then derived from the fragments by statistical consistency analysis, which achieved comparable accuracy with the machine‐learning‐based methods although the fragments were taken from unrelated proteins. When measured by both accuracies of the derived distance profiles and torsion angle pairs, we come to a consistent conclusion that the optimal fragment length for structural assembly is around 10, and at least 100 fragments at each location are needed to achieve optimal structure assembly. The distant profiles and torsion angle pairs as derived by the fragments have been successfully used in QUARK for ab initio protein structure assembly and are provided by the QUARK online server at http://zhanglab.ccmb. med.umich.edu/QUARK/ . Proteins 2013. © 2012 Wiley Periodicals, Inc. 相似文献
11.
We have revisited the protein coarse-grained optimized potential for efficient structure prediction (OPEP). The training and validation sets consist of 13 and 16 protein targets. Because optimization depends on details of how the ensemble of decoys is sampled, trial conformations are generated by molecular dynamics, threading, greedy, and Monte Carlo simulations, or taken from publicly available databases. The OPEP parameters are varied by a genetic algorithm using a scoring function which requires that the native structure has the lowest energy, and the native-like structures have energy higher than the native structure but lower than the remote conformations. Overall, we find that OPEP correctly identifies 24 native or native-like states for 29 targets and has very similar capability to the all-atom discrete optimized protein energy model (DOPE), found recently to outperform five currently used energy models. 相似文献
12.
We report results from all-atom Monte Carlo simulations of the 36-residue villin headpiece subdomain HP-36. Protein-solvent interactions are approximated by an implicit solvent model. The parallel tempering is used to overcome the problem of slow convergence in low-temperature protein simulations. Our results show that this technique allows one to sample native-like structures of small proteins and points out the need for improved energy functions. 相似文献
13.
One of the major limitations of computational protein structure prediction is the deviation of predicted models from their experimentally derived true, native structures. The limitations often hinder the possibility of applying computational protein structure prediction methods in biochemical assignment and drug design that are very sensitive to structural details. Refinement of these low‐resolution predicted models to high‐resolution structures close to the native state, however, has proven to be extremely challenging. Thus, protein structure refinement remains a largely unsolved problem. Critical assessment of techniques for protein structure prediction (CASP) specifically indicated that most predictors participating in the refinement category still did not consistently improve model quality. Here, we propose a two‐step refinement protocol, called 3Drefine, to consistently bring the initial model closer to the native structure. The first step is based on optimization of hydrogen bonding (HB) network and the second step applies atomic‐level energy minimization on the optimized model using a composite physics and knowledge‐based force fields. The approach has been evaluated on the CASP benchmark data and it exhibits consistent improvement over the initial structure in both global and local structural quality measures. 3Drefine method is also computationally inexpensive, consuming only few minutes of CPU time to refine a protein of typical length (300 residues). 3Drefine web server is freely available at http://sysbio.rnet.missouri.edu/3Drefine/ . Proteins 2013. © 2012 Wiley Periodicals, Inc. 相似文献
14.
A Support Vector Machine learning system has been trained to predict protein solvent accessibility from the primary structure. Different kernel functions and sliding window sizes have been explored to find how they affect the prediction performance. Using a cut-off threshold of 15% that splits the dataset evenly (an equal number of exposed and buried residues), this method was able to achieve a prediction accuracy of 70.1% for single sequence input and 73.9% for multiple alignment sequence input, respectively. The prediction of three and more states of solvent accessibility was also studied and compared with other methods. The prediction accuracies are better than, or comparable to, those obtained by other methods such as neural networks, Bayesian classification, multiple linear regression, and information theory. In addition, our results further suggest that this system may be combined with other prediction methods to achieve more reliable results, and that the Support Vector Machine method is a very useful tool for biological sequence analysis. 相似文献
15.
An easy and uncomplicated method to predict the solvent accessibility state of a site in a multiple protein sequence alignment is described. The approach is based on amino acid exchange and compositional preference matrices for each of three accessibility states: buried, exposed, and intermediate. Calculations utilized a modified version of the 3D―ali databank, a collection of multiple sequence alignments anchored through protein tertiary structural superpositions. The technique achieves the same accuracy as much more complex methods and thus provides such advantages as computational affordability, facile updating, and easily understood residue substitution patterns useful to biochemists involved in protein engineering, design, and structural prediction. The program is available from the authors; and, due to its simplicity, the algorithm can be readily implemented on any system. For a given alignment site, a hand calculation can yield a comparative prediction. Proteins 32:190–199, 1998. © 1998 Wiley-Liss, Inc. 相似文献
16.
Fleming PJ Gong H Rose GD 《Protein science : a publication of the Protein Society》2006,15(8):1829-1834
Using a test set of 13 small, compact proteins, we demonstrate that a remarkably simple protocol can capture native topology from secondary structure information alone, in the absence of long-range interactions. It has been a long-standing open question whether such information is sufficient to determine a protein's fold. Indeed, even the far simpler problem of reconstructing the three-dimensional structure of a protein from its exact backbone torsion angles has remained a difficult challenge owing to the small, but cumulative, deviations from ideality in backbone planarity, which, if ignored, cause large errors in structure. As a familiar example, a small change in an elbow angle causes a large displacement at the end of your arm; the longer the arm, the larger the displacement. Here, correct secondary structure assignments (alpha-helix, beta-strand, beta-turn, polyproline II, coil) were used to constrain polypeptide backbone chains devoid of side chains, and the most stable folded conformations were determined, using Monte Carlo simulation. Just three terms were used to assess stability: molecular compaction, steric exclusion, and hydrogen bonding. For nine of the 13 proteins, this protocol restricts the main chain to a surprisingly small number of energetically favorable topologies, with the native one prominent among them. 相似文献
17.
We describe a method for predicting the three-dimensional (3-D) structure of proteins from their sequence alone. The method is based on the electrostatic screening model for the stability of the protein main-chain conformation. The free energy of a protein as a function of its conformation is obtained from the potentials of mean force analysis of high-resolution x-ray protein structures. The free energy function is simple and contains only 44 fitted coefficients. The minimization of the free energy is performed by the torsion space Monte Carlo procedure using the concept of hierarchic condensation. The Monte Carlo minimization procedure is applied to predict the secondary, super-secondary, and native 3-D structures of 12 proteins with 28–110 amino acids. The 3-D structures of the majority of local secondary and super-secondary structures are predicted accurately. This result suggests that control in forming the native-like local structure is distributed along the entire protein sequence. The native 3-D structure is predicted correctly for 3 of 12 proteins composed mainly from the α-helices. The method fails to predict the native 3-D structure of proteins with a predominantly β secondary structure. We suggest that the hierarchic condensation is not an appropriate procedure for simulating the folding of proteins made up primarily from β-strands. The method has been proved accurate in predicting the local secondary and super-secondary structures in the blind ab initio 3-D prediction experiment. Proteins 31:74–96, 1998. © 1998 Wiley-Liss, Inc. 相似文献
18.
Ab initio structure prediction and de novo protein design are two problems at the forefront of research in the fields of structural biology and chemistry. The goal of ab initio structure prediction of proteins is to correctly characterize the 3D structure of a protein using only the amino acid sequence as input. De novo protein design involves the production of novel protein sequences that adopt a desired fold. In this work, the results of a double-blind study are presented in which a new ab initio method was successfully used to predict the 3D structure of a protein designed through an experimental approach using binary patterned combinatorial libraries of de novo sequences. The predicted structure, which was produced before the experimental structure was known and without consideration of the design goals, and the final NMR analysis both characterize this protein as a 4-helix bundle. The similarity of these structures is evidenced by both small RMSD values between the coordinates of the two structures and a detailed analysis of the helical packing. 相似文献
19.
A reduced protein model with five to six atoms per amino acid and five amino acid types is developed and tested on a three-helix-bundle protein, a 46-amino acid fragment from staphylococcal protein A. The model does not rely on the widely used Go approximation, which ignores non-native interactions. We find that the collapse transition is considerably more abrupt for the protein A sequence than for random sequences with the same composition. The chain collapse is found to be at least as fast as helix formation. Energy minimization restricted to the thermodynamically favored topology gives a structure that has a root-mean-square deviation of 1.8 A from the native structure. The sequence-dependent part of our potential is pairwise additive. Our calculations suggest that fine-tuning this potential by parameter optimization is of limited use. 相似文献
20.
We describe the derivation and testing of a knowledge-based atomic environment potential for the modeling of protein structural energetics. An analysis of the probabilities of atomic interactions in a dataset of high-resolution protein structures shows that the probabilities of non-bonded inter-atomic contacts are not statistically independent events, and that the multi-body contact frequencies are poorly predicted from pairwise contact potentials. A pseudo-energy function is defined that measures the preferences for protein atoms to be in a given microenvironment defined by the number of contacting atoms in the environment and its atomic composition. This functional form is tested for its ability to recognize native protein structures amongst an ensemble of decoy structures and a detailed relative performance comparison is made with a number of common functions used in protein structure prediction. 相似文献