共查询到20条相似文献,搜索用时 3 毫秒
1.
In the study of the protein folding problem with ab initio methods, the protein backbone can be built on some periodic lattices. Any vertex of these lattices can be occupied by a "ball," which can represent the mass center of an amino acid in a simplified coarse-grained model of the protein. The backbone, at a coarse-grained level, can be constituted of a No Reverse Self Avoiding Walk, which cannot intersect itself and cannot go back on itself. There is still much debate between those who use lattices to simplify the study of the protein folding problem and those preferring to work by using an off-lattice approach. Lattices can help to identify the protein tertiary structure in a computational less-expensive way, than off-lattice approaches that have to consider a potentially infinite number of possible structures. However, the use of a lattice, constituted of insufficiently accurate direction vectors, constrains the predictive ability of the model. The aim of this study is to perform a systematic screening of 7 known classic and 11 newly proposed lattices in terms of predictive power. The crystal structures of 42 different proteins (14 mainly alpha helical, 14 mainly beta sheet and 14 mixed structure proteins) were compared to the most accurate simulated models for each lattice. This strategy defines a scale of fitness for all the analyzed lattices and demonstrates that an increase in the coordination number and in the degrees of freedom is necessary but not sufficient to reach the best result. Instead, the introduction of a good set of direction vectors, as developed and tested in this study, strongly increases the lattice performance. 相似文献
2.
3.
Bonneau R Ruczinski I Tsai J Baker D 《Protein science : a publication of the Protein Society》2002,11(8):1937-1944
Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structures; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such a strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down beta-sheets that may resemble precursors to amyloid formation. 相似文献
4.
Fragment-HMM: a new approach to protein structure prediction 总被引:1,自引:0,他引:1
We designed a simple position-specific hidden Markov model to predict protein structure. Our new framework naturally repeats itself to converge to a final target, conglomerating fragment assembly, clustering, target selection, refinement, and consensus, all in one process. Our initial implementation of this theory converges to within 6 A of the native structures for 100% of decoys on all six standard benchmark proteins used in ROSETTA (discussed by Simons and colleagues in a recent paper), which achieved only 14%-94% for the same data. The qualities of the best decoys and the final decoys our theory converges to are also notably better. 相似文献
5.
Background
Disordered regions are segments of the protein chain which do not adopt stable structures. Such segments are often of interest because they have a close relationship with protein expression and functionality. As such, protein disorder prediction is important for protein structure prediction, structure determination and function annotation. 相似文献6.
NMR residual dipolar couplings (RDCs), in the form of the projection angles between the respective internuclear bond vectors, are used as structural restraints in the ab initio structure prediction of a test set of six proteins. The restraints are applied using a recently developed SICHO (SIde-CHain-Only) lattice protein model that employs a replica exchange Monte Carlo (MC) algorithm to search conformational space. Using a small number of RDC restraints, the quality of the predicted structures is improved as reflected by lower RMSD/dRMSD (root mean square deviation/distance root mean square deviation) values from the corresponding native structures and by the higher correlation of the most cooperative mode of motion of each predicted structure with that of the native structure. The latter, in particular, has possible implications for the structure-based functional analysis of predicted structures. 相似文献
7.
Is there value in constructing side chains while searching protein conformational space during an ab initio simulation? If so, what is the most computationally efficient method for constructing these side chains? To answer these questions, four published approaches were used to construct side chain conformations on a range of near-native main chains generated by ab initio protein structure prediction methods. The accuracy of these approaches was compared with a naive approach that selects the most frequently observed rotamer for a given amino acid to construct side chains. An all-atom conditional probability discriminatory function is useful at selecting conformations with overall low all-atom root mean square deviation (r.m.s.d.) and the discrimination improves on sets that are closer to the native conformation. In addition, the naive approach performs as well as more sophisticated methods in terms of the percentage of chi(1) angles built accurately and the all-atom r. m.s.d., between the native and near-native conformations. The results suggest that the naive method would be extremely useful for fast and efficient side chain construction on vast numbers of conformations for ab initio prediction of protein structure. 相似文献
8.
MOTIVATION: The Monte Carlo fragment insertion method for protein tertiary structure prediction (ROSETTA) of Baker and others, has been merged with the I-SITES library of sequence structure motifs and the HMMSTR model for local structure in proteins, to form a new public server for the ab initio prediction of protein structure. The server performs several tasks in addition to tertiary structure prediction, including a database search, amino acid profile generation, fragment structure prediction, and backbone angle and secondary structure prediction. Meeting reasonable service goals required improvements in the efficiency, in particular for the ROSETTA algorithm. RESULTS: The new server was used for blind predictions of 40 protein sequences as part of the CASP4 blind structure prediction experiment. The results for 31 of those predictions are presented here. 61% of the residues overall were found in topologically correct predictions, which are defined as fragments of 30 residues or more with a root-mean-square deviation in superimposed alpha carbons of less than 6A. HMMSTR 3-state secondary structure predictions were 73% correct overall. Tertiary structure predictions did not improve the accuracy of secondary structure prediction. 相似文献
9.
Ab initio protein structure prediction 总被引:3,自引:0,他引:3
Steady progress has been made in the field of ab initio protein folding. A variety of methods now allow the prediction of low-resolution structures of small proteins or protein fragments up to approximately 100 amino acid residues in length. Such low-resolution structures may be sufficient for the functional annotation of protein sequences on a genome-wide scale. Although no consistently reliable algorithm is currently available, the essential challenges to developing a general theory or approach to protein structure prediction are better understood. The energy landscapes resulting from the structure prediction algorithms are only partially funneled to the native state of the protein. This review focuses on two areas of recent advances in ab initio structure prediction-improvements in the energy functions and strategies to search the caldera region of the energy landscapes. 相似文献
10.
Chen CT Lin HN Sung TY Hsu WL 《Journal of bioinformatics and computational biology》2006,4(6):1287-1307
Local structure prediction can facilitate ab initio structure prediction, protein threading, and remote homology detection. However, the accuracy of existing methods is limited. In this paper, we propose a knowledge-based prediction method that assigns a measure called the local match rate to each position of an amino acid sequence to estimate the confidence of our method. Empirically, the accuracy of the method correlates positively with the local match rate; therefore, we employ it to predict the local structures of positions with a high local match rate. For positions with a low local match rate, we propose a neural network prediction method. To better utilize the knowledge-based and neural network methods, we design a hybrid prediction method, HYPLOSP (HYbrid method to Protein LOcal Structure Prediction) that combines both methods. To evaluate the performance of the proposed methods, we first perform cross-validation experiments by applying our knowledge-based method, a neural network method, and HYPLOSP to a large dataset of 3,925 protein chains. We test our methods extensively on three different structural alphabets and evaluate their performance by two widely used criteria, Maximum Deviation of backbone torsion Angle (MDA) and Q(N), which is similar to Q(3) in secondary structure prediction. We then compare HYPLOSP with three previous studies using a dataset of 56 new protein chains. HYPLOSP shows promising results in terms of MDA and Q(N) accuracy and demonstrates its alphabet-independent capability. 相似文献
11.
Ab initio protein structure prediction methods have improved dramatically in the past several years. Because these methods require only the sequence of the protein of interest, they are potentially applicable to the open reading frames in the many organisms whose sequences have been and will be determined. Ab initio methods cannot currently produce models of high enough resolution for use in rational drug design, but there is an exciting potential for using the methods for functional annotation of protein sequences on a genomic scale. Here we illustrate how functional insights can be obtained from low-resolution predicted structures using examples from blind ab initio structure predictions from the third and fourth critical assessment of structure prediction (CASP3, CASP4) experiments. 相似文献
12.
Giovanni Minervini Giuseppe Evangelista Fabio Polticelli Monika Piwowar Marek Kochanczyk Lukasz Flis Maciej Malawski Tomasz Szepieniec Zdzisaw Winiowski Ewa Matczyska Katarzyna Prymula Irena Roterman 《Bioinformation》2008,3(4):177-179
The number of natural proteins although large is significantly smaller than the theoretical number of proteins that can be obtained combining the 20 natural amino acids, the so-called “never born proteins” (NBPs). The study of the structure and properties of these proteins allows to investigate the sources of the natural proteins being of unique characteristics or special properties. However the structural study of NPBs can also been intended as an ideal test for evaluating the efficiency of software packages for the ab initio protein structure prediction. In this research, 10.000 three-dimensional structures of proteins of completely random sequence generated according to ROSETTA and FOD model were compared. The results show the limits of these software packages, but at the same time indicate that in many cases there is a significant agreement between the prediction obtained. 相似文献
13.
We have developed an ab initio protein structure prediction method called chunk-TASSER that uses ab initio folded supersecondary structure chunks of a given target as well as threading templates for obtaining contact potentials and distance restraints. The predicted chunks, selected on the basis of a new fragment comparison method, are folded by a fragment insertion method. Full-length models are built and refined by the TASSER methodology, which searches conformational space via parallel hyperbolic Monte Carlo. We employ an optimized reduced force field that includes knowledge-based statistical potentials and restraints derived from the chunks as well as threading templates. The method is tested on a dataset of 425 hard target proteins < or =250 amino acids in length. The average TM-scores of the best of top five models per target are 0.266, 0.336, and 0.362 by the threading algorithm SP(3), original TASSER and chunk-TASSER, respectively. For a subset of 80 proteins with predicted alpha-helix content > or =50%, these averages are 0.284, 0.356, and 0.403, respectively. The percentages of proteins with the best of top five models having TM-score > or =0.4 (a statistically significant threshold for structural similarity) are 3.76, 20.94, and 28.94% by SP(3), TASSER, and chunk-TASSER, respectively, overall, while for the subset of 80 predominantly helical proteins, these percentages are 2.50, 23.75, and 41.25%. Thus, chunk-TASSER shows a significant improvement over TASSER for modeling hard targets where no good template can be identified. We also tested chunk-TASSER on 21 medium/hard targets <200 amino-acids-long from CASP7. Chunk-TASSER is approximately 11% (10%) better than TASSER for the total TM-score of the first (best of top five) models. Chunk-TASSER is fully automated and can be used in proteome scale protein structure prediction. 相似文献
14.
Fragment assembly using structural motifs excised from other solved proteins has shown to be an efficient method for ab initio protein‐structure prediction. However, how to construct accurate fragments, how to derive optimal restraints from fragments, and what the best fragment length is are the basic issues yet to be systematically examined. In this work, we developed a gapless‐threading method to generate position‐specific structure fragments. Distance profiles and torsion angle pairs are then derived from the fragments by statistical consistency analysis, which achieved comparable accuracy with the machine‐learning‐based methods although the fragments were taken from unrelated proteins. When measured by both accuracies of the derived distance profiles and torsion angle pairs, we come to a consistent conclusion that the optimal fragment length for structural assembly is around 10, and at least 100 fragments at each location are needed to achieve optimal structure assembly. The distant profiles and torsion angle pairs as derived by the fragments have been successfully used in QUARK for ab initio protein structure assembly and are provided by the QUARK online server at http://zhanglab.ccmb. med.umich.edu/QUARK/ . Proteins 2013. © 2012 Wiley Periodicals, Inc. 相似文献
15.
Andrzej Kloczkowski Robert L. Jernigan Zhijun Wu Guang Song Lei Yang Andrzej Kolinski Piotr Pokarowski 《Journal of structural and functional genomics》2009,10(1):67-81
Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r ij 2 ] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r ij is greater or less than a cutoff value r cutoff. We have performed spectral decomposition of the distance matrices $ {\mathbf{D}} = \sum {\lambda_{k} {\mathbf{v}}_{k} {\mathbf{v}}_{k}^{T} } Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict
protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r
ij2] containing all square distances between residues in proteins. This distance matrix contains more information than the contact
matrix C, that has elements of either 0 or 1 depending on whether the distance r
ij is greater or less than a cutoff value r
cutoff. We have performed spectral decomposition of the distance matrices , in terms of eigenvalues and the corresponding eigenvectors and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r
2—the square distance of points from the center of mass, with the next three being the principal components of the system of
points. By predicting r
2 from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 ?, and by combining
it with the prediction of the first principal component we can improve this approximation to 4.0 ?. We can also explain the
role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number,
the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also
shown that the next three components are related to spatial directionality of the secondary structure elements, and they may
be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available
HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information
about the dynamics. After structure matching, we apply principal component analysis (PCA) to obtain the important apparent
motions for both bound and unbound structures. There are significant similarities between the first few key motions and the
first few low-frequency normal modes calculated from a static representative structure with an elastic network model (ENM)
that is based on the contact matrix C (related to D), strongly suggesting that the variations among the observed structures and the corresponding conformational changes are
facilitated by the low-frequency, global motions intrinsic to the structure. Similarities are also found when the approach
is applied to an NMR ensemble, as well as to atomic molecular dynamics (MD) trajectories. Thus, a sufficiently large number
of experimental structures can directly provide important information about protein dynamics, but ENM can also provide a similar
sampling of conformations. Finally, we use distance constraints from databases of known protein structures for structure refinement.
We use the distributions of distances of various types in known protein structures to obtain the most probable ranges or the
mean-force potentials for the distances. We then impose these constraints on structures to be refined or include the mean-force
potentials directly in the energy minimization so that more plausible structural models can be built. This approach has been
successfully used by us in 2006 in the CASPR structure refinement (). 相似文献
16.
One of the major bottlenecks in many ab initio protein structure prediction methods is currently the selection of a small number of candidate structures for high‐resolution refinement from large sets of low‐resolution decoys. This step often includes a scoring by low‐resolution energy functions and a clustering of conformations by their pairwise root mean square deviations (RMSDs). As an efficient selection is crucial to reduce the overall computational cost of the predictions, any improvement in this direction can increase the overall performance of the predictions and the range of protein structures that can be predicted. We show here that the use of structural profiles, which can be predicted with good accuracy from the amino acid sequences of proteins, provides an efficient means to identify good candidate structures. Proteins 2010. © 2009 Wiley‐Liss, Inc. 相似文献
17.
The folding process defines three‐dimensional protein structures from their amino acid chains. A protein's structure determines its activity and properties; thus knowing such conformation on an atomic level is essential for both basic and applied studies of protein function and dynamics. However, the acquisition of such structures by experimental methods is slow and expensive, and current computational methods mostly depend on previously known structures to determine new ones. Here we present a new software called GSAFold that applies the generalized simulated annealing (GSA) algorithm on ab initio protein structure prediction. The GSA is a stochastic search algorithm employed in energy minimization and used in global optimization problems, especially those that depend on long‐range interactions, such as gravity models and conformation optimization of small molecules. This new implementation applies, for the first time in ab initio protein structure prediction, an analytical inverse for the Visitation function of GSA. It also employs the broadly used NAMD Molecular Dynamics package to carry out energy calculations, allowing the user to select different force fields and parameterizations. Moreover, the software also allows the execution of several simulations simultaneously. Applications that depend on protein structures include rational drug design and structure‐based protein function prediction. Applying GSAFold in a test peptide, it was possible to predict the structure of mastoparan‐X to a root mean square deviation of 3.00 Å. Proteins 2012; © 2012 Wiley Periodicals, Inc. 相似文献
18.
19.
Ab initio protein structure prediction using pathway models 总被引:1,自引:0,他引:1
Ab initio prediction is the challenging attempt to predict protein structures based only on sequence information and without using templates. It is often divided into two distinct sub-problems: (a) the scoring function that can distinguish native, or native-like structures, from non-native ones; and (b) the method of searching the conformational space. Currently, there is no reliable scoring function that can always drive a search to the native fold, and there is no general search method that can guarantee a significant sampling of near-natives. Pathway models combine the scoring function and the search. In this short review, we explore some of the ways pathway models are used in folding, in published works since 2001, and present a new pathway model, HMMSTR-CM, that uses a fragment library and a set of nucleation/propagation-based rules. The new method was used for ab initio predictions as part of CASP5. This work was presented at the Winter School in Bioinformatics, Bologna, Italy, 10-14 February 2003. 相似文献
20.
Recently ab initio protein structure prediction methods have advanced sufficiently so that they often assemble the correct low resolution structure of the protein. To enhance the speed of conformational search, many ab initio prediction programs adopt a reduced protein representation. However, for drug design purposes, better quality structures are probably needed. To achieve this refinement, it is natural to use a more detailed heavy atom representation. Here, as opposed to costly implicit or explicit solvent molecular dynamics simulations, knowledge-based heavy atom pair potentials were employed. By way of illustration, we tried to improve the quality of the predicted structures obtained from the ab initio prediction program TOUCHSTONE by three methods: local constraint refinement, reduced predicted tertiary contact refinement, and statistical pair potential guided molecular dynamics. Sixty-seven predicted structures from 30 small proteins (less than 150 residues in length) representing different structural classes (alpha, beta, alpha;/beta) were examined. In 33 cases, the root mean square deviation (RMSD) from native structures improved by more than 0.3 A; in 19 cases, the improvement was more than 0.5 A, and sometimes as large as 1 A. In only seven (four) cases did the refinement procedure increase the RMSD by more than 0.3 (0.5) A. For the remaining structures, the refinement procedures changed the structures by less than 0.3 A. While modest, the performance of the current refinement methods is better than the published refinement results obtained using standard molecular dynamics. 相似文献