共查询到20条相似文献,搜索用时 0 毫秒
1.
We have developed an ab initio protein structure prediction method called chunk-TASSER that uses ab initio folded supersecondary structure chunks of a given target as well as threading templates for obtaining contact potentials and distance restraints. The predicted chunks, selected on the basis of a new fragment comparison method, are folded by a fragment insertion method. Full-length models are built and refined by the TASSER methodology, which searches conformational space via parallel hyperbolic Monte Carlo. We employ an optimized reduced force field that includes knowledge-based statistical potentials and restraints derived from the chunks as well as threading templates. The method is tested on a dataset of 425 hard target proteins < or =250 amino acids in length. The average TM-scores of the best of top five models per target are 0.266, 0.336, and 0.362 by the threading algorithm SP(3), original TASSER and chunk-TASSER, respectively. For a subset of 80 proteins with predicted alpha-helix content > or =50%, these averages are 0.284, 0.356, and 0.403, respectively. The percentages of proteins with the best of top five models having TM-score > or =0.4 (a statistically significant threshold for structural similarity) are 3.76, 20.94, and 28.94% by SP(3), TASSER, and chunk-TASSER, respectively, overall, while for the subset of 80 predominantly helical proteins, these percentages are 2.50, 23.75, and 41.25%. Thus, chunk-TASSER shows a significant improvement over TASSER for modeling hard targets where no good template can be identified. We also tested chunk-TASSER on 21 medium/hard targets <200 amino-acids-long from CASP7. Chunk-TASSER is approximately 11% (10%) better than TASSER for the total TM-score of the first (best of top five) models. Chunk-TASSER is fully automated and can be used in proteome scale protein structure prediction. 相似文献
2.
Ab initio protein structure prediction using pathway models 总被引:1,自引:0,他引:1
Ab initio prediction is the challenging attempt to predict protein structures based only on sequence information and without using templates. It is often divided into two distinct sub-problems: (a) the scoring function that can distinguish native, or native-like structures, from non-native ones; and (b) the method of searching the conformational space. Currently, there is no reliable scoring function that can always drive a search to the native fold, and there is no general search method that can guarantee a significant sampling of near-natives. Pathway models combine the scoring function and the search. In this short review, we explore some of the ways pathway models are used in folding, in published works since 2001, and present a new pathway model, HMMSTR-CM, that uses a fragment library and a set of nucleation/propagation-based rules. The new method was used for ab initio predictions as part of CASP5. This work was presented at the Winter School in Bioinformatics, Bologna, Italy, 10-14 February 2003. 相似文献
3.
Rainer Breitling Shawn Ritchie Dayan Goodenowe Mhairi L. Stewart Michael P. Barrett 《Metabolomics : Official journal of the Metabolomic Society》2006,2(3):155-164
Fourier transform mass spectrometry has recently been introduced into the field of metabolomics as a technique that enables the mass separation of complex mixtures at very high resolution and with ultra high mass accuracy. Here we show that this enhanced mass accuracy can be exploited to predict large metabolic networks ab initio, based only on the observed metabolites without recourse to predictions based on the literature. The resulting networks are highly information-rich and clearly non-random. They can be used to infer the chemical identity of metabolites and to obtain a global picture of the structure of cellular metabolic networks. This represents the first reconstruction of metabolic networks based on unbiased metabolomic data and offers a breakthrough in the systems-wide analysis of cellular metabolism. 相似文献
4.
5.
6.
A statistical mechanics methodology for predicting the solution structures and populations of peptides developed recently is based on a novel method for optimizing implicit solvation models, which was applied initially to a cyclic hexapeptide in DMSO (C. Baysal and H. Meirovitch, Journal of American Chemical Society, 1998, vol. 120, pp. 800-812). Thus, the molecule has been described by the simplified energy function E(tot) = E(GRO) + summation operator(k) sigma(k)A(k), where E(GRO) is the GROMOS force-field energy, sigma(k) and A(k) are the atomic solvation parameter (ASP) and the solvent accessible surface area of atom k, respectively. In a more recent study, these ASPs have been found to be transferable to the cyclic pentapeptide cyclo(D-Pro(1)-Ala(2)-Ala(3)-Ala(4)-Ala(5)) in DMSO (C. Baysal and H. Meirovitch, Biopolymers, 2000, vol. 53, pp. 423-433). In the present paper, our methodology is applied to the cyclic heptapeptides axinastatin 2 [cyclo(Asn(1)-Pro(2)-Phe(3)-Val(4)-Leu(5)-Pro(6)-Val(7))] and axinastatin 3 [cyclo(Asn(1)-Pro(2)-Phe(3)-Ile(4)-Leu(5)-Pro(6)-Val(7))], in DMSO, which were studied by nmr by Mechnich et al. (Helvetica Chimica Acta, 1997, vol. 80, pp. 1338-1354). The calculations for axinastatin 2 show that special ASPs should be optimized for the partially charged side-chain atoms of Asn while the rest of the atoms take their values derived in our previous work; this suggests that similar optimization might be needed for other side chains as well. The solution structures of these peptides are obtained ab initio (i.e., without using experimental restraints) by an extensive conformational search based on E(GRO) alone and E(*)(tot), which consists of the new set of ASPs. For E(*)(tot), the theoretical values of proton-proton distances, (3)J coupling constants, and other properties are found to agree very well with the nmr results, and they are always better than those based on E(GRO). 相似文献
7.
The results of a protein structure prediction contest are reviewed. Twelve different groups entered predictions on 14 proteins of known sequence whose structures had been determined but not yet disseminated to the scientific community. Thus, these represent true tests of the current state of structure prediction methodologies. From this work, it is clear that accurate tertiary structure prediction is not yet possible. However, protein fold and motif prediction are possible when the motif is recognizably similar to another known structure. Internal symmetry and the information inherent in an aligned family of homologous sequences facilitate predictive efforts. Novel folds remain a major challenge for prediction efforts. © 1995 Wiley-Liss, Inc. 相似文献
8.
9.
Ab initio prediction of thermodynamically feasible reaction directions from biochemical network stoichiometry 总被引:3,自引:0,他引:3
Analysis of the stoichiometric structure of metabolic networks provides insights into the relationships between structure, function, and regulation of metabolic systems. Based on knowledge of only reaction stoichiometry, certain aspects of network functionality and robustness can be predicted. Current theories focus on breaking a metabolic network down into non-decomposable pathways able to operate in steady state. The physics underlying these theories is based on mass balance and the laws of thermodynamics. However, due to the inherent nonlinearity of the thermodynamic constraints on metabolic fluxes, computational analysis of large-scale biochemical systems can be expensive. In this study, it is shown how the feasible reaction directions may be determined by either computing the allowable ranges under the mass-balance and thermodynamic constraints or by analyzing the stoichiometric structure of the network. The computed reaction directions translate into a set of linear constraints necessary for thermodynamic feasibility. This set of necessary linear constraints is shown to be sufficient to guarantee feasibility in certain cases, thus translating the nonlinear thermodynamic constraints to linear. We show that for a reaction network of 44 internal reactions representing energy metabolism, the computed linear inequality constraints represent necessary and sufficient conditions for thermodynamic feasibility. 相似文献
10.
11.
12.
Sánchez IE Beltrao P Stricher F Schymkowitz J Ferkinghoff-Borg J Rousseau F Serrano L 《PLoS computational biology》2008,4(4):e1000052
Current experiments likely cover only a fraction of all protein-protein interactions. Here, we developed a method to predict SH2-mediated protein-protein interactions using the structure of SH2-phosphopeptide complexes and the FoldX algorithm. We show that our approach performs similarly to experimentally derived consensus sequences and substitution matrices at predicting known in vitro and in vivo targets of SH2 domains. We use our method to provide a set of high-confidence interactions for human SH2 domains with known structure filtered on secondary structure and phosphorylation state. We validated the predictions using literature-derived SH2 interactions and a probabilistic score obtained from a naive Bayes integration of information on coexpression, conservation of the interaction in other species, shared interaction partners, and functions. We show how our predictions lead to a new hypothesis for the role of SH2 domains in signaling. 相似文献
13.
Ian Walsh Alberto JM Martin Catherine Mooney Enrico Rubagotti Alessandro Vullo Gianluca Pollastri 《BMC bioinformatics》2009,10(1):195-19
Background
Proteins, especially larger ones, are often composed of individual evolutionary units, domains, which have their own function and structural fold. Predicting domains is an important intermediate step in protein analyses, including the prediction of protein structures. 相似文献14.
15.
EM-Fold was used to build models for nine proteins in the maps of GroEL (7.7 ? resolution) and ribosome (6.4 ? resolution) in the ab initio modeling category of the 2010 cryo-electron microscopy modeling challenge. EM-Fold assembles predicted secondary structure elements (SSEs) into regions of the density map that were identified to correspond to either α-helices or β-strands. The assembly uses a Monte Carlo algorithm where loop closure, density-SSE length agreement, and strength of connecting density between SSEs are evaluated. Top-scoring models are refined by translating, rotating, and bending SSEs to yield better agreement with the density map. EM-Fold produces models that contain backbone atoms within SSEs only. The RMSD values of the models with respect to native range from 2.4 to 3.5 ? for six of the nine proteins. EM-Fold failed to predict the correct topology in three cases. Subsequently, Rosetta was used to build loops and side chains for the very best scoring models after EM-Fold refinement. The refinement within Rosetta's force field is driven by a density agreement score that calculates a cross-correlation between a density map simulated from the model and the experimental density map. All-atom RMSDs as low as 3.4 ? are achieved in favorable cases. Values above 10.0 ? are observed for two proteins with low overall content of secondary structure and hence particularly complex loop modeling problems. RMSDs over residues in secondary structure elements range from 2.5 to 4.8 ?. 相似文献
16.
Ian Walsh Davide Baù Alberto JM Martin Catherine Mooney Alessandro Vullo Gianluca Pollastri 《BMC structural biology》2009,9(1):5-20
Background
Prediction of protein structures from their sequences is still one of the open grand challenges of computational biology. Some approaches to protein structure prediction, especially ab initio ones, rely to some extent on the prediction of residue contact maps. Residue contact map predictions have been assessed at the CASP competition for several years now. Although it has been shown that exact contact maps generally yield correct three-dimensional structures, this is true only at a relatively low resolution (3–4 Å from the native structure). Another known weakness of contact maps is that they are generally predicted ab initio, that is not exploiting information about potential homologues of known structure.Results
We introduce a new class of distance restraints for protein structures: multi-class distance maps. We show that C α trace reconstructions based on 4-class native maps are significantly better than those from residue contact maps. We then build two predictors of 4-class maps based on recursive neural networks: one ab initio, or relying on the sequence and on evolutionary information; one template-based, or in which homology information to known structures is provided as a further input. We show that virtually any level of sequence similarity to structural templates (down to less than 10%) yields more accurate 4-class maps than the ab initio predictor. We show that template-based predictions by recursive neural networks are consistently better than the best template and than a number of combinations of the best available templates. We also extract binary residue contact maps at an 8 Å threshold (as per CASP assessment) from the 4-class predictors and show that the template-based version is also more accurate than the best template and consistently better than the ab initio one, down to very low levels of sequence identity to structural templates. Furthermore, we test both ab-initio and template-based 8 Å predictions on the CASP7 targets using a pre-CASP7 PDB, and find that both predictors are state-of-the-art, with the template-based one far outperforming the best CASP7 systems if templates with sequence identity to the query of 10% or better are available. Although this is not the main focus of this paper we also report on reconstructions of C α traces based on both ab initio and template-based 4-class map predictions, showing that the latter are generally more accurate even when homology is dubious.Conclusion
Accurate predictions of multi-class maps may provide valuable constraints for improved ab initio and template-based prediction of protein structures, naturally incorporate multiple templates, and yield state-of-the-art binary maps. Predictions of protein structures and 8 Å contact maps based on the multi-class distance map predictors described in this paper are freely available to academic users at the url http://distill.ucd.ie/. 相似文献17.
We present a hierarchical method to predict protein tertiary structure models from sequence. We start with complete enumeration of conformations using a simple tetrahedral lattice model. We then build conformations with increasing detail, and at each step select a subset of conformations using empirical energy functions with increasing complexity. After enumeration on lattice, we select a subset of low energy conformations using a statistical residue-residue contact energy function, and generate all-atom models using predicted secondary structure. A combined knowledge-based atomic level energy function is then used to select subsets of the all-atom models. The final predictions are generated using a consensus distance geometry procedure. We test the feasibility of the procedure on a set of 12 small proteins covering a wide range of protein topologies. A rigorous double-blind test of our method was made under the auspices of the CASP3 experiment, where we did ab initio structure predictions for 12 proteins using this approach. The performance of our methodology at CASP3 is reasonably good and completely consistent with our initial tests. 相似文献
18.
One of the major bottlenecks in many ab initio protein structure prediction methods is currently the selection of a small number of candidate structures for high‐resolution refinement from large sets of low‐resolution decoys. This step often includes a scoring by low‐resolution energy functions and a clustering of conformations by their pairwise root mean square deviations (RMSDs). As an efficient selection is crucial to reduce the overall computational cost of the predictions, any improvement in this direction can increase the overall performance of the predictions and the range of protein structures that can be predicted. We show here that the use of structural profiles, which can be predicted with good accuracy from the amino acid sequences of proteins, provides an efficient means to identify good candidate structures. Proteins 2010. © 2009 Wiley‐Liss, Inc. 相似文献
19.
We propose a new formulation for the problem of ab initio metabolic pathway reconstruction. Given a set of biochemical reactions together with their substrates and products, we consider the reactions as transfers of atoms between the chemical compounds and we look for successions of reactions transferring a maximal (or preset) number of atoms between a given source and sink compound. We state this problem as the one of finding a composition of partial injections that maximizes the image size. First, we study the theoretical complexity of this problem, state some related problems and then give a practical algorithm to solve them. Finally, we present two applications of this approach to the reconstruction of the tryptophan biosynthesis pathway and to the glycolysis. 相似文献