首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Xu D  Zhang Y 《Proteins》2012,80(7):1715-1735
Ab initio protein folding is one of the major unsolved problems in computational biology owing to the difficulties in force field design and conformational search. We developed a novel program, QUARK, for template-free protein structure prediction. Query sequences are first broken into fragments of 1-20 residues where multiple fragment structures are retrieved at each position from unrelated experimental structures. Full-length structure models are then assembled from fragments using replica-exchange Monte Carlo simulations, which are guided by a composite knowledge-based force field. A number of novel energy terms and Monte Carlo movements are introduced and the particular contributions to enhancing the efficiency of both force field and search engine are analyzed in detail. QUARK prediction procedure is depicted and tested on the structure modeling of 145 nonhomologous proteins. Although no global templates are used and all fragments from experimental structures with template modeling score >0.5 are excluded, QUARK can successfully construct 3D models of correct folds in one-third cases of short proteins up to 100 residues. In the ninth community-wide Critical Assessment of protein Structure Prediction experiment, QUARK server outperformed the second and third best servers by 18 and 47% based on the cumulative Z-score of global distance test-total scores in the FM category. Although ab initio protein folding remains a significant challenge, these data demonstrate new progress toward the solution of the most important problem in the field.  相似文献   

2.
Methods for automated prediction of deleterious protein mutations have utilized both structural and evolutionary information but the relative contribution of these two factors remains unclear. To address this, we have used a variety of structural and evolutionary features to create simple deleterious mutation models that have been tested on both experimental mutagenesis and human allele data. We find that the most accurate predictions are obtained using a solvent-accessibility term, the C(beta) density, and a score derived from homologous sequences, SIFT. A classification tree using these two features has a cross-validated prediction error of 20.5% on an experimental mutagenesis test set when the prior probability for deleterious and neutral cases is equal, whereas this prediction error is 28.8% and 22.2% using either the C(beta) density or SIFT alone. The improvement imparted by structure increases when fewer homologs are available: when restricted to three homologs the prediction error improves from 26.9% using SIFT alone to 22.4% using SIFT and the C(beta) density, or 24.8% using SIFT and a noisy C(beta) density term approximating the inaccuracy of ab initio structures modeled by the Rosetta method. We conclude that methods for deleterious mutation prediction should include structural information when fewer than five to ten homologs are available, and that ab initio predicted structures may soon be useful in such cases when high-resolution structures are unavailable.  相似文献   

3.
It is generally accepted that protein structures are more conserved than protein sequences, and 3D structure determination by computer simulations have become an important necessity in the postgenomic area. Despite major successes no robust, fast, and automated ab initio prediction algorithms for deriving accurate folds of single polypeptide chains or structures of intermolecular complexes exist at present. Here we present a methodology that uses selection and filtering of structural models generated by docking of known substructures such as individual proteins or domains through easily obtainable experimental NMR constraints. In particular, residual dipolar couplings and chemical shift mapping are used. Heuristic inclusion of chemical or biochemical knowledge about point-to-point interactions is combined in our selection strategy with the NMR data and commonly used contact potentials. We demonstrate the approach for the determination of protein-protein complexes using the EIN/HPr complex as an example and for establishing the domain-domain orientation in a chimeric protein, the recently determined hybrid human-Escherichia. coli thioredoxin.  相似文献   

4.
PROPAINOR is a new algorithm developed for ab initio prediction of the 3D structures of proteins using knowledge-based nonparametric multivariate statistical methods. This algorithm is found to be most efficient in terms of computational simplicity and prediction accuracy for single-domain proteins as compared to other ab initio methods. In this paper, we have used the algorithm for the atomic structure prediction of a multi-domain (two-domain) calcium-binding protein, whose solution structure has been deposited in the PDB recently (PDB ID: 1JFK). We have studied the sensitivity of the predicted structure to NMR distance restraints with their incorporation as an additional input. Further, we have compared the predicted structures in both these cases with the NMR derived solution structure reported earlier. We have also validated the refined structure for proper stereochemistry and favorable packing environment with good results and elucidated the role of the central linker. Figure The predicted 3D Structure of EhCaBP with bound Ca2+ ions (CaBP-0). In the structure, α-helices are shown in pink and the β-strands in yellow. Ca2+ ions are depicted as fluorescent green balls. Some of the residues in the calcium-binding loops are depicted in space-fill representation.   相似文献   

5.
Ab initio protein structure prediction methods have improved dramatically in the past several years. Because these methods require only the sequence of the protein of interest, they are potentially applicable to the open reading frames in the many organisms whose sequences have been and will be determined. Ab initio methods cannot currently produce models of high enough resolution for use in rational drug design, but there is an exciting potential for using the methods for functional annotation of protein sequences on a genomic scale. Here we illustrate how functional insights can be obtained from low-resolution predicted structures using examples from blind ab initio structure predictions from the third and fourth critical assessment of structure prediction (CASP3, CASP4) experiments.  相似文献   

6.
Pharmacogenomics is the study of the genetic basis for individual variation in response to drugs and other xenobiotics. Successful prediction of effects of genetic variations that change encoded amino acid sequences on protein function and their consequent biomedical implications depends on three-dimensional (3D) structures of the encoded amino acid sequences. To bridge sequence to function, thus facilitating an in-depth pharmacogenomic study, we tested the feasibility of the use of a semi-computational approach to predict 3D structures of rabbit and human indolethylamine N-methyltransferases (INMTs) from their amino acid sequences, which share less than 26% sequence identity with known protein 3D structures. Herein, we report 3D models of INMTs predicted by using the crystal structure of rat catechol O-methyltransferase as a template, testing of the models both computationally and experimentally, and successful use of the models in retrospective prediction of the effects of genetic polymorphisms and in identification of residues that contribute to observed species-specific differences in substrate affinity. The results encourage the use of the semi-computational approach to predict 3D protein structures for use in pharmacogenomic studies when de novo prediction of protein 3D structures from their amino acid sequences is still not feasible and X-ray crystallography and/or solution nuclear magnetic resonance spectroscopy can only determine 3D structures for a small number of known amino acid sequences.Electronic Supplementary Material available.  相似文献   

7.
Contact order and ab initio protein structure prediction   总被引:1,自引:0,他引:1       下载免费PDF全文
Although much of the motivation for experimental studies of protein folding is to obtain insights for improving protein structure prediction, there has been relatively little connection between experimental protein folding studies and computational structural prediction work in recent years. In the present study, we show that the relationship between protein folding rates and the contact order (CO) of the native structure has implications for ab initio protein structure prediction. Rosetta ab initio folding simulations produce a dearth of high CO structures and an excess of low CO structures, as expected if the computer simulations mimic to some extent the actual folding process. Consistent with this, the majority of failures in ab initio prediction in the CASP4 (critical assessment of structure prediction) experiment involved high CO structures likely to fold much more slowly than the lower CO structures for which reasonable predictions were made. This bias against high CO structures can be partially alleviated by performing large numbers of additional simulations, selecting out the higher CO structures, and eliminating the very low CO structures; this leads to a modest improvement in prediction quality. More significant improvements in predictions for proteins with complex topologies may be possible following significant increases in high-performance computing power, which will be required for thoroughly sampling high CO conformations (high CO proteins can take six orders of magnitude longer to fold than low CO proteins). Importantly for such a strategy, simulations performed for high CO structures converge much less strongly than those for low CO structures, and hence, lack of simulation convergence can indicate the need for improved sampling of high CO conformations. The parallels between Rosetta simulations and folding in vivo may extend to misfolding: The very low CO structures that accumulate in Rosetta simulations consist primarily of local up-down beta-sheets that may resemble precursors to amyloid formation.  相似文献   

8.
Tim J. Hubbard  J. Park 《Proteins》1995,23(3):398-402
Protein structure predictions were submitted for 9 of the target sequences in the competition that ran during 1994. Targets sequences were selected that had no known homology with any sequence of known structure and were members of a reasonably sized family of related but divergent sequences. The objective was either to recognize a compatible fold for the target sequence in the database of known structures or to predict ab initio its rough 3D topology. The main tools used were Hidden Markov models (HMM) for fold recognition, a β- strand pair potential to predict β-sheet topology, and the PHD server for secondary structure prediction. Compatible folds were correctly identified in a number of cases and the β-strand pair potential was shown to be a useful tool for ab initio topology prediction. © 1995 Wiley-Liss, Inc.  相似文献   

9.
We present a critical assessment of the performance of our homology model refinement method for G protein‐coupled receptors (GPCRs), called LITICon that led to top ranking structures in a recent structure prediction assessment GPCRDOCK2010. GPCRs form the largest class of drug targets for which only a few crystal structures are currently available. Therefore, accurate homology models are essential for drug design in these receptors. We submitted five models each for human chemokine CXCR4 (bound to small molecule IT1t and peptide CVX15) and dopamine D3DR (bound to small molecule eticlopride) before the crystal structures were published. Our models in both CXCR4/IT1t and D3/eticlopride assessments were ranked first and second, respectively, by ligand RMSD to the crystal structures. For both receptors, we developed two types of protein models: homology models based on known GPCR crystal structures, and ab initio models based on the prediction method MembStruk. The homology‐based models compared better to the crystal structures than the ab initio models. However, a robust refinement procedure for obtaining high accuracy structures is needed. We demonstrate that optimization of the helical tilt, rotation, and translation is vital for GPCR homology model refinement. As a proof of concept, our in‐house refinement program LITiCon captured the distinct orientation of TM2 in CXCR4, which differs from that of adrenoreceptors. These findings would be critical for refining GPCR homology models in future. Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

10.
Staphylococcal protein A (SpA) is a virulence factor from Staphylococcus aureus that is able to bind to immunoglobulins. The 3D structures of its immunoglobulin (Ig) binding domains have been extensively studied by NMR and X-ray crystallography, and are often used as model structures in developing de novo or ab initio strategies for predicting protein structure. These small three-helix-bundle structures, reported in free proteins or Ig-bound complexes, have been determined previously using medium- to high-resolution data. Although the location and relative orientation of the three helices in most of these published 3D domain structures are consistent, there are significant differences among the reported structures regarding the tilt angle of the first helix (helix 1). We have applied residual dipolar coupling data, together with nuclear Overhauser enhancement and scalar coupling data, in refining the NMR solution structure of an engineered IgG-binding domain (Z domain) of SpA. Our results demonstrate that the three helices are almost perfectly antiparallel in orientation, with the first helix tilting slightly away from the other two helices. We propose that this high-accuracy structure of the Z domain of SpA is a more suitable target for theoretical predictions of the free domain structure than previously published lower-accuracy structures of protein A domains.  相似文献   

11.
Kuhn M  Meiler J  Baker D 《Proteins》2004,54(2):282-288
Beta-sheet proteins have been particularly challenging for de novo structure prediction methods, which tend to pair adjacent beta-strands into beta-hairpins and produce overly local topologies. To remedy this problem and facilitate de novo prediction of beta-sheet protein structures, we have developed a neural network that classifies strand-loop-strand motifs by local hairpins and nonlocal diverging turns by using the amino acid sequence as input. The neural network is trained with a representative subset of the Protein Data Bank and achieves a prediction accuracy of 75.9 +/- 4.4% compared to a baseline prediction rate of 59.1%. Hairpins are predicted with an accuracy of 77.3 +/- 6.1%, diverging turns with an accuracy of 73.9 +/- 6.0%. Incorporation of the beta-hairpin/diverging turn classification into the ROSETTA de novo structure prediction method led to higher contact order models and somewhat improved tertiary structure predictions for a test set of 11 all-beta-proteins and 3 alphabeta-proteins. The beta-hairpin/diverging turn classification from amino acid sequences is available online for academic use (Meiler and Kuhn, 2003; www.jens-meiler.de/turnpred.html).  相似文献   

12.
The use of classical molecular dynamics simulations, performed in explicit water, for the refinement of structural models of proteins generated ab initio or based on homology has been investigated. The study involved a test set of 15 proteins that were previously used by Baker and coworkers to assess the efficiency of the ROSETTA method for ab initio protein structure prediction. For each protein, four models generated using the ROSETTA procedure were simulated for periods of between 5 and 400 nsec in explicit solvent, under identical conditions. In addition, the experimentally determined structure and the experimentally derived structure in which the side chains of all residues had been deleted and then regenerated using the WHATIF program were simulated and used as controls. A significant improvement in the deviation of the model structures from the experimentally determined structures was observed in several cases. In addition, it was found that in certain cases in which the experimental structure deviated rapidly from the initial structure in the simulations, indicating internal strain, the structures were more stable after regenerating the side-chain positions. Overall, the results indicate that molecular dynamics simulations on a tens to hundreds of nanoseconds time scale are useful for the refinement of homology or ab initio models of small to medium-size proteins.  相似文献   

13.
Ab initio protein structure prediction   总被引:3,自引:0,他引:3  
Steady progress has been made in the field of ab initio protein folding. A variety of methods now allow the prediction of low-resolution structures of small proteins or protein fragments up to approximately 100 amino acid residues in length. Such low-resolution structures may be sufficient for the functional annotation of protein sequences on a genome-wide scale. Although no consistently reliable algorithm is currently available, the essential challenges to developing a general theory or approach to protein structure prediction are better understood. The energy landscapes resulting from the structure prediction algorithms are only partially funneled to the native state of the protein. This review focuses on two areas of recent advances in ab initio structure prediction-improvements in the energy functions and strategies to search the caldera region of the energy landscapes.  相似文献   

14.
Fujitsuka Y  Chikenji G  Takada S 《Proteins》2006,62(2):381-398
Predicting protein tertiary structures by in silico folding is still very difficult for proteins that have new folds. Here, we developed a coarse-grained energy function, SimFold, for de novo structure prediction, performed a benchmark test of prediction with fragment assembly simulations for 38 test proteins, and proposed consensus prediction with Rosetta. The SimFold energy consists of many terms that take into account solvent-induced effects on the basis of physicochemical consideration. In the benchmark test, SimFold succeeded in predicting native structures within 6.5 A for 12 of 38 proteins; this success rate was the same as that by the publicly available version of Rosetta (ab initio version 1.2) run with default parameters. We investigated which energy terms in SimFold contribute to structure prediction performance, finding that the hydrophobic interaction is the most crucial for the prediction, whereas other sequence-specific terms have weak but positive roles. In the benchmark, well-predicted proteins by SimFold and by Rosetta were not the same for 5 of 12 proteins, which led us to introduce consensus prediction. With combined decoys, we succeeded in prediction for 16 proteins, four more than SimFold or Rosetta separately. For each of 38 proteins, structural ensembles generated by SimFold and by Rosetta were qualitatively compared by mapping sampled structural space onto two dimensions. For proteins of which one of the two methods succeeded and the other failed in prediction, the former had a less scattered ensemble located around the native. For proteins of which both methods succeeded in prediction, often two ensembles were mixed up.  相似文献   

15.
16.
17.
We recently developed the Rosetta algorithm for ab initio protein structure prediction, which generates protein structures from fragment libraries using simulated annealing. The scoring function in this algorithm favors the assembly of strands into sheets. However, it does not discriminate between different sheet motifs. After generating many structures using Rosetta, we found that the folding algorithm predominantly generates very local structures. We surveyed the distribution of beta-sheet motifs with two edge strands (open sheets) in a large set of non-homologous proteins. We investigated how much of that distribution can be accounted for by rules previously published in the literature, and developed a filter and a scoring method that enables us to improve protein structure prediction for beta-sheet proteins. Proteins 2002;48:85-97.  相似文献   

18.
Inter-residue interactions in protein folding and stability   总被引:6,自引:0,他引:6  
During the process of protein folding, the amino acid residues along the polypeptide chain interact with each other in a cooperative manner to form the stable native structure. The knowledge about inter-residue interactions in protein structures is very helpful to understand the mechanism of protein folding and stability. In this review, we introduce the classification of inter-residue interactions into short, medium and long range based on a simple geometric approach. The features of these interactions in different structural classes of globular and membrane proteins, and in various folds have been delineated. The development of contact potentials and the application of inter-residue contacts for predicting the structural class and secondary structures of globular proteins, solvent accessibility, fold recognition and ab initio tertiary structure prediction have been evaluated. Further, the relationship between inter-residue contacts and protein-folding rates has been highlighted. Moreover, the importance of inter-residue interactions in protein-folding kinetics and for understanding the stability of proteins has been discussed. In essence, the information gained from the studies on inter-residue interactions provides valuable insights for understanding protein folding and de novo protein design.  相似文献   

19.
We present the results of applying a novel knowledge-based method (FILM) to the prediction of small membrane protein structures. The basis of the method is the addition of a membrane potential to the energy terms (pairwise, solvation, steric, and hydrogen bonding) of a previously developed ab initio technique for the prediction of tertiary structure of globular proteins (FRAGFOLD). The method is based on the assembly of supersecondary structural fragments taken from a library of highly resolved protein structures using a standard simulated annealing algorithm. The membrane potential has been derived by the statistical analysis of a data set made of 640 transmembrane helices with experimentally defined topology and belonging to 133 proteins extracted from the SWISS-PROT database. Results obtained by applying the method to small membrane proteins of known 3D structure show that the method is able to predict, at a reasonable accuracy level, both the helix topology and the conformations of these proteins.  相似文献   

20.
A significant number of protein sequences in a given proteome have no obvious evolutionarily related protein in the database of solved protein structures, the PDB. Under these conditions, ab initio or template-free modeling methods are the sole means of predicting protein structure. To assess its expected performance on proteomes, the TASSER structure prediction algorithm is benchmarked in the ab initio limit on a representative set of 1129 nonhomologous sequences ranging from 40 to 200 residues that cover the PDB at 30% sequence identity and which adopt alpha, alpha + beta, and beta secondary structures. For sequences in the 40-100 (100-200) residue range, as assessed by their root mean square deviation from native, RMSD, the best of the top five ranked models of TASSER has a global fold that is significantly close to the native structure for 25% (16%) of the sequences, and with a correct identification of the structure of the protein core for 59% (36%). In the absence of a native structure, the structural similarity among the top five ranked models is a moderately reliable predictor of folding accuracy. If we classify the sequences according to their secondary structure content, then 64% (36%) of alpha, 43% (24%) of alpha + beta, and 20% (12%) of beta sequences in the 40-100 (100-200) residue range have a significant TM-score (TM-score > or = 0.4). TASSER performs best on helical proteins because there are less secondary structural elements to arrange in a helical protein than in a beta protein of equal length, since the average length of a helix is longer than that of a strand. In addition, helical proteins have shorter loops and dangling tails. If we exclude these flexible fragments, then TASSER has similar accuracy for sequences containing the same number of secondary structural elements, irrespective of whether they are helices and/or strands. Thus, it is the effective configurational entropy of the protein that dictates the average likelihood of correctly arranging the secondary structure elements.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号