首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MollDE: a homology modeling framework you can click with   总被引:1,自引:0,他引:1  
SUMMARY: Molecular Integrated Development Environment (MolIDE) is an integrated application designed to provide homology modeling tools and protocols under a uniform, user-friendly graphical interface. Its main purpose is to combine the most frequent modeling steps in a semi-automatic, interactive way, guiding the user from the target protein sequence to the final three-dimensional protein structure. The typical basic homology modeling process is composed of building sequence profiles of the target sequence family, secondary structure prediction, sequence alignment with PDB structures, assisted alignment editing, side-chain prediction and loop building. All of these steps are available through a graphical user interface. MolIDE's user-friendly and streamlined interactive modeling protocol allows the user to focus on the important modeling questions, hiding from the user the raw data generation and conversion steps. MolIDE was designed from the ground up as an open-source, cross-platform, extensible framework. This allows developers to integrate additional third-party programs to MolIDE. AVAILABILITY: http://dunbrack.fccc.edu/molide/molide.php CONTACT: rl_dunbrack@fccc.edu.  相似文献   

2.
Rohl CA  Strauss CE  Chivian D  Baker D 《Proteins》2004,55(3):656-677
A major limitation of current comparative modeling methods is the accuracy with which regions that are structurally divergent from homologues of known structure can be modeled. Because structural differences between homologous proteins are responsible for variations in protein function and specificity, the ability to model these differences has important functional consequences. Although existing methods can provide reasonably accurate models of short loop regions, modeling longer structurally divergent regions is an unsolved problem. Here we describe a method based on the de novo structure prediction algorithm, Rosetta, for predicting conformations of structurally divergent regions in comparative models. Initial conformations for short segments are selected from the protein structure database, whereas longer segments are built up by using three- and nine-residue fragments drawn from the database and combined by using the Rosetta algorithm. A gap closure term in the potential in combination with modified Newton's method for gradient descent minimization is used to ensure continuity of the peptide backbone. Conformations of variable regions are refined in the context of a fixed template structure using Monte Carlo minimization together with rapid repacking of side-chains to iteratively optimize backbone torsion angles and side-chain rotamers. For short loops, mean accuracies of 0.69, 1.45, and 3.62 A are obtained for 4, 8, and 12 residue loops, respectively. In addition, the method can provide reasonable models of conformations of longer protein segments: predicted conformations of 3A root-mean-square deviation or better were obtained for 5 of 10 examples of segments ranging from 13 to 34 residues. In combination with a sequence alignment algorithm, this method generates complete, ungapped models of protein structures, including regions both similar to and divergent from a homologous structure. This combined method was used to make predictions for 28 protein domains in the Critical Assessment of Protein Structure 4 (CASP 4) and 59 domains in CASP 5, where the method ranked highly among comparative modeling and fold recognition methods. Model accuracy in these blind predictions is dominated by alignment quality, but in the context of accurate alignments, long protein segments can be accurately modeled. Notably, the method correctly predicted the local structure of a 39-residue insertion into a TIM barrel in CASP 5 target T0186.  相似文献   

3.
In recent years, it has been repeatedly demonstrated that the coordinates of the main-chain atoms alone are sufficient to determine the side-chain conformations of buried residues of compact proteins. Given a perfect backbone, the side-chain packing method can predict the side-chain conformations to an accuracy as high as 1.2 Å RMS deviation (RMSD) with greater than 80% of the χ angles correct. However, similarly rigorous studies have not been conducted to determine how well these apply, if at all, to the more important problem of homology modeling per se. Specifically, if the available backbone is imperfect, as expected for practical application of homology modeling, can packing constraints alone achieve sufficiently accurate predictions to be useful? Here, by systematically applying such methods to the pairwise modeling of two repressor and two cro proteins from the closely related bacteriophages 434 and P22, we find that when the backbone RMSD is 0.8 Å, the prediction on buried side chain is accurate with an RMS error of 1.8 Å and approximately 70% of the χ angles correctly predicted. When the backbone RMSD is larger, in the range of 1.6–1.8 Å, the prediction quality is still significantly better than random, with RMS error at 2.2 Å on the buried side chains and 60% accuracy on χ angles. Together these results suggest the following rules-of-thumb for homology modeling of buried side chains. When the sequence identity between the modeled sequence and the template sequence is >50% (or, equivalently, the expected backbone RMSD is <1 Å), side-chain packing methods work well. When sequence identity is between 30–50%, reflecting a backbone RMS error of 1–2 Å, it is still valid to use side-chain packing methods to predict the buried residues, albeit with care. When sequence identity is below 30% (or backbone RMS error greater than 2 Å), the backbone constraint alone is unlikely to produce useful models. Other methods, such as those involving the use of database fragments to reconstruct a template backbone, may be necessary as a complementary guide for modeling.  相似文献   

4.
Modeling a protein structure based on a homologous structure is a standard method in structural biology today. In this process an alignment of a target protein sequence onto the structure of a template(s) is used as input to a program that constructs a 3D model. It has been shown that the most important factor in this process is the correctness of the alignment and the choice of the best template structure(s), while it is generally believed that there are no major differences between the best modeling programs. Therefore, a large number of studies to benchmark the alignment qualities and the selection process have been performed. However, to our knowledge no large-scale benchmark has been performed to evaluate the programs used to transform the alignment to a 3D model. In this study, a benchmark of six different homology modeling programs- Modeller, SegMod/ENCAD, SWISS-MODEL, 3D-JIGSAW, nest, and Builder-is presented. The performance of these programs is evaluated using physiochemical correctness and structural similarity to the correct structure. From our analysis it can be concluded that no single modeling program outperform the others in all tests. However, it is quite clear that three modeling programs, Modeller, nest, and SegMod/ ENCAD, perform better than the others. Interestingly, the fastest and oldest modeling program, SegMod/ ENCAD, performs very well, although it was written more than 10 years ago and has not undergone any development since. It can also be observed that none of the homology modeling programs builds side chains as well as a specialized program (SCWRL), and therefore there should be room for improvement.  相似文献   

5.
The performance of the self-consistent mean field theory (SCMFT) method for side-chain modeling, employing rotamer energies calculated with the flexible rotamer model (FRM), is evaluated in the context of comparative modeling of protein structure. Predictions were carried out on a test set of 56 model backbones of varying accuracy, to allow side-chain prediction accuracy to be analyzed as a function of backbone accuracy. A progressive decrease in the accuracy of prediction was observed as backbone accuracy decreased. However, even for very low backbone accuracy, prediction was substantially higher than random, indicating that the FRM can, in part, compensate for the errors in the modeled tertiary environment. It was also investigated whether the introduction in the FRM-SCMFT method of knowledge-based biases, derived from a backbone-dependent rotamer library, could enhance its performance. A bias derived from the backbone-dependent rotamer conformations alone did not improve prediction accuracy. However, a bias derived from the backbone-dependent rotamer probabilities improved prediction accuracy considerably. This bias was incorporated through two different strategies. In one (the indirect strategy), rotamer probabilities were used to reject unlikely rotamers a priori, thus restricting prediction by FRM-SCMFT to a subset containing only the most probable rotamers in the library. In the other (the direct strategy), rotamer energies were transformed into pseudo-energies that were added to the average potential energies of the respective rotamers, thereby creating hybrid energy-based/knowledge-based average rotamer energies, which were used by the FRM-SCMFT method for prediction. For all degrees of backbone accuracy, an optimal strength of the knowledge-based bias existed for both strategies for which predictions were more accurate than pure energy-based predictions, and also than pure knowledge-based predictions. Hybrid knowledge-based/energy-based methods were obtained from both strategies and compared with the SCWRL method, a hybrid method based on the same backbone-dependent rotamer library. The accuracy of the indirect method was approximately the same as that of the SCWRL method, but that of the direct method was significantly higher.  相似文献   

6.
A graph-theory algorithm for rapid protein side-chain prediction   总被引:19,自引:0,他引:19       下载免费PDF全文
Fast and accurate side-chain conformation prediction is important for homology modeling, ab initio protein structure prediction, and protein design applications. Many methods have been presented, although only a few computer programs are publicly available. The SCWRL program is one such method and is widely used because of its speed, accuracy, and ease of use. A new algorithm for SCWRL is presented that uses results from graph theory to solve the combinatorial problem encountered in the side-chain prediction problem. In this method, side chains are represented as vertices in an undirected graph. Any two residues that have rotamers with nonzero interaction energies are considered to have an edge in the graph. The resulting graph can be partitioned into connected subgraphs with no edges between them. These subgraphs can in turn be broken into biconnected components, which are graphs that cannot be disconnected by removal of a single vertex. The combinatorial problem is reduced to finding the minimum energy of these small biconnected components and combining the results to identify the global minimum energy conformation. This algorithm is able to complete predictions on a set of 180 proteins with 34342 side chains in <7 min of computer time. The total chi(1) and chi(1 + 2) dihedral angle accuracies are 82.6% and 73.7% using a simple energy function based on the backbone-dependent rotamer library and a linear repulsive steric energy. The new algorithm will allow for use of SCWRL in more demanding applications such as sequence design and ab initio structure prediction, as well addition of a more complex energy function and conformational flexibility, leading to increased accuracy.  相似文献   

7.
Incorporation of effective backbone sampling into protein simulation and design is an important step in increasing the accuracy of computational protein modeling. Recent analysis of high-resolution crystal structures has suggested a new model, termed backrub, to describe localized, hinge-like alternative backbone and side-chain conformations observed in the crystal lattice. The model involves internal backbone rotations about axes between C-alpha atoms. Based on this observation, we have implemented a backrub-inspired sampling method in the Rosetta structure prediction and design program. We evaluate this model of backbone flexibility using three different tests. First, we show that Rosetta backrub simulations recapitulate the correlation between backbone and side-chain conformations in the high-resolution crystal structures upon which the model was based. As a second test of backrub sampling, we show that backbone flexibility improves the accuracy of predicting point-mutant side-chain conformations over fixed backbone rotameric sampling alone. Finally, we show that backrub sampling of triosephosphate isomerase loop 6 can capture the millisecond/microsecond oscillation between the open and closed states observed in solution. Our results suggest that backrub sampling captures a sizable fraction of localized conformational changes that occur in natural proteins. Application of this simple model of backbone motions may significantly improve both protein design and atomistic simulations of localized protein flexibility.  相似文献   

8.
9.
Protein threading by recursive dynamic programming.   总被引:4,自引:0,他引:4  
We present the recursive dynamic programming (RDP) method for the threading approach to three-dimensional protein structure prediction. RDP is based on the divide-and-conquer paradigm and maps the protein sequence whose backbone structure is to be found (the protein target) onto the known backbone structure of a model protein (the protein template) in a stepwise fashion, a technique that is similar to computing local alignments but utilising different cost functions. We begin by mapping parts of the target onto the template that show statistically significant similarity with the template sequence. After mapping, the template structure is modified in order to account for the mapped target residues. Then significant similarities between the yet unmapped parts of the target and the modified template are searched, and the resulting segments of the target are mapped onto the template. This recursive process of identifying segments in the target to be mapped onto the template and modifying the template is continued until no significant similarities between the remaining parts of target and template are found. Those parts which are left unmapped by the procedure are interpreted as gaps.The RDP method is robust in the sense that different local alignment methods can be used, several alternatives of mapping parts of the target onto the template can be handled and compared in the process, and the cost functions can be dynamically adapted to biological needs.Our computer experiments show that the RDP procedure is efficient and effective. We can thread a typical protein sequence against a database of 887 template domains in about 12 hours even on a low-cost workstation (SUN Ultra 5). In statistical evaluations on databases of known protein structures, RDP significantly outperforms competing methods. RDP has been especially valuable in providing accurate alignments for modeling active sites of proteins.RDP is part of the ToPLign system (GMD Toolbox for protein alignment) and can be accessed via the WWW independently or in concert with other ToPLign tools at http://cartan.gmd.de/ToPLign.html.  相似文献   

10.
Five models have been built by the ICM method for the Comparative Modeling section of the Meeting on the Critical Assessment of Techniques for Protein Structure Prediction. The targets have homologous proteins with known three-dimensional structure with sequence identity ranging from 25 to 77%. After alignment of the target sequence with the related three-dimensional structure, the modeling procedure consists of two subproblems: side-chain prediction and loop prediction. The ICM method approaches these problems with the following steps: (1) a starting model is created based on the homologous structure with the conserved portion fixed and the noncon-served portion having standard covalent geometry and free torsion angles; (2) the Biased Probability Monte Carlo (BPMC) procedure is applied to search the subspaces of either all the nonconservative side-chain torsion angles or torsion angles in a loop backbone and surrounding side chains. A special algorithm was designed to generate low-energy loop deformations. The BPMC procedure globally optimizes the energy function consisting of ECEPP/3 and solvation energy terms. Comparison of the predictions with the NMR or crystallographic solutions reveals a high proportion of correctly predicted side chains. The loops were not correctly predicted because imprinted distortions of the backbone increased the energy of the near-native conformation and thus made the solution unrecognizable. Interestingly, the energy terms were found to be reliable and the sampling of conformational space sufficient. The implications of this finding for the strategies of future comparative modeling are discussed. © 1995 Wiley-Liss, Inc.  相似文献   

11.

Background  

For successful protein structure prediction by comparative modeling, in addition to identifying a good template protein with known structure, obtaining an accurate sequence alignment between a query protein and a template protein is critical. It has been known that the alignment accuracy can vary significantly depending on our choice of various alignment parameters such as gap opening penalty and gap extension penalty. Because the accuracy of sequence alignment is typically measured by comparing it with its corresponding structure alignment, there is no good way of evaluating alignment accuracy without knowing the structure of a query protein, which is obviously not available at the time of structure prediction. Moreover, there is no universal alignment parameter option that would always yield the optimal alignment.  相似文献   

12.
Protein structure prediction by comparative modeling benefits greatly from the use of multiple sequence alignment information to improve the accuracy of structural template identification and the alignment of target sequences to structural templates. Unfortunately, this benefit is limited to those protein sequences for which at least several natural sequence homologues exist. We show here that the use of large diverse alignments of computationally designed protein sequences confers many of the same benefits as natural sequences in identifying structural templates for comparative modeling targets. A large-scale massively parallelized application of an all-atom protein design algorithm, including a simple model of peptide backbone flexibility, has allowed us to generate 500 diverse, non-native, high-quality sequences for each of 264 protein structures in our test set. PSI-BLAST searches using the sequence profiles generated from the designed sequences ("reverse" BLAST searches) give near-perfect accuracy in identifying true structural homologues of the parent structure, with 54% coverage. In 41 of 49 genomes scanned using reverse BLAST searches, at least one novel structural template (not found by the standard method of PSI-BLAST against PDB) is identified. Further improvements in coverage, through optimizing the scoring function used to design sequences and continued application to new protein structures beyond the test set, will allow this method to mature into a useful strategy for identifying distantly related structural templates.  相似文献   

13.
Chung SY  Subbiah S 《Proteins》1999,35(2):184-194
The precision and accuracy of protein structures determined by nuclear magnetic resonance (NMR) spectroscopy depend on the completeness of input experimental data set. Typically, rather than a single structure, an ensemble of up to 20 equally representative conformers is generated and routinely deposited in the Protein Database. There are substantially more experimentally derived restraints available to define the main-chain coordinates than those of the side chains. Consequently, the side-chain conformations among the conformers are more variable and less well defined than those of the backbone. Even when a side chain is determined with high precision and is found to adopt very similar orientations among all the conformers in the ensemble, it is possible that its orientation might still be incorrect. Thus, it would be helpful if there were a method to assess independently the side-chain orientations determined by NMR. Recently, homology modeling by side-chain packing algorithms has been shown to be successful in predicting the side-chain conformations of the buried residues for a protein when the main-chain coordinates and sequence information are given. Since the main-chain coordinates determined by NMR are consistently more reliable than those of the side-chains, we have applied the side-chain packing algorithms to predict side-chain conformations that are compatible with the NMR-derived backbone. Using four test cases where the NMR solution structures and the X-ray crystal structure of the same protein are available, we demonstrate that the side-chain packing method can provide independent validation for the side-chain conformations of NMR structures. Comparison of the side-chain conformations derived by side-chain packing prediction and by NMR spectroscopy demonstrates that when there is agreement between the NMR model and the predicted model, on average 78% of the time the X-ray structure also concurs. While the side-chain packing method can confirm the reliable residue conformations in NMR models, more importantly, it can also identify the questionable residue conformations with an accuracy of 60%. This validation method can serve to increase the confidence level for potential users of structural models determined by NMR.  相似文献   

14.
Protein structure prediction   总被引:2,自引:0,他引:2  
The prediction of protein structure, based primarily on sequence and structure homology, has become an increasingly important activity. Homology models have become more accurate and their range of applicability has increased. Progress has come, in part, from the flood of sequence and structure information that has appeared over the past few years, and also from improvements in analysis tools. These include profile methods for sequence searches, the use of three-dimensional structure information in sequence alignment and new homology modeling tools, specifically in the prediction of loop and side-chain conformations. There have also been important advances in understanding the physical chemical basis of protein stability and the corresponding use of physical chemical potential functions to identify correctly folded from incorrectly folded protein conformations.  相似文献   

15.
MOTIVATION: Two major bottlenecks in advancing comparative protein structure modeling are the efficient combination of multiple template structures and the generation of a correct input target-template alignment. RESULTS: A novel method, Multiple Mapping Method with Multiple Templates (M4T) is introduced that implements an algorithm to automatically select and combine Multiple Template structures (MT) and an alignment optimization protocol (Multiple Mapping Method, MMM). The MT module of M4T selects and combines multiple template structures through an iterative clustering approach that takes into account the 'unique' contribution of each template, their sequence similarity among themselves and to the target sequence, and their experimental resolution. MMM is a sequence-to-structure alignment method that optimally combines alternatively aligned regions according to their fit in the structural environment of the template structure. The resulting M4T alignment is used as input to a comparative modeling module. The performance of M4T has been benchmarked on CASP6 comparative modeling target sequences and on a larger independent test set, and showed favorable performance to current state of the art methods.  相似文献   

16.
Zhao S  Goodsell DS  Olson AJ 《Proteins》2001,43(3):271-279
We compiled and analyzed a data set of paired protein structures containing proteins for which multiple high-quality uncomplexed atomic structures were available in the Protein Data Bank. Side-chain flexibility was quantified, yielding a set of residue- and environment-specific confidence levels describing the range of motion around chi1 and chi2 angles. As expected, buried residues were inflexible, adopting similar conformations in different crystal structure analyses. Ile, Thr, Asn, Asp, and the large aromatics also showed limited flexibility when exposed on the protein surface, whereas exposed Ser, Lys, Arg, Met, Gln, and Glu residues were very flexible. This information is different from and complementary to the information available from rotamer surveys. The confidence levels are useful for assessing the significance of observed side-chain motion and estimating the extent of side-chain motion in protein structure prediction. We compare the performance of a simple 40 degrees threshold with these quantitative confidence levels in a critical evaluation of side-chain prediction with the program SCWRL.  相似文献   

17.
MOTIVATION: Accurate alignment of a target sequence to a template structure continues to be a bottleneck in producing good quality comparative protein structure models. RESULTS: Multiple Mapping Method (MMM) is a comparative protein structure modeling server with an emphasis on a novel alignment optimization protocol. MMM takes inputs from five profile-to-profile based alignment methods. The alternatively aligned regions from the input alignment set are combined according to their fit in the structural environment of the template structure. The resulting, optimally spliced MMM alignment is used as input to an automated comparative modeling module to produce a full atom model. AVAILABILITY: The MMM server is freely accessible at http://www.fiserlab.org/servers/mmm  相似文献   

18.
Misura KM  Baker D 《Proteins》2005,59(1):15-29
Achieving atomic level accuracy in de novo structure prediction presents a formidable challenge even in the context of protein models with correct topologies. High-resolution refinement is a fundamental test of force field accuracy and sampling methodology, and its limited success in both comparative modeling and de novo prediction contexts highlights the limitations of current approaches. We constructed four tests to identify bottlenecks in our current approach and to guide progress in this challenging area. The first three tests showed that idealized native structures are stable under our refinement simulation conditions and that the refinement protocol can significantly decrease the root mean square deviation (RMSD) of perturbed native structures. In the fourth test we applied the refinement protocol to de novo models and showed that accurate models could be identified based on their energies, and in several cases many of the buried side chains adopted native-like conformations. We also showed that the differences in backbone and side-chain conformations between the refined de novo models and the native structures are largely localized to loop regions and regions where the native structure has unusual features such as rare rotamers or atypical hydrogen bonding between beta-strands. The refined de novo models typically have higher energies than refined idealized native structures, indicating that sampling of local backbone conformations and side-chain packing arrangements in a condensed state is a primary obstacle.  相似文献   

19.
The ability to predict structure from sequence is particularly important for toxins, virulence factors, allergens, cytokines, and other proteins of public health importance. Many such functions are represented in the parallel beta-helix and beta-trefoil families. A method using pairwise beta-strand interaction probabilities coupled with evolutionary information represented by sequence profiles is developed to tackle these problems for the beta-helix and beta-trefoil folds. The algorithm BetaWrapPro employs a "wrapping" component that may capture folding processes with an initiation stage followed by processive interaction of the sequence with the already-formed motifs. BetaWrapPro outperforms all previous motif recognition programs for these folds, recognizing the beta-helix with 100% sensitivity and 99.7% specificity and the beta-trefoil with 100% sensitivity and 92.5% specificity, in crossvalidation on a database of all nonredundant known positive and negative examples of these fold classes in the PDB. It additionally aligns 88% of residues for the beta-helices and 86% for the beta-trefoils accurately (within four residues of the exact position) to the structural template, which is then used with the side-chain packing program SCWRL to produce 3D structure predictions. One striking result has been the prediction of an unexpected parallel beta-helix structure for a pollen allergen, and its recent confirmation through solution of its structure. A Web server running BetaWrapPro is available and outputs putative PDB-style coordinates for sequences predicted to form the target folds.  相似文献   

20.
Hidetoshi Kono  Junta Doi 《Proteins》1994,19(3):244-255
Globular proteins have high packing densities as a result of residue side chains in the core achieving a tight, complementary packing. The internal packing is considered the main determinant of native protein structure. From that point of view, we present here a method of energy minimization using an automata network to predict a set of amino acid sequences and their side-chain conformations from a desired backbone geometry for de novo design of proteins. Using discrete side-chain conformations, that is, rotamers, the sequence generation problem from a given backbone geometry becomes one of combinatorial problems. We focused on the residues composing the interior core region and predicted a set of amino acid Sequences and their side-chain conformations only from a given backbone geometry. The kinds of residues were restricted to six hydrophobic amino acids (Ala, Ile, Met, Leu, Phe, and Val) because the core regions are almost always composed of hydrophobic residues. The obtained sequences were well packed as was the native sequence. The method can be used for automated sequence generation in the de novo design of proteins. © 1994 Wiley-Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号