首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Lee J  Kim SY  Joo K  Kim I  Lee J 《Proteins》2004,56(4):704-714
A novel method for ab initio prediction of protein tertiary structures, PROFESY (PROFile Enumerating SYstem), is proposed. This method utilizes the secondary structure prediction information of a query sequence and the fragment assembly procedure based on global optimization. Fifteen-residue-long fragment libraries are constructed using the secondary structure prediction method PREDICT, and fragments in these libraries are assembled to generate full-length chains of a query protein. Tertiary structures of 50 to 100 conformations are obtained by minimizing an energy function for proteins, using the conformational space annealing method that enables one to sample diverse low-lying local minima of the energy. We apply PROFESY for benchmark tests to proteins with known structures to demonstrate its feasibility. In addition, we participated in CASP5 and applied PROFESY to four new-fold targets for blind prediction. The results are quite promising, despite the fact that PROFESY was in its early stages of development. In particular, PROFESY successfully provided us the best model-one structure for the target T0161.  相似文献   

2.
Bordner AJ  Abagyan RA 《Proteins》2004,57(2):400-413
We have developed a method to both predict the geometry and the relative stability of point mutants that may be used for arbitrary mutations. The geometry optimization procedure was first tested on a new benchmark of 2141 ordered pairs of X-ray crystal structures of proteins that differ by a single point mutation, the largest data set to date. An empirical energy function, which includes terms representing the energy contributions of the folded and denatured proteins and uses the predicted mutant side chain conformation, was fit to a training set consisting of half of a diverse set of 1816 experimental stability values for single point mutations in 81 different proteins. The data included a substantial number of small to large residue mutations not considered by previous prediction studies. After removing 22 (approximately 2%) outliers, the stability calculation gave a standard deviation of 1.08 kcal/mol with a correlation coefficient of 0.82. The prediction method was then tested on the remaining half of the experimental data, giving a standard deviation of 1.10 kcal/mol and covariance of 0.66 for 97% of the test set. A regression fit of the energy function to a subset of 137 mutants, for which both native and mutant structures were available, gave a prediction error comparable to that for the complete training set with predicted side chain conformations. We found that about half of the variation is due to conformation-independent residue contributions. Finally, a fit to the experimental stability data using these residue parameters exclusively suggests guidelines for improving protein stability in the absence of detailed structure information.  相似文献   

3.
Deane CM  Blundell TL 《Proteins》2000,40(1):135-144
We present a fast ab initio method for the prediction of local conformations in proteins. The program, PETRA, selects polypeptide fragments from a computer-generated database (APD) encoding all possible peptide fragments up to twelve amino acids long. Each fragment is defined by a representative set of eight straight phi/psi pairs, obtained iteratively from a trial set by calculating how fragments generated from them represent the protein databank (PDB). Ninety-six percent (96%) of length five fragments in crystal structures, with a resolution better than 1.5 A and less than 25% identity, have a conformer in the database with less than 1 A root-mean-square deviation (rmsd). In order to select segments from APD, PETRA uses a set of simple rule-based filters, thus reducing the number of potential conformations to a manageable total. This reduced set is scored and sorted using rmsd fit to the anchor regions and a knowledge-based energy function dependent on the sequence to be modelled. The best scoring fragments can then be optimized by minimization of contact potentials and rmsd fit to the core model. The quality of the prediction made by PETRA is evaluated by calculating both the differences in rmsd and backbone torsion angles between the final model and the native fragment. The average rmsd ranges from 1.4 A for three residue loops to 3.9 A for eight residue loops.  相似文献   

4.
Lee J  Lee J  Sasaki TN  Sasai M  Seok C  Lee J 《Proteins》2011,79(8):2403-2417
Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods.  相似文献   

5.
We recently developed the Rosetta algorithm for ab initio protein structure prediction, which generates protein structures from fragment libraries using simulated annealing. The scoring function in this algorithm favors the assembly of strands into sheets. However, it does not discriminate between different sheet motifs. After generating many structures using Rosetta, we found that the folding algorithm predominantly generates very local structures. We surveyed the distribution of beta-sheet motifs with two edge strands (open sheets) in a large set of non-homologous proteins. We investigated how much of that distribution can be accounted for by rules previously published in the literature, and developed a filter and a scoring method that enables us to improve protein structure prediction for beta-sheet proteins. Proteins 2002;48:85-97.  相似文献   

6.
Forrest LR  Woolf TB 《Proteins》2003,52(4):492-509
The recent determination of crystal structures for several important membrane proteins opens the way for comparative modeling of their membrane-spanning regions. However, the ability to predict correctly the structures of loop regions, which may be critical, for example, in ligand binding, remains a considerable challenge. To meet this challenge, accurate scoring methods have to discriminate between candidate conformations of an unknown loop structure. Some success in loop prediction has been reported for globular proteins; however, the proximity of membrane protein loops to the lipid bilayer casts doubt on the applicability of the same scoring methods to this problem. In this work, we develop "decoy libraries" of non-native folds generated, using the structures of two membrane proteins, with molecular dynamics and Monte Carlo techniques over a range of temperatures. We introduce a new approach for decoy library generation by constructing a flat distribution of conformations covering a wide range of Calpha-root-mean-square deviation (RMSD) from the native structure; this removes possible bias in subsequent scoring stages. We then score these decoy conformations with effective energy functions, using increasingly more cpu-intensive implicit solvent models, including (1) simple Coulombic electrostatics with constant or distance-dependent dielectrics; (2) atomic solvation parameters; (3) the effective energy function (EEF1) of Lazaridis and Karplus; (4) generalized Born/Analytical Continuum Solvent; and (5) finite-difference Poisson-Boltzmann energy functions. We show that distinction of native-like membrane protein loops may be achieved using effective energies with the assumption of a homogenous environment; thus, the absence of the adjacent lipid bilayer does not affect the scoring ability. In particular, the Analytical Continuum Solvent and finite-difference Poisson-Boltzmann energy functions are seen to be the most powerful scoring functions. Interestingly, the use of the uncharged states of ionizable sidechains is shown to aid prediction, particularly for the simplest energy functions.  相似文献   

7.
Side-chain modeling with an optimized scoring function   总被引:1,自引:0,他引:1       下载免费PDF全文
Modeling side-chain conformations on a fixed protein backbone has a wide application in structure prediction and molecular design. Each effort in this field requires decisions about a rotamer set, scoring function, and search strategy. We have developed a new and simple scoring function, which operates on side-chain rotamers and consists of the following energy terms: contact surface, volume overlap, backbone dependency, electrostatic interactions, and desolvation energy. The weights of these energy terms were optimized to achieve the minimal average root mean square (rms) deviation between the lowest energy rotamer and real side-chain conformation on a training set of high-resolution protein structures. In the course of optimization, for every residue, its side chain was replaced by varying rotamers, whereas conformations for all other residues were kept as they appeared in the crystal structure. We obtained prediction accuracy of 90.4% for chi(1), 78.3% for chi(1 + 2), and 1.18 A overall rms deviation. Furthermore, the derived scoring function combined with a Monte Carlo search algorithm was used to place all side chains onto a protein backbone simultaneously. The average prediction accuracy was 87.9% for chi(1), 73.2% for chi(1 + 2), and 1.34 A rms deviation for 30 protein structures. Our approach was compared with available side-chain construction methods and showed improvement over the best among them: 4.4% for chi(1), 4.7% for chi(1 + 2), and 0.21 A for rms deviation. We hypothesize that the scoring function instead of the search strategy is the main obstacle in side-chain modeling. Additionally, we show that a more detailed rotamer library is expected to increase chi(1 + 2) prediction accuracy but may have little effect on chi(1) prediction accuracy.  相似文献   

8.
Prediction of protein structure depends on the accuracy and complexity of the models used. Here, we represent the polypeptide chain by a sequence of rigid fragments that are concatenated without any degrees of freedom. Fragments chosen from a library of representative fragments are fit to the native structure using a greedy build-up method. This gives a one-dimensional representation of native protein three-dimensional structure whose quality depends on the nature of the library. We use a novel clustering method to construct libraries that differ in the fragment length (four to seven residues) and number of representative fragments they contain (25-300). Each library is characterized by the quality of fit (accuracy) and the number of allowed states per residue (complexity). We find that the accuracy depends on the complexity and varies from 2.9A for a 2.7-state model on the basis of fragments of length 7-0.76A for a 15-state model on the basis of fragments of length 5. Our goal is to find representations that are both accurate and economical (low complexity). The models defined here are substantially better in this regard: with ten states per residue we approximate native protein structure to 1A compared to over 20 states per residue needed previously.For the same complexity, we find that longer fragments provide better fits. Unfortunately, libraries of longer fragments must be much larger (for ten states per residue, a seven-residue library is 100 times larger than a five-residue library). As the number of known protein native structures increases, it will be possible to construct larger libraries to better exploit this correlation between neighboring residues. Our fragment libraries, which offer a wide range of optimal fragments suited to different accuracies of fit, may prove to be useful for generating better decoy sets for ab initio protein folding and for generating accurate loop conformations in homology modeling.  相似文献   

9.
Hartmann C  Antes I  Lengauer T 《Proteins》2009,74(3):712-726
We describe a scoring and modeling procedure for docking ligands into protein models that have either modeled or flexible side-chain conformations. Our methodical contribution comprises a procedure for generating new potentials of mean force for the ROTA scoring function which we have introduced previously for optimizing side-chain conformations with the tool IRECS. The ROTA potentials are specially trained to tolerate small-scale positional errors of atoms that are characteristic of (i) side-chain conformations that are modeled using a sparse rotamer library and (ii) ligand conformations that are generated using a docking program. We generated both rigid and flexible protein models with our side-chain prediction tool IRECS and docked ligands to proteins using the scoring function ROTA and the docking programs FlexX (for rigid side chains) and FlexE (for flexible side chains). We validated our approach on the forty screening targets of the DUD database. The validation shows that the ROTA potentials are especially well suited for estimating the binding affinity of ligands to proteins. The results also show that our procedure can compensate for the performance decrease in screening that occurs when using protein models with side chains modeled with a rotamer library instead of using X-ray structures. The average runtime per ligand of our method is 168 seconds on an Opteron V20z, which is fast enough to allow virtual screening of compound libraries for drug candidates.  相似文献   

10.
Flexible loop regions of proteins play a crucial role in many biological functions such as protein–ligand recognition, enzymatic catalysis, and protein–protein association. To date, most computational methods that predict the conformational states of loops only focus on individual loop regions. However, loop regions are often spatially in close proximity to one another and their mutual interactions stabilize their conformations. We have developed a new method, titled CorLps, capable of simultaneously predicting such interacting loop regions. First, an ensemble of individual loop conformations is generated for each loop region. The members of the individual ensembles are combined and are accepted or rejected based on a steric clash filter. After a subsequent side‐chain optimization step, the resulting conformations of the interacting loops are ranked by the statistical scoring function DFIRE that originated from protein structure prediction. Our results show that predicting interacting loops with CorLps is superior to sequential prediction of the two interacting loop regions, and our method is comparable in accuracy to single loop predictions. Furthermore, improved predictive accuracy of the top‐ranked solution is achieved for 12‐residue length loop regions by diversifying the initial pool of individual loop conformations using a quality threshold clustering algorithm. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

11.
We use a homotopy optimization method, HOPE, to minimize the potential energy associated with a protein model. The method uses the minimum energy conformation of one protein as a template to predict the lowest energy structure of a query sequence. This objective is achieved by following a path of conformations determined by a homotopy between the potential energy functions for the two proteins. Ensembles of solutions are produced by perturbing conformations along the path, increasing the likelihood of predicting correct structures. Successful results are presented for pairs of homologous proteins, where HOPE is compared to a variant of Newton's method and to simulated annealing.  相似文献   

12.
13.
14.
Energy functions, fragment libraries, and search methods constitute three key components of fragment‐assembly methods for protein structure prediction, which are all crucial for their ability to generate high‐accuracy predictions. All of these components are tightly coupled; efficient searching becomes more important as the quality of fragment libraries decreases. Given these relationships, there is currently a poor understanding of the strengths and weaknesses of the sampling approaches currently used in fragment‐assembly techniques. Here, we determine how the performance of search techniques can be assessed in a meaningful manner, given the above problems. We describe a set of techniques that aim to reduce the impact of the energy function, and assess exploration in view of the search space defined by a given fragment library. We illustrate our approach using Rosetta and EdaFold, and show how certain features of these methods encourage or limit conformational exploration. We demonstrate that individual trajectories of Rosetta are susceptible to local minima in the energy landscape, and that this can be linked to non‐uniform sampling across the protein chain. We show that EdaFold's novel approach can help balance broad exploration with locating good low‐energy conformations. This occurs through two mechanisms which cannot be readily differentiated using standard performance measures: exclusion of false minima, followed by an increasingly focused search in low‐energy regions of conformational space. Measures such as ours can be helpful in characterizing new fragment‐based methods in terms of the quality of conformational exploration realized. Proteins 2016; 84:411–426. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.  相似文献   

15.
We have revisited the protein coarse-grained optimized potential for efficient structure prediction (OPEP). The training and validation sets consist of 13 and 16 protein targets. Because optimization depends on details of how the ensemble of decoys is sampled, trial conformations are generated by molecular dynamics, threading, greedy, and Monte Carlo simulations, or taken from publicly available databases. The OPEP parameters are varied by a genetic algorithm using a scoring function which requires that the native structure has the lowest energy, and the native-like structures have energy higher than the native structure but lower than the remote conformations. Overall, we find that OPEP correctly identifies 24 native or native-like states for 29 targets and has very similar capability to the all-atom discrete optimized protein energy model (DOPE), found recently to outperform five currently used energy models.  相似文献   

16.
17.
Bordner AJ  Gorin AA 《Proteins》2007,68(2):488-502
Computational prediction of protein complex structures through docking offers a means to gain a mechanistic understanding of protein interactions that mediate biological processes. This is particularly important as the number of experimentally determined structures of isolated proteins exceeds the number of structures of complexes. A comprehensive docking procedure is described in which efficient sampling of conformations is achieved by matching surface normal vectors, fast filtering for shape complementarity, clustering by RMSD, and scoring the docked conformations using a supervised machine learning approach. Contacting residue pair frequencies, residue propensities, evolutionary conservation, and shape complementarity score for each docking conformation are used as input data to a Random Forest classifier. The performance of the Random Forest approach for selecting correctly docked conformations was assessed by cross-validation using a nonredundant benchmark set of X-ray structures for 93 heterodimer and 733 homodimer complexes. The single highest rank docking solution was the correct (near-native) structure for slightly more than one third of the complexes. Furthermore, the fraction of highly ranked correct structures was significantly higher than the overall fraction of correct structures, for almost all complexes. A detailed analysis of the difficult to predict complexes revealed that the majority of the homodimer cases were explained by incorrect oligomeric state annotation. Evolutionary conservation and shape complementarity score as well as both underrepresented and overrepresented residue types and residue pairs were found to make the largest contributions to the overall prediction accuracy. Finally, the method was also applied to docking unbound subunit structures from a previously published benchmark set.  相似文献   

18.
We have applied the orthogonal array method to optimize the parameters in the genetic algorithm of the protein folding problem. Our study employed a 210-type lattice model to describe proteins, where the orientation of a residue relative to its neighboring residue is described by two angles. The statistical analysis and graphic representation show that the two angles characterize protein conformations effectively. Our energy function includes a repulsive energy, an energy for the secondary structure preference, and a pairwise contact potential. We used orthogonal array to optimize the parameters of the population, mating factor, mutation factor, and selection factor in the genetic algorithm. By designing an orthogonal set of trials with representative combinations of these parameters, we efficiently determined the optimal set of parameters through a hierarchical search. The optimal parameters were obtained from the protein crambin and applied to the structure prediction of cytochrome B562. The results indicate that the genetic algorithm with the optimal parameters reduces the computing time to reach a converged energy compared to nonoptimal parameters. It also has less chance to be trapped in a local energy minimum, and predicts a protein structure which is closer to the experimental one. Our method may also be applicable to many other optimization problems in computational biology.  相似文献   

19.
Despite years of effort, the problem of predicting the conformations of protein side chains remains a subject of inquiry. This problem has three major issues, namely defining the conformations that a side chain may adopt within a protein, developing a sampling procedure for generating possible side‐chain packings, and defining a scoring function that can rank these possible packings. To solve the former of these issues, most procedures rely on a rotamer library derived from databases of known protein structures. We introduce an alternative method that is free of statistics. We begin with a rotamer library that is based only on stereochemical considerations; this rotamer library is then optimized independently for each protein under study. We show that this optimization step restores the diversity of conformations observed in native proteins. We combine this protein‐dependent rotamer library (PDRL) method with the self‐consistent mean field (SCMF) sampling approach and a physics‐based scoring function into a new side‐chain prediction method, SCMF–PDRL. Using two large test sets of 831 and 378 proteins, respectively, we show that this new method compares favorably with competing methods such as SCAP, OPUS‐Rota, and SCWRL4 for energy‐minimized structures. Proteins 2014; 82:2000–2017. © 2014 Wiley Periodicals, Inc.  相似文献   

20.
The application of all-atom force fields (and explicit or implicit solvent models) to protein homology-modeling tasks such as side-chain and loop prediction remains challenging both because of the expense of the individual energy calculations and because of the difficulty of sampling the rugged all-atom energy surface. Here we address this challenge for the problem of loop prediction through the development of numerous new algorithms, with an emphasis on multiscale and hierarchical techniques. As a first step in evaluating the performance of our loop prediction algorithm, we have applied it to the problem of reconstructing loops in native structures; we also explicitly include crystal packing to provide a fair comparison with crystal structures. In brief, large numbers of loops are generated by using a dihedral angle-based buildup procedure followed by iterative cycles of clustering, side-chain optimization, and complete energy minimization of selected loop structures. We evaluate this method by using the largest test set yet used for validation of a loop prediction method, with a total of 833 loops ranging from 4 to 12 residues in length. Average/median backbone root-mean-square deviations (RMSDs) to the native structures (superimposing the body of the protein, not the loop itself) are 0.42/0.24 A for 5 residue loops, 1.00/0.44 A for 8 residue loops, and 2.47/1.83 A for 11 residue loops. Median RMSDs are substantially lower than the averages because of a small number of outliers; the causes of these failures are examined in some detail, and many can be attributed to errors in assignment of protonation states of titratable residues, omission of ligands from the simulation, and, in a few cases, probable errors in the experimentally determined structures. When these obvious problems in the data sets are filtered out, average RMSDs to the native structures improve to 0.43 A for 5 residue loops, 0.84 A for 8 residue loops, and 1.63 A for 11 residue loops. In the vast majority of cases, the method locates energy minima that are lower than or equal to that of the minimized native loop, thus indicating that sampling rarely limits prediction accuracy. The overall results are, to our knowledge, the best reported to date, and we attribute this success to the combination of an accurate all-atom energy function, efficient methods for loop buildup and side-chain optimization, and, especially for the longer loops, the hierarchical refinement protocol.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号