首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present an analysis of 10 blind predictions prepared for a recent conference, “Critical Assessment of Techniques for Protein Structure Prediction.”1 The sequences of these proteins are not detectably similar to those of any protein in the structure database then available, but we attempted, by a threading method, to recognize similarity to known domain folds. Four of the 10 proteins, as we subsequently learned, do indeed show significant similarity to then-known structures. For 2 of these proteins the predictions were accurate, in the sense that a similar structure was at or near the top of the list of threading scores, and the threading alignment agreed well with the corresponding structural alignment. For the best predicted model mean alignment error relative to the optimal structural alignment was 2.7 residues, arising entirely from small “register shifts” of strands or helices. In the analysis we attempt to identify factors responsible for these successes and failures. Since our threading method does not use gap penalties, we may readily distinguish between errors arising from our prior definition of the “cores” of known structures and errors arising from inherent limitations in the threading potential. It would appear from the results that successful substructure recognition depends most critically on accurate definition of the “fold” of a database protein. This definition must correctly delineate substructures that are, and are not, likely to be conserved during protein evolution. © 1995 Wiley-Liss, Inc.  相似文献   

2.
We introduce a side‐chain‐inclusive scoring function, named OPUS‐SSF, for ranking protein structural models. The method builds a scoring function based on the native distributions of the coordinate components of certain anchoring points in a local molecular system for peptide segments of 5, 7, 9, and 11 residues in length. Differing from our previous OPUS‐CSF [Xu et al., Protein Sci. 2018; 27: 286–292], which exclusively uses main chain information, OPUS‐SSF employs anchoring points on side chains so that the effect of side chains is taken into account. The performance of OPUS‐SSF was tested on 15 decoy sets containing totally 603 proteins, and 571 of them had their native structures recognized from their decoys. Similar to OPUS‐CSF, OPUS‐SSF does not employ the Boltzmann formula in constructing scoring functions. The results indicate that OPUS‐SSF has achieved a significant improvement on decoy recognition and it should be a very useful tool for protein structural prediction and modeling.  相似文献   

3.
A low-resolution scoring function for the selection of native and near-native structures from a set of predicted structures for a given protein sequence has been developed. The scoring function, ProVal (Protein Validate), used several variables that describe an aspect of protein structure for which the proximity to the native structure can be assessed quantitatively. Among the parameters included are a packing estimate, surface areas, and the contact order. A partial least squares for latent variables (PLS) model was built for each candidate set of the 28 decoy sets of structures generated for 22 different proteins using the described parameters as independent variables. The C(alpha) RMS of the candidate structures versus the experimental structure was used as the dependent variable. The final generalized scoring function was an average of all models derived, ensuring that the function was not optimized for specific fold classes or method of structure generation of the candidate folds. The results show that the crystal structure was scored best in 64% of the 28 test sets and was clearly separated from the decoys in many examples. In all the other cases in which the crystal structure did not rank first, it ranked within the top 10%. Thus, although ProVal could not distinguish between predicted structures that were similar overall in fold quality due to its inherently low resolution, it can clearly be used as a primary filter to eliminate approximately 90% of fold candidates generated by current prediction methods from all-atom modeling and further evaluation. The correlation between the predicted and actual C(alpha) RMS values varies considerably between the candidate fold sets.  相似文献   

4.
QMEAN: A comprehensive scoring function for model quality assessment   总被引:3,自引:0,他引:3  
  相似文献   

5.
Side-chain modeling with an optimized scoring function   总被引:1,自引:0,他引:1       下载免费PDF全文
Modeling side-chain conformations on a fixed protein backbone has a wide application in structure prediction and molecular design. Each effort in this field requires decisions about a rotamer set, scoring function, and search strategy. We have developed a new and simple scoring function, which operates on side-chain rotamers and consists of the following energy terms: contact surface, volume overlap, backbone dependency, electrostatic interactions, and desolvation energy. The weights of these energy terms were optimized to achieve the minimal average root mean square (rms) deviation between the lowest energy rotamer and real side-chain conformation on a training set of high-resolution protein structures. In the course of optimization, for every residue, its side chain was replaced by varying rotamers, whereas conformations for all other residues were kept as they appeared in the crystal structure. We obtained prediction accuracy of 90.4% for chi(1), 78.3% for chi(1 + 2), and 1.18 A overall rms deviation. Furthermore, the derived scoring function combined with a Monte Carlo search algorithm was used to place all side chains onto a protein backbone simultaneously. The average prediction accuracy was 87.9% for chi(1), 73.2% for chi(1 + 2), and 1.34 A rms deviation for 30 protein structures. Our approach was compared with available side-chain construction methods and showed improvement over the best among them: 4.4% for chi(1), 4.7% for chi(1 + 2), and 0.21 A for rms deviation. We hypothesize that the scoring function instead of the search strategy is the main obstacle in side-chain modeling. Additionally, we show that a more detailed rotamer library is expected to increase chi(1 + 2) prediction accuracy but may have little effect on chi(1) prediction accuracy.  相似文献   

6.
Huang SY  Zou X 《Proteins》2008,72(2):557-579
Using an efficient iterative method, we have developed a distance-dependent knowledge-based scoring function to predict protein-protein interactions. The function, referred to as ITScore-PP, was derived using the crystal structures of a training set of 851 protein-protein dimeric complexes containing true biological interfaces. The key idea of the iterative method for deriving ITScore-PP is to improve the interatomic pair potentials by iteration, until the pair potentials can distinguish true binding modes from decoy modes for the protein-protein complexes in the training set. The iterative method circumvents the challenging reference state problem in deriving knowledge-based potentials. The derived scoring function was used to evaluate the ligand orientations generated by ZDOCK 2.1 and the native ligand structures on a diverse set of 91 protein-protein complexes. For the bound test cases, ITScore-PP yielded a success rate of 98.9% if the top 10 ranked orientations were considered. For the more realistic unbound test cases, the corresponding success rate was 40.7%. Furthermore, for faster orientational sampling purpose, several residue-level knowledge-based scoring functions were also derived following the similar iterative procedure. Among them, the scoring function that uses the side-chain center of mass (SCM) to represent a residue, referred to as ITScore-PP(SCM), showed the best performance and yielded success rates of 71.4% and 30.8% for the bound and unbound cases, respectively, when the top 10 orientations were considered. ITScore-PP was further tested using two other published protein-protein docking decoy sets, the ZDOCK decoy set and the RosettaDock decoy set. In addition to binding mode prediction, the binding scores predicted by ITScore-PP also correlated well with the experimentally determined binding affinities, yielding a correlation coefficient of R = 0.71 on a test set of 74 protein-protein complexes with known affinities. ITScore-PP is computationally efficient. The average run time for ITScore-PP was about 0.03 second per orientation (including optimization) on a personal computer with 3.2 GHz Pentium IV CPU and 3.0 GB RAM. The computational speed of ITScore-PP(SCM) is about an order of magnitude faster than that of ITScore-PP. ITScore-PP and/or ITScore-PP(SCM) can be combined with efficient protein docking software to study protein-protein recognition.  相似文献   

7.
8.
We present a knowledge‐based function to score protein decoys based on their similarity to native structure. A set of features is constructed to describe the structure and sequence of the entire protein chain. Furthermore, a qualitative relationship is established between the calculated features and the underlying electromagnetic interaction that dominates this scale. The features we use are associated with residue–residue distances, residue–solvent distances, pairwise knowledge‐based potentials and a four‐body potential. In addition, we introduce a new target to be predicted, the fitness score, which measures the similarity of a model to the native structure. This new approach enables us to obtain information both from decoys and from native structures. It is also devoid of previous problems associated with knowledge‐based potentials. These features were obtained for a large set of native and decoy structures and a back‐propagating neural network was trained to predict the fitness score. Overall this new scoring potential proved to be superior to the knowledge‐based scoring functions used as its inputs. In particular, in the latest CASP (CASP10) experiment our method was ranked third for all targets, and second for freely modeled hard targets among about 200 groups for top model prediction. Ours was the only method ranked in the top three for all targets and for hard targets. This shows that initial results from the novel approach are able to capture details that were missed by a broad spectrum of protein structure prediction approaches. Source codes and executable from this work are freely available at http://mathmed.org /#Software and http://mamiris.com/ . Proteins 2014; 82:752–759. © 2013 Wiley Periodicals, Inc.  相似文献   

9.
Critical Assessment of PRedicted Interactions (CAPRI) has proven to be a catalyst for the development of docking algorithms. An essential step in docking is the scoring of predicted binding modes in order to identify stable complexes. In 2005, CAPRI introduced the scoring experiment, where upon completion of a prediction round, a larger set of models predicted by different groups and comprising both correct and incorrect binding modes, is made available to all participants for testing new scoring functions independently from docking calculations. Here we present an expanded benchmark data set for testing scoring functions, which comprises the consolidated ensemble of predicted complexes made available in the CAPRI scoring experiment since its inception. This consolidated scoring benchmark contains predicted complexes for 15 published CAPRI targets. These targets were subjected to 23 CAPRI assessments, due to existence of multiple binding modes for some targets. The benchmark contains more than 19,000 protein complexes. About 10% of the complexes represent docking predictions of acceptable quality or better, the remainder represent incorrect solutions (decoys). The benchmark set contains models predicted by 47 different predictor groups including web servers, which use different docking and scoring procedures, and is arguably as diverse as one may expect, representing the state of the art in protein docking. The data set is publicly available at the following URL: http://cb.iri.univ‐lille1.fr/Users/lensink/Score_set . Proteins 2014; 82:3163–3169. © 2014 Wiley Periodicals, Inc.  相似文献   

10.
Lee MC  Duan Y 《Proteins》2004,55(3):620-634
Recent works have shown the ability of physics-based potentials (e.g., CHARMM and OPLS-AA) and energy minimization to differentiate the native protein structures from large ensemble of non-native structures. In this study, we extended previous work by other authors and developed an energy scoring function using a new set of AMBER parameters (also recently developed in our laboratory) in conjunction with molecular dynamics and the Generalized Born solvent model. We evaluated the performance of our new scoring function by examining its ability to distinguish between the native and decoy protein structures. Here we present a systematic comparison of our results with those obtained with use of other physics-based potentials by previous authors. A total of 7 decoy sets, 117 protein sequences, and more than 41,000 structures were evaluated. The results of our study showed that our new scoring function represents a significant improvement over previously published physics-based scoring functions.  相似文献   

11.
We have investigated the effect of rigorous optimization of the MODELLER energy function for possible improvement in protein all‐atom chain‐building. For this we applied the global optimization method called conformational space annealing (CSA) to the standard MODELLER procedure to achieve better energy optimization than what MODELLER provides. The method, which we call MODELLERCSA , is tested on two benchmark sets. The first is the 298 proteins taken from the HOMSTRAD multiple alignment set. By simply optimizing the MODELLER energy function, we observe significant improvement in side‐chain modeling, where MODELLERCSA provides about 10.7% (14.5%) improvement for χ11 + χ2) accuracy compared to the standard MODELLER modeling. The improvement of backbone accuracy by MODELLERCSA is shown to be less prominent, and a similar improvement can be achieved by simply generating many standard MODELLER models and selecting lowest energy models. However, the level of side‐chain modeling accuracy by MODELLERCSA could not be matched either by extensive MODELLER strategies, side‐chain remodeling by SCWRL3, or copying unmutated rotamers. The identical procedure was successfully applied to 100 CASP7 template base modeling domains during the prediction season in a blind fashion, and the results are included here for comparison. From this study, we observe a good correlation between the MODELLER energy and the side‐chain accuracy. Our findings indicate that, when a good alignment between a target protein and its templates is provided, thorough optimization of the MODELLER energy function leads to accurate all‐atom models. Proteins 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

12.
In this paper, we report a knowledge-based potential function, named the OPUS-Ca potential, that requires only Calpha positions as input. The contributions from other atomic positions were established from pseudo-positions artificially built from a Calpha trace for auxiliary purposes. The potential function is formed based on seven major representative molecular interactions in proteins: distance-dependent pairwise energy with orientational preference, hydrogen bonding energy, short-range energy, packing energy, tri-peptide packing energy, three-body energy, and solvation energy. From the testing of decoy recognition on a number of commonly used decoy sets, it is shown that the new potential function outperforms all known Calpha-based potentials and most other coarse-grained ones that require more information than Calpha positions. We hope that this potential function adds a new tool for protein structural modeling.  相似文献   

13.
The field of computational protein design is reaching its adolescence. Protein design algorithms have been applied to design or engineer proteins that fold, fold faster, catalyze, catalyze faster, signal, and adopt preferred conformational states. Further developments of scoring functions, sampling strategies, and optimization methods will expand the range of applicability of computational protein design to larger and more varied systems, with greater incidence of success. Developments in this field are beginning to have significant impact on biotechnology and chemical biology.  相似文献   

14.
Most scoring functions for protein-protein docking algorithms are either atom-based or residue-based, with the former being able to produce higher quality structures and latter more tolerant to conformational changes upon binding. Earlier, we developed the ZRANK algorithm for reranking docking predictions, with a scoring function that contained only atom-based terms. Here we combine ZRANK's atom-based potentials with five residue-based potentials published by other labs, as well as an atom-based potential IFACE that we published after ZRANK. We simultaneously optimized the weights for selected combinations of terms in the scoring function, using decoys generated with the protein-protein docking algorithm ZDOCK. We performed rigorous cross validation of the combinations using 96 test cases from a docking benchmark. Judged by the integrative success rate of making 1000 predictions per complex, addition of IFACE and the best residue-based pair potential reduced the number of cases without a correct prediction by 38 and 27% relative to ZDOCK and ZRANK, respectively. Thus combination of residue-based and atom-based potentials into a scoring function can improve performance for protein-protein docking. The resulting scoring function is called IRAD (integration of residue- and atom-based potentials for docking) and is available at http://zlab.umassmed.edu.  相似文献   

15.
Kihara D  Skolnick J 《Proteins》2004,55(2):464-473
The genome scale threading of five complete microbial genomes is revisited using our state-of-the-art threading algorithm, PROSPECTOR_Q. Considering that structure assignment to an ORF could be useful for predicting biochemical function as well as for analyzing pathways, it is important to assess the current status of genome scale threading. The fraction of ORFs to which we could assign protein structures with a reasonably good confidence level to each genome sequences is over 72%, which is significantly higher than earlier studies. Using the assigned structures, we have predicted the function of several ORFs through "single-function" template structures, obtained from an analysis of the relationship between protein fold and function. The fold distribution of the genomes and the effect of the number of homologous sequences on structure assignment are also discussed.  相似文献   

16.
We proposed recently an optimization method to derive energy parameters for simplified models of protein folding. The method is based on the maximization of the thermodynamic average of the overlap between protein native structures and a Boltzmann ensemble of alternative structures. Such a condition enforces protein models whose ground states are most similar to the corresponding native states. We present here an extensive testing of the method for a simple residue-residue contact energy function and for alternative structures generated by threading. The optimized energy function guarantees high stability and a well-correlated energy landscape to most representative structures in the PDB database. Failures in the recognition of the native structure can be attributed to the neglect of interactions between different chains in oligomeric proteins or with cofactors. When these are taken into account, only very few X-ray structures are not recognized. Most of them are short inhibitors or fragments and one is a structure that presents serious inconsistencies. Finally, we discuss the reasons that make NMR structures more difficult to recognizeCopyright 2001 Wiley-Liss, Inc.  相似文献   

17.
Use of knowledge based scoring function (KBSF) for virtual screening and molecular docking has become an established method for drug discovery. Lack of a precise and reliable free energy function that describes several interactions including water-mediated atomic interaction between amino-acid residues and ligand makes distance based statistical measure as the only alternative. Till now all the distance based scoring functions in KBSF arena use atom singularity concept, which neglects the environmental effect of the atom under consideration. We have developed a novel knowledge-based statistical energy function for protein-ligand complexes which takes atomic environment in to account hence functional group as a singular entity. The proposed knowledge based scoring function is fast, simple to construct, easy to use and moreover it tackle the existing problem of handling molecular orientation in active site pocket. We have designed and used Functional group based Ligand retrieval (FBLR) system which can identify and detect the orientation of functional groups in ligand. This decoy searching was used to build the above KBSF to quantify the activity and affinity of high resolution protein-ligand complexes. We have proposed the probable use of these decoys in molecular build-up as a de-novo drug designing approach. We have also discussed the possible use of the said KSBF in pharmacophore fragment detection and pseudo center based fragment alignment procedure.  相似文献   

18.
Analysis of the results of the recent protein structure prediction experiment for our method shows that we achieved a high level of success, Of the 18 available prediction targets of known structure, the assessors have identified 11 chains which either entirely match a previously known fold, or which partially match a substantial region of a known fold. Of these 11 chains, we made predictions for 9, and correctly assigned the folds in 5 cases. We have also identified a further 2 chains which also partially match known folds, and both of these were correctly predicted. The success rate for our method under blind testing is therefore 7 out of 11 chains. A further 2 folds could have easily been recognized but failed due to either overzealous filtering of potential matches, or to simple human error on our part. One of the two targets for which we did not submit a prediction, prosubtilisin, would not have been recognized by our usual criteria, but even in this case, it is possible that a correct prediction could have been made by considerin a combination of pairwise energy and solvation energy Z-scores. Inspection of the threading alignments for the (αβ)8 barrels provides clues as to how fold recognition by threading works, in that these folds are recognized by parts rather than as a whole. The prospects for developing sequence threading technology further is discussed. © 1995 Wiley-Liss, Inc.  相似文献   

19.
《Proteins》2017,85(4):741-752
Protein–RNA docking is still an open question. One of the main challenges is to develop an effective scoring function that can discriminate near‐native structures from the incorrect ones. To solve the problem, we have constructed a knowledge‐based residue‐nucleotide pairwise potential with secondary structure information considered for nonribosomal protein–RNA docking. Here we developed a weighted combined scoring function RpveScore that consists of the pairwise potential and six physics‐based energy terms. The weights were optimized using the multiple linear regression method by fitting the scoring function to L_rmsd for the bound docking decoys from Benchmark II. The scoring functions were tested on 35 unbound docking cases. The results show that the scoring function RpveScore including all terms performs best. Also RpveScore was compared with the statistical mechanics‐based method derived potential ITScore‐PR, and the united atom‐based statistical potentials QUASI‐RNP and DARS‐RNP. The success rate of RpveScore is 71.6% for the top 1000 structures and the number of cases where a near‐native structure is ranked in top 30 is 25 out of 35 cases. For 32 systems (91.4%), RpveScore can find the binding mode in top 5 that has no lower than 50% native interface residues on protein and nucleotides on RNA. Additionally, it was found that the long‐range electrostatic attractive energy plays an important role in distinguishing near‐native structures from the incorrect ones. This work can be helpful for the development of protein–RNA docking methods and for the understanding of protein–RNA interactions. RpveScore program is available to the public at http://life.bjut.edu.cn/kxyj/kycg/2017116/14845362285362368_1.html Proteins 2017; 85:741–752. © 2016 Wiley Periodicals, Inc.  相似文献   

20.
Wu S  Zhang Y 《Proteins》2008,72(2):547-556
We develop a new threading algorithm MUSTER by extending the previous sequence profile-profile alignment method, PPA. It combines various sequence and structure information into single-body terms which can be conveniently used in dynamic programming search: (1) sequence profiles; (2) secondary structures; (3) structure fragment profiles; (4) solvent accessibility; (5) dihedral torsion angles; (6) hydrophobic scoring matrix. The balance of the weighting parameters is optimized by a grading search based on the average TM-score of 111 training proteins which shows a better performance than using the conventional optimization methods based on the PROSUP database. The algorithm is tested on 500 nonhomologous proteins independent of the training sets. After removing the homologous templates with a sequence identity to the target >30%, in 224 cases, the first template alignment has the correct topology with a TM-score >0.5. Even with a more stringent cutoff by removing the templates with a sequence identity >20% or detectable by PSI-BLAST with an E-value <0.05, MUSTER is able to identify correct folds in 137 cases with the first model of TM-score >0.5. Dependent on the homology cutoffs, the average TM-score of the first threading alignments by MUSTER is 5.1-6.3% higher than that by PPA. This improvement is statistically significant by the Wilcoxon signed rank test with a P-value < 1.0 x 10(-13), which demonstrates the effect of additional structural information on the protein fold recognition. The MUSTER server is freely available to the academic community at http://zhang.bioinformatics.ku.edu/MUSTER.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号