首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Most scoring functions for protein-protein docking algorithms are either atom-based or residue-based, with the former being able to produce higher quality structures and latter more tolerant to conformational changes upon binding. Earlier, we developed the ZRANK algorithm for reranking docking predictions, with a scoring function that contained only atom-based terms. Here we combine ZRANK's atom-based potentials with five residue-based potentials published by other labs, as well as an atom-based potential IFACE that we published after ZRANK. We simultaneously optimized the weights for selected combinations of terms in the scoring function, using decoys generated with the protein-protein docking algorithm ZDOCK. We performed rigorous cross validation of the combinations using 96 test cases from a docking benchmark. Judged by the integrative success rate of making 1000 predictions per complex, addition of IFACE and the best residue-based pair potential reduced the number of cases without a correct prediction by 38 and 27% relative to ZDOCK and ZRANK, respectively. Thus combination of residue-based and atom-based potentials into a scoring function can improve performance for protein-protein docking. The resulting scoring function is called IRAD (integration of residue- and atom-based potentials for docking) and is available at http://zlab.umassmed.edu.  相似文献   

2.
Lu H  Lu L  Skolnick J 《Biophysical journal》2003,84(3):1895-1901
A residue-based and a heavy atom-based statistical pair potential are developed for use in assessing the strength of protein-protein interactions. To ensure the quality of the potentials, a nonredundant, high-quality dimer database is constructed. The protein complexes in this dataset are checked by a literature search to confirm that they form multimers, and the pairwise amino acid preference to interact across a protein-protein interface is analyzed and pair potentials constructed. The performance of the residue-based potentials is evaluated by using four jackknife tests and by assessing the potentials' ability to select true protein-protein interfaces from false ones. Compared to potentials developed for monomeric protein structure prediction, the interdomain potential performs much better at distinguishing protein-protein interactions. The potential developed from homodimer interfaces is almost the same as that developed from heterodimer interfaces with a correlation coefficient of 0.92. The residue-based potential is well suited for genomic scale protein interaction prediction and analysis, such as in a recently developed threading-based algorithm, MULTIPROSPECTOR. However, the more time-consuming atom-based potential performs better in identifying near-native structures from docking generated decoys.  相似文献   

3.
H Lu  J Skolnick 《Proteins》2001,44(3):223-232
A heavy atom distance-dependent knowledge-based pairwise potential has been developed. This statistical potential is first evaluated and optimized with the native structure z-scores from gapless threading. The potential is then used to recognize the native and near-native structures from both published decoy test sets, as well as decoys obtained from our group's protein structure prediction program. In the gapless threading test, there is an average z-score improvement of 4 units in the optimized atomic potential over the residue-based quasichemical potential. Examination of the z-scores for individual pairwise distance shells indicates that the specificity for the native protein structure is greatest at pairwise distances of 3.5-6.5 A, i.e., in the first solvation shell. On applying the current atomic potential to test sets obtained from the web, composed of native protein and decoy structures, the current generation of the potential performs better than residue-based potentials as well as the other published atomic potentials in the task of selecting native and near-native structures. This newly developed potential is also applied to structures of varying quality generated by our group's protein structure prediction program. The current atomic potential tends to pick lower RMSD structures than do residue-based contact potentials. In particular, this atomic pairwise interaction potential has better selectivity especially for near-native structures. As such, it can be used to select near-native folds generated by structure prediction algorithms as well as for protein structure refinement.  相似文献   

4.
MOTIVATION: Experimental evidence suggests that certain short protein segments have stronger amyloidogenic propensities than others. Identification of the fibril-forming segments of proteins is crucial for understanding diseases associated with protein misfolding and for finding favorable targets for therapeutic strategies. RESULT: In this study, we used the microcrystal structure of the NNQQNY peptide from yeast prion protein and residue-based statistical potentials to establish an algorithm to identify the amyloid fibril-forming segment of proteins. Using the same sets of sequences, a comparable prediction performance was obtained from this study to that from 3D profile method based on the physical atomic-level potential ROSETTADESIGN. The predicted results are consistent with experiments for several representative proteins associated with amyloidosis, and also agree with the idea that peptides that can form fibrils may have strong sequence signatures. Application of the residue-based statistical potentials is computationally more efficient than using atomic-level potentials and can be applied in whole proteome analysis to investigate the evolutionary pressure effect or forecast other latent diseases related to amyloid deposits. AVAILABILITY: The fibril prediction program is available at ftp://mdl.ipc.pku.edu.cn/pub/software/pre-amyl/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

5.
6.
Statistical potentials for fold assessment   总被引:3,自引:0,他引:3       下载免费PDF全文
A protein structure model generally needs to be evaluated to assess whether or not it has the correct fold. To improve fold assessment, four types of a residue-level statistical potential were optimized, including distance-dependent, contact, Phi/Psi dihedral angle, and accessible surface statistical potentials. Approximately 10,000 test models with the correct and incorrect folds were built by automated comparative modeling of protein sequences of known structure. The criterion used to discriminate between the correct and incorrect models was the Z-score of the model energy. The performance of a Z-score was determined as a function of many variables in the derivation and use of the corresponding statistical potential. The performance was measured by the fractions of the correctly and incorrectly assessed test models. The most discriminating combination of any one of the four tested potentials is the sum of the normalized distance-dependent and accessible surface potentials. The distance-dependent potential that is optimal for assessing models of all sizes uses both C(alpha) and C(beta) atoms as interaction centers, distinguishes between all 20 standard residue types, has the distance range of 30 A, and is derived and used by taking into account the sequence separation of the interacting atom pairs. The terms for the sequentially local interactions are significantly less informative than those for the sequentially nonlocal interactions. The accessible surface potential that is optimal for assessing models of all sizes uses C(beta) atoms as interaction centers and distinguishes between all 20 standard residue types. The performance of the tested statistical potentials is not likely to improve significantly with an increase in the number of known protein structures used in their derivation. The parameters of fold assessment whose optimal values vary significantly with model size include the size of the known protein structures used to derive the potential and the distance range of the accessible surface potential. Fold assessment by statistical potentials is most difficult for the very small models. This difficulty presents a challenge to fold assessment in large-scale comparative modeling, which produces many small and incomplete models. The results described in this study provide a basis for an optimal use of statistical potentials in fold assessment.  相似文献   

7.
The structure, function, stability, and many other properties of a protein in a fixed environment are fully specified by its sequence, but in a manner that is difficult to discern. We present a general approach for rapidly mapping sequences directly to their energies on a pre-specified rigid backbone, an important sub-problem in computational protein design and in some methods for protein structure prediction. The cluster expansion (CE) method that we employ can, in principle, be extended to model any computable or measurable protein property directly as a function of sequence. Here we show how CE can be applied to the problem of computational protein design, and use it to derive excellent approximations of physical potentials. The approach provides several attractive advantages. First, following a one-time derivation of a CE expansion, the amount of time necessary to evaluate the energy of a sequence adopting a specified backbone conformation is reduced by a factor of 10(7) compared to standard full-atom methods for the same task. Second, the agreement between two full-atom methods that we tested and their CE sequence-based expressions is very high (root mean square deviation 1.1-4.7 kcal/mol, R2 = 0.7-1.0). Third, the functional form of the CE energy expression is such that individual terms of the expansion have clear physical interpretations. We derived expressions for the energies of three classic protein design targets-a coiled coil, a zinc finger, and a WW domain-as functions of sequence, and examined the most significant terms. Single-residue and residue-pair interactions are sufficient to accurately capture the energetics of the dimeric coiled coil, whereas higher-order contributions are important for the two more globular folds. For the task of designing novel zinc-finger sequences, a CE-derived energy function provides significantly better solutions than a standard design protocol, in comparable computation time. Given these advantages, CE is likely to find many uses in computational structural modeling.  相似文献   

8.
The prediction of the three-dimensional structures of the native states of proteins from the sequences of their amino acids is one of the most important challenges in molecular biology. An essential task for solving this problem within coarse-grained models is the deduction of effective interaction potentials between the amino acids. Over the years, several techniques have been developed to extract potentials that are able to discriminate satisfactorily between the native and nonnative folds of a preassigned protein sequence. In general, when these potentials are used in actual dynamical folding simulations, they lead to a drift of the native structure outside the quasinative basin. In this article, we present and validate an approach to overcome this difficulty. By exploiting several numerical and analytical tools, we set up a rigorous iterative scheme to extract potentials satisfying a prerequisite of any viable potential: the stabilization of proteins within their native basin (less than 3-4 A RMSD). The scheme is flexible and is demonstrated to be applicable to a variety of parameterizations of the energy function, and it provides in each case the optimal potentials.  相似文献   

9.
Zheng W  Brooks BR  Hummer G 《Proteins》2007,69(1):43-57
We develop a mixed elastic network model (MENM) to study large-scale conformational transitions of proteins between two (or more) known structures. Elastic network potentials for the beginning and end states of a transition are combined, in effect, by adding their respective partition functions. The resulting effective MENM energy function smoothly interpolates between the original surfaces, and retains the beginning and end structures as local minima. Saddle points, transition paths, potentials of mean force, and partition functions can be found efficiently by largely analytic methods. To characterize the protein motions during a conformational transition, we follow "transition paths" on the MENM surface that connect the beginning and end structures and are invariant to parameterizations of the model and the mathematical form of the mixing scheme. As illustrations of the general formalism, we study large-scale conformation changes of the motor proteins KIF1A kinesin and myosin II. We generate possible transition paths for these two proteins that reveal details of their conformational motions. The MENM formalism is computationally efficient and generally applicable even for large protein systems that undergo highly collective structural changes.  相似文献   

10.
Zhang J  Zhang Y 《PloS one》2010,5(10):e15386

Background

An accurate potential function is essential to attack protein folding and structure prediction problems. The key to developing efficient knowledge-based potential functions is to design reference states that can appropriately counteract generic interactions. The reference states of many knowledge-based distance-dependent atomic potential functions were derived from non-interacting particles such as ideal gas, however, which ignored the inherent sequence connectivity and entropic elasticity of proteins.

Methodology

We developed a new pair-wise distance-dependent, atomic statistical potential function (RW), using an ideal random-walk chain as reference state, which was optimized on CASP models and then benchmarked on nine structural decoy sets. Second, we incorporated a new side-chain orientation-dependent energy term into RW (RWplus) and found that the side-chain packing orientation specificity can further improve the decoy recognition ability of the statistical potential.

Significance

RW and RWplus demonstrate a significantly better ability than the best performing pair-wise distance-dependent atomic potential functions in both native and near-native model selections. It has higher energy-RMSD and energy-TM-score correlations compared with other potentials of the same type in real-life structure assembly decoys. When benchmarked with a comprehensive list of publicly available potentials, RW and RWplus shows comparable performance to the state-of-the-art scoring functions, including those combining terms from multiple resources. These data demonstrate the usefulness of random-walk chain as reference states which correctly account for sequence connectivity and entropic elasticity of proteins. It shows potential usefulness in structure recognition and protein folding simulations. The RW and RWplus potentials, as well as the newly generated I-TASSER decoys, are freely available in http://zhanglab.ccmb.med.umich.edu/RW.  相似文献   

11.
S Miyazawa  R L Jernigan 《Proteins》1999,36(3):347-356
Short-range interactions for secondary structures of proteins are evaluated as potentials of mean force from the observed frequencies of secondary structures in known protein structures which are assumed to have an equilibrium distribution with the Boltzmann factor of secondary structure energies. A secondary conformation at each residue position in a protein is described by a tripeptide, including one nearest neighbor on each side. The secondary structure potentials are approximated as additive contributions from neighboring residues along the sequence. These are part of an empirical potential to provide a crude estimate of protein conformational energy at a residue level. Unlike previous works, interactions are decoupled into intrinsic potentials of residues, potentials of backbone-backbone interactions, and of side chain-backbone interactions. Also interactions are decoupled into one-body, two-body, and higher order interactions between peptide backbone and side chain and between backbones. These decouplings are essential to correctly evaluate the total secondary structure energy of a protein structure without overcounting interactions. Each interaction potential is evaluated separately by taking account of the correlation in the amino acid order of protein sequences. Interactions among side chains are neglected, because of the relatively limited number of protein structures. Proteins 1999;36:347-356. Published 1999 Wiley-Liss, Inc.  相似文献   

12.
Knowledge-based potentials are used widely in protein folding and inverse folding algorithms. Two kinds of derivation methods are used. (1) The interactions in a database of known protein structures are assumed to obey a Boltzmann distribution. (2) The stability of the native folds relative to a manifold of misfolded structures is optimized. Here, a set of previously derived contact and secondary structure propensity potentials, taken as the "true" potentials, are employed to construct an artificial protein structural database from protein fragments. Then, new sets of potentials are derived to see how they are related to the true potentials. Using the Boltzmann distribution method, when the stability of the structures in the database lies within a certain range, both contact potentials and secondary structure propensities can be derived separately with remarkable accuracy. In general, the optimization method was found to be less accurate due to errors in the "excess energy" contribution. When the excess energy terms are kept as a constraint, the true potentials are recovered exactly.  相似文献   

13.
We discuss the derivation of atomic-level potentials of mean force from the known protein structures and their applicability for structural evaluation applications. In the derivation process, rigorous density estimation methodology is used to estimate the probability density functions (PDFs) for the distributions of interatomic distances in the protein structures. Potentials of mean force are then derived from these density functions using simple Boltzmann's relation. We also test the potentials against pairs of current and superseded protein structures in the Protein Data Bank. Using PDF potentials to evaluate each structure pair, we are able to identify, with high accuracy, which of the two structures is of higher resolution or better quality. This result shows that the PDF potentials are sensitive to details in protein structures as the current and superseded atomic coordinates generally do not differ by more than 1 A in root-mean-square deviation, and that the PDF potentials could potentially be used for X-ray structure refinement and protein structure prediction.  相似文献   

14.
15.
16.
Recognizing the structural similarity without significant sequence identity (fold recognition) is an effective method for protein structure prediction. Previously, we developed a fold recognition potential called SORDIS, which incorporated side chain orientation in relation to hydrophobic core centers, distance of the residues from the protein globule center and secondary structure terms. But this potential does not include terms, based on close contacts between residues. In this paper a new fold recognition potential CONTSOR was presented, which based on SORDIS terms and the term, based on contacts between amino acid terminal groups. The performance of this potential was evaluated on SABmark benchmark for alignment accuracy and on SABmark and Lindahl benchmarks for fold recognition. The results show that CONTSOR has the best performance among other potentials on SABmark benchmark both for alignment accuracy and fold recognition and one of the best performances on Lindahl benchmark. CONTSOR software package is available for download at http://www.lifescience.org.ge/downloads/contsor.zip.  相似文献   

17.
We present an energy function for predicting binding free energies of protein-protein complexes, using the three-dimensional structures of the complex and unbound proteins as input. Our function is a linear combination of nine terms and achieves a correlation coefficient of 0.63 with experimental measurements when tested on a benchmark of 144 complexes using leave-one-out cross validation. Although we systematically tested both atomic and residue-based scoring functions, the selected function is dominated by residue-based terms. Our function is stable for subsets of the benchmark stratified by experimental pH and extent of conformational change upon complex formation, with correlation coefficients ranging from 0.61 to 0.66.  相似文献   

18.
Measurements of protein sequence-structure correlations   总被引:1,自引:0,他引:1  
Crooks GE  Wolfe J  Brenner SE 《Proteins》2004,57(4):804-810
Correlations between protein structures and amino acid sequences are widely used for protein structure prediction. For example, secondary structure predictors generally use correlations between a secondary structure sequence and corresponding primary structure sequence, whereas threading algorithms and similar tertiary structure predictors typically incorporate interresidue contact potentials. To investigate the relative importance of these sequence-structure interactions, we measured the mutual information among the primary structure, secondary structure and side-chain surface exposure, both for adjacent residues along the amino acid sequence and for tertiary structure contacts between residues distantly separated along the backbone. We found that local interactions along the amino acid chain are far more important than non-local contacts and that correlations between proximate amino acids are essentially uninformative. This suggests that knowledge-based contact potentials may be less important for structure predication than is generally believed.  相似文献   

19.
We characterize the "sequence landscapes" in several simple, heteropolymer models of proteins by examining their mutation properties. Using an efficient flat-histogram Monte Carlo search method, our approach involves determining the distribution in energy of all sequences of a given length when threaded through a common backbone. These calculations are performed for a number of Protein Data Bank structures using two variants of the 20-letter contact potential developed by Miyazawa and Jernigan [Miyazawa S, Jernigan WL. Macromolecules 1985;18:534], and the 2-monomer HP model of Lau and Dill [Lau KF, Dill KA. Macromolecules 1989;22:3986]. Our results indicate significant differences among the energy functions in terms of the "smoothness" of their landscapes. In particular, one of the Miyazawa-Jernigan contact potentials reveals unusual cooperative behavior among its species' interactions, resulting in what is essentially a set of phase transitions in sequence space. Our calculations suggest that model-specific features can have a profound effect on protein design algorithms, and our methods offer a number of ways by which sequence landscapes can be quantified.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号