首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
  相似文献   

2.
    
Shirota M  Ishida T  Kinoshita K 《Proteins》2011,79(5):1550-1563
In protein structure prediction, it is crucial to evaluate the degree of native-likeness of given model structures. Statistical potentials extracted from protein structure data sets are widely used for such quality assessment problems, but they are only applicable for comparing different models of the same protein. Although various other methods, such as machine learning approaches, were developed to predict the absolute similarity of model structures to the native ones, they required a set of decoy structures in addition to the model structures. In this paper, we tried to reformulate the statistical potentials as absolute quality scores, without using the information from decoy structures. For this purpose, we regarded the native state and the reference state, which are necessary components of statistical potentials, as the good and bad standard states, respectively, and first showed that the statistical potentials can be regarded as the state functions, which relate a model structure to the native and reference states. Then, we proposed a standardized measure of protein structure, called native-likeness, by interpolating the score of a model structure between the native and reference state scores defined for each protein. The native-likeness correlated with the similarity to the native structures and discriminated the native structures from the models, with better accuracy than the raw score. Our results show that statistical potentials can quantify the native-like properties of protein structures, if they fully utilize the statistical information obtained from the data set.  相似文献   

3.
    
Simplified force fields play an important role in protein structure prediction and de novo protein design by requiring less computational effort than detailed atomistic potentials. A side chain centroid based, distance dependent pairwise interaction potential has been developed. A linear programming based formulation was used in which non-native \"decoy\" conformers are forced to take a higher energy compared with the corresponding native structure. This model was trained on an enhanced and diverse protein set. High quality decoy structures were generated for approximately 1400 nonhomologous proteins using torsion angle dynamics along with restricted variations of the hydrophobic cores of the native structure. The resulting decoy set was used to train the model yielding two different side chain centroid based force fields that differ in the way distance dependence has been used to calculate energy parameters. These force fields were tested on an independent set of 148 test proteins with 500 decoy structures for each protein. The side chain centroid force fields were successful in correctly identifying approximately 86% native structures. The Z-scores produced by the proposed centroid-centroid distance dependent force fields improved compared with other distance dependent C(alpha)-C(alpha) or side chain based force fields.  相似文献   

4.
    
This work presents a novel C(alpha)--C(alpha) distance dependent force field which is successful in selecting native structures from an ensemble of high resolution near-native conformers. An enhanced and diverse protein set, along with an improved decoy generation technique, contributes to the effectiveness of this potential. High quality decoys were generated for 1489 nonhomologous proteins and used to train an optimization based linear programming formulation. The goal in developing a set of high resolution decoys was to develop a simple, distance-dependent force field that yields the native structure as the lowest energy structure and assigns higher energies to decoy structures that are quite similar as well as those that are less similar. The model also includes a set of physical constraints that were based on experimentally observed physical behavior of the amino acids. The force field was tested on two sets of test decoys not in the training set and was found to excel on all the metrics that are widely used to measure the effectiveness of a force field. The high resolution force field was successful in correctly identifying 113 native structures out of 150 test cases and the average rank obtained for this test was 1.87. All the high resolution structures (training and testing) used for this work are available online and can be downloaded from http://titan.princeton.edu/HRDecoys.  相似文献   

5.
    
Wallin S  Farwer J  Bastolla U 《Proteins》2003,50(1):144-157
  相似文献   

6.
    
Quantitative prediction of protein–protein binding affinity is essential for understanding protein–protein interactions. In this article, an atomic level potential of mean force (PMF) considering volume correction is presented for the prediction of protein–protein binding affinity. The potential is obtained by statistically analyzing X‐ray structures of protein–protein complexes in the Protein Data Bank. This approach circumvents the complicated steps of the volume correction process and is very easy to implement in practice. It can obtain more reasonable pair potential compared with traditional PMF and shows a classic picture of nonbonded atom pair interaction as Lennard‐Jones potential. To evaluate the prediction ability for protein–protein binding affinity, six test sets are examined. Sets 1–5 were used as test set in five published studies, respectively, and set 6 was the union set of sets 1–5, with a total of 86 protein–protein complexes. The correlation coefficient (R) and standard deviation (SD) of fitting predicted affinity to experimental data were calculated to compare the performance of ours with that in literature. Our predictions on sets 1–5 were as good as the best prediction reported in the published studies, and for union set 6, R = 0.76, SD = 2.24 kcal/mol. Furthermore, we found that the volume correction can significantly improve the prediction ability. This approach can also promote the research on docking and protein structure prediction.  相似文献   

7.
    
Conditional optimization allows the incorporation of extensive geometrical information in protein structure refinement, without the requirement of an explicit chemical assignment of the individual atoms. Here, a mean‐force potential for the conditional optimization of protein structures is presented that expresses knowledge of common protein conformations in terms of interatomic distances, torsion angles and numbers of neighbouring atoms. Information is included for protein fragments up to several residues long in α‐helical, β‐­strand and loop conformations, comprising the main chain and side chains up to the γ position in three distinct rotamers. Using this parameter set, conditional optimization of three small protein structures against 2.0 Å observed diffraction data shows a large radius of convergence, validating the presented force field and illustrating the feasibility of the approach. The generally applicable force field allows the development of novel phase‐improvement procedures using the conditional optimization technique.  相似文献   

8.
Distributions of each amino acid in the trans-membrane domain were calculated as a function of the membrane normal using all currently available alpha-helical membrane protein structures with resolutions better than 4 A. The results were compared with previous sequence- and structure-based analyses. Calculation of the average hydrophobicity along the membrane normal demonstrated that the protein surface in the membrane domain is in fact much more hydrophobic than the protein core. While hydrophobic residues dominate the membrane domain, the interfacial regions of membrane proteins were found to be abundant in the small residues glycine, alanine, and serine, consistent with previous studies on membrane protein packing. Charged residues displayed nonsymmetric distributions with a preference for the intracellular interface. This effect was more prominent for Arg and Lys resulting in a direct confirmation of the positive inside rule. Potentials of mean force along the membrane normal were derived for each amino acid by fitting Gaussian functions to the residue distributions. The individual potentials agree well with experimental and theoretical considerations. The resulting implicit membrane potential was tested on various membrane proteins as well as single trans-membrane alpha-helices. All membrane proteins were found to be at an energy minimum when correctly inserted into the membrane. For alpha-helices both interfacial (i.e. surface bound) and inserted configurations were found to correspond to energy minima. The results demonstrate that the use of trans-membrane amino acid distributions to derive an implicit membrane representation yields meaningful residue potentials.  相似文献   

9.
    
We introduce a new hydrogen bonding potential of mean force generated from high‐quality crystal structures for use in Xplor‐NIH structure calculations. This term applies to hydrogen bonds involving both backbone and sidechain atoms. When used in structure refinement calculations of 10 example protein systems with experimental distance, dihedral and residual dipolar coupling restraints, we demonstrate that the new term has superior performance to the previously developed hydrogen bonding potential of mean force used in Xplor‐NIH.  相似文献   

10.
    
It is well established that protein structures are more conserved than protein sequences. One-third of all known protein structures can be classified into ten protein folds, which themselves are composed mainly of alpha-helical hairpin, beta hairpin, and betaalphabeta supersecondary structural elements. In this study, we explore the ability of a recent Monte Carlo-based procedure to generate the 3D structures of eight polypeptides that correspond to units of supersecondary structure and three-stranded antiparallel beta sheet. Starting from extended or misfolded compact conformations, all Monte Carlo simulations show significant success in predicting the native topology using a simplified chain representation and an energy model optimized on other structures. Preliminary results on model peptides from nucleotide binding proteins suggest that this simple protein folding model can help clarify the relation between sequence and topology.  相似文献   

11.
The predictive limits of the amino acid composition for the secondary structural content (percentage of residues in the secondary structural states helix, sheet, and coil) in proteins are assessed quantitatively. For the first time, techniques for prediction of secondary structural content are presented which rely on the amino acid composition as the only information on the query protein. In our first method, the amino acid composition of an unknown protein is represented by the best (in a least square sense) linear combination of the characteristic amino acid compositions of the three secondary structural types computed from a learning set of tertiary structures. The second technique is a generalization of the first one and takes into account also possible compositional couplings between any two sorts of amino acids. Its mathematical formulation results in an eigenvalue/eigenvector problem of the second moment matrix describing the amino acid compositional fluctuations of secondary structural types in various proteins of a learning set. Possible correlations of the principal directions of the eigenspaces with physical properties of the amino acids were also checked. For example, the first two eigenvectors of the helical eigenspace correlate with the size and hydrophobicity of the residue types respectively. As learning and test sets of tertiary structures, we utilized representative, automatically generated subsets of Protein Data Bank (PDB) consisting of non-homologous protein structures at the resolution thresholds ≤1.8Å, ≤2.0Å, ≤2.5Å, and ≤3.0Å. We show that the consideration of compositional couplings improves prediction accuracy, albeit not dramatically. Whereas in the self-consistency test (learning with the protein to be predicted), a clear decrease of prediction accuracy with worsening resolution is observed, the jackknife test (leave the predicted protein out) yielded best results for the largest dataset (≤3.0 Å, almost no difference to the self-consistency test!), i.e., only this set, with more than 400 proteins, is sufficient for stable computation of the parameters in the prediction function of the second method. The average absolute error in predicting the fraction of helix, sheet, and coil from amino acid composition of the query protein are 13.7, 12.6, and 11.4%, respectively with r.m.s. deviations in the range of 8.6 ÷ 11.8% for the 3.0 Å dataset in a jackknife test. The absolute precision of the average absolute errors is in the range of 1 ÷ 3% as measured for other representative subsets of the PDB. Secondary structural content prediction methods found in the literature have been clustered in accordance with their prediction accuracies. To our surprise, much more complex secondary structure prediction methods utilized for the same purpose of secondary structural content prediction achieve prediction accuracies very similar to those of the present analytic techniques, implying that all the information beyond the amino acid composition is, in fact, mainly utilized for positioning the secondary structural state in the sequence but not for determination of the overall number of residues in a secondary structural type. This result implies that higher prediction accuracies cannot be achieved relying solely on the amino acid composition of an unknown query protein as prediction input. Our prediction program SSCP has been made available as a World Wide Web and E-mail service. © 1996 Wiley-Liss, Inc.  相似文献   

12.
    
Lee HS  Zhang Y 《Proteins》2012,80(1):93-110
We developed BSP‐SLIM, a new method for ligand–protein blind docking using low‐resolution protein structures. For a given sequence, protein structures are first predicted by I‐TASSER; putative ligand binding sites are transferred from holo‐template structures which are analogous to the I‐TASSER models; ligand–protein docking conformations are then constructed by shape and chemical match of ligand with the negative image of binding pockets. BSP‐SLIM was tested on 71 ligand–protein complexes from the Astex diverse set where the protein structures were predicted by I‐TASSER with an average RMSD 2.92 Å on the binding residues. Using I‐TASSER models, the median ligand RMSD of BSP‐SLIM docking is 3.99 Å which is 5.94 Å lower than that by AutoDock; the median binding‐site error by BSP‐SLIM is 1.77 Å which is 6.23 Å lower than that by AutoDock and 3.43 Å lower than that by LIGSITECSC. Compared to the models using crystal protein structures, the median ligand RMSD by BSP‐SLIM using I‐TASSER models increases by 0.87 Å, while that by AutoDock increases by 8.41 Å; the median binding‐site error by BSP‐SLIM increase by 0.69Å while that by AutoDock and LIGSITECSC increases by 7.31 Å and 1.41 Å, respectively. As case studies, BSP‐SLIM was used in virtual screening for six target proteins, which prioritized actives of 25% and 50% in the top 9.2% and 17% of the library on average, respectively. These results demonstrate the usefulness of the template‐based coarse‐grained algorithms in the low‐resolution ligand–protein docking and drug‐screening. An on‐line BSP‐SLIM server is freely available at http://zhanglab.ccmb.med.umich.edu/BSP‐SLIM . Proteins 2012. © 2011 Wiley Periodicals, Inc.  相似文献   

13.
    
Rykunov D  Fiser A 《Proteins》2007,67(3):559-568
Statistical distance dependent pair potentials are frequently used in a variety of folding, threading, and modeling studies of proteins. The applicability of these types of potentials is tightly connected to the reliability of statistical observations. We explored the possible origin and extent of false positive signals in statistical potentials by analyzing their distance dependence in a variety of randomized protein-like models. While on average potentials derived from such models are expected to equal zero at any distance, we demonstrate that systematic and significant distortions exist. These distortions originate from the limited statistical counts in local environments of proteins and from the limited size of protein structures at large distances. We suggest that these systematic errors in statistical potentials are connected to the dependence of amino acid composition on protein size and to variation in protein sizes. Additionally, atom-based potentials are dominated by a false positive signal that is due to correlation among distances measured from atoms of one residue to atoms of another residue. The significance of residue-based pairwise potentials at various spatial pair separations was assessed in this study and it was found that as few as approximately 50% of potential values were statistically significant at distances below 4 A, and only at most approximately 80% of them were significant at larger pair separations. A new definition for reference state, free of the observed systematic errors, is suggested. It has been demonstrated to generate statistical potentials that compare favorably to other publicly available ones.  相似文献   

14.
    
Karchin R  Cline M  Karplus K 《Proteins》2004,55(3):508-518
Residue burial, which describes a protein residue's exposure to solvent and neighboring atoms, is key to protein structure prediction, modeling, and analysis. We assessed 21 alphabets representing residue burial, according to their predictability from amino acid sequence, conservation in structural alignments, and utility in one fold-recognition scenario. This follows upon our previous work in assessing nine representations of backbone geometry.1 The alphabet found to be most effective overall has seven states and is based on a count of C(beta) atoms within a 14 A-radius sphere centered at the C(beta) of a residue of interest. When incorporated into a hidden Markov model (HMM), this alphabet gave us a 38% performance boost in fold recognition and 23% in alignment quality.  相似文献   

15.
16.
Knowledge-based potentials are used widely in protein folding and inverse folding algorithms. Two kinds of derivation methods are used. (1) The interactions in a database of known protein structures are assumed to obey a Boltzmann distribution. (2) The stability of the native folds relative to a manifold of misfolded structures is optimized. Here, a set of previously derived contact and secondary structure propensity potentials, taken as the \"true\" potentials, are employed to construct an artificial protein structural database from protein fragments. Then, new sets of potentials are derived to see how they are related to the true potentials. Using the Boltzmann distribution method, when the stability of the structures in the database lies within a certain range, both contact potentials and secondary structure propensities can be derived separately with remarkable accuracy. In general, the optimization method was found to be less accurate due to errors in the \"excess energy\" contribution. When the excess energy terms are kept as a constraint, the true potentials are recovered exactly.  相似文献   

17.
    
To examine the possible relationship of guanine-dependent GpA conformations with ribonucleotide cleavage, two potential of mean force (PMF) calculations were performed in aqueous solution. In the first calculation, the guanosine glycosidic (Gchi) angle was used as the reaction coordinate, and computations were performed on two GpA ionic species: protonated (neutral) or deprotonated (negatively charged) guanosine ribose O2 '. Similar energetic profiles featuring two minima corresponding to the anti and syn Gchi regions were obtained for both ionic forms. For both forms the anti conformation was more stable than the syn, and barriers of approximately 4 kcal/mol were obtained for the anti --> syn transition. Structural analysis showed a remarkable sensitivity of the phosphate moiety to the conformation of the Gchi angle, suggesting a possible connection between this conformation and the mechanism of ribonucleotide cleavage. This hypothesis was confirmed by the second PMF calculations, for which the O2 '--P distance for the deprotonated GpA was used as reaction coordinate. The computations were performed from two selected starting points: the anti and syn minima determined in the first PMF study of the deprotonated guanosine ribose O2'. The simulations revealed that the O2 ' attack along the syn Gchi was more favorable than that along the anti Gchi: energetically, significantly lower barriers were obtained in the syn than in the anti conformation for the O--P bond formation; structurally, a lesser O2 '--P initial distance, and a better suited orientation for an in-line attack was observed in the syn relative to the anti conformation. These results are consistent with the catalytically competent conformation of barnase-ribonucleotide complex, which requires a guanine syn conformation of the substrate to enable abstraction of the ribose H2 ' proton by the general base Glu73, thereby suggesting a coupling between the reactive substrate conformation and enzyme structure and mechanism.  相似文献   

18.
Tim J. Hubbard  J. Park 《Proteins》1995,23(3):398-402
Protein structure predictions were submitted for 9 of the target sequences in the competition that ran during 1994. Targets sequences were selected that had no known homology with any sequence of known structure and were members of a reasonably sized family of related but divergent sequences. The objective was either to recognize a compatible fold for the target sequence in the database of known structures or to predict ab initio its rough 3D topology. The main tools used were Hidden Markov models (HMM) for fold recognition, a β- strand pair potential to predict β-sheet topology, and the PHD server for secondary structure prediction. Compatible folds were correctly identified in a number of cases and the β-strand pair potential was shown to be a useful tool for ab initio topology prediction. © 1995 Wiley-Liss, Inc.  相似文献   

19.
    
Mehdi Mirzaie 《Proteins》2018,86(4):467-474
Evaluation of protein structures needs a trustworthy potential function. Although several knowledge‐based potential functions exist, the impact of different types of amino acids in the scoring functions has not been studied yet. Previously, we have reported the importance of nonlocal interactions in scoring function (based on Delaunay tessellation) in discrimination of native structures. Then, we have questioned the structural impact of hydrophobic amino acids in protein fold recognition. Therefore, a Hydrophobic Reduced Model (HRM) was designed to reduce protein structure of FS (Full Structure) into RS (Reduced Structure). RS is considered as a reduced structure of only seven hydrophobic amino acids (L, V, F, I, A, W, Y) and all their interactions. The presented model was evaluated via four different performance metrics including the number of correctly identified natives, the Z‐score of the native energy, the RMSD of the minimum score, and the Pearson correlation coefficient between the energy and the model quality. Results indicated that only nonlocal interactions between hydrophobic amino acids could be sufficient and accurate enough for protein fold recognition. Interestingly, the results of HRM is significantly close to the model that considers all amino acids (20‐amino acid model) to discriminate the native structure of the proteins on eleven decoy sets. This indicates that the power of knowledge‐based potential functions in protein fold recognition is mostly due to hydrophobic interactions. Hence, we suggest combining a different well‐designed scoring function for non‐hydrophobic interactions with HRM to achieve better performance in fold recognition.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号