首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Crippen GM 《Biopolymers》2004,75(3):278-289
This is our second type of model for protein folding where the configurational parameters and the effective potential energy function are chosen in such a way that all conformations are described and the canonical partition function can be evaluated analytically. Structure is described in terms of distances between pairs of sequentially contiguous blocks of eight residues, and all possible conformations are grouped into 71 subsets in terms of bounds on these distances. The energy is taken to be a sum of pairwise interactions between such blocks. The 210 energy parameters were adjusted so that the native folds of 32 small proteins are favored in free energy over the denatured state. We then found 146 proteins having negligible sequence similarity to any of the training proteins, yet the free energy of the respective correct native states were favored over the denatured state.  相似文献   

2.
We present a method to derive contact energy parameters from large sets of proteins. The basic requirement on which our method is based is that for each protein in the database the native contact map has lower energy than all its decoy conformations that are obtained by threading. Only when this condition is satisfied one can use the proposed energy function for fold identification. Such a set of parameters can be found (by perceptron learning) if Mp, the number of proteins in the database, is not too large. Other aspects that influence the existence of such a solution are the exact definition of contact and the value of the critical distance Rc, below which two residues are considered to be in contact. Another important novel feature of our approach is its ability to determine whether an energy function of some suitable proposed form can or cannot be parameterized in a way that satisfies our basic requirement. As a demonstration of this, we determine the region in the (Rc, Mp) plane in which the problem is solvable, i.e., we can find a set of contact parameters that stabilize simultaneously all the native conformations. We show that for large enough databases the contact approximation to the energy cannot stabilize all the native folds even against the decoys obtained by gapless threading.  相似文献   

3.
Tobi D  Shafran G  Linial N  Elber R 《Proteins》2000,40(1):71-85
Pairwise interaction models to recognize native folds are designed and analyzed. Different sets of parameters are considered but the focus was on 20 x 20 contact matrices. Simultaneous solution of inequalities and minimization of the variance of the energy find matrices that recognize exactly the native folds of 572 sequences and structures from the protein data bank (PDB). The set includes many homologous pairs, which present a difficult recognition problem. Significant recognition ability is recovered with a small number of parameters (e.g., the H/P model). However, full recognition requires a complete set of amino acids. In addition to structures from the PDB, a folding program (MONSSTER) was used to generate decoy structures for 75 proteins. It is impossible to recognize all the native structures of the extended set by contact potentials. We therefore searched for a new functional form. An energy function U, which is based on a sum of general pairwise interactions limited to a resolution of 1 angstrom, is considered. This set was infeasible too. We therefore conjecture that it is not possible to find a folding potential, resolved to 1 angstrom, which is a sum of pair interactions.  相似文献   

4.
Native states of proteins are flexible, populating more than just the unique native conformation. The energetics and dynamics resulting from this conformational ensemble are inherently linked to protein function and regulation. Proteolytic susceptibility is one feature determined by this conformational energy landscape. As an attempt to investigate energetics of proteins on a proteomic scale, we challenged the Escherichia coli proteome with extensive proteolysis and determined which proteins, if any, have optimized their energy landscape for resistance to proteolysis. To our surprise, multiple soluble proteins survived the challenge. Maltose binding protein, a survivor from thermolysin digestion, was characterized by in vitro biophysical studies to identify the physical origin of proteolytic resistance. This experimental characterization shows that kinetic stability is responsible for the unusual resistance in maltose binding protein. The biochemical functions of the identified survivors suggest that many of these proteins may have evolved extreme proteolytic resistance because of their critical roles under stressed conditions. Our results suggest that under functional selection proteins can evolve extreme proteolysis resistance by modulating their conformational energy landscapes without the need to invent new folds, and that proteins can be profiled on a proteomic scale according to their energetic properties by using proteolysis as a structural probe.  相似文献   

5.
Vincent J. Hilser 《Proteins》2016,84(4):435-447
Knowing the determinants of conformational specificity is essential for understanding protein structure, stability, and fold evolution. To address this issue, a novel statistical measure of energetic compatibility between sequence and structure was developed using an experimentally validated model of the energetics of the native state ensemble. This approach successfully matched sequences from a diverse subset of the human proteome to their respective folds. Unexpectedly, significant energetic compatibility between ostensibly unrelated sequences and structures was also observed. Interrogation of these matches revealed a general framework for understanding the origins of conformational specificity within a proteome: specificity is a complex function of both the ability of a sequence to adopt folds other than the native, and ability of a fold to accommodate sequences other than the native. The regional variation in energetic compatibility indicates that the compatibility is dominated by incompatibility of sequence for alternative fold segments, suggesting that evolution of protein sequences has involved substantial negative selection, with certain segments serving as “gatekeepers” that presumably prevent alternative structures. Beyond these global trends, a size dependence exists in the degree to which the energetic compatibility is determined from negative selection, with smaller proteins displaying more negative selection. This partially explains how short sequences can adopt unique folds, despite the higher probability in shorter proteins for small numbers of mutations to increase compatibility with other folds. In providing evolutionary ground rules for the thermodynamic relationship between sequence and fold, this framework imparts valuable insight for rational design of unique folds or fold switches. Proteins 2016; 84:435–447. © 2016 Wiley Periodicals, Inc.  相似文献   

6.
Motivation. Protein design aims to identify sequences compatible with a given protein fold but incompatible to any alternative folds. To select the correct sequences and to guide the search process, a design scoring function is critically important. Such a scoring function should be able to characterize the global fitness landscape of many proteins simultaneously. RESULTS: To find optimal design scoring functions, we introduce two geometric views and propose a formulation using a mixture of non-linear Gaussian kernel functions. We aim to solve a simplified protein sequence design problem. Our goal is to distinguish each native sequence for a major portion of representative protein structures from a large number of alternative decoy sequences, each a fragment from proteins of different folds. Our scoring function discriminates perfectly a set of 440 native proteins from 14 million sequence decoys. We show that no linear scoring function can succeed in this task. In a blind test of unrelated proteins, our scoring function misclassfies only 13 native proteins out of 194. This compares favorably with about three-four times more misclassifications when optimal linear functions reported in the literature are used. We also discuss how to develop protein folding scoring function.  相似文献   

7.
We studied the possibility to approximate a Lennard-Jones interaction by a pairwise contact potential. First we used a Lennard-Jones potential to design off-lattice, protein-like heteropolymer sequences, whose lowest energy (native) conformations were then identified by molecular dynamics. Then we turned to investigate whether one can find a pairwise contact potential, whose ground states are the contact maps associated with these native conformations. We show that such a requirement cannot be satisfied exactly, i.e., no such contact parameters exist. Nevertheless, we found that one can find contact energy parameters for which an energy minimization procedure, acting in the space of contact maps, yields maps whose corresponding structures are close to the native ones. Finally, we show that when these structures are used as the initial point of a molecular dynamics energy minimization process, the correct native folds are recovered with high probability.  相似文献   

8.
MOTIVATION: The discovery of new protein folds is a relatively rare occurrence even as the rate of protein structure determination increases. This rarity reinforces the concept of folds as reusable units of structure and function shared by diverse proteins. If the folding mechanism of proteins is largely determined by their topology, then the folding pathways of members of existing folds could encompass the full set used by globular protein domains. RESULTS: We have used recent versions of three common protein domain dictionaries (SCOP, CATH and Dali) to generate a consensus domain dictionary (CDD). Surprisingly, 40% of the metafolds in the CDD are not composed of autonomous structural domains, i.e. they are not plausible independent folding units. This finding has serious ramifications for bioinformatics studies mining these domain dictionaries for globular protein properties. However, our main purpose in deriving this CDD was to generate an updated CDD to choose targets for MD simulation as part of our dynameomics effort, which aims to simulate the native and unfolding pathways of representatives of all globular protein consensus folds (metafolds). Consequently, we also compiled a list of representative protein targets of each metafold in the CDD. Availability and implementation: This domain dictionary is available at www.dynameomics.org.  相似文献   

9.
Using parallel tempering simulations with high statistics, we investigate the folding and thermodynamic properties of three small proteins with distinct native folds: the all-helical 1RIJ, the all-sheet beta3s, and BBA5, which has a mixed helix-sheet fold. In all three cases, simulations with our energy function find the native structures as global minima in free energy at experimentally relevant temperatures. However, the folding process strongly differs for the three molecules, indicating that the folding mechanism is correlated with the form of the native structure.  相似文献   

10.
Fujitsuka Y  Chikenji G  Takada S 《Proteins》2006,62(2):381-398
Predicting protein tertiary structures by in silico folding is still very difficult for proteins that have new folds. Here, we developed a coarse-grained energy function, SimFold, for de novo structure prediction, performed a benchmark test of prediction with fragment assembly simulations for 38 test proteins, and proposed consensus prediction with Rosetta. The SimFold energy consists of many terms that take into account solvent-induced effects on the basis of physicochemical consideration. In the benchmark test, SimFold succeeded in predicting native structures within 6.5 A for 12 of 38 proteins; this success rate was the same as that by the publicly available version of Rosetta (ab initio version 1.2) run with default parameters. We investigated which energy terms in SimFold contribute to structure prediction performance, finding that the hydrophobic interaction is the most crucial for the prediction, whereas other sequence-specific terms have weak but positive roles. In the benchmark, well-predicted proteins by SimFold and by Rosetta were not the same for 5 of 12 proteins, which led us to introduce consensus prediction. With combined decoys, we succeeded in prediction for 16 proteins, four more than SimFold or Rosetta separately. For each of 38 proteins, structural ensembles generated by SimFold and by Rosetta were qualitatively compared by mapping sampled structural space onto two dimensions. For proteins of which one of the two methods succeeded and the other failed in prediction, the former had a less scattered ensemble located around the native. For proteins of which both methods succeeded in prediction, often two ensembles were mixed up.  相似文献   

11.
Protein structure prediction remains an unsolved problem. Since prediction of the native structure seems very difficult, one usually tries to predict the correct fold of a protein. Here the "fold" is defined by the approximate backbone structure of the protein. However, physicochemical factors that determine the correct fold are not well understood. It has recently been reported that molecular mechanics energy functions combined with effective solvent terms can discriminate the native structures from misfolded ones. Using such a physicochemical energy function, we studied the factors necessary for discrimination of correct and incorrect folds. We first selected correct and incorrect folds by a conventional threading method. Then, all-atom models of those folds were constructed by simply minimizing the atomic overlaps. The constructed correct model representing the native fold has almost the same backbone structure as the native structure but differs in side-chain packing. Finally, the energy values of the constructed models were compared with that of the experimentally determined native structure. The correct model as well as the native structure showed lower energy than misfolded models. However, a large energy gap was found between the native structure and the correct model. By decomposing the energy values into their components, it was found that solvent effects such as the hydrophobic interaction or solvent shielding and the Born energy stabilized the correct model rather than the native structure. The large energetic stabilization of the native structure was attained by specific side-chain packing. The stabilization by solvent effects is small compared to that by side-chain packing. Therefore, it is suggested that in order to confidently predict the correct fold of a protein, it is also necessary to predict correct side-chain packing.  相似文献   

12.
13.
We present an approach that is able to detect native folds amongst a large number of non-native conformations. The method is based on the compilation of potentials of mean force of the interactions of the C beta atoms of all amino acid pairs from a database of known three-dimensional protein structures. These potentials are used to calculate the conformational energy of amino acid sequences in a number of different folds. For a substantial number of proteins we find that the conformational energy of the native state is lowest amongst the alternatives. Exceptions are proteins containing large prosthetic groups, Fe-S clusters or polypeptide chains that do not adopt globular folds. We discuss briefly potential applications in various fields of protein structural research.  相似文献   

14.
The prediction of the three-dimensional structures of the native states of proteins from the sequences of their amino acids is one of the most important challenges in molecular biology. An essential task for solving this problem within coarse-grained models is the deduction of effective interaction potentials between the amino acids. Over the years, several techniques have been developed to extract potentials that are able to discriminate satisfactorily between the native and nonnative folds of a preassigned protein sequence. In general, when these potentials are used in actual dynamical folding simulations, they lead to a drift of the native structure outside the quasinative basin. In this article, we present and validate an approach to overcome this difficulty. By exploiting several numerical and analytical tools, we set up a rigorous iterative scheme to extract potentials satisfying a prerequisite of any viable potential: the stabilization of proteins within their native basin (less than 3-4 A RMSD). The scheme is flexible and is demonstrated to be applicable to a variety of parameterizations of the energy function, and it provides in each case the optimal potentials.  相似文献   

15.
We describe the construction of a scoring function designed to model the free energy of protein folding. An optimization technique is used to determine the best functional forms of the hydrophobic, residue-residue and hydrogen-bonding components of the potential. The scoring function is expanded by use of Chebyshev polynomials, the coefficients of which are determined by minimizing the score, in units of standard deviation, of native structures in the ensembles of alternate decoy conformations. The derived effective potential is then tested on decoy sets used conventionally in such studies. Using our scoring function, we achieve a high level of discrimination between correct and incorrect folds. In addition, our method is able to represent functions of arbitrary shape with fewer parameters than the usual histogram potentials of similar resolution. Finally, our representation can be combined easily with many optimization methods, because the total energy is a linear function of the parameters. Our results show that the techniques of Z-score optimization and Chebyshev expansion work well.  相似文献   

16.
Interest centers here on whether the use of a fixed charge distribution of a protein solute, or a treatment that considers proton-binding equilibria by solving the Poisson equation, is a better approach to discriminate native from non-native conformations of proteins. In this analysis of the charge distribution of 7 proteins, we estimate the solvation free energy contribution to the total free energy by exploring the 2(zeta) possible ionization states of the whole molecule, with zeta being the number of ionizable groups in the amino acid sequence, for every conformation in the ensembles of 7 proteins. As an additional consideration of the role of electrostatic interactions in determining the charge distribution of native folds, we carried out a comparison of alternative charge assignment models for the ionizable residues in a set of 21 native-like proteins. The results of this work indicate that (1) for 6 out of 7 proteins, estimation of solvent polarization based on the Generalized Born model with a fixed charge distribution provides the optimal trade-off between accuracy, with respect to the Poisson equation, and speed when compared to the accessible surface area model; for the seventh protein, consideration of all possible ionization states of the whole molecule appears to be crucial to discriminate the native from non-native conformations; (2) significant differences in the degree of ionization and hence the charge distribution for native folds are found between the different charge models examined; (3) the stability of the native state is determined by a delicate balance of all the energy components, and (4) conformational entropy, and hence the dynamics of folding, may play a crucial role for a successful ab initio protein folding prediction.  相似文献   

17.
Understanding the relationship between the amino‐acid sequence of a protein and its ability to fold and to function is one of the major challenges of protein science. Here, cases are reviewed in which mutagenesis, biochemistry, structure determination, protein engineering, and single‐molecule biophysics have illuminated the sequence determinants of folding, binding specificity, and biological function for DNA‐binding proteins and ATP‐fueled machines that forcibly unfold native proteins as a prelude to degradation. In addition to structure‐function relationships, these studies provide information about folding intermediates, mutations that accelerate folding, slow unfolding, and stabilize proteins against denaturation, show how new binding specificities and folds can evolve, and reveal strategies that proteolytic machines use to recognize, unfold, and degrade thousands of distinct substrates.  相似文献   

18.
We developed a high-throughput methodology, termed fluorescent tagging of full-length proteins (FTFLP), to analyze expression patterns and subcellular localization of Arabidopsis gene products in planta. Determination of these parameters is a logical first step in functional characterization of the approximately one-third of all known Arabidopsis genes that encode novel proteins of unknown function. Our FTFLP-based approach offers two significant advantages: first, it produces internally-tagged full-length proteins that are likely to exhibit native intracellular localization, and second, it yields information about the tissue specificity of gene expression by the use of native promoters. To demonstrate how FTFLP may be used for characterization of the Arabidopsis proteome, we tagged a series of known proteins with diverse subcellular targeting patterns as well as several proteins with unknown function and unassigned subcellular localization.  相似文献   

19.
The routine prediction of three-dimensional protein structure from sequence remains a challenge in computational biochemistry. It has been intuited that calculated energies from physics-based scoring functions are able to distinguish native from nonnative folds based on previous performance with small proteins and that conformational sampling is the fundamental bottleneck to successful folding. We demonstrate that as protein size increases, errors in the computed energies become a significant problem. We show, by using error probability density functions, that physics-based scores contain significant systematic and random errors relative to accurate reference energies. These errors propagate throughout an entire protein and distort its energy landscape to such an extent that modern scoring functions should have little chance of success in finding the free energy minima of large proteins. Nonetheless, by understanding errors in physics-based score functions, they can be reduced in a post-hoc manner, improving accuracy in energy computation and fold discrimination.  相似文献   

20.
We suggest a new approach to the generation of candidate structures (decoys) for ab initio prediction of protein structures. Our method is based on random sampling of conformation space and subsequent local energy minimization. At the core of this approach lies the design of a novel type of energy function. This energy function has local minima with native structure characteristics and wide basins of attraction. The current work presents our motivation for deriving such an energy function and also tests the derived energy function.Our approach is novel in that it takes advantage of the inherently rough energy landscape of proteins, which is generally considered a major obstacle for protein structure prediction. When local minima have wide basins of attraction, the protein's conformation space can be greatly reduced by the convergence of large regions of the space into single points, namely the local minima corresponding to these funnels. We have implemented this concept by an iterative process. The potential is first used to generate decoy sets and then we study these sets of decoys to guide further development of the potential. A key feature of our potential is the use of cooperative multi-body interactions that mimic the role of the entropic and solvent contributions to the free energy.The validity and value of our approach is demonstrated by applying it to 14 diverse, small proteins. We show that, for these proteins, the size of conformation space is considerably reduced by the new energy function. In fact, the reduction is so substantial as to allow efficient conformational sampling. As a result we are able to find a significant number of near-native conformations in random searches performed with limited computational resources.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号