首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Protein folding speeds are known to vary over more than eight orders of magnitude. Plaxco, Simons, and Baker (see References) first showed a correlation of folding speed with the topology of the native protein. That and subsequent studies showed, if the native structure of a protein is known, its folding speed can be predicted reasonably well through a correlation with the "localness" of the contacts in the protein. In the present work, we develop a related measure, the geometric contact number, N (alpha), which is the number of nonlocal contacts that are well-packed, by a Voronoi criterion. We find, first, that in 80 proteins, the largest such database of proteins yet studied, N (alpha) is a consistently excellent predictor of folding speeds of both two-state fast folders and more complex multistate folders. Second, we show that folding rates can also be predicted from amino acid sequences directly, without the need to know the native topology or other structural properties.  相似文献   

2.
Huang JT  Tian J 《Proteins》2006,63(3):551-554
The significant correlation between protein folding rates and the sequence-predicted secondary structure suggests that folding rates are largely determined by the amino acid sequence. Here, we present a method for predicting the folding rates of proteins from sequences using the intrinsic properties of amino acids, which does not require any information on secondary structure prediction and structural topology. The contribution of residue to the folding rate is expressed by the residue's Omega value. For a given residue, its Omega depends on the amino acid properties (amino acid rigidity and dislike of amino acid for secondary structures). Our investigation achieves 82% correlation with folding rates determined experimentally for simple, two-state proteins studied until the present, suggesting that the amino acid sequence of a protein is an important determinant of the protein-folding rate and mechanism.  相似文献   

3.
Folding rates of small single-domain proteins that fold through simple two-state kinetics can be estimated from details of the three-dimensional protein structure. Previously, predictions of secondary structure had been exploited to predict folding rates from sequence. Here, we estimate two-state folding rates from predictions of internal residue-residue contacts in proteins of unknown structure. Our estimate is based on the correlation between the folding rate and the number of predicted long-range contacts normalized by the square of the protein length. It is well known that long-range order derived from known structures correlates with folding rates. The surprise was that estimates based on very noisy contact predictions were almost as accurate as the estimates based on known contacts. On average, our estimates were similar to those previously published from secondary structure predictions. The combination of these methods that exploit different sources of information improved performance. It appeared that the combined method reliably distinguished fast from slow two-state folders.  相似文献   

4.
Protein folding experiments demonstrate that the folding behaviors of many proteins can be roughly classified into two types: two-state kinetics and multi-state kinetics. Although the two types of protein folding kinetics have been observed for a long time, what determines the folding type of a protein is still largely unclear. The present work performed a comparative study based on a dataset of 43 two-state and 42 multi-state folders at different levels of proteins' intrinsic properties from the simplest sequence length to native structure topology. The results show that protein's amino acids composition and the long-range interaction-based topological complexity rather than secondary structure contents are the major determinants of protein folding type. Furthermore, a sequence-based folding type prediction achieved an accuracy of more than 80%. These findings implicate that there is no clear boundary between secondary and tertiary structure formation during the protein folding process and support the existence of a continuum of folding mechanism between the two ends of hierarchic and nucleation folding scenarios.  相似文献   

5.
Huang JT  Xing DJ  Huang W 《Amino acids》2012,43(2):567-572
The successful prediction of protein-folding rates based on the sequence-predicted secondary structure suggests that the folding rates might be predicted from sequence alone. To pursue this question, we directly predict the folding rates from amino acid sequences, which do not require any information on secondary or tertiary structure. Our work achieves 88% correlation with folding rates determined experimentally for proteins of all folding types and peptide, suggesting that almost all of the information needed to specify a protein's folding kinetics and mechanism is comprised within its amino acid sequence. The influence of residue on folding rate is related to amino acid properties. Hydrophobic character of amino acids may be an important determinant of folding kinetics, whereas other properties, size, flexibility, polarity and isoelectric point, of amino acids have contributed little to the folding rate constant.  相似文献   

6.
Proteins fold by either two‐state or multistate kinetic mechanism. We observe that amino acids play different roles in different mechanism. Many residues that are easy to form regular secondary structures (α helices, β sheets and turns) can promote the two‐state folding reactions of small proteins. Most of hydrophilic residues can speed up the multistate folding reactions of large proteins. Folding rates of large proteins are equally responsive to the flexibility of partial amino acids. Other properties of amino acids (including volume, polarity, accessible surface, exposure degree, isoelectric point, and phase transfer energy) have contributed little to folding kinetics of the proteins. Cysteine is a special residue, it triggers two‐state folding reaction and but inhibits multistate folding reaction. These findings not only provide a new insight into protein structure prediction, but also could be used to direct the point mutations that can change folding rate. Proteins 2014; 82:2375–2382. © 2014 Wiley Periodicals, Inc.  相似文献   

7.
We demonstrate that chain length is the main determinant of the folding rate for proteins with the three-state folding kinetics. The logarithm of their folding rate in water (k(f)) strongly anticorrelates with their chain length L (the correlation coefficient being -0.80). At the same time, the chain length has no correlation with the folding rate for two-state folding proteins (the correlation coefficient is -0.07). Another significant difference of these two groups of proteins is a strong anticorrelation between the folding rate and Baker's "relative contact order" for the two-state folders and the complete absence of such correlation for the three-state folders.  相似文献   

8.
What are the key building blocks that would have been needed to construct complex protein folds? This is an important issue for understanding protein folding mechanism and guiding de novo protein design. Twenty naturally occurring amino acids and eight secondary structures consist of a 28‐letter alphabet to determine folding kinetics and mechanism. Here we predict folding kinetic rates of proteins from many reduced alphabets. We find that a reduced alphabet of 10 letters achieves good correlation with folding rates, close to the one achieved by full 28‐letter alphabet. Many other reduced alphabets are not significantly correlated to folding rates. The finding suggests that not all amino acids and secondary structures are equally important for protein folding. The foldable sequence of a protein could be designed using at least 10 folding units, which can either promote or inhibit protein folding. Reducing alphabet cardinality without losing key folding kinetic information opens the door to potentially faster machine learning and data mining applications in protein structure prediction, sequence alignment and protein design. Proteins 2015; 83:631–639. © 2015 Wiley Periodicals, Inc.  相似文献   

9.
Contact order revisited: influence of protein size on the folding rate   总被引:13,自引:0,他引:13       下载免费PDF全文
Guided by the recent success of empirical model predicting the folding rates of small two-state folding proteins from the relative contact order (CO) of their native structures, by a theoretical model of protein folding that predicts that logarithm of the folding rate decreases with the protein chain length L as L(2/3), and by the finding that the folding rates of multistate folding proteins strongly correlate with their sizes and have very bad correlation with CO, we reexamined the dependence of folding rate on CO and L in attempt to find a structural parameter that determines folding rates for the totality of proteins. We show that the Abs_CO = CO x L, is able to predict rather accurately folding rates for both two-state and multistate folding proteins, as well as short peptides, and that this Abs_CO scales with the protein chain length as L(0.70 +/- 0.07) for the totality of studied single-domain proteins and peptides.  相似文献   

10.
Many single-domain proteins exhibit two-state folding kinetics, with folding rates that span more than six orders of magnitude. A quantity of much recent interest for such proteins is their contact order, the average separation in sequence between contacting residue pairs. Numerous studies have reached the surprising conclusion that contact order is well-correlated with the logarithm of the folding rate for these small, well-characterized molecules. Here, we investigate the physico-chemical basis for this finding by asking whether contact order is actually a composite number that measures the fraction of local secondary structure in the protein; viz. turns, helices, and hairpins. To pursue this question, we calculated the secondary structure content for 24 two-state proteins and obtained coefficients that predict their folding rates. The predicted rates correlate strongly with experimentally determined rates, comparable to the correlation with contact order. Further, these predicted folding rates are correlated strongly with contact order. Our results suggest that the folding rate of two-state proteins is a function of their local secondary structure content, consistent with the hierarchic model of protein folding. Accordingly, it should be possible to utilize secondary structure prediction methods to predict folding rates from sequence alone.  相似文献   

11.
12.
We have collected the kinetic folding data for non-two-state and two-state globular proteins reported in the literature, and investigated the relationships between the folding kinetics and the native three-dimensional structure of these proteins. The rate constants of formation of both the intermediate and the native state of non-two-state folders were found to be significantly correlated with protein chain length and native backbone topology, which is represented by the absolute contact order and sequence-distant native pairs. The folding rate of two-state folders, which is known to be correlated with the native backbone topology, apparently does not correlate significantly with protein chain length. On the basis of a comparison of the folding rates of the non-two-state and two-state folders, it was found that they are similarly dependent on the parameters that reflect the native backbone topology. This suggests that the mechanisms behind non-two-state and two-state folding are essentially identical. The present results lead us to propose a unified mechanism of protein folding, in which folding occurs in a hierarchical manner, reflecting the hierarchy of the native three-dimensional structure, as embodied in the case of non-two-state folding with an accumulation of the intermediate. Apparently, two-state folding is merely a simplified version of hierarchical folding caused either by an alteration in the rate-limiting step of folding or by destabilization of the intermediate.  相似文献   

13.
We perform a statistical analysis of amino-acid contacts to investigate possible preferences of amino-acid interactions. We include in the analysis only tertiary contacts, because they are less constrained--compared to secondary contacts--by proteins' backbone rigidity. Using proteins from the protein data bank, our analysis reveals an unusually high frequency of cysteine pairings relative to that expected from random. To elucidate the possible effects of cysteine interactions in folding, we perform molecular simulations on three cysteine-rich proteins. In particular, we investigate the difference in folding dynamics between a Gō-like model (where attraction only occurs between amino acids forming a native contact) and a variant model (where attraction between any two cysteines is introduced to mimic the formation/dissociation of native/nonnative disulfide bonds). We find that when attraction among cysteines is nonspecific and comparable to a solvent-averaged interaction, they produce a target-focusing effect that expedites folding of cysteine-rich proteins as a result of a reduction of conformational search space. In addition, the target-focusing effect also helps reduce glassiness by lowering activation energy barriers and kinetic frustration in the system. The concept of target-focusing also provides a qualitative understanding of a correlation between the rates of protein folding and parameters such as contact order and total contact distance.  相似文献   

14.
Protein folding is the process by which a protein processes from its denatured state to its specific biologically active conformation. Understanding the relationship between sequences and the folding rates of proteins remains an important challenge. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. In this study, the long‐range and short‐range contact in protein were used to derive extended version of the pseudo amino acid composition based on sliding window method. This method is capable of predicting the protein folding rates just from the amino acid sequence without the aid of any structural class information. We systematically studied the contributions of individual features to folding rate prediction. The optimal feature selection procedures are adopted by means of combining the forward feature selection and sequential backward selection method. Using the jackknife cross validation test, the method was demonstrated on the large dataset. The predictor was achieved on the basis of multitudinous physicochemical features and statistical features from protein using nonlinear support vector machine (SVM) regression model, the method obtained an excellent agreement between predicted and experimentally observed folding rates of proteins. The correlation coefficient is 0.9313 and the standard error is 2.2692. The prediction server is freely available at http://www.jci‐bioinfo.cn/swfrate/input.jsp . Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

15.
Although the folding rates of proteins have been studied extensively, both experimentally and theoretically, and many native state topological parameters have been proposed to correlate with or predict these rates, unfolding rates have received much less attention. Moreover, unfolding rates have generally been thought either to not relate to native topology in the same manner as folding rates, perhaps depending on different topological parameters, or to be more difficult to predict. Using a dataset of 108 proteins including two-state and multistate folders, we find that both unfolding and folding rates correlate strongly, and comparably well, with well-established measures of native topology, the absolute contact order and the long range order, with correlation coefficient values of 0.75 or higher. In addition, compared to folding rates, the absolute values of unfolding rates vary more strongly with native topology, have a larger range of values, and correlate better with thermodynamic stability. Similar trends are observed for subsets of different protein structural classes. Taken together, these results suggest that choosing a scaffold for protein engineering may require a compromise between a simple topology that will fold sufficiently quickly but also unfold quickly, and a complex topology that will unfold slowly and hence have kinetic stability, but fold slowly. These observations, together with the established role of kinetic stability in determining resistance to thermal and chemical denaturation as well as proteases, have important implications for understanding fundamental aspects of protein unfolding and folding and for protein engineering and design.  相似文献   

16.
It is a challenging task to understand the relationship between sequences and folding rates of proteins. Previous studies are found that one of contact order (CO), long-range order (LRO), total contact distance (TCD), chain topology parameter (CTP), and effective length (Leff) has a significant correlation with folding rate of proteins. In this paper, we introduce a new parameter called n-order contact distance (nOCD) and use it to predict folding rate of proteins with two- and three-state folding kinetics. A good linear correlation between the folding rate logarithm lnkf and nOCD with n=1.2, alpha=0.6 is found for two-state folders (correlation coefficient is -0.809, P-value<0.0001) and n=2.8, alpha=1.5 for three-state folders (correlation coefficient is -0.816, P-value<0.0001). However, this correlation is completely absent for three-state folders with n=1.2, alpha=0.6 (correlation coefficient is 0.0943, P-value=0.661) and for two-state folders with n=2.8, alpha=1.5 (correlation coefficient is -0.235, P-value=0.2116). We also find that the average number of contacts per residue Pm in the interval of m for two-state folders is smaller than that for three-state folders. The probability distribution P(gamma) of residue having gamma pairs of contacts fits a Gaussian distribution for both two- and three-state folders. We observe that the correlations between square radius of gyration S2 and number of residues for two- and three-state folders are both good, and the correlation coefficient is 0.908 and 0.901, and the slope of the fitting line is 1.202 and 0.795, respectively. Maybe three-state folders are more compact than two-state folders. Comparisons with nTCD and nCTP are also made, and it is found that nOCD is the best one in folding rate prediction.  相似文献   

17.
从氨基酸序列预测蛋白质折叠速率   总被引:1,自引:0,他引:1  
蛋白质折叠速率预测是当今生物物理学最具挑战性的课题之一.近年来,许多科研工作者开展了大量的研究工作来探索折叠速率的决定因素,许多参数和方法被相继提出.但氨基酸残基间的相互作用、氨基酸的序列顺序等信息对折叠速率的影响从未被提及.采用伪氨基酸组成的方法提取氨基酸的序列顺序信息,利用蒙特卡洛方法选择最佳特征因子,建立线性回归模型进行折叠速率预测.该方法能在不需要任何(显示)结构信息的情况下,直接从蛋白质的氨基酸序列出发对折叠速率进行预测.在Jackknife交互检验方法的验证下,对含有99个蛋白质的数据集,发现折叠速率的预测值与实验值有很好的相关性,相关系数能达到0.81,预测误差仅为2.54.这一精度明显优于其他基于序列的方法,充分说明蛋白质的序列顺序信息是影响蛋白质折叠速率的重要因素.  相似文献   

18.
Huang JT  Cheng JP 《Proteins》2008,72(1):44-49
Prediction of protein-folding rates follows different rules in two-state and multi-state kinetics. The prerequisite for the prediction is to recognize the folding kinetic pathway of proteins. Here, we use the logistic regression and support vector machine to discriminate between two-state and multi-state folding proteins. We find that chain length is sufficient to accurately recognize multi-state proteins. There is a transition boundary between two kinetic models. Protein folds with multi-state kinetics, if its length is larger than 112 residues. The logistic prediction from amino acid composition shows that the kinetic pathway of folding is closely related to amino acid volume. Small amino acids make two-state folding easier, and vice versa. However, cysteine, alanine, arginine, lysine, histidine, and methionine do not conform to this rule.  相似文献   

19.
Bastolla U  Porto M  Ortíz AR 《Proteins》2008,71(1):278-299
We adopt a model of inverse folding in which folding stability results from the combination of the hydrophobic effect with local interactions responsible for secondary structure preferences. Site-specific amino acid distributions can be calculated analytically for this model. We determine optimal parameters for the local interactions by fitting the complete inverse folding model to the site-specific amino acid distributions found in the Protein Data Bank. This procedure reduces drastically the influence on the derived parameters of the preference of different secondary structures for buriedness, which affects local interaction parameters determined through the standard approach based on amino acid propensities. The quality of the fit is evaluated through the likelihood of the observed amino acid distributions given the model and the Bayesian Information Criterion, which indicate that the model with optimal local interaction parameters is strongly preferable to the model where local interaction parameters are determined through propensities. The optimal model yields a mean correlation coefficient r = 0.96 between observed and predicted amino acid distributions. The local interaction parameters are then tested in threading experiments, in combination with contact interactions, for their capacity to recognize the native structure and structures similar to the native against unrelated ones. In a challenging test, proteins structurally aligned with the Mammoth algorithm are scored with the effective free energy function. The native structure gets the highest stability score in 100% of the cases, a high recognition rate comparable to that achieved against easier decoys generated by gapless threading. We then examine proteins for which at least one highly similar template exists. In 61% of the cases, the structure with the highest stability score excluding the native belongs to the native fold, compared to 60% if we use local interaction parameters derived from the usual amino acid propensities and 52% if we use only contact interactions. A highly similar structure is present within the five best stability scores in 82%, 81%, and 76% of the cases, for local interactions determined through inverse folding, through propensity, and set to zero, respectively. These results indicate that local interactions improve substantially the performances of contact free energy functions in fold recognition, and that similar structures tend to get high stability scores, although they are often not high enough to discriminate them from unrelated structures. This work highlights the importance to apply more challenging tests, as the recognition of homologous structures, for testing stability scores for protein folding.  相似文献   

20.
Fernández A  Colubri A 《Proteins》2002,48(2):293-310
We generate ab initio folding pathways in two single-domain proteins, hyperthermophile variant of protein G domain (1gb4) and ubiquitin (1ubi), both presumed to be two-state folders. Both proteins are endowed with the same topology but, as shown in this work, rely to a different extent on large-scale context to find their native folds. First, we demonstrate a generic feature of two-state folders: A downsizing of structural fluctuations is achieved only when the protein reaches a stationary plateau maximizing the number of highly protected hydrogen bonds. This enables us to identify the folding nucleus and show that folding does not become expeditious until a topology is generated that is able to protect intramolecular hydrogen bonds from water attack. Pathway heterogeneity is shown to be dependent on the extent to which the protein relies on large-scale context to fold, rather than on contact order: Proteins that can only stabilize native secondary structure by packing it against scaffolding hydrophobic moieties are meant to have a heterogeneous transition-state ensemble if they are to become successful folders (otherwise, successful folding would be too fortuitous an event.) We estimate mutational Phi values as ensemble averages and deconvolute individual-route contributions to the averaged two-state kinetic picture. Our results find experimental corroboration in the well-studied chymotrypsin inhibitor (CI2), while leading to verifiable predictions for the other two study cases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号