首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Ma BG  Guo JX  Zhang HY 《Proteins》2006,65(2):362-372
Discovering the mechanism of protein folding, in molecular biology, is a great challenge. A key step to this end is to find factors that correlate with protein folding rates. Over the past few years, many empirical parameters, such as contact order, long-range order, total contact distance, secondary structure contents, have been developed to reflect the correlation between folding rates and protein tertiary or secondary structures. However, the correlation between proteins' folding rates and their amino acid compositions has not been explored. In the present work, we examined systematically the correlation between proteins' folding rates and their amino acid compositions for two-state and multistate folders and found that different amino acids contributed differently to the folding progress. The relation between the amino acids' molecular weight and degeneracy and the folding rates was examined, and the role of hydrophobicity in the protein folding process was also inspected. As a consequence, a new indicator called composition index was derived, which takes no structure factors into account and is merely determined by the amino acid composition of a protein. Such an indicator is found to be highly correlated with the protein's folding rate (r > 0.7). From the results of this work, three points of concluding remarks are evident. (1) Two-state folders and multistate folders have different rate-determining amino acids. (2) The main determining information of a protein's folding rate is largely reflected in its amino acid composition. (3) Composition index may be the best predictor for an ab initio protein folding rate prediction directly from protein sequence from the standpoint of practical application.  相似文献   

2.
Huang JT  Xing DJ  Huang W 《Amino acids》2012,43(2):567-572
The successful prediction of protein-folding rates based on the sequence-predicted secondary structure suggests that the folding rates might be predicted from sequence alone. To pursue this question, we directly predict the folding rates from amino acid sequences, which do not require any information on secondary or tertiary structure. Our work achieves 88% correlation with folding rates determined experimentally for proteins of all folding types and peptide, suggesting that almost all of the information needed to specify a protein's folding kinetics and mechanism is comprised within its amino acid sequence. The influence of residue on folding rate is related to amino acid properties. Hydrophobic character of amino acids may be an important determinant of folding kinetics, whereas other properties, size, flexibility, polarity and isoelectric point, of amino acids have contributed little to the folding rate constant.  相似文献   

3.
从氨基酸序列预测蛋白质折叠速率   总被引:1,自引:0,他引:1  
蛋白质折叠速率预测是当今生物物理学最具挑战性的课题之一.近年来,许多科研工作者开展了大量的研究工作来探索折叠速率的决定因素,许多参数和方法被相继提出.但氨基酸残基间的相互作用、氨基酸的序列顺序等信息对折叠速率的影响从未被提及.采用伪氨基酸组成的方法提取氨基酸的序列顺序信息,利用蒙特卡洛方法选择最佳特征因子,建立线性回归模型进行折叠速率预测.该方法能在不需要任何(显示)结构信息的情况下,直接从蛋白质的氨基酸序列出发对折叠速率进行预测.在Jackknife交互检验方法的验证下,对含有99个蛋白质的数据集,发现折叠速率的预测值与实验值有很好的相关性,相关系数能达到0.81,预测误差仅为2.54.这一精度明显优于其他基于序列的方法,充分说明蛋白质的序列顺序信息是影响蛋白质折叠速率的重要因素.  相似文献   

4.
Many single-domain proteins exhibit two-state folding kinetics, with folding rates that span more than six orders of magnitude. A quantity of much recent interest for such proteins is their contact order, the average separation in sequence between contacting residue pairs. Numerous studies have reached the surprising conclusion that contact order is well-correlated with the logarithm of the folding rate for these small, well-characterized molecules. Here, we investigate the physico-chemical basis for this finding by asking whether contact order is actually a composite number that measures the fraction of local secondary structure in the protein; viz. turns, helices, and hairpins. To pursue this question, we calculated the secondary structure content for 24 two-state proteins and obtained coefficients that predict their folding rates. The predicted rates correlate strongly with experimentally determined rates, comparable to the correlation with contact order. Further, these predicted folding rates are correlated strongly with contact order. Our results suggest that the folding rate of two-state proteins is a function of their local secondary structure content, consistent with the hierarchic model of protein folding. Accordingly, it should be possible to utilize secondary structure prediction methods to predict folding rates from sequence alone.  相似文献   

5.
Proteins fold by either two‐state or multistate kinetic mechanism. We observe that amino acids play different roles in different mechanism. Many residues that are easy to form regular secondary structures (α helices, β sheets and turns) can promote the two‐state folding reactions of small proteins. Most of hydrophilic residues can speed up the multistate folding reactions of large proteins. Folding rates of large proteins are equally responsive to the flexibility of partial amino acids. Other properties of amino acids (including volume, polarity, accessible surface, exposure degree, isoelectric point, and phase transfer energy) have contributed little to folding kinetics of the proteins. Cysteine is a special residue, it triggers two‐state folding reaction and but inhibits multistate folding reaction. These findings not only provide a new insight into protein structure prediction, but also could be used to direct the point mutations that can change folding rate. Proteins 2014; 82:2375–2382. © 2014 Wiley Periodicals, Inc.  相似文献   

6.
Protein folding is the process by which a protein processes from its denatured state to its specific biologically active conformation. Understanding the relationship between sequences and the folding rates of proteins remains an important challenge. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. In this study, the long‐range and short‐range contact in protein were used to derive extended version of the pseudo amino acid composition based on sliding window method. This method is capable of predicting the protein folding rates just from the amino acid sequence without the aid of any structural class information. We systematically studied the contributions of individual features to folding rate prediction. The optimal feature selection procedures are adopted by means of combining the forward feature selection and sequential backward selection method. Using the jackknife cross validation test, the method was demonstrated on the large dataset. The predictor was achieved on the basis of multitudinous physicochemical features and statistical features from protein using nonlinear support vector machine (SVM) regression model, the method obtained an excellent agreement between predicted and experimentally observed folding rates of proteins. The correlation coefficient is 0.9313 and the standard error is 2.2692. The prediction server is freely available at http://www.jci‐bioinfo.cn/swfrate/input.jsp . Proteins 2013. © 2012 Wiley Periodicals, Inc.  相似文献   

7.
It has been shown for 20 proteins that amino acid residues included into the protein folding nucleus, determined experimentally, are often involved in the theoretically determined amyloidogenic fragments. For 18 proteins, Φ-values indicative of the extent of residue involvement into the folding nucleus are on average higher for amino acid residues within amyloidogenic regions. Amyloidogenic fragments were predicted for 20 proteins by two methods chosen from four on the basis of comparison of prediction of amyloidogenic regions known from experimental data. Since theoretical folding nuclei are detected by the protein three-dimensional structure and amyloidogenic regions by the protein chain primary structure, the detected regularity makes possible predictions of folding nucleation sites on the basis of amino acid sequence.  相似文献   

8.
To understand the folding behavior of proteins is an important and challenging problem in modern molecular biology. In the present investigation, a large number of features representing protein sequences were developed based on sequence autocorrelation weighted by properties of amino acid residues. Genetic algorithm (GA) combined with multiple linear regression (MLR) was employed to select significant features related to protein folding rates, and to build global predictive model. Moreover, local lazy regression (LLR) method was also used to predict the protein folding rates. The obtained results indicated that LLR performed much better than the global MLR model. The important properties of amino acid residues affecting protein folding rates were also analyzed. The results of this study will be helpful to understand the mechanism of protein folding. Our results also demonstrate that the features of amino acid sequence autocorrelation is effective in representing the relationship between protein sequence and folding rates, and the local method is a powerful tool to predict the protein folding rates.  相似文献   

9.
What are the key building blocks that would have been needed to construct complex protein folds? This is an important issue for understanding protein folding mechanism and guiding de novo protein design. Twenty naturally occurring amino acids and eight secondary structures consist of a 28‐letter alphabet to determine folding kinetics and mechanism. Here we predict folding kinetic rates of proteins from many reduced alphabets. We find that a reduced alphabet of 10 letters achieves good correlation with folding rates, close to the one achieved by full 28‐letter alphabet. Many other reduced alphabets are not significantly correlated to folding rates. The finding suggests that not all amino acids and secondary structures are equally important for protein folding. The foldable sequence of a protein could be designed using at least 10 folding units, which can either promote or inhibit protein folding. Reducing alphabet cardinality without losing key folding kinetic information opens the door to potentially faster machine learning and data mining applications in protein structure prediction, sequence alignment and protein design. Proteins 2015; 83:631–639. © 2015 Wiley Periodicals, Inc.  相似文献   

10.
The persistent difficulties in the production of protein at high levels in heterologous systems, as well as the inability to understand pathologies associated with protein aggregation, highlight our limited knowledge on the mechanisms of protein folding in vivo. Attempts to improve yield and quality of recombinant proteins are diverse, frequently involving optimization of the cell growth temperature, the use of synonymous codons and/or the co-expression of tRNAs, chaperones and folding catalysts among others. Although protein secondary structure can be determined largely by the amino acid sequence, protein folding within the cell is affected by a range of factors beyond amino acid sequence. The folding pathway of a nascent polypeptide can be affected by transient interactions with other proteins and ligands, the ribosome, translocation through a pore membrane, redox conditions, among others. The translation rate as well as the translation machinery itself can dramatically affect protein folding, and thus the structure and function of the protein product. This review addresses current efforts to better understand how the use of synonymous codons in the mRNA and the availability of tRNAs can modulate translation kinetics, affecting the folding, the structure and the biological activity of proteins.  相似文献   

11.
Fuzzy cluster analysis has been applied to the 20 amino acids by using 65 physicochemical properties as a basis for classification. The clustering products, the fuzzy sets (i.e., classical sets with associated membership functions), have provided a new measure of amino acid similarities for use in protein folding studies. This work demonstrates that fuzzy sets of simple molecular attributes, when assigned to amino acid residues in a protein''s sequence, can predict the secondary structure of the sequence with reasonable accuracy. An approach is presented for discriminating standard folding states, using near-optimum information splitting in half-overlapping segments of the sequence of assigned membership functions. The method is applied to a nonredundant set of 252 proteins and yields approximately 73% matching for correctly predicted and correctly rejected residues with approximately 60% overall success rate for the correctly recognized ones in three folding states: alpha-helix, beta-strand, and coil. The most useful attributes for discriminating these states appear to be related to size, polarity, and thermodynamic factors. Van der Waals volume, apparent average thickness of surrounding molecular free volume, and a measure of dimensionless surface electron density can explain approximately 95% of prediction results. hydrogen bonding and hydrophobicity induces do not yet enable clear clustering and prediction.  相似文献   

12.
Despite the large number of publications on three‐helix protein folding, there is no study devoted to the influence of handedness on the rate of three‐helix protein folding. From the experimental studies, we make a conclusion that the left‐handed three‐helix proteins fold faster than the right‐handed ones. What may explain this difference? An important question arising in this paper is whether the modeling of protein folding can catch the difference between the protein folding rates of proteins with similar structures but with different folding mechanisms. To answer this question, the folding of eight three‐helix proteins (four right‐handed and four left‐handed), which are similar in size, was modeled using the Monte Carlo and dynamic programming methods. The studies allowed us to determine the orders of folding of the secondary‐structure elements in these domains and amino acid residues which are important for the folding. The obtained data are in good correlation with each other and with the experimental data. Structural analysis of these proteins demonstrated that the left‐handed domains have a lesser number of contacts per residue and a smaller radius of cross section than the right‐handed domains. This may be one of the explanations of the observed fact. The same tendency is observed for the large dataset consisting of 332 three‐helix proteins (238 right‐ and 94 left‐handed). From our analysis, we found that the left‐handed three‐helix proteins have some less‐dense packing that should result in faster folding for some proteins as compared to the case of right‐handed proteins.Proteins 2013; © 2013 Wiley Periodicals, Inc.  相似文献   

13.
14.
Protein folding speeds are known to vary over more than eight orders of magnitude. Plaxco, Simons, and Baker (see References) first showed a correlation of folding speed with the topology of the native protein. That and subsequent studies showed, if the native structure of a protein is known, its folding speed can be predicted reasonably well through a correlation with the "localness" of the contacts in the protein. In the present work, we develop a related measure, the geometric contact number, N (alpha), which is the number of nonlocal contacts that are well-packed, by a Voronoi criterion. We find, first, that in 80 proteins, the largest such database of proteins yet studied, N (alpha) is a consistently excellent predictor of folding speeds of both two-state fast folders and more complex multistate folders. Second, we show that folding rates can also be predicted from amino acid sequences directly, without the need to know the native topology or other structural properties.  相似文献   

15.
Burns LL  Ropson IJ 《Proteins》2001,43(3):292-302
The folding mechanisms of cellular retinol binding protein II (CRBP II), cellular retinoic acid binding protein I (CRABP I), and cellular retinoic acid binding protein II (CRABP II) were examined. These beta-sheet proteins have very similar structures and higher sequence homologies than most proteins in this diverse family. They have similar stabilities and show completely reversible folding at equilibrium with urea as a denaturant. The unfolding kinetics of these proteins were monitored during folding and unfolding by circular dichroism (CD) and fluorescence. During unfolding, CRABP II showed no intermediates, CRABP I had an intermediate with nativelike secondary structure, and CRBP II had an intermediate that lacked secondary structure. The refolding kinetics of these proteins were more similar. Each protein showed a burst-phase change in intensity by both CD and fluorescence, followed by a single observed phase by both CD and fluorescence and one or two additional refolding phases by fluorescence. The fluorescence spectral properties of the intermediate states were similar and suggested a gradual increase in the amount of native tertiary structure present for each step in a sequential path. However, the rates of folding differed by as much as 3 orders of magnitude and were slower than those expected from the contact order and topology of these proteins. As such, proteins with the same final structure may not follow the same route to the native state.  相似文献   

16.
Some amino acid substitutions in phage P22 coat protein cause a temperature-sensitive folding (tsf) phenotype. In vivo, these tsf amino acid substitutions cause coat protein to aggregate and form intracellular inclusion bodies when folded at high temperatures, but at low temperatures the proteins fold properly. Here the effects of tsf amino acid substitutions on folding and unfolding kinetics and the stability of coat protein in vitro have been investigated to determine how the substitutions change the ability of coat protein to fold properly. The equilibrium unfolding transitions of the tsf variants were best fit to a three-state model, N if I if U, where all species concerned were monomeric, a result confirmed by velocity sedimentation analytical ultracentrifugation. The primary effect of the tsf amino acid substitutions on the equilibrium unfolding pathway was to decrease the stability (DeltaG) and the solvent accessibility (m-value) of the N if I transition. The kinetics of folding and unfolding of the tsf coat proteins were investigated using tryptophan fluorescence and circular dichroism (CD) at 222 nm. The tsf amino acid substitutions increased the rate of unfolding by 8-14-fold, with little effect on the rate of folding, when monitored by tryptophan fluorescence. In contrast, when folding or unfolding reactions were monitored by CD, the reactions were too fast to be observed. The tsf coat proteins are natural substrates for the molecular chaperones, GroEL/S. When native tsf coat protein monomers were incubated with GroEL, they bound efficiently, indicating that a folding intermediate was significantly populated even without denaturant. Thus, the tsf coat proteins aggregate in vivo because of an increased propensity to populate this unfolding intermediate.  相似文献   

17.
Extended proteins such as calmodulin and troponin C have two globular terminal domains linked by a central region that is exposed to water and often acts as a function-regulating element. The mechanisms that stabilize the tertiary structure of extended proteins appear to differ greatly from those of globular proteins. Identifying such differences in physical properties of amino acid sequences between extended proteins and globular proteins can provide clues useful for identification of extended proteins from complete genomes including orphan sequences. In the present study, we examined the structure and amino acid sequence of extended proteins. We found that extended proteins have a large net electric charge, high charge density, and an even balance of charge between the terminal domains, indicating that electrostatic interaction is a dominant factor in stabilization of extended proteins. Additionally, the central domain exposed to water contained many amphiphilic residues. Extended proteins can be identified from these physical properties of the tertiary structure, which can be deduced from the amino acid sequence. Analysis of physical properties of amino acid sequences can provide clues to the mechanism of protein folding. Also, structural changes in extended proteins may be caused by formation of molecular complexes. Long-range effects of electrostatic interactions also appear to play important roles in structural changes of extended proteins.  相似文献   

18.
Melo F  Marti-Renom MA 《Proteins》2006,63(4):986-995
Reduced or simplified amino acid alphabets group the 20 naturally occurring amino acids into a smaller number of representative protein residues. To date, several reduced amino acid alphabets have been proposed, which have been derived and optimized by a variety of methods. The resulting reduced amino acid alphabets have been applied to pattern recognition, generation of consensus sequences from multiple alignments, protein folding, and protein structure prediction. In this work, amino acid substitution matrices and statistical potentials were derived based on several reduced amino acid alphabets and their performance assessed in a large benchmark for the tasks of sequence alignment and fold assessment of protein structure models, using as a reference frame the standard alphabet of 20 amino acids. The results showed that a large reduction in the total number of residue types does not necessarily translate into a significant loss of discriminative power for sequence alignment and fold assessment. Therefore, some definitions of a few residue types are able to encode most of the relevant sequence/structure information that is present in the 20 standard amino acids. Based on these results, we suggest that the use of reduced amino acid alphabets may allow to increasing the accuracy of current substitution matrices and statistical potentials for the prediction of protein structure of remote homologs.  相似文献   

19.
Folding rates of small single-domain proteins that fold through simple two-state kinetics can be estimated from details of the three-dimensional protein structure. Previously, predictions of secondary structure had been exploited to predict folding rates from sequence. Here, we estimate two-state folding rates from predictions of internal residue-residue contacts in proteins of unknown structure. Our estimate is based on the correlation between the folding rate and the number of predicted long-range contacts normalized by the square of the protein length. It is well known that long-range order derived from known structures correlates with folding rates. The surprise was that estimates based on very noisy contact predictions were almost as accurate as the estimates based on known contacts. On average, our estimates were similar to those previously published from secondary structure predictions. The combination of these methods that exploit different sources of information improved performance. It appeared that the combined method reliably distinguished fast from slow two-state folders.  相似文献   

20.
Huang JT  Cheng JP  Chen H 《Proteins》2007,67(1):12-17
We present a simple method for determining the folding rates of two- and three-state proteins from the number of residues in their secondary structures (secondary structure length). The method is based on the hypothesis that two- and three-state foldings share a common pattern. Three-state proteins first condense into metastable intermediates, subsequent forming of alpha-helices, turns, and beta-sheets at slow rate-limiting step. The folding rate of such proteins anticorrelate with the length of these beta-secondary structures. It is also assumed that in two-state folding, rapidly folded alpha-helices and turns may facilitate formation of fleeting unobservable intermediates and thus show two-state behavior. There is an inverse relationship between the folding rate and the length of beta-sheets and loops. Our study achieves 94.0 and 88.1% correlations with folding rates determined experimentally for 21 three- and 38 two-state proteins, respectively, suggesting that protein-folding rates are determined by the secondary structure length. The kinetic kinds are selected on the basis of a competitive formation of hydrophobic collapse and alpha-structure in early intermediates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号