首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An empirical relation between the amino acid composition and three-dimensional folding pattern of several classes of proteins has been determined. Computer simulated neural networks have been used to assign proteins to one of the following classes based on their amino acid composition and size: (1) 4α-helical bundles, (2) parallel (α/β)8 barrels, (3) nucleotide binding fold, (4) immunoglobulin fold, or (5) none of these. Networks trained on the known crystal structures as well as sequences of closely related proteins are shown to correctly predict folding classes of proteins not represented in the training set with an average accuracy of 87%. Other folding motifs can easily be added to the prediction scheme once larger databases become available. Analysis of the neural network weights reveals that amino acids favoring prediction of a folding class are usually over represented in that class and amino acids with unfavorable weights are underrepresented in composition. The neural networks utilize combinations of these multiple small variations in amino acid composition in order to make a prediction. The favorably weighted amino acids in a given class also form the most intramolecular interactions with other residues in proteins of that class. A detailed examination of the contacts of these amino acids reveals some general patterns that may help stabilize each folding class. © 1993 Wiley-Liss, Inc.  相似文献   

2.

Background

Folding nucleus of globular proteins formation starts by the mutual interaction of a group of hydrophobic amino acids whose close contacts allow subsequent formation and stability of the 3D structure. These early steps can be predicted by simulation of the folding process through a Monte Carlo (MC) coarse grain model in a discrete space. We previously defined MIRs (Most Interacting Residues), as the set of residues presenting a large number of non-covalent neighbour interactions during such simulation. MIRs are good candidates to define the minimal number of residues giving rise to a given fold instead of another one, although their proportion is rather high, typically [15-20]% of the sequences. Having in mind experiments with two sequences of very high levels of sequence identity (up to 90%) but different folds, we combined the MIR method, which takes sequence as single input, with the “fuzzy oil drop” (FOD) model that requires a 3D structure, in order to estimate the residues coding for the fold. FOD assumes that a globular protein follows an idealised 3D Gaussian distribution of hydrophobicity density, with the maximum in the centre and minima at the surface of the “drop”. If the actual local density of hydrophobicity around a given amino acid is as high as the ideal one, then this amino acid is assigned to the core of the globular protein, and it is assumed to follow the FOD model. Therefore one obtains a distribution of the amino acids of a protein according to their agreement or rejection with the FOD model.

Results

We compared and combined MIR and FOD methods to define the minimal nucleus, or keystone, of two populated folds: immunoglobulin-like (Ig) and flavodoxins (Flav). The combination of these two approaches defines some positions both predicted as a MIR and assigned as accordant with the FOD model. It is shown here that for these two folds, the intersection of the predicted sets of residues significantly differs from random selection. It reduces the number of selected residues by each individual method and allows a reasonable agreement with experimentally determined key residues coding for the particular fold. In addition, the intersection of the two methods significantly increases the specificity of the prediction, providing a robust set of residues that constitute the folding nucleus.  相似文献   

3.
Proteins fold by either two‐state or multistate kinetic mechanism. We observe that amino acids play different roles in different mechanism. Many residues that are easy to form regular secondary structures (α helices, β sheets and turns) can promote the two‐state folding reactions of small proteins. Most of hydrophilic residues can speed up the multistate folding reactions of large proteins. Folding rates of large proteins are equally responsive to the flexibility of partial amino acids. Other properties of amino acids (including volume, polarity, accessible surface, exposure degree, isoelectric point, and phase transfer energy) have contributed little to folding kinetics of the proteins. Cysteine is a special residue, it triggers two‐state folding reaction and but inhibits multistate folding reaction. These findings not only provide a new insight into protein structure prediction, but also could be used to direct the point mutations that can change folding rate. Proteins 2014; 82:2375–2382. © 2014 Wiley Periodicals, Inc.  相似文献   

4.
Here, we provide an analysis of molecular evolution of five of the most populated protein folds: immunoglobulin fold, oligonucleotide-binding fold, Rossman fold, alpha/beta plait, and TIM barrels. In order to distinguish between "historic", functional and structural reasons for amino acid conservations, we consider proteins that acquire the same fold and have no evident sequence homology. For each fold we identify positions that are conserved within each individual family and coincide when non-homologous proteins are structurally superimposed. As a baseline for statistical assessment we use the conservatism expected based on the solvent accessibility. The analysis is based on a new concept of "conservatism-of-conservatism". This approach allows us to identify the structural features that are stabilized in all proteins having a given fold, despite the fact that actual interactions that provide such stabilization may vary from protein to protein. Comparison with experimental data on thermodynamics, folding kinetics and function of the proteins reveals that such universally conserved clusters correspond to either: (i) super-sites (common location of active site in proteins having common tertiary structures but not function) or (ii) folding nuclei whose stability is an important determinant of folding rate, or both (in the case of Rossman fold). The analysis also helps to clarify the relation between folding and function that is apparent for some folds.  相似文献   

5.
The design of a protein folding approximation algorithm is not straightforward even when a simplified model is used. The folding problem is a combinatorial problem, where approximation and heuristic algorithms are usually used to find near optimal folds of proteins primary structures. Approximation algorithms provide guarantees on the distance to the optimal solution. The folding approximation approach proposed here depends on two-dimensional cellular automata to fold proteins presented in a well-studied simplified model called the hydrophobic–hydrophilic model. Cellular automata are discrete computational models that rely on local rules to produce some overall global behavior. One-third and one-fourth approximation algorithms choose a subset of the hydrophobic amino acids to form H–H contacts. Those algorithms start with finding a point to fold the protein sequence into two sides where one side ignores H’s at even positions and the other side ignores H’s at odd positions. In addition, blocks or groups of amino acids fold the same way according to a predefined normal form. We intend to improve approximation algorithms by considering all hydrophobic amino acids and folding based on the local neighborhood instead of using normal forms. The CA does not assume a fixed folding point. The proposed approach guarantees one half approximation minus the H–H endpoints. This lower bound guaranteed applies to short sequences only. This is proved as the core and the folds of the protein will have two identical sides for all short sequences.  相似文献   

6.
How tightly packed is the hydrophobic core of a folding transition state structure? We have addressed this question by characterizing the effects on folding kinetics of > 40 substitutions of both large and small amino acids in the hydrophobic core of the Fyn SH3 domain. Our results show that residues at three positions, which we designate as the 'core folding nucleus', are tightly packed in the transition state, and substitutions at these positions cause the largest changes in the folding rate. The other six positions examined appear to be loosely packed; thus, substitutions at these positions with larger hydrophobic residues generally accelerate folding, presumably by increasing the rate of nonspecific hydrophobic collapse. Surprisingly, the folding rate can be greatly accelerated by residues that also significantly destabilize the native state structure. Furthermore, mutants with identical thermodynamic stability can differ by up to 55-fold in their folding rates. These results highlight the importance of hydrophobic core composition, as opposed to only topology, in determining the folding rate of a protein. They also provide a new explanation for the 'abnormal' phi-values observed in many protein folding kinetics studies.  相似文献   

7.

Background  

The wealth of information on protein structure has led to a variety of statistical analyses of the role played by individual amino acid types in the protein fold. In particular, the contact propensities between the various amino acids can be converted into folding energies that have proved useful in structure prediction. The present study addresses the relationship of protein folding propensities to the evolutionary relationship between residues.  相似文献   

8.
Parallel folding pathways in the SH3 domain protein   总被引:2,自引:0,他引:2  
The transition-state ensemble (TSE) is the set of protein conformations with an equal probability to fold or unfold. Its characterization is crucial for an understanding of the folding process. We determined the TSE of the src-SH3 domain protein by using extensive molecular dynamics simulations of the Go model and computing the folding probability of a generated set of TSE candidate conformations. We found that the TSE possesses a well-defined hydrophobic core with variable enveloping structures resulting from the superposition of three parallel folding pathways. The most preferred pathway agrees with the experimentally determined TSE, while the two least preferred pathways differ significantly. The knowledge of the different pathways allows us to design the interactions between amino acids that guide the protein to fold through the least preferred pathway. This particular design is akin to a circular permutation of the protein. The finding motivates the hypothesis that the different experimentally observed TSEs in homologous proteins and circular permutants may represent potentially available pathways to the wild-type protein.  相似文献   

9.
A Poupon  J P Mornon 《FEBS letters》1999,452(3):283-289
Understanding the mechanism of protein folding would allow prediction of the three-dimensional structure from sequence data alone. It has been shown that small proteins fold in a small number of kinetic steps and that significantly populated intermediate states exist for some of them. Studies of these intermediates have demonstrated the existence of specific interactions established during the initial stages of folding. Comparison of the amino acids participating in these specific and essential interactions and constituting the folding nucleus with conserved hydrophobic positions of a given fold shows a striking correspondence. This finding opens the perspective of predicting the folding nucleus knowing only a set of divergent sequences of a protein family.  相似文献   

10.

Background  

Residue depth allows determining how deeply a given residue is buried, in contrast to the solvent accessibility that differentiates between buried and solvent-exposed residues. When compared with the solvent accessibility, the depth allows studying deep-level structures and functional sites, and formation of the protein folding nucleus. Accurate prediction of residue depth would provide valuable information for fold recognition, prediction of functional sites, and protein design.  相似文献   

11.
Molecular dissection was employed to identify minimal independent folding units in dihydrofolate reductase (DHFR) from Escherichia coli. Eight overlapping fragments of DHFR, spanning the entire sequence and ranging in size from 36 to 123 amino acids, were constructed by chemical cleavage. These fragments were designed to examine the effect of tethering multiple elements of secondary structure on folding and to test if the secondary structural domains represent autonomous folding units. CD and fluorescence spectroscopy demonstrated that six fragments containing up to a total of seven alpha-helices or beta-strands and, in three cases, the adenine binding domain (residues 37-86), are largely disordered. A stoichiometric mixture of the two fragments comprising the large discontinuous domain, 1-36 and 87-159, also showed no evidence for folding beyond that observed for the isolated fragments. A fragment containing residues 1-107 appears to have secondary and tertiary structure; however, spontaneous self-association made it impossible to determine if this structure solely reflects the behavior of the monomeric form. In contrast, a monomeric fragment spanning residues 37-159 possesses significant secondary and tertiary structure. The urea-induced unfolding of fragment 37-159 in the presence of 0.5 M ammonium sulfate was found to be a well-defined, two-state process. The observation that fragment 37-159 can adopt a stable native fold with unique, aromatic side-chain packing is quite striking because residues 1-36 form an integral part of the structural core of the full-length protein.  相似文献   

12.
Structural genomics projects as well as ab initio protein structure prediction methods provide structures of proteins with no sequence or fold similarity to proteins with known functions. These are often low-resolution structures that may only include the positions of C alpha atoms. We present a fast and efficient method to predict DNA-binding proteins from just the amino acid sequences and low-resolution, C alpha-only protein models. The method uses the relative proportions of certain amino acids in the protein sequence, the asymmetry of the spatial distribution of certain other amino acids as well as the dipole moment of the molecule. These quantities are used in a linear formula, with coefficients derived from logistic regression performed on a training set, and DNA-binding is predicted based on whether the result is above a certain threshold. We show that the method is insensitive to errors in the atomic coordinates and provides correct predictions even on inaccurate protein models. We demonstrate that the method is capable of predicting proteins with novel binding site motifs and structures solved in an unbound state. The accuracy of our method is close to another, published method that uses all-atom structures, time-consuming calculations and information on conserved residues.  相似文献   

13.
Our abilities to predict three-dimensional conformation of a polypeptide, given its amino acid sequence, remain limited despite advances in structure analysis. Analysis of structures and sequences of protein families with similar secondary structural elements, but varying topologies, might help in addressing this problem. We have studied the small beta-barrel class of proteins characterized by four strands (n = 4) and a shear number of 8 (S = 8) to understand the principles of barrel formation. Multiple alignments of the various protein sequences were generated for the analysis. Positional entropy, as a measure of residue conservation, indicated conservation of non-polar residues at the core positions. The presence of a type II beta-turn among the various barrel proteins considered was another strikingly invariant feature. A conserved glycyl-aspartyl dipeptide at the beta-turn appeared to be important in guiding the protein sequence into the barrel fold. Molecular dynamics simulations of the type II beta-turn peptide suggested that aspartate is a key residue in the folding of the protein sequence into the barrel. Our study suggests that the conserved type II beta-turn and the non-polar residues in the barrel core are crucial for the folding of the protein's primary sequence into the beta-barrel conformation.  相似文献   

14.
We examine sequence-to-structure specificity of beta-structural fragments of immunoglobulin domains. The structure specificity of separate chain fragments is estimated by computing the Z-score values in recognition of the native structure in gapless threading tests. To improve the accuracy of our calculations we use energy averaging over diverse homologs of immunoglobulin domains. We show that the interactions between residues of beta-structure are more determinant in recognition of the native structure than the interactions within the whole chain molecule. This result distinguishes immunoglobulins from more typical proteins where the interactions between residues of the whole chain normally recognize the native fold more accurately than interactions between the residues of the secondary structure residues alone [Reva,B. and Topiol,S. (2000) BIOCOMPUTING: Proceedings of the Pacific Symposium. World Scientific Publishing Co., pp. 168-178]. We also find that the predominant contributions of the secondary structure are produced by the four central beta-strands that form the core of the molecule. The results of this study allow us through quantitative means to understand the architecture of immunoglobulin molecules. Comparing the fold recognition data for different chain fragments one can say that beta-strands form a rigid frame for immunoglobulin molecules, whereas loops, with no structural role, can develop a broad variety of binding specificities. It is well known that protein function is determined by specific portions of a protein chain. This study suggests that the whole protein structure can be predominantly determined by a few fragments of chain which form the structural framework of the molecule. This idea may help in better understanding the mechanisms of protein evolution: strengthening a protein structure in the key framework-forming regions allows mutations and flexibility in other chain regions.  相似文献   

15.
The three-dimensional structures of two animoacyl-tRNA synthetases, the methionyl-tRNA synthetase from Escherichia coli (MetRS) and the tyrosyl-tRNA synthetase from Bacillus stearothermophilus (TyrRS), show a remarkable similarity over a span of about 140 amino acids. The region of homologous folding corresponds to a five-stranded parallel beta-sheet, including a mononucleotide-binding fold. One cysteine and two histidine residues that were found to be invariant in the amino acid sequences occupy similar places in the nucleotide-binding fold. In TyrRS, these residues are close to the adenylate binding site, and in MetRS to the Mg2+-ATP binding site.  相似文献   

16.
An obligatory alpha-helical amino acid residue   总被引:6,自引:0,他引:6  
A W Burgess  S J Leach 《Biopolymers》1973,12(11):2599-2605
Stereochemical studies predict that α-amino isobutyric acid, one of the amino acids found in antibiotics, can fold only into left- or righthanded α-helical conformations. Such residues will direct chain folding and should be useful in synthetic analogs of protein sequences to increase helix stability.  相似文献   

17.
We present a verified computational model of the SH3 domain transition state (TS) ensemble. This model was built for three separate SH3 domains using experimental phi-values as structural constraints in all-atom protein folding simulations. While averaging over all conformations incorrectly considers non-TS conformations as transition states, quantifying structures as pre-TS, TS, and post-TS by measurement of their transmission coefficient ("probability to fold", or p(fold)) allows for rigorous conclusions regarding the structure of the folding nucleus and a full mechanistic analysis of the folding process. Through analysis of the TS, we observe a highly polarized nucleus in which many residues are solvent-exposed. Mechanistic analysis suggests the hydrophobic core forms largely after an early nucleation step. SH3 presents an ideal system for studying the nucleation-condensation mechanism and highlights the synergistic relationship between experiment and simulation in the study of protein folding.  相似文献   

18.
One still cannot predict the 3D fold of a protein from its amino acid sequence, mainly because of errors in the energy estimates underlying the prediction. However, a recently developed theory [1] shows that having a set of homologs (i.e., the chains with equal, in despite of numerous mutations, 3D folds) one can average the potential of each interaction over the homologs and thus predict the common 3D fold of protein family even when a correct fold prediction for an individual sequence is impossible because the energies are known only approximately. This theoretical conclusion has been verified by simulation of the energy spectra of simplified models of protein chains [2], and the further investigation of these simplified models shows that their true "native" fold can be found by folding of the chain where each interaction potential is averaged over the homologs. In conclusion, the applicability of the "homolog-averaging" approach is tested by recognition of real protein 3D structures. Both the gapless threading of sequences onto the known protein folds [3] and the more practically important gapped threading (which allows to consider not only the known 3D structures, but the more or less similar to them folds as well) shows a significant increase in selectivity of the native chain fold recognition.  相似文献   

19.
R A Broglia  G Tiana 《Proteins》2001,45(4):421-427
While all the information required for the folding of a protein is contained in its amino acid sequence, one has not yet learned how to extract this information to predict the detailed, biological active, three-dimensional structure of a protein whose sequence is known. Using insight obtained from lattice model simulations of the folding of small proteins (fewer than 100 residues), in particular of the fact that this phenomenon is essentially controlled by conserved contacts (Mirny et al., Proc Natl Acad Sci USA 1995;92:1282) among (few) strongly interacting ("hot") amino acids (Tiana et al., J Chem Phys 1998;108:757-761), which also stabilize local elementary structures formed early in the folding process and leading to the (postcritical) folding core when they assemble together (Broglia et al., Proc Natl Acad Sci USA 1998;95:12930, Broglia & Tiana, J Chem Phys 2001;114:7267), we have worked out a successful strategy for reading the three-dimensional structure of lattice model-designed proteins from the knowledge of only their amino acid sequence and of the contact energies among the amino acids.  相似文献   

20.
The amino acid sequence of the P2 protein of peripheral myelin was analyzed with regard to regions of probable alpha-helix, beta-structure, beta-turn, and unordered conformation by means of several algorithms commonly used to predict secondary structure in proteins. Because of the high beta-sheet content and virtual absence of alpha-helix shown by the circular dichroic spectra of the protein, a bias was introduced into the algorithms to favor the beta-structure over the alpha-helical conformation. In order to define those beta-sheet residues that could lie on the external hydrophilic surface of the protein and those that could lie in its hydrophobic interior, the predicted beta-strands were examined for charged and uncharged amino acids located at alternating positions in the sequence. The sequential beta-strands in the predicted secondary structure were then ordered into beta-sheets and aligned according to generally accepted tertiary folding principles and certain chemical properties peculiar to the P2 protein. The general model of the P2 protein that emerged was a "Greek key" beta-barrel, consisting of eight antiparallel beta-strands with a two-stranded ribbon of antiparallel beta-structure emerging from one end. The model has an uncharged, hydrophobic core and a highly hydrophilic surface. The two Cys residues, which form a disulfide, occur in a loop connecting two adjacent antiparallel strands. Two hydrophilic loops, each containing a cluster of acidic residues and a single Phe, protrude from one end of the molecule. The general model is consistent with many of the properties of the actual protein, including the relatively weak nature of its association with myelin lipids and the positions of amino acid substitutions. Alternative beta-strand orderings yield three specific models having different interstrand connections across the barrel ends.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号