首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Intrinsically unstructured/disordered proteins and domains (IUPs) lack a well-defined three-dimensional structure under native conditions. The IUPred server presents a novel algorithm for predicting such regions from amino acid sequences by estimating their total pairwise interresidue interaction energy, based on the assumption that IUP sequences do not fold due to their inability to form sufficient stabilizing interresidue interactions. Optional to the prediction are built-in parameter sets optimized for predicting short or long disordered regions and structured domains.  相似文献   

2.
Intrinsically unstructured proteins (IUPs) are proteins lacking a fixed three dimensional structure or containing long disordered regions. IUPs play an important role in biology and disease. Identifying disordered regions in protein sequences can provide useful information on protein structure and function, and can assist high-throughput protein structure determination. In this paper we present a system for predicting disordered regions in proteins based on decision trees and reduced amino acid composition. Concise rules based on biochemical properties of amino acid side chains are generated for prediction. Coarser information extracted from the composition of amino acids can not only improve the prediction accuracy but also increase the learning efficiency. In cross-validation tests, with four groups of reduced amino acid composition, our system can achieve a recall of 80% at a 13% false positive rate for predicting disordered regions, and the overall accuracy can reach 83.4%. This prediction accuracy is comparable to most, and better than some, existing predictors. Advantages of our approach are high prediction accuracy for long disordered regions and efficiency for large-scale sequence analysis. Our software is freely available for academic use upon request.  相似文献   

3.
Until recently, the point of view that the unique tertiary structure is necessary for protein function has prevailed. However, recent data have demonstrated that many cell proteins do not possess such structure in isolation, although displaying a distinct function under physiological conditions. These proteins were named the naturally, or intrinsically, disordered proteins. The fraction of intrinsically disordered regions in such proteins may vary from several amino acid residues to a completely unordered sequence of several tens or even several hundreds of residues. The main distinction of these proteins from structured (globular) proteins is that they have no unique tertiary structure in isolation and acquire it only upon interaction with their partners. The conformation of these proteins in a complex is determined not only by their own amino acid sequence (as is typical of structured, or globular, proteins) but also by the interacting partner. This review discusses the structure-function relationships in structured and intrinsically disordered proteins. The intricateness of this problem and the possible ways to solve it are illustrated by the example of the EF1A elongation factor family.  相似文献   

4.
MOTIVATION: Partially and wholly unstructured proteins have now been identified in all kingdoms of life--more commonly in eukaryotic organisms. This intrinsic disorder is related to certain critical functions. Apart from their fundamental interest, unstructured regions in proteins may prevent crystallization. Therefore, the prediction of disordered regions is an important aspect for the understanding of protein function, but may also help to devise genetic constructs. RESULTS: In this paper we present a computational tool for the detection of unstructured regions in proteins based on two properties of unfolded fragments: (1) disordered regions have a biased composition and (2) they usually contain either small or no hydrophobic clusters. In order to quantify these two facts we first calculate the amino acid distributions in structured and unstructured regions. Using this distribution, we calculate for a given sequence fragment the probability to be part of either a structured or an unstructured region. For each amino acid, the distance to the nearest hydrophobic cluster is also computed. Using these three values along a protein sequence allows us to predict unstructured regions, with very simple rules. This method requires only the primary sequence, and no multiple alignment, which makes it an adequate method for orphan proteins. AVAILABILITY: http://genomics.eu.org/  相似文献   

5.
Molecular principles of the interactions of disordered proteins   总被引:6,自引:0,他引:6  
Thorough knowledge of the molecular principles of protein-protein recognition is essential to our understanding of protein function at the cellular level. Whereas interactions of ordered proteins have been analyzed in great detail, complexes of intrinsically unstructured/disordered proteins (IUPs) have hardly been addressed so far. Here, we have collected a database of 39 complexes of experimentally verified IUPs, and compared their interfaces with those of 72 complexes of ordered, globular proteins. The characteristic differences found between the two types of complexes suggest that IUPs represent a distinct molecular implementation of the principles of protein-protein recognition. The interfaces do not differ in size, but those of IUPs cover a much larger part of the surface of the protein than for their ordered counterparts. Moreover, IUP interfaces are significantly more hydrophobic relative to their overall amino acid composition, but also in absolute terms. They rely more on hydrophobic-hydrophobic than on polar-polar interactions. Their amino acids in the interface realize more intermolecular contacts, which suggests a better fit with the partner due to induced folding upon binding that results in a better adaptation to the partner. The two modes of interaction also differ in that IUPs usually use only a single continuous segment for partner binding, whereas the binding sites of ordered proteins are more segmented. Probably, all these features contribute to the increased evolutionary conservation of IUP interface residues. These noted molecular differences are also manifested in the interaction energies of IUPs. Our approximation of these by low-resolution force-fields shows that IUPs gain much more stabilization energy from intermolecular contacts, than from folding, i.e. they use their binding energy for folding. Overall, our findings provide a structural rationale to the prior suggestions that many IUPs are specialized for functions realized by protein-protein interactions.  相似文献   

6.
Collapse of unfolded protein chains is an early event in folding. It affects structural properties of intrinsically disordered proteins, which take a considerable fraction of the human proteome. Collapse is generally believed to be driven by hydrophobic forces imposed by the presence of nonpolar amino acid side chains. Contributions from backbone hydrogen bonds to protein folding and stability, however, are controversial. To date, the experimental dissection of side-chain and backbone contributions has not yet been achieved because both types of interactions are integral parts of protein structure. Here, we realized this goal by applying mutagenesis and chemical modification on a set of disordered peptides and proteins. We measured the protein dimensions and kinetics of intra-chain diffusion of modified polypeptides at the level of individual molecules using fluorescence correlation spectroscopy, thereby avoiding artifacts commonly caused by aggregation of unfolded protein material in bulk. We found no contributions from side chains to collapse but, instead, identified backbone interactions as a source sufficient to form globules of native-like dimensions. The presence of backbone hydrogen bonds decreased polypeptide water solubility dramatically and accelerated the nanosecond kinetics of loop closure, in agreement with recent predictions from computer simulation. The presence of side chains, instead, slowed loop closure and modulated the dimensions of intrinsically disordered domains. It appeared that the transient formation of backbone interactions facilitates the diffusive search for productive conformations at the early stage of folding and within intrinsically disordered proteins.  相似文献   

7.
Compared to eukaryotes, the occurrence of "intrinsically disordered" or "natively unfolded" proteins in prokaryotes has not been explored extensively. Here, we report the occurrence of an intrinsically disordered protein from the mesophilic human pathogen Mycobacterium tuberculosis. The Histidine-tagged recombinant Rv3221c biotin-binding protein is intrinsically disordered at ambient and physiological growth temperatures as revealed by circular dichroism and Fourier transform infrared (FTIR) spectroscopic studies. However, an increase in temperature induces a transition from disordered to structured state with a folding temperature of approximately 53 degrees C. Addition of a structure inducing solvent trifluoroethanol (TFE) causes the protein to fold at lower temperatures suggesting that TFE fosters hydrophobic interactions, which drives protein folding. Differential Scanning Calorimetry studies revealed that folding is endothermic and the transition from a disordered to structured state is continuous (higher-order), implying existence of intermediates during folding process. Secondary structure analysis revealed that the protein has propensity to form beta-sheets. This is in conformity with FTIR spectrum that showed an absorption peak at wave number of 1636 cm(-1), indicative of disordered beta-sheet conformation in the native state. These data suggest that although Rv3221c may be disordered under ambient or optimal growth temperature conditions, it has the potential to fold into ordered structure at high temperature driven by increased hydrophobic interactions. In contrast to the generally known behavior of other intrinsically disordered proteins folding at high temperature, Rv3221c does not appear to oligomerize or aggregate as revealed through numerous experiments including Congo red binding, Thioflavin T-binding, turbidity measurements, and examining molar ellipticity as a function of protein concentration. The amino acid composition of Rv3221c reveals that it has 24% charged and 54.9% hydrophobic amino acid residues. In this respect, this protein, although belonging to the class of intrinsically disordered proteins, has distinct features. The intrinsically disordered state and the biotin-binding feature of this protein suggest that it may participate in many biochemical processes requiring biotin as a cofactor and adopt suitable conformations upon binding other folded targets.  相似文献   

8.
We have performed a statistical analysis of unstructured amino acid residues in protein structures available in the databank of protein structures. Data on the occurrence of disordered regions at the ends and in the middle part of protein chains have been obtained: in the regions near the ends (at distance less than 30 residues from the N- or C-terminus), there are 66% of unstructured residues (38% are near the N-terminus and 28% are near the C-terminus), although these terminal regions include only 23% of the amino acid residues. The frequencies of occurrence of unstructured residues have been calculated for each of 20 types in different positions in the protein chain. It has been shown that relative frequencies of occurrence of unstructured residues of 20 types at the termini of protein chains differ from the ones in the middle part of the protein chain; amino acid residues of the same type have different probabilities to be unstructured in the terminal regions and in the middle part of the protein chain. The obtained frequencies of occurrence of unstructured residues in the middle part of the protein chain have been used as a scale for predicting disordered regions from amino acid sequence using the method (FoldUnfold) previously developed by us. This scale of frequencies of occurrence of unstructured residues correlates with the contact scale (previously developed by us and used for the same purpose) at a level of 95%. Testing the new scale on a database of 427 unstructured proteins and 559 completely structured proteins has shown that this scale can be successfully used for the prediction of disordered regions in protein chains.  相似文献   

9.
A truly disordered protein lacks a stable fold and its backbone amide protons exchange with solvent at rates predicted from studies of unstructured peptides. We have measured the exchange rates of two model disordered proteins, FlgM and α-synuclein, in buffer and in Escherichia coli using the NMR experiment, SOLEXSY. The rates are similar in buffer and cells and are close to the rates predicted from data on small, unstructured peptides. This result indicates that true disorder can persist inside the crowded cellular interior and that weak interactions between proteins and macromolecules in cells do not necessarily affect intrinsic rates of exchange.  相似文献   

10.
Serine/arginine-rich (SR) splicing factors play an important role in constitutive and alternative splicing as well as during several steps of RNA metabolism. Despite the wealth of functional information about SR proteins accumulated to-date, structural knowledge about the members of this family is very limited. To gain a better insight into structure-function relationships of SR proteins, we performed extensive sequence analysis of SR protein family members and combined it with ordered/disordered structure predictions. We found that SR proteins have properties characteristic of intrinsically disordered (ID) proteins. The amino acid composition and sequence complexity of SR proteins were very similar to those of the disordered protein regions. More detailed analysis showed that the SR proteins, and their RS domains in particular, are enriched in the disorder-promoting residues and are depleted in the order-promoting residues as compared to the entire human proteome. Moreover, disorder predictions indicated that RS domains of SR proteins were completely unstructured. Two different classification methods, the charge-hydropathy measure and the cumulative distribution function (CDF) of the disorder scores, were in agreement with each other, and they both strongly predicted members of the SR protein family to be disordered. This study emphasizes the importance of the disordered structure for several functions of SR proteins, such as for spliceosome assembly and for interaction with multiple partners. In addition, it demonstrates the usefulness of order/disorder predictions for inferring protein structure from sequence.  相似文献   

11.
Many disordered proteins function via binding to a structured partner and undergo a disorder-to-order transition. The coupled folding and binding can confer several functional advantages such as the precise control of binding specificity without increased affinity. Additionally, the inherent flexibility allows the binding site to adopt various conformations and to bind to multiple partners. These features explain the prevalence of such binding elements in signaling and regulatory processes. In this work, we report ANCHOR, a method for the prediction of disordered binding regions. ANCHOR relies on the pairwise energy estimation approach that is the basis of IUPred, a previous general disorder prediction method. In order to predict disordered binding regions, we seek to identify segments that are in disordered regions, cannot form enough favorable intrachain interactions to fold on their own, and are likely to gain stabilizing energy by interacting with a globular protein partner. The performance of ANCHOR was found to be largely independent from the amino acid composition and adopted secondary structure. Longer binding sites generally were predicted to be segmented, in agreement with available experimentally characterized examples. Scanning several hundred proteomes showed that the occurrence of disordered binding sites increased with the complexity of the organisms even compared to disordered regions in general. Furthermore, the length distribution of binding sites was different from disordered protein regions in general and was dominated by shorter segments. These results underline the importance of disordered proteins and protein segments in establishing new binding regions. Due to their specific biophysical properties, disordered binding sites generally carry a robust sequence signal, and this signal is efficiently captured by our method. Through its generality, ANCHOR opens new ways to study the essential functional sites of disordered proteins.  相似文献   

12.
Absence of any regular structure is increasingly being observed in structural studies of proteins. These disordered regions or random coils, which have been observed under physiological conditions, are indicators of protein plasticity. The wide variety of interactions possible due to the flexibility of these 'natively disordered' regions confers functional advantage to the protein and the organism in general. This concept is underscored by the increasing proportion of intrinsically unstructured proteins seen with the ascension in the complexity of the organisms. The 'natively unfolded/disordered' state of the protein can be predicted utilizing Uversky's or Dunker's algorithm. We utilized Uversky's prediction scheme and based on the unique position of a protein in the charge-hydrophobicity plot, a derived net score was used to predict the overall disorder of the human housekeeping and non-housekeeping proteins. Substantial numbers of proteins in both the classes were predicted to be unfolded. However, comparative genomic analysis of predicted unfolded Homo sapiens proteins with homologues in Caenorhabditis elegans, Drosophila melanogaster and Mus musculus revealed significant increase in unfoldedness in non-housekeeping proteins in comparison with housekeeping proteins. Our analysis in the evolutionary context suggests addition or substitution of amino acid residues which favour unfoldedness in non-housekeeping proteins compared to housekeeping proteins.  相似文献   

13.
S Miyazawa  R L Jernigan 《Proteins》1999,36(3):357-369
We consider modifications of an empirical energy potential for fold and sequence recognition to represent approximately the stabilities of proteins in various environments. A potential used here includes a secondary structure potential representing short-range interactions for secondary structures of proteins, and a tertiary structure potential consisting of a long-range, pairwise contact potential and a repulsive packing potential. This potential is devised to evaluate together the total conformational energy of a protein at the coarse grained residue level. It was previously estimated from the observed frequencies of secondary structures, from contact frequencies between residues, and from the distributions of the number of residues in contact in known protein structures by regarding those distributions as the equilibrium distributions with the Boltzmann factor of these interaction energies. The stability of native structures is assumed as a primary requirement for proteins to fold into their native structures. A collapse energy is subtracted from the contact energies to remove the protein size dependence and to represent protein stabilities for monomeric and multimeric states. The free energy of the whole ensemble of protein conformations that is subtracted from the conformational energy to represent protein stability is approximated as the average energy expected for a typical native structure with the same amino acid composition. This term may be constant in fold recognition but essentially varies in sequence recognition. A simple test of threading sequences into structures without gaps is employed to demonstrate the importance of the present modifications that permit the same potential to be utilized for both fold and sequence recognition. Proteins 1999;36:357-369. Published 1999 Wiley-Liss, Inc.  相似文献   

14.
The cell cycle inhibitor p57Kip2 induces cell cycle arrest by inhibiting the activity of cyclin-dependent kinases. p57, although active as a cyclin A-CDK2 inhibitor, is largely unfolded or intrinsically disordered as shown by circular dichroism and fluorescence spectra characteristic of an unfolded protein and a hydrodynamic radius consistent with an unfolded structure. In addition, the N-terminal domain of p57 is both functionally independent as a cyclin A-CDK2 inhibitor and unstructured, as demonstrated by circular dichroism and fluorescence spectra indicative of unfolded proteins, a lack of 1H chemical shift dispersion and a hydrodynamic radius consistent with a highly unfolded structure. The amino acid compositions of full-length p57 and the excised QT domain of p57 exhibit significant deviations from the average composition of globular proteins that are consistent with the observed intrinsic disorder. However, the amino acid composition of the CDK inhibition domain of p57 does not exhibit such a striking deviation from the average values observed for proteins, implying that a general low level of hydrophobicity, rather than depletion or enrichment in specific amino acids, contributes to the intrinsic disorder of the excised p57 CDK inhibition domain.  相似文献   

15.
Designed armadillo repeat proteins (dArmRP) are α‐helical solenoid repeat proteins with an extended peptide binding groove that were engineered to develop a generic modular technology for peptide recognition. In this context, the term “peptide” not only denotes a short unstructured chain of amino acids, but also an unstructured region of a protein, as they occur in termini, loops, or linkers between folded domains. Here we report two crystal structures of dArmRPs, in complex with peptides fused either to the N‐terminus of Green Fluorescent Protein or to the C‐terminus of a phage lambda protein D. These structures demonstrate that dArmRPs bind unfolded peptides in the intended conformation also when they constitute unstructured parts of folded proteins, which greatly expands possible applications of the dArmRP technology. Nonetheless, the structures do not fully reflect the binding behavior in solution, that is, some binding sites remain unoccupied in the crystal and even unexpected peptide residues appear to be bound. We show how these differences can be explained by restrictions of the crystal lattice or the composition of the crystallization solution. This illustrates that crystal structures have to be interpreted with caution when protein–peptide interactions are characterized, and should always be correlated with measurements in solution.  相似文献   

16.
We have identified that the collagen helix has the potential to be disruptive to analyses of intrinsically disordered proteins. The collagen helix is an extended fibrous structure that is both promiscuous and repetitive. Whilst its sequence is predicted to be disordered, this type of protein structure is not typically considered as intrinsic disorder. Here, we show that collagen‐encoding proteins skew the distribution of exon lengths in genes. We find that previous results, demonstrating that exons encoding disordered regions are more likely to be symmetric, are due to the abundance of the collagen helix. Other related results, showing increased levels of alternative splicing in disorder‐encoding exons, still hold after considering collagen‐containing proteins. Aside from analyses of exons, we find that the set of proteins that contain collagen significantly alters the amino acid composition of regions predicted as disordered. We conclude that research in this area should be conducted in the light of the collagen helix.  相似文献   

17.
Fibrillar inclusions are a characteristic feature of the neuropathology found in the alpha-synucleinopathies such as Parkinson's disease, dementia with Lewy bodies, and multiple system atrophy. Familial forms of alpha-synucleinopathies have also been linked with missense mutations or gene multiplications that result in higher protein expression levels. In order to form these fibrils, the protein, alpha-synuclein (alpha-syn), must undergo a process of self-assembly in which its native state is converted from a disordered conformer into a beta-sheet-dominated form. Here, we have developed a novel polypeptide property calculator to locate and quantify relative propensities for beta-strand structure in the sequence of alpha-syn. The output of the algorithm, in the form of a simple x-y plot, was found to correlate very well with the location of the beta-sheet core in alpha-syn fibrils. In particular, the plot features three peaks, the largest of which is completely absent for the nonfibrillogenic protein, beta-syn. We also report similar significant correlations for the Alzheimer's disease-related proteins, Abeta and tau. A substantial region of alpha-syn is capable [corrected] of converting from its disordered conformation into a long [corrected] alpha-helical protein. We have developed the aforementioned algorithm to locate and quantify the alpha-helical hydrophobic moment in the amino acid sequence of alpha-syn. As before, the output of the algorithm, in the form of a simple x-y plot, was found to correlate very well with the location of alpha-helical structure in membrane bilayer-associated alpha-syn.  相似文献   

18.
We have used the occluded surface algorithm to estimate the packing of both buried and exposed amino acid residues in protein structures. This method works equally well for buried residues and solvent-exposed residues in contrast to the commonly used Voronoi method that works directly only on buried residues. The atomic packing of individual globular proteins may vary significantly from the average packing of a large data set of globular proteins. Here, we demonstrate that these variations in protein packing are due to a complex combination of protein size, secondary structure composition and amino acid composition. Differences in protein packing are conserved in protein families of similar structure despite significant sequence differences. This conclusion indicates that quality assessments of packing in protein structures should include a consideration of various parameters including the packing of known homologous proteins. Also, modeling of protein structures based on homologous templates should take into account the packing of the template protein structure.  相似文献   

19.
20.
Intrinsically disordered proteins (IDPs) lack a well-defined three-dimensional structure under physiological conditions. Intrinsic disorder is a common phenomenon, particularly in multicellular eukaryotes, and is responsible for important protein functions including regulation and signaling. Many disease-related proteins are likely to be intrinsically disordered or to have disordered regions. In this paper, a new predictor model based on the Bayesian classification methodology is introduced to predict for a given protein or protein region if it is intrinsically disordered or ordered using only its primary sequence. The method allows to incorporate length-dependent amino acid compositional differences of disordered regions by including separate statistical representations for short, middle and long disordered regions. The predictor was trained on the constructed data set of protein regions with known structural properties. In a Jack-knife test, the predictor achieved the sensitivity of 89.2% for disordered and 81.4% for ordered regions. Our method outperformed several reported predictors when evaluated on the previously published data set of Prilusky et al. [2005. FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics 21 (16), 3435-3438]. Further strength of our approach is the ease of implementation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号