首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Intrinsically disordered regions have been associated with various cellular processes and are implicated in several human diseases, but their exact roles remain unclear. We previously defined two classes of conserved disordered regions in budding yeast, referred to as “flexible” and “constrained” conserved disorder. In flexible disorder, the property of disorder has been positionally conserved during evolution, whereas in constrained disorder, both the amino acid sequence and the property of disorder have been conserved. Here, we show that flexible and constrained disorder are widespread in the human proteome, and are particularly common in proteins with regulatory functions. Both classes of disordered sequences are highly enriched in regions of proteins that undergo tissue-specific (TS) alternative splicing (AS), but not in regions of proteins that undergo general (i.e., not tissue-regulated) AS. Flexible disorder is more highly enriched in TS alternative exons, whereas constrained disorder is more highly enriched in exons that flank TS alternative exons. These latter regions are also significantly more enriched in potential phosphosites and other short linear motifs associated with cell signaling. We further show that cancer driver mutations are significantly enriched in regions of proteins associated with TS and general AS. Collectively, our results point to distinct roles for TS alternative exons and flanking exons in the dynamic regulation of protein interaction networks in response to signaling activity, and they further suggest that alternatively spliced regions of proteins are often functionally altered by mutations responsible for cancer.  相似文献   

2.
Although the members of the largest subfamily of the EF-hand proteins, S100 proteins, are evolutionarily young, their functional diversity is extremely broad, partly due to their ability to adapt to various targets. This feature is a hallmark of intrinsically disordered proteins (IDPs), but none of the S100 proteins are recognized as IDPs. S100 are predicted to be enriched in intrinsic disorder, with 62% of them being predicted to be disordered by at least one of the predictors: 31% are recognized as 'molten globules' and 15% are shown to be in extended disordered form. The disorder level of predicted disordered S100 regions is conserved compared to that of more structured regions. The central disordered stretch corresponds to the major part of pseudo EF-hand loop, helix II, hinge region, and an initial part of helix III. It contains about half of known sites of enzymatic post-translational modifications (PTMs), confirming that this region can be flexible in vivo. Most of the internal residues missing in tertiary structures belong to the hinge. Both hinge and pseudo EF-hand loop correspond to the local maxima of the PONDR? VSL2 score and are shown to be evolutionary hotspots, leading to gain of new functional properties. The action of PTMs is shown to be destabilizing, in contrast with the effect of metal-binding or S100 dimerization. Formation of the S100 heterodimers relies on the interplay between the structural rigidity of one of the S100 monomers and the flexibility of another monomer. The ordered regions dominate in the S100 homodimerization sites. Target-binding sites generally consist of distant regions, drastically differing in their disorder level. The disordered region comprising most of the hinge and the N-terminal half of helix III is virtually not involved into dimerization, being intended solely for target recognition. The structural flexibility of this region is essential for recognition of diverse target proteins. At least 86% of multiple interactions of S100 proteins with binding partners are attributed to the S100 proteins predicted to be disordered. Overall, the intrinsic disorder is inherent to many S100 proteins and is vital for activity and functional diversity of the family.  相似文献   

3.
Conformational changes in proteins often involve secondary structure transitions. Such transitions can be divided into two types: disorder‐to‐order changes, in which a disordered segment acquires an ordered secondary structure (e.g., disorder to α‐helix, disorder to β‐strand), and order‐to‐order changes, where a segment switches from one ordered secondary structure to another (e.g., α‐helix to β‐strand, α‐helix to turn). In this study, we explore the distribution of these transitions in the proteome. Using a comprehensive, yet highly conservative method, we compared solved three‐dimensional structures of identical protein sequences, looking for differences in the secondary structures with which they were assigned. Protein chains in which such secondary structure transitions were detected, were classified into two sets according to the type of transition that is involved (disorder‐to‐order or order‐to‐order), allowing us to characterize each set by examining enrichment of gene ontology terms. The results reveal that the disorder‐to‐order set is significantly enriched with nucleotide binding proteins, whereas the order‐to‐order set is more diverse. Remarkably, further examination reveals that >22% of the purine nucleotide binding proteins include segments which undergo disorder‐to‐order transitions, suggesting that such transitions play an important role in this process. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

4.
We have discovered that positions of splice junctions in genes are constrained by the tolerance for disorder-promoting amino acids in the translated protein region. It is known that efficient splicing requires nucleotide bias at the splice junction; the preferred usage produces a distribution of amino acids that is disorder-promoting. We observe that efficiency of splicing, as seen in the amino-acid distribution, is not compromised to accommodate globular structure. Thus we infer that it is the positions of splice junctions in the gene that must be under constraint by the local protein environment. Examining exonic splicing enhancers found near the splice junction in the gene, reveals that these (short DNA motifs) are more prevalent in exons that encode disordered protein regions than exons encoding structured regions. Thus we also conclude that local protein features constrain efficient splicing more in structure than in disorder.  相似文献   

5.
Many protein regions have been shown to be intrinsically disordered, lacking unique structure under physiological conditions. These intrinsically disordered regions are not only very common in proteomes, but also crucial to the function of many proteins, especially those involved in signaling, recognition, and regulation. The goal of this work was to identify the prevalence, characteristics, and functions of conserved disordered regions within protein domains and families. A database was created to store the amino acid sequences of nearly one million proteins and their domain matches from the InterPro database, a resource integrating eight different protein family and domain databases. Disorder prediction was performed on these protein sequences. Regions of sequence corresponding to domains were aligned using a multiple sequence alignment tool. From this initial information, regions of conserved predicted disorder were found within the domains. The methodology for this search consisted of finding regions of consecutive positions in the multiple sequence alignments in which a 90% or more of the sequences were predicted to be disordered. This procedure was constrained to find such regions of conserved disorder prediction that were at least 20 amino acids in length. The results of this work included 3,653 regions of conserved disorder prediction, found within 2,898 distinct InterPro entries. Most regions of conserved predicted disorder detected were short, with less than 10% of those found exceeding 30 residues in length.  相似文献   

6.
Identifying relationships between function, amino acid sequence, and protein structure represents a major challenge. In this study, we propose a bioinformatics approach that identifies functional keywords in the Swiss-Prot database that correlate with intrinsic disorder. A statistical evaluation is employed to rank the significance of these correlations. Protein sequence data redundancy and the relationship between protein length and protein structure were taken into consideration to ensure the quality of the statistical inferences. Over 200,000 proteins from the Swiss-Prot database were analyzed using this approach. The predictions of intrinsic disorder were carried out using PONDR VL3E predictor of long disordered regions that achieves an accuracy of above 86%. Overall, out of the 710 Swiss-Prot functional keywords that were each associated with at least 20 proteins, 238 were found to be strongly positively correlated with predicted long intrinsically disordered regions, whereas 302 were strongly negatively correlated with such regions. The remaining 170 keywords were ambiguous without strong positive or negative correlation with the disorder predictions. These functions cover a large variety of biological activities and imply that disordered regions are characterized by a wide functional repertoire. Our results agree well with literature findings, as we were able to find at least one illustrative example of functional disorder or order shown experimentally for the vast majority of keywords showing the strongest positive or negative correlation with intrinsic disorder. This work opens a series of three papers, which enriches the current view of protein structure-function relationships, especially with regards to functionalities of intrinsically disordered proteins, and provides researchers with a novel tool that could be used to improve the understanding of the relationships between protein structure and function. The first paper of the series describes our statistical approach, outlines the major findings, and provides illustrative examples of biological processes and functions positively and negatively correlated with intrinsic disorder.  相似文献   

7.

Background

Intrinsically disordered proteins (IDPs) or proteins with disordered regions (IDRs) do not have a well-defined tertiary structure, but perform a multitude of functions, often relying on their native disorder to achieve the binding flexibility through changing to alternative conformations. Intrinsic disorder is frequently found in all three kingdoms of life, and may occur in short stretches or span whole proteins. To date most studies contrasting the differences between ordered and disordered proteins focused on simple summary statistics. Here, we propose an evolutionary approach to study IDPs, and contrast patterns specific to ordered protein regions and the corresponding IDRs.

Results

Two empirical Markov models of amino acid substitutions were estimated, based on a large set of multiple sequence alignments with experimentally verified annotations of disordered regions from the DisProt database of IDPs. We applied new methods to detect differences in Markovian evolution and evolutionary rates between IDRs and the corresponding ordered protein regions. Further, we investigated the distribution of IDPs among functional categories, biochemical pathways and their preponderance to contain tandem repeats.

Conclusions

We find significant differences in the evolution between ordered and disordered regions of proteins. Most importantly we find that disorder promoting amino acids are more conserved in IDRs, indicating that in some cases not only amino acid composition but the specific sequence is important for function. This conjecture is also reinforced by the observation that for of our data set IDRs evolve more slowly than the ordered parts of the proteins, while we still support the common view that IDRs in general evolve more quickly. The improvement in model fit indicates a possible improvement for various types of analyses e.g. de novo disorder prediction using a phylogenetic Hidden Markov Model based on our matrices showed a performance similar to other disorder predictors.  相似文献   

8.
Proteins are elaborate biopolymers balancing between contradicting intrinsic propensities to fold, aggregate, or remain disordered. Assessing their primary structural preferences observable without evolutionary optimization has been reinforced by the recent identification of de novo proteins that have emerged from previously non-coding sequences. In this paper we investigate structural preferences of hypothetical proteins translated from random DNA segments using the standard genetic code and three of its proposed evolutionarily predecessor models encoding 10, 6, and 4 amino acids, respectively. Our only main assumption is that the disorder, aggregation, and transmembrane helix predictions used are able to reflect the differences in the trends of the protein sets investigated. We found that the 10-residue code encodes proteins that resemble modern proteins in their predicted structural properties. All of the investigated early genetic codes give rise to proteins with enhanced disorder and diminished aggregation propensities. Our results suggest that an ancestral genetic code similar to the proposed 10-residue one is capable of encoding functionally diverse proteins but these might have existed under conditions different from today’s common physiological ones. The existence of a protein functional repertoire for the investigated earlier stages which is quite distinct as it is today can be deduced from the presented results.  相似文献   

9.
Circular dichroism (CD) spectroscopy is a valuable method for defining canonical secondary structure contents of proteins based on empirically‐defined spectroscopic signatures derived from proteins with known three‐dimensional structures. Many proteins identified as being “Intrinsically Disordered Proteins” have a significant amount of their structure that is neither sheet, helix, nor turn; this type of structure is often classified by CD as “other”, “random coil”, “unordered”, or “disordered”. However the “other” category can also include polyproline II (PPII)‐type structures, whose spectral properties have not been well‐distinguished from those of unordered structures. In this study, synchrotron radiation circular dichroism spectroscopy was used to investigate the spectral properties of collagen and polyproline, which both contain PPII‐type structures. Their native spectra were compared as representatives of PPII structures. In addition, their spectra before and after treatment with various conditions to produce unfolded or denatured structures were also compared, with the aim of defining the differences between CD spectra of PPII and disordered structures. We conclude that the spectral features of collagen are more appropriate than those of polyproline for use as the representative spectrum for PPII structures present in typical amino acid‐containing proteins, and that the single most characteristic spectroscopic feature distinguishing a PPII structure from a disordered structure is the presence of a positive peak around 220nm in the former but not in the latter. These spectra are now available for inclusion in new reference data sets used for CD analyses of the secondary structures of soluble proteins.  相似文献   

10.
Type I collagen is the fundamental component of the extracellular matrix. Its α1 gene is the direct descendant of ancestral fibrillar collagen and contains 57 exons encoding the rod-like triple-helical COL domain. We trace the evolution of the COL domain from a primordial collagen 18 residues in length to its present 1014 residues, the limit of its possible length. In order to maintain and improve the essential structural features of collagen during evolution, exons can be added or extended only in permitted, non-random increments that preserve the position of spatially sensitive cross-linkage sites. Such sites cannot be maintained unless the twist of the triple helix is close to 30 amino acids per turn. Inspection of the gene structure of other long structural proteins, fibronectin and titin, suggests that their evolution might have been subject to similar constraints.  相似文献   

11.
The spliceosome is a molecular machine that performs the excision of introns from eukaryotic pre-mRNAs. This macromolecular complex comprises in human cells five RNAs and over one hundred proteins. In recent years, many spliceosomal proteins have been found to exhibit intrinsic disorder, that is to lack stable native three-dimensional structure in solution. Building on the previous body of proteomic, structural and functional data, we have carried out a systematic bioinformatics analysis of intrinsic disorder in the proteome of the human spliceosome. We discovered that almost a half of the combined sequence of proteins abundant in the spliceosome is predicted to be intrinsically disordered, at least when the individual proteins are considered in isolation. The distribution of intrinsic order and disorder throughout the spliceosome is uneven, and is related to the various functions performed by the intrinsic disorder of the spliceosomal proteins in the complex. In particular, proteins involved in the secondary functions of the spliceosome, such as mRNA recognition, intron/exon definition and spliceosomal assembly and dynamics, are more disordered than proteins directly involved in assisting splicing catalysis. Conserved disordered regions in spliceosomal proteins are evolutionarily younger and less widespread than ordered domains of essential spliceosomal proteins at the core of the spliceosome, suggesting that disordered regions were added to a preexistent ordered functional core. Finally, the spliceosomal proteome contains a much higher amount of intrinsic disorder predicted to lack secondary structure than the proteome of the ribosome, another large RNP machine. This result agrees with the currently recognized different functions of proteins in these two complexes.  相似文献   

12.
13.
Here we study the properties and the evolution of proteins that constitute the Centrosome, the complex molecular assembly that regulates the division and differentiation of animal cells. We found that centrosomal proteins are predicted to be significantly enriched in disordered and coiled-coil regions, more phosphorylated and longer than control proteins of the same organism. Interestingly, the ratio of these properties in centrosomal and control proteins tends to increase with the number of cell-types. We reconstructed indels evolution, finding that indels significantly increase disorder in both centrosomal and control proteins, at a rate that is typically larger along branches associated with a large growth in cell-types number, and larger for centrosomal than for control proteins. Substitutions show a similar trend for coiled-coil, but they contribute less to the evolution of disorder. Our results suggest that the increase in cell-types number in animal evolution is correlated with the gain of disordered and coiled-coil regions in centrosomal proteins, establishing a connection between organism and molecular complexity. We argue that the structural plasticity conferred to the Centrosome by disordered regions and phosphorylation plays an important role in its mechanical properties and its regulation in space and time.  相似文献   

14.
Intrinsically disordered proteins (IDPs) and proteins with long disordered regions are highly abundant in various proteomes. Despite their lack of well-defined ordered structure, these proteins and regions are frequently involved in crucial biological processes. Although in recent years these proteins have attracted the attention of many researchers, IDPs represent a significant challenge for structural characterization since these proteins can impact many of the processes in the structure determination pipeline. Here we investigate the effects of IDPs on the structure determination process and the utility of disorder prediction in selecting and improving proteins for structural characterization. Examination of the extent of intrinsic disorder in existing crystal structures found that relatively few protein crystal structures contain extensive regions of intrinsic disorder. Although intrinsic disorder is not the only cause of crystallization failures and many structured proteins cannot be crystallized, filtering out highly disordered proteins from structure-determination target lists is still likely to be cost effective. Therefore it is desirable to avoid highly disordered proteins from structure-determination target lists and we show that disorder prediction can be applied effectively to enrich structure determination pipelines with proteins more likely to yield crystal structures. For structural investigation of specific proteins, disorder prediction can be used to improve targets for structure determination. Finally, a framework for considering intrinsic disorder in the structure determination pipeline is proposed.  相似文献   

15.
Although most proteins conform to the classical one‐structure/one‐function paradigm, an increasing number of proteins with dual structures and functions have been discovered. In response to cellular stimuli, such proteins undergo structural changes sufficiently dramatic to remodel even their secondary structures and domain organization. This “fold‐switching” capability fosters protein multi‐functionality, enabling cells to establish tight control over various biochemical processes. Accurate predictions of fold‐switching proteins could both suggest underlying mechanisms for uncharacterized biological processes and reveal potential drug targets. Recently, we developed a prediction method for fold‐switching proteins using structure‐based thermodynamic calculations and discrepancies between predicted and experimentally determined protein secondary structure (Porter and Looger, Proc Natl Acad Sci U S A 2018; 115:5968–5973). Here we seek to leverage the negative information found in these secondary structure prediction discrepancies. To do this, we quantified secondary structure prediction accuracies of 192 known fold‐switching regions (FSRs) within solved protein structures found in the Protein Data Bank (PDB). We find that the secondary structure prediction accuracies for these FSRs vary widely. Inaccurate secondary structure predictions are strongly associated with fold‐switching proteins compared to equally long segments of non‐fold‐switching proteins selected at random. These inaccurate predictions are enriched in helix‐to‐strand and strand‐to‐coil discrepancies. Finally, we find that most proteins with inaccurate secondary structure predictions are underrepresented in the PDB compared with their alternatively folded cognates, suggesting that unequal representation of fold‐switching conformers within the PDB could be an important cause of inaccurate secondary structure predictions. These results demonstrate that inconsistent secondary structure predictions can serve as a useful preliminary marker of fold switching.  相似文献   

16.
More than just tails: intrinsic disorder in histone proteins   总被引:2,自引:0,他引:2  
Many biologically active proteins are disordered as a whole, or contain long disordered regions. These intrinsically disordered proteins/regions are very common in nature, abundantly found in all organisms, where they carry out important biological functions. The functions of these proteins complement the functional repertoire of "normal" ordered proteins, and many protein functional classes are heavily dependent on intrinsic disorder. Among these disorder-centric functions are interactions with nucleic acids and protein complex assembly. In this study, we present the results of comprehensive bioinformatics analyses of the abundance and roles of intrinsic disorder in 2007 histones from 746 species. We show that all the members of the histone family are intrinsically disordered proteins. Furthermore, intrinsic disorder is not only abundant in histones, but is absolutely necessary for various histone functions, starting from heterodimerization to formation of higher order oligomers, to interactions with DNA and other proteins, and to posttranslational modifications.  相似文献   

17.
Intrinsically disordered proteins (IDPs) refer to those proteins without fixed three-dimensional structures under physiological conditions. Although experiments suggest that the conformations of IDPs can vary from random coils, semi-compact globules, to compact globules with different contents of secondary structures, computational efforts to separate IDPs into different states are not yet successful. Recently, we developed a neural-network-based disorder prediction technique SPINE-D that was ranked as one of the top performing techniques for disorder prediction in the biannual meeting of critical assessment of structure prediction techniques (CASP 9, 2010). Here, we further analyze the results from SPINE-D prediction by defining a semi-disordered state that has about 50 % predicted probability to be disordered or ordered. This semi-disordered state is partially collapsed with intermediate levels of predicted solvent accessibility and secondary structure content. The relative difference in compositions between semi-disordered and fully disordered regions is highly correlated with amyloid aggregation propensity (a correlation coefficient of 0.86 if excluding four charged residues and proline, 0.73 if not). In addition, we observed that some semi-disordered regions participate in induced folding, and others play key roles in protein aggregation. More specifically, a semi-disordered region is amyloidogenic in fully unstructured proteins (such as alpha-synuclein and Sup35) but prone to local unfolding that exposes the hydrophobic core to aggregation in structured globular proteins (such as SOD1 and lysozyme). A transition from full disorder to semi-disorder at about 30–40 Qs is observed in the poly-Q (poly-glutamine) tract of huntingtin. The accuracy of using semi-disorder to predict binding-induced folding and aggregation is compared with several methods trained for the purpose. These results indicate the usefulness of three-state classification (order, semi-disorder, and full-disorder) in distinguishing nonfolding from induced-folding and aggregation-resistant from aggregation-prone IDPs and in locating weakly stable, locally unfolding, and potentially aggregation regions in structured proteins. A comparison with five representative disorder-prediction methods showed that SPINE-D is the only method with a clear separation of semi-disorder from ordered and fully disordered states.  相似文献   

18.
Cheng Y  LeGall T  Oldfield CJ  Dunker AK  Uversky VN 《Biochemistry》2006,45(35):10448-10460
Evidence that many protein regions and even entire proteins lacking stable tertiary and/or secondary structure in solution (i.e., intrinsically disordered proteins) might be involved in protein-protein interactions, regulation, recognition, and signal transduction is rapidly accumulating. These signaling proteins play a crucial role in the development of several pathological conditions, including cancer. To test a hypothesis that intrinsic disorder is also abundant in cardiovascular disease (CVD), a data set of 487 CVD-related proteins was extracted from SWISS-PROT. CVD-related proteins are depleted in major order-promoting residues (Trp, Phe, Tyr, Ile, and Val) and enriched in some disorder-promoting residues (Arg, Gln, Ser, Pro, and Glu). The application of a neural network predictor of natural disordered regions (PONDR VL-XT) together with cumulative distribution function (CDF) analysis, charge-hydropathy plot (CH plot) analysis, and alpha-helical molecular recognition feature (alpha-MoRF) indicator revealed that CVD-related proteins are enriched in intrinsic disorder. In fact, the percentage of proteins with 30 or more consecutive residues predicted by PONDR VL-XT to be disordered was 57 +/- 4% for CVD-associated proteins. This value is close that described earlier for signaling proteins (66 +/- 6%) and is significantly larger than the content of intrinsic disorder in eukaryotic proteins from SWISS-PROT (47 +/- 4%) and in nonhomologous protein segments with a well-defined three-dimensional structure (13 +/- 4%). Furthermore, CDF and CH-plot analyses revealed that 120 and 36 CVD-related proteins, respectively, are wholly disordered. This high level of intrinsic disorder could be important for the function of CVD-related proteins and for the control and regulation of processes associated with cardiovascular disease. In agreement with this hypothesis, 198 alpha-MoRFs were predicted in 101 proteins from the CVD data set. A comparison of disorder predictions with the experimental structural and functional data for a subset of the CVD-associated proteins indicated good agreement between predictions and observations. Thus, our data suggest that intrinsically disordered proteins might play key roles in cardiovascular disease.  相似文献   

19.
Type VII collagen (Col7) is important for skin integrity. As a major component of the anchoring fibrils, Col7 is essential for linking different skin layers together. The central collagenous domain of Col7 contains several interruptions of the collagen triple helix. The longest interruption is 39 amino acids long and referred to as the hinge region. The hinge region is highly conserved between species. This region was predicted to adopt a coiled coil structure and to serve as the trimerization domain of Col7.To gain insight into the potential function of the hinge region we investigated a heterologous expressed peptide by CD and NMR spectroscopy. CD spectroscopy implies that the hinge region is intrinsically disordered. Resonance assignment was performed and allowed secondary structure analysis based on the chemical shift values. Seven amino acids in the N-terminal moiety show residual α-helical conformation. Subsequent investigation of temperature dependency of amide chemical shifts indicated participation in hydrogen bonding of amino acid residues in the C-terminal moiety of the hinge region. Therefore, the hinge region does not form a coiled coil structure under the employed experimental conditions. The intrinsic disorder of the hinge region might be desired for flexibility to serve as a “hinge” or the hinge region is an important interaction site as typically observed for intrinsically disordered proteins.  相似文献   

20.
Serine/arginine-rich (SR) splicing factors play an important role in constitutive and alternative splicing as well as during several steps of RNA metabolism. Despite the wealth of functional information about SR proteins accumulated to-date, structural knowledge about the members of this family is very limited. To gain a better insight into structure-function relationships of SR proteins, we performed extensive sequence analysis of SR protein family members and combined it with ordered/disordered structure predictions. We found that SR proteins have properties characteristic of intrinsically disordered (ID) proteins. The amino acid composition and sequence complexity of SR proteins were very similar to those of the disordered protein regions. More detailed analysis showed that the SR proteins, and their RS domains in particular, are enriched in the disorder-promoting residues and are depleted in the order-promoting residues as compared to the entire human proteome. Moreover, disorder predictions indicated that RS domains of SR proteins were completely unstructured. Two different classification methods, the charge-hydropathy measure and the cumulative distribution function (CDF) of the disorder scores, were in agreement with each other, and they both strongly predicted members of the SR protein family to be disordered. This study emphasizes the importance of the disordered structure for several functions of SR proteins, such as for spliceosome assembly and for interaction with multiple partners. In addition, it demonstrates the usefulness of order/disorder predictions for inferring protein structure from sequence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号