首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Biologically active proteins without stable ordered structure (i.e., intrinsically disordered proteins) are attracting increased attention. Functional repertoires of ordered and disordered proteins are very different, and the ability to differentiate whether a given function is associated with intrinsic disorder or with a well-folded protein is crucial for modern protein science. However, there is a large gap between the number of proteins experimentally confirmed to be disordered and their actual number in nature. As a result, studies of functional properties of confirmed disordered proteins, while helpful in revealing the functional diversity of protein disorder, provide only a limited view. To overcome this problem, a bioinformatics approach for comprehensive study of functional roles of protein disorder was proposed in the first paper of this series (Xie, H.; Vucetic, S.; Iakoucheva, L. M.; Oldfield, C. J.; Dunker, A. K.; Obradovic, Z.; Uversky, V. N. Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions. J. Proteome Res. 2007, 5, 1882-1898). Applying this novel approach to Swiss-Prot sequences and functional keywords, we found over 238 and 302 keywords to be strongly positively or negatively correlated, respectively, with long intrinsically disordered regions. This paper describes approximately 90 Swiss-Prot keywords attributed to the cellular components, domains, technical terms, developmental processes, and coding sequence diversities possessing strong positive and negative correlation with long disordered regions.  相似文献   

2.
3.

Background

Intrinsically disordered proteins (IDPs) or proteins with disordered regions (IDRs) do not have a well-defined tertiary structure, but perform a multitude of functions, often relying on their native disorder to achieve the binding flexibility through changing to alternative conformations. Intrinsic disorder is frequently found in all three kingdoms of life, and may occur in short stretches or span whole proteins. To date most studies contrasting the differences between ordered and disordered proteins focused on simple summary statistics. Here, we propose an evolutionary approach to study IDPs, and contrast patterns specific to ordered protein regions and the corresponding IDRs.

Results

Two empirical Markov models of amino acid substitutions were estimated, based on a large set of multiple sequence alignments with experimentally verified annotations of disordered regions from the DisProt database of IDPs. We applied new methods to detect differences in Markovian evolution and evolutionary rates between IDRs and the corresponding ordered protein regions. Further, we investigated the distribution of IDPs among functional categories, biochemical pathways and their preponderance to contain tandem repeats.

Conclusions

We find significant differences in the evolution between ordered and disordered regions of proteins. Most importantly we find that disorder promoting amino acids are more conserved in IDRs, indicating that in some cases not only amino acid composition but the specific sequence is important for function. This conjecture is also reinforced by the observation that for of our data set IDRs evolve more slowly than the ordered parts of the proteins, while we still support the common view that IDRs in general evolve more quickly. The improvement in model fit indicates a possible improvement for various types of analyses e.g. de novo disorder prediction using a phylogenetic Hidden Markov Model based on our matrices showed a performance similar to other disorder predictors.  相似文献   

4.
Identifying relationships between function, amino acid sequence, and protein structure represents a major challenge. In this study, we propose a bioinformatics approach that identifies functional keywords in the Swiss-Prot database that correlate with intrinsic disorder. A statistical evaluation is employed to rank the significance of these correlations. Protein sequence data redundancy and the relationship between protein length and protein structure were taken into consideration to ensure the quality of the statistical inferences. Over 200,000 proteins from the Swiss-Prot database were analyzed using this approach. The predictions of intrinsic disorder were carried out using PONDR VL3E predictor of long disordered regions that achieves an accuracy of above 86%. Overall, out of the 710 Swiss-Prot functional keywords that were each associated with at least 20 proteins, 238 were found to be strongly positively correlated with predicted long intrinsically disordered regions, whereas 302 were strongly negatively correlated with such regions. The remaining 170 keywords were ambiguous without strong positive or negative correlation with the disorder predictions. These functions cover a large variety of biological activities and imply that disordered regions are characterized by a wide functional repertoire. Our results agree well with literature findings, as we were able to find at least one illustrative example of functional disorder or order shown experimentally for the vast majority of keywords showing the strongest positive or negative correlation with intrinsic disorder. This work opens a series of three papers, which enriches the current view of protein structure-function relationships, especially with regards to functionalities of intrinsically disordered proteins, and provides researchers with a novel tool that could be used to improve the understanding of the relationships between protein structure and function. The first paper of the series describes our statistical approach, outlines the major findings, and provides illustrative examples of biological processes and functions positively and negatively correlated with intrinsic disorder.  相似文献   

5.
6.
Proteins of the p53 family are expressed in vertebrates and in some invertebrate species. The main function of these proteins is to control and regulate cell cycle in response to various cellular signals, and therefore to control the organism's development. The regulatory functions of the p53 family members originate mostly from their highly-conserved and well-structured DNA-binding domains. Many human diseases (including various types of cancer) are related to the missense mutations within this domain. The ordered DNA-binding domains of the p53 family members are surrounded by functionally important intrinsically disordered regions. In this study, substitution rates and propensities in different regions of p53 were analyzed. The analyses revealed that the ordered DNA-binding domain is conserved, whereas disordered regions are characterized by high sequence diversity. This diversity was reflected both in the number of substitutions and in the types of substitutions to which each amino acid was prone. These results support the existence of a positive correlation between protein intrinsic disorder and sequence divergence during the evolutionary process. This higher sequence divergence provides strong support for the existence of disordered regions in p53 in vivo for if they were structured, they would evolve at similar rates as the rest of the protein.  相似文献   

7.
Intrinsic protein disorder is a widespread phenomenon characterised by a lack of stable three-dimensional structures and is considered to play an important role in protein-protein interactions (PPIs). This study examined the genome-wide preference of disorder in PPIs by using exhaustive disorder prediction in human PPIs. We categorised the PPIs into three types (interaction between disordered proteins, interaction between structured proteins, and interaction between a disordered protein and a structured protein) with regard to the flexibility of molecular recognition and compared these three interaction types in an existing human PPI network with those in a randomised network. Although the structured regions were expected to become the identifiers for binding recognition, this comparative analysis revealed unexpected results. The occurrence of interactions between disordered proteins was significantly frequent, and that between a disordered protein and a structured protein was significantly infrequent. We found that this propensity was much stronger in interactions between nonhub proteins. We also analysed the interaction types from a functional standpoint by using GO, which revealed that the interaction between disordered proteins frequently occurred in cellular processes, regulation, and metabolic processes. The number of interactions, especially in metabolic processes between disordered proteins, was 1.8 times as large as that in the randomised network. Another analysis conducted by using KEGG pathways provided results where several signaling pathways and disease-related pathways included many interactions between disordered proteins. All of these analyses suggest that human PPIs preferably occur between disordered proteins and that the flexibility of the interacting protein pairs may play an important role in human PPI networks.  相似文献   

8.
Mularoni L  Veitia RA  Albà MM 《Genomics》2007,89(3):316-325
Single-amino-acid tandem repeats are very common in mammalian proteins but their function and evolution are still poorly understood. Here we investigate how the variability and prevalence of amino acid repeats are related to the evolutionary constraints operating on the proteins. We find a significant positive correlation between repeat size difference and protein nonsynonymous substitution rate in human and mouse orthologous genes. This association is observed for all the common amino acid repeat types and indicates that rapid diversification of repeat structures, involving both trinucleotide slippage and nucleotide substitutions, preferentially occurs in proteins subject to low selective constraints. However, strikingly, we also observe a significant negative correlation between the number of repeats in a protein and the gene nonsynonymous substitution rate, particularly for glutamine, glycine, and alanine repeats. This implies that proteins subject to strong selective constraints tend to contain an unexpectedly high number of repeats, which tend to be well conserved between the two species. This is consistent with a role for selection in the maintenance of a significant number of repeats. Analysis of the codon structure of the sequences encoding the repeats shows that codon purity is associated with high repeat size interspecific variability. Interestingly, polyalanine and polyglutamine repeats associated with disease show very distinctive features regarding the degree of repeat conservation and the protein sequence selective constraints.  相似文献   

9.
Pancsa R  Tompa P 《PloS one》2012,7(4):e34687
Based on early bioinformatic studies on a handful of species, the frequency of structural disorder of proteins is generally thought to be much higher in eukaryotes than in prokaryotes. To refine this view, we present here a comparative prediction study and analysis of 194 fully described eukaryotic proteomes and 87 reference prokaryotes for structural disorder. We found that structural disorder does distinguish eukaryotes from prokaryotes, but its frequency spans a very wide range in the two superkingdoms that largely overlap. The number of disordered binding regions and different Pfam domain types also contribute to distinguish eukaryotes from prokaryotes. Unexpectedly, the highest levels--and highest variability--of predicted disorder is found in protists, i.e. single-celled eukaryotes, often surpassing more complex eukaryote organisms, plants and animals. This trend contrasts with that of the number of domain types, which increases rather monotonously toward more complex organisms. The level of structural disorder appears to be strongly correlated with lifestyle, because some obligate intracellular parasites and endosymbionts have the lowest levels, whereas host-changing parasites have the highest level of predicted disorder. We conclude that protists have been the evolutionary hot-bed of experimentation with structural disorder, in a period when structural disorder was actively invented and the major functional classes of disordered proteins established.  相似文献   

10.
Integrating gene expression into protein-protein interaction network (PPIN) leads to the construction of tissue-specific (TS) and housekeeping (HK) sub-networks, with distinctive TS- and HK-hubs. All such hub proteins are divided into multi-interface (MI) hubs and single-interface (SI) hubs, where MI hubs evolve slower than SI hubs. Here we explored the evolutionary rate difference between MI and SI proteins within TS- and HK-PPIN and observed that this difference is present only in TS, but not in HK-class. Next, we explored whether proteins' own properties or its partners' properties are more influential in such evolutionary discrepancy. Statistical analyses revealed that this evolutionary rate correlates negatively with protein's own properties like expression level, miRNA count, conformational diversity and functional properties and with its partners' properties like protein disorder and tissue expression similarity. Moreover, partial correlation and regression analysis revealed that both proteins' and its partners' properties have independent effects on protein evolutionary rate.  相似文献   

11.
Intrinsically disordered proteins and intrinsically disordered protein regions are highly abundant in nature. However, the quantitative and qualitative measures of protein intrinsic disorder in species with known genomes are still not available. Furthermore, although the correlation between high fraction of disordered residues and advanced species has been reported, the details of this correlation and the connection between the disorder content and proteome complexity have not been reported as of yet. To fill this gap, we analysed entire proteomes of 3484 species from three domains of life (archaea, bacteria and eukaryotes) and from viruses. Our analysis revealed that the evolution process is characterized by distinctive patterns of changes in the protein intrinsic disorder content. We are showing here that viruses are characterized by the widest spread of the proteome disorder content (the percentage of disordered residues ranges from 7.3% in human coronavirus NL63 to 77.3% in Avian carcinoma virus). For several organisms, a clear correlation is seen between their disorder contents and habitats. In multicellular eukaryotes, there is a weak correlation between the complexity of an organism (evaluated as a number of different cell types) and its overall disorder content. For both the prokaryotes and eukaryotes, the disorder content is generally independent of the proteome size. However, disorder shows a sharp increase associated with the transition from prokaryotic to eukaryotic cells. This suggests that the increased disorder content in eukaryotic proteomes might be used by nature to deal with the increased cell complexity due to the appearance of the various cellular compartments.  相似文献   

12.
13.
Currently, the understanding of the relationships between function, amino acid sequence, and protein structure continues to represent one of the major challenges of the modern protein science. As many as 50% of eukaryotic proteins are likely to contain functionally important long disordered regions. Many proteins are wholly disordered but still possess numerous biologically important functions. However, the number of experimentally confirmed disordered proteins with known biological functions is substantially smaller than their actual number in nature. Therefore, there is a crucial need for novel bionformatics approaches that allow projection of the current knowledge from a few experimentally verified examples to much larger groups of known and potential proteins. The elaboration of a bioinformatics tool for the analysis of functional diversity of intrinsically disordered proteins and application of this data mining tool to >200 000 proteins from the Swiss-Prot database, each annotated with at least one of the 875 functional keywords, was described in the first paper of this series (Xie, H.; Vucetic, S.; Iakoucheva, L. M.; Oldfield, C. J.; Dunker, A. K.; Obradovic, Z.; Uversky, V.N. Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions. J. Proteome Res. 2007, 5, 1882-1898). Using this tool, we have found that out of the 710 Swiss-Prot functional keywords associated with at least 20 proteins, 262 were strongly positively correlated with long intrinsically disordered regions, and 302 were strongly negatively correlated. Illustrative examples of functional disorder or order were found for the vast majority of keywords showing strongest positive or negative correlation with intrinsic disorder, respectively. Some 80 Swiss-Prot keywords associated with disorder- and order-driven biological processes and protein functions were described in the first paper (see above). The second paper of the series was devoted to the presentation of 87 Swiss-Prot keywords attributed to the cellular components, domains, technical terms, developmental processes, and coding sequence diversities possessing strong positive and negative correlation with long disordered regions (Vucetic, S.; Xie, H.; Iakoucheva, L. M.; Oldfield, C. J.; Dunker, A. K.; Obradovic, Z.; Uversky, V. N. Functional anthology of intrinsic disorder. 2. Cellular components, domains, technical terms, developmental processes, and coding sequence diversities correlated with long disordered regions. J. Proteome Res. 2007, 5, 1899-1916). Protein structure and functionality can be modulated by various post-translational modifications or/and as a result of binding of specific ligands. Numerous human diseases are associated with protein misfolding/misassembly/misfunctioning. This work concludes the series of papers dedicated to the functional anthology of intrinsic disorder and describes approximately 80 Swiss-Prot functional keywords that are related to ligands, post-translational modifications, and diseases possessing strong positive or negative correlation with the predicted long disordered regions in proteins.  相似文献   

14.
15.
Conformational and functional flexibility promote protein evolvability. High evolvability allows related proteins to functionally diverge and perhaps to neostructuralize. p53 is a multifunctional protein frequently referred to as the Guardian of the Genome–a hub for e.g. incoming and outgoing signals in apoptosis and DNA repair. p53 has been found to be structurally disordered, an extreme form of conformational flexibility. Here, p53, and its paralogs p63 and p73, were studied for further insights into the evolutionary dynamics of structural disorder, secondary structure, and phosphorylation. This study is focused on the post gene duplication phase for the p53 family in vertebrates, but also visits the origin of the protein family and the early domain loss and gain events. Functional divergence, measured by rapid evolutionary dynamics of protein domains, structural properties, and phosphorylation propensity, is inferred across vertebrate p53 proteins, in p63 and p73 from fish, and between the three paralogs. In particular, structurally disordered regions are redistributed among paralogs, but within clades redistribution of structural disorder also appears to be an ongoing process. Despite its deemed importance as the Guardian of the Genome, p53 is indeed a protein with high evolvability as seen not only in rearranged structural disorder, but also in fluctuating domain sequence signatures among lineages.  相似文献   

16.

Background

Does a relationship exist between a protein's evolutionary rate and its number of interactions? This relationship has been put forward many times, based on a biological premise that a highly interacting protein will be more restricted in its sequence changes. However, to date several studies have voiced conflicting views on the presence or absence of such a relationship.

Results

Here we perform a large scale study over multiple data sets in order to demonstrate that the major reason for conflict between previous studies is the use of different but overlapping datasets. We show that lack of correlation, between evolutionary rate and number of interactions in a data set is related to the error rate. We also demonstrate that the correlation is not an artifact of the underlying distributions of evolutionary distance and interactions and is therefore likely to be biologically relevant. Further to this, we consider the claim that the dependence is due to gene expression levels and find some supporting evidence. A strong and positive correlation between the number of interactions and the age of a protein is also observed and we show this relationship is independent of expression levels.

Conclusion

A correlation between number of interactions and evolutionary rate is observed but is dependent on the accuracy of the dataset being used. However it appears that the number of interactions a protein participates in depends more on the age of the protein than the rate at which it changes.  相似文献   

17.
We perform a large-scale study of intrinsically disordered regions in proteins and protein complexes using a non-redundant set of hundreds of different protein complexes. In accordance with the conventional view that folding and binding are coupled, in many of our cases the disorder-to-order transition occurs upon complex formation and can be localized to binding interfaces. Moreover, analysis of disorder in protein complexes depicts a significant fraction of intrinsically disordered regions, with up to one third of all residues being disordered. We find that the disorder in homodimers, especially in symmetrical homodimers, is significantly higher than in heterodimers and offer an explanation for this interesting phenomenon. We argue that the mechanisms of regulation of binding specificity through disordered regions in complexes can be as common as for unbound monomeric proteins. The fascinating diversity of roles of disordered regions in various biological processes and protein oligomeric forms shown in our study may be a subject of future endeavors in this area.  相似文献   

18.
Many prokaryotic organisms have adapted to incredibly extreme habitats. The genomes of such extremophiles differ from their non-extremophile relatives. For example, some proteins in thermophiles sustain high temperatures by being more compact than homologs in non-extremophiles. Conversely, some proteins have increased volumes to compensate for freezing effects in psychrophiles that survive in the cold. Here, we revealed that some differences in organisms surviving in extreme habitats correlate with a simple single feature, namely the fraction of proteins predicted to have long disordered regions. We predicted disorder with different methods for 46 completely sequenced organisms from diverse habitats and found a correlation between protein disorder and the extremity of the environment. More specifically, the overall percentage of proteins with long disordered regions tended to be more similar between organisms of similar habitats than between organisms of similar taxonomy. For example, predictions tended to detect substantially more proteins with long disordered regions in prokaryotic halophiles (survive high salt) than in their taxonomic neighbors. Another peculiar environment is that of high radiation survived, e.g. by Deinococcus radiodurans. The relatively high fraction of disorder predicted in this extremophile might provide a shield against mutations. Although our analysis fails to establish causation, the observed correlation between such a simplistic, coarse-grained, microscopic molecular feature (disorder content) and a macroscopic variable (habitat) remains stunning.  相似文献   

19.
Comparative analyses between human disease and non-disease genes are of great interest in understanding human disease gene evolution. However, the progression of neurodegenerative diseases (NDD) involving amyloid formation in specific brain regions is still unknown. Therefore, in this study, we mainly focused our analysis on the evolutionary features of human NDD genes with respect to non-disease genes. Here, we observed that human NDD genes are evolutionarily conserved relative to non-disease genes. To elucidate the conserved nature of NDD genes, we incorporated the evolutionary attributes like gene expression level, number of regulatory miRNAs, protein connectivity, intrinsic disorder content and relative aggregation propensity in our analysis. Our studies demonstrate that NDD genes have higher gene expression levels in favor of their lower evolutionary rates. Additionally, we observed that NDD genes have higher number of different regulatory miRNAs target sites and also have higher interaction partners than the non-disease genes. Moreover, miRNA targeted genes are known to have higher disorder content. In contrast, our analysis exclusively established that NDD genes have lower disorder content. In favor of our analysis, we found that NDD gene encoded proteins are enriched with multi interface hubs (party hubs) with lower disorder contents. Since, proteins with higher disorder content need to adapt special structure to reduce their aggregation propensity, NDD proteins found to have elevated relative aggregation propensity (RAP) in support of their lower disorder content. Finally, our categorical regression analysis confirmed the underlined relative dominance of protein connectivity, 3′UTR length, RAP, nature of hubs (singlish/multi interface) and disorder content for such evolutionary rates variation between human NDD genes and non-disease genes.  相似文献   

20.

Background

Intrinsically disordered regions are enriched in short interaction motifs that play a critical role in many protein-protein interactions. Since new short interaction motifs may easily evolve, they have the potential to rapidly change protein interactions and cellular signaling. In this work we examined the dynamics of gain and loss of intrinsically disordered regions in duplicated proteins to inspect if changes after genome duplication can create functional divergence. For this purpose we used Saccharomyces cerevisiae and the outgroup species Lachancea kluyveri.

Principal Findings

We find that genes duplicated as part of a genome duplication (ohnologs) are significantly more intrinsically disordered than singletons (p<2.2e-16, Wilcoxon), reflecting a preference for retaining intrinsically disordered proteins in duplicate. In addition, there have been marked changes in the extent of intrinsic disorder following duplication. A large number of duplicated genes have more intrinsic disorder than their L. kluyveri ortholog (29% for duplicates versus 25% for singletons) and an even greater number have less intrinsic disorder than the L. kluyveri ortholog (37% for duplicates versus 25% for singletons). Finally, we show that the number of physical interactions is significantly greater in the more intrinsically disordered ohnolog of a pair (p = 0.003, Wilcoxon).

Conclusion

This work shows that intrinsic disorder gain and loss in a protein is a mechanism by which a genome can also diverge and innovate. The higher number of interactors for proteins that have gained intrinsic disorder compared with their duplicates may reflect the acquisition of new interaction partners or new functional roles.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号