首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Many aspects of cell signalling, trafficking, and targeting are governed by interactions between globular protein domains and short peptide segments. These domains often bind multiple peptides that share a common sequence pattern, or “linear motif” (e.g., SH3 binding to PxxP). Many domains are known, though comparatively few linear motifs have been discovered. Their short length (three to eight residues), and the fact that they often reside in disordered regions in proteins makes them difficult to detect through sequence comparison or experiment. Nevertheless, each new motif provides critical molecular details of how interaction networks are constructed, and can explain how one protein is able to bind to very different partners. Here we show that binding motifs can be detected using data from genome-scale interaction studies, and thus avoid the normally slow discovery process. Our approach based on motif over-representation in non-homologous sequences, rediscovers known motifs and predicts dozens of others. Direct binding experiments reveal that two predicted motifs are indeed protein-binding modules: a DxxDxxxD protein phosphatase 1 binding motif with a KD of 22 μM and a VxxxRxYS motif that binds Translin with a KD of 43 μM. We estimate that there are dozens or even hundreds of linear motifs yet to be discovered that will give molecular insight into protein networks and greatly illuminate cellular processes.  相似文献   

2.
Natively unstructured or disordered regions appear to be abundant in eukaryotic proteins. Many such regions have been found alongside small linear binding motifs. We report a Monte Carlo study that aims to elucidate the role of disordered regions adjacent to such binding motifs. The coarse-grained simulations show that small hydrophobic peptides without disordered flanks tend to aggregate under conditions where peptides embedded in unstructured peptide sequences are stable as monomers or as part of small micelle-like clusters. Surprisingly, the binding free energy of the motif is barely decreased by the presence of disordered flanking regions, although it is sensitive to the loss of entropy of the motif itself upon binding. This latter effect allows for reversible binding of the signalling motif to the substrate. The work provides insights into a mechanism that prevents the aggregation of signalling peptides, distinct from the general mechanism of protein folding, and provides a testable hypothesis to explain the abundance of disordered regions in proteins.  相似文献   

3.
Intrinsically disordered proteins (IDPs) exist without the presence of a stable tertiary structure in isolation. These proteins are often involved in molecular recognition processes via their disordered binding regions that can recognize partner molecules by undergoing a coupled folding and binding process. The specific properties of disordered binding regions give way to specific, yet transient interactions that enable IDPs to play central roles in signaling pathways and act as hubs of protein interaction networks. An alternative model of protein-protein interactions with largely overlapping functional properties is offered by the concept of linear interaction motifs. This approach focuses on distilling a short consensus sequence pattern from proteins with a common interaction partner. These motifs often reside in disordered regions and are considered to mediate the interaction roughly independent from the rest of the protein. Although a connection between linear motifs and disordered binding regions has been established through common examples, the complementary nature of the two concepts has yet to be fully explored. In many cases the sequence based definition of linear motifs and the structural context based definition of disordered binding regions describe two aspects of the same phenomenon. To gain insight into the connection between the two models, prediction methods were utilized. We combined the regular expression based prediction of linear motifs with the disordered binding region prediction method ANCHOR, each specialized for either model to get the best of both worlds. The thorough analysis of the overlap of the two methods offers a bioinformatics tool for more efficient binding site prediction that can serve a wide range of practical implications. At the same time it can also shed light on the theoretical connection between the two co-existing interaction models.  相似文献   

4.
Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of their host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virus-host crosstalk.  相似文献   

5.
Many protein functions can be traced to linear sequence motifs of less than five residues, which are often found within intrinsically disordered domains. In spite of their prevalence, their role in protein evolution is only beginning to be understood. The study of papillomaviruses has provided many insights on the evolution of protein structure and function. We have chosen the papillomavirus E7 oncoprotein as a model system for the evolution of functional linear motifs. The multiple functions of E7 proteins from paradigmatic papillomavirus types can be explained to a large extent in terms of five linear motifs within the intrinsically disordered N-terminal domain and two linear motifs within the globular homodimeric C-terminal domain. We examined the motif inventory of E7 proteins from over 200 known papillomavirus types and found that the motifs reported for paradigmatic papillomavirus types are absent from many uncharacterized E7 proteins. Several motif pairs occur more often than expected, suggesting that linear motifs may evolve and function in a cooperative manner. The E7 linear motifs have appeared or disappeared multiple times during papillomavirus evolution, confirming the evolutionary plasticity of short functional sequences. Four of the motifs appeared several times during papillomavirus evolution, providing direct evidence for convergent evolution. Interestingly, the evolution pattern of a motif is independent of its location in a globular or disordered domain. The correlation between the presence of some motifs and virus host specificity and tissue tropism suggests that linear motifs play a role in the adaptive evolution of papillomaviruses.  相似文献   

6.

Background

Linear motifs are short modules of protein sequences that play a crucial role in mediating and regulating many protein–protein interactions. The function of linear motifs strongly depends on the context, e.g. functional instances mainly occur inside flexible regions that are accessible for interaction. Sometimes linear motifs appear as isolated islands of conservation in multiple sequence alignments. However, they also occur in larger blocks of sequence conservation, suggesting an active role for the neighbouring amino acids.

Results

The evolution of regions flanking 116 functional linear motif instances was studied. The conservation of the amino acid sequence and order/disorder tendency of those regions was related to presence/absence of the instance. For the majority of the analysed instances, the pairs of sequences conserving the linear motif were also observed to maintain a similar local structural tendency and/or to have higher local sequence conservation when compared to pairs of sequences where one is missing the linear motif. Furthermore, those instances have a higher chance to co–evolve with the neighbouring residues in comparison to the distant ones. Those findings are supported by examples where the regulation of the linear motif–mediated interaction has been shown to depend on the modifications (e.g. phosphorylation) at neighbouring positions or is thought to benefit from the binding versatility of disordered regions.

Conclusion

The results suggest that flanking regions are relevant for linear motif–mediated interactions, both at the structural and sequence level. More interestingly, they indicate that the prediction of linear motif instances can be enriched with contextual information by performing a sequence analysis similar to the one presented here. This can facilitate the understanding of the role of these predicted instances in determining the protein function inside the broader context of the cellular network where they arise.  相似文献   

7.

Background  

Many proteins contain disordered regions that lack fixed three-dimensional (3D) structure under physiological conditions but have important biological functions. Prediction of disordered regions in protein sequences is important for understanding protein function and in high-throughput determination of protein structures. Machine learning techniques, including neural networks and support vector machines have been widely used in such predictions. Predictors designed for long disordered regions are usually less successful in predicting short disordered regions. Combining prediction of short and long disordered regions will dramatically increase the complexity of the prediction algorithm and make the predictor unsuitable for large-scale applications. Efficient batch prediction of long disordered regions alone is of greater interest in large-scale proteome studies.  相似文献   

8.
9.
Protein–protein interactions are thought to be mediated by domains, which are autonomous folding units of proteins. Recently, a second type of interaction has been suggested, mediated by short segments termed linear motifs, which are related to recognition elements of intrinsically disordered regions. Here, we propose a third kind of protein–protein recognition mechanism, mediated by disordered regions longer than 20–30 residues. Bioinformatics predictions and well‐characterized examples, such as the kinase‐inhibitory domain of Cdk inhibitors and the Wiskott–Aldrich syndrome protein (WASP)‐homology domain 2 of actin‐binding proteins, show that these disordered regions conform to the definition of domains rather than motifs, i.e., they represent functional, evolutionary, and structural units. Their functions are distinct from those of short motifs and ordered domains, and establish a third kind of interaction principle. With these points, we argue that these long disordered regions should be recognized as a distinct class of biologically functional protein domains.  相似文献   

10.
Intrinsically disordered proteins (IDPs) lack a well-defined three-dimensional structure under physiological conditions. Intrinsic disorder is a common phenomenon, particularly in multicellular eukaryotes, and is responsible for important protein functions including regulation and signaling. Many disease-related proteins are likely to be intrinsically disordered or to have disordered regions. In this paper, a new predictor model based on the Bayesian classification methodology is introduced to predict for a given protein or protein region if it is intrinsically disordered or ordered using only its primary sequence. The method allows to incorporate length-dependent amino acid compositional differences of disordered regions by including separate statistical representations for short, middle and long disordered regions. The predictor was trained on the constructed data set of protein regions with known structural properties. In a Jack-knife test, the predictor achieved the sensitivity of 89.2% for disordered and 81.4% for ordered regions. Our method outperformed several reported predictors when evaluated on the previously published data set of Prilusky et al. [2005. FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics 21 (16), 3435-3438]. Further strength of our approach is the ease of implementation.  相似文献   

11.
12.
ABSTRACT: BACKGROUND: Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3-10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation. RESULTS: The profile motif discovery method MEME performed relatively poorly for motifs in disordered regions of proteins. However, when we applied evolutionary weighting to account for redundancy amongst homologous proteins, and masked out poorly conserved regions of disordered proteins, the performance of MEME is equivalent to that of regular expression methods. However, the two approaches returned different subsets within both a benchmark dataset, and a more realistic discovery dataset. CONCLUSIONS: Profile-based motif discovery methods complement regular expression based methods. Whilst profile-based methods are computationally more intensive, they are likely to discover motifs currently overlooked by regular expression methods.  相似文献   

13.
Traditionally, protein-protein interactions were thought to be mediated by large, structured domains. However, it has become clear that the interactome comprises a wide range of binding interfaces with varying degrees of flexibility, ranging from rigid globular domains to disordered regions that natively lack structure. Enrichment for disorder in highly connected hub proteins and its correlation with organism complexity hint at the functional importance of disordered regions. Nevertheless, they have not yet been extensively characterised. Shifting the attention from globular domains to disordered regions of the proteome might bring us closer to elucidating the dense and complex connectivity of the interactome. An important class of disordered interfaces are the compact mono-partite, short linear motifs (SLiMs, or eukaryotic linear motifs (ELMs)). They are evolutionarily plastic and interact with relatively low affinity due to the limited number of residues that make direct contact with the binding partner. These features confer to SLiMs the ability to evolve convergently and mediate transient interactions, which is imperative to network evolution and to maintain robust cell signalling, respectively. The ability to discriminate biologically relevant SLiMs by means of different attributes will improve our understanding of the complexity of the interactome and aid development of bioinformatics tools for motif discovery. In this paper, the curated instances currently available in the Eukaryotic Linear Motif (ELM) database are analysed to provide a clear overview of the defining attributes of SLiMs. These analyses suggest that functional SLiMs have higher levels of conservation than their surrounding residues, frequently evolve convergently, preferentially occur in disordered regions and often form a secondary structure when bound to their interaction partner. These results advocate searching for small groupings of residues in disordered regions with higher relative conservation and a propensity to form the secondary structure. Finally, the most interesting conclusions are examined in regard to their functional consequences.  相似文献   

14.
Short and long disordered regions of proteins have different preference for different amino acid residues. Different methods often have to be trained to predict them separately. In this study, we developed a single neural-network-based technique called SPINE-D that makes a three-state prediction first (ordered residues and disordered residues in short and long disordered regions) and reduces it into a two-state prediction afterwards. SPINE-D was tested on various sets composed of different combinations of Disprot annotated proteins and proteins directly from the PDB annotated for disorder by missing coordinates in X-ray determined structures. While disorder annotations are different according to Disprot and X-ray approaches, SPINE-D's prediction accuracy and ability to predict disorder are relatively independent of how the method was trained and what type of annotation was employed but strongly depend on the balance in the relative populations of ordered and disordered residues in short and long disordered regions in the test set. With greater than 85% overall specificity for detecting residues in both short and long disordered regions, the residues in long disordered regions are easier to predict at 81% sensitivity in a balanced test dataset with 56.5% ordered residues but more challenging (at 65% sensitivity) in a test dataset with 90% ordered residues. Compared to eleven other methods, SPINE-D yields the highest area under the curve (AUC), the highest Mathews correlation coefficient for residue-based prediction, and the lowest mean square error in predicting disorder contents of proteins for an independent test set with 329 proteins. In particular, SPINE-D is comparable to a meta predictor in predicting disordered residues in long disordered regions and superior in short disordered regions. SPINE-D participated in CASP 9 blind prediction and is one of the top servers according to the official ranking. In addition, SPINE-D was examined for prediction of functional molecular recognition motifs in several case studies.  相似文献   

15.
Ellen V. Hackl 《Biopolymers》2014,101(6):591-602
Natively unfolded (intrinsically disordered (ID) proteins) have been attracting an increasing attention due to their involvement in many regulatory processes. Natively unfolded proteins can fold upon binding to their metabolic partners. Coupled folding and binding events usually involve only relatively short motifs (binding motifs). These binding motifs which are able to fold should have an increased propensity to form a secondary structure. The aim of the present work was to probe the conformation of the intrinsically disordered protein 4E‐BP1 in the native and partly folded states by limited proteolysis and to reveal regions with a high propensity to form an ordered structure. Trifuoroethanol (TFE) in low concentrations (up to 15 vol%) was applied to increase the helical population of protein regions with a high intrinsic propensity to fold. When forming helical structures, these regions lose mobility and become more protected from proteases than random/unfolded protein regions. Limited proteolysis followed by mass spectrometry analysis allows identification of the regions with decreased mobility in TFE solutions. Trypsin and V8 proteases were used to perform limited proteolysis of the 4E‐BP1 protein in buffer and in solutions with low TFE concentrations at 37°C and at elevated temperatures (42 and 50°C). Comparison of the results obtained with the previously established 4E‐BP1 structure and the binding motif illustrates the ability of limited proteolysis in the presence of a folding assistant (TFE) to map the regions with high and low propensities to form a secondary structure revealing potential binding motifs inside the intrinsically disordered protein. © 2013 Wiley Periodicals, Inc. Biopolymers 101: 591–602, 2014.  相似文献   

16.
线性短模体是天然无序蛋白实现生物学功能的重要组件.线性短模体具有柔性结构和短小的序列,可以介导瞬时、可逆的蛋白质相互作用,并在发生相互作用时表现出杂泛性.随着实验技术的更新和预测手段的发展,越来越多的线性短模体被发现和重新定义,例如BH3线性短模体.本文重点总结了线性短模体在结构、生物学功能以及进化等方面的特点.对线性短模体功能的研究将为解析细胞信号转导网络、疾病靶标确认、新药发现等领域带来新的思路.  相似文献   

17.
RNA binding proteins recognize RNA targets in a sequence specific manner. Apart from the sequence, the secondary structure context of the binding site also affects the binding affinity. Binding sites are often located in single-stranded RNA regions and it was shown that the sequestration of a binding motif in a double-strand abolishes protein binding. Thus, it is desirable to include knowledge about RNA secondary structures when searching for the binding motif of a protein. We present the approach MEMERIS for searching sequence motifs in a set of RNA sequences and simultaneously integrating information about secondary structures. To abstract from specific structural elements, we precompute position-specific values measuring the single-strandedness of all substrings of an RNA sequence. These values are used as prior knowledge about the motif starts to guide the motif search. Extensive tests with artificial and biological data demonstrate that MEMERIS is able to identify motifs in single-stranded regions even if a stronger motif located in double-strand parts exists. The discovered motif occurrences in biological datasets mostly coincide with known protein-binding sites. This algorithm can be used for finding the binding motif of single-stranded RNA-binding proteins in SELEX or other biological sequence data.  相似文献   

18.
Large portions of higher eukaryotic proteomes are intrinsically disordered, and abundant evidence suggests that these unstructured regions of proteins are rich in regulatory interaction interfaces. A major class of disordered interaction interfaces are the compact and degenerate modules known as short linear motifs (SLiMs). As a result of the difficulties associated with the experimental identification and validation of SLiMs, our understanding of these modules is limited, advocating the use of computational methods to focus experimental discovery. This article evaluates the use of evolutionary conservation as a discriminatory technique for motif discovery. A statistical framework is introduced to assess the significance of relatively conserved residues, quantifying the likelihood a residue will have a particular level of conservation given the conservation of the surrounding residues. The framework is expanded to assess the significance of groupings of conserved residues, a metric that forms the basis of SLiMPrints (short linear motif fingerprints), a de novo motif discovery tool. SLiMPrints identifies relatively overconstrained proximal groupings of residues within intrinsically disordered regions, indicative of putatively functional motifs. Finally, the human proteome is analysed to create a set of highly conserved putative motif instances, including a novel site on translation initiation factor eIF2A that may regulate translation through binding of eIF4E.  相似文献   

19.
Intracellular juxtamembrane regions of transmembrane proteins play pivotal roles in cell signalling, mediated by protein-protein interactions. Disordered protein regions, and short conserved motifs within them, are emerging as key determinants of many such interactions. Here, we investigated whether disorder and conserved motifs are enriched in the juxtamembrane area of human single-pass transmembrane proteins. Conserved motifs were defined as short disordered regions that were much more conserved than the adjacent disordered residues. Human single-pass proteins had higher mean disorder in their cytoplasmic segments than their extracellular parts. Some, but not all, of this effect reflected the shorter length of the cytoplasmic tail. A peak of cytoplasmic disorder was seen at around 30 residues from the membrane. We noted a significant increase in the incidence of conserved motifs within the disordered regions at the same location, even after correcting for the extent of disorder. We conclude that elevated disorder within the cytoplasmic tail of many transmembrane proteins is likely to be associated with enrichment for signalling interactions mediated by conserved short motifs.  相似文献   

20.
RGG/RG motifs are RNA binding segments found in many proteins that can partition into membraneless organelles. They occur in the context of low-complexity disordered regions and often in multiple copies. Although short RGG/RG-containing regions can sometimes form high-affinity interactions with RNA structures, multiple RGG/RG repeats are generally required for high-affinity binding, suggestive of the dynamic, multivalent interactions that are thought to underlie phase separation in formation of cellular membraneless organelles. Arginine can interact with nucleotide bases via hydrogen bonding and π-stacking; thus, nucleotide conformers that provide access to the bases provide enhanced opportunities for RGG interactions. Methylation of RGG/RG regions, which is accomplished by protein arginine methyltransferase enzymes, occurs to different degrees in different cell types and may regulate the behavior of proteins containing these regions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号