首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Intrinsically Disordered Proteins (IDPs) lack an ordered three-dimensional structure and are enriched in various biological processes. The Molecular Recognition Features (MoRFs) are functional regions within IDPs that undergo a disorder-to-order transition on binding to a partner protein. Identifying MoRFs in IDPs using computational methods is a challenging task.

Methods

In this study, we introduce hidden Markov model (HMM) profiles to accurately identify the location of MoRFs in disordered protein sequences. Using windowing technique, HMM profiles are utilised to extract features from protein sequences and support vector machines (SVM) are used to calculate a propensity score for each residue. Two different SVM kernels with high noise tolerance are evaluated with a varying window size and the scores of the SVM models are combined to generate the final propensity score to predict MoRF residues. The SVM models are designed to extract maximal information between MoRF residues, its neighboring regions (Flanks) and the remainder of the sequence (Others).

Results

To evaluate the proposed method, its performance was compared to that of other MoRF predictors; MoRFpred and ANCHOR. The results show that the proposed method outperforms these two predictors.

Conclusions

Using HMM profile as a source of feature extraction, the proposed method indicates improvement in predicting MoRFs in disordered protein sequences.
  相似文献   

2.
Viral proteins bind to numerous cellular and viral proteins throughout the infection cycle. However, the mechanisms by which viral proteins interact with such large numbers of factors remain unknown. Cellular proteins that interact with multiple, distinct partners often do so through short sequences known as molecular recognition features (MoRFs) embedded within intrinsically disordered regions (IDRs). In this study, we report the first evidence that MoRFs in viral proteins play a similar role in targeting the host cell. Using a combination of evolutionary modeling, protein–protein interaction analyses and forward genetic screening, we systematically investigated two computationally predicted MoRFs within the N‐terminal IDR of the hepatitis C virus (HCV) Core protein. Sequence analysis of the MoRFs showed their conservation across all HCV genotypes and the canine and equine Hepaciviruses. Phylogenetic modeling indicated that the Core MoRFs are under stronger purifying selection than the surrounding sequence, suggesting that these modules have a biological function. Using the yeast two‐hybrid assay, we identified three cellular binding partners for each HCV Core MoRF, including two previously characterized cellular targets of HCV Core (DDX3X and NPM1). Random and site‐directed mutagenesis demonstrated that the predicted MoRF regions were required for binding to the cellular proteins, but that different residues within each MoRF were critical for binding to different partners. This study demonstrated that viruses may use intrinsic disorder to target multiple cellular proteins with the same amino acid sequence and provides a framework for characterizing the binding partners of other disordered regions in viral and cellular proteomes.  相似文献   

3.
Molecular recognition features (MoRFs) are intrinsically disordered protein regions that bind to partners via disorder‐to‐order transitions. In one‐to‐many binding, a single MoRF binds to two or more different partners individually. MoRF‐based one‐to‐many protein–protein interaction (PPI) examples were collected from the Protein Data Bank, yielding 23 MoRFs bound to 2–9 partners, with all pairs of same‐MoRF partners having less than 25% sequence identity. Of these, 8 MoRFs were bound to 2–9 partners having completely different folds, whereas 15 MoRFs were bound to 2–5 partners having the same folds but with low sequence identities. For both types of partner variation, backbone and side chain torsion angle rotations were used to bring about the conformational changes needed to enable close fits between a single MoRF and distinct partners. Alternative splicing events (ASEs) and posttranslational modifications (PTMs) were also found to contribute to distinct partner binding. Because ASEs and PTMs both commonly occur in disordered regions, and because both ASEs and PTMs are often tissue‐specific, these data suggest that MoRFs, ASEs, and PTMs may collaborate to alter PPI networks in different cell types. These data enlarge the set of carefully studied MoRFs that use inherent flexibility and that also use ASE‐based and/or PTM‐based surface modifications to enable the same disordered segment to selectively associate with two or more partners. The small number of residues involved in MoRFs and in their modifications by ASEs or PTMs may simplify the evolvability of signaling network diversity.  相似文献   

4.
Viruses have compact genomes that encode limited number of proteins in comparison to other biological entities. Interestingly, viral proteins have shown natural abundance of either completely disordered proteins that are recognized as intrinsically disorder proteins (IDPs) or partially disordered segments known as intrinsically disordered protein regions (IDPRs). IDPRs are involved in interactions with multiple binding partners to accomplish signaling, regulation, and control functions in cells. Tuning of IDPs and IDPRs are mediated through post-translational modification and alternative splicing. Often, the interactions of IDPRs with their binding protein partner(s) lead to transition from the state of disorder to ordered form. Such interaction-prone protein IDPRs are identified as molecular recognition features (MoRFs). Molecular recognition is an important initial step for the biomolecular interactions and their functional proceedings. Although previous studies have established occurrence of the IDPRs in Zika virus proteome, which provide the functional diversity and structural plasticity to viral proteins, the MoRF analysis has not been performed as of yet. Many computational methods have been developed for the identification of the MoRFs in protein sequences including ANCHOR, MoRFpred, DISOPRED3, and MoRFchibi_web server. In the current study, we have investigated the presence of MoRF regions in structural and non-structural proteins of Zika virus using an aforementioned set of computational techniques. Furthermore, we have experimentally validated the intrinsic disorderness of NS2B cofactor region of NS2B–NS3 protease. NS2B has one of the longest MoRF regions in Zika virus proteome. In future, this study may provide valuable information while investigating the virus host protein interaction networks.  相似文献   

5.
Analysis of molecular recognition features (MoRFs)   总被引:1,自引:0,他引:1  
Several proteomic studies in the last decade revealed that many proteins are either completely disordered or possess long structurally flexible regions. Many such regions were shown to be of functional importance, often allowing a protein to interact with a large number of diverse partners. Parallel to these findings, during the last five years structural bioinformatics has produced an explosion of results regarding protein-protein interactions and their importance for cell signaling. We studied the occurrence of relatively short (10-70 residues), loosely structured protein regions within longer, largely disordered sequences that were characterized as bound to larger proteins. We call these regions molecular recognition features (MoRFs, also known as molecular recognition elements, MoREs). Interestingly, upon binding to their partner(s), MoRFs undergo disorder-to-order transitions. Thus, in our interpretation, MoRFs represent a class of disordered region that exhibits molecular recognition and binding functions. This work extends previous research showing the importance of flexibility and disorder for molecular recognition. We describe the development of a database of MoRFs derived from the RCSB Protein Data Bank and present preliminary results of bioinformatics analyses of these sequences. Based on the structure adopted upon binding, at least three basic types of MoRFs are found: α-MoRFs, β-MoRFs, and ι-MoRFs, which form α-helices, β-strands, and irregular secondary structure when bound, respectively. Our data suggest that functionally significant residual structure can exist in MoRF regions prior to the actual binding event. The contribution of intrinsic protein disorder to the nature and function of MoRFs has also been addressed. The results of this study will advance the understanding of protein-protein interactions and help towards the future development of useful protein-protein binding site predictors.  相似文献   

6.
Molecular Recognition Features (MoRFs) are short, interaction-prone segments of protein disorder that undergo disorder-to-order transitions upon specific binding, representing a specific class of intrinsically disordered regions that exhibit molecular recognition and binding functions. MoRFs are common in various proteomes and occupy a unique structural and functional niche in which function is a direct consequence of intrinsic disorder. Example MoRFs collected from the Protein Data Bank (PDB) have been divided into three subtypes according to their structures in the bound state: alpha-MoRFs form alpha-helices, beta-MoRFs form beta-strands, and iota-MoRFs form structures without a regular pattern of backbone hydrogen bonds. These example MoRFs were indicated to be intrinsically disordered in the absence of their binding partners by several criteria. In this study, we used several geometric and physiochemical criteria to examine the properties of 62 alpha-, 20 beta-, and 176 iota-MoRF complex structures. Interface residues were examined by calculating differences in accessible surface area between the complex and isolated monomers. The compositions and physiochemical properties of MoRF and MoRF partner interface residues were compared to the interface residues of homodimers, heterodimers, and antigen-antibody complexes. Our analysis indicates that there are significant differences in residue composition and several geometric and physicochemical properties that can be used to discriminate, with a high degree of accuracy, between various interfaces in protein interaction data sets. Implications of these findings for the development of MoRF-partner interaction predictors are discussed. In addition, structural changes upon MoRF-to-partner complex formation were examined for several illustrative examples.  相似文献   

7.

Motivation

Intrinsically disordered regions of proteins play an essential role in the regulation of various biological processes. Key to their regulatory function is often the binding to globular protein domains via sequence elements known as molecular recognition features (MoRFs). Development of computational tools for the identification of candidate MoRF locations in amino acid sequences is an important task and an area of growing interest. Given the relative sparseness of MoRFs in protein sequences, the accuracy of the available MoRF predictors is often inadequate for practical usage, which leaves a significant need and room for improvement. In this work, we introduce MoRFCHiBi_Web, which predicts MoRF locations in protein sequences with higher accuracy compared to current MoRF predictors.

Methods

Three distinct and largely independent property scores are computed with component predictors and then combined to generate the final MoRF propensity scores. The first score reflects the likelihood of sequence windows to harbour MoRFs and is based on amino acid composition and sequence similarity information. It is generated by MoRFCHiBi using small windows of up to 40 residues in size. The second score identifies long stretches of protein disorder and is generated by ESpritz with the DisProt option. Lastly, the third score reflects residue conservation and is assembled from PSSM files generated by PSI-BLAST. These propensity scores are processed and then hierarchically combined using Bayes rule to generate the final MoRFCHiBi_Web predictions.

Results

MoRFCHiBi_Web was tested on three datasets. Results show that MoRFCHiBi_Web outperforms previously developed predictors by generating less than half the false positive rate for the same true positive rate at practical threshold values. This level of accuracy paired with its relatively high processing speed makes MoRFCHiBi_Web a practical tool for MoRF prediction.

Availability

http://morf.chibi.ubc.ca:8080/morf/.  相似文献   

8.
9.
The large-conductance Ca2+-activated K+ (BK) channel is broadly expressed in various mammalian cells and tissues such as neurons, skeletal and smooth muscles, exocrine cells, and sensory cells of the inner ear. Previous studies suggest that BK channels are promiscuous binders involved in a multitude of protein-protein interactions. To gain a better understanding of the potential mechanisms underlying BK interactions, we analyzed the abundance, distribution, and potential mechanisms of intrinsic disorder in 27 BK channel variants from mouse cochlea, 104 previously reported BK-associated proteins (BKAPS) from cytoplasmic and membrane/cytoskeletal regions, plus BK β- and γ-subunits. Disorder was evaluated using the MFDp algorithm, which is a consensus-based predictor that provides a strong and competitive predictive quality and PONDR, which can determine long intrinsically disordered regions (IDRs). Disorder-based binding sites or molecular recognition features (MoRFs) were found using MoRFpred and ANCHOR. BKAP functions were categorized based on Gene Ontology (GO) terms. The analyses revealed that the BK variants contain a number of IDRs. Intrinsic disorder is also common in BKAPs, of which ∼5% are completely disordered. However, intrinsic disorder is very differently distributed within BK and its partners. Approximately 65% of the disordered segments in BK channels are long (IDRs) (>50 residues), whereas >60% of the disordered segments in BKAPs are short IDRs that range in length from 4 to 30 residues. Both α and γ subunits showed various amounts of disorder as did hub proteins of the BK interactome. Our analyses suggest that intrinsic disorder is important for the function of BK and its BKAPs. Long IDRs in BK are engaged in protein-protein and protein-ligand interactions, contain multiple post-translational modification sites, and are subjected to alternative splicing. The disordered structure of BK and its BKAPs suggests one of the underlying mechanisms of their interaction.  相似文献   

10.
11.

Background

Intrinsically disordered proteins (IDPs) or proteins with disordered regions (IDRs) do not have a well-defined tertiary structure, but perform a multitude of functions, often relying on their native disorder to achieve the binding flexibility through changing to alternative conformations. Intrinsic disorder is frequently found in all three kingdoms of life, and may occur in short stretches or span whole proteins. To date most studies contrasting the differences between ordered and disordered proteins focused on simple summary statistics. Here, we propose an evolutionary approach to study IDPs, and contrast patterns specific to ordered protein regions and the corresponding IDRs.

Results

Two empirical Markov models of amino acid substitutions were estimated, based on a large set of multiple sequence alignments with experimentally verified annotations of disordered regions from the DisProt database of IDPs. We applied new methods to detect differences in Markovian evolution and evolutionary rates between IDRs and the corresponding ordered protein regions. Further, we investigated the distribution of IDPs among functional categories, biochemical pathways and their preponderance to contain tandem repeats.

Conclusions

We find significant differences in the evolution between ordered and disordered regions of proteins. Most importantly we find that disorder promoting amino acids are more conserved in IDRs, indicating that in some cases not only amino acid composition but the specific sequence is important for function. This conjecture is also reinforced by the observation that for of our data set IDRs evolve more slowly than the ordered parts of the proteins, while we still support the common view that IDRs in general evolve more quickly. The improvement in model fit indicates a possible improvement for various types of analyses e.g. de novo disorder prediction using a phylogenetic Hidden Markov Model based on our matrices showed a performance similar to other disorder predictors.  相似文献   

12.
All proteomes contain both proteins and polypeptide segments that don’t form a defined three-dimensional structure yet are biologically active—called intrinsically disordered proteins and regions (IDPs and IDRs). Most of these IDPs/IDRs lack useful functional annotation limiting our understanding of their importance for organism fitness. Here we characterized IDRs using protein sequence annotations of functional sites and regions available in the UniProt knowledgebase (“UniProt features”: active site, ligand-binding pocket, regions mediating protein-protein interactions, etc.). By measuring the statistical enrichment of twenty-five UniProt features in 981 IDRs of 561 human proteins, we identified eight features that are commonly located in IDRs. We then collected the genetic variant data from the general population and patient-based databases and evaluated the prevalence of population and pathogenic variations in IDPs/IDRs. We observed that some IDRs tolerate 2 to 12-times more single amino acid-substituting missense mutations than synonymous changes in the general population. However, we also found that 37% of all germline pathogenic mutations are located in disordered regions of 96 proteins. Based on the observed-to-expected frequency of mutations, we categorized 34 IDRs in 20 proteins (DDX3X, KIT, RB1, etc.) as intolerant to mutation. Finally, using statistical analysis and a machine learning approach, we demonstrate that mutation-intolerant IDRs carry a distinct signature of functional features. Our study presents a novel approach to assign functional importance to IDRs by leveraging the wealth of available genetic data, which will aid in a deeper understating of the role of IDRs in biological processes and disease mechanisms.  相似文献   

13.
固有无序蛋白质是一类在生理条件下缺乏稳定三维结构而具有正常功能,参与信号转导、转录调控、胁迫应答等多种生物学过程的蛋白质.植物中许多逆境响应蛋白是固有无序蛋白质,通过其结构无序或部分无序区域在蛋白质 蛋白质、蛋白质 膜脂、蛋白质 核酸的互作中发挥重要作用.本文主要对固有无序蛋白质的类别、氨基酸组成和结构特点以及在逆境胁迫下其稳定细胞膜、保护核酸和蛋白质、调控基因表达等分子功能进行综述,以拓展对逆境胁迫下蛋白质作用分子机制的认识.  相似文献   

14.
Intrinsically disordered regions (IDR) play an important role in key biological processes and are closely related to human diseases. IDRs have great potential to serve as targets for drug discovery, most notably in disordered binding regions. Accurate prediction of IDRs is challenging because their genome wide occurrence and a low ratio of disordered residues make them difficult targets for traditional classification techniques. Existing computational methods mostly rely on sequence profiles to improve accuracy which is time consuming and computationally expensive. This article describes an ab initio sequence-only prediction method—which tries to overcome the challenge of accurate prediction posed by IDRs—based on reduced amino acid alphabets and convolutional neural networks (CNNs). We experiment with six different 3-letter reduced alphabets. We argue that the dimensional reduction in the input alphabet facilitates the detection of complex patterns within the sequence by the convolutional step. Experimental results show that our proposed IDR predictor performs at the same level or outperforms other state-of-the-art methods in the same class, achieving accuracy levels of 0.76 and AUC of 0.85 on the publicly available Critical Assessment of protein Structure Prediction dataset (CASP10). Therefore, our method is suitable for proteome-wide disorder prediction yielding similar or better accuracy than existing approaches at a faster speed.  相似文献   

15.
The pathological process of allergies generally involves an initial activation of certain immune cells, tied to an ensuing inflammatory reaction on renewed contact with the allergen. In IgE-mediated hypersensitivity, this typically occurs in response to otherwise harmless food- or air-borne proteins. As some members of certain protein families carry special properties that make them allergenic, exploring protein allergens at the molecular level is instrumental to an improved understanding of the disease mechanisms, including the identification of relevant antigen features. For this purpose, we inspected a previously identified set of allergen representative peptides (ARPs) to scrutinize protein intrinsic disorder. The resulting study presented here focused on the association between these ARPs and protein intrinsic disorder. In addition, the connection between the disorder-enriched ARPs and UniProt functional keywords was considered. Our analysis revealed that ~ 20% of the allergen peptides are highly disordered, and that ~ 77% of ARPs are either located within disordered regions of corresponding allergenic proteins or show more disorder/flexibility than their neighbor regions. Furthermore, among the subset of allergenic proteins, ~ 70% of the predicted molecular recognition features (MoRFs that consist of short interactive disordered regions undergoing disorder-to-order transitions at interaction with binding partners) were identified as ARPs. These results suggest that intrinsic disorder and MoRFs may play functional roles in IgE-mediated allergy.  相似文献   

16.
Intrinsic disorder is important for protein regulation, yet its role in regulation of ion transport proteins is essentially uninvestigated. The ubiquitous plasma membrane carrier protein Na(+)/H(+) Exchanger isoform 1 (NHE1) plays pivotal roles in cellular pH and volume homeostasis, and its dysfunction is implicated in several clinically important diseases. This study shows, for the first time for any carrier protein, that the distal part of the C-terminal intracellular tail (the cdt, residues V686-Q815) from human (h) NHE1 is intrinsically disordered. Further, we experimentally demonstrated the presence of a similar region of intrinsic disorder (ID) in NHE1 from the teleost fish Pleuronectes americanus (paNHE1), and bioinformatic analysis suggested ID to be conserved in the NHE1 family. The sequential variation in structure propensity as determined by NMR, but not the amplitude, was largely conserved between the h- and paNHE1cdt. This suggests that both proteins contain molecular recognition features (MoRFs), i.e., local, transiently formed structures within an ID region. The functional relevance of the most conserved MoRF was investigated by introducing a point mutation that significantly disrupted the putative binding feature. When this mutant NHE1 was expressed in full length NHE1 in AP1 cells, it exhibited impaired trafficking to the plasma membrane. This study demonstrated that the distal regulatory domain of NHE1 is intrinsically disordered yet contains conserved regions of transient structure. We suggest that normal NHE1 function depends on a protein recognition element within the ID region that may be linked to NHE1 trafficking via an acidic ER export motif.  相似文献   

17.
18.
The sequence–structure–function paradigm of proteins has been revolutionized by the discovery of intrinsically disordered proteins (IDPs) or intrinsically disordered regions (IDRs). In contrast to traditional ordered proteins, IDPs/IDRs are unstructured under physiological conditions. The absence of well‐defined three‐dimensional structures in the free state of IDPs/IDRs is fundamental to their function. Folding upon binding is an important mode of molecular recognition for IDPs/IDRs. While great efforts have been devoted to investigating the complex structures and binding kinetics and affinities, our knowledge on the binding mechanisms of IDPs/IDRs remains very limited. Here, we review recent advances on the binding mechanisms of IDPs/IDRs. The structures and kinetic parameters of IDPs/IDRs can vary greatly, and the binding mechanisms can be highly dependent on the structural properties of IDPs/IDRs. IDPs/IDRs can employ various combinations of conformational selection and induced fit in a binding process, which can be templated by the target and/or encoded by the IDP/IDR. Further studies should provide deeper insights into the molecular recognition of IDPs/IDRs and enable the rational design of IDP/IDR binding mechanisms in the future.  相似文献   

19.
Eukaryotic cells are partitioned into functionally distinct self-organizing compartments. But while the biogenesis of membrane-surrounded compartments is beginning to be understood, the organizing principles behind large membrane-less structures, such as RNA-containing granules, remain a mystery. Here, we argue that protein disorder is an essential ingredient for the formation of such macromolecular collectives. Intrinsically disordered regions (IDRs) do not fold into a well-defined structure but rather sample a range of conformational states, depending on the local conditions. In addition to being structurally versatile, IDRs promote multivalent and transient interactions. This unique combination of features turns intrinsically disordered proteins into ideal agents to orchestrate the formation of large macromolecular assemblies. The presence of conformationally flexible regions, however, comes at a cost, for many intrinsically disordered proteins are aggregation-prone and cause protein misfolding diseases. This association with disease is particularly strong for IDRs with prion-like amino acid composition. Here, we examine how disease-causing and normal conformations are linked, and discuss the possibility that the dynamic order of the cytoplasm emerges, at least in part, from the collective properties of intrinsically disordered prion-like domains. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly.  相似文献   

20.
Intrinsically disordered or unstructured proteins (or regions in proteins) have been found to be important in a wide range of biological functions and implicated in many diseases. Due to the high cost and low efficiency of experimental determination of intrinsic disorder and the exponential increase of unannotated protein sequences, developing complementary computational prediction methods has been an active area of research for several decades. Here, we employed an ensemble of deep Squeeze-and-Excitation residual inception and long short-term memory (LSTM) networks for predicting protein intrinsic disorder with input from evolutionary information and predicted one-dimensional structural properties. The method, called SPOT-Disorder2, offers substantial and consistent improvement not only over our previous technique based on LSTM networks alone, but also over other state-of-the-art techniques in three independent tests with different ratios of disordered to ordered amino acid residues, and for sequences with either rich or limited evolutionary information. More importantly, semi-disordered regions predicted in SPOT-Disorder2 are more accurate in identifying molecular recognition features (MoRFs) than methods directly designed for MoRFs prediction. SPOT-Disorder2 is available as a web server and as a standalone program at https://sparks-lab.org/server/spot-disorder2/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号