首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
2.

Background  

Many proteins contain disordered regions that lack fixed three-dimensional (3D) structure under physiological conditions but have important biological functions. Prediction of disordered regions in protein sequences is important for understanding protein function and in high-throughput determination of protein structures. Machine learning techniques, including neural networks and support vector machines have been widely used in such predictions. Predictors designed for long disordered regions are usually less successful in predicting short disordered regions. Combining prediction of short and long disordered regions will dramatically increase the complexity of the prediction algorithm and make the predictor unsuitable for large-scale applications. Efficient batch prediction of long disordered regions alone is of greater interest in large-scale proteome studies.  相似文献   

3.
Numerous studies have demonstrated that the propensity of a protein to form amyloids or amorphous aggregates is encoded by its amino acid sequence. This led to the emergence of several computational programs to predict amyloidogenicity from amino acid sequences. However, a growing number of studies indicate that an accurate prediction of the protein aggregation can only be achieved when also accounting for the overall structural context of the protein, and the likelihood of transition between the initial state and the aggregate. Here, we describe a computational pipeline called TAPASS, which was designed to do just that. The pipeline assigns each residue of a protein as belonging to a structured region or an intrinsically disordered region (IDR). For this purpose, TAPASS uses either several state-of-the-art programs for prediction of IDRs, of transmembrane regions and of structured domains or the artificial intelligence program AlphaFold. In the next step, this assignment is crossed with amyloidogenicity prediction. As a result, TAPASS allows the detection of Exposed Amyloidogenic Regions (EARs) located within intrinsically disordered regions (IDRs) and carrying high amyloidogenic potential. TAPASS can substantially improve the prediction of amyloids and be used in proteome-wide analysis to discover new amyloid-forming proteins. Its results, combined with clinical data, can create individual risk profiles for different amyloidoses, opening up new opportunities for personalised medicine. The architecture of the pipeline is designed so that it makes it easy to add new individual predictors as they become available. TAPASS can be used through the web interface (https://bioinfo.crbm.cnrs.fr/index.php?route=tools&tool=32).  相似文献   

4.
A substantial portion of the proteome consists of intrinsically disordered regions (IDRs) that do not fold into well-defined 3D structures yet perform numerous biological functions and are associated with a broad range of diseases. It has been a long-standing enigma how different IDRs successfully execute their specific functions. Further putting a spotlight on IDRs are recent discoveries of functionally relevant biomolecular assemblies, which in some cases form through liquid-liquid phase separation. At the molecular level, the formation of biomolecular assemblies is largely driven by weak, multivalent, but selective IDR-IDR interactions. Emerging experimental and computational studies suggest that the primary amino acid sequences of IDRs encode a variety of their interaction behaviors. In this review, we focus on findings and insights that connect sequence-derived features of IDRs to their conformations, propensities to form biomolecular assemblies, selectivity of interaction partners, functions in the context of physiology and disease, and regulation of function. We also discuss directions of future research to facilitate establishing a comprehensive sequence-function paradigm that will eventually allow prediction of selective interactions and specificity of function mediated by IDRs.  相似文献   

5.
The protein folding problem was apparently solved recently by the advent of a deep learning method for protein structure prediction called AlphaFold. However, this program is not able to make predictions about the protein folding pathways. Moreover, it only treats about half of the human proteome, as the remaining proteins are intrinsically disordered or contain disordered regions. By definition these proteins differ from natively folded proteins and do not adopt a properly folded structure in solution. However these intrinsically disordered proteins (IDPs) also systematically differ in amino acid composition and uniquely often become folded upon binding to an interaction partner. These factors preclude solving IDP structures by current machine-learning methods like AlphaFold, which also cannot solve the protein aggregation problem, since this meta-folding process can give rise to different aggregate sizes and structures. An alternative computational method is provided by molecular dynamics simulations that already successfully explored the energy landscapes of IDP conformational switching and protein aggregation in multiple cases. These energy landscapes are very different from those of ‘simple’ protein folding, where one energy funnel leads to a unique protein structure. Instead, the energy landscapes of IDP conformational switching and protein aggregation feature a number of minima for different competing low-energy structures. In this review, I discuss the characteristics of these multifunneled energy landscapes in detail, illustrated by molecular dynamics simulations that elucidated the underlying conformational transitions and aggregation processes.  相似文献   

6.
The past decade has witnessed great advances in our understanding of protein structure‐function relationships in terms of the ubiquitous existence of intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs). The structural disorder of IDPs/IDRs enables them to play essential functions that are complementary to those of ordered proteins. In addition, IDPs/IDRs are persistent in evolution. Therefore, they are expected to possess some advantages over ordered proteins. In this review, we summarize and survey nine possible advantages of IDPs/IDRs: economizing genome/protein resources, overcoming steric restrictions in binding, achieving high specificity with low affinity, increasing binding rate, facilitating posttranslational modifications, enabling flexible linkers, preventing aggregation, providing resistance to non‐native conditions, and allowing compatibility with more available sequences. Some potential advantages of IDPs/IDRs are not well understood and require both experimental and theoretical approaches to decipher. The connection with protein design is also briefly discussed.  相似文献   

7.
Arguably, 2020 was the year of high-accuracy protein structure predictions, with AlphaFold 2.0 achieving previously unseen accuracy in the Critical Assessment of Protein Structure Prediction (CASP). In 2021, DeepMind and EMBL-EBI developed the AlphaFold Protein Structure Database to make an unprecedented number of reliable protein structure predictions easily accessible to the broad scientific community. We provide a brief overview and describe the latest developments in the AlphaFold database. We highlight how the fields of data services, bioinformatics, structural biology, and drug discovery are directly affected by the influx of protein structure data. We also show examples of cutting-edge research that took advantage of the AlphaFold database. It is apparent that connections between various fields through protein structures are now possible, but the amount of data poses new challenges. Finally, we give an outlook regarding the future direction of the database, both in terms of data sets and new functionalities.  相似文献   

8.
The sequence–structure–function paradigm of proteins has been revolutionized by the discovery of intrinsically disordered proteins (IDPs) or intrinsically disordered regions (IDRs). In contrast to traditional ordered proteins, IDPs/IDRs are unstructured under physiological conditions. The absence of well‐defined three‐dimensional structures in the free state of IDPs/IDRs is fundamental to their function. Folding upon binding is an important mode of molecular recognition for IDPs/IDRs. While great efforts have been devoted to investigating the complex structures and binding kinetics and affinities, our knowledge on the binding mechanisms of IDPs/IDRs remains very limited. Here, we review recent advances on the binding mechanisms of IDPs/IDRs. The structures and kinetic parameters of IDPs/IDRs can vary greatly, and the binding mechanisms can be highly dependent on the structural properties of IDPs/IDRs. IDPs/IDRs can employ various combinations of conformational selection and induced fit in a binding process, which can be templated by the target and/or encoded by the IDP/IDR. Further studies should provide deeper insights into the molecular recognition of IDPs/IDRs and enable the rational design of IDP/IDR binding mechanisms in the future.  相似文献   

9.
10.
Intrinsically disordered proteins and regions (IDPs/IDRs) are characterized by well-defined sequence-to-conformation relationships (SCRs). These relationships refer to the sequence-specific preferences for average sizes, shapes, residue-specific secondary structure propensities, and amplitudes of multiscale conformational fluctuations. SCRs are discerned from the sequence-specific conformational ensembles of IDPs. A vast majority of IDPs are actually tethered to folded domains (FDs). This raises the question of whether or not SCRs inferred for IDPs are applicable to IDRs tethered to FDs. Here, we use atomistic simulations based on a well-established forcefield paradigm and an enhanced sampling method to obtain comparative assessments of SCRs for 13 archetypal IDRs modeled as autonomous units, as C-terminal tails connected to FDs, and as linkers between pairs of FDs. Our studies uncover a set of general observations regarding context-independent versus context-dependent SCRs of IDRs. SCRs are minimally perturbed upon tethering to FDs if the IDRs are deficient in charged residues and for polyampholytic IDRs where the oppositely charged residues within the sequence of the IDR are separated into distinct blocks. In contrast, the interplay between IDRs and tethered FDs has a significant modulatory effect on SCRs if the IDRs have intermediate fractions of charged residues or if they have sequence-intrinsic conformational preferences for canonical random coils. Our findings suggest that IDRs with context-independent SCRs might be independent evolutionary modules, whereas IDRs with context-dependent SCRs might co-evolve with the FDs to which they are tethered.  相似文献   

11.
Intrinsically disordered proteins (IDPs) do not adopt stable three-dimensional structures in physiological conditions, yet these proteins play crucial roles in biological phenomena. In most cases, intrinsic disorder manifests itself in segments or domains of an IDP, called intrinsically disordered regions (IDRs), but fully disordered IDPs also exist. Although IDRs can be detected as missing residues in protein structures determined by X-ray crystallography, no protocol has been developed to identify IDRs from structures obtained by Nuclear Magnetic Resonance (NMR). Here, we propose a computational method to assign IDRs based on NMR structures. We compared missing residues of X-ray structures with residue-wise deviations of NMR structures for identical proteins, and derived a threshold deviation that gives the best correlation of ordered and disordered regions of both structures. The obtained threshold of 3.2 Å was applied to proteins whose structures were only determined by NMR, and the resulting IDRs were analyzed and compared to those of X-ray structures with no NMR counterpart in terms of sequence length, IDR fraction, protein function, cellular location, and amino acid composition, all of which suggest distinct characteristics. The structural knowledge of IDPs is still inadequate compared with that of structured proteins. Our method can collect and utilize IDRs from structures determined by NMR, potentially enhancing the understanding of IDPs.  相似文献   

12.
Intrinsically disordered proteins (IDPs) are crucial players in various cellular activities. Several experimental and computational analyses have been conducted to study structural pliability and functional potential of IDPs. In spite of active research in past few decades, what induces structural disorder in IDPs and how is still elusive. Many studies testify that sequential and spatial neighbours often play important roles in determining structural and functional behaviour of proteins. Considering this fact, we assessed sequence neighbours of intrinsically disordered regions (IDRs) to understand if they have any role to play in inducing structural flexibility in IDPs. Our analysis includes 97% eukaryotic IDPs and 3% from bacteria and viruses. Physicochemical and structural parameters including amino acid propensity, hydrophobicity, secondary structure propensity, relative solvent accessibility, B-factor and atomic packing density are used to characterise the neighbouring residues of IDRs (NRIs). We show that NRIs exhibit a unique nature, which makes them stand out from both ordered and disordered residues. They show correlative occurrences of residue pairs like Ser-Thr and Gln-Asn, indicating their tendency to avoid strong biases of order or disorder promoting amino acids. We also find differential preferences of amino acids between N- and C-terminal neighbours, which might indicate a plausible directional effect on the dynamics of adjacent IDRs. We designed an efficient prediction tool using Random Forest to distinguish the NRIs from the ordered residues. Our findings will contribute to understand the behaviour of IDPs, and may provide potential lead in deciphering the role of IDRs in protein folding and assembly.  相似文献   

13.
Lim Heo  Michael Feig 《Proteins》2020,88(5):637-642
Protein structure prediction has long been available as an alternative to experimental structure determination, especially via homology modeling based on templates from related sequences. Recently, models based on distance restraints from coevolutionary analysis via machine learning to have significantly expanded the ability to predict structures for sequences without templates. One such method, AlphaFold, also performs well on sequences where templates are available but without using such information directly. Here we show that combining machine-learning based models from AlphaFold with state-of-the-art physics-based refinement via molecular dynamics simulations further improves predictions to outperform any other prediction method tested during the latest round of CASP. The resulting models have highly accurate global and local structures, including high accuracy at functionally important interface residues, and they are highly suitable as initial models for crystal structure determination via molecular replacement.  相似文献   

14.
The protein structure field is experiencing a revolution. From the increased throughput of techniques to determine experimental structures, to developments such as cryo-EM that allow us to find the structures of large protein complexes or, more recently, the development of artificial intelligence tools, such as AlphaFold, that can predict with high accuracy the folding of proteins for which the availability of homology templates is limited. Here we quantify the effect of the recently released AlphaFold database of protein structural models in our knowledge on human proteins. Our results indicate that our current baseline for structural coverage of 48%, considering experimentally-derived or template-based homology models, elevates up to 76% when including AlphaFold predictions. At the same time the fraction of dark proteome is reduced from 26% to just 10% when AlphaFold models are considered. Furthermore, although the coverage of disease-associated genes and mutations was near complete before AlphaFold release (69% of Clinvar pathogenic mutations and 88% of oncogenic mutations), AlphaFold models still provide an additional coverage of 3% to 13% of these critically important sets of biomedical genes and mutations. Finally, we show how the contribution of AlphaFold models to the structural coverage of non-human organisms, including important pathogenic bacteria, is significantly larger than that of the human proteome. Overall, our results show that the sequence-structure gap of human proteins has almost disappeared, an outstanding success of direct consequences for the knowledge on the human genome and the derived medical applications.  相似文献   

15.
The first genuine high-resolution single particle cryo-electron microscopy structure of a membrane protein determined was a transient receptor potential (TRP) ion channel, TRPV1, in 2013. This methodical breakthrough opened up a whole new world for structural biology and ion channel aficionados alike. TRP channels capture the imagination due to the sheer endless number of tasks they carry out in all aspects of animal physiology. To date, structures of at least one representative member of each of the six mammalian TRP channel subfamilies as well as of a few non-mammalian families have been determined. These structures were instrumental for a better understanding of TRP channel function and regulation. However, all of the TRP channel structures solved so far are incomplete since they miss important information about highly flexible regions found mostly in the channel N- and C-termini. These intrinsically disordered regions (IDRs) can represent between a quarter to almost half of the entire protein sequence and act as important recruitment hubs for lipids and regulatory proteins. Here, we analyze the currently available TRP channel structures with regard to the extent of these “missing” regions and compare these findings to disorder predictions. We discuss select examples of intra- and intermolecular crosstalk of TRP channel IDRs with proteins and lipids as well as the effect of splicing and post-translational modifications, to illuminate their importance for channel function and to complement the prevalently discussed structural biology of these versatile and fascinating proteins with their equally relevant ’unstructural’ biology.  相似文献   

16.
The announcement of the outstanding performance of AlphaFold 2 in the CASP 14 protein structure prediction competition came at the end of a long year defined by the COVID-19 pandemic. With an infectious organism dominating the world stage, the developers of Alphafold 2 were keen to play their part, accurately predicting novel structures of two proteins from SARS-CoV-2. In their blog post of December 2020, they highlighted this contribution, writing “we’ve also seen signs that protein structure prediction could be useful in future pandemic response efforts”. So, what role does structural biology play in guiding vaccine immunogen design and what might be the contribution of AlphaFold 2?  相似文献   

17.
Intrinsically disordered proteins and regions (IDPs and IDRs) lack stable 3D structure under physiological conditions in-vitro, are common in eukaryotes, and facilitate interactions with RNA, DNA and proteins. Current methods for prediction of IDPs and IDRs do not provide insights into their functions, except for a handful of methods that address predictions of protein-binding regions. We report first-of-its-kind computational method DisoRDPbind for high-throughput prediction of RNA, DNA and protein binding residues located in IDRs from protein sequences. DisoRDPbind is implemented using a runtime-efficient multi-layered design that utilizes information extracted from physiochemical properties of amino acids, sequence complexity, putative secondary structure and disorder and sequence alignment. Empirical tests demonstrate that it provides accurate predictions that are competitive with other predictors of disorder-mediated protein binding regions and complementary to the methods that predict RNA- and DNA-binding residues annotated based on crystal structures. Application in Homo sapiens, Mus musculus, Caenorhabditis elegans and Drosophila melanogaster proteomes reveals that RNA- and DNA-binding proteins predicted by DisoRDPbind complement and overlap with the corresponding known binding proteins collected from several sources. Also, the number of the putative protein-binding regions predicted with DisoRDPbind correlates with the promiscuity of proteins in the corresponding protein–protein interaction networks. Webserver: http://biomine.ece.ualberta.ca/DisoRDPbind/  相似文献   

18.
In Dec 2020, the results of AlphaFold version 2 were presented at CASP14, sparking a revolution in the field of protein structure predictions. For the first time, a purely computational method could challenge experimental accuracy for structure prediction of single protein domains. The code of AlphaFold v2 was released in the summer of 2021, and since then, it has been shown that it can be used to accurately predict the structure of most ordered proteins and many protein–protein interactions. It has also sparked an explosion of development in the field, improving AI-based methods to predict protein complexes, disordered regions, and protein design. Here I will review some of the inventions sparked by the release of AlphaFold.  相似文献   

19.
Intrinsically disordered proteins (IDPs) constitute a broad set of proteins with few uniting and many diverging properties. IDPs—and intrinsically disordered regions (IDRs) interspersed between folded domains—are generally characterized as having no persistent tertiary structure; instead they interconvert between a large number of different and often expanded structures. IDPs and IDRs are involved in an enormously wide range of biological functions and reveal novel mechanisms of interactions, and while they defy the common structure-function paradigm of folded proteins, their structural preferences and dynamics are important for their function. We here discuss open questions in the field of IDPs and IDRs, focusing on areas where machine learning and other computational methods play a role. We discuss computational methods aimed to predict transiently formed local and long-range structure, including methods for integrative structural biology. We discuss the many different ways in which IDPs and IDRs can bind to other molecules, both via short linear motifs, as well as in the formation of larger dynamic complexes such as biomolecular condensates. We discuss how experiments are providing insight into such complexes and may enable more accurate predictions. Finally, we discuss the role of IDPs in disease and how new methods are needed to interpret the mechanistic effects of genomic variants in IDPs.  相似文献   

20.
Proteins in general consist not only of globular structural domains (SDs), but also of intrinsically disordered regions (IDRs), i.e. those that do not assume unique three-dimensional structures by themselves. Although IDRs are especially prevalent in eukaryotic proteins, the functions are mostly unknown. To elucidate the functions of IDRs, we first divided eukaryotic proteins into subcellular localizations, identified IDRs by the DICHOT system that accurately divides entire proteins into SDs and IDRs, and examined charge and hydropathy characteristics. On average, mitochondrial proteins have IDRs more positively charged than SDs. Comparison of mitochondrial proteins with orthologous prokaryotic proteins showed that mitochondrial proteins tend to have segments attached at both N and C termini, high fractions of which are IDRs. Segments added to the N-terminus of mitochondrial proteins contain not only signal sequences but also mature proteins and exhibit a positive charge gradient, with the magnitude increasing toward the N-terminus. This finding is consistent with the notion that positively charged residues are added to the N-terminus of proteobacterial proteins so that the extended proteins can be chromosomally encoded and efficiently transported to mitochondria after translation. By contrast, nuclear proteins generally have positively charged SDs and negatively charged IDRs. Among nuclear proteins, DNA-binding proteins have enhanced charge tendencies. We propose that SDs in nuclear proteins tend to be positively charged because of the need to bind to negatively charged nucleotides, while IDRs tend to be negatively charged to interact with other proteins or other regions of the same proteins to avoid premature proteasomal degradation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号