期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Intrinsic disorder is a common feature of hub proteins from four eukaryotic interactomes

Haynes C Oldfield CJ Ji F Klitgord N Cusick ME Radivojac P Uversky VN Vidal M Iakoucheva LM 《PLoS computational biology》2006,2(8):e100

相似文献

2.

Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions

Chen JW Romero P Uversky VN Dunker AK 《Journal of proteome research》2006,5(4):879-887

Many protein regions have been shown to be intrinsically disordered, lacking unique structure under physiological conditions. These intrinsically disordered regions are not only very common in proteomes, but also crucial to the function of many proteins, especially those involved in signaling, recognition, and regulation. The goal of this work was to identify the prevalence, characteristics, and functions of conserved disordered regions within protein domains and families. A database was created to store the amino acid sequences of nearly one million proteins and their domain matches from the InterPro database, a resource integrating eight different protein family and domain databases. Disorder prediction was performed on these protein sequences. Regions of sequence corresponding to domains were aligned using a multiple sequence alignment tool. From this initial information, regions of conserved predicted disorder were found within the domains. The methodology for this search consisted of finding regions of consecutive positions in the multiple sequence alignments in which a 90% or more of the sequences were predicted to be disordered. This procedure was constrained to find such regions of conserved disorder prediction that were at least 20 amino acids in length. The results of this work included 3,653 regions of conserved disorder prediction, found within 2,898 distinct InterPro entries. Most regions of conserved predicted disorder detected were short, with less than 10% of those found exceeding 30 residues in length. 相似文献

3.

SPOT-Disorder2: Improved Protein Intrinsic Disorder Prediction by Ensembled Deep Learning

《基因组蛋白质组与生物信息学报(英文版)》2019,17(6):645-656

Intrinsically disordered or unstructured proteins (or regions in proteins) have been found to be important in a wide range of biological functions and implicated in many diseases. Due to the high cost and low efficiency of experimental determination of intrinsic disorder and the exponential increase of unannotated protein sequences, developing complementary computational prediction methods has been an active area of research for several decades. Here, we employed an ensemble of deep Squeeze-and-Excitation residual inception and long short-term memory (LSTM) networks for predicting protein intrinsic disorder with input from evolutionary information and predicted one-dimensional structural properties. The method, called SPOT-Disorder2, offers substantial and consistent improvement not only over our previous technique based on LSTM networks alone, but also over other state-of-the-art techniques in three independent tests with different ratios of disordered to ordered amino acid residues, and for sequences with either rich or limited evolutionary information. More importantly, semi-disordered regions predicted in SPOT-Disorder2 are more accurate in identifying molecular recognition features (MoRFs) than methods directly designed for MoRFs prediction. SPOT-Disorder2 is available as a web server and as a standalone program at https://sparks-lab.org/server/spot-disorder2/. 相似文献

4.

Intrinsic disorder in cell-signaling and cancer-associated proteins 总被引：3，自引：0，他引：3

Iakoucheva LM Brown CJ Lawson JD Obradović Z Dunker AK 《Journal of molecular biology》2002,323(3):573-584

The number of intrinsically disordered proteins known to be involved in cell-signaling and regulation is growing rapidly. To test for a generalized involvement of intrinsic disorder in signaling and cancer, we applied a neural network predictor of natural disordered regions (PONDR VL-XT) to four protein datasets: human cancer-associated proteins (HCAP), signaling proteins (AfCS), eukaryotic proteins from SWISS-PROT (EU_SW) and non-homologous protein segments with well-defined (ordered) 3D structure (O_PDB_S25). PONDR VL-XT predicts >or=30 consecutive disordered residues for 79(+/-5)%, 66(+/-6)%, 47(+/-4)% and 13(+/-4)% of the proteins from HCAP, AfCS, EU_SW, and O_PDB_S25, respectively, indicating significantly more intrinsic disorder in cancer-associated and signaling proteins as compared to the two control sets. The disorder analysis was extended to 11 additional functionally diverse categories of human proteins from SWISS-PROT. The proteins involved in metabolism, biosynthesis, and degradation together with kinases, inhibitors, transport, G-protein coupled receptors, and membrane proteins are predicted to have at least twofold less disorder than regulatory, cancer-associated and cytoskeletal proteins. In contrast to 44.5% of the proteins from representative non-membrane categories, just 17.3% of the cancer-associated proteins had sequence alignments with structures in the Protein Data Bank covering at least 75% of their lengths. This relative lack of structural information correlated with the greater amount of predicted disorder in the HCAP dataset. A comparison of disorder predictions with the experimental structural data for a subset of the HCAP proteins indicated good agreement between prediction and observation. Our data suggest that intrinsically unstructured proteins play key roles in cell-signaling, regulation and cancer, where coupled folding and binding is a common mechanism. 相似文献

5.

Interaction between Intrinsically Disordered Proteins Frequently Occurs in a Human Protein-Protein Interaction Network

Kana Shimizu Hiroyuki Toh 《Journal of molecular biology》2009,392(5):1253-1265

Intrinsic protein disorder is a widespread phenomenon characterised by a lack of stable three-dimensional structures and is considered to play an important role in protein-protein interactions (PPIs). This study examined the genome-wide preference of disorder in PPIs by using exhaustive disorder prediction in human PPIs. We categorised the PPIs into three types (interaction between disordered proteins, interaction between structured proteins, and interaction between a disordered protein and a structured protein) with regard to the flexibility of molecular recognition and compared these three interaction types in an existing human PPI network with those in a randomised network. Although the structured regions were expected to become the identifiers for binding recognition, this comparative analysis revealed unexpected results. The occurrence of interactions between disordered proteins was significantly frequent, and that between a disordered protein and a structured protein was significantly infrequent. We found that this propensity was much stronger in interactions between nonhub proteins. We also analysed the interaction types from a functional standpoint by using GO, which revealed that the interaction between disordered proteins frequently occurred in cellular processes, regulation, and metabolic processes. The number of interactions, especially in metabolic processes between disordered proteins, was 1.8 times as large as that in the randomised network. Another analysis conducted by using KEGG pathways provided results where several signaling pathways and disease-related pathways included many interactions between disordered proteins. All of these analyses suggest that human PPIs preferably occur between disordered proteins and that the flexibility of the interacting protein pairs may play an important role in human PPI networks. 相似文献

6.

A comprehensive overview of computational protein disorder prediction methods

Deng X Eickholt J Cheng J 《Molecular bioSystems》2012,8(1):114-121

Over the past decade there has been a growing acknowledgement that a large proportion of proteins within most proteomes contain disordered regions. Disordered regions are segments of the protein chain which do not adopt a stable structure. Recognition of disordered regions in a protein is of great importance for protein structure prediction, protein structure determination and function annotation as these regions have a close relationship with protein expression and functionality. As a result, a great many protein disorder prediction methods have been developed so far. Here, we present an overview of current protein disorder prediction methods including an analysis of their advantages and shortcomings. In order to help users to select alternative tools under different circumstances, we also evaluate 23 disorder predictors on the benchmark data of the most recent round of the Critical Assessment of protein Structure Prediction (CASP) and assess their accuracy using several complementary measures. 相似文献

7.

Prediction of Protein Binding Regions in Disordered Proteins

Blint Mszros Istvn Simon Zsuzsanna Dosztnyi 《PLoS computational biology》2009,5(5)

Many disordered proteins function via binding to a structured partner and undergo a disorder-to-order transition. The coupled folding and binding can confer several functional advantages such as the precise control of binding specificity without increased affinity. Additionally, the inherent flexibility allows the binding site to adopt various conformations and to bind to multiple partners. These features explain the prevalence of such binding elements in signaling and regulatory processes. In this work, we report ANCHOR, a method for the prediction of disordered binding regions. ANCHOR relies on the pairwise energy estimation approach that is the basis of IUPred, a previous general disorder prediction method. In order to predict disordered binding regions, we seek to identify segments that are in disordered regions, cannot form enough favorable intrachain interactions to fold on their own, and are likely to gain stabilizing energy by interacting with a globular protein partner. The performance of ANCHOR was found to be largely independent from the amino acid composition and adopted secondary structure. Longer binding sites generally were predicted to be segmented, in agreement with available experimentally characterized examples. Scanning several hundred proteomes showed that the occurrence of disordered binding sites increased with the complexity of the organisms even compared to disordered regions in general. Furthermore, the length distribution of binding sites was different from disordered protein regions in general and was dominated by shorter segments. These results underline the importance of disordered proteins and protein segments in establishing new binding regions. Due to their specific biophysical properties, disordered binding sites generally carry a robust sequence signal, and this signal is efficiently captured by our method. Through its generality, ANCHOR opens new ways to study the essential functional sites of disordered proteins. 相似文献

8.

Intrinsic disorder and functional proteomics 总被引：11，自引：0，他引：11

下载免费PDF全文

Radivojac P Iakoucheva LM Oldfield CJ Obradovic Z Uversky VN Dunker AK 《Biophysical journal》2007,92(5):1439-1456

The recent advances in the prediction of intrinsically disordered proteins and the use of protein disorder prediction in the fields of molecular biology and bioinformatics are reviewed here, especially with regard to protein function. First, a close look is taken at intrinsically disordered proteins and then at the methods used for their experimental characterization. Next, the major statistical properties of disordered regions are summarized, and prediction models developed thus far are described, including their numerous applications in functional proteomics. The future of the prediction of protein disorder and the future uses of such predictions in functional proteomics comprise the last section of this article. 相似文献

9.

Prediction of unfolded segments in a protein sequence based on amino acid composition 总被引：1，自引：0，他引：1

Coeytaux K Poupon A 《Bioinformatics (Oxford, England)》2005,21(9):1891-1900

MOTIVATION: Partially and wholly unstructured proteins have now been identified in all kingdoms of life--more commonly in eukaryotic organisms. This intrinsic disorder is related to certain critical functions. Apart from their fundamental interest, unstructured regions in proteins may prevent crystallization. Therefore, the prediction of disordered regions is an important aspect for the understanding of protein function, but may also help to devise genetic constructs. RESULTS: In this paper we present a computational tool for the detection of unstructured regions in proteins based on two properties of unfolded fragments: (1) disordered regions have a biased composition and (2) they usually contain either small or no hydrophobic clusters. In order to quantify these two facts we first calculate the amino acid distributions in structured and unstructured regions. Using this distribution, we calculate for a given sequence fragment the probability to be part of either a structured or an unstructured region. For each amino acid, the distance to the nearest hydrophobic cluster is also computed. Using these three values along a protein sequence allows us to predict unstructured regions, with very simple rules. This method requires only the primary sequence, and no multiple alignment, which makes it an adequate method for orphan proteins. AVAILABILITY: http://genomics.eu.org/ 相似文献

10.

Markov models of amino acid substitution to study proteins with intrinsically disordered regions

Szalkowski AM Anisimova M 《PloS one》2011,6(5):e20488

Background

Intrinsically disordered proteins (IDPs) or proteins with disordered regions (IDRs) do not have a well-defined tertiary structure, but perform a multitude of functions, often relying on their native disorder to achieve the binding flexibility through changing to alternative conformations. Intrinsic disorder is frequently found in all three kingdoms of life, and may occur in short stretches or span whole proteins. To date most studies contrasting the differences between ordered and disordered proteins focused on simple summary statistics. Here, we propose an evolutionary approach to study IDPs, and contrast patterns specific to ordered protein regions and the corresponding IDRs.

Results

Two empirical Markov models of amino acid substitutions were estimated, based on a large set of multiple sequence alignments with experimentally verified annotations of disordered regions from the DisProt database of IDPs. We applied new methods to detect differences in Markovian evolution and evolutionary rates between IDRs and the corresponding ordered protein regions. Further, we investigated the distribution of IDPs among functional categories, biochemical pathways and their preponderance to contain tandem repeats.

Conclusions

We find significant differences in the evolution between ordered and disordered regions of proteins. Most importantly we find that disorder promoting amino acids are more conserved in IDRs, indicating that in some cases not only amino acid composition but the specific sequence is important for function. This conjecture is also reinforced by the observation that for of our data set IDRs evolve more slowly than the ordered parts of the proteins, while we still support the common view that IDRs in general evolve more quickly. The improvement in model fit indicates a possible improvement for various types of analyses e.g. de novo disorder prediction using a phylogenetic Hidden Markov Model based on our matrices showed a performance similar to other disorder predictors. 相似文献

11.

Prediction and functional analysis of native disorder in proteins from the three kingdoms of life 总被引：4，自引：0，他引：4

Ward JJ Sodhi JS McGuffin LJ Buxton BF Jones DT 《Journal of molecular biology》2004,337(3):635-645

相似文献

12.

Conservation of intrinsic disorder in protein domains and families: II. functions of conserved disorder

Chen JW Romero P Uversky VN Dunker AK 《Journal of proteome research》2006,5(4):888-898

Regions of conserved disorder prediction (CDP) were found in protein domains from all available InterPro member databases, although with varying frequency. These CDP regions were found in proteins from all kingdoms of life, including viruses. However, eukaryotes had 1 order of magnitude more proteins containing long disordered regions than did archaea and bacteria. Sequence conservation in CDP regions varied, but was on average slightly lower than in regions of conserved order. In some cases, disordered regions evolve faster than ordered regions, in others they evolve slower, and in the rest they evolve at roughly the same rate. A variety of functions were found to be associated with domains containing conserved disorder. The most common were DNA/RNA binding, and protein binding. Many ribosomal proteins also were found to contain conserved disordered regions. Other functions identified included membrane translocation and amino acid storage for germination. Due to limitations of current knowledge as well as the methodology used for this work, it was not determined whether these functions were directly associated with the predicted disordered region. However, the functions associated with conserved disorder in this work are in agreement with the functions found in other studies to correlate to disordered regions. We have established that intrinsic disorder may be more common in bacterial and archaeal proteins than previously thought, but this disorder is likely to be used for different purposes than in eukaryotic proteins, as well as occurring in shorter stretches of protein. Regions of predicted disorder were found to be conserved within a large number of protein families and domains. Although many think of such conserved domains as being ordered, in fact a significant number of them contain regions of disorder that are likely to be crucial to their functions. 相似文献

13.

Hierarchical multi-label prediction of gene function

Barutcuoglu Z Schapire RE Troyanskaya OG 《Bioinformatics (Oxford, England)》2006,22(7):830-836

MOTIVATION: Assigning functions for unknown genes based on diverse large-scale data is a key task in functional genomics. Previous work on gene function prediction has addressed this problem using independent classifiers for each function. However, such an approach ignores the structure of functional class taxonomies, such as the Gene Ontology (GO). Over a hierarchy of functional classes, a group of independent classifiers where each one predicts gene membership to a particular class can produce a hierarchically inconsistent set of predictions, where for a given gene a specific class may be predicted positive while its inclusive parent class is predicted negative. Taking the hierarchical structure into account resolves such inconsistencies and provides an opportunity for leveraging all classifiers in the hierarchy to achieve higher specificity of predictions. RESULTS: We developed a Bayesian framework for combining multiple classifiers based on the functional taxonomy constraints. Using a hierarchy of support vector machine (SVM) classifiers trained on multiple data types, we combined predictions in our Bayesian framework to obtain the most probable consistent set of predictions. Experiments show that over a 105-node subhierarchy of the GO, our Bayesian framework improves predictions for 93 nodes. As an additional benefit, our method also provides implicit calibration of SVM margin outputs to probabilities. Using this method, we make function predictions for multiple proteins, and experimentally confirm predictions for proteins involved in mitosis. SUPPLEMENTARY INFORMATION: Results for the 105 selected GO classes and predictions for 1059 unknown genes are available at: http://function.princeton.edu/genesite/ CONTACT: ogt@cs.princeton.edu. 相似文献

14.

Intrinsically Semi-disordered State and Its Role in Induced Folding and Protein Aggregation

Tuo Zhang Eshel Faraggi Zhixiu Li Yaoqi Zhou 《Cell biochemistry and biophysics》2013,67(3):1193-1205

Intrinsically disordered proteins (IDPs) refer to those proteins without fixed three-dimensional structures under physiological conditions. Although experiments suggest that the conformations of IDPs can vary from random coils, semi-compact globules, to compact globules with different contents of secondary structures, computational efforts to separate IDPs into different states are not yet successful. Recently, we developed a neural-network-based disorder prediction technique SPINE-D that was ranked as one of the top performing techniques for disorder prediction in the biannual meeting of critical assessment of structure prediction techniques (CASP 9, 2010). Here, we further analyze the results from SPINE-D prediction by defining a semi-disordered state that has about 50 % predicted probability to be disordered or ordered. This semi-disordered state is partially collapsed with intermediate levels of predicted solvent accessibility and secondary structure content. The relative difference in compositions between semi-disordered and fully disordered regions is highly correlated with amyloid aggregation propensity (a correlation coefficient of 0.86 if excluding four charged residues and proline, 0.73 if not). In addition, we observed that some semi-disordered regions participate in induced folding, and others play key roles in protein aggregation. More specifically, a semi-disordered region is amyloidogenic in fully unstructured proteins (such as alpha-synuclein and Sup35) but prone to local unfolding that exposes the hydrophobic core to aggregation in structured globular proteins (such as SOD1 and lysozyme). A transition from full disorder to semi-disorder at about 30–40 Qs is observed in the poly-Q (poly-glutamine) tract of huntingtin. The accuracy of using semi-disorder to predict binding-induced folding and aggregation is compared with several methods trained for the purpose. These results indicate the usefulness of three-state classification (order, semi-disorder, and full-disorder) in distinguishing nonfolding from induced-folding and aggregation-resistant from aggregation-prone IDPs and in locating weakly stable, locally unfolding, and potentially aggregation regions in structured proteins. A comparison with five representative disorder-prediction methods showed that SPINE-D is the only method with a clear separation of semi-disorder from ordered and fully disordered states. 相似文献

15.

Progressive Clustering Based Method for Protein Function Prediction

Ashish Saini Jingyu Hou 《Bulletin of mathematical biology》2013,75(2):331-350

In recent years, significant effort has been given to predicting protein functions from protein interaction data generated from high throughput techniques. However, predicting protein functions correctly and reliably still remains a challenge. Recently, many computational methods have been proposed for predicting protein functions. Among these methods, clustering based methods are the most promising. The existing methods, however, mainly focus on protein relationship modeling and the prediction algorithms that statically predict functions from the clusters that are related to the unannotated proteins. In fact, the clustering itself is a dynamic process and the function prediction should take this dynamic feature of clustering into consideration. Unfortunately, this dynamic feature of clustering is ignored in the existing prediction methods. In this paper, we propose an innovative progressive clustering based prediction method to trace the functions of relevant annotated proteins across all clusters that are generated through the progressive clustering of proteins. A set of prediction criteria is proposed to predict functions of unannotated proteins from all relevant clusters and traced functions. The method was evaluated on real protein interaction datasets and the results demonstrated the effectiveness of the proposed method compared with representative existing methods. 相似文献

16.

Exploiting multi-layered information to iteratively predict protein functions

Zhu W Hou J Chen YP 《Mathematical biosciences》2012,236(2):108-116

BackgroundSimilarity based computational methods are a useful tool for predicting protein functions from protein–protein interaction (PPI) datasets. Although various similarity-based prediction algorithms have been proposed, unsatisfactory prediction results have occurred on many occasions. The purpose of this type of algorithm is to predict functions of an unannotated protein from the functions of those proteins that are similar to the unannotated protein. Therefore, the prediction quality largely depends on how to select a set of proper proteins (i.e., a prediction domain) from which the functions of an unannotated protein are predicted, and how to measure the similarity between proteins. Another issue with existing algorithms is they only believe the function prediction is a one-off procedure, ignoring the fact that interactions amongst proteins are mutual and dynamic in terms of similarity when predicting functions. How to resolve these major issues to increase prediction quality remains a challenge in computational biology.ResultsIn this paper, we propose an innovative approach to predict protein functions of unannotated proteins iteratively from a PPI dataset. The iterative approach takes into account the mutual and dynamic features of protein interactions when predicting functions, and addresses the issues of protein similarity measurement and prediction domain selection by introducing into the prediction algorithm a new semantic protein similarity and a method of selecting the multi-layer prediction domain. The new protein similarity is based on the multi-layered information carried by protein functions. The evaluations conducted on real protein interaction datasets demonstrated that the proposed iterative function prediction method outperformed other similar or non-iterative methods, and provided better prediction results.ConclusionsThe new protein similarity derived from multi-layered information of protein functions more reasonably reflects the intrinsic relationships among proteins, and significant improvement to the prediction quality can occur through incorporation of mutual and dynamic features of protein interactions into the prediction algorithm. 相似文献

17.

Intrinsic disorder in the Protein Data Bank 总被引：2，自引：0，他引：2

Le Gall T Romero PR Cortese MS Uversky VN Dunker AK 《Journal of biomolecular structure & dynamics》2007,24(4):325-342

The Protein Data Bank (PDB) is the preeminent source of protein structural information. PDB contains over 32,500 experimentally determined 3-D structures solved using X-ray crystallography or nuclear magnetic resonance spectroscopy. Intrinsically disordered regions fail to form a fixed 3-D structure under physiological conditions. In this study, we compare the amino-acid sequences of proteins whose structures are determined by X-ray crystallography with the corresponding sequences from the Swiss-Prot database. The analyzed dataset includes 16,370 structures, which represent 18,101 PDB chains and 5,434 different proteins from 910 different organisms (2,793 eukaryotic, 2,109 bacterial, 288 viral, and 244 archaeal). In this dataset, on average, each Swiss-Prot protein is represented by 7 PDB chains with 76% of the crystallized regions being represented by more than one structure. Intriguingly, the complete sequences of only approximately 7% of proteins are observed in the corresponding PDB structures, and only approximately 25% of the total dataset have >95% of their lengths observed in the corresponding PDB structures. This suggests that the vast majority of PDB proteins is shorter than their corresponding Swiss-Prot sequences and/or contain numerous residues, which are not observed in maps of electron density. To determine the prevalence of disordered regions in PDB, the residues in the Swiss-Prot sequences were grouped into four general categories, "Observed" (which correspond to structured regions), "Not observed" (regions with missing electron density, potentially disordered), "Uncharacterized," and "Ambiguous," depending on their appearance in the corresponding PDB entries. This non-redundant set of residues can be viewed as a 'fragment' or empirical domain database that contains a set of experimentally determined structured regions or domains and a set of experimentally verified disordered regions or domains. We studied the propensities and properties of residues in these four categories and analyzed their relations to the predictions of disorder using several algorithms. "Non-observed," "Ambiguous," and "Uncharacterized" regions were shown to possess the amino acid compositional biases typical of intrinsically disordered proteins. The application of four different disorder predictors (PONDR(R) VL-XT, VL3-BA, VSL1P, and IUPred) revealed that the vast majority of residues in the "Observed" dataset are ordered, and that the "Not observed" regions are mostly disordered. The "Uncharacterized" regions possess some tendency toward order, whereas the predictions for the short "Ambiguous" regions are really ambiguous. Long "Ambiguous" regions (>70 amino acid residues) are mostly predicted to be ordered, suggesting that they are likely to be "wobbly" domains. Overall, we showed that completely ordered proteins are not highly abundant in PDB and many PDB sequences have disordered regions. In fact, in the analyzed dataset approximately 10% of the PDB proteins contain regions of consecutive missing or ambiguous residues longer than 30 amino-acids and approximately 40% of the proteins possess short regions (> or =10 and < 30 amino-acid long) of missing and ambiguous residues. 相似文献

18.

Environmental Pressure May Change the Composition Protein Disorder in Prokaryotes

Esmeralda Vicedo Avner Schlessinger Burkhard Rost 《PloS one》2015,10(8)

Many prokaryotic organisms have adapted to incredibly extreme habitats. The genomes of such extremophiles differ from their non-extremophile relatives. For example, some proteins in thermophiles sustain high temperatures by being more compact than homologs in non-extremophiles. Conversely, some proteins have increased volumes to compensate for freezing effects in psychrophiles that survive in the cold. Here, we revealed that some differences in organisms surviving in extreme habitats correlate with a simple single feature, namely the fraction of proteins predicted to have long disordered regions. We predicted disorder with different methods for 46 completely sequenced organisms from diverse habitats and found a correlation between protein disorder and the extremity of the environment. More specifically, the overall percentage of proteins with long disordered regions tended to be more similar between organisms of similar habitats than between organisms of similar taxonomy. For example, predictions tended to detect substantially more proteins with long disordered regions in prokaryotic halophiles (survive high salt) than in their taxonomic neighbors. Another peculiar environment is that of high radiation survived, e.g. by Deinococcus radiodurans. The relatively high fraction of disorder predicted in this extremophile might provide a shield against mutations. Although our analysis fails to establish causation, the observed correlation between such a simplistic, coarse-grained, microscopic molecular feature (disorder content) and a macroscopic variable (habitat) remains stunning. 相似文献

19.

Utilization of protein intrinsic disorder knowledge in structural proteomics

Christopher J. Oldfield Bin Xue Ya-Yue Van Eldon L. Ulrich John L. Markley A. Keith Dunker Vladimir N. Uversky 《Biochimica et Biophysica Acta - Proteins and Proteomics》2013,1834(2):487-498

Intrinsically disordered proteins (IDPs) and proteins with long disordered regions are highly abundant in various proteomes. Despite their lack of well-defined ordered structure, these proteins and regions are frequently involved in crucial biological processes. Although in recent years these proteins have attracted the attention of many researchers, IDPs represent a significant challenge for structural characterization since these proteins can impact many of the processes in the structure determination pipeline. Here we investigate the effects of IDPs on the structure determination process and the utility of disorder prediction in selecting and improving proteins for structural characterization. Examination of the extent of intrinsic disorder in existing crystal structures found that relatively few protein crystal structures contain extensive regions of intrinsic disorder. Although intrinsic disorder is not the only cause of crystallization failures and many structured proteins cannot be crystallized, filtering out highly disordered proteins from structure-determination target lists is still likely to be cost effective. Therefore it is desirable to avoid highly disordered proteins from structure-determination target lists and we show that disorder prediction can be applied effectively to enrich structure determination pipelines with proteins more likely to yield crystal structures. For structural investigation of specific proteins, disorder prediction can be used to improve targets for structure determination. Finally, a framework for considering intrinsic disorder in the structure determination pipeline is proposed. 相似文献

20.

Genome-scale gene function prediction using multiple sources of high-throughput data in yeast Saccharomyces cerevisiae

Joshi T Chen Y Becker JM Alexandrov N Xu D 《Omics : a journal of integrative biology》2004,8(4):322-333

Characterizing gene function is one of the major challenging tasks in the post-genomic era. To address this challenge, we have developed GeneFAS (Gene Function Annotation System), a new integrated probabilistic method for cellular function prediction by combining information from protein-protein interactions, protein complexes, microarray gene expression profiles, and annotations of known proteins through an integrative statistical model. Our approach is based on a novel assessment for the relationship between (1) the interaction/correlation of two proteins' high-throughput data and (2) their functional relationship in terms of their Gene Ontology (GO) hierarchy. We have developed a Web server for the predictions. We have applied our method to yeast Saccharomyces cerevisiae and predicted functions for 1548 out of 2472 unannotated proteins. 相似文献