首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The structures of DNA-protein complexes have illuminated the diversity of DNA-protein binding mechanisms shown by different protein families. This lack of generality could pose a great challenge for predicting DNA-protein interactions. To address this issue, we have developed a knowledge-based method, DNA-binding Domain Hunter (DBD-Hunter), for identifying DNA-binding proteins and associated binding sites. The method combines structural comparison and the evaluation of a statistical potential, which we derive to describe interactions between DNA base pairs and protein residues. We demonstrate that DBD-Hunter is an accurate method for predicting DNA-binding function of proteins, and that DNA-binding protein residues can be reliably inferred from the corresponding templates if identified. In benchmark tests on approximately 4000 proteins, our method achieved an accuracy of 98% and a precision of 84%, which significantly outperforms three previous methods. We further validate the method on DNA-binding protein structures determined in DNA-free (apo) state. We show that the accuracy of our method is only slightly affected on apo-structures compared to the performance on holo-structures cocrystallized with DNA. Finally, we apply the method to approximately 1700 structural genomics targets and predict that 37 targets with previously unknown function are likely to be DNA-binding proteins. DBD-Hunter is freely available at http://cssb.biology.gatech.edu/skolnick/webservice/DBD-Hunter/.  相似文献   

2.
Crowded intracellular environments present a challenge for proteins to form functional specific complexes while reducing non‐functional interactions with promiscuous non‐functional partners. Here we show how the need to minimize the waste of resources to non‐functional interactions limits the proteome diversity and the average concentration of co‐expressed and co‐localized proteins. Using the results of high‐throughput Yeast 2‐Hybrid experiments, we estimate the characteristic strength of non‐functional protein–protein interactions. By combining these data with the strengths of specific interactions, we assess the fraction of time proteins spend tied up in non‐functional interactions as a function of their overall concentration. This allows us to sketch the phase diagram for baker's yeast cells using the experimentally measured concentrations and subcellular localization of their proteins. The positions of yeast compartments on the phase diagram are consistent with our hypothesis that the yeast proteome has evolved to operate closely to the upper limit of its size, whereas keeping individual protein concentrations sufficiently low to reduce non‐functional interactions. These findings have implication for conceptual understanding of intracellular compartmentalization, multicellularity and differentiation.  相似文献   

3.
The analysis of proteins in biological membranes forms a major challenge in proteomics. Despite continuous improvements and the development of more sensitive analytical methods, the analysis of membrane proteins has always been hampered by their hydrophobic properties and relatively low abundance. In this review, we describe recent successful strategies that have led to in-depth analyses of the membrane proteome. To facilitate membrane proteome analysis, it is essential that biochemical enrichment procedures are combined with special analytical workflows that are all optimized to cope with hydrophobic polypeptides. These include techniques for protein solubilization, and also well-matched developments in protein separation and protein digestion procedures. Finally, we discuss approaches to target membrane–protein complexes and lipid–protein interactions, as such approaches offer unique insights into function and architecture of cellular membranes.  相似文献   

4.
Characterizing gene function is one of the major challenging tasks in the post-genomic era. To address this challenge, we have developed GeneFAS (Gene Function Annotation System), a new integrated probabilistic method for cellular function prediction by combining information from protein-protein interactions, protein complexes, microarray gene expression profiles, and annotations of known proteins through an integrative statistical model. Our approach is based on a novel assessment for the relationship between (1) the interaction/correlation of two proteins' high-throughput data and (2) their functional relationship in terms of their Gene Ontology (GO) hierarchy. We have developed a Web server for the predictions. We have applied our method to yeast Saccharomyces cerevisiae and predicted functions for 1548 out of 2472 unannotated proteins.  相似文献   

5.
6.
The quest to determine the function of a protein can represent a profound challenge. Although this task is the mandate of countless research groups, a general framework for how it can be approached is conspicuously lacking. Moreover, even expectations for when the function of a protein can be considered to be ‘known’ are not well defined. In this review, we begin by introducing concepts pertinent to the challenge of protein function assignments. We then propose a framework for inferring a protein's function from four data categories: ‘inheritance’, ‘distribution’, ‘interactions’ and ‘phenotypes’ (IDIP). We document that the functions of proteins emerge at the intersection of inferences drawn from these data categories and emphasise the benefit of considering them in an evolutionary context. We then apply this approach to the cellular prion protein (PrPC), well known for its central role in prion diseases, whose function continues to be considered elusive by many investigators. We document that available data converge on the conclusion that the function of the prion protein is to control a critical post-translational modification of the neural cell adhesion molecule in the context of epithelial-to-mesenchymal transition and related plasticity programmes. Finally, we argue that this proposed function of PrPC has already passed the test of time and is concordant with the IDIP framework in a way that other functions considered for this protein fail to achieve. We anticipate that the IDIP framework and the concepts analysed herein will aid the investigation of other proteins whose primary functional assignments have thus far been intractable.  相似文献   

7.
Kaleel  Manaz  Torrisi  Mirko  Mooney  Catherine  Pollastri  Gianluca 《Amino acids》2019,51(9):1289-1296

Predicting the three-dimensional structure of proteins is a long-standing challenge of computational biology, as the structure (or lack of a rigid structure) is well known to determine a protein’s function. Predicting relative solvent accessibility (RSA) of amino acids within a protein is a significant step towards resolving the protein structure prediction challenge especially in cases in which structural information about a protein is not available by homology transfer. Today, arguably the core of the most powerful prediction methods for predicting RSA and other structural features of proteins is some form of deep learning, and all the state-of-the-art protein structure prediction tools rely on some machine learning algorithm. In this article we present a deep neural network architecture composed of stacks of bidirectional recurrent neural networks and convolutional layers which is capable of mining information from long-range interactions within a protein sequence and apply it to the prediction of protein RSA using a novel encoding method that we shall call “clipped”. The final system we present, PaleAle 5.0, which is available as a public server, predicts RSA into two, three and four classes at an accuracy exceeding 80% in two classes, surpassing the performances of all the other predictors we have benchmarked.

  相似文献   

8.
BackgroundSimilarity based computational methods are a useful tool for predicting protein functions from protein–protein interaction (PPI) datasets. Although various similarity-based prediction algorithms have been proposed, unsatisfactory prediction results have occurred on many occasions. The purpose of this type of algorithm is to predict functions of an unannotated protein from the functions of those proteins that are similar to the unannotated protein. Therefore, the prediction quality largely depends on how to select a set of proper proteins (i.e., a prediction domain) from which the functions of an unannotated protein are predicted, and how to measure the similarity between proteins. Another issue with existing algorithms is they only believe the function prediction is a one-off procedure, ignoring the fact that interactions amongst proteins are mutual and dynamic in terms of similarity when predicting functions. How to resolve these major issues to increase prediction quality remains a challenge in computational biology.ResultsIn this paper, we propose an innovative approach to predict protein functions of unannotated proteins iteratively from a PPI dataset. The iterative approach takes into account the mutual and dynamic features of protein interactions when predicting functions, and addresses the issues of protein similarity measurement and prediction domain selection by introducing into the prediction algorithm a new semantic protein similarity and a method of selecting the multi-layer prediction domain. The new protein similarity is based on the multi-layered information carried by protein functions. The evaluations conducted on real protein interaction datasets demonstrated that the proposed iterative function prediction method outperformed other similar or non-iterative methods, and provided better prediction results.ConclusionsThe new protein similarity derived from multi-layered information of protein functions more reasonably reflects the intrinsic relationships among proteins, and significant improvement to the prediction quality can occur through incorporation of mutual and dynamic features of protein interactions into the prediction algorithm.  相似文献   

9.
Accurate determination of functional interactions among proteins at the genome level remains a challenge for genomic research. Here we introduce a genome-scale approach to functional protein annotation--phylogenomic mapping--that requires only sequence data, can be applied equally well to both finished and unfinished genomes, and can be extended beyond single genomes to annotate multiple genomes simultaneously. We have developed and applied it to more than 200 sequenced bacterial genomes. Proteins with similar evolutionary histories were grouped together, placed on a three dimensional map and visualized as a topographical landscape. The resulting phylogenomic maps display thousands of proteins clustered in mountains on the basis of coinheritance, a strong indicator of shared function. In addition to systematic computational validation, we have experimentally confirmed the ability of phylogenomic maps to predict both mutant phenotype and gene function in the delta proteobacterium Myxococcus xanthus.  相似文献   

10.
The surface of all living cells is decorated with carbohydrate molecules. Hundreds of functional proteins bind to these glycosylated ligands; such binding events subsequently modulate many aspects of protein and cell function. Identifying ligands for glycan-binding proteins (GBPs) is a defining challenge of glycoscience research. Here, we review recent advances that are allowing protein-carbohydrate interactions to be dissected with an unprecedented level of precision. We specifically highlight how cell-based glycan arrays and glyco-genomic profiling are being used to define the structural determinants of glycan-protein interactions in living cells. Going forward, these methods create exciting new opportunities for the study of glycans in physiology and disease.  相似文献   

11.
Experiments and molecular simulations have shown that the hydrophobic mismatch between proteins and membranes contributes significantly to lipid-mediated protein-protein interactions. In this article, we discuss the effect of cholesterol on lipid-mediated protein-protein interactions as function of hydrophobic mismatch, protein diameter and protein cluster size, lipid tail length, and temperature. To do so, we study a mesoscopic model of a hydrated bilayer containing lipids and cholesterol in which proteins are embedded, with a hybrid dissipative particle dynamics-Monte Carlo method. We propose a mechanism by which cholesterol affects protein interactions: protein-induced, cholesterol-enriched, or cholesterol-depleted lipid shells surrounding the proteins affect the lipid-mediated protein-protein interactions. Our calculations of the potential of mean force between proteins and protein clusters show that the addition of cholesterol dramatically reduces repulsive lipid-mediated interactions between proteins (protein clusters) with positive mismatch, but does not affect attractive interactions between proteins with negative mismatch. Cholesterol has only a modest effect on the repulsive interactions between proteins with different mismatch.  相似文献   

12.
Convergent evolution with combinatorial peptides   总被引:1,自引:0,他引:1  
Once the sequence of a genome is in hand, understanding the function of its encoded proteins becomes a task of paramount importance. Much like the biochemists who first outlined different biochemical pathways, many genomic scientists are engaged in determining which proteins interact with which proteins, thereby establishing a protein interaction network. While these interactions have evolved in regard to their specificity, affinity and cellular function over billions of years, it is possible in the laboratory to isolate peptides from combinatorial libraries that bind to the same proteins with similar specificity, affinity and primary structures, which resemble those of the natural interacting proteins. We have termed this phenomenon 'convergent evolution'. In this review, we highlight various examples of convergent evolution that have been uncovered in experiments dissecting protein-protein interactions with combinatorial peptides. Thus, a fruitful approach for mapping protein-protein interactions is to isolate peptide ligands to a target protein and identify candidate interacting proteins in a sequenced genome by computer analysis.  相似文献   

13.
High-throughput methods for detecting protein interactions, such as mass spectrometry and yeast two-hybrid assays, continue to produce vast amounts of data that may be exploited to infer protein function and regulation. As this article went to press, the pool of all published interaction information on Saccharomyces cerevisiae was 15,143 interactions among 4,825 proteins, and power-law scaling supports an estimate of 20,000 specific protein interactions. To investigate the biases, overlaps, and complementarities among these data, we have carried out an analysis of two high-throughput mass spectrometry (HMS)-based protein interaction data sets from budding yeast, comparing them to each other and to other interaction data sets. Our analysis reveals 198 interactions among 222 proteins common to both data sets, many of which reflect large multiprotein complexes. It also indicates that a "spoke" model that directly pairs bait proteins with associated proteins is roughly threefold more accurate than a "matrix" model that connects all proteins. In addition, we identify a large, previously unsuspected nucleolar complex of 148 proteins, including 39 proteins of unknown function. Our results indicate that existing large-scale protein interaction data sets are nonsaturating and that integrating many different experimental data sets yields a clearer biological view than any single method alone.  相似文献   

14.
Expanded runs of consecutive trinucleotide CAG repeats encoding polyglutamine (polyQ) stretches are observed in the genes of a large number of patients with different genetic diseases such as Huntington's and several Ataxias. Protein aggregation, which is a key feature of most of these diseases, is thought to be triggered by these expanded polyQ sequences in disease-related proteins. However, polyQ tracts are a normal feature of many human proteins, suggesting that they have an important cellular function. To clarify the potential function of polyQ repeats in biological systems, we systematically analyzed available information stored in sequence and protein interaction databases. By integrating genomic, phylogenetic, protein interaction network and functional information, we obtained evidence that polyQ tracts in proteins stabilize protein interactions. This happens most likely through structural changes whereby the polyQ sequence extends a neighboring coiled-coil region to facilitate its interaction with a coiled-coil region in another protein. Alteration of this important biological function due to polyQ expansion results in gain of abnormal interactions, leading to pathological effects like protein aggregation. Our analyses suggest that research on polyQ proteins should shift focus from expanded polyQ proteins into the characterization of the influence of the wild-type polyQ on protein interactions.  相似文献   

15.
Sharabi O  Dekel A  Shifman JM 《Proteins》2011,79(5):1487-1498
Computational prediction of stabilizing mutations into monomeric proteins has become an almost ordinary task. Yet, computational stabilization of protein–protein complexes remains a challenge. Design of protein–protein interactions (PPIs) is impeded by the absence of an energy function that could reliably reproduce all favorable interactions between the binding partners. In this work, we present three energy functions: one function that was trained on monomeric proteins, while the other two were optimized by different techniques to predict side-chain conformations in a dataset of PPIs. The performances of these energy functions are evaluated in three different tasks related to design of PPIs: predicting side-chain conformations in PPIs, recovering native binding-interface sequences, and predicting changes in free energy of binding due to mutations. Our findings show that both functions optimized on side-chain repacking in PPIs are more suitable for PPI design compared to the function trained on monomeric proteins. Yet, no function performs best at all three tasks. Comparison of the three energy functions and their performances revealed that (1) burial of polar atoms should not be penalized significantly in PPI design as in single-protein design and (2) contribution of electrostatic interactions should be increased several-fold when switching from single-protein to PPI design. In addition, the use of a softer van der Waals potential is beneficial in cases when backbone flexibility is important. All things considered, we define an energy function that captures most of the nuances of the binding energetics and hence, should be used in future for design of PPIs.  相似文献   

16.
The genomes of more than 100 species have been sequenced, and the biological functions of encoded proteins are now actively being researched. Protein function is based on interactions between proteins and other molecules. One approach to assuming protein function based on genomic sequence is to predict interactions between an encoded protein and other molecules. As a data source for such predictions, knowledge regarding known protein-small molecule interactions needs to be compiled. We have, therefore, surveyed interactions between proteins and other molecules in Protein Data Bank (PDB), the protein three-dimensional (3D) structure database. Among 20,685 entries in PDB (April, 2003), 4,189 types of small molecules were found to interact with proteins. Biologically relevant small molecules most often found in PDB were metal ions, such as calcium, zinc, and magnesium. Sugars and nucleotides were the next most common. These molecules are known to act as cofactors for enzymes and/or stabilizers of proteins. In each case of interactions between a protein and small molecule, we found preferred amino acid residues at the interaction sites. These preferences can be the basis for predicting protein function from genomic sequence and protein 3D structures. The data pertaining to these small molecules were collected in a database named Het-PDB Navi., which is freely available at http://daisy.nagahama-i-bio.ac.jp/golab/hetpdbnavi.html and linked to the official PDB home page.  相似文献   

17.
A protein interaction network describes a set of physical associations that can occur between proteins. However, within any particular cell or tissue only a subset of proteins is expressed and so only a subset of interactions can occur. Integrating interaction and expression data, we analyze here this interplay between protein expression and physical interactions in humans. Proteins only expressed in restricted cell types, like recently evolved proteins, make few physical interactions. Most tissue‐specific proteins do, however, bind to universally expressed proteins, and so can function by recruiting or modifying core cellular processes. Conversely, most ‘housekeeping’ proteins that are expressed in all cells also make highly tissue‐specific protein interactions. These results suggest a model for the evolution of tissue‐specific biology, and show that most, and possibly all, ‘housekeeping’ proteins actually have important tissue‐specific molecular interactions.  相似文献   

18.
19.
Interactions between proteins are an essential part of biology, and the desire to identify these interactions has led to the development of numerous technologies to systematically map protein–protein interactions at a large scale. As in most cellular processes, protein interactions are central to the control of cell polarity, and a full understanding of polarity will require comprehensive knowledge of the protein interactions involved. At its core, cell polarity is established through carefully regulated mutually inhibitory interactions between several groups of cortical proteins. While several interactions have been identified, the dynamics and molecular mechanisms that control these interactions are not well understood. Cell polarity also needs to be integrated with cellular processes including junction formation, cytoskeletal organization, organelle positioning, protein trafficking, and functional specialization of membrane domains. Moreover, polarized cells need to respond to external cues that coordinate polarity at the tissue level. Identifying the protein–protein interactions responsible for integrating polarity with all of these processes remains a major challenge, in part because the mechanisms of polarity control vary in different contexts and with developmental times. Because of their unbiased nature, systematic large-scale protein–protein interaction mapping approaches can be particularly helpful to identify such mechanisms. Here, we discuss methods commonly used to generate proteome-wide interactome maps, with an emphasis on advances in our understanding of cell polarity that have been achieved through application of such methods.  相似文献   

20.

Background

Accurate annotation of protein functions is still a big challenge for understanding life in the post-genomic era. Many computational methods based on protein-protein interaction (PPI) networks have been proposed to predict the function of proteins. However, the precision of these predictions still needs to be improved, due to the incompletion and noise in PPI networks. Integrating network topology and biological information could improve the accuracy of protein function prediction and may also lead to the discovery of multiple interaction types between proteins. Current algorithms generate a single network, which is archived using a weighted sum of all types of protein interactions.

Method

The influences of different types of interactions on the prediction of protein functions are not the same. To address this, we construct multilayer protein networks (MPN) by integrating PPI networks, the domain of proteins, and information on protein complexes. In the MPN, there is more than one type of connections between pairwise proteins. Different types of connections reflect different roles and importance in protein function prediction. Based on the MPN, we propose a new protein function prediction method, named function prediction based on multilayer protein networks (FP-MPN). Given an un-annotated protein, the FP-MPN method visits each layer of the MPN in turn and generates a set of candidate neighbors with known functions. A set of predicted functions for the testing protein is then formed and all of these functions are scored and sorted. Each layer plays different importance on the prediction of protein functions. A number of top-ranking functions are selected to annotate the unknown protein.

Conclusions

The method proposed in this paper was a better predictor when used on Saccharomyces cerevisiae protein data than other function prediction methods previously used. The proposed FP-MPN method takes different roles of connections in protein function prediction into account to reduce the artificial noise by introducing biological information.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号