首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
计算方法在蛋白质相互作用研究中的应用   总被引:3,自引:1,他引:2  
计算方法在蛋白质相互作用研究的各个阶段扮演了一个重要的角色。对此,作者将从以下几个方面对计算方法在蛋白质相互作用及相互作用网络研究中的应用做一个概述:蛋白质相互作用数据库及其发展;数据挖掘方法在蛋白质相互作用数据收集和整合中的应用;高通量方法实验结果的验证;根据蛋白质相互作用网络预测和推断未知蛋白质的功能;蛋白质相互作用的预测。  相似文献   

2.
MOTIVATION: Short well-defined domains known as peptide recognition modules (PRMs) regulate many important protein-protein interactions involved in the formation of macromolecular complexes and biochemical pathways. Since high-throughput experiments like yeast two-hybrid and phage display are expensive and intrinsically noisy, it would be desirable to more specifically target or partially bypass them with complementary in silico approaches. In the present paper, we present a probabilistic discriminative approach to predicting PRM-mediated protein-protein interactions from sequence data. The model is motivated by the discriminative model of Segal and Sharan as an alternative to the generative approach of Reiss and Schwikowski. In our evaluation, we focus on predicting the interaction network. As proposed by Williams, we overcome the problem of susceptibility to over-fitting by adopting a Bayesian a posteriori approach based on a Laplacian prior in parameter space. RESULTS: The proposed method was tested on two datasets of protein-protein interactions involving 28 SH3 domain proteins in Saccharmomyces cerevisiae, where the datasets were obtained with different experimental techniques. The predictions were evaluated with out-of-sample receiver operator characteristic (ROC) curves. In both cases, Laplacian regularization turned out to be crucial for achieving a reasonable generalization performance. The Laplacian-regularized discriminative model outperformed the generative model of Reiss and Schwikowski in terms of the area under the ROC curve on both datasets. The performance was further improved with a hybrid approach, in which our model was initialized with the motifs obtained with the method of Reiss and Schwikowski. AVAILABILITY: Software and supplementary material is available from http://lehrach.com/wolfgang/dmf.  相似文献   

3.
The vast majority of the chores in the living cell involve protein-protein interactions. Providing details of protein interactions at the residue level and incorporating them into protein interaction networks are crucial toward the elucidation of a dynamic picture of cells. Despite the rapid increase in the number of structurally known protein complexes, we are still far away from a complete network. Given experimental limitations, computational modeling of protein interactions is a prerequisite to proceed on the way to complete structural networks. In this work, we focus on the question 'how do proteins interact?' rather than 'which proteins interact?' and we review structure-based protein-protein interaction prediction approaches. As a sample approach for modeling protein interactions, PRISM is detailed which combines structural similarity and evolutionary conservation in protein interfaces to infer structures of complexes in the protein interaction network. This will ultimately help us to understand the role of protein interfaces in predicting bound conformations.  相似文献   

4.
5.
MOTIVATION: Biological processes in cells are properly performed by gene regulations, signal transductions and interactions between proteins. To understand such molecular networks, we propose a statistical method to estimate gene regulatory networks and protein-protein interaction networks simultaneously from DNA microarray data, protein-protein interaction data and other genome-wide data. RESULTS: We unify Bayesian networks and Markov networks for estimating gene regulatory networks and protein-protein interaction networks according to the reliability of each biological information source. Through the simultaneous construction of gene regulatory networks and protein-protein interaction networks of Saccharomyces cerevisiae cell cycle, we predict the role of several genes whose functions are currently unknown. By using our probabilistic model, we can detect false positives of high-throughput data, such as yeast two-hybrid data. In a genome-wide experiment, we find possible gene regulatory relationships and protein-protein interactions between large protein complexes that underlie complex regulatory mechanisms of biological processes.  相似文献   

6.
Experiments to probe for protein-protein interactions are the focus of functional proteomic studies, thus proteomic data repositories are increasingly likely to contain a large cross-section of such information. Here, we use the Global Proteome Machine database (GPMDB), which is the largest curated and publicly available proteomic data repository derived from tandem mass spectrometry, to develop an in silico protein interaction analysis tool. Using a human histone protein for method development, we positively identified an interaction partner from each histone protein family that forms the histone octameric complex. Moreover, this method, applied to the α subunits of the human proteasome, identified all of the subunits in the 20S core particle. Furthermore, we applied this approach to human integrin αIIb and integrin β3, a major receptor involved in the activation of platelets. We identified 28 proteins, including a protein network for integrin and platelet activation. In addition, proteins interacting with integrin β1 obtained using this method were validated by comparing them to those identified in a formaldehyde-supported coimmunoprecipitation experiment, protein-protein interaction databases and the literature. Our results demonstrate that in silico protein interaction analysis is a novel tool for identifying known/candidate protein-protein interactions and proteins with shared functions in a protein network.  相似文献   

7.
Predictive understanding of the myriads of signal transduction pathways in a cell is an outstanding challenge of systems biology. Such pathways are primarily mediated by specific but transient protein-protein interactions, which are difficult to study experimentally. In this study, we dissect the specificity of protein-protein interactions governing two-component signaling (TCS) systems ubiquitously used in bacteria. Exploiting the large number of sequenced bacterial genomes and an operon structure which packages many pairs of interacting TCS proteins together, we developed a computational approach to extract a molecular interaction code capturing the preferences of a small but critical number of directly interacting residue pairs. This code is found to reflect physical interaction mechanisms, with the strongest signal coming from charged amino acids. It is used to predict the specificity of TCS interaction: Our results compare favorably to most available experimental results, including the prediction of 7 (out of 8 known) interaction partners of orphan signaling proteins in Caulobacter crescentus. Surveying among the available bacterial genomes, our results suggest 15~25% of the TCS proteins could participate in out-of-operon "crosstalks". Additionally, we predict clusters of crosstalking candidates, expanding from the anecdotally known examples in model organisms. The tools and results presented here can be used to guide experimental studies towards a system-level understanding of two-component signaling.  相似文献   

8.
Interaction between proteins is a fundamental mechanism that underlies virtually all biological processes. Many important interactions are conserved across a large variety of species. The need to maintain interaction leads to a high degree of co-evolution between residues in the interface between partner proteins. The inference of protein-protein interaction networks from the rapidly growing sequence databases is one of the most formidable tasks in systems biology today. We propose here a novel approach based on the Direct-Coupling Analysis of the co-evolution between inter-protein residue pairs. We use ribosomal and trp operon proteins as test cases: For the small resp. large ribosomal subunit our approach predicts protein-interaction partners at a true-positive rate of 70% resp. 90% within the first 10 predictions, with areas of 0.69 resp. 0.81 under the ROC curves for all predictions. In the trp operon, it assigns the two largest interaction scores to the only two interactions experimentally known. On the level of residue interactions we show that for both the small and the large ribosomal subunit our approach predicts interacting residues in the system with a true positive rate of 60% and 85% in the first 20 predictions. We use artificial data to show that the performance of our approach depends crucially on the size of the joint multiple sequence alignments and analyze how many sequences would be necessary for a perfect prediction if the sequences were sampled from the same model that we use for prediction. Given the performance of our approach on the test data we speculate that it can be used to detect new interactions, especially in the light of the rapid growth of available sequence data.  相似文献   

9.
10.
Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes.  相似文献   

11.
Predicting protein functions with message passing algorithms   总被引:2,自引:0,他引:2  
MOTIVATION: In the last few years, a growing interest in biology has been shifting toward the problem of optimal information extraction from the huge amount of data generated via large-scale and high-throughput techniques. One of the most relevant issues has recently emerged that of correctly and reliably predicting the functions of a given protein with that of functions exploiting information coming from the whole network of proteins physically interacting with the functionally undetermined one. In the present work, we will refer to an 'observed' protein as the one present in the protein-protein interaction networks published in the literature. METHODS: The method proposed in this paper is based on a message passing algorithm known as Belief Propagation, which accepts the network of protein's physical interactions and a catalog of known protein's functions as input, and returns the probabilities for each unclassified protein of having one chosen function. The implementation of the algorithm allows for fast online analysis, and can easily be generalized into more complex graph topologies taking into account hypergraphs, i.e. complexes of more than two interacting proteins. RESULTS: Benchmarks of our method are the two Saccharomyces cerevisiae protein-protein interaction networks and the Database of Interacting Proteins. The validity of our approach is successfully tested against other available techniques. CONTACT: leone@isiosf.isi.it SUPPLEMENTARY INFORMATION: http://isiosf.isi.it/~pagnani  相似文献   

12.
Probabilistic inference of molecular networks from noisy data sources   总被引:1,自引:0,他引:1  
Information on molecular networks, such as networks of interacting proteins, comes from diverse sources that contain remarkable differences in distribution and quantity of errors. Here, we introduce a probabilistic model useful for predicting protein interactions from heterogeneous data sources. The model describes stochastic generation of protein-protein interaction networks with real-world properties, as well as generation of two heterogeneous sources of protein-interaction information: research results automatically extracted from the literature and yeast two-hybrid experiments. Based on the domain composition of proteins, we use the model to predict protein interactions for pairs of proteins for which no experimental data are available. We further explore the prediction limits, given experimental data that cover only part of the underlying protein networks. This approach can be extended naturally to include other types of biological data sources.  相似文献   

13.
Controlled intra-nuclear organization of proteins is critical for sustaining correct function of the cell. Proteins and RNA are transported by passive diffusion and associate with compartments by virtue of diverse molecular interactions--presenting a challenging problem for data-driven model building. An increasing inventory of proteins with known intra-nuclear destination and proliferation of molecular interaction data motivate an integrative method, leveraging the existing evidence to build accurate models of intranuclear trafficking. Kernel canonical correlation analysis (KCCA) enables the construction of predictors based on genomic sequence data, but leverages other knowledge sources during training. The approach specifically involves the induction of protein sequence features and relations most pertinent to the recovery of nucleolar associated protein-protein interactions. With success rates of about 78%, the classification of nucleolar association from KCCA-induced features surpasses that of baseline approaches. We observe that the coalescence of protein-protein interaction data with sequence data enhances the prediction of highly interconnected, key ribosomal and RNA-related nucleolar proteins. For supplementary material, see www.itee.uq.edu.au/~ pprowler/nucleoli.  相似文献   

14.
Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.  相似文献   

15.
MOTIVATION: Recent screening techniques have made large amounts of protein-protein interaction data available, from which biologically important information such as the function of uncharacterized proteins, the existence of novel protein complexes, and novel signal-transduction pathways can be discovered. However, experimental data on protein interactions contain many false positives, making these discoveries difficult. Therefore computational methods of assessing the reliability of each candidate protein-protein interaction are urgently needed. RESULTS: We developed a new 'interaction generality' measure (IG2) to assess the reliability of protein-protein interactions using only the topological properties of their interaction-network structure. Using yeast protein-protein interaction data, we showed that reliable protein-protein interactions had significantly lower IG2 values than less-reliable interactions, suggesting that IG2 values can be used to evaluate and filter interaction data to enable the construction of reliable protein-protein interaction networks.  相似文献   

16.
The protein-protein interaction networks of even well-studied model organisms are sketchy at best, highlighting the continued need for computational methods to help direct experimentalists in the search for novel interactions. This need has prompted the development of a number of methods for predicting protein-protein interactions based on various sources of data and methodologies. The common method for choosing negative examples for training a predictor of protein-protein interactions is based on annotations of cellular localization, and the observation that pairs of proteins that have different localization patterns are unlikely to interact. While this method leads to high quality sets of non-interacting proteins, we find that this choice can lead to biased estimates of prediction accuracy, because the constraints placed on the distribution of the negative examples makes the task easier. The effects of this bias are demonstrated in the context of both sequence-based and non-sequence based features used for predicting protein-protein interactions.  相似文献   

17.
Recently, several domain-based computational models for predicting protein-protein interactions (PPIs) have been proposed. The conventional methods usually infer domain or domain combination (DC) interactions from already known interacting sets of proteins, and then predict PPIs using the information. However, the majority of these models often have limitations in providing detailed information on which domain pair (single domain interaction) or DC pair (multidomain interaction) will actually interact for the predicted protein interaction. Therefore, a more comprehensive and concrete computational model for the prediction of PPIs is needed. We developed a computational model to predict PPIs using the information of intraprotein domain cohesion and interprotein DC coupling interaction. A method of identifying the primary interacting DC pair was also incorporated into the model in order to infer actual participants in a predicted interaction. Our method made an apparent improvement in the PPI prediction accuracy, and the primary interacting DC pair identification was valid specifically in predicting multidomain protein interactions. In this paper, we demonstrate that 1) the intraprotein domain cohesion is meaningful in improving the accuracy of domain-based PPI prediction, 2) a prediction model incorporating the intradomain cohesion enables us to identify the primary interacting DC pair, and 3) a hybrid approach using the intra/interdomain interaction information can lead to a more accurate prediction.  相似文献   

18.
19.
Correlation of two-hybrid affinity data with in vitro measurements.   总被引:28,自引:8,他引:20       下载免费PDF全文
Since their introduction, the interaction trap and other two-hybrid systems have been used to study protein-protein interactions. Despite their general use, little is known about the extent to which the degree of protein interaction determined by two-hybrid approaches parallels the degree of interaction determined by biochemical techniques. In this study, we used a set of lexAop-LEU2 and lexAop-lacZ reporters to calibrate the interaction trap. For the calibration, we used two sets of proteins, the Myc-Max-Mxi1 helix-loop-helix proteins, and wild-type and dimerization-defective versions of the lambda cI repressor. Our results indicate that the strength of interaction as predicted by the two-hybrid approach generally correlates with that determined in vitro, permitting discrimination of high-, intermediate-, and low-affinity interactions, but there was no single reporter for which the amount of gene expression linearly reflected affinity measured in vitro. However, some reporters showed thresholds and only responded to stronger interactions. Finally, some interactions were subject to directionality, and their apparent strength depended on the reporter used. Taken together, our results provide a cautionary framework for interpreting affinities from two-hybrid experiments.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号