首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Using indirect protein-protein interactions for protein complex prediction   总被引:1,自引:0,他引:1  
Protein complexes are fundamental for understanding principles of cellular organizations. As the sizes of protein-protein interaction (PPI) networks are increasing, accurate and fast protein complex prediction from these PPI networks can serve as a guide for biological experiments to discover novel protein complexes. However, it is not easy to predict protein complexes from PPI networks, especially in situations where the PPI network is noisy and still incomplete. Here, we study the use of indirect interactions between level-2 neighbors (level-2 interactions) for protein complex prediction. We know from previous work that proteins which do not interact but share interaction partners (level-2 neighbors) often share biological functions. We have proposed a method in which all direct and indirect interactions are first weighted using topological weight (FS-Weight), which estimates the strength of functional association. Interactions with low weight are removed from the network, while level-2 interactions with high weight are introduced into the interaction network. Existing clustering algorithms can then be applied to this modified network. We have also proposed a novel algorithm that searches for cliques in the modified network, and merge cliques to form clusters using a "partial clique merging" method. Experiments show that (1) the use of indirect interactions and topological weight to augment protein-protein interactions can be used to improve the precision of clusters predicted by various existing clustering algorithms; and (2) our complex-finding algorithm performs very well on interaction networks modified in this way. Since no other information except the original PPI network is used, our approach would be very useful for protein complex prediction, especially for prediction of novel protein complexes.  相似文献   

2.
基因逻辑网络研究进展   总被引:1,自引:0,他引:1  
海量生物数据的涌现,使得通过数据分析和理论方法探索生物机理成为理论生物学研究的重要途径.特别是对于基因的复杂的功能系统,建立基因网络这种理论方法的意义更为突出.Bowers在蛋白质相互作用的分析中引入了高阶逻辑关系,从而建立了系统发生谱数据的逻辑分析(LAPP)的系统方法.LAPP和通常建立模型的方法不同,它给出了一个从复杂网络的元素(或部件)的表达数据出发,通过逻辑分析,找到元素之间逻辑关联性的建模方法.这种方法能够从蛋白质表达谱数据出发,利用信息熵的算法发现两种蛋白质对一种蛋白质的联合作用,对于发现蛋白质之间新的作用机理有重要意义.由于涉及功能的基因组通常是一个大的群体构成的系统,因此LAPP方法也是一个生成复杂的基因逻辑网络的方法.基因逻辑网络的建立,方便实现通过逻辑调控进行基因调控的目的.这种方法可以应用在很多方面,如物种进化、肿瘤诊疗等等.系统阐述并分析了LAPP方法,并指出其在方法和应用方面的新进展以及评述.  相似文献   

3.
Biological networks are a topic of great current interest, particularly with the publication of a number of large genome-wide interaction datasets. They are globally characterized by a variety of graph-theoretic statistics, such as the degree distribution, clustering coefficient, characteristic path length and diameter. Moreover, real protein networks are quite complex and can often be divided into many sub-networks through systematic selection of different nodes and edges. For instance, proteins can be sub-divided by expression level, length, amino-acid composition, solubility, secondary structure and function. A challenging research question is to compare the topologies of sub- networks, looking for global differences associated with different types of proteins. TopNet is an automated web tool designed to address this question, calculating and comparing topological characteristics for different sub-networks derived from any given protein network. It provides reasonable solutions to the calculation of network statistics for sub-networks embedded within a larger network and gives simplified views of a sub-network of interest, allowing one to navigate through it. After constructing TopNet, we applied it to the interaction networks and protein classes currently available for yeast. We were able to find a number of potential biological correlations. In particular, we found that soluble proteins had more interactions than membrane proteins. Moreover, amongst soluble proteins, those that were highly expressed, had many polar amino acids, and had many alpha helices, tended to have the most interaction partners. Interestingly, TopNet also turned up some systematic biases in the current yeast interaction network: on average, proteins with a known functional classification had many more interaction partners than those without. This phenomenon may reflect the incompleteness of the experimentally determined yeast interaction network.  相似文献   

4.

Background  

Systems biology makes it possible to study larger and more intricate systems than before, so it is now possible to look at the molecular basis of several diseases in parallel. Analyzing the interaction network of proteins in the cell can be the key to understand how complex processes lead to diseases. Novel tools in network analysis provide the possibility to quantify the key interacting proteins in large networks as well as proteins that connect them. Here we suggest a new method to study the relationships between topology and functionality of the protein-protein interaction network, by identifying key mediator proteins possibly maintaining indirect relationships among proteins causing various diseases.  相似文献   

5.
Protein-protein interaction (PPI) networks contain a large amount of useful information for the functional characterization of proteins and promote the understanding of the complex molecular relationships that determine the phenotype of a cell. Recently, large human interaction maps have been generated with high throughput technologies such as the yeast two-hybrid system. However, they are static and incomplete and do not provide immediate clues about the cellular processes that convert genetic information into complex phenotypes. Refined multiple-aspect PPI screening and confirmation strategies will have to be put in place to increase the validity of interaction maps. Integration of interaction data with other qualitative and quantitative information (e.g. protein expression or localization data), will be required to construct networks of protein function that reflect dynamic processes in the cell. In this way, combined PPI networks can become valuable resources for a systems-level understanding of cellular processes and complex phenotypes.  相似文献   

6.

Background

Cellular interaction networks can be used to analyze the effects on cell signaling and other functional consequences of perturbations to cellular physiology. Thus, several methods have been used to reconstitute interaction networks from multiple published datasets. However, the structure and performance of these networks depends on both the quality and the unbiased nature of the original data. Due to the inherent bias against membrane proteins in protein-protein interaction (PPI) data, interaction networks can be compromised particularly if they are to be used in conjunction with drug screening efforts, since most drug-targets are membrane proteins.

Results

To overcome the experimental bias against PPIs involving membrane-associated proteins we used a probabilistic approach based on a hypergeometric distribution followed by logistic regression to simultaneously optimize the weights of different sources of interaction data. The resulting less biased genome-scale network constructed for the budding yeast Saccharomyces cerevisiae revealed that the starvation pathway is a distinct subnetwork of autophagy and retrieved a more integrated network of unfolded protein response genes. We also observed that the centrality-lethality rule depends on the content of membrane proteins in networks.

Conclusions

We show here that the bias against membrane proteins can and should be corrected in order to have a better representation of the interactions and topological properties of protein interaction networks.  相似文献   

7.
Ding C  He X  Meraz RF  Holbrook SR 《Proteins》2004,57(1):99-108
The protein interaction network presents one perspective for understanding cellular processes. Recent experiments employing high-throughput mass spectrometric characterizations have resulted in large data sets of physiologically relevant multiprotein complexes. We present a unified representation of such data sets based on an underlying bipartite graph model that is an advance over existing models of the network. Our unified representation allows for weighting of connections between proteins shared in more than one complex, as well as addressing the higher level organization that occurs when the network is viewed as consisting of protein complexes that share components. This representation also allows for the application of the rigorous MinMaxCut graph clustering algorithm for the determination of relevant protein modules in the networks. Statistically significant annotations of clusters in the protein-protein and complex-complex networks using terms from the Gene Ontology indicate that this method will be useful for posing hypotheses about uncharacterized components of protein complexes or uncharacterized relationships between protein complexes.  相似文献   

8.
MOTIVATION: The structural interaction of proteins and their domains in networks is one of the most basic molecular mechanisms for biological cells. Topological analysis of such networks can provide an understanding of and solutions for predicting properties of proteins and their evolution in terms of domains. A single paradigm for the analysis of interactions at different layers, such as domain and protein layers, is needed. RESULTS: Applying a colored vertex graph model, we integrated two basic interaction layers under a unified model: (1) structural domains and (2) their protein/complex networks. We identified four basic and distinct elements in the model that explains protein interactions at the domain level. We searched for motifs in the networks to detect their topological characteristics using a pruning strategy and a hash table for rapid detection. We obtained the following results: first, compared with a random distribution, a substantial part of the protein interactions could be explained by domain-level structural interaction information. Second, there were distinct kinds of protein interaction patterns classified by specific and distinguishable numbers of domains. The intermolecular domain interaction was the most dominant protein interaction pattern. Third, despite the coverage of the protein interaction information differing among species, the similarity of their networks indicated shared architectures of protein interaction network in living organisms. Remarkably, there were only a few basic architectures in the model (>10 for a 4-node network topology), and we propose that most biological combinations of domains into proteins and complexes can be explained by a small number of key topological motifs. CONTACT: doheon@kaist.ac.kr.  相似文献   

9.
Revealing organizational principles of biological networks is an important goal of systems biology. In this study, we sought to analyze the dynamic organizational principles within the protein interaction network by studying the characteristics of individual neighborhoods of proteins within the network based on their gene expression as well as protein-protein interaction patterns. By clustering proteins into distinct groups based on their neighborhood gene expression characteristics, we identify several significant trends in the dynamic organization of the protein interaction network. We show that proteins with distinct neighborhood gene expression characteristics are positioned in specific localities in the protein interaction network thereby playing specific roles in the dynamic network connectivity. Remarkably, our analysis reveals a neighborhood characteristic that corresponds to the most centrally located group of proteins within the network. Further, we show that the connectivity pattern displayed by this group is consistent with the notion of “rich club connectivity” in complex networks. Importantly, our findings are largely reproducible in networks constructed using independent and different datasets.  相似文献   

10.

Background

The study of biological interaction networks is a central theme of systems biology. Here, we investigate the relationships between two distinct types of interaction networks: the metabolic pathway map and the protein-protein interaction network (PIN). It has long been established that successive enzymatic steps are often catalyzed by physically interacting proteins forming permanent or transient multi-enzymes complexes. Inspecting high-throughput PIN data, it was shown recently that, indeed, enzymes involved in successive reactions are generally more likely to interact than other protein pairs. In our study, we expanded this line of research to include comparisons of the underlying respective network topologies as well as to investigate whether the spatial organization of enzyme interactions correlates with metabolic efficiency.

Results

Analyzing yeast data, we detected long-range correlations between shortest paths between proteins in both network types suggesting a mutual correspondence of both network architectures. We discovered that the organizing principles of physical interactions between metabolic enzymes differ from the general PIN of all proteins. While physical interactions between proteins are generally dissortative, enzyme interactions were observed to be assortative. Thus, enzymes frequently interact with other enzymes of similar rather than different degree. Enzymes carrying high flux loads are more likely to physically interact than enzymes with lower metabolic throughput. In particular, enzymes associated with catabolic pathways as well as enzymes involved in the biosynthesis of complex molecules were found to exhibit high degrees of physical clustering. Single proteins were identified that connect major components of the cellular metabolism and may thus be essential for the structural integrity of several biosynthetic systems.

Conclusion

Our results reveal topological equivalences between the protein interaction network and the metabolic pathway network. Evolved protein interactions may contribute significantly towards increasing the efficiency of metabolic processes by permitting higher metabolic fluxes. Thus, our results shed further light on the unifying principles shaping the evolution of both the functional (metabolic) as well as the physical interaction network.  相似文献   

11.
Proteins participate in complex sets of interactions that represent the mechanistic foundation for much of the physiology and function of the cell. These protein-protein interactions are organized into exquisitely complex networks. The architecture of protein-protein interaction networks was recently proposed to be scale-free, with most of the proteins having only one or two connections but with relatively fewer 'hubs' possessing tens, hundreds or more links. The high level of hub connectivity must somehow be reflected in protein structure. What structural quality of hub proteins enables them to interact with large numbers of diverse targets? One possibility would be to employ binding regions that have the ability to bind multiple, structurally diverse partners. This trait can be imparted by the incorporation of intrinsic disorder in one or both partners. To illustrate the value of such contributions, this review examines the roles of intrinsic disorder in protein network architecture. We show that there are three general ways that intrinsic disorder can contribute: First, intrinsic disorder can serve as the structural basis for hub protein promiscuity; secondly, intrinsically disordered proteins can bind to structured hub proteins; and thirdly, intrinsic disorder can provide flexible linkers between functional domains with the linkers enabling mechanisms that facilitate binding diversity. An important research direction will be to determine what fraction of protein-protein interaction in regulatory networks relies on intrinsic disorder.  相似文献   

12.
Recent analyses of biological and artificial networks have revealed a common network architecture, called scale-free topology. The origin of the scale-free topology has been explained by using growth and preferential attachment mechanisms. In a cell, proteins are the most important carriers of function, and are composed of domains as elemental units responsible for the physical interaction between protein pairs. Here, we propose a model for protein–protein interaction networks that reveals the emergence of two possible topologies. We show that depending on the number of randomly selected interacting domain pairs, the connectivity distribution follows either a scale-free distribution, even in the absence of the preferential attachment, or a normal distribution. This new approach only requires an evolutionary model of proteins (nodes) but not for the interactions (edges). The edges are added by means of random interaction of domain pairs. As a result, this model offers a new mechanistic explanation for understanding complex networks with a direct biological interpretation because only protein structures and their functions evolved through genetic modifications of amino acid sequences. These findings are supported by numerical simulations as well as experimental data.  相似文献   

13.

Background

WD40 repeat proteins constitute one of the largest families in eukaryotes, and widely participate in various fundamental cellular processes by interacting with other molecules. Based on individual WD40 proteins, previous work has demonstrated that their structural characteristics should confer great potential of interaction and complex formation, and has speculated that they may serve as hubs in the protein-protein interaction (PPI) network. However, what roles the whole family plays in organizing the PPI network, and whether this information can be utilized in complex prediction remain unclear. To address these issues, quantitative and systematic analyses of WD40 proteins from the perspective of PPI networks are highly required.

Results

In this work, we built two human PPI networks by using data sets with different confidence levels, and studied the network properties of the whole human WD40 protein family systematically. Our analyses have quantitatively confirmed that the human WD40 protein family, as a whole, tends to be hubs with an odds ratio of about 1.8 or greater, and the network decomposition has revealed that they are prone to enrich near the global center of the whole network with a fold change of two in the median k-values. By integrating expression profiles, we have further shown that WD40 hub proteins are inclined to be intramodular, which is indicative of complex assembling. Based on this information, we have further predicted 1674 potential WD40-associated complexes by choosing a clique-based method, which is more sensitive than others, and an indirect evaluation by co-expression scores has demonstrated its reliability.

Conclusions

At the systems level but not sporadic examples’ level, this work has provided rich knowledge for better understanding WD40 proteins’ roles in organizing the PPI network. These findings and predicted complexes can offer valuable clues for prioritizing candidates for further studies.
  相似文献   

14.
15.
What proteins interacted in a long-extinct ancestor of yeast? How have different members of a protein complex assembled together over time? Our ability to answer such questions has been limited by the unavailability of ancestral protein-protein interaction (PPI) networks. To overcome this limitation, we propose several novel algorithms to reconstruct the growth history of a present-day network. Our likelihood-based method finds a probable previous state of the graph by applying an assumed growth model backwards in time. This approach retains node identities so that the history of individual nodes can be tracked. Using this methodology, we estimate protein ages in the yeast PPI network that are in good agreement with sequence-based estimates of age and with structural features of protein complexes. Further, by comparing the quality of the inferred histories for several different growth models (duplication-mutation with complementarity, forest fire, and preferential attachment), we provide additional evidence that a duplication-based model captures many features of PPI network growth better than models designed to mimic social network growth. From the reconstructed history, we model the arrival time of extant and ancestral interactions and predict that complexes have significantly re-wired over time and that new edges tend to form within existing complexes. We also hypothesize a distribution of per-protein duplication rates, track the change of the network''s clustering coefficient, and predict paralogous relationships between extant proteins that are likely to be complementary to the relationships inferred using sequence alone. Finally, we infer plausible parameters for the model, thereby predicting the relative probability of various evolutionary events. The success of these algorithms indicates that parts of the history of the yeast PPI are encoded in its present-day form.  相似文献   

16.
To understand the function of protein complexes and their association with biological processes, a lot of studies have been done towards analyzing the protein-protein interaction (PPI) networks. However, the advancement in high-throughput technology has resulted in a humongous amount of data for analysis. Moreover, high level of noise, sparseness, and skewness in degree distribution of PPI networks limits the performance of many clustering algorithms and further analysis of their interactions.In addressing and solving these problems we present a novel random walk based algorithm that converts the incomplete and binary PPI network into a protein-protein topological similarity matrix (PP-TS matrix). We believe that if two proteins share some high-order topological similarities they are likely to be interacting with each other. Using the obtained PP-TS matrix, we constructed and used weighted networks to further study and analyze the interaction among proteins. Specifically, we applied a fully automated community structure finding algorithm (Auto-HQcut) on the obtained weighted network to cluster protein complexes. We then analyzed the protein complexes for significance in biological processes. To help visualize and analyze these protein complexes we also developed an interface that displays the resulting complexes as well as the characteristics associated with each complex.Applying our approach to a yeast protein-protein interaction network, we found that the predicted protein-protein interaction pairs with high topological similarities have more significant biological relevance than the original protein-protein interactions pairs. When we compared our PPI network reconstruction algorithm with other existing algorithms using gene ontology and gene co-expression, our algorithm produced the highest similarity scores. Also, our predicted protein complexes showed higher accuracy measure compared to the other protein complex predictions.  相似文献   

17.
Goel A  Li SS  Wilkins MR 《Proteomics》2011,11(13):2672-2682
Protein-protein interaction networks are typically built with interactions collated from many experiments. These networks are thus composite and show all interactions that are currently known to occur in a cell. However, these representations are static and ignore the constant changes in protein-protein interactions. Here we present software for the generation and analysis of dynamic, four-dimensional (4-D) protein interaction networks. In this, time-course-derived abundance data are mapped onto three-dimensional networks to generate network movies. These networks can be navigated, manipulated and queried in real time. Two types of dynamic networks can be generated: a 4-D network that maps expression data onto protein nodes and one that employs 'real-time rendering' by which protein nodes and their interactions appear and disappear in association with temporal changes in expression data. We illustrate the utility of this software by the analysis of singlish interface date hub interactions during the yeast cell cycle. In this, we show that proteins MLC1 and YPT52 show strict temporal control of when their interaction partners are expressed. Since these proteins have one and two interaction interfaces, respectively, it suggests that temporal control of gene expression may be used to limit competition at the interaction interfaces of some hub proteins. The software and movies of the 4-D networks are available at http://www.systemsbiology.org.au/downloads_geomi.html.  相似文献   

18.
We introduce here the concept of Implicit networks which provide, like Bayesian networks, a graphical modelling framework that encodes the joint probability distribution for a set of random variables within a directed acyclic graph. We show that Implicit networks, when used in conjunction with appropriate statistical techniques, are very attractive for their ability to understand and analyze biological data. Particularly, we consider here the use of Implicit networks for causal inference in biomolecular pathways. In such pathways, an Implicit network encodes dependencies among variables (proteins, genes), can be trained to learn causal relationships (regulation, interaction) between them and then used to predict the biological response given the status of some key proteins or genes in the network. We show that Implicit networks offer efficient methodologies for learning from observations without prior knowledge and thus provide a good alternative to classical inference in Bayesian networks when priors are missing. We illustrate our approach by an application to simulated data for a simplified signal transduction pathway of the epidermal growth factor receptor (EGFR) protein.  相似文献   

19.
20.
We develop an integrated probabilistic model to combine protein physical interactions, genetic interactions, highly correlated gene expression networks, protein complex data, and domain structures of individual proteins to predict protein functions. The model is an extension of our previous model for protein function prediction based on Markovian random field theory. The model is flexible in that other protein pairwise relationship information and features of individual proteins can be easily incorporated. Two features distinguish the integrated approach from other available methods for protein function prediction. One is that the integrated approach uses all available sources of information with different weights for different sources of data. It is a global approach that takes the whole network into consideration. The second feature is that the posterior probability that a protein has the function of interest is assigned. The posterior probability indicates how confident we are about assigning the function to the protein. We apply our integrated approach to predict functions of yeast proteins based upon MIPS protein function classifications and upon the interaction networks based on MIPS physical and genetic interactions, gene expression profiles, tandem affinity purification (TAP) protein complex data, and protein domain information. We study the recall and precision of the integrated approach using different sources of information by the leave-one-out approach. In contrast to using MIPS physical interactions only, the integrated approach combining all of the information increases the recall from 57% to 87% when the precision is set at 57%-an increase of 30%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号