首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
An ensemble framework for clustering protein-protein interaction networks   总被引:3,自引:0,他引:3  
MOTIVATION: Protein-Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. The presence of biologically relevant functional modules in these networks has been theorized by many researchers. However, the application of traditional clustering algorithms for extracting these modules has not been successful, largely due to the presence of noisy false positive interactions as well as specific topological challenges in the network. RESULTS: In this article, we propose an ensemble clustering framework to address this problem. For base clustering, we introduce two topology-based distance metrics to counteract the effects of noise. We develop a PCA-based consensus clustering technique, designed to reduce the dimensionality of the consensus problem and yield informative clusters. We also develop a soft consensus clustering variant to assign multifaceted proteins to multiple functional groups. We conduct an empirical evaluation of different consensus techniques using topology-based, information theoretic and domain-specific validation metrics and show that our approaches can provide significant benefits over other state-of-the-art approaches. Our analysis of the consensus clusters obtained demonstrates that ensemble clustering can (a) produce improved biologically significant functional groupings; and (b) facilitate soft clustering by discovering multiple functional associations for proteins. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

2.

Background  

Although protein-protein interaction networks determined with high-throughput methods are incomplete, they are commonly used to infer the topology of the complete interactome. These partial networks often show a scale-free behavior with only a few proteins having many and the majority having only a few connections. Recently, the possibility was suggested that this scale-free nature may not actually reflect the topology of the complete interactome but could also be due to the error proneness and incompleteness of large-scale experiments.  相似文献   

3.
The advent of the "omics" era in biology research has brought new challenges and requires the development of novel strategies to answer previously intractable questions. Molecular interaction networks provide a framework to visualize cellular processes, but their complexity often makes their interpretation an overwhelming task. The inherently artificial nature of interaction detection methods and the incompleteness of currently available interaction maps call for a careful and well-informed utilization of this valuable data. In this tutorial, we aim to give an overview of the key aspects that any researcher needs to consider when working with molecular interaction data sets and we outline an example for interactome analysis. Using the molecular interaction database IntAct, the software platform Cytoscape, and its plugins BiNGO and clusterMaker, and taking as a starting point a list of proteins identified in a mass spectrometry-based proteomics experiment, we show how to build, visualize, and analyze a protein-protein interaction network.  相似文献   

4.
The scale free structure p(k)-k(-gamma) of protein-protein interaction networks can be reproduced by a static physical model in simulation. We inspect the model theoretically, and find the key reason for the model generating apparent scale free degree distributions. This explanation provides a generic mechanism of 'scale free' networks. Moreover, we predict the dependence of gamma on experimental protein concentrations or other sensitivity factors in detecting interactions, and find experimental evidence to support the prediction.  相似文献   

5.
6.

Background  

In recent years, a considerable amount of research effort has been directed to the analysis of biological networks with the availability of genome-scale networks of genes and/or proteins of an increasing number of organisms. A protein-protein interaction (PPI) network is a particular biological network which represents physical interactions between pairs of proteins of an organism. Major research on PPI networks has focused on understanding the topological organization of PPI networks, evolution of PPI networks and identification of conserved subnetworks across different species, discovery of modules of interaction, use of PPI networks for functional annotation of uncharacterized proteins, and improvement of the accuracy of currently available networks.  相似文献   

7.
Several approaches have been presented in the literature to cluster Protein-Protein Interaction (PPI) networks. They can be grouped in two main categories: those allowing a protein to participate in different clusters and those generating only nonoverlapping clusters. In both cases, a challenging task is to find a suitable compromise between the biological relevance of the results and a comprehensive coverage of the analyzed networks. Indeed, methods returning high accurate results are often able to cover only small parts of the input PPI network, especially when low-characterized networks are considered. We present a coclustering-based technique able to generate both overlapping and nonoverlapping clusters. The density of the clusters to search for can also be set by the user. We tested our method on the two networks of yeast and human, and compared it to other five well-known techniques on the same interaction data sets. The results showed that, for all the examples considered, our approach always reaches a good compromise between accuracy and network coverage. Furthermore, the behavior of our algorithm is not influenced by the structure of the input network, different from all the techniques considered in the comparison, which returned very good results on the yeast network, while on the human network their outcomes are rather poor.  相似文献   

8.
Alternative splicing plays a key role in the expansion of proteomic and regulatory complexity, yet the functions of the vast majority of differentially spliced exons are not known. In this study, we observe that brain and other tissue-regulated exons are significantly enriched in flexible regions of proteins that likely form conserved interaction surfaces. These proteins participate in significantly more interactions in protein-protein interaction (PPI) networks than other proteins. Using LUMIER, an automated PPI assay, we observe that approximately one-third of analyzed neural-regulated exons affect PPIs. Inclusion of these exons stimulated and repressed different partner interactions at comparable frequencies. This assay further revealed functions of individual exons, including a role for a neural-specific exon in promoting an interaction between Bridging Integrator 1 (Bin1)/Amphiphysin II and Dynamin 2 (Dnm2) that facilitates endocytosis. Collectively, our results provide evidence that regulated alternative exons frequently remodel interactions to establish tissue-dependent PPI networks.  相似文献   

9.
Proteins carry out their functions by interacting with other proteins and small molecules, forming a complex interaction network. In this review, we briefly introduce classical graph theory based protein-protein interaction networks. We also describe the commonly used experimental methods to construct these networks, and the insights that can be gained from these networks. We then discuss the recent transition from graph theory based networks to structure based protein-protein interaction networks and the advantages of the latter over the former, using two networks as examples. We further discuss the usefulness of structure based protein-protein interaction networks for drug discovery, with a special emphasis on drug repositioning.  相似文献   

10.
Itzhaki Z 《PloS one》2011,6(7):e21724
Protein-domains play an important role in mediating protein-protein interactions. Furthermore, the same domain-pairs mediate different interactions in different contexts and in various organisms, and therefore domain-pairs are considered as the building blocks of interactome networks. Here we extend these principles to the host-virus interface and find the domain-pairs that potentially mediate human-herpesvirus interactions. Notably, we find that the same domain-pairs used by other organisms for mediating their interactions underlie statistically significant fractions of human-virus protein inter-interaction networks. Our analysis shows that viral domains tend to interact with human domains that are hubs in the human domain-domain interaction network. This may enable the virus to easily interfere with a variety of mechanisms and processes involving various and different human proteins carrying the relevant hub domain. Comparative genomics analysis provides hints at a molecular mechanism by which the virus acquired some of its interacting domains from its human host.  相似文献   

11.

Background

Numerous centrality measures have been introduced to identify “central” nodes in large networks. The availability of a wide range of measures for ranking influential nodes leaves the user to decide which measure may best suit the analysis of a given network. The choice of a suitable measure is furthermore complicated by the impact of the network topology on ranking influential nodes by centrality measures. To approach this problem systematically, we examined the centrality profile of nodes of yeast protein-protein interaction networks (PPINs) in order to detect which centrality measure is succeeding in predicting influential proteins. We studied how different topological network features are reflected in a large set of commonly used centrality measures.

Results

We used yeast PPINs to compare 27 common of centrality measures. The measures characterize and assort influential nodes of the networks. We applied principal component analysis (PCA) and hierarchical clustering and found that the most informative measures depend on the network’s topology. Interestingly, some measures had a high level of contribution in comparison to others in all PPINs, namely Latora closeness, Decay, Lin, Freeman closeness, Diffusion, Residual closeness and Average distance centralities.

Conclusions

The choice of a suitable set of centrality measures is crucial for inferring important functional properties of a network. We concluded that undertaking data reduction using unsupervised machine learning methods helps to choose appropriate variables (centrality measures). Hence, we proposed identifying the contribution proportions of the centrality measures with PCA as a prerequisite step of network analysis before inferring functional consequences, e.g., essentiality of a node.
  相似文献   

12.
Goel A  Li SS  Wilkins MR 《Proteomics》2011,11(13):2672-2682
Protein-protein interaction networks are typically built with interactions collated from many experiments. These networks are thus composite and show all interactions that are currently known to occur in a cell. However, these representations are static and ignore the constant changes in protein-protein interactions. Here we present software for the generation and analysis of dynamic, four-dimensional (4-D) protein interaction networks. In this, time-course-derived abundance data are mapped onto three-dimensional networks to generate network movies. These networks can be navigated, manipulated and queried in real time. Two types of dynamic networks can be generated: a 4-D network that maps expression data onto protein nodes and one that employs 'real-time rendering' by which protein nodes and their interactions appear and disappear in association with temporal changes in expression data. We illustrate the utility of this software by the analysis of singlish interface date hub interactions during the yeast cell cycle. In this, we show that proteins MLC1 and YPT52 show strict temporal control of when their interaction partners are expressed. Since these proteins have one and two interaction interfaces, respectively, it suggests that temporal control of gene expression may be used to limit competition at the interaction interfaces of some hub proteins. The software and movies of the 4-D networks are available at http://www.systemsbiology.org.au/downloads_geomi.html.  相似文献   

13.

Background  

The sparse connectivity of protein-protein interaction data sets makes identification of functional modules challenging. The purpose of this study is to critically evaluate a novel clustering technique for clustering and detecting functional modules in protein-protein interaction networks, termed STM.  相似文献   

14.
Conservation planning requires knowledge of the distribution of all species in the area of interest. Surrogates for biodiversity are considered as a possible solution. The two major types are biological and environmental surrogates. Here, we evaluate four different methods of hierarchical clustering, as well as one non-hierarchical method, in the context of producing surrogates for biodiversity. Each clustering method was used to produce maps of both surrogate types. We evaluated the representativeness of each clustering method by finding the average number of species represented in a set of sites, one site of each domain, which was carried out with Monte-Carlo permutations procedure. We propose an additional measure of surrogate performance, which is the degree of evenness of the different domains, e.g., by calculating Simpson's diversity index. Surrogates with low evenness leave little flexibility in site selection since often some of the domains may be represented by a single or very few sites, and thus surrogate maps with a high Simpson's index value may be more relevant for actual decision making. We found that there is a trade-off between species representativeness and evenness. Centroid clustering represented the most species, but had very low values of evenness. Ward's method of minimum variance represented more species than a random choice, and had high evenness values. Using the typical evaluation measures, the Centroid clustering method was most efficient for surrogate production. However, when Simpson's index is also considered, Ward's method of minimum variance is more appropriate for managers.  相似文献   

15.
16.
The biological mechanisms through which proteins interact with one another are best revealed by studying the structural interfaces between interacting proteins. Protein-protein interfaces can be extracted from three-dimensional (3D) structural data of protein complexes and then clustered to derive biological insights. However, conventional protein interface clustering methods lack computational scalability and statistical support. In this work, we present a new method named "PPiClust" to systematically encode, cluster, and analyze similar 3D interface patterns in protein complexes efficiently. Experimental results showed that our method is effective in discovering visually consistent and statistically significant clusters of interfaces, and at the same time sufficiently time-efficient to be performed on a single computer. The interface clusters are also useful for uncovering the structural basis of protein interactions. Analysis of the resulting interface clusters revealed groups of structurally diverse proteins having similar interface patterns. We also found, in some of the interface clusters, the presence of well-known linear binding motifs which were noncontiguous in the primary sequences. These results suggest that PPiClust can discover not only statistically significant, but also biologically significant, protein interface clusters from protein complex structural data.  相似文献   

17.

Background  

The abundant data available for protein interaction networks have not yet been fully understood. New types of analyses are needed to reveal organizational principles of these networks to investigate the details of functional and regulatory clusters of proteins.  相似文献   

18.
Global protein function prediction from protein-protein interaction networks   总被引:20,自引:0,他引:20  
Determining protein function is one of the most challenging problems of the post-genomic era. The availability of entire genome sequences and of high-throughput capabilities to determine gene coexpression patterns has shifted the research focus from the study of single proteins or small complexes to that of the entire proteome. In this context, the search for reliable methods for assigning protein function is of primary importance. There are various approaches available for deducing the function of proteins of unknown function using information derived from sequence similarity or clustering patterns of co-regulated genes, phylogenetic profiles, protein-protein interactions (refs. 5-8 and Samanta, M.P. and Liang, S., unpublished data), and protein complexes. Here we propose the assignment of proteins to functional classes on the basis of their network of physical interactions as determined by minimizing the number of protein interactions among different functional categories. Function assignment is proteome-wide and is determined by the global connectivity pattern of the protein network. The approach results in multiple functional assignments, a consequence of the existence of multiple equivalent solutions. We apply the method to analyze the yeast Saccharomyces cerevisiae protein-protein interaction network. The robustness of the approach is tested in a system containing a high percentage of unclassified proteins and also in cases of deletion and insertion of specific protein interactions.  相似文献   

19.
We introduce clustering with overlapping neighborhood expansion (ClusterONE), a method for detecting potentially overlapping protein complexes from protein-protein interaction data. ClusterONE-derived complexes for several yeast data sets showed better correspondence with reference complexes in the Munich Information Center for Protein Sequence (MIPS) catalog and complexes derived from the Saccharomyces Genome Database (SGD) than the results of seven popular methods. The results also showed a high extent of functional homogeneity.  相似文献   

20.
Protein-protein interaction (PPI) networks contain a large amount of useful information for the functional characterization of proteins and promote the understanding of the complex molecular relationships that determine the phenotype of a cell. Recently, large human interaction maps have been generated with high throughput technologies such as the yeast two-hybrid system. However, they are static and incomplete and do not provide immediate clues about the cellular processes that convert genetic information into complex phenotypes. Refined multiple-aspect PPI screening and confirmation strategies will have to be put in place to increase the validity of interaction maps. Integration of interaction data with other qualitative and quantitative information (e.g. protein expression or localization data), will be required to construct networks of protein function that reflect dynamic processes in the cell. In this way, combined PPI networks can become valuable resources for a systems-level understanding of cellular processes and complex phenotypes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号