Background
A goal of systems biology is to analyze large-scale molecular networks including gene expressions and protein-protein interactions, revealing the relationships between network structures and their biological functions. Dividing a protein-protein interaction (PPI) network into naturally grouped parts is an essential way to investigate the relationship between topology of networks and their functions. However, clear modular decomposition is often hard due to the heterogeneous or scale-free properties of PPI networks.Methodology/Principal Findings
To address this problem, we propose a diffusion model-based spectral clustering algorithm, which analytically solves the cluster structure of PPI networks as a problem of random walks in the diffusion process in them. To cope with the heterogeneity of the networks, the power factor is introduced to adjust the diffusion matrix by weighting the transition (adjacency) matrix according to a node degree matrix. This algorithm is named adjustable diffusion matrix-based spectral clustering (ADMSC). To demonstrate the feasibility of ADMSC, we apply it to decomposition of a yeast PPI network, identifying biologically significant clusters with approximately equal size. Compared with other established algorithms, ADMSC facilitates clear and fast decomposition of PPI networks.Conclusions/Significance
ADMSC is proposed by introducing the power factor that adjusts the diffusion matrix to the heterogeneity of the PPI networks. ADMSC effectively partitions PPI networks into biologically significant clusters with almost equal sizes, while being very fast, robust and appealing simple. 相似文献Background
Numerous centrality measures have been introduced to identify “central” nodes in large networks. The availability of a wide range of measures for ranking influential nodes leaves the user to decide which measure may best suit the analysis of a given network. The choice of a suitable measure is furthermore complicated by the impact of the network topology on ranking influential nodes by centrality measures. To approach this problem systematically, we examined the centrality profile of nodes of yeast protein-protein interaction networks (PPINs) in order to detect which centrality measure is succeeding in predicting influential proteins. We studied how different topological network features are reflected in a large set of commonly used centrality measures.Results
We used yeast PPINs to compare 27 common of centrality measures. The measures characterize and assort influential nodes of the networks. We applied principal component analysis (PCA) and hierarchical clustering and found that the most informative measures depend on the network’s topology. Interestingly, some measures had a high level of contribution in comparison to others in all PPINs, namely Latora closeness, Decay, Lin, Freeman closeness, Diffusion, Residual closeness and Average distance centralities.Conclusions
The choice of a suitable set of centrality measures is crucial for inferring important functional properties of a network. We concluded that undertaking data reduction using unsupervised machine learning methods helps to choose appropriate variables (centrality measures). Hence, we proposed identifying the contribution proportions of the centrality measures with PCA as a prerequisite step of network analysis before inferring functional consequences, e.g., essentiality of a node.Background
Recently, large data sets of protein-protein interactions (PPI) which can be modeled as PPI networks are generated through high-throughput methods. And locally dense regions in PPI networks are very likely to be protein complexes. Since protein complexes play a key role in many biological processes, detecting protein complexes in PPI networks is one of important tasks in post-genomic era. However, PPI networks are often incomplete and noisy, which builds barriers to mining protein complexes.Results
We propose a new and effective algorithm based on robustness to detect overlapping clusters as protein complexes in PPI networks. And in order to improve the accuracy of resulting clusters, our algorithm tries to reduce bad effects brought by noise in PPI networks. And in our algorithm, each new cluster begins from a seed and is expanded through adding qualified nodes from the cluster's neighbourhood nodes. Besides, in our algorithm, a new distance measurement method between a cluster K and a node in the neighbours of K is proposed as well. The performance of our algorithm is evaluated by applying it on two PPI networks which are Gavin network and Database of Interacting Proteins (DIP). The results show that our algorithm is better than Markov clustering algorithm (MCL), Clique Percolation method (CPM) and core-attachment based method (CoAch) in terms of F-measure, co-localization and Gene Ontology (GO) semantic similarity.Conclusions
Our algorithm detects locally dense regions or clusters as protein complexes. The results show that protein complexes generated by our algorithm have better quality than those generated by some previous classic methods. Therefore, our algorithm is effective and useful.Background
Protein complexes can be identified from the protein interaction networks derived from experimental data sets. However, these analyses are challenging because of the presence of unreliable interactions and the complex connectivity of the network. The integration of protein-protein interactions with the data from other sources can be leveraged for improving the effectiveness of protein complexes detection algorithms.Methods
We have developed novel semantic similarity method, which use Gene Ontology (GO) annotations to measure the reliability of protein-protein interactions. The protein interaction networks can be converted into a weighted graph representation by assigning the reliability values to each interaction as a weight. Following the approach of that of the previously proposed clustering algorithm IPCA which expands clusters starting from seeded vertices, we present a clustering algorithm OIIP based on the new weighted Protein-Protein interaction networks for identifying protein complexes.Results
The algorithm OIIP is applied to the protein interaction network of Sacchromyces cerevisiae and identifies many well known complexes. Experimental results show that the algorithm OIIP has higher F-measure and accuracy compared to other competing approaches.Background
Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI) data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. 相似文献Background
The goal of the study was to demonstrate a hierarchical structure of resting state activity in the healthy brain using a data-driven clustering algorithm.Methodology/Principal Findings
The fuzzy-c-means clustering algorithm was applied to resting state fMRI data in cortical and subcortical gray matter from two groups acquired separately, one of 17 healthy individuals and the second of 21 healthy individuals. Different numbers of clusters and different starting conditions were used. A cluster dispersion measure determined the optimal numbers of clusters. An inner product metric provided a measure of similarity between different clusters. The two cluster result found the task-negative and task-positive systems. The cluster dispersion measure was minimized with seven and eleven clusters. Each of the clusters in the seven and eleven cluster result was associated with either the task-negative or task-positive system. Applying the algorithm to find seven clusters recovered previously described resting state networks, including the default mode network, frontoparietal control network, ventral and dorsal attention networks, somatomotor, visual, and language networks. The language and ventral attention networks had significant subcortical involvement. This parcellation was consistently found in a large majority of algorithm runs under different conditions and was robust to different methods of initialization.Conclusions/Significance
The clustering of resting state activity using different optimal numbers of clusters identified resting state networks comparable to previously obtained results. This work reinforces the observation that resting state networks are hierarchically organized. 相似文献Background
Although focal epilepsies are increasingly recognized to affect multiple and remote neural systems, the underlying spatiotemporal pattern and the relationships between recurrent spontaneous seizures, global functional connectivity, and structural integrity remain largely unknown.Methodology/Principal Findings
Here we utilized serial resting-state functional MRI, graph-theoretical analysis of complex brain networks and diffusion tensor imaging to characterize the evolution of global network topology, functional connectivity and structural changes in the interictal brain in relation to focal epilepsy in a rat model. Epileptic networks exhibited a more regular functional topology than controls, indicated by a significant increase in shortest path length and clustering coefficient. Interhemispheric functional connectivity in epileptic brains decreased, while intrahemispheric functional connectivity increased. Widespread reductions of fractional anisotropy were found in white matter regions not restricted to the vicinity of the epileptic focus, including the corpus callosum.Conclusions/Significance
Our longitudinal study on the pathogenesis of network dynamics in epileptic brains reveals that, despite the locality of the epileptogenic area, epileptic brains differ in their global network topology, connectivity and structural integrity from healthy brains. 相似文献Background
A network motif is a sub-network that occurs frequently in a given network. Detection of such motifs is important since they uncover functions and local properties of the given biological network. Finding motifs is however a computationally challenging task as it requires solving the costly subgraph isomorphism problem. Moreover, the topology of biological networks change over time. These changing networks are called dynamic biological networks. As the network evolves, frequency of each motif in the network also changes. Computing the frequency of a given motif from scratch in a dynamic network as the network topology evolves is infeasible, particularly for large and fast evolving networks.Results
In this article, we design and develop a scalable method for counting the number of motifs in a dynamic biological network. Our method incrementally updates the frequency of each motif as the underlying network’s topology evolves. Our experiments demonstrate that our method can update the frequency of each motif in orders of magnitude faster than counting the motif embeddings every time the network changes. If the network evolves more frequently, the margin with which our method outperforms the existing static methods, increases.Conclusions
We evaluated our method extensively using synthetic and real datasets, and show that our method is highly accurate(≥?96%) and that it can be scaled to large dense networks. The results on real data demonstrate the utility of our method in revealing interesting insights on the evolution of biological processes.Background
Cellular behaviors are governed by interaction networks among biomolecules, for example gene regulatory and signal transduction networks. An often used dynamic modeling framework for these networks, Boolean modeling, can obtain their attractors (which correspond to cell types and behaviors) and their trajectories from an initial state (e.g. a resting state) to the attractors, for example in response to an external signal. The existing methods however do not elucidate the causal relationships between distant nodes in the network.Results
In this work, we propose a simple logic framework, based on categorizing causal relationships as sufficient or necessary, as a complement to Boolean networks. We identify and explore the properties of complex subnetworks that are distillable into a single logic relationship. We also identify cyclic subnetworks that ensure the stabilization of the state of participating nodes regardless of the rest of the network. We identify the logic backbone of biomolecular networks, consisting of external signals, self-sustaining cyclic subnetworks (stable motifs), and output nodes. Furthermore, we use the logic framework to identify crucial nodes whose override can drive the system from one steady state to another. We apply these techniques to two biological networks: the epithelial-to-mesenchymal transition network corresponding to a developmental process exploited in tumor invasion, and the network of abscisic acid induced stomatal closure in plants. We find interesting subnetworks with logical implications in these networks. Using these subgraphs and motifs, we efficiently reduce both networks to succinct backbone structures.Conclusions
The logic representation identifies the causal relationships between distant nodes and subnetworks. This knowledge can form the basis of network control or used in the reverse engineering of networks.Background
Network alignment is one of the most common biological network comparison methods. Aligning protein-protein interaction (PPI) networks of different species is of great important to detect evolutionary conserved pathways or protein complexes across species through the identification of conserved interactions, and to improve our insight into biological systems. Global network alignment (GNA) problem is NP-complete, for which only heuristic methods have been proposed so far. Generally, the current GNA methods fall into global heuristic seed-and-extend approaches. These methods can not get the best overall consistent alignment between networks for the opinionated local seed. Furthermore These methods are lost in maximizing the number of aligned edges between two networks without considering the original structures of functional modules.Methods
We present a novel seed selection strategy for global network alignment by constructing the pairs of hub nodes of networks to be aligned into multiple seeds. Beginning from every hub seed and using the membership similarity of nodes to quantify to what extent the nodes can participate in functional modules associated with current seed topologically we align the networks by modules. By this way we can maintain the functional modules are not damaged during the heuristic alignment process. And our method is efficient in resolving the fatal problem of most conventional algorithms that the initialization selected seeds have a direct influence on the alignment result. The similarity measures between network nodes (e.g., proteins) include sequence similarity, centrality similarity, and dynamic membership similarity and our algorithm can be called Multiple Hubs-based Alignment (MHA).Results
When applying our seed selection strategy to several pairs of real PPI networks, it is observed that our method is working to strike a balance, extending the conserved interactions while maintaining the functional modules unchanged. In the case study, we assess the effectiveness of MHA on the alignment of the yeast and fly PPI networks. Our method outperforms state-of-the-art algorithms at detecting conserved functional modules and retrieves in particular 86% more conserved interactions than IsoRank.Conclusions
We believe that our seed selection strategy will lead us to obtain more topologically and biologically similar alignment result. And it can be used as the reference and complement of other heuristic methods to seek more meaningful alignment results.The most basic and significant issue in complex network analysis is community detection, which is a branch of machine learning. Most current community detection approaches, only consider a network's topology structures, which lose the potential to use node attribute information. In attributed networks, both topological structure and node attributed are important features for community detection. In recent years, the spectral clustering algorithm has received much interest as one of the best performing algorithms in the subcategory of dimensionality reduction. This algorithm applies the eigenvalues of the affinity matrix to map data to low-dimensional space. In the present paper, a new version of the spectral cluster, named Attributed Spectral Clustering (ASC), is applied for attributed graphs that the identified communities have structural cohesiveness and attribute homogeneity. Since the performance of spectral clustering heavily depends on the goodness of the affinity matrix, the ASC algorithm will use the Topological and Attribute Random Walk Affinity Matrix (TARWAM) as a new affinity matrix to calculate the similarity between nodes. TARWAM utilizes the biased random walk to integrate network topology and attribute information. It can improve the similarity degree among the pairs of nodes in the same density region of the attributed network, without the need for parameter tuning. The proposed approach has been compared to other primary and new attributed graph clustering algorithms based on synthetic and real datasets. The experimental results show that the proposed approach is more effective and accurate compared to other state-of-the-art attributed graph clustering techniques.
相似文献