Inferring Meaningful Communities from Topology-Constrained Correlation Networks |
| |
Authors: | Jose Sergio Hleap Christian Blouin |
| |
Affiliation: | 1. Department of Biochemistry and Molecular Biology, Dalhouise University, Halifax, Nova Scotia, Canada.; 2. Department of Computer Science, Dalhouise University, Halifax, Nova Scotia, Canada.; Technical University Darmstadt, Germany, |
| |
Abstract: | Community structure detection is an important tool in graph analysis. This can be done, among other ways, by solving for the partition set which optimizes the modularity scores . Here it is shown that topological constraints in correlation graphs induce over-fragmentation of community structures. A refinement step to this optimization based on Linear Discriminant Analysis (LDA) and a statistical test for significance is proposed. In structured simulation constrained by topology, this novel approach performs better than the optimization of modularity alone. This method was also tested with two empirical datasets: the Roll-Call voting in the 110th US Senate constrained by geographic adjacency, and a biological dataset of 135 protein structures constrained by inter-residue contacts. The former dataset showed sub-structures in the communities that revealed a regional bias in the votes which transcend party affiliations. This is an interesting pattern given that the 110th Legislature was assumed to be a highly polarized government. The -amylase catalytic domain dataset (biological dataset) was analyzed with and without topological constraints (inter-residue contacts). The results without topological constraints showed differences with the topology constrained one, but the LDA filtering did not change the outcome of the latter. This suggests that the LDA filtering is a robust way to solve the possible over-fragmentation when present, and that this method will not affect the results where there is no evidence of over-fragmentation. |
| |
Keywords: | |
|
|