首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Agglomerative hierarchical clustering becomes infeasible when applied to large datasets due to its O(N 2) storage requirements. We present a multi-stage agglomerative hierarchical clustering (MAHC) approach aimed at large datasets of speech segments. The algorithm is based on an iterative divide-and-conquer strategy. The data is first split into independent subsets, each of which is clustered separately. Thus reduces the storage required for sequential implementations, and allows concurrent computation on parallel computing hardware. The resultant clusters are merged and subsequently re-divided into subsets, which are passed to the following iteration. We show that MAHC can match and even surpass the performance of the exact implementation when applied to datasets of speech segments.  相似文献   

3.

Background

Previous studies using hierarchical clustering approach to analyze resting-state fMRI data were limited to a few slices or regions-of-interest (ROIs) after substantial data reduction.

Purpose

To develop a framework that can perform voxel-wise hierarchical clustering of whole-brain resting-state fMRI data from a group of subjects.

Materials and Methods

Resting-state fMRI measurements were conducted for 86 adult subjects using a single-shot echo-planar imaging (EPI) technique. After pre-processing and co-registration to a standard template, pair-wise cross-correlation coefficients (CC) were calculated for all voxels inside the brain and translated into absolute Pearson''s distances after imposing a threshold CC≥0.3. The group averages of the Pearson''s distances were then used to perform hierarchical clustering with the developed framework, which entails gray matter masking and an iterative scheme to analyze the dendrogram.

Results

With the hierarchical clustering framework, we identified most of the functional connectivity networks reported previously in the literature, such as the motor, sensory, visual, memory, and the default-mode functional networks (DMN). Furthermore, the DMN and visual system were split into their corresponding hierarchical sub-networks.

Conclusion

It is feasible to use the proposed hierarchical clustering scheme for voxel-wise analysis of whole-brain resting-state fMRI data. The hierarchical clustering result not only confirmed generally the finding in functional connectivity networks identified previously using other data processing techniques, such as ICA, but also revealed directly the hierarchical structure within the functional connectivity networks.  相似文献   

4.
We introduce the concept of control centrality to quantify the ability of a single node to control a directed weighted network. We calculate the distribution of control centrality for several real networks and find that it is mainly determined by the network’s degree distribution. We show that in a directed network without loops the control centrality of a node is uniquely determined by its layer index or topological position in the underlying hierarchical structure of the network. Inspired by the deep relation between control centrality and hierarchical structure in a general directed network, we design an efficient attack strategy against the controllability of malicious networks.  相似文献   

5.
Cell biologists have developed methods to label membrane proteins with gold nanoparticles and then extract spatial point patterns of the gold particles from transmission electron microscopy images using image processing software. Previously, the resulting patterns were analyzed using the Hopkins statistic, which distinguishes nonclustered from modestly and highly clustered distributions, but is not designed to quantify the number or sizes of the clusters. Clusters were defined by the partitional clustering approach which required the choice of a distance. Two points from a pattern were put in the same cluster if they were closer than this distance. In this study, we present a new methodology based on hierarchical clustering to quantify clustering. An intrinsic distance is computed, which is the distance that produces the maximum number of clusters in the biological data, eliminating the need to choose a distance. To quantify the extent of clustering, we compare the clustering distance between the experimental data being analyzed with that from simulated random data. Results are then expressed as a dimensionless number, the clustering ratio that facilitates the comparison of clustering between experiments. Replacing the chosen cluster distance by the intrinsic clustering distance emphasizes densely packed clusters that are likely more important to downstream signaling events.  相似文献   

6.
Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times.  相似文献   

7.
8.
G Yadav  S Babu 《PloS one》2012,7(8):e41827
Recent advances in network theory have led to considerable progress in our understanding of complex real world systems and their behavior in response to external threats or fluctuations. Much of this research has been invigorated by demonstration of the 'robust, yet fragile' nature of cellular and large-scale systems transcending biology, sociology, and ecology, through application of the network theory to diverse interactions observed in nature such as plant-pollinator, seed-dispersal agent and host-parasite relationships. In this work, we report the development of NEXCADE, an automated and interactive program for inducing disturbances into complex systems defined by networks, focusing on the changes in global network topology and connectivity as a function of the perturbation. NEXCADE uses a graph theoretical approach to simulate perturbations in a user-defined manner, singly, in clusters, or sequentially. To demonstrate the promise it holds for broader adoption by the research community, we provide pre-simulated examples from diverse real-world networks including eukaryotic protein-protein interaction networks, fungal biochemical networks, a variety of ecological food webs in nature as well as social networks. NEXCADE not only enables network visualization at every step of the targeted attacks, but also allows risk assessment, i.e. identification of nodes critical for the robustness of the system of interest, in order to devise and implement context-based strategies for restructuring a network, or to achieve resilience against link or node failures. Source code and license for the software, designed to work on a Linux-based operating system (OS) can be downloaded at http://www.nipgr.res.in/nexcade_download.html. In addition, we have developed NEXCADE as an OS-independent online web server freely available to the scientific community without any login requirement at http://www.nipgr.res.in/nexcade.html.  相似文献   

9.
Complex systems have attracted considerable interest because of their wide range of applications, and are often studied via a “classic” approach: study a specific system, find a complex network behind it, and analyze the corresponding properties. This simple methodology has produced a great deal of interesting results, but relies on an often implicit underlying assumption: the level of detail on which the system is observed. However, in many situations, physical or abstract, the level of detail can be one out of many, and might also depend on intrinsic limitations in viewing the data with a different level of abstraction or precision. So, a fundamental question arises: do properties of a network depend on its level of observability, or are they invariant? If there is a dependence, then an apparently correct network modeling could in fact just be a bad approximation of the true behavior of a complex system. In order to answer this question, we propose a novel micro-macro analysis of complex systems that quantitatively describes how the structure of complex networks varies as a function of the detail level. To this extent, we have developed a new telescopic algorithm that abstracts from the local properties of a system and reconstructs the original structure according to a fuzziness level. This way we can study what happens when passing from a fine level of detail (“micro”) to a different scale level (“macro”), and analyze the corresponding behavior in this transition, obtaining a deeper spectrum analysis. The obtained results show that many important properties are not universally invariant with respect to the level of detail, but instead strongly depend on the specific level on which a network is observed. Therefore, caution should be taken in every situation where a complex network is considered, if its context allows for different levels of observability.  相似文献   

10.
We investigate the efficient transmission and processing of weak, subthreshold signals in a realistic neural medium in the presence of different levels of the underlying noise. Assuming Hebbian weights for maximal synaptic conductances—that naturally balances the network with excitatory and inhibitory synapses—and considering short-term synaptic plasticity affecting such conductances, we found different dynamic phases in the system. This includes a memory phase where population of neurons remain synchronized, an oscillatory phase where transitions between different synchronized populations of neurons appears and an asynchronous or noisy phase. When a weak stimulus input is applied to each neuron, increasing the level of noise in the medium we found an efficient transmission of such stimuli around the transition and critical points separating different phases for well-defined different levels of stochasticity in the system. We proved that this intriguing phenomenon is quite robust, as it occurs in different situations including several types of synaptic plasticity, different type and number of stored patterns and diverse network topologies, namely, diluted networks and complex topologies such as scale-free and small-world networks. We conclude that the robustness of the phenomenon in different realistic scenarios, including spiking neurons, short-term synaptic plasticity and complex networks topologies, make very likely that it could also occur in actual neural systems as recent psycho-physical experiments suggest.  相似文献   

11.
12.

Background

There is a rapidly expanding literature on the application of complex networks in economics that focused mostly on stock markets. In this paper, we discuss an application of complex networks to study international business cycles.

Methodology/Principal Findings

We construct complex networks based on GDP data from two data sets on G7 and OECD economies. Besides the well-known correlation-based networks, we also use a specific tool for presenting causality in economics, the Granger causality. We consider different filtering methods to derive the stationary component of the GDP series for each of the countries in the samples. The networks were found to be sensitive to the detrending method. While the correlation networks provide information on comovement between the national economies, the Granger causality networks can better predict fluctuations in countries’ GDP. By using them, we can obtain directed networks allows us to determine the relative influence of different countries on the global economy network. The US appears as the key player for both the G7 and OECD samples.

Conclusion

The use of complex networks is valuable for understanding the business cycle comovements at an international level.  相似文献   

13.
Parametric uncertainty is a particularly challenging and relevant aspect of systems analysis in domains such as systems biology where, both for inference and for assessing prediction uncertainties, it is essential to characterize the system behavior globally in the parameter space. However, current methods based on local approximations or on Monte-Carlo sampling cope only insufficiently with high-dimensional parameter spaces associated with complex network models. Here, we propose an alternative deterministic methodology that relies on sparse polynomial approximations. We propose a deterministic computational interpolation scheme which identifies most significant expansion coefficients adaptively. We present its performance in kinetic model equations from computational systems biology with several hundred parameters and state variables, leading to numerical approximations of the parametric solution on the entire parameter space. The scheme is based on adaptive Smolyak interpolation of the parametric solution at judiciously and adaptively chosen points in parameter space. As Monte-Carlo sampling, it is “non-intrusive” and well-suited for massively parallel implementation, but affords higher convergence rates. This opens up new avenues for large-scale dynamic network analysis by enabling scaling for many applications, including parameter estimation, uncertainty quantification, and systems design.  相似文献   

14.
15.
With ever-increasing available data, predicting individuals'' preferences and helping them locate the most relevant information has become a pressing need. Understanding and predicting preferences is also important from a fundamental point of view, as part of what has been called a “new” computational social science. Here, we propose a novel approach based on stochastic block models, which have been developed by sociologists as plausible models of complex networks of social interactions. Our model is in the spirit of predicting individuals'' preferences based on the preferences of others but, rather than fitting a particular model, we rely on a Bayesian approach that samples over the ensemble of all possible models. We show that our approach is considerably more accurate than leading recommender algorithms, with major relative improvements between 38% and 99% over industry-level algorithms. Besides, our approach sheds light on decision-making processes by identifying groups of individuals that have consistently similar preferences, and enabling the analysis of the characteristics of those groups.  相似文献   

16.
17.
A model of fractal hierarchical structures that share the property of non-homogeneous weighted networks is introduced. These networks can be completely and analytically characterized in terms of the involved parameters, i.e., the size of the original graph Nk and the non-homogeneous weight scaling factors r 1, r 2, · · · rM. We also study the average weighted shortest path (AWSP), the average degree and the average node strength, taking place on the non-homogeneous hierarchical weighted networks. Moreover the AWSP is scrupulously calculated. We show that the AWSP depends on the number of copies and the sum of all non-homogeneous weight scaling factors in the infinite network order limit.  相似文献   

18.
Nervous systems are information processing networks that evolved by natural selection, whereas very large scale integrated (VLSI) computer circuits have evolved by commercially driven technology development. Here we follow historic intuition that all physical information processing systems will share key organizational properties, such as modularity, that generally confer adaptivity of function. It has long been observed that modular VLSI circuits demonstrate an isometric scaling relationship between the number of processing elements and the number of connections, known as Rent''s rule, which is related to the dimensionality of the circuit''s interconnect topology and its logical capacity. We show that human brain structural networks, and the nervous system of the nematode C. elegans, also obey Rent''s rule, and exhibit some degree of hierarchical modularity. We further show that the estimated Rent exponent of human brain networks, derived from MRI data, can explain the allometric scaling relations between gray and white matter volumes across a wide range of mammalian species, again suggesting that these principles of nervous system design are highly conserved. For each of these fractal modular networks, the dimensionality of the interconnect topology was greater than the 2 or 3 Euclidean dimensions of the space in which it was embedded. This relatively high complexity entailed extra cost in physical wiring: although all networks were economically or cost-efficiently wired they did not strictly minimize wiring costs. Artificial and biological information processing systems both may evolve to optimize a trade-off between physical cost and topological complexity, resulting in the emergence of homologous principles of economical, fractal and modular design across many different kinds of nervous and computational networks.  相似文献   

19.

Background

There are two ways that statistical methods can learn from biomedical data. One way is to learn classifiers to identify diseases and to predict outcomes using the training dataset with established diagnosis for each sample. When the training dataset is not available the task can be to mine for presence of meaningful groups (clusters) of samples and to explore underlying data structure (unsupervised learning).

Results

We investigated the proteomic profiles of the cytosolic fraction of human liver samples using two-dimensional electrophoresis (2DE). Samples were resected upon surgical treatment of hepatic metastases in colorectal cancer. Unsupervised hierarchical clustering of 2DE gel images (n = 18) revealed a pair of clusters, containing 11 and 7 samples. Previously we used the same specimens to measure biochemical profiles based on cytochrome P450-dependent enzymatic activities and also found that samples were clearly divided into two well-separated groups by cluster analysis. It turned out that groups by enzyme activity almost perfectly match to the groups identified from proteomic data. Of the 271 reproducible spots on our 2DE gels, we selected 15 to distinguish the human liver cytosolic clusters. Using MALDI-TOF peptide mass fingerprinting, we identified 12 proteins for the selected spots, including known cancer-associated species.

Conclusions/Significance

Our results highlight the importance of hierarchical cluster analysis of proteomic data, and showed concordance between results of biochemical and proteomic approaches. Grouping of the human liver samples and/or patients into differing clusters may provide insights into possible molecular mechanism of drug metabolism and creates a rationale for personalized treatment.  相似文献   

20.
Models for genome-wide prediction and association studies usually target a single phenotypic trait. However, in animal and plant genetics it is common to record information on multiple phenotypes for each individual that will be genotyped. Modeling traits individually disregards the fact that they are most likely associated due to pleiotropy and shared biological basis, thus providing only a partial, confounded view of genetic effects and phenotypic interactions. In this article we use data from a Multiparent Advanced Generation Inter-Cross (MAGIC) winter wheat population to explore Bayesian networks as a convenient and interpretable framework for the simultaneous modeling of multiple quantitative traits. We show that they are equivalent to multivariate genetic best linear unbiased prediction (GBLUP) and that they are competitive with single-trait elastic net and single-trait GBLUP in predictive performance. Finally, we discuss their relationship with other additive-effects models and their advantages in inference and interpretation. MAGIC populations provide an ideal setting for this kind of investigation because the very low population structure and large sample size result in predictive models with good power and limited confounding due to relatedness.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号