首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A recent paper by Daubechies et al. claims that two independent component analysis (ICA) algorithms, Infomax and FastICA, which are widely used for functional magnetic resonance imaging (fMRI) analysis, select for sparsity rather than independence. The argument was supported by a series of experiments on synthetic data. We show that these experiments fall short of proving this claim and that the ICA algorithms are indeed doing what they are designed to do: identify maximally independent sources.  相似文献   

2.
3.
4.

Background  

The analysis of high-throughput gene expression data sets derived from microarray experiments still is a field of extensive investigation. Although new approaches and algorithms are published continuously, mostly conventional methods like hierarchical clustering algorithms or variance analysis tools are used. Here we take a closer look at independent component analysis (ICA) which is already discussed widely as a new analysis approach. However, deep exploration of its applicability and relevance to concrete biological problems is still missing. In this study, we investigate the relevance of ICA in gaining new insights into well characterized regulatory mechanisms of M-CSF dependent macrophage differentiation.  相似文献   

5.
6.
The study was aimed at analyzing event-related desynchronization/synchronization (ERD/ERS) in 19-channel EEGs recorded in 329 healthy subjects in the course of a Go/NoGo task. Three methods were tested: reference, current source density (CSD), and group decomposition by independent component analysis (ICA). A comparison of the three data sets showed that the ICA method better reflects the local features of the brain responses in the θ, α, and β ranges. The functional significance of the group ICA components is discussed.  相似文献   

7.
Data-driven fMRI analysis techniques include independent component analysis (ICA) and different types of clustering in the temporal domain. Since each of these methods has its particular strengths, it is natural to look for an approach that unifies Kohonen's self-organizing map and ICA. This is given by the topographic independent component analysis. While achieved by a slight modification of the ICA model, it can be at the same time used to define a topographic order (clusters) between the components, and thus has the usual computational advantages associated with topographic maps. In this contribution, we can show that when applied to fMRI analysis it outperforms FastICA.  相似文献   

8.

Background

Although high-throughput microarray based molecular diagnostic technologies show a great promise in cancer diagnosis, it is still far from a clinical application due to its low and instable sensitivities and specificities in cancer molecular pattern recognition. In fact, high-dimensional and heterogeneous tumor profiles challenge current machine learning methodologies for its small number of samples and large or even huge number of variables (genes). This naturally calls for the use of an effective feature selection in microarray data classification.

Methods

We propose a novel feature selection method: multi-resolution independent component analysis (MICA) for large-scale gene expression data. This method overcomes the weak points of the widely used transform-based feature selection methods such as principal component analysis (PCA), independent component analysis (ICA), and nonnegative matrix factorization (NMF) by avoiding their global feature-selection mechanism. In addition to demonstrating the effectiveness of the multi-resolution independent component analysis in meaningful biomarker discovery, we present a multi-resolution independent component analysis based support vector machines (MICA-SVM) and linear discriminant analysis (MICA-LDA) to attain high-performance classifications in low-dimensional spaces.

Results

We have demonstrated the superiority and stability of our algorithms by performing comprehensive experimental comparisons with nine state-of-the-art algorithms on six high-dimensional heterogeneous profiles under cross validations. Our classification algorithms, especially, MICA-SVM, not only accomplish clinical or near-clinical level sensitivities and specificities, but also show strong performance stability over its peers in classification. Software that implements the major algorithm and data sets on which this paper focuses are freely available at https://sites.google.com/site/heyaumapbc2011/.

Conclusions

This work suggests a new direction to accelerate microarray technologies into a clinical routine through building a high-performance classifier to attain clinical-level sensitivities and specificities by treating an input profile as a ‘profile-biomarker’. The multi-resolution data analysis based redundant global feature suppressing and effective local feature extraction also have a positive impact on large scale ‘omics’ data mining.
  相似文献   

9.
We propose a new method for tumor classification from gene expression data, which mainly contains three steps. Firstly, the original DNA microarray gene expression data are modeled by independent component analysis (ICA). Secondly, the most discriminant eigenassays extracted by ICA are selected by the sequential floating forward selection technique. Finally, support vector machine is used to classify the modeling data. To show the validity of the proposed method, we applied it to classify three DNA microarray datasets involving various human normal and tumor tissue samples. The experimental results show that the method is efficient and feasible.  相似文献   

10.

Background  

Clustering is a popular data exploration technique widely used in microarray data analysis. Most conventional clustering algorithms, however, generate only one set of clusters independent of the biological context of the analysis. This is often inadequate to explore data from different biological perspectives and gain new insights. We propose a new clustering model that can generate multiple versions of different clusters from a single dataset, each of which highlights a different aspect of the given dataset.  相似文献   

11.
The existing DTI studies have suggested that white matter damage constitutes an important part of the neurodegenerative changes in Alzheimer’s disease (AD). The present study aimed to identify the regional covariance patterns of microstructural white matter changes associated with AD. In this study, we applied a multivariate analysis approach, independent component analysis (ICA), to identify covariance patterns of microstructural white matter damage based on fractional anisotropy (FA) skeletonised images from DTI data in 39 AD patients and 41 healthy controls (HCs) from the Alzheimer’s Disease Neuroimaging Initiative database. The multivariate ICA decomposed the subject-dimension concatenated FA data into a mixing coefficient matrix and a source matrix. Twenty-eight independent components (ICs) were extracted, and a two sample t-test on each column of the corresponding mixing coefficient matrix revealed significant AD/HC differences in ICA weights for 7 ICs. The covariant FA changes primarily involved the bilateral corona radiata, the superior longitudinal fasciculus, the cingulum, the hippocampal commissure, and the corpus callosum in AD patients compared to HCs. Our findings identified covariant white matter damage associated with AD based on DTI in combination with multivariate ICA, potentially expanding our understanding of the neuropathological mechanisms of AD.  相似文献   

12.
DNA microarray gene expression and microarray-based comparative genomic hybridization (aCGH) have been widely used for biomedical discovery. Because of the large number of genes and the complex nature of biological networks, various analysis methods have been proposed. One such method is "gene shaving," a procedure which identifies subsets of the genes with coherent expression patterns and large variation across samples. Since combining genomic information from multiple sources can improve classification and prediction of diseases, in this paper we proposed a new method, "ICA gene shaving" (ICA, independent component analysis), for jointly analyzing gene expression and copy number data. First we used ICA to analyze joint measurements, gene expression and copy number, of a biological system and project the data onto statistically independent biological processes. Next, we used these results to identify patterns of variation in the data and then applied an iterative shaving method. We investigated the properties of our proposed method by analyzing both simulated and real data. We demonstrated that the robustness of our method to noise using simulated data. Using breast cancer data, we showed that our method is superior to the Generalized Singular Value Decomposition (GSVD) gene shaving method for identifying genes associated with breast cancer.  相似文献   

13.
Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust. The purpose of this paper, however, is to combine multiple microarray studies to automatically identify subnetworks that are distinctive to specific experimental conditions rather than common to them all. To better understand key regulatory mechanisms and how they change under different conditions, we derive unique networks from multiple independent networks built using glasso which goes beyond standard correlations. This involves calculating cluster prediction accuracies to detect the most predictive genes for a specific set of conditions. We differentiate between accuracies calculated using cross-validation within a selected cluster of studies (the intra prediction accuracy) and those calculated on a set of independent studies belonging to different study clusters (inter prediction accuracy). Finally, we compare our method''s results to related state-of-the art techniques. We explore how the proposed pipeline performs on both synthetic data and real data (wheat and Fusarium). Our results show that subnetworks can be identified reliably that are specific to subsets of studies and that these networks reflect key mechanisms that are fundamental to the experimental conditions in each of those subsets.  相似文献   

14.
15.
Traditional k-means and most k-means variants are still computationally expensive for large datasets, such as microarray data, which have large datasets with large dimension size d. In k-means clustering, we are given a set of n data points in d-dimensional space Rd and an integer k. The problem is to determine a set of k points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this work, we develop a novel k-means algorithm, which is simple but more efficient than the traditional k-means and the recent enhanced k-means. Our new algorithm is based on the recently established relationship between principal component analysis and the k-means clustering. We provided the correctness proof for this algorithm. Results obtained from testing the algorithm on three biological data and six non-biological data (three of these data are real, while the other three are simulated) also indicate that our algorithm is empirically faster than other known k-means algorithms. We assessed the quality of our algorithm clusters against the clusters of a known structure using the Hubert-Arabie Adjusted Rand index (ARIHA). We found that when k is close to d, the quality is good (ARIHA>0.8) and when k is not close to d, the quality of our new k-means algorithm is excellent (ARIHA>0.9). In this paper, emphases are on the reduction of the time requirement of the k-means algorithm and its application to microarray data due to the desire to create a tool for clustering and malaria research. However, the new clustering algorithm can be used for other clustering needs as long as an appropriate measure of distance between the centroids and the members is used. This has been demonstrated in this work on six non-biological data.  相似文献   

16.
17.
MOTIVATION: Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory feature extraction techniques yield expression modes (ICA) or metagenes (NMF). These extracted features are considered indicative of underlying regulatory processes. They can as well be applied to the classification of gene expression datasets by grouping samples into different categories for diagnostic purposes or group genes into functional categories for further investigation of related metabolic pathways and regulatory networks. RESULTS: In this study we focus on unsupervised matrix factorization techniques and apply ICA and sparse NMF to microarray datasets. The latter monitor the gene expression levels of human peripheral blood cells during differentiation from monocytes to macrophages. We show that these tools are able to identify relevant signatures in the deduced component matrices and extract informative sets of marker genes from these gene expression profiles. The methods rely on the joint discriminative power of a set of marker genes rather than on single marker genes. With these sets of marker genes, corroborated by leave-one-out or random forest cross-validation, the datasets could easily be classified into related diagnostic categories. The latter correspond to either monocytes versus macrophages or healthy vs Niemann Pick C disease patients.  相似文献   

18.
Previous researches have explored the changes of functional connectivity caused by smoking with the aid of fMRI. This study considers not only functional connectivity but also effective connectivity regarding both brain networks and brain regions by using a novel analysis framework that combines independent component analysis (ICA) and Granger causality analysis (GCA). We conducted a resting-state fMRI experiment in which twenty-one heavy smokers were scanned in two sessions of different conditions: smoking abstinence followed by smoking satiety. In our framework, group ICA was firstly adopted to obtain the spatial patterns of the default-mode network (DMN), executive-control network (ECN), and salience network (SN). Their associated time courses were analyzed using GCA, showing that the effective connectivity from SN to DMN was reduced and that from ECN/DMN to SN was enhanced after smoking replenishment. A paired t-test on ICA spatial patterns revealed functional connectivity variation in regions such as the insula, parahippocampus, precuneus, anterior cingulate cortex, supplementary motor area, and ventromedial/dorsolateral prefrontal cortex. These regions were later selected as the regions of interest (ROIs), and their effective connectivity was investigated subsequently using GCA. In smoking abstinence, the insula showed the increased effective connectivity with the other ROIs; while in smoking satiety, the parahippocampus had the enhanced inter-area effective connectivity. These results demonstrate our hypothesis that for deprived heavy smokers, smoking replenishment takes effect on both functional and effective connectivity. Moreover, our analysis framework could be applied in a range of neuroscience studies.  相似文献   

19.
20.

Background  

Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biological interpretation. Moreover, implicit assumptions about the measurement errors often limit the application of these methods to log-transformed data, destroying linear structure in the untransformed expression data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号