首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
This paper presents Fuzzy-Adaptive-Subspace-Iteration-based Two-way Clustering (FASIC) of microarray data for finding differentially expressed genes (DEGs) from two-sample microarray experiments. The concept of fuzzy membership is introduced to transform the hard adaptive subspace iteration (ASI) algorithm into a fuzzy-ASI algorithm to perform two-way clustering. The proposed approach follows a progressive framework to assign a relevance value to genes associated with each cluster. Subsequently, each gene cluster is scored and ranked based on its potential to provide a correct classification of the sample classes. These ranks are converted into P values using the R-test, and the significance of each gene is determined. A fivefold validation is performed on the DEGs selected using the proposed approach. Empirical analyses on a number of simulated microarray data sets are conducted to quantify the results obtained using the proposed approach. To exemplify the efficacy of the proposed approach, further analyses on different real microarray data sets are also performed.  相似文献   

2.

Background  

To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility.  相似文献   

3.

Background

Existing microarray studies of bone mineral density (BMD) have been critical for understanding the pathophysiology of osteoporosis, and have identified a number of candidate genes. However, these studies were limited by their relatively small sample sizes and were usually analyzed individually. Here, we propose a novel network-based meta-analysis approach that combines data across six microarray studies to identify functional modules from human protein-protein interaction (PPI) data, and highlight several differentially expressed genes (DEGs) and a functional module that may play an important role in BMD regulation in women.

Methods

Expression profiling studies were identified by searching PubMed, Gene Expression Omnibus (GEO) and ArrayExpress. Two meta-analysis methods were applied across different gene expression profiling studies. The first, a nonparametric Fisher’s method, combined p-values from individual experiments to identify genes with large effect sizes. The second method combined effect sizes from individual datasets into a meta-effect size to gain a higher precision of effect size estimation across all datasets. Genes with Q test’s p-values < 0.05 or I2 values > 50% were assessed by a random effects model and the remainder by a fixed effects model. Using Fisher’s combined p-values, functional modules were identified through an integrated analysis of microarray data in the context of large protein–protein interaction (PPI) networks. Two previously published meta-analysis studies of genome-wide association (GWA) datasets were used to determine whether these module genes were genetically associated with BMD. Pathway enrichment analysis was performed with a hypergeometric test.

Results

Six gene expression datasets were identified, which included a total of 249 (129 high BMD and 120 low BMD) female subjects. Using a network-based meta-analysis, a consensus module containing 58 genes (nodes) and 83 edges was detected. Pathway enrichment analysis of the 58 module genes revealed that these genes were enriched in several important KEGG pathways including Osteoclast differentiation, B cell receptor signaling pathway, MAPK signaling pathway, Chemokine signaling pathway and Insulin signaling pathway. The importance of module genes was replicated by demonstrating that most module genes were genetically associated with BMD in the GWAS data sets. Meta-analyses were performed at the individual gene level by combining p-values and effect sizes. Five candidate genes (ESR1, MAP3K3, PYGM, RAC1 and SYK) were identified based on gene expression meta-analysis, and their associations with BMD were also replicated by two BMD meta-analysis studies.

Conclusions

In summary, our network-based meta-analysis not only identified important differentially expressed genes but also discovered biologically meaningful functional modules for BMD determination. Our study may provide novel therapeutic targets for osteoporosis in women.  相似文献   

4.
Significance of gene ranking for classification of microarray samples   总被引:1,自引:0,他引:1  
Many methods for classification and gene selection with microarray data have been developed. These methods usually give a ranking of genes. Evaluating the statistical significance of the gene ranking is important for understanding the results and for further biological investigations, but this question has not been well addressed for machine learning methods in existing works. Here, we address this problem by formulating it in the framework of hypothesis testing and propose a solution based on resampling. The proposed r-test methods convert gene ranking results into position p-values to evaluate the significance of genes. The methods are tested on three real microarray data sets and three simulation data sets with support vector machines as the method of classification and gene selection. The obtained position p-values help to determine the number of genes to be selected and enable scientists to analyze selection results by sophisticated multivariate methods under the same statistical inference paradigm as for simple hypothesis testing methods.  相似文献   

5.

Background

Biclustering is an important analysis procedure to understand the biological mechanisms from microarray gene expression data. Several algorithms have been proposed to identify biclusters, but very little effort was made to compare the performance of different algorithms on real datasets and combine the resultant biclusters into one unified ranking.

Results

In this paper we propose differential co-expression framework and a differential co-expression scoring function to objectively quantify quality or goodness of a bicluster of genes based on the observation that genes in a bicluster are co-expressed in the conditions belonged to the bicluster and not co-expressed in the other conditions. Furthermore, we propose a scoring function to stratify biclusters into three types of co-expression. We used the proposed scoring functions to understand the performance and behavior of the four well established biclustering algorithms on six real datasets from different domains by combining their output into one unified ranking.

Conclusions

Differential co-expression framework is useful to provide quantitative and objective assessment of the goodness of biclusters of co-expressed genes and performance of biclustering algorithms in identifying co-expression biclusters. It also helps to combine the biclusters output by different algorithms into one unified ranking i.e. meta-biclustering.  相似文献   

6.
7.
8.

Background  

Genes work coordinately as gene modules or gene networks. Various computational approaches have been proposed to find gene modules based on gene expression data; for example, gene clustering is a popular method for grouping genes with similar gene expression patterns. However, traditional gene clustering often yields unsatisfactory results for regulatory module identification because the resulting gene clusters are co-expressed but not necessarily co-regulated.  相似文献   

9.

Background

Chromophobe renal cell carcinoma (ChRCC) is the second common subtype of non-clear cell renal cell carcinoma (nccRCC), which accounting for 4–5% of renal cell carcinoma (RCC). However, there is no effective bio-marker to predict clinical outcomes of this malignant disease. Bioinformatic methods may provide a feasible potential to solve this problem.

Methods

In this study, differentially expressed genes (DEGs) of ChRCC samples on The Cancer Genome Atlas database were filtered out to construct co-expression modules by weighted gene co-expression network analysis and the key module were identified by calculating module-trait correlations. Functional analysis was performed on the key module and candidate hub genes were screened out by co-expression and MCODE analysis. Afterwards, real hub genes were filter out in an independent dataset GSE15641 and validated by survival analysis.

Results

Overall 2215 DEGs were screened out to construct eight co-expression modules. Brown module was identified as the key module for the highest correlations with pathologic stage, neoplasm status and survival status. 29 candidate hub genes were identified. GO and KEGG analysis demonstrated most candidate genes were enriched in mitotic cell cycle. Three real hub genes (SKA1, ERCC6L, GTSE-1) were selected out after mapping candidate genes to GSE15641 and two of them (SKA1, ERCC6L) were significantly related to overall survivals of ChRCC patients.

Conclusions

In summary, our findings identified molecular markers correlated with progression and prognosis of ChRCC, which might provide new implications for improving risk evaluation, therapeutic intervention, and prognosis prediction in ChRCC patients.
  相似文献   

10.
11.

Background  

Feature selection is an important pre-processing task in the analysis of complex data. Selecting an appropriate subset of features can improve classification or clustering and lead to better understanding of the data. An important example is that of finding an informative group of genes out of thousands that appear in gene-expression analysis. Numerous supervised methods have been suggested but only a few unsupervised ones exist. Unsupervised Feature Filtering (UFF) is such a method, based on an entropy measure of Singular Value Decomposition (SVD), ranking features and selecting a group of preferred ones.  相似文献   

12.
13.
14.
Renal cell carcinoma (RCC) is the most common type of renal tumor, and the clear cell renal cell carcinoma (ccRCC) is the most frequent subtype. In this study, our aim is to identify potential biomarkers that could effectively predict the prognosis and progression of ccRCC. First, we used The Cancer Genome Atlas (TCGA) RNA-sequencing (RNA-seq) data of ccRCC to identify 2370 differentially expressed genes (DEGs). Second, the DEGs were used to construct a coexpression network by weighted gene coexpression network analysis (WGCNA). Moreover, we identified the yellow module, which was strongly related to the histologic grade and pathological stage of ccRCC. Then, the functional annotation of the yellow module and single-samples gene-set enrichment analysis of DEGs were performed and mainly enriched in cell cycle. Subsequently, 18 candidate hub genes were screened through WGCNA and protein–protein interaction (PPI) network analysis. After verification of TCGA’s ccRCC data set, Gene Expression Omnibus (GEO) data set (GSE73731) and tissue validation, we finally identified 15 hub genes that can actually predict the progression of ccRCC. In addition, by using survival analysis, we found that patients of ccRCC with high expression of each hub gene were more likely to have poor prognosis than those with low expression. The receiver operating characteristic curve showed that each hub gene could effectively distinguish between localized and advanced ccRCC. In summary, our study indicates that 15 hub genes have great predictive value for the prognosis and progression of ccRCC, and may contribute to the exploration of the pathogenesis of ccRCC.  相似文献   

15.

Background  

Biological processes are carried out by coordinated modules of interacting molecules. As clustering methods demonstrate that genes with similar expression display increased likelihood of being associated with a common functional module, networks of coexpressed genes provide one framework for assigning gene function. This has informed the guilt-by-association (GBA) heuristic, widely invoked in functional genomics. Yet although the idea of GBA is accepted, the breadth of GBA applicability is uncertain.  相似文献   

16.
17.

Background  

Normalization of gene expression data refers to the comparison of expression values using reference standards that are consistent across all conditions of an experiment. In PCR studies, genes designated as "housekeeping genes" have been used as internal reference genes under the assumption that their expression is stable and independent of experimental conditions. However, verification of this assumption is rarely performed. Here we assess the use of gene microarray analysis to facilitate selection of internal reference sequences with higher expression stability across experimental conditions than can be expected using traditional selection methods.  相似文献   

18.

Background  

A common clustering method in the analysis of gene expression data has been hierarchical clustering. Usually the analysis involves selection of clusters by cutting the tree at a suitable level and/or analysis of a sorted gene list that is obtained with the tree. Cutting of the hierarchical tree requires the selection of a suitable level and it results in the loss of information on the other level. Sorted gene lists depend on the sorting method of the joined clusters. Author proposes that the clusters should be selected using the gene classifications.  相似文献   

19.
Low temperature has become a major abiotic stress factor that can reduce maize yield and cause a number of economic loss. This study was designed to identify key genes and pathways associated with coldresistance of maize. The gene expression profile GSE46704, including 4 control temperature treated plants and 4 low temperature treated plants, was downloaded from the Gene Expression Omnibus database. Differentially-expressed genes (DEGs) were identified by limma package. Then, protein-protein interaction (PPI) network and module selection were constructed using Cytoscape. Moreover, the DEGs were re-matched based on the Zea mays L. gene ID and symbol data from PlantRegMap. Finally, the re-matched DEGs were performed functional and pathway enrichment analyses by the DAVID online tool. A total of 750 DEGs were screened (including 387 up-regulated and 363 down-regulated genes) In the PPI network, GRMZM2G070837_P01 and GRMZM2G114578_P01 had higher degrees. Besides, carbohydrate metabolic process, starch and sucrose metabolism and biosynthesis of secondary metabolites were significantly enriched in functional and pathway enrichment analysis. GRMZM2G070837_P01 and GRMZM2G114578_P01 might play a critical role in cold-resistance of maize. Meanwhile, carbohydrate metabolic process, starch and sucrose metabolism and biosynthesis of secondary metabolites might function in cold-resistance of maize.  相似文献   

20.

Background  

Gene clustering has been widely used to group genes with similar expression pattern in microarray data analysis. Subsequent enrichment analysis using predefined gene sets can provide clues on which functional themes or regulatory sequence motifs are associated with individual gene clusters. In spite of the potential utility, gene clustering and enrichment analysis have been used in separate platforms, thus, the development of integrative algorithm linking both methods is highly challenging.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号