首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Gene set enrichment analysis (GSEA) is a microarray data analysis method that uses predefined gene sets and ranks of genes to identify significant biological changes in microarray data sets. GSEA is especially useful when gene expression changes in a given microarray data set is minimal or moderate.  相似文献   

2.

Background  

Gene-set analysis evaluates the expression of biological pathways, or a priori defined gene sets, rather than that of individual genes, in association with a binary phenotype, and is of great biologic interest in many DNA microarray studies. Gene Set Enrichment Analysis (GSEA) has been applied widely as a tool for gene-set analyses. We describe here some critical problems with GSEA and propose an alternative method by extending the individual-gene analysis method, Significance Analysis of Microarray (SAM), to gene-set analyses (SAM-GS).  相似文献   

3.

Background  

Recently, microarray data analyses using functional pathway information, e.g., gene set enrichment analysis (GSEA) and significance analysis of function and expression (SAFE), have gained recognition as a way to identify biological pathways/processes associated with a phenotypic endpoint. In these analyses, a local statistic is used to assess the association between the expression level of a gene and the value of a phenotypic endpoint. Then these gene-specific local statistics are combined to evaluate association for pre-selected sets of genes. Commonly used local statistics include t-statistics for binary phenotypes and correlation coefficients that assume a linear or monotone relationship between a continuous phenotype and gene expression level. Methods applicable to continuous non-monotone relationships are needed. Furthermore, for multiple experimental categories, methods that combine multiple GSEA/SAFE analyses are needed.  相似文献   

4.

Background  

This paper presents a unified framework for finding differentially expressed genes (DEGs) from the microarray data. The proposed framework has three interrelated modules: (i) gene ranking, ii) significance analysis of genes and (iii) validation. The first module uses two gene selection algorithms, namely, a) two-way clustering and b) combined adaptive ranking to rank the genes. The second module converts the gene ranks into p-values using an R-test and fuses the two sets of p-values using the Fisher's omnibus criterion. The DEGs are selected using the FDR analysis. The third module performs three fold validations of the obtained DEGs. The robustness of the proposed unified framework in gene selection is first illustrated using false discovery rate analysis. In addition, the clustering-based validation of the DEGs is performed by employing an adaptive subspace-based clustering algorithm on the training and the test datasets. Finally, a projection-based visualization is performed to validate the DEGs obtained using the unified framework.  相似文献   

5.

Background

Multiple microarray analyses of multiple sclerosis (MS) and its experimental models have been published in the last years.

Objective

Meta-analyses integrate the information from multiple studies and are suggested to be a powerful approach in detecting highly relevant and commonly affected pathways.

Data sources

ArrayExpress, Gene Expression Omnibus and PubMed databases were screened for microarray gene expression profiling studies of MS and its experimental animal models.

Study eligibility criteria

Studies comparing central nervous system (CNS) samples of diseased versus healthy individuals with n >1 per group and publically available raw data were selected.

Material and Methods

Included conditions for re-analysis of differentially expressed genes (DEGs) were MS, myelin oligodendrocyte glycoprotein-induced experimental autoimmune encephalomyelitis (EAE) in rats, proteolipid protein-induced EAE in mice, Theiler’s murine encephalomyelitis virus-induced demyelinating disease (TMEV-IDD), and a transgenic tumor necrosis factor-overexpressing mouse model (TNFtg). Since solely a single MS raw data set fulfilled the inclusion criteria, a merged list containing the DEGs from two MS-studies was additionally included. Cross-study analysis was performed employing list comparisons of DEGs and alternatively Gene Set Enrichment Analysis (GSEA).

Results

The intersection of DEGs in MS, EAE, TMEV-IDD, and TNFtg contained 12 genes related to macrophage functions. The intersection of EAE, TMEV-IDD and TNFtg comprised 40 DEGs, functionally related to positive regulation of immune response. Over and above, GSEA identified substantially more differentially regulated pathways including coagulation and JAK/STAT-signaling.

Conclusion

A meta-analysis based on a simple comparison of DEGs is over-conservative. In contrast, the more experimental GSEA approach identified both, a priori anticipated as well as promising new candidate pathways.  相似文献   

6.

Background  

Gene set enrichment testing has helped bridge the gap from an individual gene to a systems biology interpretation of microarray data. Although gene sets are defined a priori based on biological knowledge, current methods for gene set enrichment testing treat all genes equal. It is well-known that some genes, such as those responsible for housekeeping functions, appear in many pathways, whereas other genes are more specialized and play a unique role in a single pathway. Drawing inspiration from the field of information retrieval, we have developed and present here an approach to incorporate gene appearance frequency (in KEGG pathways) into two current methods, Gene Set Enrichment Analysis (GSEA) and logistic regression-based LRpath framework, to generate more reproducible and biologically meaningful results.  相似文献   

7.

Objective

Colorectal cancer (CRC) development involves underlying modifications at genetic/epigenetic level. This study evaluated the role of Kras gene mutation and RASSF1A, FHIT and MGMT gene promoter hypermethylation together/independently in sporadic CRC in Indian population and correlation with clinicopathological variables of the disease.

Methods

One hundred and twenty four consecutive surgically resected tissues (62 tumor and equal number of normal adjacent controls) of primary sporadic CRC were included and patient details including demographic characteristics, lifestyle/food or drinking habits, clinical and histopathological profiles were recorded. Polymerase chain reaction - Restriction fragment length polymorphism and direct sequencing for Kras gene mutation and Methylation Specific-PCR for RASSF1A, FHIT and MGMT genes was performed.

Results

Kras gene mutation at codon 12 & 13 and methylated RASSF1A, FHIT and MGMT gene was observed in 47%, 19%, 47%, 37% and 47% cases, respectively. Alcohol intake and smoking were significantly associated with presence of Kras mutation (codon 12) and MGMT methylation (p-value <0.049). Tumor stage and metastasis correlated with presence of mutant Kras codon 12 (p-values 0.018, 0.044) and methylated RASSF1A (p-values 0.034, 0.044), FHIT (p-values 0.001, 0.047) and MGMT (p-values 0.018, 0.044) genes. Combinatorial effect of gene mutation/methylation was also observed (p-value <0.025). Overall, tumor stage 3, moderately differentiated tumors, presence of lymphatic invasion and absence of metastasis was more frequently observed in tumors with mutated Kras and/or methylated RASSF1A, FHIT and MGMT genes.

Conclusion

Synergistic interrelationship between these genes in sporadic CRC may be used as diagnostic/prognostic markers in assessing the overall pathological status of CRC.  相似文献   

8.
《PloS one》2015,10(12)

Background

Fatigue is a debilitating condition with a significant impact on patients’ quality of life. Fatigue is frequently reported by patients suffering from primary Sjögren’s Syndrome (pSS), a chronic autoimmune condition characterised by dryness of the eyes and the mouth. However, although fatigue is common in pSS, it does not manifest in all sufferers, providing an excellent model with which to explore the potential underpinning biological mechanisms.

Methods

Whole blood samples from 133 fully-phenotyped pSS patients stratified for the presence of fatigue, collected by the UK primary Sjögren’s Syndrome Registry, were used for whole genome microarray. The resulting data were analysed both on a gene by gene basis and using pre-defined groups of genes. Finally, gene set enrichment analysis (GSEA) was used as a feature selection technique for input into a support vector machine (SVM) classifier. Classification was assessed using area under curve (AUC) of receiver operator characteristic and standard error of Wilcoxon statistic, SE(W).

Results

Although no genes were individually found to be associated with fatigue, 19 metabolic pathways were enriched in the high fatigue patient group using GSEA. Analysis revealed that these enrichments arose from the presence of a subset of 55 genes. A radial kernel SVM classifier with this subset of genes as input displayed significantly improved performance over classifiers using all pathway genes as input. The classifiers had AUCs of 0.866 (SE(W) 0.002) and 0.525 (SE(W) 0.006), respectively.

Conclusions

Systematic analysis of gene expression data from pSS patients discordant for fatigue identified 55 genes which are predictive of fatigue level using SVM classification. This list represents the first step in understanding the underlying pathophysiological mechanisms of fatigue in patients with pSS.  相似文献   

9.

Background  

Once specific genes are identified through high throughput genomics technologies there is a need to sort the final gene list to a manageable size for validation studies. The triaging and sorting of genes often relies on the use of supplemental information related to gene structure, metabolic pathways, and chromosomal location. Yet in disease states where the genes may not have identifiable structural elements, poorly defined metabolic pathways, or limited chromosomal data, flexible systems for obtaining additional data are necessary. In these situations having a tool for searching the biomedical literature using the list of identified genes while simultaneously defining additional search terms would be useful.  相似文献   

10.

Background

Gender differences in gene expression were estimated in liver samples from 9 males and 9 females. The study tested 31,110 genes for a gender difference using a design that adjusted for sources of variation associated with cDNA arrays, normalization, hybridizations and processing conditions.

Results

The genes were split into 2,800 that were clearly expressed (expressed genes) and 28,310 that had expression levels in the background range (not expressed genes). The distribution of p-values from the 'not expressed' group was consistent with no gender differences. The distribution of p-values from the 'expressed' group suggested that 8 % of these genes differed by gender, but the estimated fold-changes (expression in males / expression in females) were small. The largest observed fold-change was 1.55. The 95 % confidence bounds on the estimated fold-changes were less than 1.4 fold for 79.3 %, and few (1.1%) exceed 2-fold.

Conclusion

Observed gender differences in gene expression were small. When selecting genes with gender differences based upon their p-values, false discovery rates exceed 80 % for any set of genes, essentially making it impossible to identify any specific genes with a gender difference.
  相似文献   

11.
12.

Background  

A common clustering method in the analysis of gene expression data has been hierarchical clustering. Usually the analysis involves selection of clusters by cutting the tree at a suitable level and/or analysis of a sorted gene list that is obtained with the tree. Cutting of the hierarchical tree requires the selection of a suitable level and it results in the loss of information on the other level. Sorted gene lists depend on the sorting method of the joined clusters. Author proposes that the clusters should be selected using the gene classifications.  相似文献   

13.

Background  

Ranked gene lists from microarray experiments are usually analysed by assigning significance to predefined gene categories, e.g., based on functional annotations. Tools performing such analyses are often restricted to a category score based on a cutoff in the ranked list and a significance calculation based on random gene permutations as null hypothesis.  相似文献   

14.

Background  

Microarray experiments measure changes in the expression of thousands of genes. The resulting lists of genes with changes in expression are then searched for biologically related sets using several divergent methods such as the Fisher Exact Test (as used in multiple GO enrichment tools), Parametric Analysis of Gene Expression (PAGE), Gene Set Enrichment Analysis (GSEA), and the connectivity map.  相似文献   

15.
16.

Background

Existing microarray studies of bone mineral density (BMD) have been critical for understanding the pathophysiology of osteoporosis, and have identified a number of candidate genes. However, these studies were limited by their relatively small sample sizes and were usually analyzed individually. Here, we propose a novel network-based meta-analysis approach that combines data across six microarray studies to identify functional modules from human protein-protein interaction (PPI) data, and highlight several differentially expressed genes (DEGs) and a functional module that may play an important role in BMD regulation in women.

Methods

Expression profiling studies were identified by searching PubMed, Gene Expression Omnibus (GEO) and ArrayExpress. Two meta-analysis methods were applied across different gene expression profiling studies. The first, a nonparametric Fisher’s method, combined p-values from individual experiments to identify genes with large effect sizes. The second method combined effect sizes from individual datasets into a meta-effect size to gain a higher precision of effect size estimation across all datasets. Genes with Q test’s p-values < 0.05 or I2 values > 50% were assessed by a random effects model and the remainder by a fixed effects model. Using Fisher’s combined p-values, functional modules were identified through an integrated analysis of microarray data in the context of large protein–protein interaction (PPI) networks. Two previously published meta-analysis studies of genome-wide association (GWA) datasets were used to determine whether these module genes were genetically associated with BMD. Pathway enrichment analysis was performed with a hypergeometric test.

Results

Six gene expression datasets were identified, which included a total of 249 (129 high BMD and 120 low BMD) female subjects. Using a network-based meta-analysis, a consensus module containing 58 genes (nodes) and 83 edges was detected. Pathway enrichment analysis of the 58 module genes revealed that these genes were enriched in several important KEGG pathways including Osteoclast differentiation, B cell receptor signaling pathway, MAPK signaling pathway, Chemokine signaling pathway and Insulin signaling pathway. The importance of module genes was replicated by demonstrating that most module genes were genetically associated with BMD in the GWAS data sets. Meta-analyses were performed at the individual gene level by combining p-values and effect sizes. Five candidate genes (ESR1, MAP3K3, PYGM, RAC1 and SYK) were identified based on gene expression meta-analysis, and their associations with BMD were also replicated by two BMD meta-analysis studies.

Conclusions

In summary, our network-based meta-analysis not only identified important differentially expressed genes but also discovered biologically meaningful functional modules for BMD determination. Our study may provide novel therapeutic targets for osteoporosis in women.  相似文献   

17.
18.

Background  

With the current technological advances in high-throughput biology, the necessity to develop tools that help to analyse the massive amount of data being generated is evident. A powerful method of inspecting large-scale data sets is gene set enrichment analysis (GSEA) and investigation of protein structural features can guide determining the function of individual genes. However, a convenient tool that combines these two features to aid in high-throughput data analysis has not been developed yet. In order to fill this niche, we developed the user-friendly, web-based application, PhenoFam.  相似文献   

19.
20.

Background  

A central task in contemporary biosciences is the identification of biological processes showing response in genome-wide differential gene expression experiments. Two types of analysis are common. Either, one generates an ordered list based on the differential expression values of the probed genes and examines the tail areas of the list for over-representation of various functional classes. Alternatively, one monitors the average differential expression level of genes belonging to a given functional class. So far these two types of method have not been combined.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号