首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Competitive gene set tests are commonly used in molecular pathway analysis to test for enrichment of a particular gene annotation category amongst the differential expression results from a microarray experiment. Existing gene set tests that rely on gene permutation are shown here to be extremely sensitive to inter-gene correlation. Several data sets are analyzed to show that inter-gene correlation is non-ignorable even for experiments on homogeneous cell populations using genetically identical model organisms. A new gene set test procedure (CAMERA) is proposed based on the idea of estimating the inter-gene correlation from the data, and using it to adjust the gene set test statistic. An efficient procedure is developed for estimating the inter-gene correlation and characterizing its precision. CAMERA is shown to control the type I error rate correctly regardless of inter-gene correlations, yet retains excellent power for detecting genuine differential expression. Analysis of breast cancer data shows that CAMERA recovers known relationships between tumor subtypes in very convincing terms. CAMERA can be used to analyze specified sets or as a pathway analysis tool using a database of molecular signatures.  相似文献   

2.

Background  

Gene set enrichment testing has helped bridge the gap from an individual gene to a systems biology interpretation of microarray data. Although gene sets are defined a priori based on biological knowledge, current methods for gene set enrichment testing treat all genes equal. It is well-known that some genes, such as those responsible for housekeeping functions, appear in many pathways, whereas other genes are more specialized and play a unique role in a single pathway. Drawing inspiration from the field of information retrieval, we have developed and present here an approach to incorporate gene appearance frequency (in KEGG pathways) into two current methods, Gene Set Enrichment Analysis (GSEA) and logistic regression-based LRpath framework, to generate more reproducible and biologically meaningful results.  相似文献   

3.
Identifying differential features between conditions is a popular approach to understanding molecular features and their mechanisms underlying a biological process of particular interest. Although many tests for identifying differential expression of gene or gene sets have been proposed, there was limited success in developing methods for differential interactions of genes between conditions because of its computational complexity. We present a method for Evaluation of Dependency DifferentialitY (EDDY), which is a statistical test for differential dependencies of a set of genes between two conditions. Unlike previous methods focused on differential expression of individual genes or correlation changes of individual gene–gene interactions, EDDY compares two conditions by evaluating the probability distributions of dependency networks from genes. The method has been evaluated and compared with other methods through simulation studies, and application to glioblastoma multiforme data resulted in informative cancer and glioblastoma multiforme subtype-related findings. The comparison with Gene Set Enrichment Analysis, a differential expression-based method, revealed that EDDY identifies the gene sets that are complementary to those identified by Gene Set Enrichment Analysis. EDDY also showed much lower false positives than Gene Set Co-expression Analysis, a method based on correlation changes of individual gene–gene interactions, thus providing more informative results. The Java implementation of the algorithm is freely available to noncommercial users. Download from: http://biocomputing.tgen.org/software/EDDY.  相似文献   

4.

Background  

Gene-set analysis evaluates the expression of biological pathways, or a priori defined gene sets, rather than that of individual genes, in association with a binary phenotype, and is of great biologic interest in many DNA microarray studies. Gene Set Enrichment Analysis (GSEA) has been applied widely as a tool for gene-set analyses. We describe here some critical problems with GSEA and propose an alternative method by extending the individual-gene analysis method, Significance Analysis of Microarray (SAM), to gene-set analyses (SAM-GS).  相似文献   

5.

Background  

The analysis of high-throughput gene expression data with respect to sets of genes rather than individual genes has many advantages. A variety of methods have been developed for assessing the enrichment of sets of genes with respect to differential expression. In this paper we provide a comparative study of four of these methods: Fisher's exact test, Gene Set Enrichment Analysis (GSEA), Random-Sets (RS), and Gene List Analysis with Prediction Accuracy (GLAPA). The first three methods use associative statistics, while the fourth uses predictive statistics. We first compare all four methods on simulated data sets to verify that Fisher's exact test is markedly worse than the other three approaches. We then validate the other three methods on seven real data sets with known genetic perturbations and then compare the methods on two cancer data sets where our a priori knowledge is limited.  相似文献   

6.
Extensions to gene set enrichment   总被引:2,自引:0,他引:2  
MOTIVATION: Gene Set Enrichment Analysis (GSEA) has been developed recently to capture changes in the expression of pre-defined sets of genes. We propose number of extensions to GSEA, including the use of different statistics to describe the association between genes and phenotypes of interest. We make use of dimension reduction procedures, such as principle component analysis, to identify gene sets with correlated expression. We also address issues that arise when gene sets overlap. RESULTS: Our proposals extend the range of applicability of GSEA and allow for adjustments based on other covariates. We have provided a well-defined procedure to address interpretation issues that can raise when gene sets have substantial overlap. We have shown how standard dimension reduction methods, such as PCA, can be used to help further interpret GSEA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

7.
Autologous cell-based therapies promise important developments for reconstructive surgery. In vitro expansion as well as differentiation strategies could provide a substantial benefit to cellular therapies. Human dermal fibroblasts, considered ubiquitous connective tissue cells, can be coaxed towards different cellular fates, are readily available and may altogether be a suitable cell source for tissue engineering strategies. Global gene expression analysis was performed to investigate the changes of the fibroblast phenotype after four-week inductions toward adipocytic, osteoblastic and chondrocytic lineages. Differential gene regulation, interpreted through Gene Set Enrichment Analysis, highlight important similarities and differences of induced fibroblasts compared to control cultures of human fibroblasts, adipocytes, osteoblasts and articular chondrocytes. Fibroblasts show an inherent degree of phenotype plasticity that can be controlled to obtain cells supportive of multiple tissue types.  相似文献   

8.
Interpretation of microarray data remains a challenge, and most methods fail to consider the complex, nonlinear regulation of gene expression. To address that limitation, we introduce Learner of Functional Enrichment (LeFE), a statistical/machine learning algorithm based on Random Forest, and demonstrate it on several diverse datasets: smoker/never smoker, breast cancer classification, and cancer drug sensitivity. We also compare it with previously published algorithms, including Gene Set Enrichment Analysis. LeFE regularly identifies statistically significant functional themes consistent with known biology.  相似文献   

9.
Chronic obstructive pulmonary disease (COPD) is a smoking-related disease that lacks effective therapies due partly to the poor understanding of disease pathogenesis. The aim of this study was to identify molecular pathways that could be responsible for the damaging consequences of smoking. To do this, we employed Gene Set Enrichment Analysis to analyze differences in global gene expression, which we then related to the pathological changes induced by cigarette smoke (CS). Sprague-Dawley rats were exposed to whole body CS for 1 day and for various periods up to 8 mo. Gene Set Enrichment Analysis of microarray data identified that metabolic processes were most significantly increased early in the response to CS. Gene sets involved in stress response and inflammation were also upregulated. CS exposure increased neutrophil chemokines, cytokines, and proteases (MMP-12) linked to the pathogenesis of COPD. After a transient acute response, the CS-exposed rats developed a distinct molecular signature after 2 wk, which was followed by the chronic phase of the response. During this phase, gene sets related to immunity and defense progressively increased and predominated at the later time points in smoke-exposed rats. Chronic CS inhalation recapitulated many of the phenotypic changes observed in COPD patients including oxidative damage to macrophages, a slowly resolving inflammation, epithelial damage, mucus hypersecretion, airway fibrosis, and emphysema. As such, it appears that metabolic pathways are central to dealing with the stress of CS exposure; however, over time, inflammation and stress response gene sets become the most significantly affected in the chronic response to CS.  相似文献   

10.
Absolute enrichment: gene set enrichment analysis for homeostatic systems   总被引:1,自引:0,他引:1  
The Gene Set Enrichment Analysis (GSEA) identifies sets of genes that are differentially regulated in one direction. Many homeostatic systems will include one limb that is upregulated in response to a downregulation of another limb and vice versa. Such patterns are poorly captured by the standard formulation of GSEA. We describe a technique to identify groups of genes (which sometimes can be pathways) that include both up- and down-regulated components. This approach lends insights into the feedback mechanisms that may operate, especially when integrated with protein interaction databases.  相似文献   

11.
Analyses of gene set differential coexpression may shed light on molecular mechanisms underlying phenotypes and diseases. However, differential coexpression analyses of conceptually similar individual studies are often inconsistent and underpowered to provide definitive results. Researchers can greatly benefit from an open-source application facilitating the aggregation of evidence of differential coexpression across studies and the estimation of more robust common effects. We developed Meta Gene Set Coexpression Analysis (MetaGSCA), an analytical tool to systematically assess differential coexpression of an a priori defined gene set by aggregating evidence across studies to provide a definitive result. In the kernel, a nonparametric approach that accounts for the gene-gene correlation structure is used to test whether the gene set is differentially coexpressed between two comparative conditions, from which a permutation test p-statistic is computed for each individual study. A meta-analysis is then performed to combine individual study results with one of two options: a random-intercept logistic regression model or the inverse variance method. We demonstrated MetaGSCA in case studies investigating two human diseases and identified pathways highly relevant to each disease across studies. We further applied MetaGSCA in a pan-cancer analysis with hundreds of major cellular pathways in 11 cancer types. The results indicated that a majority of the pathways identified were dysregulated in the pan-cancer scenario, many of which have been previously reported in the cancer literature. Our analysis with randomly generated gene sets showed excellent specificity, indicating that the significant pathways/gene sets identified by MetaGSCA are unlikely false positives. MetaGSCA is a user-friendly tool implemented in both forms of a Web-based application and an R package “MetaGSCA”. It enables comprehensive meta-analyses of gene set differential coexpression data, with an optional module of post hoc pathway crosstalk network analysis to identify and visualize pathways having similar coexpression profiles.  相似文献   

12.

Background  

Microarray experiments measure changes in the expression of thousands of genes. The resulting lists of genes with changes in expression are then searched for biologically related sets using several divergent methods such as the Fisher Exact Test (as used in multiple GO enrichment tools), Parametric Analysis of Gene Expression (PAGE), Gene Set Enrichment Analysis (GSEA), and the connectivity map.  相似文献   

13.
14.
Gene Set Context Analysis (GSCA) is an open source software package to help researchers use massive amounts of publicly available gene expression data (PED) to make discoveries. Users can interactively visualize and explore gene and gene set activities in 25,000+ consistently normalized human and mouse gene expression samples representing diverse biological contexts (e.g. different cells, tissues and disease types, etc.). By providing one or multiple genes or gene sets as input and specifying a gene set activity pattern of interest, users can query the expression compendium to systematically identify biological contexts associated with the specified gene set activity pattern. In this way, researchers with new gene sets from their own experiments may discover previously unknown contexts of gene set functions and hence increase the value of their experiments. GSCA has a graphical user interface (GUI). The GUI makes the analysis convenient and customizable. Analysis results can be conveniently exported as publication quality figures and tables. GSCA is available at https://github.com/zji90/GSCA. This software significantly lowers the bar for biomedical investigators to use PED in their daily research for generating and screening hypotheses, which was previously difficult because of the complexity, heterogeneity and size of the data.  相似文献   

15.
Zhang F  Guo X  Wang W  Yan H  Li C 《PloS one》2011,6(7):e22983
Kashin-Beck Disease (KBD) is an endemic osteochondropathy, the pathogenesis of which remains unclear now. In this study, we compared gene expression profiles of articular cartilage derived respectively from KBD patients and normal controls. Total RNA were isolated, amplified, labeled and hybridized to Agilent human 1A 22 k whole genome microarray chip. qRT-PCR was conducted to validate our microarray data. We detected 57 up-regulated genes (ratios ≥2.0) and 24 down-regulated genes (ratios ≤0.5) in KBD cartilage. To further identify the key genes involved in the pathogenesis of KBD, Bayesian analysis of variance for microarrays (BAM) software was applied and identified 12 potential key genes with an average ratio 6.64, involved in apoptosis, metabolism, cytokine & growth factor and cytoskeleton & cell movement. Gene Set Enrichment Analysis (GSEA) software was used to identify differently expressed gene ontology categories and pathways. GSEA found that a set of apoptosis, hypoxia and mitochondrial function related gene ontology categories and pathways were significantly up-regulated in KBD compared to normal controls. Based on the results of this study, we suggest that chronic hypoxia-induced mitochondrial damage and apoptosis might play an important role in the pathogenesis of KBD. Our efforts may help to understand the pathogenesis of KBD as well as other osteoarthrosis with similar articular cartilage lesions.  相似文献   

16.
17.
《Genomics》2020,112(3):2541-2549
Chromosome segregation defects lead to aneuploidy which is a major feature of solid tumors. How diploid cells face chromosome mis-segregation and how aneuploidy is tolerated in tumor cells are not completely defined yet. Thus, an important goal of cancer genetics is to identify gene networks that underlie aneuploidy and are involved in its tolerance. To this aim, we induced aneuploidy in IMR90 human primary cells by depleting pRB, DNMT1 and MAD2 and analyzed their gene expression profiles by microarray analysis. Bioinformatic analysis revealed a common gene expression profile of IMR90 cells that became aneuploid. Gene Set Enrichment Analysis (GSEA) also revealed gene-sets/pathways that are shared by aneuploid IMR90 cells that may be exploited for novel therapeutic approaches in cancer. Furthermore, Protein-Protein Interaction (PPI) network analysis identified TOP2A and KIF4A as hub genes that may be important for aneuploidy establishment.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号