首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ABSTRACT: BACKGROUND: Gene-set enrichment analyses (GEA or GSEA) are commonly used for biological characterization of an experimental gene-set. This is done by finding known functional categories, such as pathways or Gene Ontology terms, that are over-represented in the experimental set; the assessment is based on an overlap statistic. Rich biological information in terms of gene interaction network is now widely available, but this topological information is not used by GEA, so there is a need for methods that exploit this type of information in high-throughput data analysis. RESULTS: We developed a method of network enrichment analysis (NEA) that extends the overlap statistic in GEA to network links between genes in the experimental set and those in the functional categories. For the crucial step in statistical inference, we developed a fast network randomization algorithm in order to obtain the distribution of any network statistic under the null hypothesis of no association between an experimental gene-set and a functional category. We illustrate the NEA method using gene and protein expression data from a lung cancer study. CONCLUSIONS: The results indicate that the NEA method is more powerful than the traditional GEA, primarily because the relationships between gene sets were more strongly captured by network connectivity rather than by simple overlaps.  相似文献   

2.

Background  

Recently, microarray data analyses using functional pathway information, e.g., gene set enrichment analysis (GSEA) and significance analysis of function and expression (SAFE), have gained recognition as a way to identify biological pathways/processes associated with a phenotypic endpoint. In these analyses, a local statistic is used to assess the association between the expression level of a gene and the value of a phenotypic endpoint. Then these gene-specific local statistics are combined to evaluate association for pre-selected sets of genes. Commonly used local statistics include t-statistics for binary phenotypes and correlation coefficients that assume a linear or monotone relationship between a continuous phenotype and gene expression level. Methods applicable to continuous non-monotone relationships are needed. Furthermore, for multiple experimental categories, methods that combine multiple GSEA/SAFE analyses are needed.  相似文献   

3.
Head and neck squamous cell carcinoma (HNSCC) is a common malignancy with high mortality and poor prognosis due to a lack of predictive markers. Increasing evidence has demonstrated small nucleolar RNAs (snoRNAs) play an important role in tumorigenesis. The aim of this study was to identify a prognostic snoRNA signature of HNSCC. Survival-related snoRNAs were screened by Cox regression analysis (univariate, least absolute shrinkage and selection operator, and multivariate). The predictive value was validated in different subgroups. The biological functions were explored by coexpression analysis and gene set enrichment analysis (GSEA). One hundred and thirteen survival-related snoRNAs were identified, and a five-snoRNA signature predicted prognosis with high sensitivity and specificity. Furthermore, the signature was applicable to patients of different sexes, ages, stages, grades, and anatomic subdivisions. Coexpression analysis and GSEA revealed the five-snoRNA are involved in regulating malignant phenotype and DNA/RNA editing. This five-snoRNA signature is not only a promising predictor of prognosis and survival but also a potential biomarker for patient stratification management.  相似文献   

4.

Background  

Gene set enrichment analysis (GSEA) is a microarray data analysis method that uses predefined gene sets and ranks of genes to identify significant biological changes in microarray data sets. GSEA is especially useful when gene expression changes in a given microarray data set is minimal or moderate.  相似文献   

5.
Pathway analysis has been proposed as a complement to single SNP analyses in GWAS. This study compared pathway analysis methods using two lung cancer GWAS data sets based on four studies: one a combined data set from Central Europe and Toronto (CETO); the other a combined data set from Germany and MD Anderson (GRMD). We searched the literature for pathway analysis methods that were widely used, representative of other methods, and had available software for performing analysis. We selected the programs EASE, which uses a modified Fishers Exact calculation to test for pathway associations, GenGen (a version of Gene Set Enrichment Analysis (GSEA)), which uses a Kolmogorov-Smirnov-like running sum statistic as the test statistic, and SLAT, which uses a p-value combination approach. We also included a modified version of the SUMSTAT method (mSUMSTAT), which tests for association by averaging χ2 statistics from genotype association tests. There were nearly 18000 genes available for analysis, following mapping of more than 300,000 SNPs from each data set. These were mapped to 421 GO level 4 gene sets for pathway analysis. Among the methods designed to be robust to biases related to gene size and pathway SNP correlation (GenGen, mSUMSTAT and SLAT), the mSUMSTAT approach identified the most significant pathways (8 in CETO and 1 in GRMD). This included a highly plausible association for the acetylcholine receptor activity pathway in both CETO (FDR≤0.001) and GRMD (FDR = 0.009), although two strong association signals at a single gene cluster (CHRNA3-CHRNA5-CHRNB4) drive this result, complicating its interpretation. Few other replicated associations were found using any of these methods. Difficulty in replicating associations hindered our comparison, but results suggest mSUMSTAT has advantages over the other approaches, and may be a useful pathway analysis tool to use alongside other methods such as the commonly used GSEA (GenGen) approach.  相似文献   

6.
7.
GSEA-P: a desktop application for Gene Set Enrichment Analysis   总被引:4,自引:0,他引:4  
Gene Set Enrichment Analysis (GSEA) is a computational method that assesses whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states. We report the availability of a new version of the Java based software (GSEA-P 2.0) that represents a major improvement on the previous release through the addition of a leading edge analysis component, seamless integration with the Molecular Signature Database (MSigDB) and an embedded browser that allows users to search for gene sets and map them to a variety of microarray platform formats. This functionality makes it possible for users to directly import gene sets from MSigDB for analysis with GSEA. We have also improved the visualizations in GSEA-P 2.0 and added links to a new form of concise gene set annotations called Gene Set Cards. These additions, as well as other improvements suggested by over 3500 users who have downloaded the software over the past year have been incorporated into this new release of the GSEA-P Java desktop program. AVAILABILITY: GSEA-P 2.0 is freely available for academic and commercial users and can be downloaded from http://www.broad.mit.edu/GSEA  相似文献   

8.
Circumventing the cut-off for enrichment analysis   总被引:1,自引:0,他引:1  
  相似文献   

9.
Background: The present study investigated the independent prognostic value of glycolysis-related long noncoding (lnc)RNAs in clear cell renal cell carcinoma (ccRCC).Methods: A coexpression analysis of glycolysis-related mRNAs–long noncoding RNAs (lncRNAs) in ccRCC from The Cancer Genome Atlas (TCGA) was carried out. Clinical samples were randomly divided into training and validation sets. Univariate Cox regression and least absolute shrinkage and selection operator (LASSO) regression analyses were performed to establish a glycolysis risk model with prognostic value for ccRCC, which was validated in the training and validation sets and in the whole cohort by Kaplan–Meier, univariate and multivariate Cox regression, and receiver operating characteristic (ROC) curve analyses. Principal component analysis (PCA) and functional annotation by gene set enrichment analysis (GSEA) were performed to evaluate the risk model.Results: We identified 297 glycolysis-associated lncRNAs in ccRCC; of these, 7 were found to have prognostic value in ccRCC patients by Kaplan–Meier, univariate and multivariate Cox regression, and ROC curve analyses. The results of the GSEA suggested a close association between the 7-lncRNA signature and glycolysis-related biological processes and pathways.Conclusion: The seven identified glycolysis-related lncRNAs constitute an lncRNA signature with prognostic value for ccRCC and provide potential therapeutic targets for the treatment of ccRCC patients.  相似文献   

10.
GSEA是一个可下载后免费使用的全基因组表达谱芯片数据分析工具。它根据已有的对基因的定位、性质、功能、生物学意义等知识的基础上,首先构建了一个分子标签数据库,数据库中包含了多个功能基因集。通过分析一组处于两个生物学状态的基因表达谱杂交数据,它们在特定的功能基因集中的表达状况,以及这种表达状况是否存在某种统计学显著性。GSEA是从另一个角度来诠释生物信息,可进一步完善我们对相关生物学事件的认识。  相似文献   

11.
12.
Lipid metabolism reprogramming plays important role in cell growth, proliferation, angiogenesis and invasion in cancers. However, the diverse lipid metabolism programmes and prognostic value during glioma progression remain unclear. Here, the lipid metabolism‐related genes were profiled using RNA sequencing data from The Cancer Genome Atlas (TCGA) and Chinese Glioma Genome Atlas (CGGA) database. Gene ontology (GO) and gene set enrichment analysis (GSEA) found that glioblastoma (GBM) mainly exhibited enrichment of glycosphingolipid metabolic progress, whereas lower grade gliomas (LGGs) showed enrichment of phosphatidylinositol metabolic progress. According to the differential genes of lipid metabolism between LGG and GBM, we developed a nine‐gene set using Cox proportional hazards model with elastic net penalty, and the CGGA cohort was used for validation data set. Survival analysis revealed that the obtained gene set could differentiate the outcome of low‐ and high‐risk patients in both cohorts. Meanwhile, multivariate Cox regression analysis indicated that this signature was a significantly independent prognostic factor in diffuse gliomas. Gene ontology and GSEA showed that high‐risk cases were associated with phenotypes of cell division and immune response. Collectively, our findings provided a new sight on lipid metabolism in diffuse gliomas.  相似文献   

13.
14.
Fistulifera sp. strain JPCC DA0580 is a newly sequenced pennate diatom that is capable of simultaneously growing and accumulating lipids. This is a unique trait, not found in other related microalgae so far. It is able to accumulate between 40 to 60% of its cell weight in lipids, making it a strong candidate for the production of biofuel. To investigate this characteristic, we used RNA-Seq data gathered at four different times while Fistulifera sp. strain JPCC DA0580 was grown in oil accumulating and non-oil accumulating conditions. We then adapted gene set enrichment analysis (GSEA) to investigate the relationship between the difference in gene expression of 7,822 genes and metabolic functions in our data. We utilized information in the KEGG pathway database to create the gene sets and changed GSEA to use re-sampling so that data from the different time points could be included in the analysis. Our GSEA method identified photosynthesis, lipid synthesis and amino acid synthesis related pathways as processes that play a significant role in oil production and growth in Fistulifera sp. strain JPCC DA0580. In addition to GSEA, we visualized the results by creating a network of compounds and reactions, and plotted the expression data on top of the network. This made existing graph algorithms available to us which we then used to calculate a path that metabolizes glucose into triacylglycerol (TAG) in the smallest number of steps. By visualizing the data this way, we observed a separate up-regulation of genes at different times instead of a concerted response. We also identified two metabolic paths that used less reactions than the one shown in KEGG and showed that the reactions were up-regulated during the experiment. The combination of analysis and visualization methods successfully analyzed time-course data, identified important metabolic pathways and provided new hypotheses for further research.  相似文献   

15.
A test based on ordered observations for selection of consistent judges for sensory evaluation has been given. The test–statistic depends on the sum of ranks of different products under consideration. The probability distribution of the test-statistic has been worked out for small sample and it turns out to be chi-square distribution for large sample. The analytical procedure has been explained by a numerical example of a taste-testing experiment.  相似文献   

16.
17.
《PloS one》2015,10(12)

Background

Fatigue is a debilitating condition with a significant impact on patients’ quality of life. Fatigue is frequently reported by patients suffering from primary Sjögren’s Syndrome (pSS), a chronic autoimmune condition characterised by dryness of the eyes and the mouth. However, although fatigue is common in pSS, it does not manifest in all sufferers, providing an excellent model with which to explore the potential underpinning biological mechanisms.

Methods

Whole blood samples from 133 fully-phenotyped pSS patients stratified for the presence of fatigue, collected by the UK primary Sjögren’s Syndrome Registry, were used for whole genome microarray. The resulting data were analysed both on a gene by gene basis and using pre-defined groups of genes. Finally, gene set enrichment analysis (GSEA) was used as a feature selection technique for input into a support vector machine (SVM) classifier. Classification was assessed using area under curve (AUC) of receiver operator characteristic and standard error of Wilcoxon statistic, SE(W).

Results

Although no genes were individually found to be associated with fatigue, 19 metabolic pathways were enriched in the high fatigue patient group using GSEA. Analysis revealed that these enrichments arose from the presence of a subset of 55 genes. A radial kernel SVM classifier with this subset of genes as input displayed significantly improved performance over classifiers using all pathway genes as input. The classifiers had AUCs of 0.866 (SE(W) 0.002) and 0.525 (SE(W) 0.006), respectively.

Conclusions

Systematic analysis of gene expression data from pSS patients discordant for fatigue identified 55 genes which are predictive of fatigue level using SVM classification. This list represents the first step in understanding the underlying pathophysiological mechanisms of fatigue in patients with pSS.  相似文献   

18.
19.
MOTIVATION: The development of gene expression microarray technology has allowed the identification of differentially expressed genes between different clinical phenotypic classes of cancer from a large pool of candidate genes. Although many class comparisons concerned only a single phenotype, simultaneous assessment of the relationship between gene expression and multiple phenotypes would be warranted to better understand the underlying biological structure. RESULTS: We develop a method to select genes related to multiple clinical phenotypes based on a set of multivariate linear regression models. For each gene, we perform model selection based on the doubly-adjusted R-square statistic and use the maximum of this statistic for gene selection. The method can substantially improve the power in gene selection, compared with a conventional method that uses a single model exclusively for gene selection. Application to a bladder cancer study to correlate pre-treatment gene expressions with pathological stage and grade is given. The methods would be useful for screening for genes related to multiple clinical phenotypes. AVAILABILITY: SAS and MATLAB codes are available from author upon request.  相似文献   

20.
《Translational oncology》2020,13(12):100861
Neurotransmitters are reported to be involved in tumor initiation and progression. This study aimed to elucidate the prognostic value of γ-aminobutyric acid type A receptor δ subunit (GABRD) in colon adenocarcinoma (COAD) using the data from The Cancer Genome Atlas (TCGA) database. The GABRD mRNA expression levels in the COAD and normal tissues were compared using the Wilcoxon rank-sum test. The correlation between clinicopathologic characteristics and GABRD expression was analyzed by Wilcoxon rank-sum test or Kruskal-Wallis test and logistic regression. The prognostic value of GABRD mRNA expression in patients with COAD was determined using the Kaplan-Meier curve and Cox regression analysis. Finally, the molecular mechanisms of GABRD in COAD were predicted by gene set enrichment analysis (GSEA). The COAD tissues exhibited higher GABRD mRNA expression levels than the normal tissues. The logistic regression analysis revealed that GABRD mRNA expression was correlated with TNM stage, N stage, M stage, and microsatellite instability (MSI) status. The Kaplan-Meier survival curve and log-rank test revealed that patients with COAD exhibiting high GABRD mRNA expression were associated with poor overall survival (OS). The multivariate analysis indicated that increased GABRD mRNA expression was an independent prognostic factor and was correlated with a poor OS. The GSEA revealed that GABRD was involved in signaling pathways, including cell adhesion molecules, gap junction, melanogenesis, and mTOR signaling pathway, as well as the signaling pathways associated with basal cell carcinoma or bladder cancer development. In summary, enhanced GABRD mRNA expression may be a potential independent prognostic biomarker for COAD.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号