首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Prostate cancer is one of the most common male malignant neoplasms; however, its causes are not completely understood. A few recent studies have used gene expression profiling of prostate cancer to identify differentially expressed genes and possible relevant pathways. However, few studies have examined the genetic mechanics of prostate cancer at the pathway level to search for such pathways. We used gene set enrichment analysis and a meta-analysis of six independent studies after standardized microarray preprocessing, which increased concordance between these gene datasets. Based on gene set enrichment analysis, there were 12 down- and 25 up-regulated mixing pathways in more than two tissue datasets, while there were two down- and two up-regulated mixing pathways in three cell datasets. Based on the meta-analysis, there were 46 and nine common pathways in the tissue and cell datasets, respectively. Three up- and 10 down-regulated crossing pathways were detected with combined gene set enrichment analysis and meta-analysis. We found that genes with small changes are difficult to detect by classic univariate statistics; they can more easily be identified by pathway analysis. After standardized microarray preprocessing, we applied gene set enrichment analysis and a meta-analysis to increase the concordance in identifying biological mechanisms involved in prostate cancer. The gene pathways that we identified could provide insight concerning the development of prostate cancer.  相似文献   

2.

Background  

Analysis of microarray and other high-throughput data on the basis of gene sets, rather than individual genes, is becoming more important in genomic studies. Correspondingly, a large number of statistical approaches for detecting gene set enrichment have been proposed, but both the interrelations and the relative performance of the various methods are still very much unclear.  相似文献   

3.
A test-statistic typically employed in the gene set enrichment analysis (GSEA) prevents this method from being genuinely multivariate. In particular, this statistic is insensitive to changes in the correlation structure of the gene sets of interest. The present paper considers the utility of an alternative test-statistic in designing the confirmatory component of the GSEA. This statistic is based on a pertinent distance between joint distributions of expression levels of genes included in the set of interest. The null distribution of the proposed test-statistic, known as the multivariate N-statistic, is obtained by permuting group labels. Our simulation studies and analysis of biological data confirm the conjecture that the N-statistic is a much better choice for multivariate significance testing within the framework of the GSEA. We also discuss some other aspects of the GSEA paradigm and suggest new avenues for future research.  相似文献   

4.
Absolute enrichment: gene set enrichment analysis for homeostatic systems   总被引:1,自引:0,他引:1  
The Gene Set Enrichment Analysis (GSEA) identifies sets of genes that are differentially regulated in one direction. Many homeostatic systems will include one limb that is upregulated in response to a downregulation of another limb and vice versa. Such patterns are poorly captured by the standard formulation of GSEA. We describe a technique to identify groups of genes (which sometimes can be pathways) that include both up- and down-regulated components. This approach lends insights into the feedback mechanisms that may operate, especially when integrated with protein interaction databases.  相似文献   

5.
Current demand for understanding the behavior of groups of related genes, combined with the greater availability of data, has led to an increased focus on statistical methods in gene set analysis. In this paper, we aim to perform a critical appraisal of the methodology based on graphical models developed in Massa et al. ( 2010 ) that uses pathway signaling networks as a starting point to develop statistically sound procedures for gene set analysis. We pay attention to the potential of the methodology with respect to the organizational aspects of dealing with such complex but highly informative starting structures, that is pathways. We focus on three themes: the translation of a biological pathway into a graph suitable for modeling, the role of shrinkage when more genes than samples are obtained, the evaluation of respondence of the statistical models to the biological expectations. To study the impact of shrinkage, two simulation studies will be run. To evaluate the biological expectation we will use data from a network with known behavior that offer the possibility of carrying out a realistic check of respondence of the model to changes in the experimental conditions.  相似文献   

6.
7.
Extensions to gene set enrichment   总被引:2,自引:0,他引:2  
MOTIVATION: Gene Set Enrichment Analysis (GSEA) has been developed recently to capture changes in the expression of pre-defined sets of genes. We propose number of extensions to GSEA, including the use of different statistics to describe the association between genes and phenotypes of interest. We make use of dimension reduction procedures, such as principle component analysis, to identify gene sets with correlated expression. We also address issues that arise when gene sets overlap. RESULTS: Our proposals extend the range of applicability of GSEA and allow for adjustments based on other covariates. We have provided a well-defined procedure to address interpretation issues that can raise when gene sets have substantial overlap. We have shown how standard dimension reduction methods, such as PCA, can be used to help further interpret GSEA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

8.

Background  

Gene set analysis (GSA) is a widely used strategy for gene expression data analysis based on pathway knowledge. GSA focuses on sets of related genes and has established major advantages over individual gene analyses, including greater robustness, sensitivity and biological relevance. However, previous GSA methods have limited usage as they cannot handle datasets of different sample sizes or experimental designs.  相似文献   

9.
10.
【目的】采用生物信息学方法分析公共数据库来源的细菌性败血症患者全血转录组学表达谱,探讨细菌败血症相关的宿主关键差异基因及意义。【方法】基于GEO数据库中GSE80496和GSE72829全血转录组基因数据集,采用GEO2R、基因集富集分析(GSEA)联用加权基因共表达网络分析(WGCNA)筛选细菌性败血症患者相比健康人群显著改变的差异基因,通过R软件对交集基因进行GO功能分析和KEGG富集分析。同时,通过String 11.0和Cytoscape分析枢纽基因,验证枢纽基因在数据集GSE72809(Health组52例,Definedsepsis组52例)全血标本中的表达情况,并探讨婴儿性别、月(胎)龄、出生体重、是否接触抗生素等因素与靶基因表达谱间的关系。【结果】分析GSE80496和GSE72829数据集分别筛选得到932个基因和319个基因,联合WGCNA枢纽模块交集得到与细菌性败血症发病相关的10个枢纽基因(MMP9、ITGAM、CSTD、GAPDH、PGLYRP1、FOLR3、OSCAR、TLR5、IL1RN和TIMP1);GSEA分析获得关键通路(氨基酸糖类-核糖代谢、PPAR信号通路、聚糖生物合成通路、自噬调控通路、补体、凝血因子级联反应、尼古丁和烟酰胺代谢、不饱和脂肪酸生物合成和阿尔兹海默症通路)及生物学过程(类固醇激素分泌、腺苷酸环化酶的激活、细胞外基质降解和金属离子运输)。【结论】本项研究通过GEO2R、GSEA联用WGCNA分析,筛选出与细菌性败血症发病相关的2个枢纽模块、10个枢纽基因以及一些关键信号通路和生物学过程,可为后续深入研究细菌性败血症致病机制奠定理论依据。  相似文献   

11.
12.
13.

Background

Sets of genes that are known to be associated with each other can be used to interpret microarray data. This gene set approach to microarray data analysis can illustrate patterns of gene expression which may be more informative than analyzing the expression of individual genes. Various statistical approaches exist for the analysis of gene sets. There are three main classes of these methods: over-representation analysis, functional class scoring, and pathway topology based methods.

Methods

We propose weighted hypergeometric and weighted chi-squared methods in order to assign a rank to the degree to which each gene participates in the enrichment. Each gene is assigned a weight determined by the absolute value of its log fold change, which is then raised to a certain power. The power value can be adjusted as needed. Datasets from the Gene Expression Omnibus are used to test the method. The significantly enriched pathways are validated through searching the literature in order to determine their relevance to the dataset.

Results

Although these methods detect fewer significantly enriched pathways, they can potentially produce more relevant results. Furthermore, we compare the results of different enrichment methods on a set of microarray studies all containing data from various rodent neuropathic pain models.

Discussion

Our method is able to produce more consistent results than other methods when evaluated on similar datasets. It can also potentially detect relevant pathways that are not identified by the standard methods. However, the lack of biological ground truth makes validating the method difficult.
  相似文献   

14.

Background  

Gene set enrichment testing has helped bridge the gap from an individual gene to a systems biology interpretation of microarray data. Although gene sets are defined a priori based on biological knowledge, current methods for gene set enrichment testing treat all genes equal. It is well-known that some genes, such as those responsible for housekeeping functions, appear in many pathways, whereas other genes are more specialized and play a unique role in a single pathway. Drawing inspiration from the field of information retrieval, we have developed and present here an approach to incorporate gene appearance frequency (in KEGG pathways) into two current methods, Gene Set Enrichment Analysis (GSEA) and logistic regression-based LRpath framework, to generate more reproducible and biologically meaningful results.  相似文献   

15.

Background  

The analysis of high-throughput gene expression data with respect to sets of genes rather than individual genes has many advantages. A variety of methods have been developed for assessing the enrichment of sets of genes with respect to differential expression. In this paper we provide a comparative study of four of these methods: Fisher's exact test, Gene Set Enrichment Analysis (GSEA), Random-Sets (RS), and Gene List Analysis with Prediction Accuracy (GLAPA). The first three methods use associative statistics, while the fourth uses predictive statistics. We first compare all four methods on simulated data sets to verify that Fisher's exact test is markedly worse than the other three approaches. We then validate the other three methods on seven real data sets with known genetic perturbations and then compare the methods on two cancer data sets where our a priori knowledge is limited.  相似文献   

16.
数据整合逐渐成为生物数据分析的重要方向。随着高通量技术的广泛应用,基因组序列数据大量产生,而如何充分、高效地整合这些序列数据成为了一个难题。本文从序列角度出发设计了一套新的数据整合方法:序列群富集分析(SequenceSet Enrichment Analysis,SSEA),并用实验数据探讨了其潜在的应用价值,结果显示SSEA能更好的挖掘出序列数据中的复杂生物学信息。  相似文献   

17.
18.
19.
The power of genome-wide SNP association studies is limited, among others, by the large number of false positive test results. To provide a remedy, we combined SNP association analysis with the pathway-driven gene set enrichment analysis (GSEA), recently developed to facilitate handling of genome-wide gene expression data. The resulting GSEA-SNP method rests on the assumption that SNPs underlying a disease phenotype are enriched in genes constituting a signaling pathway or those with a common regulation. Besides improving power for association mapping, GSEA-SNP may facilitate the identification of disease-associated SNPs and pathways, as well as the understanding of the underlying biological mechanisms. GSEA-SNP may also help to identify markers with weak effects, undetectable in association studies without pathway consideration. The program is freely available and can be downloaded from our website.  相似文献   

20.
Multiple sclerosis (MS) is a chronic, demyelinating disease that affects the central nervous system and is characterized by a complex pathogenesis and difficult management. The identification of new biomarkers would be clinically useful for more accurate diagnoses and disease monitoring. Metabolomics, the identification of small endogenous molecules, offers an instantaneous molecular snapshot of the MS phenotype. Here the metabolomic profiles (utilizing plasma from patients with MS) were characterized with a Gas cromatography-mass spectrometry-based platform followed by a multivariate statistical analysis and comparison with a healthy control (HC) population. The obtained partial least square discriminant analysis (PLS-DA) model identified and validated significant metabolic differences between individuals with MS and HC (R2X = 0.223, R2Y = 0.82, Q2 = 0.562; p < 0.001). Among discriminant metabolites phosphate, fructose, myo-inositol, pyroglutamate, threonate, l-leucine, l-asparagine, l-ornithine, l-glutamine, and l-glutamate were correctly identified, and some resulted as unknown. A receiver operating characteristic (ROC) curve with AUC 0.84 (p = 0.01; CI: 0.75–1) generated with the concentrations of the discriminant metabolites, supported the strength of the model. Pathway analysis indicated asparagine and citrulline biosynthesis as the main canonical pathways involved in MS. Changes in the citrulline biosynthesis pathway suggests the involvement of oxidative stress during neuronal damage. The results confirmed metabolomics as a useful approach to better understand the pathogenesis of MS and to provide new biomarkers for the disease to be used together with clinical data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号