首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
目的:研究混合效应模型(Mixed Effects Model)在肿瘤表达谱基因芯片数据分析中的检验效能,并探讨其分析效果。方法:采用混合效应模型分析肿瘤实例基因芯片数据,并以基因集富集分析方法(GSEA)作为参照比较分析结果的有效性和科学性,探讨其检验效果。结果:通过混合效应模型和基因集富集分析(GSEA)两种方法对肿瘤基因芯片数据的分析和比较,两种方法筛选出共同的差异表达通路外,混合效应模型额外地筛选出来GSEA未能检验到的8条差异表达通路,且得到文献支持;混和效应模型筛选出的前10个差异表达通路中有6个已有生物学证明而基因集富集分析方法(GSEA)筛选出的前10个差异表达通路中仅有4个已有生物学证明。结论:混合效应模型作为top-down方法中的典型代表,其优势在于通过构建潜变量达到降维目的,可有效地减少多个复杂的变异来源从而保证了结果的准确性和科学性,其检验效能优于基因集富集分析方法(GSEA),是一种行之有效的筛选肿瘤基因芯片数据的分析方法。  相似文献   

2.
【目的】采用生物信息学方法分析公共数据库来源的细菌性败血症患者全血转录组学表达谱,探讨细菌败血症相关的宿主关键差异基因及意义。【方法】基于GEO数据库中GSE80496和GSE72829全血转录组基因数据集,采用GEO2R、基因集富集分析(GSEA)联用加权基因共表达网络分析(WGCNA)筛选细菌性败血症患者相比健康人群显著改变的差异基因,通过R软件对交集基因进行GO功能分析和KEGG富集分析。同时,通过String 11.0和Cytoscape分析枢纽基因,验证枢纽基因在数据集GSE72809(Health组52例,Definedsepsis组52例)全血标本中的表达情况,并探讨婴儿性别、月(胎)龄、出生体重、是否接触抗生素等因素与靶基因表达谱间的关系。【结果】分析GSE80496和GSE72829数据集分别筛选得到932个基因和319个基因,联合WGCNA枢纽模块交集得到与细菌性败血症发病相关的10个枢纽基因(MMP9、ITGAM、CSTD、GAPDH、PGLYRP1、FOLR3、OSCAR、TLR5、IL1RN和TIMP1);GSEA分析获得关键通路(氨基酸糖类-核糖代谢、PPAR信号通路、聚糖生物合成通路、自噬调控通路、补体、凝血因子级联反应、尼古丁和烟酰胺代谢、不饱和脂肪酸生物合成和阿尔兹海默症通路)及生物学过程(类固醇激素分泌、腺苷酸环化酶的激活、细胞外基质降解和金属离子运输)。【结论】本项研究通过GEO2R、GSEA联用WGCNA分析,筛选出与细菌性败血症发病相关的2个枢纽模块、10个枢纽基因以及一些关键信号通路和生物学过程,可为后续深入研究细菌性败血症致病机制奠定理论依据。  相似文献   

3.
目的:研究在基因芯片数据分析中自限性原假设和竞争性原假设两类方法的优劣性和准确型,选取各自具有代表性的GAGE(Generally Applicable Gene-set Enrichment)和GSEA(Gene Set Enrichment Analysis)两种基因集分析方法筛选富集基因集的效能,并探讨其筛选效果.方法:采用两种待比较的方法在实际基因表达谱数据中分析研究,比较筛选结果的准确性和科学性,探讨两种方法筛选富集基因集的效果.结果:两方法对已知的基因表达谱数据进行应用分析表明GAGE的检验效能和筛选出的基因集生物学相关性均优于GSEA.结论:GAGE作为一种自限性原假设的基因集分析方法,由于其充分利用了表达谱数据,并将表达数据分为实验集和通路集分别进行分析处理,同时考虑到基因集的上调和下调,其检验效能优于竞争性原假设的GSEA,能够得到更为准确和科学的结果.  相似文献   

4.
本研究是利用公共基因芯片数据库筛选乳腺癌的预后基因,预测和探索这些基因在乳腺癌进展中的可能机制和临床价值.首先,我们筛选了公共基因芯片数据库(gene expression omnibus,GEO)GSE22820和癌症基因组图谱(the cancer genome atlas,TCGA)乳腺癌数据库的重叠差异表达基因,联合R语言分析乳腺癌组织与癌旁正常组织差异表达的基因;其次,基于STRING数据库及Cytoscape软件构建蛋白质相互作用网络图,分析并识别了中枢基因和前3个模块;之后进行了更多的功能分析,包括基因本体(gene ontology,GO)和京都基因与基因组百科全书(kyoto encyclopedia of genes and genomes,KEGG)通路分析以及基因集富集分析(gene set enrichment analysis,GSEA),以研究这些基因的作用以及潜在的潜在机制;最后进行了Kaplan-Meier分析和Cox比例风险分析,以阐明这些基因的诊断和预后效果.相关数据分析表明15个基因的表达水平与生存预后相关,高表达基因患者的总生存时间短于低表达患者(P<0.05);Cox比例风险分析表明UBE2T、ER-CC6L和RAD51这3个基因是预后生存的独立因素(P<0.05);GSEA分析表明在UBE2T、ERCC6L和RAD51基因中细胞周期、基础转录因子和卵母细胞减数分裂明显富集.最终,我们得出结论,这3种基因标志物的高表达是乳腺癌预后不良因素,可作为预测乳腺癌患者转移和预后的有效生物标志物.  相似文献   

5.
目的:筛选肝细胞癌(HCC)预后不良相关基因,并探讨其临床意义。方法:在基因表达综合数据库(GEO)中获取符合分析条件的肝细胞癌全基因组表达谱数据并分析得到差异表达基因(DEGs),再运用生物学信息注释及可视化数据库 (DAVID) 和蛋白相互作用数据库 (String) 分别进行功能富集分析和蛋白质互作用网络的构建。利用癌症基因组图谱数据库(TCGA)和Cox比例风险回归模型对相关差异基因进行预后分析。结果:找到一个符合条件的人类HCC数据库 (GSE84402),共筛选出1141个差异表达基因(DEGs),其中上调基因720个,下调基因421个。基因功能富集分析和蛋白质互作用分析结果显示CDK1、CDC6、CCNA2、CHEK1、CENPE 、PIK3R1、RACGAP1、BIRC5、KIF11和CYP2B6为HCC预后的关键基因。TCGA数据库和Cox回归模型分析显示CDC6、PIK3R1、RACGAP1和KIF11的表达升高,CENPE的表达降低与HCC预后不良密切相关。结论:CDC6、CENPE、PIK3R1、RACGAP1和KIF11可能和HCC的预后不良相关,可作为未来HCC预后研究的参考标志物。  相似文献   

6.
刘洁  许凯龙  马立新  王洋 《生物工程学报》2022,38(10):3790-3808
脑胶质瘤(glioma)是中枢神经系统最常见的内在肿瘤,具有发病率高、预后较差等特点。本研究旨在鉴定多形性胶质母细胞瘤(glioblastoma multiforme,GBM)和低级别胶质瘤(lower-grade gliomas,LGG)之间的差异表达基因(differentially expressed genes,DEGs),以探讨不同级别胶质瘤的预后影响因素。从NCBI基因表达综合数据库中收集了胶质瘤的单细胞转录组测序数据,其中包括来自3个数据集的共29 097个细胞样本。对于不同分级的人脑胶质瘤进行分析,经过滤得到21 071个细胞,通过基因本体分析、京都基因与基因组百科全书途径分析,从差异表达基因中筛选出70个基因,我们通过查阅文献,聚焦到delta样典型Notch配体3(delta like canonical Notch ligand 3,DLL3)这个基因。基于TCGA的基因表达谱交互分析(gene expression profiling interactive analysis,GEPIA)数据库用于探索LGG和GBM中DLL3基因的表达差异,采用基因表达谱交互式分析和肿瘤免疫学估计资源(tumor immune estimation resource,TIMER)数据库,研究关键基因在不同分级的脑胶质瘤中的表达,预测了与免疫治疗密切相关的生物标志物。cBioPortal数据库用于探索DLL3表达与25个免疫检查点之间的关系。基因集富集分析(gene set enrichment analysis,GSEA)进一步确定了与中心基因相关的途径。最后,在中国胶质瘤基因组图谱(Chinese glioma genome atlas,CGGA)中验证了生物标志物在预后和预测中的疗效。这些结果发现,预后基因与肿瘤增殖和进展有关,通过生物学信息和生存分析,表明这些基因可能作为一种有前途的预后生物标志物,并作为选择治疗策略的新靶点。  相似文献   

7.
目的:用生物信息学方法分析多效生长因子(PTN)潜在的分子功能。方法:利用由美国亚利桑那癌中心提供的生物信息学数据库,对前期用小鼠全基因组表达谱芯片检测到的Ptn相关基因进行生物信息学分析,通过GO Terms分析这些基因所属的功能群体,用Pathway Miner分析这些基因参与调控的信号通路。结果:370个由芯片检测得到的Ptn相关基因中,在GO Terms数据库中找到231个基因,其中参与细胞成分构成的基因占31.83%,具有分子功能的基因占35.34%,而参与生物学过程的基因占32.83%;在Pathway Miner数据库中找到105个基因。这些基因相关的信号通路有230条,分别属于细胞和调控过程通路以及代谢通路。结论:PTN是一个重要的细胞因子,可能参与机体的免疫与防御反应、炎症反应,以及细胞的增殖、凋亡调控等。  相似文献   

8.
通过基因芯片技术,利用Roche-NimbleGen公司制作的大鼠12×135K全基因组表达谱芯片,对日龄为6d和10d的大鼠睾丸组织进行全基因组表达差异分析。结果显示:具有2倍以上的差异表达基因有4298个,其中表达上调的基因共1878个,表达下调的基因共2420个。这些差异表达的基因中有3154个基因具有基因本体注释,参与了154个生物学通路。进一步分析表明具有8倍以上差异表达的基因有13个,这些基因参与了生物学过程、细胞组分和分子功能等基因本体分类,进一步选择3个差异表达的基因,LOC686076、Cxcl6和Trib3,做了实时定量RT-PCR检测。其结果趋势与芯片数据一致。因此,我们初步认为精原干细胞的发生与增殖在大鼠早期的发育过程中已经有大量的基因参与,是一个多基因协调表达的过程。  相似文献   

9.
MicroRNA 122对肝癌细胞基因表达谱的影响   总被引:1,自引:0,他引:1  
为研究microRNA(miR-122)对肝癌细胞Hep3B基因表达谱的影响,并探讨其在肝癌发过程中的可能作用,构建了miR-122稳定高表达的Hep3B细胞,利用基因表达谱芯片技术筛选得到和对照组细胞比较的差异表达基因.研究结果显示,2倍以上变化的差异表达基因有490个,其中上调的有345个,下调的有145个.这些基因中有16个与肿瘤发生相关,其它基因涉及细胞周期、信号转导、细胞凋亡和细胞增殖分化等众多生物学过程.这些结果提示,miR-122可能在肝癌发生的过程中发挥作用,并可能与这些差异表达基因密切相关.另外,还结合生物信息学方法,在下调表达的基因中预测了miR-122可能直接作用的靶基因.本研究初步探讨了miR-122在肝癌细胞中的生物学功能,为进一步研究miR-122在肝癌发生中的作用奠定了基础,同时也为miRNA的生物学功能及其作用机制的研究提供了一些参考.  相似文献   

10.
《蛇志》2020,(1)
目的探讨强直性脊柱炎(AS)患者差异表达基因,并基于差异基因探讨强直性脊柱炎发病相关的可能生物学过程和信号通路。方法检索基因表达谱数据库(GEO)并筛选AS相关基因表达谱数据集。应用GEO在线分析功能GEO2R分析AS组和正常对照组的差异表达基因,用Cytoscape软件clueGO插件进行基因本体论和京都基因与基因组百科全书分析,采用String蛋白-蛋白相互作用(PPI)数据库分析差异表达基因编码蛋白间的相互作用;应用Cytoscape绘制蛋白相互作用网络图,并软件筛选信号通路关键基因分析。结果选取AS患者全血表达数据集GSE25101为研究对象,分析获得差异表达基因72个。72个差异表达基因分子功能主要为参与高迁移率族盒染色体蛋白1(HMGB1)转导机制;生物学过程主要富集于巨噬细胞迁移、骨髓细胞凋亡过程、线粒体呼吸链复合体装配、ATP合成偶联电子传输、线粒体ATP合成耦合电子输运等;细胞成分主要富集于呼吸链复合体、线粒体呼吸体等。信号通路富集于氧化磷酸化信号通路和帕金森综合征相关信号通路。PPI网络经过cytohubba插件筛选,ATP5J、NDUFS4、UQCRB、UQCRH、NDUFB3、COX7B、LSM3、ATP5EP2、ENY2、PSMA4被筛选为网络中的核心基因。结论通过生物信息学方法进行预测了AS的潜在机制,并筛选出10个潜在的与AS相关的重要分子,其中氧化磷酸化可能在AS发病机制中发挥了重要的作用。  相似文献   

11.

Background  

Gene set enrichment analysis (GSEA) is a microarray data analysis method that uses predefined gene sets and ranks of genes to identify significant biological changes in microarray data sets. GSEA is especially useful when gene expression changes in a given microarray data set is minimal or moderate.  相似文献   

12.

Background  

Gene set enrichment testing has helped bridge the gap from an individual gene to a systems biology interpretation of microarray data. Although gene sets are defined a priori based on biological knowledge, current methods for gene set enrichment testing treat all genes equal. It is well-known that some genes, such as those responsible for housekeeping functions, appear in many pathways, whereas other genes are more specialized and play a unique role in a single pathway. Drawing inspiration from the field of information retrieval, we have developed and present here an approach to incorporate gene appearance frequency (in KEGG pathways) into two current methods, Gene Set Enrichment Analysis (GSEA) and logistic regression-based LRpath framework, to generate more reproducible and biologically meaningful results.  相似文献   

13.
Circumventing the cut-off for enrichment analysis   总被引:1,自引:0,他引:1  
  相似文献   

14.

Background

Large amounts of microarray expression data have been generated for the Apicomplexan parasite Toxoplasma gondii in an effort to identify genes critical for virulence or developmental transitions. However, researchers’ ability to analyze this data is limited by the large number of unannotated genes, including many that appear to be conserved hypothetical proteins restricted to Apicomplexa. Further, differential expression of individual genes is not always informative and often relies on investigators to draw big-picture inferences without the benefit of context. We hypothesized that customization of gene set enrichment analysis (GSEA) to T. gondii would enable us to rigorously test whether groups of genes serving a common biological function are co-regulated during the developmental transition to the latent bradyzoite form.

Results

Using publicly available T. gondii expression microarray data, we created Toxoplasma gene sets related to bradyzoite differentiation, oocyst sporulation, and the cell cycle. We supplemented these with lists of genes derived from community annotation efforts that identified contents of the parasite-specific organelles, rhoptries, micronemes, dense granules, and the apicoplast. Finally, we created gene sets based on metabolic pathways annotated in the KEGG database and Gene Ontology terms associated with gene annotations available at http://www.toxodb.org. These gene sets were used to perform GSEA analysis using two sets of published T. gondii expression data that characterized T. gondii stress response and differentiation to the latent bradyzoite form.

Conclusions

GSEA provides evidence that cell cycle regulation and bradyzoite differentiation are coupled. Δgcn5A mutants unable to induce bradyzoite-associated genes in response to alkaline stress have different patterns of cell cycle and bradyzoite gene expression from stressed wild-type parasites. Extracellular tachyzoites resemble a transitional state that differs in gene expression from both replicating intracellular tachyzoites and in vitro bradyzoites by expressing genes that are enriched in bradyzoites as well as genes that are associated with the G1 phase of the cell cycle. The gene sets we have created are readily modified to reflect ongoing research and will aid researchers’ ability to use a knowledge-based approach to data analysis facilitating the development of new insights into the intricate biology of Toxoplasma gondii.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-515) contains supplementary material, which is available to authorized users.  相似文献   

15.
A test-statistic typically employed in the gene set enrichment analysis (GSEA) prevents this method from being genuinely multivariate. In particular, this statistic is insensitive to changes in the correlation structure of the gene sets of interest. The present paper considers the utility of an alternative test-statistic in designing the confirmatory component of the GSEA. This statistic is based on a pertinent distance between joint distributions of expression levels of genes included in the set of interest. The null distribution of the proposed test-statistic, known as the multivariate N-statistic, is obtained by permuting group labels. Our simulation studies and analysis of biological data confirm the conjecture that the N-statistic is a much better choice for multivariate significance testing within the framework of the GSEA. We also discuss some other aspects of the GSEA paradigm and suggest new avenues for future research.  相似文献   

16.
17.

Background  

Recently, microarray data analyses using functional pathway information, e.g., gene set enrichment analysis (GSEA) and significance analysis of function and expression (SAFE), have gained recognition as a way to identify biological pathways/processes associated with a phenotypic endpoint. In these analyses, a local statistic is used to assess the association between the expression level of a gene and the value of a phenotypic endpoint. Then these gene-specific local statistics are combined to evaluate association for pre-selected sets of genes. Commonly used local statistics include t-statistics for binary phenotypes and correlation coefficients that assume a linear or monotone relationship between a continuous phenotype and gene expression level. Methods applicable to continuous non-monotone relationships are needed. Furthermore, for multiple experimental categories, methods that combine multiple GSEA/SAFE analyses are needed.  相似文献   

18.

Background  

With the current technological advances in high-throughput biology, the necessity to develop tools that help to analyse the massive amount of data being generated is evident. A powerful method of inspecting large-scale data sets is gene set enrichment analysis (GSEA) and investigation of protein structural features can guide determining the function of individual genes. However, a convenient tool that combines these two features to aid in high-throughput data analysis has not been developed yet. In order to fill this niche, we developed the user-friendly, web-based application, PhenoFam.  相似文献   

19.
20.
Extensions to gene set enrichment   总被引:2,自引:0,他引:2  
MOTIVATION: Gene Set Enrichment Analysis (GSEA) has been developed recently to capture changes in the expression of pre-defined sets of genes. We propose number of extensions to GSEA, including the use of different statistics to describe the association between genes and phenotypes of interest. We make use of dimension reduction procedures, such as principle component analysis, to identify gene sets with correlated expression. We also address issues that arise when gene sets overlap. RESULTS: Our proposals extend the range of applicability of GSEA and allow for adjustments based on other covariates. We have provided a well-defined procedure to address interpretation issues that can raise when gene sets have substantial overlap. We have shown how standard dimension reduction methods, such as PCA, can be used to help further interpret GSEA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号