首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 437 毫秒
1.
高通量微阵列杂交技术和测序技术的快速发展,产生了大量的基因数据,生物信息迅速膨胀成为数据的海洋。为适应这种高通量基因表达数据的不断增长和人们共享数据的需要,各种数据库应用而生,其中,NCBI(national center for biotechnology information)的基因表达综合数据库(gene expression omnibus,GEO)是世界上最大的储存高通量分子丰度数据的公共数据库,用户可以提交、储存和检索多种形式的数据并免费使用。迄今为止,GEO已收录了300000个样本的数据,涉及16亿个基因表达丰度数据,涵盖500多种生物体,广泛覆盖各种生物学内容。GEO数据库操作简单,数据全面,免费共享的优势为后期数据挖掘和信息推广提供了良好的平台。文章概述了GEO数据库的结构、数据的提交、检索和其在分子生物学领域中的应用前景。登陆GEO数据库的网址为:http://www.ncbi.nlm.nih.gov/geo。  相似文献   

2.
GEO(Gene Expression Omnibus ):高通量基因表达数据库   总被引:2,自引:0,他引:2  
 GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被存档,例如基因表达系列分析(serial analysis of gene expression,SAGE)和蛋白质鉴定技术.迄今为止,GEO数据库包含的数据含概10 000个杂交实验和来自30种不同生物体的SAGE库.本文概述了GEO数据库的查询和浏览,数据下载和格式,数据分析,贮存与更新,并着重分析GEO数据浏览器中控制词汇的使用,阐述了GEO数据库的数据挖掘以及GEO在分子生物学领域中的应用前景.GEO可由此公众网址直接登陆http://www.ncbi.nlm.nih.gov/projects/geo/.  相似文献   

3.
拟南芥不同组织基因表达及可变剪接差异分析   总被引:1,自引:0,他引:1  
可变剪接是转录后重要的基因表达调控方式,也是转录组和蛋白质组多样性的重要来源. 近年来随着拟南芥、水稻、玉米等植物转录组测序的完成,研究人员发现植物pre-mRNA可变剪接的发生与组织分化、发育等生物学过程密切相关. 本工作基于GEO数据库的RNA-seq数据,使用高通量测序数据分析常用的Trimmomatic、Salmon、DESeq2、SUPPA2等工具,识别了拟南芥的种子、根、叶、花、花梗、节间、长角果共7种组织的表达基因和可变剪接事件,以及7种组织间的差异表达基因和差异可变剪接事件,并以叶和花为例展示了相应的生物学功能分析. 该工作系统地研究了拟南芥基因表达和可变剪接发生的组织特异性,有助于进一步阐明植物基因组的基因表达调控机制.  相似文献   

4.
目的 构建冠状病毒感染动物模型比较转录组学数据库,从基因表达水平比较冠状病毒感染人和动物模型后的异同,为动物实验与临床研究提供数据支撑.方法 从GEO、ArrayExpress等数据库中下载冠状病毒(以SARS-CoV、SARS-CoV-2、MERS-CoV为主)感染动物模型与人的转录组数据,对测序数据进行质控、标准化...  相似文献   

5.
嗜热四膜虫是一种优良的真核模式生物, 对其功能基因组学的分析将为研究细胞和分子生物学的基础性重大问题提供新的机遇. 四膜虫基因表达数据库(TGED)是原生动物纤毛虫中第一个基因表达数据库, 整个数据库包括了四膜虫3个重要的生理和发育阶段20个时间点的全基因组基因表达数据, 提供了友好的网页界面(http://tged.ihb.ac.cn)供用户获取这些数据. 通过基因的标识号或描述信息, 用户可以获取基因表达谱信息和协同表达基因. 此外, TGED同时提供了制备四膜虫基因表达芯片的样品制备方法. 欢迎研究者上传和分享其所有的基因芯片数据, 以期为四膜虫或纤毛虫研究提供重要的功能基因组学资源.  相似文献   

6.
Oncomine 是目前世界上最大的癌基因芯片数据库和综合数据挖掘平台之一,该数据库整合了GEO、TCGA和已发表文献来源的RNA和DNA-seq数据。数据库目前含有715个基因表达数据集(datasheet)、86 733个人体肿瘤组织和正常组织样本的信息,且有新的数据不断更新。Oncomine 数据库囊括的肿瘤类型有19种,包括:膀胱癌、脑/中枢神经系统肿瘤、乳腺癌、宫颈癌、结直肠癌、食管癌、胃癌、头/颈肿瘤、肾癌、白血病、肝癌、肺癌、淋巴瘤、黑色素瘤、骨髓瘤、卵巢癌、胰腺癌、前列腺癌、肉瘤。本文就如何利用Oncomine数据库,进行肿瘤组织中癌基因表达差异性分析以及基因共表达分析、癌基因在肿瘤组织中的表达及拷贝数分析、多组研究数据集的荟萃分析(meta analysis)、以及癌基因表达与患者生存率关系等进行分析。通过该数据库可以对肿瘤癌基因进行研究前的筛查,有利于发现新的肿瘤生物标记物或治疗靶点,为临床科学研究奠定一定的理论基础。  相似文献   

7.
高通量RNA测序(RNA-seq)技术为研究人员提供了海量数据,如何对这些数据进行快速有效的分析,并为后续转录组、基因表达等研究提供支持,是生物信息学领域的热点方向。本文讨论了当前RNA-seq数据分析的发展水平和常用软件、算法,并设计了一系列数据处理模块和分析流程。同时,为了给用户提供更好的使用环境,我们设计了基于弹性资源管理系统的生物云平台BioCloud。该平台集成了丰富的软件,采用高灵活度、高扩展性的体系架构,在给用户提供低成本、高性能计算服务的同时,还提供个性化的流程定制服务。  相似文献   

8.
王琦  许杰  郭政  李霞 《生物信息学》2003,1(1):33-36
基因芯片具有高通量快速并行检测基因表达水平的功能,是功能基因组研究的有力工具。针对基因芯片常规的信息分析需要,我们初步设计开发了基因表达谱的信息学分析平台,包括基于单机的软件IDKA(Information Digger for Experiments of microArray)与网络应用程序WebGEA(WEB GeneChip Expression Analysis),分别支持用户运行独立程序与在因特网上提交数据运行服务器程序来完成数据采掘分析任务。该平台得到良好的应用,是解决基因芯片常规的信息分析问题的一个方便工具。  相似文献   

9.
基因表达谱微阵列数据库是一类可提供存储、查询、下载分析的在线网络数据库,在肿瘤相关领域的研究中提供了大量的数据来源。由于微阵列分析对于无生物/医学信息学专业背景的研究人员仍然有较多困难,致使该数据库的使用尚未普及。本文从数据查询、下载分析和使用方法等方面对常用基因表达谱微阵列数据库进行概述,并对现阶段基因表达微阵列数据库的应用策略进行总结,旨在帮助该领域研究的初学工作者了解数据库的基本知识并推动其在科研工作中的应用。  相似文献   

10.
基因表达系列性分析技术及其应用   总被引:3,自引:0,他引:3  
基因表达系列性分析(SAGE)是一种高通量的基因表达模式的研究技术,能够对特定细胞或组织中的大量转录本同时进行定量分析。本综述了SAGE技术的基本原理和实验流程以及近年来SAGE方法上的改进,同时介绍了该技术的一些应用研究实例和Internet上可资利用的SAGE数据库资源。  相似文献   

11.
The Gene Expression Omnibus (GEO) project was initiated in response to the growing demand for a public repository for high-throughput gene expression data. GEO provides a flexible and open design that facilitates submission, storage and retrieval of heterogeneous data sets from high-throughput gene expression and genomic hybridization experiments. GEO is not intended to replace in house gene expression databases that benefit from coherent data sets, and which are constructed to facilitate a particular analytic method, but rather complement these by acting as a tertiary, central data distribution hub. The three central data entities of GEO are platforms, samples and series, and were designed with gene expression and genomic hybridization experiments in mind. A platform is, essentially, a list of probes that define what set of molecules may be detected. A sample describes the set of molecules that are being probed and references a single platform used to generate its molecular abundance data. A series organizes samples into the meaningful data sets which make up an experiment. The GEO repository is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.  相似文献   

12.
Bioinformatics support for high-throughput proteomics   总被引:2,自引:0,他引:2  
In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteome data. The rapid advancement of this technique in combination with other methods used in proteomics results in an increasing number of high-throughput projects. This leads to an increasing amount of data that needs to be archived and analyzed.To cope with the need for automated data conversion, storage, and analysis in the field of proteomics, the open source system ProDB was developed. The system handles data conversion from different mass spectrometer software, automates data analysis, and allows the annotation of MS spectra (e.g. assign gene names, store data on protein modifications). The system is based on an extensible relational database to store the mass spectra together with the experimental setup. It also provides a graphical user interface (GUI) for managing the experimental steps which led to the MS data. Furthermore, it allows the integration of genome and proteome data. Data from an ongoing experiment was used to compare manual and automated analysis. First tests showed that the automation resulted in a significant saving of time. Furthermore, the quality and interpretability of the results was improved in all cases.  相似文献   

13.
Integrated analysis of DNA methylation and gene expression can reveal specific epigenetic patterns that are important during carcinogenesis. We built an integrated database of DNA methylation and gene expression termed MENT (Methylation and Expression database of Normal and Tumor tissues) to provide researchers information on both DNA methylation and gene expression in diverse cancers. It contains integrated data of DNA methylation, gene expression, correlation of DNA methylation and gene expression in paired samples, and clinicopathological conditions gathered from the GEO (Gene Expression Omnibus) and TCGA (The Cancer Genome Atlas). A user-friendly interface allows users to search for differential DNA methylation by either ‘gene search’ or ‘dataset search’. The ‘gene search’ returns which conditions are differentially methylated in a gene of interest, while ‘dataset search’ returns which genes are differentially methylated in a condition of interest based on filtering options such as direction, DM (differential methylation value), and p-value. MENT is the first database which provides both DNA methylation and gene expression information in diverse normal and tumor tissues. Its user-friendly interface allows users to easily search and view both DNA methylation and gene expression patterns. MENT is freely available at http://mgrc.kribb.re.kr:8080/MENT/.  相似文献   

14.
Microarray technology has become a standard molecular biology tool. Experimental data have been generated on a huge number of organisms, tissue types, treatment conditions and disease states. The Gene Expression Omnibus (Barrett et al., 2005), developed by the National Center for Bioinformatics (NCBI) at the National Institutes of Health is a repository of nearly 140,000 gene expression experiments. The BioConductor project (Gentleman et al., 2004) is an open-source and open-development software project built in the R statistical programming environment (R Development core Team, 2005) for the analysis and comprehension of genomic data. The tools contained in the BioConductor project represent many state-of-the-art methods for the analysis of microarray and genomics data. We have developed a software tool that allows access to the wealth of information within GEO directly from BioConductor, eliminating many the formatting and parsing problems that have made such analyses labor-intensive in the past. The software, called GEOquery, effectively establishes a bridge between GEO and BioConductor. Easy access to GEO data from BioConductor will likely lead to new analyses of GEO data using novel and rigorous statistical and bioinformatic tools. Facilitating analyses and meta-analyses of microarray data will increase the efficiency with which biologically important conclusions can be drawn from published genomic data. Availability: GEOquery is available as part of the BioConductor project.  相似文献   

15.
16.
Colorectal cancer (CRC) ranks as one of the most common malignant tumors worldwide. Its mortality rate has remained high in recent years. Therefore, the aim of this study was to identify significant differentially expressed genes (DEGs) involved in its pathogenesis, which may be used as novel biomarkers or potential therapeutic targets for CRC. The gene expression profiles of GSE21510, GSE32323, GSE89076, and GSE113513 were downloaded from the Gene Expression Omnibus (GEO) database. After screening DEGs in each GEO data set, we further used the robust rank aggregation method to identify 494 significant DEGs including 212 upregulated and 282 downregulated genes. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed by DAVID and the KOBAS online database, respectively. These DEGs were shown to be significantly enriched in different cancer-related functions and pathways. Then, the STRING database was used to construct the protein–protein interaction network. The module analysis was performed by the MCODE plug-in of Cytoscape based on the whole network. We finally filtered out seven hub genes by the cytoHubba plug-in, including PPBP, CCL28, CXCL12, INSL5, CXCL3, CXCL10, and CXCL11. The expression validation and survival analysis of these hub genes were analyzed based on The Cancer Genome Atlas database. In conclusion, the robust DEGs associated with the carcinogenesis of CRC were screened through the GEO database, and integrated bioinformatics analysis was conducted. Our study provides reliable molecular biomarkers for screening and diagnosis, prognosis as well as novel therapeutic targets for CRC.  相似文献   

17.
The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data in GEO can be challenging. We have developed GEOmetadb in an attempt to make querying the GEO metadata both easier and more powerful. All GEO metadata records as well as the relationships between them are parsed and stored in a local MySQL database. A powerful, flexible web search interface with several convenient utilities provides query capabilities not available via NCBI tools. In addition, a Bioconductor package, GEOmetadb that utilizes a SQLite export of the entire GEOmetadb database is also available, rendering the entire GEO database accessible with full power of SQL-based queries from within R. AVAILABILITY: The web interface and SQLite databases available at http://gbnci.abcc.ncifcrf.gov/geo/. The Bioconductor package is available via the Bioconductor project. The corresponding MATLAB implementation is also available at the same website.  相似文献   

18.
Leukemias are exceptionally well studied at the molecular level and a wealth of high-throughput data has been published. But further utilization of these data by researchers is severely hampered by the lack of accessible integrative tools for viewing and analysis. We developed the Leukemia Gene Atlas (LGA) as a public platform designed to support research and analysis of diverse genomic data published in the field of leukemia. With respect to leukemia research, the LGA is a unique resource with comprehensive search and browse functions. It provides extensive analysis and visualization tools for various types of molecular data. Currently, its database contains data from more than 5,800 leukemia and hematopoiesis samples generated by microarray gene expression, DNA methylation, SNP and next generation sequencing analyses. The LGA allows easy retrieval of large published data sets and thus helps to avoid redundant investigations. It is accessible at www.leukemia-gene-atlas.org.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号