首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
GEO(Gene Expression Omnibus ):高通量基因表达数据库   总被引:2,自引:0,他引:2  
 GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被存档,例如基因表达系列分析(serial analysis of gene expression,SAGE)和蛋白质鉴定技术.迄今为止,GEO数据库包含的数据含概10 000个杂交实验和来自30种不同生物体的SAGE库.本文概述了GEO数据库的查询和浏览,数据下载和格式,数据分析,贮存与更新,并着重分析GEO数据浏览器中控制词汇的使用,阐述了GEO数据库的数据挖掘以及GEO在分子生物学领域中的应用前景.GEO可由此公众网址直接登陆http://www.ncbi.nlm.nih.gov/projects/geo/.  相似文献   

2.
使用高通量方法学来检测基因表达情况在最近几年已非常普遍。微集芯片技术可同时定量成千上万的基因转录本。基因表达综合数据库(Gene Expression Omnibus 简称GEO)是目前最大的而且完全公开的高通量分子丰度数据库,主要储存基因表达数据。该数据库以一个灵活开放的设计理念,允许用户或科研人员来递呈,保存和检索多种不同类型的数据。本文综合描述一下近年来该数据库在基因表达数据挖掘中的应用,同时介绍一些通过使用用户友好网络界面能有效探索、查询和再现数百个实验和数百万个基因表达谱的工具,以方便数据进行挖掘和可视化。登录GEO公用数据库的网址为:http://www.ncbi.nlm.nih.gov/geo.  相似文献   

3.
A digital anatomy construction (DANCER) program was developed for gene expression data. DANCER can be used to reconstruct anatomical images from in situ hybridization images, microarray or other gene expression data. The program fills regions of a drawn figure with the corresponding values from a gene expression data set. The output of the program presents the expression levels of a particular gene in a particular region relative to other regions. The program was tested with values from experimental in situ hybridization autoradiographs and from a microarray experiment. Reconstruction of in situ hybridization data from adult rat brain made by DANCER corresponded well with the original autoradiograph. Reconstruction of microarray data from adult mouse brains provided images that reflect actual expression levels. This program should help to provide visualization and interpretation of data derived from gene expression experiments. DANCER may be freely downloaded.  相似文献   

4.
Massive amounts of image data have been collected and continue to be generated for representing cellular gene expression throughout the mouse brain. Critical to exploiting this key effort of the post-genomic era is the ability to place these data into a common spatial reference that enables rapid interactive queries, analysis, data sharing, and visualization. In this paper, we present a set of automated protocols for generating and annotating gene expression patterns suitable for the establishment of a database. The steps include imaging tissue slices, detecting cellular gene expression levels, spatial registration with an atlas, and textual annotation. Using high-throughput in situ hybridization to generate serial sets of tissues displaying gene expression, this process was applied toward the establishment of a database representing over 200 genes in the postnatal day 7 mouse brain. These data using this protocol are now well-suited for interactive comparisons, analysis, queries, and visualization.  相似文献   

5.
Curated gene sets from databases such as KEGG Pathway and Gene Ontology are often used to systematically organize lists of genes or proteins derived from high-throughput data. However, the information content inherent to some relationships between the interrogated gene sets, such as pathway crosstalk, is often underutilized. A gene set network, where nodes representing individual gene sets such as KEGG pathways are connected to indicate a functional dependency, is well suited to visualize and analyze global gene set relationships. Here we introduce a novel gene set network construction algorithm that integrates gene lists derived from high-throughput experiments with curated gene sets to construct co-enrichment gene set networks. Along with previously described co-membership and linkage algorithms, we apply the co-enrichment algorithm to eight gene set collections to construct integrated multi-evidence gene set networks with multiple edge types connecting gene sets. We demonstrate the utility of approach through examples of novel gene set networks such as the chromosome map co-differential expression gene set network. A total of twenty-four gene set networks are exposed via a web tool called MetaNet, where context-specific multi-edge gene set networks are constructed from enriched gene sets within user-defined gene lists. MetaNet is freely available at http://blaispathways.dfci.harvard.edu/metanet/.  相似文献   

6.
7.
The ability to correlate chromosome conformation and gene expression gives a great deal of information regarding the strategies used by a cell to properly regulate gene activity. 4C-Seq is a relatively new and increasingly popular technology where the set of genomic interactions generated by a single point in the genome can be determined. 4C-Seq experiments generate large, complicated data sets and it is imperative that signal is properly distinguished from noise. Currently, there are a limited number of methods for analyzing 4C-Seq data. Here, we present a new method, fourSig, which in addition to being precise and simple to use also includes a new feature that prioritizes detected interactions. Our results demonstrate the efficacy of fourSig with previously published and novel 4C-Seq data sets and show that our significance prioritization correlates with the ability to reproducibly detect interactions among replicates.  相似文献   

8.

Background

In the last decade, a large amount of microarray gene expression data has been accumulated in public repositories. Integrating and analyzing high-throughput gene expression data have become key activities for exploring gene functions, gene networks and biological pathways. Effectively utilizing these invaluable microarray data remains challenging due to a lack of powerful tools to integrate large-scale gene-expression information across diverse experiments and to search and visualize a large number of gene-expression data points.

Results

Gene Expression Browser is a microarray data integration, management and processing system with web-based search and visualization functions. An innovative method has been developed to define a treatment over a control for every microarray experiment to standardize and make microarray data from different experiments homogeneous. In the browser, data are pre-processed offline and the resulting data points are visualized online with a 2-layer dynamic web display. Users can view all treatments over control that affect the expression of a selected gene via Gene View, and view all genes that change in a selected treatment over control via treatment over control View. Users can also check the changes of expression profiles of a set of either the treatments over control or genes via Slide View. In addition, the relationships between genes and treatments over control are computed according to gene expression ratio and are shown as co-responsive genes and co-regulation treatments over control.

Conclusion

Gene Expression Browser is composed of a set of software tools, including a data extraction tool, a microarray data-management system, a data-annotation tool, a microarray data-processing pipeline, and a data search & visualization tool. The browser is deployed as a free public web service (http://www.ExpressionBrowser.com) that integrates 301 ATH1 gene microarray experiments from public data repositories (viz. the Gene Expression Omnibus repository at the National Center for Biotechnology Information and Nottingham Arabidopsis Stock Center). The set of Gene Expression Browser software tools can be easily applied to the large-scale expression data generated by other platforms and in other species.  相似文献   

9.
10.
Microarray technology has become a standard molecular biology tool. Experimental data have been generated on a huge number of organisms, tissue types, treatment conditions and disease states. The Gene Expression Omnibus (Barrett et al., 2005), developed by the National Center for Bioinformatics (NCBI) at the National Institutes of Health is a repository of nearly 140,000 gene expression experiments. The BioConductor project (Gentleman et al., 2004) is an open-source and open-development software project built in the R statistical programming environment (R Development core Team, 2005) for the analysis and comprehension of genomic data. The tools contained in the BioConductor project represent many state-of-the-art methods for the analysis of microarray and genomics data. We have developed a software tool that allows access to the wealth of information within GEO directly from BioConductor, eliminating many the formatting and parsing problems that have made such analyses labor-intensive in the past. The software, called GEOquery, effectively establishes a bridge between GEO and BioConductor. Easy access to GEO data from BioConductor will likely lead to new analyses of GEO data using novel and rigorous statistical and bioinformatic tools. Facilitating analyses and meta-analyses of microarray data will increase the efficiency with which biologically important conclusions can be drawn from published genomic data. Availability: GEOquery is available as part of the BioConductor project.  相似文献   

11.

Background  

Analysis of microarray and other high-throughput data on the basis of gene sets, rather than individual genes, is becoming more important in genomic studies. Correspondingly, a large number of statistical approaches for detecting gene set enrichment have been proposed, but both the interrelations and the relative performance of the various methods are still very much unclear.  相似文献   

12.
Predictome: a database of putative functional links between proteins   总被引:11,自引:2,他引:9       下载免费PDF全文
The current deluge of genomic sequences has spawned the creation of tools capable of making sense of the data. Computational and high-throughput experimental methods for generating links between proteins have recently been emerging. These methods effectively act as hypothesis machines, allowing researchers to screen large sets of data to detect interesting patterns that can then be studied in greater detail. Although the potential use of these putative links in predicting gene function has been demonstrated, a central repository for all such links for many genomes would maximize their usefulness. Here we present Predictome, a database of predicted links between the proteins of 44 genomes based on the implementation of three computational methods—chromosomal proximity, phylogenetic profiling and domain fusion—and large-scale experimental screenings of protein–protein interaction data. The combination of data from various predictive methods in one database allows for their comparison with each other, as well as visualization of their correlation with known pathway information. As a repository for such data, Predictome is an ongoing resource for the community, providing functional relationships among proteins as new genomic data emerges. Predictome is available at http://predictome.bu.edu.  相似文献   

13.
Understanding the function and evolution of developmental regulatory networks requires the characterisation and quantification of spatio-temporal gene expression patterns across a range of systems and species. However, most high-throughput methods to measure the dynamics of gene expression do not preserve the detailed spatial information needed in this context. For this reason, quantification methods based on image bioinformatics have become increasingly important over the past few years. Most available approaches in this field either focus on the detailed and accurate quantification of a small set of gene expression patterns, or attempt high-throughput analysis of spatial expression through binary pattern extraction and large-scale analysis of the resulting datasets. Here we present a robust, “medium-throughput” pipeline to process in situ hybridisation patterns from embryos of different species of flies. It bridges the gap between high-resolution, and high-throughput image processing methods, enabling us to quantify graded expression patterns along the antero-posterior axis of the embryo in an efficient and straightforward manner. Our method is based on a robust enzymatic (colorimetric) in situ hybridisation protocol and rapid data acquisition through wide-field microscopy. Data processing consists of image segmentation, profile extraction, and determination of expression domain boundary positions using a spline approximation. It results in sets of measured boundaries sorted by gene and developmental time point, which are analysed in terms of expression variability or spatio-temporal dynamics. Our method yields integrated time series of spatial gene expression, which can be used to reverse-engineer developmental gene regulatory networks across species. It is easily adaptable to other processes and species, enabling the in silico reconstitution of gene regulatory networks in a wide range of developmental contexts.  相似文献   

14.
Bi Zhao  Aqeela Erwin  Bin Xue 《Genomics》2018,110(1):67-73
Identifying differentially expressed genes is critical in microarray data analysis. Many methods have been developed by combining p-value, fold-change, and various statistical models to determine these genes. When using these methods, it is necessary to set up various pre-determined cutoff values. However, many of these cutoff values are somewhat arbitrary and may not have clear connections to biology. In this study, a genetic distance method based on gene expression level was developed to analyze eight sets of microarray data extracted from the GEO database. Since the genes used in distance calculation have been ranked by fold-change, the genetic distance becomes more stable when adding more genes in the calculation, indicating there is an optimal set of genes which are sufficient to characterize the stable difference between samples. This set of genes is differentially expressed genes representing both the genotypic and phenotypic differences between samples.  相似文献   

15.
Drosophila melanogaster is a leading model in population genetics and genomics, and a growing number of whole-genome data sets from natural populations of this species have been published over the last years. A major challenge is the integration of disparate data sets, often generated using different sequencing technologies and bioinformatic pipelines, which hampers our ability to address questions about the evolution of this species. Here we address these issues by developing a bioinformatics pipeline that maps pooled sequencing (Pool-Seq) reads from D. melanogaster to a hologenome consisting of fly and symbiont genomes and estimates allele frequencies using either a heuristic (PoolSNP) or a probabilistic variant caller (SNAPE-pooled). We use this pipeline to generate the largest data repository of genomic data available for D. melanogaster to date, encompassing 271 previously published and unpublished population samples from over 100 locations in >20 countries on four continents. Several of these locations have been sampled at different seasons across multiple years. This data set, which we call Drosophila Evolution over Space and Time (DEST), is coupled with sampling and environmental metadata. A web-based genome browser and web portal provide easy access to the SNP data set. We further provide guidelines on how to use Pool-Seq data for model-based demographic inference. Our aim is to provide this scalable platform as a community resource which can be easily extended via future efforts for an even more extensive cosmopolitan data set. Our resource will enable population geneticists to analyze spatiotemporal genetic patterns and evolutionary dynamics of D. melanogaster populations in unprecedented detail.  相似文献   

16.
The quantitative real-time PCR (qPCR) based techniques have become essential for gene expression studies and high-throughput molecular characterization of transgenic events. Normalizing to reference gene in relative quantification make results from qPCR more reliable when compared to absolute quantification, but requires robust reference genes. Since, ideal reference gene should be species specific, no single internal control gene is universal for use as a reference gene across various plant developmental stages and diverse growth conditions. Here, we present validation studies of multiple stably expressed reference genes in cultivated peanut with minimal variations in temporal and spatial expression when subjected to various biotic and abiotic stresses. Stability in the expression of eight candidate reference genes including ADH3, ACT11, ATPsyn, CYP2, ELF1B, G6PD, LEC and UBC1 was compared in diverse peanut plant samples. The samples were categorized into distinct experimental sets to check the suitability of candidate genes for accurate and reliable normalization of gene expression using qPCR. Stability in expression of the references genes in eight sets of samples was determined by geNorm and NormFinder methods. While three candidate reference genes including ADH3, G6PD and ELF1B were identified to be stably expressed across experiments, LEC was observed to be the least stable, and hence must be avoided for gene expression studies in peanut. Inclusion of the former two genes gave sufficiently reliable results; nonetheless, the addition of the third reference gene ELF1B may be potentially better in a diverse set of tissue samples of peanut.  相似文献   

17.
We describe the current status of the gene expression database CIBEX (Center for Information Biology gene EXpression database, http://cibex.nig.ac.jp), with a data retrieval system in compliance with MIAME, a standard that the MGED Society has developed for comparing and data produced in microarray experiments at different laboratories worldwide. CIBEX serves as a public repository for a wide range of high-throughput experimental data in gene expression research, including microarray-based experiments measuring mRNA, serial analysis of gene expression (SAGE tags), and mass spectrometry proteomic data.  相似文献   

18.
An important and ongoing focus of biomedical and agricultural avian research is to understand gene function, which for a significant fraction of genes remains unknown. A first step is to determine when and where genes are expressed during development and in the adult. Whole mount in situ hybridization gives precise spatial and temporal resolution of gene expression throughout an embryo, and a comprehensive analysis and centralized repository of in situ hybridization information would provide a valuable research tool. The GEISHA project (gallus expression in situ hybridization analysis) was initiated to explore the utility of using high-throughput in situ hybridization as a means for gene discovery and annotation in chicken embryos, and to provide a unified repository for in situ hybridization information. This report describes the design and implementation of a new GEISHA database and user interface (www.geisha.arizona.edu), and illustrates its utility for researchers in the biomedical and poultry science communities. Results obtained from a high throughput screen of microRNA expression in chicken embryos are also presented.  相似文献   

19.
The MGOS (Magnaporthe grisea Oryza sativa) web-based database contains data from Oryza sativa and Magnaporthe grisea interaction experiments in which M. grisea is the fungal pathogen that causes the rice blast disease. In order to study the interactions, a consortium of fungal and rice geneticists was formed to construct a comprehensive set of experiments that would elucidate information about the gene expression of both rice and M. grisea during the infection cycle. These experiments included constructing and sequencing cDNA and robust long-serial analysis gene expression libraries from both host and pathogen during different stages of infection in both resistant and susceptible interactions, generating >50,000 M. grisea mutants and applying them to susceptible rice strains to test for pathogenicity, and constructing a dual O. sativa-M. grisea microarray. MGOS was developed as a central web-based repository for all the experimental data along with the rice and M. grisea genomic sequence. Community-based annotation is available for the M. grisea genes to aid in the study of the interactions.  相似文献   

20.
Purpose: The expression and clinical value of zinc finger protein 2 gene (ZIC2) in hepatocellular carcinoma (HCC) were analyzed by mining gene information from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases.Methods: Gene chip data sets were retrieved from GEO and TCGA and screened for differentially expressed genes in HCC. Gene expression profile interaction analysis (GEPIA) and Kaplan–Meier curves were used to analyze the relationship between differentially expressed genes (DEGs) and survival and prognosis in patients with HCC. Moreover, the Genecards database was used to extract ZIC2-related proteins and to analyze the physiological process of protein enrichment. Furthermore, the relationships between ZIC2 gene and tumor cell immune invasion and that between immune cell infiltration and the 5-year survival rate were studied using the tumor immune evaluation resource (TIMER) database.Results: Datasets from GEO and TCGA revealed that ZIC2 was differentially expressed in HCC tissues and normal tissues (P<0.05). High ZIC2 expression was associated with overall survival (OS) and progress-free survival in HCC patients. Overall, 25 ZIC2 related proteins, including Gli3, PRKDC, and rnf180 were identified and protein enrichment analysis indicated these were associated with four types of cell components, six types of cell functions, and eight types of biological processes. ZIC2 was positively correlated with immune infiltration cells in patients with HCC, and higher expression of ZIC2 mRNA CD4+T cells is associated with a better 5-year survival.Conclusion: ZIC2 gene may be used as an immune response marker in liver cancer to predict the prognosis of HCC.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号