首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
With the rising interest in the regulatory functions of long non-coding RNAs (lncRNAs) in complex human diseases such as cardiovascular diseases, there is an increasing need in public databases offering comprehensive and integrative data for all aspects of these versatile molecules. Recently, a variety of public data repositories that specialized in lncRNAs have been developed, which make use of huge high-throughput data particularly from next-generation sequencing (NGS) approaches. Here, we provide an overview of current lncRNA databases covering basic and functional annotation, lncRNA expression and regulation, interactions with other biomole-cules, and genomic variants influencing the structure and function of lncRNAs. The prominent lncRNA antisense noncoding RNA in the INK4 locus (ANRIL), which has been unequivocally associated with coronary artery disease through genome-wide association studies (GWAS), serves as an example to demonstrate the features of each individual database.  相似文献   

3.
Web Tools for Rice Transcriptome Analyses   总被引:1,自引:0,他引:1  
Gene expression databases provide profiling data for the expression of thousands of genes to researchers worldwide. Oligonucleotide microarray technology is a useful tool that has been employed to produce gene expression profiles in most species. In rice, there are five genome-wide DNA microarray platforms: NSF 45K, BGI/Yale 60K, Affymetrix, Agilent Rice 44K, and NimbleGen 390K. Presently, more than 1,700 hybridizations of microarray gene expression data are available from public microarray depositing databases such as NCBI gene expression omnibus and Arrayexpress at EBI. More processing or reformatting of public gene expression data is required for further applications or analyses. Web-based databases for expression meta-analyses are useful for guiding researchers in designing relevant research schemes. In this review, we summarize various databases for expression meta-analyses of rice genes and web tools for further applications, such as the development of co-expression network or functional gene network.  相似文献   

4.
Microarrays enable high-throughput parallel gene expression analysis, and their use has grown exponentially during the past decade. We are now in a position where individual experiments could benefit from using the swelling public data repositories to allow microarrays to progress from being a hypothesis-generating tool to a powerful resource that can be used to test hypothesis about biology. Comparative microarray analysis could better distinguish phenotypes from associated phenotypes; identify valid differentially expressed genes by combining many studies; test new hypothesis; and discover fundamental patterns of gene regulation. This review aims to describe the additional methodology needed for such comparative microarray analysis, and we identify and discuss a number of problems such as loss of published data, lack of annotations, and variable array quality, which need to be solved before comparative microarray analysis can be used in a more systematic and powerful manner.  相似文献   

5.
Rapidly growing public gene expression databases contain a wealth of data for building an unprecedentedly detailed picture of human biology and disease. This data comes from many diverse measurement platforms that make integrating it all difficult. Although RNA-sequencing (RNA-seq) is attracting the most attention, at present, the rate of new microarray studies submitted to public databases far exceeds the rate of new RNA-seq studies. There is clearly a need for methods that make it easier to combine data from different technologies. In this paper, we propose a new method for processing RNA-seq data that yields gene expression estimates that are much more similar to corresponding estimates from microarray data, hence greatly improving cross-platform comparability. The method we call PREBS is based on estimating the expression from RNA-seq reads overlapping the microarray probe regions, and processing these estimates with standard microarray summarisation algorithms. Using paired microarray and RNA-seq samples from TCGA LAML data set we show that PREBS expression estimates derived from RNA-seq are more similar to microarray-based expression estimates than those from other RNA-seq processing methods. In an experiment to retrieve paired microarray samples from a database using an RNA-seq query sample, gene signatures defined based on PREBS expression estimates were found to be much more accurate than those from other methods. PREBS also allows new ways of using RNA-seq data, such as expression estimation for microarray probe sets. An implementation of the proposed method is available in the Bioconductor package “prebs.”  相似文献   

6.
7.
A huge amount of important biomedical information is hidden in the bulk of research articles in biomedical fields. At the same time, the publication of databases of biological information and of experimental datasets generated by high-throughput methods is in great expansion, and a wealth of annotated gene databases, chemical, genomic (including microarray datasets), clinical and other types of data repositories are now available on the Web. Thus a current challenge of bioinformatics is to develop targeted methods and tools that integrate scientific literature, biological databases and experimental data for reducing the time of database curation and for accessing evidence, either in the literature or in the datasets, useful for the analysis at hand. Under this scenario, this article reviews the knowledge discovery systems that fuse information from the literature, gathered by text mining, with microarray data for enriching the lists of down and upregulated genes with elements for biological understanding and for generating and validating new biological hypothesis. Finally, an easy to use and freely accessible tool, GeneWizard, that exploits text mining and microarray data fusion for supporting researchers in discovering gene-disease relationships is described.  相似文献   

8.
9.
An initiative to increase biopharmaceutical research productivity by capturing, sharing and computationally integrating proprietary scientific discoveries with public knowledge is described. This initiative involves both organisational process change and multiple interoperating software systems. The software components rely on mutually supporting integration techniques. These include a richly structured ontology, statistical analysis of experimental data against stored conclusions, natural language processing of public literature, secure document repositories with lightweight metadata, web services integration, enterprise web portals and relational databases. This approach has already begun to increase scientific productivity in our enterprise by creating an organisational memory (OM) of internal research findings, accessible on the web. Through bringing together these components it has also been possible to construct a very large and expanding repository of biological pathway information linked to this repository of findings which is extremely useful in analysis of DNA microarray data. This repository, in turn, enables our research paradigm to be shifted towards more comprehensive systems-based understandings of drug action.  相似文献   

10.
Mead JA  Shadforth IP  Bessant C 《Proteomics》2007,7(16):2769-2786
As proteomic MS has increased in throughput, so has the demand to catalogue the increasing number of peptides and proteins observed by the community using this technique. As in other 'omics' fields, this brings obvious scientific benefits such as sharing of results and prevention of unnecessary repetition, but also provides technical insights, such as the ability to compare proteome coverage between different laboratories, or between different proteomic platforms. Journals are also moving towards mandating that proteomics data be submitted to public repositories upon publication. In response to these demands, several web-based repositories have been established to store protein and peptide identifications derived from MS data, and a similar number of peptide identification software pipelines have emerged to deliver identifications to these repositories. This paper reviews the latest developments in public domain peptide and protein identification databases and describes the analysis pipelines that feed them. Recent applications of the tools to pertinent biological problems are examined, and through comparing and contrasting the capabilities of each system, the issues facing research users of web-based repositories are explored. Future developments and mechanisms to enhance system functionality and user-interfacing opportunities are also suggested.  相似文献   

11.
12.
An enormous amount of microarray data has been collected and accumulated in public repositories. Although some of the depositions include raw and processed data, significant parts of them include processed data only. If we need to combine multiple datasets for specific purposes, the data should be adjusted prior to use to remove bias between the datasets. We focused on a GeneChip platform and a pre-processing method, RMA, and examined simple quantile correction as the post-processing method for integration. Integration of the data pre-processed by RMA was evaluated using artificial spike-in datasets and real microarray datasets of atopic dermatitis and lung cancer. Studies using the spike-in datasets show that the quantile correction for data integration reduces the data quality at some extent but it should be acceptable level. Studies using the real datasets show that the quantile correction significantly reduces the bias. These results show that the quantile correction is useful for integration of multiple datasets processed by RMA, and encourage effective use of public microarray data.  相似文献   

13.
14.
高通量微阵列杂交技术和测序技术的快速发展,产生了大量的基因数据,生物信息迅速膨胀成为数据的海洋。为适应这种高通量基因表达数据的不断增长和人们共享数据的需要,各种数据库应用而生,其中,NCBI(national center for biotechnology information)的基因表达综合数据库(gene expression omnibus,GEO)是世界上最大的储存高通量分子丰度数据的公共数据库,用户可以提交、储存和检索多种形式的数据并免费使用。迄今为止,GEO已收录了300000个样本的数据,涉及16亿个基因表达丰度数据,涵盖500多种生物体,广泛覆盖各种生物学内容。GEO数据库操作简单,数据全面,免费共享的优势为后期数据挖掘和信息推广提供了良好的平台。文章概述了GEO数据库的结构、数据的提交、检索和其在分子生物学领域中的应用前景。登陆GEO数据库的网址为:http://www.ncbi.nlm.nih.gov/geo。  相似文献   

15.
The upcoming availability of public microarray repositories and of large compendia of gene expression information opens up a new realm of possibilities for microarray data analysis. An essential challenge is the efficient integration of microarray data generated by different research groups on different array platforms. This review focuses on the problems associated with this integration, which are: (1) the efficient access to and exchange of microarray data; (2) the validation and comparison of data from different platforms (cDNA and short and long oligonucleotides); and (3) the integrated statistical analysis of multiple data sets.  相似文献   

16.
The current study investigates existing infrastructure, its technical solutions and implemented standards for data repositories related to integrative biodiversity research. The storage and reuse of complex biodiversity data in central databases are becoming increasingly important, particularly in attempts to cope with the impacts of environmental change on biodiversity and ecosystems. From the data side, the main challenge of biodiversity repositories is to deal with the highly interdisciplinary and heterogeneous character of standardized and unstandardized data and metadata covering information from genes to ecosystems. Furthermore, the technical improvements in data acquisition techniques produce ever larger data volumes, which represent a challenge for database structure and proper data exchange.The current study is based on comprehensive in-depth interviews and an online survey addressing IT specialists involved in database development and operation. The results show that metadata are already well established, but that non-meta data still is largely unstandardized across various scientific communities. For example, only a third of all repositories in our investigation use internationally unified semantic standard checklists for taxonomy. The study also showed that database developers are mostly occupied with the implementation of state of the art technology and solving operational problems, leaving no time to implement user's requirements. One of the main reasons for this dissatisfying situation is the undersized and unreliable funding situation of most repositories, as reflected by the marginally small number of permanent IT staff members. We conclude that a sustainable data management system that fosters the future use and reuse of these valuable data resources requires the development of fewer, but more permanent data repositories using commonly accepted standards for their long-term data. This can only be accomplished through the consolidation of hitherto widely scattered small and non-permanent repositories.  相似文献   

17.

Background  

One of the important challenges in microarray analysis is to take full advantage of previously accumulated data, both from one's own laboratory and from public repositories. Through a comparative analysis on a variety of datasets, a more comprehensive view of the underlying mechanism or structure can be obtained. However, as we discover in this work, continual changes in genomic sequence annotations and probe design criteria make it difficult to compare gene expression data even from different generations of the same microarray platform.  相似文献   

18.
Large volumes of genomic data have been generated for several plant species over the past decade, including structural sequence data and functional annotation at the genome level. Various technologies such as expressed sequence tags (ESTs), massively parallel signature sequencing (MPSS) and microarrays have been used to study gene expression and to provide functional data for many genes simultaneously. This review focuses on recent advances in the application of microarrays in plant genomic research and in gene expression databases available for plants. Large sets of Arabidopsis microarray data are publicly available. Recently developed array platforms are currently being used to generate genome-wide expression profiles for several crop species. Coupled to these platforms are public databases that provide access to these large-scale expression data, which can be used to aid the functional discovery of gene function.  相似文献   

19.
20.

Background

In the last decade, a large amount of microarray gene expression data has been accumulated in public repositories. Integrating and analyzing high-throughput gene expression data have become key activities for exploring gene functions, gene networks and biological pathways. Effectively utilizing these invaluable microarray data remains challenging due to a lack of powerful tools to integrate large-scale gene-expression information across diverse experiments and to search and visualize a large number of gene-expression data points.

Results

Gene Expression Browser is a microarray data integration, management and processing system with web-based search and visualization functions. An innovative method has been developed to define a treatment over a control for every microarray experiment to standardize and make microarray data from different experiments homogeneous. In the browser, data are pre-processed offline and the resulting data points are visualized online with a 2-layer dynamic web display. Users can view all treatments over control that affect the expression of a selected gene via Gene View, and view all genes that change in a selected treatment over control via treatment over control View. Users can also check the changes of expression profiles of a set of either the treatments over control or genes via Slide View. In addition, the relationships between genes and treatments over control are computed according to gene expression ratio and are shown as co-responsive genes and co-regulation treatments over control.

Conclusion

Gene Expression Browser is composed of a set of software tools, including a data extraction tool, a microarray data-management system, a data-annotation tool, a microarray data-processing pipeline, and a data search & visualization tool. The browser is deployed as a free public web service (http://www.ExpressionBrowser.com) that integrates 301 ATH1 gene microarray experiments from public data repositories (viz. the Gene Expression Omnibus repository at the National Center for Biotechnology Information and Nottingham Arabidopsis Stock Center). The set of Gene Expression Browser software tools can be easily applied to the large-scale expression data generated by other platforms and in other species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号