共查询到20条相似文献,搜索用时 156 毫秒
1.
2.
3.
4.
生命与健康多组学数据是生命科学研究和生物医学技术发展的重要基础。然而,我国缺乏生物数据管理和共享平台,不但无法满足国内日益增长的生物医学及相关学科领域的研究发展需求,而且严重制约我国生物大数据整合共享与转化利用。鉴于此,中国科学院北京基因组研究所于2016年初成立生命与健康大数据中心(BIG Data Center, BIGD),围绕国家人口健康和重要战略生物资源,建立生物大数据管理平台和多组学数据资源体系。本文重点介绍BIGD的生命与健康大数据资源系统,主要包括组学原始数据归档库、基因组数据库、基因组变异数据库、基因表达数据库、甲基化数据库、生物信息工具库和生命科学维基知识库,提供生物大数据汇交、整合与共享服务,为促进我国生命科学数据管理、推动国家生物信息中心建设奠定重要基础。 相似文献
5.
用基因本体论(Cene Ontology,GO)中的相关的规范术语和BLAST分析结果来对球毛壳菌EST及CONTIG序列信息进行注释,利用GO的语义模型构建不同物种数据库之间的语义联接,在此基础上建立球毛壳菌EST生物信息分析数据库,在概念和联系层面上有效地解决了不同物种生物信息的整合问题,实现了对球毛壳菌生物信息学数据智能化的多重、复合和交叉检索。为球毛壳菌生物信息学的进一步研究奠定了坚实的基础。文中详细论述了基于GO的球毛壳菌EST生物信息学数据库的研究背景、建立过程、查询功能及其维护。 相似文献
6.
7.
8.
蛋白质组图谱数据库的建立 总被引:12,自引:0,他引:12
中国科学院上海生命科学研究院生物信息中心与蛋白质组研究中心将公布我国第一个蛋白质组图谱数据库。数据库由物理层、链路层、交互层三层构架组成 ,数据库开发全面采用Java技术 ,具有完全的平台无关性。用户可以方便地对数据库中的图谱进行浏览 ,还可以使用数据库提供的多种检索工具获得感兴趣蛋白质的具体信息 相似文献
9.
系统发育信息学是近年来形成的新的学科方向,是系统学研究领域的一个新兴生长点。系统发育信息学是存贮、管理、注释、开发和加工系统树及其相关生物学信息的交叉学科。它的方法是基于计算机和网络技术,包括大型系统树及其相关生物学数据库的建立,系统树数据库网络的构架,系统树的可视化显示,小系统树的联合与超树的建立、用户查询、搜索和下载等,最终目的是要建立一个囊括地球上所有生物的系统树及其相关信息的数据库,将各种生物在树上精确定位,并进一步通过对系统发育信息的查询、搜索、联合与分析,从中获取生命进化的知识和进行生物学的预测。目前可用的系统发育网络资源主要有CIPRes和系统发育软件(PhylogenyPrograms)网站,已建立的系统发育信息学数据库包括TreeBASE,TreeofLife,Species2000,NCBITaxonomy数据库等。 相似文献
10.
生物多样性本底信息是开展生物多样性评价的基础,现有的生物多样性数据分布比较分散,不同的数据掌握在不同的部门,没有得到很好整合。同时,由于缺乏物种分布的空间信息,生物多样性数据很难满足环境影响评价中生物多样性评价的需求,生物多样性评价一般只限于简单的定性分析。为了促使生物多样性评价工作能更深入的开展,切实有效地保护生物多样性资源,以四川省(重点以甘孜州)为研究案例,整合了现有的生物多样性数据资源,主要包括四川省自然保护区生物多样性数据、甘孜州水生和陆生生物多样性数据、四川省环境敏感区数据以及生物多样性现场调查数据等,建立了四川省生物多样性基础数据库,同时通过数据库开发和网络开发,建立了数据库信息系统,实现了生物多样性空间数据库和属性数据库的查询检索、数据显示等功能。 相似文献
11.
Paquola AC Nishyiama MY Reis EM da Silva AM Verjovski-Almeida S 《Bioinformatics (Oxford, England)》2003,19(12):1587-1588
ESTWeb is an internet based software package designed for uniform data processing and storage for large-scale EST sequencing projects. The package provides for: (a) reception of sequencing chromatograms; (b) sequence processing such as base-calling, vector screening, comparison with public databases; (c) storage of data and analysis in a relational database, (d) generation of a graphical report of individual sequence quality; and (e) issuing of reports with statistics of productivity and redundancy. The software facilitates real-time monitoring and evaluation of EST sequence acquisition progress along an EST sequencing project. 相似文献
12.
Love CG Batley J Lim G Robinson AJ Savage D Singh D Spangenberg GC Edwards D 《Comparative and Functional Genomics》2004,5(3):276-280
With the increasing quantities of Brassica genomic data being entered into the public domain and in preparation for the complete Brassica genome sequencing effort, there is a growing requirement for the structuring and detailed bioinformatic analysis of Brassica genomic information within a user-friendly database. At the Plant Biotechnology Centre, Melbourne, Australia, we have developed a series of tools and computational pipelines to assist in the processing and structuring of genomic data, to aid its application to agricultural biotechnology research. These tools include a sequence database, ASTRA, a sequence processing pipeline incorporating annotation against GenBank, SwissProt and Arabidopsis Gene Ontology (GO) data and tools for molecular marker discovery and comparative genome analysis. All sequences are mined for simple sequence repeat (SSR) molecular markers using 'SSR primer' and mapped onto the complete Arabidopsis thaliana genome by sequence comparison. The database may be queried using a text-based search of sequence annotation or GO terms, BLAST comparison against resident sequences, or by the position of candidate orthologues within the Arabidopsis genome. Tools have also been developed and applied to the discovery of single nucleotide polymorphism (SNP) molecular markers and the in silico mapping of Brassica BAC end sequences onto the Arabidopsis genome. Planned extensions to this resource include the integration of gene expression data and the development of an EnsEMBL-based genome viewer. 相似文献
13.
蛋白质二级结构预测样本集数据库的设计与实现 总被引:1,自引:0,他引:1
将数据库技术应用到蛋白质二级结构预测的样本集处理和分析上,建立了二级结构预测样本集数据库。以CB513样本集为例介绍了该数据库的构建模式。构建样本数据库不仅便于存储、管理和检索数据,还可以完成一些简单的序列分析工作,取代许多以往必须的编程。从而大大提高了工作效率,减少错误的发生。 相似文献
14.
A novel database and modified alignment program is described which provides a fast and accurate procedure for assigning nucleotide sequences to allele types for multi-locus sequence analysis (MLSA). The database has between 40 and 160 alleles per organism including Neisseria meningitidis, Streptococcus pneumoniae, Staphylococcus aureus and Haemophilus influenzae. The database directly compares the query nucleotide sequence against all alleles within the database and this system reduces the time taken for the analysis of nucleotide sequence data and assignment of alleles for subsequent sequence analysis. 相似文献
15.
16.
Cordonnier-Pratt MM Liang C Wang H Kolychev DS Sun F Freeman R Sullivan R Pratt LH 《Comparative and Functional Genomics》2004,5(3):268-275
The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC) Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs), and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance. 相似文献
17.
Varani AM Monteiro-Vitorello CB de Almeida LG Souza RC Cunha OL Lima WC Civerolo E Van Sluys MA Vasconcelos AT 《Genetics and molecular biology》2012,35(1):149-152
The Xylella fastidiosa comparative genomic database is a scientific resource with the aim to provide a user-friendly interface for accessing high-quality manually curated genomic annotation and comparative sequence analysis, as well as for identifying and mapping prophage-like elements, a marked feature of Xylella genomes. Here we describe a database and tools for exploring the biology of this important plant pathogen. The hallmarks of this database are the high quality genomic annotation, the functional and comparative genomic analysis and the identification and mapping of prophage-like elements. It is available from web site http://www.xylella.lncc.br. 相似文献
18.
SUMMARY: Microarray data management and processing (MAD) is a set of Windows integrated software for microarray analysis. It consists of a relational database for data storage with many user-interfaces for data manipulation, several text file parsers and Microsoft Excel macros for automation of data processing, and a generator to produce text files that are ready for cluster analysis. AVAILABILITY: Executable is available free of charge on http://pompous.swmed.edu. The source code is also available upon request. 相似文献
19.
Manas Ranjan Dikhit Sindhu Prava Rana Pradeep Das Ganesh Chandra Sahoo 《Bioinformation》2009,3(7):299-302
Databases containing proteomic information have become indispensable for virology studies. As the gap between the amount of sequence information and functional characterization widens, increasing efforts are being directed to the development of databases. For virologist, it is therefore desirable to have a single data collection point which integrates research related data from different domains. CHPVDB is our effort to provide virologist such a one‐step information center. We describe herein the creation of CHPVDB, a new database that integrates information of different proteins in to a single resource. For basic curation of protein information, the database relies on features from other selected databases, servers and published reports. This database facilitates significant relationship between molecular analysis, cleavage sites, possible protein functional families assigned to different proteins of Chandipura virus (CHPV) by SVMProt and related tools. 相似文献
20.
A strategy has been developed for the construction of a validated, comprehensive composite protein sequence database. Entries are amalgamated from primary source data bases by a largely automated set of processes in which redundant and trivially different entries are eliminated. A modular approach has been adopted to allow scientific judgement to be used at each stage of database processing and amalgamation. Source databases are assigned a priority depending on the quality of sequence validation and commenting. Rejection of entries from the lower priority database, in each pairwise comparison of databases, is carried out according to optionally defined redundancy criteria based on sequence segment mismatches. Efficient algorithms for this methodology are embodied in the COMPO software system. COMPO has been applied for over 2 years in construction and regular updating of the OWL composite protein sequence database from the source databases NBRF-PIR, SWISS-PROT, a GenBank translation retrieved from the feature tables, NBRF-NEW, NEWAT86, PSD-KYOTO and the sequences contained in the Brookhaven protein structure databank. OWL is part of the ISIS integrated data resource of protein sequence and structure [Akrigg et al. (1988) Nature, 335, 745-746]. The modular nature of the integration process greatly facilitates the frequent updating of OWL following releases of the source databases. The extent of redundancy in these sources is revealed by the comparison process. The advantages of a robust composite database for sequence similarity searching and information retrieval are discussed. 相似文献