首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Our team developed a metadata editing and management system employing state of the art XML technologies initially aimed at the environmental sciences but with the potential to be useful across multiple domains. We chose a modular and distributed design for scalability, flexibility, options for customizations, and the possibility to add more functionality at a later stage. The system consists of a desktop design tool that generates code for the actual online editor, a native XML database, and an online user access management application. A Java Swing application that reads an XML schema, the design tool provides the designer with options to combine input fields into online forms with user-friendly tags and determine the flow of input forms. Based on design decisions, the tool generates XForm code for the online metadata editor which is based on the Orbeon XForms engine. The design tool fulfills two requirements: First data entry forms based on a schema are customized at design time and second the tool can generate data entry applications for any valid XML schema without relying on custom information in the schema. A configuration file in the design tool saves custom information generated at design time. Future developments will add functionality to the design tool to integrate help text, tool tips, project specific keyword lists, and thesaurus services.Cascading style sheets customize the look-and-feel of the finished editor. The editor produces XML files in compliance with the original schema, however, a user may save the input into a native XML database at any time independent of validity. The system uses the open source XML database eXist for storage and uses a MySQL relational database and a simple Java Server Faces user interface for file and access management. We chose three levels to distribute administrative responsibilities and handle the common situation of an information manager entering the bulk of the metadata but leave specifics to the actual data provider.  相似文献   

2.
The Biology of Addictive Diseases-Database (BiolAD-DB) system is a research bioinformatics system for archiving, analyzing, and processing of complex clinical and genetic data. The database schema employs design principles for handling complex clinical information, such as response items in genetic questionnaires. Data access and validation is provided by the BiolAD-DB client application, which features a data validation engine tightly coupled to a graphical user interface. Data integrity is provided by the password-protected BiolAD-DB SQL compliant server and database. BiolAD-DB tools further provide functionalities for generating customized reports and views. The BiolAD-DB system schema, client, and installation instructions are freely available at http://www.rockefeller.edu/biolad-db/.  相似文献   

3.
The MiSink Plugin converts Cytoscape, an open-source bioinformatics platform for network visualization, to a graphical interface for the database of interacting proteins (DIP: http://dip.doe-mbi.ucla.edu). Seamless integration is possible by providing bi-directional communication between Cytoscape and any Web site supplying data in XML or tab-delimited format. Availability: MiSink is freely available for download at http://dip.doe-mbi.ucla.edu/Software.cgi.  相似文献   

4.
Storing biological sequence databases in relational form   总被引:2,自引:0,他引:2  
SUMMARY: We have created a set of applications using Perl and Java in combination with XML technology to install biological sequence databases into an Oracle RDBMS. An easy-to-use interface using Java has been created for database query and other tools developed to integrate with our in-house bioinformatics applications. AVAILIBILITY: The database schema, DTD file, and source codes are available from the authors via email. CONTACT: guochun_ xie@merck. com  相似文献   

5.
Pedro is a Java application that dynamically generates data entry forms for data models expressed in XML Schema, producing XML data files that validate against this schema. The software uses an intuitive tree-based navigation system, can supply context-sensitive help to users and features a sophisticated interface for populating data fields with terms from controlled vocabularies. The software also has the ability to import records from tab delimited text files and features various validation routines. AVAILABILITY: The application, source code, example models from several domains and tutorials can be downloaded from http://pedro.man.ac.uk/.  相似文献   

6.
MOTIVATION: Biological data come in very different shapes. Databanks are maintained and used by distinct organizations. Text is the de facto Standard exchange format. The SRS system can integrate heterogeneous textual databanks but it was lacking a way to structure the extracted data. RESULTS: This paper presents a CORBA interface to the SRS system which manages databanks in a flat file format. SRS Object Servers are CORBA wrappers for SRS. They allow client applications (visualisation tools, data mining tools, etc.) to access and query SRS servers remotely through an Object Request Broker (ORB). They provide loader objects that contain the information extracted from the databanks by SRS. Loader objects are not hard-coded but generated in a flexible way by using loader specifications which allow SRS administrators to package data coming from distinct databanks. AVAILABILITY: The prototype may be available for beta-testing. Please contact the SRS group (http://srs.ebi.ac.uk).  相似文献   

7.
8.
hmChIP is a database of genome-wide chromatin immunoprecipitation (ChIP) data in human and mouse. Currently, the database contains 2016 samples from 492 ChIP-seq and ChIP-chip experiments, representing a total of 170 proteins and 11 069 914 protein-DNA interactions. A web server provides interface for database query. Protein-DNA binding intensities can be retrieved from individual samples for user-provided genomic regions. The retrieved intensities can be used to cluster samples and genomic regions to facilitate exploration of combinatorial patterns, cell-type dependencies, and cross-sample variability of protein-DNA interactions. AVAILABILITY: http://jilab.biostat.jhsph.edu/database/cgi-bin/hmChIP.pl.  相似文献   

9.
The HuGeMap database stores the major genetic and physical maps of the human genome. HuGeMap is accessible on the Web at http://www. infobiogen.fr/services/Hugemap and through a CORBA server. A standard genome map data format for the interconnection of genome map databases was defined in collaboration with the EBI. The HuGeMap CORBA server provides this interconnection using the interface definition language IDL. Two graphical user interfaces were developed for the visualization of the HuGeMap data: ZoomMap (http://www.infobiogen.fr/services/zomit/Zoom Map.html) for navigation by zooming and data transformation via magic lenses, and MappetShow (http://www.infobiogen.fr/services/Mappet) for visualizing and comparing maps.  相似文献   

10.
SUMMARY: Class-responsibility-collaboration (CRC) cards have been used extensively in the software industry for defining complex object-oriented software requirements. We have adapted this tool to capture information about biological components, collaborators and responsibilities within these collaborations, which is not captured by current annotation tools. CRC cards should provide a common ground that will facilitate communication between biologist and computer scientists. AVAILABILITY: A CRC card template, XML representation and XML schema are freely available at http://people.musc.edu/~zhengw/CRCCard/CRC_Card_Index.html SUPPLEMENTARY INFORMATION: Supplemental Figures 1-4.  相似文献   

11.
12.
A motif is a short DNA or protein sequence that contributes to the biological function of the sequence in which it resides. Over the past several decades, many computational methods have been described for identifying, characterizing and searching with sequence motifs. Critical to nearly any motif-based sequence analysis pipeline is the ability to scan a sequence database for occurrences of a given motif described by a position-specific frequency matrix. RESULTS: We describe Find Individual Motif Occurrences (FIMO), a software tool for scanning DNA or protein sequences with motifs described as position-specific scoring matrices. The program computes a log-likelihood ratio score for each position in a given sequence database, uses established dynamic programming methods to convert this score to a P-value and then applies false discovery rate analysis to estimate a q-value for each position in the given sequence. FIMO provides output in a variety of formats, including HTML, XML and several Santa Cruz Genome Browser formats. The program is efficient, allowing for the scanning of DNA sequences at a rate of 3.5 Mb/s on a single CPU. Availability and Implementation: FIMO is part of the MEME Suite software toolkit. A web server and source code are available at http://meme.sdsc.edu.  相似文献   

13.
The effectiveness of any proteomics database search depends on the theoretical candidate information contained in the protein database. Unfortunately, candidate entries from protein databases such as UniProt rarely contain all the post-translational modifications (PTMs), disulfide bonds, or endogenous cleavages of interest to researchers. These omissions can limit discovery of novel and biologically important proteoforms. Conversely, searching for a specific proteoform becomes a computationally difficult task for heavily modified proteins. Both situations require updates to the database through user-annotated entries. Unfortunately, manually creating properly formatted UniProt Extensible Markup Language (XML) files is tedious and prone to errors. ProSight Annotator solves these issues by providing a graphical interface for adding user-defined features to UniProt-formatted XML files for better informed proteoform searches. It can be downloaded from http://prosightannotator.northwestern.edu .  相似文献   

14.
MOTIVATION: The human genome project and the development of new high-throughput technologies have created unparalleled opportunities to study the mechanism of diseases, monitor the disease progression and evaluate effective therapies. Gene expression profiling is a critical tool to accomplish these goals. The use of nucleic acid microarrays to assess the gene expression of thousands of genes simultaneously has seen phenomenal growth over the past five years. Although commercial sources of microarrays exist, investigators wanting more flexibility in the genes represented on the array will turn to in-house production. The creation and use of cDNA microarrays is a complicated process that generates an enormous amount of information. Effective data management of this information is essential to efficiently access, analyze, troubleshoot and evaluate the microarray experiments. RESULTS: We have developed a distributable software package designed to track and store the various pieces of data generated by a cDNA microarray facility. This includes the clone collection storage data, annotation data, workflow queues, microarray data, data repositories, sample submission information, and project/investigator information. This application was designed using a 3-tier client server model. The data access layer (1st tier) contains the relational database system tuned to support a large number of transactions. The data services layer (2nd tier) is a distributed COM server with full database transaction support. The application layer (3rd tier) is an internet based user interface that contains both client and server side code for dynamic interactions with the user. AVAILABILITY: This software is freely available to academic institutions and non-profit organizations at http://www.genomics.mcg.edu/niddkbtc.  相似文献   

15.
Protein identification using MS is an important technique in proteomics as well as a major generator of proteomics data. We have designed the protein identification data object model (PDOM) and developed a parser based on this model to facilitate the analysis and storage of these data. The parser works with HTML or XML files saved or exported from MASCOT MS/MS ions search in peptide summary report or MASCOT PMF search in protein summary report. The program creates PDOM objects, eliminates redundancy in the input file, and has the capability to output any PDOM object to a relational database. This program facilitates additional analysis of MASCOT search results and aids the storage of protein identification information. The implementation is extensible and can serve as a template to develop parsers for other search engines. The parser can be used as a stand-alone application or can be driven by other Java programs. It is currently being used as the front end for a system that loads HTML and XML result files of MASCOT searches into a relational database. The source code is freely available at http://www.ccbm.jhu.edu and the program uses only free and open-source Java libraries.  相似文献   

16.
Data management systems are fast becoming required components in many biology laboratories as the role of computer-based information grows. Although the need for data management systems is on the rise, their inherent complexities can deter the full and routine use of their computational capabilities. The significant undertaking to implement a capable production system can be reduced in part by adapting an established data management system. In such a way, we are leveraging the Genomics Unified Schema (GUS) developed at the Computational Biology and Informatics Laboratory at the University of Pennsylvania as a foundation for managing and analysing DNA sequence data in centromere research projects around Arabidopsis thaliana and related species. Because GUS provides a core schema that includes support for genome sequences, mRNA and its expression, and annotated chromosomes, it is ideal for synthesising a variety of parameters to analyse these repetitive and highly dynamic portions of the genome. Despite this, production-strength data management frameworks are complex, requiring dedicated efforts to adapt and maintain. The work reported in this article addresses one component of such an effort, namely the pivotal task of marshalling data from various sources into GUS. In order to harness GUS for our project, and motivated by efficiency needs, we developed a structured framework for transferring data into GUS from outside sources. This technology is embodied in a GUS object-layer processor, XMLGUS. XMLGUS facilitates incorporating data into GUS by (i) formulating an XML interface that includes relational database key constraint definitions, (ii) regularising traversal through that XML, (iii) realising automatic processing of the XML with database key constraints and (iv) allowing for special processing of input data within the framework for automated processing. The application of XMLGUS to production pipeline processing for a sequencing project and inputting the Arabidopsis genome into GUS is discussed. XMLGUS is available from the Flora website (http://flora.ittc.ku.edu/).  相似文献   

17.
Database interconnection requires the development of links between related objects from different databases. We built a database of links, called Virgil, to manage and distribute rich (documented) links between GDB genes and GenBank human sequences. Virgil contains 18 667 unique links. In addition to a simple Web form for ad-hoc queries, we propose a generic Web interface and a prototype CORBA server for link distribution. Materials described in this paper are available from http://www.infobiogen.fr/services/virgil/home. html  相似文献   

18.
SUMMARY: Tracker is a web-based email alert system for monitoring protein database searches using HMMER and Blast-P, nucleotide searches using Blast-N and literature searches of the PubMed database. Users submit searches via a web-based interface. Searches are saved and run against updated databases to alert users about new information. If there are new results from the saved searches, users will be notified by email and will then be able to access results and link to additional information on the NCBI website. Tracker supports Boolean AND/OR operations on HMMER and BLASTP result sets to allow users to broaden or narrow protein searches. AVAILABILITY: The server is located at http://jay.bioinformatics.ku.edu/tracker/index.html. A distribution package including detailed installation procedure is freely available from http://jay.bioinformatics.ku.edu/download/tracker/.  相似文献   

19.
HuGeMap: a distributed and integrated Human Genome Map database.   总被引:1,自引:0,他引:1       下载免费PDF全文
The HuGeMap database stores the major genetic and physical maps of the human genome. It is also interconnected with the gene radiation hybrid mapping database RHdb. HuGeMap is accessible through a Web server for interactive browsing at URL http://www.infobiogen. fr/services/Hugemap , as well as through a CORBA server for effective programming. HuGeMap is intended as an attempt to build open, interconnected databases, that is databases that distribute their objects worldwide in compliance with a recognized standard of distribution. Maps can be displayed and compared with a java applet (http://babbage.infobiogen.fr:15000/Mappet/Show. html ) that queries the HuGeMap ORB server as well as the RHdb ORB server at the EBI.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号