首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
MOTIVATION: To improve the ability of biologists (both researchers and students) to ask biologically interesting questions of the Gene Ontology (GO) database and to explore the ontologies by seeing large portions of the ontology graphs in context, along with details of individual terms in the ontologies. RESULTS: GoGet and GoView are two new tools built as part of an extensible web application system based on Java 2 Enterprise Edition technology. GoGet has a user interface that enables users to ask biologically interesting questions, such as (1) What are the DNA binding proteins involved in DNA repair, but not in DNA replication? and (2) Of the terms containing the word triphosphatase, which have associated gene products from mouse, but not fruit fly? The results of such queries can be viewed in a collapsed tabular format that eases the burden of getting through large tables of data. GoView enables users to explore the large directed acyclic graph structure of the ontologies in the GO database. The two tools are coordinated, so that results from queries in GoGet can be visualized in GoView in the ontology in which they appear, and explorations started from GoView can request details of gene product associations to appear in a result table in GoGet. AVAILABILITY: Free access to the GoGet query tool and free download of the GoView ontology viewer are provided to all users at http://db.math.macalester.edu/goproject. In addition, source code for the GoView tool is also available from this site, along with a user manual for both tools.  相似文献   

2.
M. Ba  G. Diallo 《IRBM》2013,34(1):56-59
The proliferation of biomedical applications, which rely on different knowledge organization systems, such as ontologies and thesauri raises the issue of the automated identification of the correspondences between these models, in particular for the data integration need. A significant effort has been conducted for tackling this issue of ontology alignment. However, few systems are able to deal with ontologies containing tens of thousands of entities, as it may be the case in the biomedical domain where resources such as SNOMED-CT, the FMA or the NCI thesaurus are commonly used. We present in this paper ServOMap, an efficient system for large-scale ontology alignment. It relies on an Ontology Server (ServO) and uses Information Retrieval techniques for computing similarity between entities. The system participated with two configurations in the 2012 Ontology Alignment Evaluation Initiative campaign. We report the very promising results obtained by the system for large biomedical ontologies alignment. ServOMap is freely available for download at http://code.google.com/p/servo/.  相似文献   

3.
The use of structured knowledge representations-ontologies and terminologies-has become standard in biomedicine. Definitions of ontologies vary widely, as do the values and philosophies that underlie them. In seeking to make these views explicit, we conducted and summarized interviews with a dozen leading ontologists. Their views clustered into three broad perspectives that we summarize as mathematics, computer code, and Esperanto. Ontology as mathematics puts the ultimate premium on rigor and logic, symmetry and consistency of representation across scientific subfields, and the inclusion of only established, non-contradictory knowledge. Ontology as computer code focuses on utility and cultivates diversity, fitting ontologies to their purpose. Like computer languages C++, Prolog, and HTML, the code perspective holds that diverse applications warrant custom designed ontologies. Ontology as Esperanto focuses on facilitating cross-disciplinary communication, knowledge cross-referencing, and computation across datasets from diverse communities. We show how these views align with classical divides in science and suggest how a synthesis of their concerns could strengthen the next generation of biomedical ontologies.  相似文献   

4.
5.
MOTIVATION: The formal representation of mereological aspects of canonical anatomy (parthood relations) is relatively well understood. The formal representation of other aspects of canonical anatomy, such as connectedness and adjacency relations between anatomical parts, their shape and size as well as the spatial arrangement of anatomical parts within larger anatomical structures are, however, much less well understood and represented in existing computational anatomical and bio-medical ontologies only insufficiently. RESULTS: In this article, we provide a methodology of how to incorporate this kind of information into anatomical and bio-medical ontologies by applying techniques of representing qualitative spatial information from Artificial Intelligence. In particular, we focus on how to explicitly take into account the qualitative and time-dependent character of these relations. As a running example, we use the human temporomandibular joint (TMJ). AVAILABILITY: Using the presented methodology, a formal ontology was developed which is accessible on http://www.ifomis.org/bfo/fol. This ontology may help to improve the logical and ontological rigor of bio-medical ontologies such as the OBO relation ontology.  相似文献   

6.
The mzQuantML standard from the HUPO Proteomics Standards Initiative has recently been released, capturing quantitative data about peptides and proteins, following analysis of MS data. We present a Java application programming interface (API) for mzQuantML called jmzQuantML. The API provides robust bridges between Java classes and elements in mzQuantML files and allows random access to any part of the file. The API provides read and write capabilities, and is designed to be embedded in other software packages, enabling mzQuantML support to be added to proteomics software tools ( http://code.google.com/p/jmzquantml/ ). The mzQuantML standard is designed around a multilevel validation system to ensure that files are structurally and semantically correct for different proteomics quantitative techniques. In this article, we also describe a Java software tool ( http://code.google.com/p/mzquantml‐validator/ ) for validating mzQuantML files, which is a formal part of the data standard.  相似文献   

7.
8.
A great deal of data in functional genomics studies needs to be annotated with low-resolution anatomical terms. For example, gene expression assays based on manually dissected samples (microarray, SAGE, etc.) need high-level anatomical terms to describe sample origin. First-pass annotation in high-throughput assays (e.g. large-scale in situ gene expression screens or phenotype screens) and bibliographic applications, such as selection of keywords, would also benefit from a minimum set of standard anatomical terms. Although only simple terms are required, the researcher faces serious practical problems of inconsistency and confusion, given the different aims and the range of complexity of existing anatomy ontologies. A Standards and Ontologies for Functional Genomics (SOFG) group therefore initiated discussions between several of the major anatomical ontologies for higher vertebrates. As we report here, one result of these discussions is a simple, accessible, controlled vocabulary of gross anatomical terms, the SOFG Anatomy Entry List (SAEL). The SAEL is available from http://www.sofg.org and is intended as a resource for biologists, curators, bioinformaticians and developers of software supporting functional genomics. It can be used directly for annotation in the contexts described above. Importantly, each term is linked to the corresponding term in each of the major anatomy ontologies. Where the simple list does not provide enough detail or sophistication, therefore, the researcher can use the SAEL to choose the appropriate ontology and move directly to the relevant term as an entry point. The SAEL links will also be used to support computational access to the respective ontologies.  相似文献   

9.
A software package, IndexToolkit, aimed at overcoming the disadvantage of FASTA-format databases for frequent searching, is developed to utilize an indexing strategy to substantially accelerate sequence queries. IndexToolkit includes user-friendly tools and an Application Programming Interface (API) to facilitate indexing, storage and retrieval of protein sequence databases. As open source, it provides a sequence-retrieval developing framework, which is easily extensible for high-speed-request proteomic applications, such as database searching or modification discovering. We applied IndexToolkit to database searching engine pFind to demonstrate its effect. Experimental studies show that IndexToolkit is able to support significantly faster searches of protein database. AVAILABILITY: The IndexToolkit is free to use under the open source GNU GPL license. The source code and the compiled binary can be freely accessed through the website http://pfind.jdl.ac.cn/IndexToolkit. In this website, the more detailed information including screenshots and documentations for users and developers is also available.  相似文献   

10.
YODA: selecting signature oligonucleotides   总被引:3,自引:0,他引:3  
MOTIVATION: Selecting oligonucleotide probes for use in microarray design, and other applications requiring signature sequences, involves identifying sequences which will bind strongly to their intended target, while binding only weakly (or preferably, not at all) to non-target sequences which may be present in the hybridization reaction. While many tools to assist in selection of such sequences exist, all the ones we examined lack important oligo design and software features. RESULTS: YODA is an application for assisting biological researchers in selecting signature sequences. It incorporates a custom sequence similarity search to find potential cross-hybridizing non-target sequences. For this task, most oligo design tools rely on BLAST, which is ill suited for it due to an unacceptable risk of false negatives. YODA supports multiple probe design goals including single-genome, multiple-genome, pathogen-host and species/strain-identification. A graphical interface is provided as well as a command-line interface, both of which support many user-controlled parameters. YODA is easy to install and use and runs on Windows, Mac OS X and Linux platforms. AVAILABILITY: Freely available (LGLP) along with source code and additional documentation at http://pathport.vbi.vt.edu/YODA CONTACT: enordber@vbi.vt.edu.  相似文献   

11.
Metagenomic sequencing has produced significant amounts of data in recent years. For example, as of summer 2013, MG-RAST has been used to annotate over 110,000 data sets totaling over 43 Terabases. With metagenomic sequencing finding even wider adoption in the scientific community, the existing web-based analysis tools and infrastructure in MG-RAST provide limited capability for data retrieval and analysis, such as comparative analysis between multiple data sets. Moreover, although the system provides many analysis tools, it is not comprehensive. By opening MG-RAST up via a web services API (application programmers interface) we have greatly expanded access to MG-RAST data, as well as provided a mechanism for the use of third-party analysis tools with MG-RAST data. This RESTful API makes all data and data objects created by the MG-RAST pipeline accessible as JSON objects. As part of the DOE Systems Biology Knowledgebase project (KBase, http://kbase.us) we have implemented a web services API for MG-RAST. This API complements the existing MG-RAST web interface and constitutes the basis of KBase''s microbial community capabilities. In addition, the API exposes a comprehensive collection of data to programmers. This API, which uses a RESTful (Representational State Transfer) implementation, is compatible with most programming environments and should be easy to use for end users and third parties. It provides comprehensive access to sequence data, quality control results, annotations, and many other data types. Where feasible, we have used standards to expose data and metadata. Code examples are provided in a number of languages both to show the versatility of the API and to provide a starting point for users. We present an API that exposes the data in MG-RAST for consumption by our users, greatly enhancing the utility of the MG-RAST service.  相似文献   

12.
As Semantic Web technologies mature and new releases of key elements, such as SPARQL 1.1 and OWL 2.0, become available, the Life Sciences continue to push the boundaries of these technologies with ever more sophisticated tools and applications. Unsurprisingly, therefore, interest in the SWAT4LS (Semantic Web Applications and Tools for the Life Sciences) activities have remained high, as was evident during the third international SWAT4LS workshop held in Berlin in December 2010. Contributors to this workshop were invited to submit extended versions of their papers, the best of which are now made available in the special supplement of BMC Bioinformatics. The papers reflect the wide range of work in this area, covering the storage and querying of Life Sciences data in RDF triple stores, tools for the development of biomedical ontologies and the semantics-based integration of Life Sciences as well as clinicial data.  相似文献   

13.
MOTIVATION: Primary immunodeficiency diseases (PIDs) are Mendelian conditions of high phenotypic complexity and low incidence. They usually manifest in toddlers and infants, although they can also occur much later in life. Information about PIDs is often widely scattered throughout the clinical as well as the research literature and hard to find for both generalists as well as experienced clinicians. Semantic Web technologies coupled to clinical information systems can go some way toward addressing this problem. Ontologies are a central component of such a system, containing and centralizing knowledge about primary immunodeficiencies in both a human- and computer-comprehensible form. The development of an ontology of PIDs is therefore a central step toward developing informatics tools, which can support the clinician in the diagnosis and treatment of these diseases. RESULTS: We present PIDO, the primary immunodeficiency disease ontology. PIDO characterizes PIDs in terms of the phenotypes commonly observed by clinicians during a diagnosis process. Phenotype terms in PIDO are formally defined using complex definitions based on qualities, functions, processes and structures. We provide mappings to biomedical reference ontologies to ensure interoperability with ontologies in other domains. Based on PIDO, we developed the PIDFinder, an ontology-driven software prototype that can facilitate clinical decision support. PIDO connects immunological knowledge across resources within a common framework and thereby enables translational research and the development of medical applications for the domain of immunology and primary immunodeficiency diseases.  相似文献   

14.
基于Cygwin实现生物信息学软件从Unix/Linux向Windows移植   总被引:2,自引:0,他引:2  
Cygwin可在Windows环境下提供对Unix/Linux环境的模拟与支持,具有较为完善的Unix/Linux工具包和编程环境。利用Cygwin对常用的生物信息学数据分析软件如Sim4、FASTA、Phred/Phrap/RepeatMasker、EMBOSS、HMMER和ClustalW等进行重新编译,发现通过该方式能够获得可在Windows环境下运行的可执行代码,为利用Windows环境优势的同时进行跨平台生物信息学数据分析平台的开发提供重要参考价值。  相似文献   

15.
BACKGROUND: Ontologies are being developed for the life sciences to standardise the way we describe and interpret the wealth of data currently being generated. As more ontology based applications begin to emerge, tools are required that enable domain experts to contribute their knowledge to the growing pool of ontologies. There are many barriers that prevent domain experts engaging in the ontology development process and novel tools are needed to break down these barriers to engage a wider community of scientists. RESULTS: We present Populous, a tool for gathering content with which to construct an ontology. Domain experts need to add content, that is often repetitive in its form, but without having to tackle the underlying ontological representation. Populous presents users with a table based form in which columns are constrained to take values from particular ontologies. Populated tables are mapped to patterns that can then be used to automatically generate the ontology's content. These forms can be exported as spreadsheets, providing an interface that is much more familiar to many biologists. CONCLUSIONS: Populous's contribution is in the knowledge gathering stage of ontology development; it separates knowledge gathering from the conceptualisation and axiomatisation, as well as separating the user from the standard ontology authoring environments. Populous is by no means a replacement for standard ontology editing tools, but instead provides a useful platform for engaging a wider community of scientists in the mass production of ontology content.  相似文献   

16.
The integration of proteomics data with biological knowledge is a recent trend in bioinformatics. A lot of biological information is available and is spread on different sources and encoded in different ontologies (e.g. Gene Ontology). Annotating existing protein data with biological information may enable the use (and the development) of algorithms that use biological ontologies as framework to mine annotated data. Recently many methodologies and algorithms that use ontologies to extract knowledge from data, as well as to analyse ontologies themselves have been proposed and applied to other fields. Conversely, the use of such annotations for the analysis of protein data is a relatively novel research area that is currently becoming more and more central in research. Existing approaches span from the definition of the similarity among genes and proteins on the basis of the annotating terms, to the definition of novel algorithms that use such similarities for mining protein data on a proteome-wide scale. This work, after the definition of main concept of such analysis, presents a systematic discussion and comparison of main approaches. Finally, remaining challenges, as well as possible future directions of research are presented.  相似文献   

17.
The Human Proteome Organization's Proteomics Standards Initiative (PSI) promotes the development of exchange standards to improve data integration and interoperability. PSI specifies the suitable level of detail required when reporting a proteomics experiment (via the Minimum Information About a Proteomics Experiment), and provides extensible markup language (XML) exchange formats and dedicated controlled vocabularies (CVs) that must be combined to generate a standard compliant document. The framework presented here tackles the issue of checking that experimental data reported using a specific format, CVs and public bio‐ontologies (e.g. Gene Ontology, NCBI taxonomy) are compliant with the Minimum Information About a Proteomics Experiment recommendations. The semantic validator not only checks the XML syntax but it also enforces rules regarding the use of an ontology class or CV terms by checking that the terms exist in the resource and that they are used in the correct location of a document. Moreover, this framework is extremely fast, even on sizable data files, and flexible, as it can be adapted to any standard by customizing the parameters it requires: an XML Schema Definition, one or more CVs or ontologies, and a mapping file describing in a formal way how the semantic resources and the format are interrelated. As such, the validator provides a general solution to the common problem in data exchange: how to validate the correct usage of a data standard beyond simple XML Schema Definition validation. The framework source code and its various applications can be found at http://psidev.info/validator .  相似文献   

18.
Non-circular plots of whole genomes are natural representations of genomic data aligned along all chromosomes.Currently,there is no specialized graphical user interface(GUI) designed to produce non-circular whole genome diagrams,and the use of existing tools requires considerable coding effort from users.Moreover,such tools also require improvement,including the addition of new functionalities.To address these issues,we developed a new R/Shiny application,named shiny Chromosome,as a GUI for the interactive creation of non-circular whole genome diagrams.shiny Chromosome can be easily installed on personal computers for own use as well as on local or public servers for community use.Publication-quality images can be readily generated and annotated from user input using diverse widgets.shiny Chromosome is deployed at http://150.109.59.144:3838/shiny Chromosome/,http://shiny Chromosome.ncpgr.cn,and https://yimingyu.shinyapps.io/shiny Chromosome for online use.The source code and manual of shiny Chromosome are freely available at https://github.com/venyao/shiny Chromosome.  相似文献   

19.
Data from the electronic medical record comprise numerous structured but uncoded ele-ments, which are not linked to standard terminologies. Reuse of such data for secondary research purposes has gained in importance recently. However, the identification of rele-vant data elements and the creation of database jobs for extraction, transformation and loading (ETL) are challenging: With current methods such as data warehousing, it is not feasible to efficiently maintain and reuse semantically complex data extraction and trans-formation routines. We present an ontology-supported approach to overcome this challenge by making use of abstraction: Instead of defining ETL procedures at the database level, we use ontologies to organize and describe the medical concepts of both the source system and the target system. Instead of using unique, specifically developed SQL statements or ETL jobs, we define declarative transformation rules within ontologies and illustrate how these constructs can then be used to automatically generate SQL code to perform the desired ETL procedures. This demonstrates how a suitable level of abstraction may not only aid the interpretation of clinical data, but can also foster the reutilization of methods for un-locking it.  相似文献   

20.
MOTIVATION: In the Life Sciences, guidelines, checklists and ontologies describing what metadata is required for the interpretation and reuse of experimental data are emerging. Data producers, however, may have little experience in the use of such standards and require tools to support this form of data annotation. RESULTS: RightField is an open source application that provides a mechanism for embedding ontology annotation support for Life Science data in Excel spreadsheets. Individual cells, columns or rows can be restricted to particular ranges of allowed classes or instances from chosen ontologies. The RightField-enabled spreadsheet presents selected ontology terms to the users as a simple drop-down list, enabling scientists to consistently annotate their data. The result is 'semantic annotation by stealth', with an annotation process that is less error-prone, more efficient, and more consistent with community standards. Availability and implementation: RightField is open source under a BSD license and freely available from http://www.rightfield.org.uk  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号