首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Although various ontologies and knowledge sources have been developed in recent years to facilitate biomedical research, it is difficult to assimilate information from multiple knowledge sources. To enable researchers to easily gain understanding of a biomedical concept, a biomedical Semantic Web that seamlessly integrates knowledge from biomedical ontologies, publications and patents would be very helpful. In this paper, current research efforts in representing biomedical knowledge in Semantic Web languages are surveyed. Techniques are presented for information retrieval and knowledge discovery from the Semantic Web that extend traditional keyword search and database querying techniques. Finally, some of the challenges that have to be addressed to make the vision of a biomedical Semantic Web a reality are discussed.  相似文献   

2.
Biological data integration using Semantic Web technologies   总被引:2,自引:0,他引:2  
Pasquier C 《Biochimie》2008,90(4):584-594
Current research in biology heavily depends on the availability and efficient use of information. In order to build new knowledge, various sources of biological data must often be combined. Semantic Web technologies, which provide a common framework allowing data to be shared and reused between applications, can be applied to the management of disseminated biological data. However, due to some specificities of biological data, the application of these technologies to life science constitutes a real challenge. Through a use case of biological data integration, we show in this paper that current Semantic Web technologies start to become mature and can be applied for the development of large applications. However, in order to get the best from these technologies, improvements are needed both at the level of tool performance and knowledge modeling.  相似文献   

3.

Background

Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of ‘omics’ data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis.

Results

We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions.

Conclusions

Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0386-y) contains supplementary material, which is available to authorized users.  相似文献   

4.
The World Wide Web has revolutionized how researchers from various disciplines collaborate over long distances. This is nowhere more important than in the Life Sciences, where interdisciplinary approaches are becoming increasingly powerful as a driver of both integration and discovery. Data access, data quality, identity, and provenance are all critical ingredients to facilitate and accelerate these collaborative enterprises and it is here where Semantic Web technologies promise to have a profound impact. This paper reviews the need for, and explores advantages of as well as challenges with these novel Internet information tools as illustrated with examples from the biomedical community.  相似文献   

5.
6.
This article presents the design goals and features of the open-source Boca RDF server in the context of a community of cancer-tumor modeling investigators. Boca supplements the desirable data features of the Semantic Web with important enterprise and application features to power a new generation of Semantic-Web-based applications. The data features enable the integration and retrieval of tremendous quantities of diverse data. The enterprise features promote data integrity, fidelity, provenance and robustness. The application features provide for collaborative applications and dynamic user interfaces.  相似文献   

7.
The Semantic Web for the Life Sciences (SWLS), when realized, will dramatically improve our ability to conduct bioinformatics analyses using the vast and growing stores of web-accessible resources. This ability will be achieved through the widespread acceptance and application of standards for naming, representing, describing and accessing biological information. The W3C-led Semantic Web initiative has established most, if not all, of the standards and technologies needed to achieve a unified, global SWLS. Unfortunately, the bioinformatics community has, thus far, appeared reluctant to fully adopt them. Rather, we are seeing what could be described as 'semantic creep'-timid, piecemeal and ad hoc adoption of parts of standards by groups that should be stridently taking a leadership role for the community. We suggest that, at this point, the primary hindrances to the creation of the SWLS may be social rather than technological in nature, and that, like the original Web, the establishment of the SWLS will depend primarily on the will and participation of its consumers.  相似文献   

8.
9.
Industrial synergies join two or more organizations that initially functioned as independent economic actors—that may originate from different sectors—together in order to share resources and exchange by‐products for mutual environmental, financial, and social benefits for its participants. Industrial symbioses (ISs) are networks of industrial synergies that can be initiated and created over time in various manners. In practice, the initiation of an industrial synergy, and particularly the identification of by‐product compatibilities, relies on direct or facilitated knowledge and information sharing, which is essential for discovering industrial synergy opportunities. Beyond its potential contribution to facilitate knowledge and information sharing among organizations, the Social Semantic Web (SSW) also has the potential to facilitate the initiation of industrial synergy by systematically and automatically identifying and recommending by‐products exchange compatibilities to potential partners. This framework exploits the ability of the sematic web to enable the search for analogies between potential partners within a region or district and existing industrial synergies around the world. This paper proposes the Social Semantic Web for Industrial Synergies Initiation (SSWISI) framework for the initiation of industrial synergies, which is based on the Social Semantic Web. The framework proposed in this paper adopts the concept of Linked Open Data (LOD), which enables the sharing and exchanging of information with external systems. This feature distinguishes the proposed framework from the existing approaches in its initiation of industrial synergies.  相似文献   

10.
Most librarians, and at least some lecturers in institutions of higher education, recognise the need for students to receive instruction in the use of information sources. In the life sciences, the continuing growth in the number of these sources creates problems for biologists. To help students cope with the situation, a compulsory course in Communication and Information Retrieval has been introduced into the BSc degree in biology at Paisley College of Technology. The course, taught by the College library staff, covers the structure of scientific literature, the language barrier, techniques of literature searching and scientific writing, and the information network in the life sciences.  相似文献   

11.
12.
13.
The Web has become the major medium for various communities to share their knowledge. To this end, it provides an optimal environment for knowledge networks. The web offers global connectivity that is virtually instantaneous, and whose resources and documents can easily be indexed for easy searching. In the coupled realms of biomedical research and healthcare, this has become especially important where today many thousands of communities already exist that connect across academia, hospitals and industry. These communities also rely on several forms of knowledge assets, including publications, experimental data, domain-specific vocabularies and policies. Web-based communities will be one of the earlier beneficiaries of the emerging Semantic Web. With the new standards and technologies of the Semantic Web, effective utilization of knowledge networks will expand profoundly, fostering new levels of innovation and knowledge.  相似文献   

14.

Background

Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way.

Results

SPARQLGraph offers an intuitive drag & drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers.

Conclusions

This new graphical way of creating queries for biological Semantic Web databases considerably facilitates usability as it removes the requirement of knowing specific query languages and database structures. The system is freely available at http://sparqlgraph.i-med.ac.at.  相似文献   

15.
16.
An upper-level ontology for the biomedical domain   总被引:1,自引:0,他引:1  
At the US National Library of Medicine we have developed the Unified Medical Language System (UMLS), whose goal it is to provide integrated access to a large number of biomedical resources by unifying the vocabularies that are used to access those resources. The UMLS currently interrelates some 60 controlled vocabularies in the biomedical domain. The UMLS coverage is quite extensive, including not only many concepts in clinical medicine, but also a large number of concepts applicable to the broad domain of the life sciences. In order to provide an overarching conceptual framework for all UMLS concepts, we developed an upper-level ontology, called the UMLS semantic network. The semantic network, through its 134 semantic types, provides a consistent categorization of all concepts represented in the UMLS. The 54 links between the semantic types provide the structure for the network and represent important relationships in the biomedical domain. Because of the growing number of information resources that contain genetic information, the UMLS coverage in this area is being expanded. We recently integrated the taxonomy of organisms developed by the NLM's National Center for Biotechnology Information, and we are currently working together with the developers of the Gene Ontology to integrate this resource, as well. As additional, standard, ontologies become publicly available, we expect to integrate these into the UMLS construct.  相似文献   

17.
18.

Background

The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work moving toward shared data formats and linked identifiers, significant problems persist in semantic data integration in order to establish shared identity and shared meaning across heterogeneous biomedical data sources.

Results

We present five processes for semantic data integration that, when applied collectively, solve seven key problems. These processes include making explicit the differences between biomedical concepts and database records, aggregating sets of identifiers denoting the same biomedical concepts across data sources, and using declaratively represented forward-chaining rules to take information that is variably represented in source databases and integrating it into a consistent biomedical representation. We demonstrate these processes and solutions by presenting KaBOB (the Knowledge Base Of Biomedicine), a knowledge base of semantically integrated data from 18 prominent biomedical databases using common representations grounded in Open Biomedical Ontologies. An instance of KaBOB with data about humans and seven major model organisms can be built using on the order of 500 million RDF triples. All source code for building KaBOB is available under an open-source license.

Conclusions

KaBOB is an integrated knowledge base of biomedical data representationally based in prominent, actively maintained Open Biomedical Ontologies, thus enabling queries of the underlying data in terms of biomedical concepts (e.g., genes and gene products, interactions and processes) rather than features of source-specific data schemas or file formats. KaBOB resolves many of the issues that routinely plague biomedical researchers intending to work with data from multiple data sources and provides a platform for ongoing data integration and development and for formal reasoning over a wealth of integrated biomedical data.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0559-3) contains supplementary material, which is available to authorized users.  相似文献   

19.

Motivation

Weighted semantic networks built from text-mined literature can be used to retrieve known protein-protein or gene-disease associations, and have been shown to anticipate associations years before they are explicitly stated in the literature. Our text-mining system recognizes over 640,000 biomedical concepts: some are specific (i.e., names of genes or proteins) others generic (e.g., ‘Homo sapiens’). Generic concepts may play important roles in automated information retrieval, extraction, and inference but may also result in concept overload and confound retrieval and reasoning with low-relevance or even spurious links. Here, we attempted to optimize the retrieval performance for protein-protein interactions (PPI) by filtering generic concepts (node filtering) or links to generic concepts (edge filtering) from a weighted semantic network. First, we defined metrics based on network properties that quantify the specificity of concepts. Then using these metrics, we systematically filtered generic information from the network while monitoring retrieval performance of known protein-protein interactions. We also systematically filtered specific information from the network (inverse filtering), and assessed the retrieval performance of networks composed of generic information alone.

Results

Filtering generic or specific information induced a two-phase response in retrieval performance: initially the effects of filtering were minimal but beyond a critical threshold network performance suddenly drops. Contrary to expectations, networks composed exclusively of generic information demonstrated retrieval performance comparable to unfiltered networks that also contain specific concepts. Furthermore, an analysis using individual generic concepts demonstrated that they can effectively support the retrieval of known protein-protein interactions. For instance the concept “binding” is indicative for PPI retrieval and the concept “mutation abnormality” is indicative for gene-disease associations.

Conclusion

Generic concepts are important for information retrieval and cannot be removed from semantic networks without negative impact on retrieval performance.  相似文献   

20.
To reduce the increasing amount of time spent on literature search in the life sciences, several methods for automated knowledge extraction have been developed. Co-occurrence based approaches can deal with large text corpora like MEDLINE in an acceptable time but are not able to extract any specific type of semantic relation. Semantic relation extraction methods based on syntax trees, on the other hand, are computationally expensive and the interpretation of the generated trees is difficult. Several natural language processing (NLP) approaches for the biomedical domain exist focusing specifically on the detection of a limited set of relation types. For systems biology, generic approaches for the detection of a multitude of relation types which in addition are able to process large text corpora are needed but the number of systems meeting both requirements is very limited. We introduce the use of SENNA (“Semantic Extraction using a Neural Network Architecture”), a fast and accurate neural network based Semantic Role Labeling (SRL) program, for the large scale extraction of semantic relations from the biomedical literature. A comparison of processing times of SENNA and other SRL systems or syntactical parsers used in the biomedical domain revealed that SENNA is the fastest Proposition Bank (PropBank) conforming SRL program currently available. 89 million biomedical sentences were tagged with SENNA on a 100 node cluster within three days. The accuracy of the presented relation extraction approach was evaluated on two test sets of annotated sentences resulting in precision/recall values of 0.71/0.43. We show that the accuracy as well as processing speed of the proposed semantic relation extraction approach is sufficient for its large scale application on biomedical text. The proposed approach is highly generalizable regarding the supported relation types and appears to be especially suited for general-purpose, broad-scale text mining systems. The presented approach bridges the gap between fast, cooccurrence-based approaches lacking semantic relations and highly specialized and computationally demanding NLP approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号