首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Biological data integration using Semantic Web technologies   总被引:2,自引:0,他引:2  
Pasquier C 《Biochimie》2008,90(4):584-594
Current research in biology heavily depends on the availability and efficient use of information. In order to build new knowledge, various sources of biological data must often be combined. Semantic Web technologies, which provide a common framework allowing data to be shared and reused between applications, can be applied to the management of disseminated biological data. However, due to some specificities of biological data, the application of these technologies to life science constitutes a real challenge. Through a use case of biological data integration, we show in this paper that current Semantic Web technologies start to become mature and can be applied for the development of large applications. However, in order to get the best from these technologies, improvements are needed both at the level of tool performance and knowledge modeling.  相似文献   

2.
As a result of the enormous amount of information that has been collected with E. coli over the past half century (e.g. genome sequence, mutant phenotypes, metabolic and regulatory networks, etc.), we now have detailed knowledge about gene regulation, protein activity, several hundred enzyme reactions, metabolic pathways, macromolecular machines, and regulatory interactions for this model organism. However, understanding how all these processes interact to form a living cell will require further characterization, quantification, data integration, and mathematical modeling, systems biology. No organism can rival E. coli with respect to the amount of available basic information and experimental tractability for the technologies needed for this undertaking. A focused, systematic effort to understand the E. coli cell will accelerate the development of new post-genomic technologies, including both experimental and computational tools. It will also lead to new technologies that will be applicable to other organisms, from microbes to plants, animals, and humans. E. coli is not only the best studied free-living model organism, but is also an extensively used microbe for industrial applications, especially for the production of small molecules of interest. It is an excellent representative of Gram-negative commensal bacteria. E. coli may represent a perfect model organism for systems biology that is aimed at elucidating both its free-living and commensal life-styles, which should open the door to whole-cell modeling and simulation.  相似文献   

3.
Arguably, the richest source of knowledge (as opposed to fact and data collections) about biology and biotechnology is captured in natural-language documents such as technical reports, conference proceedings and research articles. The automatic exploitation of this rich knowledge base for decision making, hypothesis management (generation and testing) and knowledge discovery constitutes a formidable challenge. Recently, a set of technologies collectively referred to as knowledge discovery in text (KDT) has been advocated as a promising approach to tackle this challenge. KDT comprises three main tasks: information retrieval, information extraction and text mining. These tasks are the focus of much recent scientific research and many algorithms have been developed and applied to documents and text in biology and biotechnology. This article introduces the basic concepts of KDT, provides an overview of some of these efforts in the field of bioscience and biotechnology, and presents a framework of commonly used techniques for evaluating KDT methods, tools and systems.  相似文献   

4.
5.
Multiple myeloma (MM) is a malignancy of terminally differentiated B-lymphocytes that accounts for ~13% of all hematologic cancers. Despite a wealth of knowledge describing the molecular biology of MM as well as significant advances in therapeutics, this disease remains incurable. Since proteins govern the cellular structure and biological function, a wide selection of proteomic approaches holds great promise for increasing our understanding of this disease, such as by investigating the dynamic nature of protein expression, cellular and subcellular distribution, post-translational modifications, and interactions at both the cellular and subcellular levels. The aims of this review are to introduce the available and emerging proteomic technologies that have potential applications in the study of MM and to highlight the current status of proteomic studies of MM. To date, although there have been a limited number of proteomic studies in MM, those performed have provided valuable information with regard to MM diagnosis and therapy. The potential future application of proteomic technologies is expected to provide new avenues in MM diagnostics, individualized therapy design and therapy response surveillance for the clinician.  相似文献   

6.
Increasing data from a few sites demonstrate that information technologies can improve physician decision making and clinical effectiveness. For example, computer-based physician order entry systems, automated laboratory alert systems, and artificial neural networks have demonstrated significant reductions in medical errors. In addition, Internet services to disseminate new knowledge and safety alerts to physicians more rationally and effectively are rapidly developing, and telemedicine to improve rural access to specialty services is undergoing substantial growth. However, even technologies demonstrated to yield beneficial effects have not yet achieved widespread adoption, though the pace of change appears to be increasing as the Internet takes hold. Scientific evaluation of many technologies is also lacking, and the dangers of some of these technologies may be underappreciated. Research on the effects of specific technologies should be a priority. Policies should be developed to press information technology companies, such as pharmaceutical and medical device manufacturers, to recognize the importance of clinical evaluation. Research could also analyze the characteristics of effective technologies and of physicians and organizations who implement these technologies effectively.  相似文献   

7.
Little is known about the fundamental biology of parasitic nematodes (= roundworms) that cause serious diseases, affecting literally billions of animals and humans worldwide. Unlocking the biology of these neglected pathogens using modern technologies will yield crucial and profound knowledge of their molecular biology, and could lead to new treatment and control strategies. Supported by studies in the free-living nematode, Caenorhabditis elegans, some recent investigations have provided improved insights into selected protein phosphatases (PPs) of economically important parasitic nematodes (Strongylida). In the present article, we review this progress and assess the potential of serine/threonine phosphatase (STP) genes and/or their products as targets for new nematocidal drugs. Current information indicates that some small molecules, known to specifically inhibit PPs, might be developed as nematocides. For instance, some cantharidin analogues are known to display exquisite PP-inhibitor activity, which indicates that some of them could be designed and tailored to specifically inhibit selected STPs of nematodes. This information provides prospects for the discovery of an entirely novel class of nematocides, which is of paramount importance, given the serious problems linked to anthelmintic resistance in parasitic nematode populations of livestock, and has the potential to lead to significant biotechnological outcomes.  相似文献   

8.
The flood of data acquired from the increasing number of publicly available genomes has led to new demands for bioinformatics software. With the growing amount of information resulting from high throughput experiments new questions arise that often focus on the comparison of genes, genomes, and their expression profiles. Inferring new knowledge by combining different kinds of "post-genomics" data obviously necessitates the development of new approaches that allow the integration of variable data sources into a flexible framework. In this paper, we describe our concept for the integration of heterogeneous data into a platform for systems biology. We have implemented a Bioinformatics Resource for the Integration of heterogeneous Data from Genomic Explorations (BRIDGE) and illustrate the usability of our approach as a platform for systems biology for two sample applications.  相似文献   

9.
Search and discovery strategies for biotechnology: the paradigm shift.   总被引:21,自引:0,他引:21  
Profound changes are occurring in the strategies that biotechnology-based industries are deploying in the search for exploitable biology and to discover new products and develop new or improved processes. The advances that have been made in the past decade in areas such as combinatorial chemistry, combinatorial biosynthesis, metabolic pathway engineering, gene shuffling, and directed evolution of proteins have caused some companies to consider withdrawing from natural product screening. In this review we examine the paradigm shift from traditional biology to bioinformatics that is revolutionizing exploitable biology. We conclude that the reinvigorated means of detecting novel organisms, novel chemical structures, and novel biocatalytic activities will ensure that natural products will continue to be a primary resource for biotechnology. The paradigm shift has been driven by a convergence of complementary technologies, exemplified by DNA sequencing and amplification, genome sequencing and annotation, proteome analysis, and phenotypic inventorying, resulting in the establishment of huge databases that can be mined in order to generate useful knowledge such as the identity and characterization of organisms and the identity of biotechnology targets. Concurrently there have been major advances in understanding the extent of microbial diversity, how uncultured organisms might be grown, and how expression of the metabolic potential of microorganisms can be maximized. The integration of information from complementary databases presents a significant challenge. Such integration should facilitate answers to complex questions involving sequence, biochemical, physiological, taxonomic, and ecological information of the sort posed in exploitable biology. The paradigm shift which we discuss is not absolute in the sense that it will replace established microbiology; rather, it reinforces our view that innovative microbiology is essential for releasing the potential of microbial diversity for biotechnology penetration throughout industry. Various of these issues are considered with reference to deep-sea microbiology and biotechnology.  相似文献   

10.
Urological malignancies, including prostate cancer, bladder cancer and kidney cancer, are major causes of morbidity and mortality worldwide. Because of the high incidence, diversity in biology, and especially direct interaction with urine, urological cancers are an important resource for both scientists and clinicians for novel diagnostic and therapeutic discovery. Extracellular vesicles (EVs) are lipid bilayer encapsulated particles released by cells into the extracellular space. Since EVs work as a safe way to transport important biological information through the whole body, they are now recognized as an important mechanism of cell–cell communication and have opened a new window for us to gain a better understanding of cancer biology, novel diagnostics, and therapeutic options. In recent years, numerous evolutions in EV technologies and novel biological and clinical findings continue to be reported in the research field of urological cancers. This comprehensive review aims to give an update of recent advances in EV technologies and summarize the state-of-the-art knowledge of EVs related to prostate cancer, bladder cancer and kidney cancer, particularly focusing on the potential of EV as biomarkers and their biological roles in promoting cancer and metastasis.  相似文献   

11.
12.
The 1000 Genomes Project: data management and community access   总被引:1,自引:0,他引:1  
The 1000 Genomes Project was launched as one of the largest distributed data collection and analysis projects ever undertaken in biology. In addition to the primary scientific goals of creating both a deep catalog of human genetic variation and extensive methods to accurately discover and characterize variation using new sequencing technologies, the project makes all of its data publicly available. Members of the project data coordination center have developed and deployed several tools to enable widespread data access.  相似文献   

13.
Although theoretical systems analysis has been available for over half a century, the recent advent of omic high-throughput analytical platforms along with the integration of individual tools and technologies has given rise to the field of modern systems biology. Coupled with information technology, bioinformatics, knowledge management and powerful mathematical models, systems biology has opened up new vistas in our understanding of complex biological systems. Currently there are two distinct approaches that include the inductively driven computational systems biology (bottom-up approach) and the deductive data-driven top-down analysis. Such approaches offer enormous potential in the elucidation of disease as well as defining key pathways and networks involved in optimal human health and nutrition. The tools and technologies now available in systems biology analyses offer exciting opportunities to develop the emerging areas of personalized medicine and individual nutritional profiling.  相似文献   

14.
Proteomic tools for biomedicine   总被引:4,自引:0,他引:4  
Proteomic tools measure gene expression, protein activity and interactions of biological events at the protein level. Proteins are the major catalysts of biological functions and contain several dimensions of information that collectively indicate the actual rather than the potential functional state as indicated by mRNA analysis. Measurements can be made in terms of protein quantity, location, and time-point. For the future we see a further integration of existing and new technologies for proteomics from a wide range of areas of biochemistry, chemistry, physics, computing science and molecular biology. This will further advance our knowledge of how biological systems are built up and what mechanisms control these systems. However, the potential of proteomics to comprehensively answer all biological questions is limited as only protein activity is measured. A unification of genomics, proteomics, and other technologies is needed if we are to start to understand the complexity of biological function in the context of disease and health.  相似文献   

15.
MOTIVATION: Natural language processing (NLP) techniques are increasingly being used in biology to automate the capture of new biological discoveries in text, which are being reported at a rapid rate. Yet, information represented in NLP data structures is classically very different from information organized with ontologies as found in model organisms or genetic databases. To facilitate the computational reuse and integration of information buried in unstructured text with that of genetic databases, we propose and evaluate a translational schema that represents a comprehensive set of phenotypic and genetic entities, as well as their closely related biomedical entities and relations as expressed in natural language. In addition, the schema connects different scales of biological information, and provides mappings from the textual information to existing ontologies, which are essential in biology for integration, organization, dissemination and knowledge management of heterogeneous phenotypic information. A common comprehensive representation for otherwise heterogeneous phenotypic and genetic datasets, such as the one proposed, is critical for advancing systems biology because it enables acquisition and reuse of unprecedented volumes of diverse types of knowledge and information from text. RESULTS: A novel representational schema, PGschema, was developed that enables translation of phenotypic, genetic and their closely related information found in textual narratives to a well-defined data structure comprising phenotypic and genetic concepts from established ontologies along with modifiers and relationships. Evaluation for coverage of a selected set of entities showed that 90% of the information could be represented (95% confidence interval: 86-93%; n = 268). Moreover, PGschema can be expressed automatically in an XML format using natural language techniques to process the text. To our knowledge, we are providing the first evaluation of a translational schema for NLP that contains declarative knowledge about genes and their associated biomedical data (e.g. phenotypes). AVAILABILITY: http://zellig.cpmc.columbia.edu/PGschema  相似文献   

16.
The last decade has witnessed extensive, and widespread, changes in scientific technologies that have impacted significantly upon the study of the life sciences. Arguably, the biggest advances in our comprehension of simple and complex biological processes have come as a consequence of obtaining the complete DNA sequence of organisms. It is likely that we will become accustomed to hearing of quantum leaps in the study and understanding of the biology of higher eukaryotes in the coming years, now that (near) complete genome sequences are available for man, mouse and rat. In this review, we will discuss the impact of genome sequence data, and the use of new scientific technologies that have emerged largely as consequence of the availability of this information, on the study of the master regulator of sporulation, Spo0A, in low G+C Gram-positive endospore-forming bacteria.  相似文献   

17.
Abstract

Arguably, the richest source of knowledge (as opposed to fact and data collections) about biology and biotechnology is captured in natural-language documents such as technical reports, conference proceedings and research articles. The automatic exploitation of this rich knowledge base for decision making, hypothesis management (generation and testing) and knowledge discovery constitutes a formidable challenge. Recently, a set of technologies collectively referred to as knowledge discovery in text (KDT) has been advocated as a promising approach to tackle this challenge. KDT comprises three main tasks: information retrieval, information extraction and text mining. These tasks are the focus of much recent scientific research and many algorithms have been developed and applied to documents and text in biology and biotechnology. This article introduces the basic concepts of KDT, provides an overview of some of these efforts in the field of bioscience and biotechnology, and presents a framework of commonly used techniques for evaluating KDT methods, tools and systems.  相似文献   

18.
The immense growth of MEDLINE coupled with the realization that a vast amount of biomedical knowledge is recorded in free-text format, has led to the appearance of a large number of literature mining techniques aiming to extract biomedical terms and their inter-relations from the scientific literature. Ontologies have been extensively utilized in the biomedical domain either as controlled vocabularies or to provide the framework for mapping relations between concepts in biology and medicine. Literature-based approaches and ontologies have been used in the past for the purpose of hypothesis generation in connection with drug discovery. Here, we review the application of literature mining and ontology modeling and traversal to the area of drug repurposing (DR). In recent years, DR has emerged as a noteworthy alternative to the traditional drug development process, in response to the decreased productivity of the biopharmaceutical industry. Thus, systematic approaches to DR have been developed, involving a variety of in silico, genomic and high-throughput screening technologies. Attempts to integrate literature mining with other types of data arising from the use of these technologies as well as visualization tools assisting in the discovery of novel associations between existing drugs and new indications will also be presented.  相似文献   

19.
20.
An understanding of heart development is critical in any systems biology approach to cardiovascular disease. The interpretation of data generated from high-throughput technologies (such as microarray and proteomics) is also essential to this approach. However, characterizing the role of genes in the processes underlying heart development and cardiovascular disease involves the non-trivial task of data analysis and integration of previous knowledge. The Gene Ontology (GO) Consortium provides structured controlled biological vocabularies that are used to summarize previous functional knowledge for gene products across all species. One aspect of GO describes biological processes, such as development and signaling.In order to support high-throughput cardiovascular research, we have initiated an effort to fully describe heart development in GO; expanding the number of GO terms describing heart development from 12 to over 280. This new ontology describes heart morphogenesis, the differentiation of specific cardiac cell types, and the involvement of signaling pathways in heart development. This work also aligns GO with the current views of the heart development research community and its representation in the literature. This extension of GO allows gene product annotators to comprehensively capture the genetic program leading to the developmental progression of the heart. This will enable users to integrate heart development data across species, resulting in the comprehensive retrieval of information about this subject.The revised GO structure, combined with gene product annotations, should improve the interpretation of data from high-throughput methods in a variety of cardiovascular research areas, including heart development, congenital cardiac disease, and cardiac stem cell research. Additionally, we invite the heart development community to contribute to the expansion of this important dataset for the benefit of future research in this area.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号