首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.

Background  

Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as GO, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the GO terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology.  相似文献   

3.

Background  

The exponential growth of research in molecular biology has brought concomitant proliferation of databases for stocking its findings. A variety of protein sequence databases exist. While all of these strive for completeness, the range of user interests is often beyond their scope. Large databases covering a broad range of domains tend to offer less detailed information than smaller, more specialized resources, often creating a need to combine data from many sources in order to obtain a complete picture. Scientific researchers are continually developing new specific databases to enhance their understanding of biological processes.  相似文献   

4.
In the business and healthcare sectors data warehousing has provided effective solutions for information usage and knowledge discovery from databases. However, data warehousing applications in the biological research and development (R&D) sector are lagging far behind. The fuzziness and complexity of biological data represent a major challenge in data warehousing for molecular biology. By combining experiences in other domains with our findings from building a model database, we have defined the requirements for data warehousing in molecular biology.  相似文献   

5.
Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum.  相似文献   

6.
随着现代生物学的发展,全球范围内建立了大量的生物学数据共享中心,同时,在生物学发展的带动下,植物遗传资源数据变得更为复杂、异构化和海量。本文在分析国内外几大著名的数据整合共享中心的基础上,简要介绍了本体论的概念及其在生物学领域中的研究现状,提出了基于生物本体论将植物遗传数据、数据挖掘工具、科技文献和科技交流进行整合的设想,并对数据整合需要考虑的几个问题进行了讨论。  相似文献   

7.
8.
9.
10.
11.
The nematode Caenorhabditis elegans is used extensively by scientists to study a wide variety of biological processes and is one of the most thoroughly characterized animals. Over the years, the community of C. elegans researchers has generated a wealth of information on the genetics, development, behaviour, and cellular and molecular biology of the worm. This body of data has grown even larger with the recent application of high throughput screening methodology to study gene function, expression and interactions. WormBase (http://www.wormbase.org) is the primary online source of biological data on C. elegans and related nematodes. Equipped with an assortment of powerful search tools, WormBase allows users to quickly extract a variety of information, including data on individual genes, DNA sequence, cell lineage and literature citations. As the database is well maintained and the functionalities constantly modified in response to evolving researcher needs, WormBase has become a vital component of the laboratories studying the worm and a model for other biological databases.  相似文献   

12.

Background

Modeling in systems biology is vital for understanding the complexity of biological systems across scales and predicting system-level behaviors. To obtain high-quality pathway databases, it is essential to improve the efficiency of model validation and model update based on appropriate feedback.

Results

We have developed a new method to guide creating novel high-quality biological pathways, using a rule-based validation. Rules are defined to correct models against biological semantics and improve models for dynamic simulation. In this work, we have defined 40 rules which constrain event-specific participants and the related features and adding missing processes based on biological events. This approach is applied to data in Cell System Ontology which is a comprehensive ontology that represents complex biological pathways with dynamics and visualization. The experimental results show that the relatively simple rules can efficiently detect errors made during curation, such as misassignment and misuse of ontology concepts and terms in curated models.

Conclusions

A new rule-based approach has been developed to facilitate model validation and model complementation. Our rule-based validation embedding biological semantics enables us to provide high-quality curated biological pathways. This approach can serve as a preprocessing step for model integration, exchange and extraction data, and simulation.
  相似文献   

13.
The information explosion in biology makes it difficult for researchers to stay abreast of current biomedical knowledge and to make sense of the massive amounts of online information. Ontologies--specifications of the entities, their attributes and relationships among the entities in a domain of discourse--are increasingly enabling biomedical researchers to accomplish these tasks. In fact, bio-ontologies are beginning to proliferate in step with accruing biological data. The myriad of ontologies being created enables researchers not only to solve some of the problems in handling the data explosion but also introduces new challenges. One of the key difficulties in realizing the full potential of ontologies in biomedical research is the isolation of various communities involved: some workers spend their career developing ontologies and ontology-related tools, while few researchers (biologists and physicians) know how ontologies can accelerate their research. The objective of this review is to give an overview of biomedical ontology in practical terms by providing a functional perspective--describing how bio-ontologies can and are being used. As biomedical scientists begin to recognize the many different ways ontologies enable biomedical research, they will drive the emergence of new computer applications that will help them exploit the wealth of research data now at their fingertips.  相似文献   

14.
EBI databases and services   总被引:2,自引:0,他引:2  
The EMBL Outstation-European Bioinformatics Institute (EBI) is a center for research and services in bioinformatics. It serves researchers in molecular biology, genetics, medicine, and agriculture from academia, and the agricultural, biotechnology, chemical, and pharmaceutical industries. The Institute manages and makes available databases of biological data including nucleic acid, protein sequences, and macromolecular structures. It provides to this community bioinformatics services relevant to molecular biology free of charge over the Internet. Some of these databases and services are described in this review. For more information, visit the EBI Web server at http://www.ebi.ac.uk/.  相似文献   

15.
In the wake of the numerous now-fruitful genome projects, we have witnessed a 'tsunami' of sequence data and with it the birth of the field of bioinformatics. Bioinformatics involves the application of information technology to the management and analysis of biological data. For many of us, this means that databases and their search tools have become an essential part of the research environment. However, the rate of sequence generation and the haphazard proliferation of databases have made it difficult to keep pace with developments, even for the cognoscenti. Moreover, increasing amounts of sequence information do not necessarily equate with an increase in knowledge, and in the panic to automate the route from raw data to biological insight, we may be generating and propagating innumerable errors in our precious databases. In the genome era upon us, researchers want rapid, easy-to-use, reliable tools for functional characterisation of newly determined sequences. For the pharmaceutical industry in particular, the Pandora's box of bioinformatics harbours an information-rich nugget, ripe with potential drug targets and possible new avenues for the development of therapeutic agents. This review outlines the current status of the major pattern databases now used routinely in the analysis of protein sequences. The review is divided into three main sections. In the first, commonly used terms are defined and the methods behind the databases are briefly described; in the second, the structure and content of the principal pattern databases are discussed; and in the final part, several alignment databases, which are frequently confused with pattern databases, are mentioned. For the new-comer, the array of resources, the range of methods behind them and the different tools required to search them can be confusing. The review therefore also briefly mentions a current international endeavour to integrate the diverse databases, which effort should facilitate sequence analysis in the future. This is particularly important for target-discovery programmes, where the challenge is to rationalise the enormous numbers of potential targets generated by sequence database searches. This problem may be addressed, at least in part, by reducing search outputs to the more focused and manageable subsets suggested by searches of integrated groups of family-specific pattern databases.  相似文献   

16.
Lars Vogt 《Zoomorphology》2009,128(3):201-217
Due to lack of common data standards, the communicability and comparability of biological data across various levels of organization and taxonomic groups is continuously decreasing. However, the interdependence between molecular and higher levels of organization is of growing interest and calls for co-operations between biologists from different methodological and theoretical backgrounds. A general data standard in biology would greatly facilitate such co-operations. This article examines the role that defined and formalized vocabularies (i.e., ontologies) could have in developing such a data standard. I suggest basic criteria for developing data standards on grounds of distinguishing content, concept, nomenclatural, and format standards and discuss the role of data bases and their use of bio-ontologies in current activities for data standardization in biology. General principles of ontology development are introduced, including foundational ontology properties (e.g. class–subclass, parthood), and how concepts are defined. After addressing problems that are specific to morphological data, the notion of a general structure concept for morphology is introduced and why it is required for developing a morphological ontology. The necessity for a general morphological ontology to be taxon-independent and free of homology assumptions is discussed and how it can solve the problems of morphology. The article concludes with an outlook on how the use of ontologies will likely establish some sort of general data standard in biology and why the development of a set of commonly used foundational ontology properties and the use of globally unique identifiers for all classes defined in ontologies is crucial for its success.  相似文献   

17.
In biology field, the ontology application relates to a large amount of genetic information and chemical information of molecular structure, which makes knowledge of ontology concepts convey much information. Therefore, in mathematical notation, the dimension of vector which corresponds to the ontology concept is often very large, and thus improves the higher requirements of ontology algorithm. Under this background, we consider the designing of ontology sparse vector algorithm and application in biology. In this paper, using knowledge of marginal likelihood and marginal distribution, the optimized strategy of marginal based ontology sparse vector learning algorithm is presented. Finally, the new algorithm is applied to gene ontology and plant ontology to verify its efficiency.  相似文献   

18.
The increasing availability of data related to genes, proteins and their modulation by small molecules has provided a vast amount of biological information leading to the emergence of systems biology and the broad use of simulation tools for data analysis. However, there is a critical need to develop cheminformatics tools that can integrate chemical knowledge with these biological databases and simulation approaches, with the goal of creating systems chemical biology.  相似文献   

19.
Toward understanding the origin and evolution of cellular organisms   总被引:1,自引:0,他引:1  
In this era of high‐throughput biology, bioinformatics has become a major discipline for making sense out of large‐scale datasets. Bioinformatics is usually considered as a practical field developing databases and software tools for supporting other fields, rather than a fundamental scientific discipline for uncovering principles of biology. The KEGG resource that we have been developing is a reference knowledge base for biological interpretation of genome sequences and other high‐throughput data. It is now one of the most utilized biological databases because of its practical values. For me personally, KEGG is a step toward understanding the origin and evolution of cellular organisms.  相似文献   

20.

Background  

The goal of information integration in systems biology is to combine information from a number of databases and data sets, which are obtained from both high and low throughput experiments, under one data management scheme such that the cumulative information provides greater biological insight than is possible with individual information sources considered separately.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号