首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
Pal D 《Bioinformation》2006,1(3):97-98
The effort of function annotation does not merely involve associating a gene with some structured vocabulary that describes action. Rather the details of the actions, the components of the actions, the larger context of the actions are important issues that are of direct relevance, because they help understand the biological system to which the gene/protein belongs. Currently Gene Ontology (GO) Consortium offers the most comprehensive sets of relationships to describe gene/protein activity. However, its choice to segregate gene ontology to subdomains of molecular function, biological process and cellular component is creating significant limitations in terms of future scope of use. If we are to understand biology in its total complexity, comprehensive ontologies in larger biological domains are essential. A vigorous discussion on this topic is necessary for the larger benefit of the biological community. I highlight this point because larger-bio-domain ontologies cannot be simply created by integrating subdomain ontologies. Relationships in larger bio-domain-ontologies are more complex due to larger size of the system and are therefore more labor intensive to create. The current limitations of GO will be a handicap in derivation of more complex relationships from the high throughput biology data.  相似文献   

4.
An understanding of heart development is critical in any systems biology approach to cardiovascular disease. The interpretation of data generated from high-throughput technologies (such as microarray and proteomics) is also essential to this approach. However, characterizing the role of genes in the processes underlying heart development and cardiovascular disease involves the non-trivial task of data analysis and integration of previous knowledge. The Gene Ontology (GO) Consortium provides structured controlled biological vocabularies that are used to summarize previous functional knowledge for gene products across all species. One aspect of GO describes biological processes, such as development and signaling.In order to support high-throughput cardiovascular research, we have initiated an effort to fully describe heart development in GO; expanding the number of GO terms describing heart development from 12 to over 280. This new ontology describes heart morphogenesis, the differentiation of specific cardiac cell types, and the involvement of signaling pathways in heart development. This work also aligns GO with the current views of the heart development research community and its representation in the literature. This extension of GO allows gene product annotators to comprehensively capture the genetic program leading to the developmental progression of the heart. This will enable users to integrate heart development data across species, resulting in the comprehensive retrieval of information about this subject.The revised GO structure, combined with gene product annotations, should improve the interpretation of data from high-throughput methods in a variety of cardiovascular research areas, including heart development, congenital cardiac disease, and cardiac stem cell research. Additionally, we invite the heart development community to contribute to the expansion of this important dataset for the benefit of future research in this area.  相似文献   

5.

Background  

Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as GO, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the GO terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology.  相似文献   

6.
7.
《Ecological Complexity》2008,5(3):272-279
As ecological data increases in breadth, depth, and complexity, the discipline of ecology is increasingly influenced by information science. While this influence provides many opportunities for ecologists, it also necessitates a change in how we manage and share data, and perhaps more fundamentally, define concepts in ecology. Specifically, the information technology process of automated data integration entirely depends upon consistent concept definition. A common tool used in computer science and engineering to specify meanings, which is both novel and offers significant potential to ecology, is an ontology. An ontology is a formal representation of knowledge in which concepts are described by their meaning and their relationship to each other. Ontologies are a tool that can be used to ‘explicitly specify a concept’ (Gruber, 1993) and this approach is uncommon in ecology. In this paper, we develop an ontology for the concept of ‘landscape’ that captures the most general definitions and usages of this term. We selected the concept of landscape because it is often used in very different ways by investigators and hence generates linguistic uncertainty. A graphic theoretic (i.e., visual) model is provided which describes the set of structuring rules we used to define the relationships between ‘landscape’ and appropriately related terms. Based upon these rules, a landscape necessarily contains a spatial component (i.e., area), structure and function (i.e., ecosystems), and is scale independent. This approach provides the set of necessary conditions for landscape studies to reduce linguistic uncertainty, and facilitate interoperability of data, i.e., in a manner that promotes data linkages and quantitative synthesis particularly by automatic data synthesis programs that are likely to become an important part of ecology in the future. Simply put, we use an ontology, a technique novel to ecology but not other disciplines, to define ‘landscape,’ thereby clearly delineating one subset of its potential general usage. As such this ontology can serve as both a checklist for landscape studies and a blueprint for additional ecological ontologies.  相似文献   

8.
We develop a new weighting approach of gene ontology (GO) terms for predicting protein subcellular localization. The weights of individual GO terms, corresponding to their contribution to the prediction algorithm, are determined by the term-weighting methods used in text categorization. We evaluate several term-weighting methods, which are based on inverse document frequency, information gain, gain ratio, odds ratio, and chi-square and its variants. Additionally, we propose a new term-weighting method based on the logarithmic transformation of chi-square. The proposed term-weighting method performs better than other term-weighting methods, and also outperforms state-of-the-art subcellular prediction methods. Our proposed method achieves 98.1%, 99.3%, 98.1%, 98.1%, and 95.9% overall accuracies for the animal BaCelLo independent dataset (IDS), fungal BaCelLo IDS, animal Höglund IDS, fungal Höglund IDS, and PLOC dataset, respectively. Furthermore, the close correlation between high-weighted GO terms and subcellular localizations suggests that our proposed method appropriately weights GO terms according to their relevance to the localizations.  相似文献   

9.
Standard microbial evolutionary ontology is organized according to a nested hierarchy of entities at various levels of biological organization. It typically detects and defines these entities in relation to the most stable aspects of evolutionary processes, by identifying lineages evolving by a process of vertical inheritance from an ancestral entity. However, recent advances in microbiology indicate that such an ontology has important limitations. The various dynamics detected within microbiological systems reveal that a focus on the most stable entities (or features of entities) over time inevitably underestimates the extent and nature of microbial diversity. These dynamics are not the outcome of the process of vertical descent alone. Other processes, often involving causal interactions between entities from distinct levels of biological organisation, or operating at different time scales, are responsible not only for the destabilisation of pre-existing entities, but also for the emergence and stabilisation of novel entities in the microbial world. In this article we consider microbial entities as more or less stabilised functional wholes, and sketch a network-based ontology that can represent a diverse set of processes including, for example, as well as phylogenetic relations, interactions that stabilise or destabilise the interacting entities, spatial relations, ecological connections, and genetic exchanges. We use this pluralistic framework for evaluating (i) the existing ontological assumptions in evolution (e.g. whether currently recognized entities are adequate for understanding the causes of change and stabilisation in the microbial world), and (ii) for identifying hidden ontological kinds, essentially invisible from within a more limited perspective. We propose to recognize additional classes of entities that provide new insights into the structure of the microbial world, namely “processually equivalent” entities, “processually versatile” entities, and “stabilized” entities.  相似文献   

10.
In the process of cell division, a great deal of proteins is assembled into three distinct organelles, namely midbody, centrosome and kinetochore. Knowing the localization of microkit (midbody, centrosome and kinetochore) proteins will facilitate drug target discovery and provide novel insights into understanding their functions. In this study, a support vector machine (SVM) model, MicekiPred, was presented to predict the localization of microkit proteins based on gene ontology (GO) information. A total accuracy of 77.51% was achieved using the jackknife cross-validation. This result shows that the model will be an effective complementary tool for future experimental study. The prediction model and dataset used in this article can be freely downloaded from http://cobi.uestc.edu.cn/people/hlin/tools/MicekiPred/.  相似文献   

11.
The Gene Ontology Categorizer, developed jointly by the Los Alamos National Laboratory and Procter & Gamble Corp., provides a capability for the categorization task in the Gene Ontology (GO): given a list of genes of interest, what are the best nodes of the GO to summarize or categorize that list? The motivating question is from a drug discovery process, where after some gene expression analysis experiment, we wish to understand the overall effect of some cell treatment or condition by identifying 'where' in the GO the differentially expressed genes fall: 'clustered' together in one place? in two places? uniformly spread throughout the GO? 'high', or 'low'? In order to address this need, we view bio-ontologies more as combinatorially structured databases than facilities for logical inference, and draw on the discrete mathematics of finite partially ordered sets (posets) to develop data representation and algorithms appropriate for the GO. In doing so, we have laid the foundations for a general set of methods to address not just the categorization task, but also other tasks (e.g. distances in ontologies and ontology merger and exchange) in both the GO and other bio-ontologies (such as the Enzyme Commission database or the MEdical Subject Headings) cast as hierarchically structured taxonomic knowledge systems.  相似文献   

12.
MOTIVATION: Anatomy ontologies have a growing role in bioinformatics-for example, in indexing gene expression data in model organisms. To relate or draw conclusions from data so indexed, anatomy ontologies must be equipped with the formal vocabulary that would allow statements about meronomy to be qualified by constraints such as part of the male or part at the embryonic stage. Lacking such a vocabulary, anatomists have built this information into the structure of the ontology or into anatomical terms. For example, in the FlyBase anatomy for drosophila, the term larval abdominal segment encodes the stage in the term, while the terms male genital disc and female genital disc encode the sex. It remains implicit that a fly has one and only one of these parts during its larval stage. Such indicators of context can and should be represented explicitly in the ontology. RESULTS: The framework we have defined for anatomical ontologies allows the canonical anatomy structures of a given species to be those common to all sexes, and to have either male, female or hermaphrodite parts--but not combinations of the latter. Temporal aspects of development are addressed by associating a stage with organism parts and requiring a connected anatomy to have parts that exist at a common stage. Both sex and anatomical stage are represented by attributes. This formalization clarifies ontological structure and meaning and increases the capacity for formal reasoning about anatomy. The framework also supports generalizations such as vertebrate and invertebrate, thereby allowing the representation of anatomical structures that are common across a sub-phylum.  相似文献   

13.
14.
Small molecules play crucial role in the modulation of biological functions by interacting with specific macromolecules. Hence small molecule interactions are captured by a variety of experimental methods to estimate and propose correlations between molecular structures to their biological activities. The tremendous expanse in publicly available small molecules is also driving new efforts to better understand interactions involving small molecules particularly in area of drug docking and pharmacogenomics. We have studied and designed a functional group identification system with the associated ontology for it. The functional group identification system can detect the functional group components from given ligand structure with specific coordinate information. Functional group ontology (FGO) proposed by us is a structured classification of chemical functional group which acts as an important source of prior knowledge that may be automatically integrated to support identification, categorization and predictive data analysis tasks. We have used a new annotation method which can be used to construct the original structure from given ontological expression using exact coordinate information. Here, we also discuss about ontology-driven similarity measure of functional groups and uses of such novel ontology for pharmacophore searching and de-novo ligand designing.  相似文献   

15.
Structuring an event ontology for disease outbreak detection   总被引:1,自引:0,他引:1  
BACKGROUND: This paper describes the design of an event ontology being developed for application in the machine understanding of infectious disease-related events reported in natural language text. This event ontology is designed to support timely detection of disease outbreaks and rapid judgment of their alerting status by 1) bridging a gap between layman's language used in disease outbreak reports and public health experts' deep knowledge, and 2) making multi-lingual information available. CONSTRUCTION AND CONTENT: This event ontology integrates a model of experts' knowledge for disease surveillance, and at the same time sets of linguistic expressions which denote disease-related events, and formal definitions of events. In this ontology, rather general event classes, which are suitable for application to language-oriented tasks such as recognition of event expressions, are placed on the upper-level, and more specific events of the experts' interest are in the lower level. Each class is related to other classes which represent participants of events, and linked with multi-lingual synonym sets and axioms. CONCLUSIONS: We consider that the design of the event ontology and the methodology introduced in this paper are applicable to other domains which require integration of natural language information and machine support for experts to assess them. The first version of the ontology, with about 40 concepts, will be available in March 2008.  相似文献   

16.
The Microarray Gene Expression Data (MGED) society was formed with an initial focus on experiments involving microarray technology. Despite the diversity of applications, there are common concepts used and a common need to capture experimental information in a standardized manner. In building the MGED ontology, it was recognized that it would be impractical to cover all the different types of experiments on all the different types of organisms by listing and defining all the types of organisms and their properties. Our solution was to create a framework for describing microarray experiments with an initial focus on the biological sample and its manipulation. For concepts that are common for many species, we could provide a manageable listing of controlled terms. For concepts that are species-specific or whose values cannot be readily listed, we created an 'OntologyEntry' concept that referenced an external resource. The MGED ontology is a work in progress that needs additional instances and particularly needs constraints to be added. The ontology currently covers the experimental sample and design, and we have begun capturing aspects of the microarrays themselves as well. The primary application of the ontology will be to develop forms for entering information into databases, and consequently allowing queries, taking advantage of the structure provided by the ontology. The application of an ontology of experimental conditions extends beyond microarray experiments and, as the scope of MGED includes other aspects of functional genomics, so too will the MGED ontology.  相似文献   

17.
Yang JO  Charny P  Lee B  Kim S  Bhak J  Woo HG 《Bioinformation》2007,2(5):194-196
GS2PATH is a Web-based pipeline tool to permit functional enrichment of a given gene set from prior knowledge databases, including gene ontology (GO) database and biological pathway databases. The tool also provides an estimation of gene set enrichment, in GO terms, from the databases of the KEGG and BioCarta pathways, which may allow users to compute and compare functional over-representations. This is especially useful in the perspective of biological pathways such as metabolic, signal transduction, genetic information processing, environmental information processing, cellular process, disease, and drug development. It provides relevant images of biochemical pathways with highlighting of the gene set by customized colors, which can directly assist in the visualization of functional alteration.

Availability  相似文献   


18.
MOTIVATION: Primary immunodeficiency diseases (PIDs) are Mendelian conditions of high phenotypic complexity and low incidence. They usually manifest in toddlers and infants, although they can also occur much later in life. Information about PIDs is often widely scattered throughout the clinical as well as the research literature and hard to find for both generalists as well as experienced clinicians. Semantic Web technologies coupled to clinical information systems can go some way toward addressing this problem. Ontologies are a central component of such a system, containing and centralizing knowledge about primary immunodeficiencies in both a human- and computer-comprehensible form. The development of an ontology of PIDs is therefore a central step toward developing informatics tools, which can support the clinician in the diagnosis and treatment of these diseases. RESULTS: We present PIDO, the primary immunodeficiency disease ontology. PIDO characterizes PIDs in terms of the phenotypes commonly observed by clinicians during a diagnosis process. Phenotype terms in PIDO are formally defined using complex definitions based on qualities, functions, processes and structures. We provide mappings to biomedical reference ontologies to ensure interoperability with ontologies in other domains. Based on PIDO, we developed the PIDFinder, an ontology-driven software prototype that can facilitate clinical decision support. PIDO connects immunological knowledge across resources within a common framework and thereby enables translational research and the development of medical applications for the domain of immunology and primary immunodeficiency diseases.  相似文献   

19.
We have recently mapped the Gene Ontology (GO), developed by the Gene Ontology Consortium, into the National Library of Medicine's Unified Medical Language System (UMLS). GO has been developed for the purpose of annotating gene products in genome databases, and the UMLS has been developed as a framework for integrating large numbers of disparate terminologies, primarily for the purpose of providing better access to biomedical information sources. The mapping of GO to UMLS highlighted issues in both terminology systems. After some initial explorations and discussions between the UMLS and GO teams, the GO was integrated with the UMLS. Overall, a total of 23% of the GO terms either matched directly (3%) or linked (20%) to existing UMLS concepts. All GO terms now have a corresponding, official UMLS concept, and the entire vocabulary is available through the web-based UMLS Knowledge Source Server. The mapping of the Gene Ontology, with its focus on structures, processes and functions at the molecular level, to the existing broad coverage UMLS should contribute to linking the language and practices of clinical medicine to the language and practices of genomics.  相似文献   

20.

Background  

The use of ontologies to control vocabulary and structure annotation has added value to genome-scale data, and contributed to the capture and re-use of knowledge across research domains. Gene Ontology (GO) is widely used to capture detailed expert knowledge in genomic-scale datasets and as a consequence has grown to contain many terms, making it unwieldy for many applications. To increase its ease of manipulation and efficiency of use, subsets called GO slims are often created by collapsing terms upward into more general, high-level terms relevant to a particular context. Creation of a GO slim currently requires manipulation and editing of GO by an expert (or community) familiar with both the ontology and the biological context. Decisions about which terms to include are necessarily subjective, and the creation process itself and subsequent curation are time-consuming and largely manual.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号