共查询到20条相似文献,搜索用时 31 毫秒
1.
Rob Jelier Guido Jenster Lambert CJ Dorssers Bas J Wouters Peter JM Hendriksen Barend Mons Ruud Delwel Jan A Kors 《BMC bioinformatics》2007,8(1):14
Background
High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of genes and other biomedical concepts from free-text literature. For each gene, a profile of related concepts is constructed that summarizes the context in which the gene is mentioned in literature. We assign a weight to each concept in the profile based on a likelihood ratio measure. Gene concept profiles can then be clustered to find related genes and other concepts. 相似文献2.
Background
Once specific genes are identified through high throughput genomics technologies there is a need to sort the final gene list to a manageable size for validation studies. The triaging and sorting of genes often relies on the use of supplemental information related to gene structure, metabolic pathways, and chromosomal location. Yet in disease states where the genes may not have identifiable structural elements, poorly defined metabolic pathways, or limited chromosomal data, flexible systems for obtaining additional data are necessary. In these situations having a tool for searching the biomedical literature using the list of identified genes while simultaneously defining additional search terms would be useful. 相似文献3.
Background
Microarray techniques survey gene expressions on a global scale. Extensive biomedical studies have been designed to discover subsets of genes that are associated with survival risks for diseases such as lymphoma and construct predictive models using those selected genes. In this article, we investigate simultaneous estimation and gene selection with right censored survival data and high dimensional gene expression measurements. 相似文献4.
Philip Stegmaier Mathias Krull Nico Voss Alexander E Kel Edgar Wingender 《BMC systems biology》2010,4(1):124
Background
The study of relationships between human diseases provides new possibilities for biomedical research. Recent achievements on human genetic diseases have stimulated interest to derive methods to identify disease associations in order to gain further insight into the network of human diseases and to predict disease genes. 相似文献5.
Background
As the use of microarray technology becomes more prevalent it is not unusual to find several laboratories employing the same microarray technology to identify genes related to the same condition in the same species. Although the experimental specifics are similar, typically a different list of statistically significant genes result from each data analysis. 相似文献6.
Sampo Pyysalo Filip Ginter Juho Heimonen Jari Björne Jorma Boberg Jouni Järvinen Tapio Salakoski 《BMC bioinformatics》2007,8(1):50
Background
Lately, there has been a great interest in the application of information extraction methods to the biomedical domain, in particular, to the extraction of relationships of genes, proteins, and RNA from scientific publications. The development and evaluation of such methods requires annotated domain corpora. 相似文献7.
Background
Lysozymes are important model enzymes in biomedical research with a ubiquitous taxonomic distribution ranging from phages up to plants and animals. Their main function appears to be defence against pathogens, although some of them have also been implicated in digestion. Whereas most organisms have only few lysozyme genes, nematodes of the genus Caenorhabditis possess a surprisingly large repertoire of up to 15 genes. 相似文献8.
Kung Ahn Jae-Won Huh Sang-Je Park Dae-Soo Kim Hong-Seok Ha Yun-Ji Kim Ja-Rang Lee Kyu-Tae Chang Heui-Soo Kim 《BMC molecular biology》2008,9(1):78
Background
The rhesus monkey (Macaca mulatta) is a valuable and widely used model animal for biomedical research. However, quantitative analyses of rhesus gene expression profiles under diverse experimental conditions are limited by a shortage of suitable internal controls for the normalization of mRNA levels. In this study, we used a systematic approach for the selection of potential reference genes in the rhesus monkey and compared their suitability to that of the corresponding genes in humans. 相似文献9.
Background
With the biomedical literature continually expanding, searching PubMed for information about specific genes becomes increasingly difficult. Not only can thousands of results be returned, but gene name ambiguity leads to many irrelevant hits. As a result, it is difficult for life scientists and gene curators to rapidly get an overall picture about a specific gene from documents that mention its names and synonyms. 相似文献10.
Background
The task of recognizing and identifying species names in biomedical literature has recently been regarded as critical for a number of applications in text and data mining, including gene name recognition, species-specific document retrieval, and semantic enrichment of biomedical articles. 相似文献11.
Yang Jin Ryan T McDonald Kevin Lerman Mark A Mandel Steven Carroll Mark Y Liberman Fernando C Pereira Raymond S Winters Peter S White 《BMC bioinformatics》2006,7(1):492
Background
The rapid proliferation of biomedical text makes it increasingly difficult for researchers to identify, synthesize, and utilize developed knowledge in their fields of interest. Automated information extraction procedures can assist in the acquisition and management of this knowledge. Previous efforts in biomedical text mining have focused primarily upon named entity recognition of well-defined molecular objects such as genes, but less work has been performed to identify disease-related objects and concepts. Furthermore, promise has been tempered by an inability to efficiently scale approaches in ways that minimize manual efforts and still perform with high accuracy. Here, we have applied a machine-learning approach previously successful for identifying molecular entities to a disease concept to determine if the underlying probabilistic model effectively generalizes to unrelated concepts with minimal manual intervention for model retraining. 相似文献12.
Ramón Diaz-Uriarte 《BMC bioinformatics》2007,8(1):328
Background
Microarray data are often used for patient classification and gene selection. An appropriate tool for end users and biomedical researchers should combine user friendliness with statistical rigor, including carefully avoiding selection biases and allowing analysis of multiple solutions, together with access to additional functional information of selected genes. Methodologically, such a tool would be of greater use if it incorporates state-of-the-art computational approaches and makes source code available. 相似文献13.
Background
In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP), generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. 相似文献14.
Peter Hof Claudia Ortmeier Kirstin Pape Birgit Reitmaier Johannes Regenbogen Andreas Goppelt Joern-Peter Halle 《BMC genomics》2002,3(1):7-11
Background
Gene expression profiling among different tissues is of paramount interest in various areas of biomedical research. We have developed a novel method (DADA, Digital Analysis of cDNA Abundance), that calculates the relative abundance of genes in cDNA libraries. 相似文献15.
Background
Cardio-vascular diseases are the first cause of death worldwide, particularly in the developed countries; the identification of genes specifically expressed in the cardiac muscle is thus of major biomedical interest. In this study, we performed a comprehensive analysis of the expression profiles to identify genes over-expressed in the human adult heart using the public Expressed Sequence Tags (ESTs) database. The initial set of genes expressed in the heart was constructed by clustering and assembling ESTs from human adult heart cDNA libraries. Expression profiles were then generated for each of these genes by counting their cognate ESTs in all libraries. Differential expression was assessed by applying to these profiles a previously published statistical procedure. 相似文献16.
Background
Providing for long-term and consistent public access to scientific data is a growing concern in biomedical research. One aspect of this problem can be demonstrated by evaluating the persistence of supplementary data associated with published biomedical papers. 相似文献17.
18.
Hua Xu Marianthi Markatou Rositsa Dimova Hongfang Liu Carol Friedman 《BMC bioinformatics》2006,7(1):334-16
Background
Word sense disambiguation (WSD) is critical in the biomedical domain for improving the precision of natural language processing (NLP), text mining, and information retrieval systems because ambiguous words negatively impact accurate access to literature containing biomolecular entities, such as genes, proteins, cells, diseases, and other important entities. Automated techniques have been developed that address the WSD problem for a number of text processing situations, but the problem is still a challenging one. Supervised WSD machine learning (ML) methods have been applied in the biomedical domain and have shown promising results, but the results typically incorporate a number of confounding factors, and it is problematic to truly understand the effectiveness and generalizability of the methods because these factors interact with each other and affect the final results. Thus, there is a need to explicitly address the factors and to systematically quantify their effects on performance. 相似文献19.
Kevin M Livingston Michael Bada William A Baumgartner Jr Lawrence E Hunter 《BMC bioinformatics》2015,16(1)
Background
The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work moving toward shared data formats and linked identifiers, significant problems persist in semantic data integration in order to establish shared identity and shared meaning across heterogeneous biomedical data sources.Results
We present five processes for semantic data integration that, when applied collectively, solve seven key problems. These processes include making explicit the differences between biomedical concepts and database records, aggregating sets of identifiers denoting the same biomedical concepts across data sources, and using declaratively represented forward-chaining rules to take information that is variably represented in source databases and integrating it into a consistent biomedical representation. We demonstrate these processes and solutions by presenting KaBOB (the Knowledge Base Of Biomedicine), a knowledge base of semantically integrated data from 18 prominent biomedical databases using common representations grounded in Open Biomedical Ontologies. An instance of KaBOB with data about humans and seven major model organisms can be built using on the order of 500 million RDF triples. All source code for building KaBOB is available under an open-source license.Conclusions
KaBOB is an integrated knowledge base of biomedical data representationally based in prominent, actively maintained Open Biomedical Ontologies, thus enabling queries of the underlying data in terms of biomedical concepts (e.g., genes and gene products, interactions and processes) rather than features of source-specific data schemas or file formats. KaBOB resolves many of the issues that routinely plague biomedical researchers intending to work with data from multiple data sources and provides a platform for ongoing data integration and development and for formal reasoning over a wealth of integrated biomedical data.Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0559-3) contains supplementary material, which is available to authorized users. 相似文献20.
Richard Tzong-Han Tsai Wen-Chi Chou Ying-Shan Su Yu-Chun Lin Cheng-Lung Sung Hong-Jie Dai Irene Tzu-Hsuan Yeh Wei Ku Ting-Yi Sung Wen-Lian Hsu 《BMC bioinformatics》2007,8(1):325