Identifying gene-specific variations in biomedical text |
| |
Authors: | Klinger Roman Friedrich Christoph M Mevissen Heinz Theodor Fluck Juliane Hofmann-Apitius Martin Furlong Laura I Sanz Ferran |
| |
Institution: | Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 Sankt Augustin, Germany. roman.klinger@scai.fhg.de |
| |
Abstract: | The influence of genetic variations on diseases or cellular processes is the main focus of many investigations, and results of biomedical studies are often only accessible through scientific publications. Automatic extraction of this information requires recognition of the gene names and the accompanying allelic variant information. In a previous work, the OSIRIS system for the detection of allelic variation in text based on a query expansion approach was communicated. Challenges associated with this system are the relatively low recall for variation mentions and gene name recognition. To tackle this challenge, we integrate the ProMiner system developed for the recognition and normalization of gene and protein names with a conditional random field (CRF)-based recognition of variation terms in biomedical text. Following the newly developed normalization of variation entities, we can link textual entities to Single Nucleotide Polymorphism database (dbSNP) entries. The performance of this novel approach is evaluated, and improved results in comparison to state-of-the-art systems are reported. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|