首页 | 本学科首页   官方微博 | 高级检索  
   检索      


A text-mining perspective on the requirements for electronically annotated abstracts
Authors:Leitner Florian  Valencia Alfonso
Institution:Structural Computational Biology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
Abstract:We propose that the combination of human expertise and automatic text-mining systems can be used to create a first generation of electronically annotated information (EAI) that can be added to journal abstracts and that is directly related to the information in the corresponding text. The first experiments have concentrated on the annotation of gene/protein names and those of organisms, as these are the best resolved problems. A second generation of systems could then attempt to address the problems of annotating protein interactions and protein/gene functions, a more difficult task for text-mining systems. EAI will permit easier categorization of this information, it will help in the evaluation of papers for their curation in databases, and it will be invaluable for maintaining the links between the information in databases and the facts described in text. Additionally, it will contribute to the efforts towards completing database information and creating collections of annotated text that can be used to train new generations of text-mining systems. The recent introduction of the first meta-server for the annotation of biological text, with the possibility of collecting annotations from available text-mining systems, adds credibility to the technical feasibility of this proposal.
Keywords:BCMS  BioCreative MetaServer  EAI  electronically annotated information  IE  information extraction  NER  named entity recognition  NLP  natural language processing  NLU  natural language understanding
本文献已被 ScienceDirect PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号