首页 | 本学科首页   官方微博 | 高级检索  
     


Molecular profiling of thyroid cancer subtypes using large-scale text mining
Authors:Chengkun Wu  Jean-Marc Schwartz  Georg Brabant  Goran Nenadic
Affiliation:1.Faculty of Life Sciences, University of Manchester,Manchester,UK;2.Doctoral Training Centre in Integrative Systems Biology,University of Manchester,Manchester,UK;3.Manchester Institute of Biotechnology,Manchester,UK;4.Department of Endocrinology,Christie Hospital, University of Manchester,Manchester,UK;5.Experimental and Clinical Endocrinology, Med Clinic I, University of Luebeck Ratzeburger Allee 160,Lübeck,Germany;6.School of Computer Science,University of Manchester,Manchester,UK;7.Health e-Research Centre (HeRC),Manchester,UK
Abstract:

Background

Thyroid cancer is the most common endocrine tumor with a steady increase in incidence. It is classified into multiple histopathological subtypes with potentially distinct molecular mechanisms. Identifying the most relevant genes and biological pathways reported in the thyroid cancer literature is vital for understanding of the disease and developing targeted therapeutics.

Results

We developed a large-scale text mining system to generate a molecular profiling of thyroid cancer subtypes. The system first uses a subtype classification method for the thyroid cancer literature, which employs a scoring scheme to assign different subtypes to articles. We evaluated the classification method on a gold standard derived from the PubMed Supplementary Concept annotations, achieving a micro-average F1-score of 85.9% for primary subtypes. We then used the subtype classification results to extract genes and pathways associated with different thyroid cancer subtypes and successfully unveiled important genes and pathways, including some instances that are missing from current manually annotated databases or most recent review articles.

Conclusions

Identification of key genes and pathways plays a central role in understanding the molecular biology of thyroid cancer. An integration of subtype context can allow prioritized screening for diagnostic biomarkers and novel molecular targeted therapeutics. Source code used for this study is made freely available online at https://github.com/chengkun-wu/GenesThyCan.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号