首页 | 本学科首页   官方微博 | 高级检索  
   检索      

基于多组学数据的肿瘤药物敏感性预测北大核心CSCD
引用本文:杨晨雨,刘振浩,代培斌,张钰,黄鹏杰,林勇,谢鹭.基于多组学数据的肿瘤药物敏感性预测北大核心CSCD[J].生物工程学报,2022,38(6):2201-2212.
作者姓名:杨晨雨  刘振浩  代培斌  张钰  黄鹏杰  林勇  谢鹭
作者单位:上海理工大学 健康科学与工程学院, 上海 200093;上海市生物医药技术研究院 基因组与生物信息研究所, 上海 201203;上海市生物医药技术研究院 基因组与生物信息研究所, 上海 201203;中南大学湘雅医院, 湖南 长沙 410008;同济大学 医学院, 上海 200092
基金项目:国家自然科学基金(31301092,31800700);上海市卫健委协同创新集群项目(2019CXJQ02)
摘    要:肿瘤药物敏感性预测在指导患者临床用药方面具有重要意义。本文基于癌症药物敏感性基因组学数据库(genomics of drug sensitivity in cancer, GDSC) 198种药物的细胞系敏感性IC50数据,通过Stacking集成学习构建了包含基因表达、基因突变、拷贝数变异数据的多组学癌症药物敏感性预测模型。采用多种特征选择方法对基因特征进行降维,使用Stacking方法集成6种初级学习器和1种次级学习器进行建模,采用5折交叉进行模型验证。预测结果中AUC大于0.9的占比为36.4%,在0.8–0.9之间的占比为49.0%,最低AUC为0.682。基于Stacking构建的多组学预测模型较已有单组学和多组学模型的准确性和稳定性具有优势。多组学整合预测药物敏感性优于单一组学。特征基因功能注释和富集分析解析了肿瘤对sorafenib潜在的耐药机制,从生物学角度提供了模型可解释性及其应用于临床用药指导的价值。

关 键 词:集成学习  Stacking  特征选择  多组学  肿瘤耐药机制  sorafenib
收稿时间:2021/9/4 0:00:00

Predicting tumor drug sensitivity with multi-omics data
YANG Chenyu,LIU Zhenhao,DAI Peibin,ZHANG Yu,HUANG Pengjie,LIN Yong,XIE Lu.Predicting tumor drug sensitivity with multi-omics data[J].Chinese Journal of Biotechnology,2022,38(6):2201-2212.
Authors:YANG Chenyu  LIU Zhenhao  DAI Peibin  ZHANG Yu  HUANG Pengjie  LIN Yong  XIE Lu
Institution:School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China;Institute for Genome and Bioinformatics, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai 201203, China;Institute for Genome and Bioinformatics, Shanghai Institute for Biomedical and Pharmaceutical Technologies, Shanghai 201203, China;Xiangya Hospital, Central South University, Changsha 410008, Hunan, China;School of Medicine, Tongji University, Shanghai 200092, China
Abstract:The prediction of tumor drug sensitivity plays an important role in clinically guiding patients'' medication. In this paper, a multi-omics data-based cancer drug sensitivity prediction model was constructed by Stacking ensemble learning method. The data including gene expression, mutation, copy number variation and drug sensitivity value (IC50) of 198 drugs were downloaded from the GDSC database. Multiple feature selection methods were applied for dimensionality reduction. Six primary learners and one secondary learner were integrated into modeling by Stacking method. The model was validated with 5-fold cross-validation. In the prediction results, 36.4% of drug models'' AUCs were greater than 0.9, 49.0% of drug models'' AUCs were between 0.8-0.9, and the lowest drug model''s AUC was 0.682. The multi-omics model for drug sensitivity prediction based on Stacking method is better than the known single-omics or multi-omics model in terms of accuracy and stability. The model based on multi-omics data is better than the single-omics data in predicting drug sensitivity. Function annotation and enrichment analysis of feature genes revealed the potential resistance mechanism of tumors to sorafenib, providing the model interpretability from a biological perspective, and demonstrated the model''s potential applicability in clinical medication guidance.
Keywords:ensemble learning  Stacking  feature selection  multi-omics  tumor resistance mechanism  sorafenib
本文献已被 维普 等数据库收录!
点击此处可从《生物工程学报》浏览原始摘要信息
点击此处可从《生物工程学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号