首页 | 本学科首页   官方微博 | 高级检索  
   检索      

肥胖人群肠道菌群特征分析及机器学习模型
引用本文:吴桐,王鸿超,陆文伟,赵建新,张灏,陈卫.肥胖人群肠道菌群特征分析及机器学习模型[J].微生物学通报,2020,47(12):4328-4337.
作者姓名:吴桐  王鸿超  陆文伟  赵建新  张灏  陈卫
作者单位:江南大学食品学院 江苏 无锡 214122
基金项目:国家重点研发计划(2019YFF0217601)
摘    要:【背景】肠道菌群与人体健康之间的关系吸引了越来越多的关注,成为目前热门的研究热点。【目的】基于美国肠道计划公开数据库,对肥胖和健康人群肠道菌群进行比较分析,解析肥胖人群肠道菌群特征,并基于肠道菌群建立机器学习模型预测人群肥胖的状态,为基于肠道菌群干预肥胖提供理论基础。【方法】从公开数据库中获取美国肠道计划中的肠道菌数据,经过筛选得到1 655个健康(18.530)成年人的肠道菌群数据。针对α多样性,进行了Wilcox秩和检验分析并通过logsitic回归判定α多样性与肥胖之间的关系;对Unweighted Unifrac、Weighted Unifrac和Bray-Curtis三种β多样性距离进行主成分分析(principal component analysis,PCA),探索肥胖与健康人群在肠道菌群组成上的差异;对于物种差异,进行Wilcox秩和检验探索差异菌属;通过PICRUSt分析预测可能的代谢通路,同时与肠道菌群进行相关性分析。利用Scikit-Learn软件包基于属水平的肠道菌群数据建立肥胖分类机器学习模型,并进行网络搜索确定最佳模型参数。【结果】经过Wilcox秩和检验,发现肥胖人群的α多样性都较健康人群显著下降,logistic回归表明α多样性与人体肥胖状态有相关性。经过基于Weighted unifrac、Unweighted unifrac和Bray-curtis三种距离的PCA,肥胖和健康人群的肠道菌群结构上无明显差异;在门水平上,肥胖人群中的Firmicutes和Bacteroidetes比值较低,在属水平上共发现57个在两组之间具有显著性差异的属,其中肥胖人群中的Ruminococcus相对丰度较高,而Prevotella、Akkermansia和Methanobacteriales的相对丰度较低;PICRUSt预测的代谢通路有63个代谢通路在两组之间具有显著差异;梯度提升回归树对于基于肠道菌群预测肥胖人群效果最好,受试曲线下与坐标轴围成的面积(area under curve,AUC)值可以达到0.769,测试集精度可以达到0.725。【结论】基于大规模的肠道菌群数据揭示了肥胖人群肠道菌群的特征,将机器学习运用到肥胖预测上面,为精准膳食、精准医疗提供新的研究思路和理论基础。

关 键 词:肠道菌群,肥胖,代谢通路,机器学习

Characteristics of gut microbiota of obese people and machine learning model
WU Tong,WANG Hong-Chao,LU Wen-Wei,ZHAO Jian-Xin,ZHANG Hao,CHEN Wei.Characteristics of gut microbiota of obese people and machine learning model[J].Microbiology,2020,47(12):4328-4337.
Authors:WU Tong  WANG Hong-Chao  LU Wen-Wei  ZHAO Jian-Xin  ZHANG Hao  CHEN Wei
Institution:School of Food Science and Technology, Jianngnan University, Wuxi, Jiangsu 214122, China
Abstract:Background] The relationship between gut microbiota and human health has attracted much attention and became a popular research area. Objective] To explore the feature of gut microbiota of obese people based on the American Gut Project. To provide a theoretical basis for the intervention of obesity based on gut microbiota by constructing machine learning models to predict the status of people obesity. Methods] Total of 1 665 normal samples (18.530) obese samples were downloaded from the website of the American Gut Project (AGP). The Wilcox rank-sum analysis was performed to explore the alteration of alpha-diversity between the obese and normal group. In addition, the logistic regression was performed to explore the correlation between alpha-diversity of gut microbiota and obese. For beta-analysis, we performed the principal component analysis (PCA) to explore the difference in the structure of gut microbiota between obese and normal groups. For the phylogenetic profiles, we performed the Wilcox rank-sum analysis to detect any significantly different taxa between the two groups. The PICRUSt analysis was used to predict the pathway based on the 16s rRNA gene sequences. Then, the Wilcox rank-sum analysis was used to detect the significantly different pathway between the two groups. To find the correlation between these significantly different pathways and genus, we performed the correlation analysis. Finally, we used the Scikit-Learn packages in python to construct the machine learning model and used the AUC value as the standard to justify the performance of each model. Results] The decreasing trend of alpha-diversity in the obese population compared to the healthy population was observed after the Wilcox rank-sum analysis. In addition, the correlation between the alpha-diversity and the statues of obese was confirmed using the logistics regression. As for the beta-diversity, we did not observe the significant difference of the structure of gut microbiota after PCA based on three beta-diversity distance matrix including Weighted Unifrac, Unweighted Unifrac and Bray-Curtis. For the phylum, the high relative abundance of Bacteroidetes and the low relative abundance of Firmicutes was observed in the obese group. Besides, a total of 57 genera was significantly different between the two groups after the Wilcox rank-sum analysis. The genus of Ruminococcus increased in the obese groups, but the genus of Prevotella, Akkermansia and Methanobacteriales decreased in the obese group. All the pathway which predicted by the PICRUSt analysis were performed the Wilcox-rank-sum analysis between two groups and a total of 63 significantly different pathways was observed. The gradient boosted regression tree (GBDT) had the best performance with the AUC value (0.769) and test precise (0.725) among other models. Conclusion] This study revealed the feature of gut microbiota of obese population based on a large-scale data sets. Besides, this study also constructed the machine learning models based on gut microbiota to predict the status of obese, which provide the new idea and theory basis of personalized medicine and diet.
Keywords:Gut microbiota  Obese  Metabolic pathway  Machine learning
点击此处可从《微生物学通报》浏览原始摘要信息
点击此处可从《微生物学通报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号