首页 | 本学科首页   官方微博 | 高级检索  
     


Modeling Longitudinal Microbiome Compositional Data: A Two-Part Linear Mixed Model with Shared Random Effects
Authors:Han  Yongli  Baker  Courtney  Vogtmann  Emily  Hua  Xing  Shi  Jianxin  Liu  Danping
Affiliation:1.Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, USA
;2.Department of Biostatistics, University of North Carolina, Chapel Hill, USA
;3.Metabolic Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, USA
;4.Fred Hutchinson Cancer Research Center, Seattle, USA
;
Abstract:

Longitudinal microbiome studies have been widely used to unveil the dynamics in the complex host-microbial ecosystems. Modeling the longitudinal microbiome compositional data, which is semi-continuous in nature, is challenging in several aspects: the overabundance of zeros, the heavy skewness of non-zero values that are bounded in (0, 1), and the dependence between the binary and non-zero parts. To deal with these challenges, we first extended the work of Chen and Li [1] and proposed a two-part zero-inflated Beta regression model with shared random effects (ZIBR-SRE), which characterize the dependence between the binary and the continuous parts. Besides, the microbiome compositional data have unit-sum constraint, indicating the existence of negative correlations among taxa. As ZIBR-SRE models each taxon separately, it does not satisfy the sum-to-one constraint. We then proposed a two-part linear mixed model (TPLMM) with shared random effects to formulate the log-transformed standardized relative abundances rather than the original ones. Such transformation is called “additive logistic transformation”, initially developed for cross-sectional compositional data. We extended it to analyze the longitudinal microbiome compositions and showed that the unit-sum constraint can be automatically satisfied under the TPLMM framework. Model performances of TPLMM and ZIBR-SRE were compared with existing methods in simulation studies. Under settings adopted from real data, TPLMM had the best performance and is recommended for practical use. An oral microbiome application further showed that TPLMM and ZIBR-SRE estimated a strong correlation structure in the binary and the continuous parts, suggesting models without accounting for this dependence would lead to biased inferences.

Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号