首页 | 本学科首页   官方微博 | 高级检索  
     


Integrated knowledge mining,genome-scale modeling,and machine learning for predicting Yarrowia lipolytica bioproduction
Affiliation:1. Department of Chemical and Biomolecular Engineering, National University of Singapore, 117585, Singapore;2. Department of Civil and Environmental Engineering, University of Illinois at Urbana−Champaign, Urbana, IL 61801, USA;3. DOE Center for Advanced Bioenergy and Bioproducts Innovation (CABBI), University of Illinois at Urbana−Champaign, Urbana, IL 61801, USA;1. Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, MO 63130, USA;2. Donald Danforth Plant Science Center, St. Louis, MO 63132, USA;3. United States Department of Agriculture, Agricultural Research Service, St. Louis, MO 63132, USA;4. Department of Biochemistry, University of Colorado Boulder, Boulder, CO 80309, USA;5. Renewable and Sustainable Energy Institute, University of Colorado, Boulder, CO 80309, USA;6. National Bioenergy Center, National Renewable Energy Laboratory, Golden, CO 80401, USA;1. Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China;2. TIB-VIB Joint Center of Synthetic Biology, National Center of Technology Innovation for Synthetic Biology, Tianjin 300308, China;3. Infection Program and Department of Microbiology, Biomedicine Discovery Institute, Monash University, Clayton, Victoria 3800, Australia;1. Systems Biology and Medicine Laboratory, Department of Chemical and Biomolecular Engineering (BK21 Plus Program), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea;2. Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, KAIST, Daejeon 34141, Republic of Korea;3. Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 Plus Program), KAIST Institute for BioCentury, KAIST, Daejeon 34141, Republic of Korea;4. KAIST Institute for Artificial Intelligence, BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon 34141, Republic of Korea
Abstract:Predicting bioproduction titers from microbial hosts has been challenging due to complex interactions between microbial regulatory networks, stress responses, and suboptimal cultivation conditions. This study integrated knowledge mining, feature extraction, genome-scale modeling (GSM), and machine learning (ML) to develop a model for predicting Yarrowia lipolytica chemical titers (i.e., organic acids, terpenoids, etc.). First, Y. lipolytica production data, including cultivation conditions, genetic engineering strategies, and product information, was manually collected from literature (~100 papers) and stored as either numerical (e.g., substrate concentrations) or categorical (e.g., bioreactor modes) variables. For each case recorded, central pathway fluxes were estimated using GSMs and flux balance analysis (FBA) to provide metabolic features. Second, a ML ensemble learner was trained to predict strain production titers. Accurate predictions on the test data were obtained for instances with production titers >1 g/L (R2 = 0.87). However, the model had reduced predictability for low performance strains (0.01–1 g/L, R2 = 0.29) potentially due to biosynthesis bottlenecks not captured in the features. Feature ranking indicated that the FBA fluxes, the number of enzyme steps, the substrate inputs, and thermodynamic barriers (i.e., Gibbs free energy of reaction) were the most influential factors. Third, the model was evaluated on other oleaginous yeasts and indicated there were conserved features for some hosts that can be potentially exploited by transfer learning. The platform was also designed to assist computational strain design tools (such as OptKnock) to screen genetic targets for improved microbial production in light of experimental conditions.
Keywords:FBA  Computational strain design  Machine learning  Pathway bottlenecks
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号