Machine learning for accelerating process-based computation of land biogeochemical cycles |
| |
Authors: | Yan Sun Daniel S. Goll Yuanyuan Huang Philippe Ciais Ying-Ping Wang Vladislav Bastrikov Yilong Wang |
| |
Affiliation: | 1. College of Marine Life Sciences, Ocean University of China, Qingdao, China;2. Laboratoire des Sciences du Climat et de 1'Environnement, CEA-CNRS-UVSQ, Gif sur Yvette, France;3. CSIRO Environment, Aspendale, Australia;4. Science Partners, Paris, France;5. State Key Laboratory of Tibetan Plateau Earth System, Resources and Environment (TPESRE), Institute of Tibetan Plateau Research, Chinese Academy of Sciences, Beijing, China |
| |
Abstract: | Global change ecology nowadays embraces ever-growing large observational datasets (big-data) and complex mathematical models that track hundreds of ecological processes (big-model). The rapid advancement of the big-data-big-model has reached its bottleneck: high computational requirements prevent further development of models that need to be integrated over long time-scales to simulate the distribution of ecosystems carbon and nutrient pools and fluxes. Here, we introduce a machine-learning acceleration (MLA) tool to tackle this grand challenge. We focus on the most resource-consuming step in terrestrial biosphere models (TBMs): the equilibration of biogeochemical cycles (spin-up), a prerequisite that can take up to 98% of the computational time. Through three members of the ORCHIDEE TBM family part of the IPSL Earth System Model, including versions that describe the complex interactions between nitrogen, phosphorus and carbon that do not have any analytical solution for the spin-up, we show that an unoptimized MLA reduced the computation demand by 77%–80% for global studies via interpolating the equilibrated state of biogeochemical variables for a subset of model pixels. Despite small biases in the MLA-derived equilibrium, the resulting impact on the predicted regional carbon balance over recent decades is minor. We expect a one-order of magnitude lower computation demand by optimizing the choices of machine learning algorithms, their settings, and balancing the trade-off between quality of MLA predictions and need for TBM simulations for training data generation and bias reduction. Our tool is agnostic to gridded models (beyond TBMs), compatible with existing spin-up acceleration procedures, and opens the door to a wide variety of future applications, with complex non-linear models benefit most from the computational efficiency. |
| |
Keywords: | biogeochemical cycles computational demand hybrid modeling machine learning terrestrial biosphere model |
|
|