首页 | 本学科首页   官方微博 | 高级检索  
     


Inferring species interaction networks from species abundance data: A comparative evaluation of various statistical and machine learning methods
Authors:Ali Faisal  Frank Dondelinger  Dirk Husmeier  Colin M. Beale
Affiliation:1. Helsinki Institute for Information Technology, Adaptive Informatics Research Centre, Department of Information and Computer Science, Aalto University, P.O. Box 15400, FI-02015 Aalto, Finland;2. Biomathematics and Statistics Scotland, JCMB, The King''s Buildings, Edinburgh, EH9 3JZ, United Kingdom;3. Institute for Adaptive Neural Computation, School of Informatics, University of Edinburgh, Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, United Kingdom;4. Macaulay Land Use Research Institute, Craigiebuckler, Aberdeen, AB15 8QH, United Kingdom;1. Cavanilles Institute of Biodiversity and Evolutionary Biology, University of Valencia, Catedrático José Beltrán, 2, Paterna, E-46980, Spain;2. Department of Biology and Geology, IES Violant de Casalduch, Avinguda Castelló s/n, Benicàssim, E-12560, Spain;1. Environmental Studies Program, 397 UCB University of Colorado, Boulder, CO 80309-0397, USA;2. Mpala Research Centre, PO Box 555, Nanyuki, Kenya;3. Biodiversity Research Centre, University of British Columbia, 2212 Main Mall, Vancouver, British Columbia, V6T 1Z4, Canada;1. Conservation Science Group, Department of Zoology, University of Cambridge, Cambridge, UK;2. Integrative Ecology Group, Department of Integrative Ecology, Estación Biológica de Doñana, Consejo Superior de Investigaciones Científicas (CSIC), Sevilla, Spain;3. Mediterranean Institute of Advanced Studies (IMEDEA), CSIC–Universitat de les Illes Balears (IUB), Esporles, Mallorca, Spain;1. Agroécologie, AgroSup Dijon, INRA, University of Bourgogne Franche-Comté, F-21000 Dijon, France;2. BIOGECO, INRA, University of Bordeaux, 33615 Pessac, France;3. Computational Bioinformatics Laboratory, Department of Computing, Imperial College London, London, SW7 2AZ, UK;4. Syngenta Crop Protection AG, PO Box 4002, Basel, Switzerland;5. School of Biological Sciences, University of Essex, Colchester, Essex, CO4 3SQ, UK;6. Department of Life Sciences, Imperial College London, Silwood Park Campus, Berkshire, SL5 7PY, UK
Abstract:The complexity of ecosystems is staggering, with hundreds or thousands of species interacting in a number of ways from competition and predation to facilitation and mutualism. Understanding the networks that form the systems is of growing importance, e.g. to understand how species will respond to climate change, or to predict potential knock-on effects of a biological control agent. In recent years, a variety of summary statistics for characterising the global and local properties of such networks have been derived, which provide a measure for gauging the accuracy of a mathematical model for network formation processes. However, the critical underlying assumption is that the true network is known. This is not a straightforward task to accomplish, and typically requires minute observations and detailed field work. More importantly, knowledge about species interactions is restricted to specific kinds of interactions. For instance, while the interactions between pollinators and their host plants are amenable to direct observation, other types of species interactions, like those mentioned above, are not, and might not even be clearly defined from the outset. To discover information about complex ecological systems efficiently, new tools for inferring the structure of networks from field data are needed. In the present study, we investigate the viability of various statistical and machine learning methods recently applied in molecular systems biology: graphical Gaussian models, L1-regularised regression with least absolute shrinkage and selection operator (LASSO), sparse Bayesian regression and Bayesian networks. We have assessed the performance of these methods on data simulated from food webs of known structure, where we combined a niche model with a stochastic population model in a 2-dimensional lattice. We assessed the network reconstruction accuracy in terms of the area under the receiver operating characteristic (ROC) curve, which was typically in the range between 0.75 and 0.9, corresponding to the recovery of about 60% of the true species interactions at a false prediction rate of 5%. We also applied the models to presence/absence data for 39 European warblers, and found that the inferred species interactions showed a weak yet significant correlation with phylogenetic similarity scores, which tended to weakly increase when including bio-climate covariates and allowing for spatial autocorrelation. Our findings demonstrate that relevant patterns in ecological networks can be identified from large-scale spatial data sets with machine learning methods, and that these methods have the potential to contribute novel important tools for gaining deeper insight into the structure and stability of ecosystems.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号