首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Correcting for bias in distribution modelling for rare species using citizen science data
Authors:Orin J Robinson  Viviana Ruiz‐Gutierrez  Daniel Fink
Institution:Cornell Laboratory of Ornithology, Ithaca, NY, USA
Abstract:

Aim

To improve the accuracy of inferences on habitat associations and distribution patterns of rare species by combining machine‐learning, spatial filtering and resampling to address class imbalance and spatial bias of large volumes of citizen science data.

Innovation

Modelling rare species’ distributions is a pressing challenge for conservation and applied research. Often, a large number of surveys are required before enough detections occur to model distributions of rare species accurately, resulting in a data set with a high proportion of non‐detections (i.e. class imbalance). Citizen science data can provide a cost‐effective source of surveys but likely suffer from class imbalance. Citizen science data also suffer from spatial bias, likely from preferential sampling. To correct for class imbalance and spatial bias, we used spatial filtering to under‐sample the majority class (non‐detection) while maintaining all of the limited information from the minority class (detection). We investigated the use of spatial under‐sampling with randomForest models and compared it to common approaches used for imbalanced data, the synthetic minority oversampling technique (SMOTE), weighted random forest and balanced random forest models. Model accuracy was assessed using kappa, Brier score and AUC. We demonstrate the method by evaluating habitat associations and seasonal distribution patterns using citizen science data for a rare species, the tricoloured blackbird (Agelaius tricolor).

Main Conclusions

Spatial under‐sampling increased the accuracy of each model and outperformed the approach typically used to direct under‐sampling in the SMOTE algorithm. Our approach is the first to characterize winter distribution and movement of tricoloured blackbirds. Our results show that tricoloured blackbirds are positively associated with grassland, pasture and wetland habitats, and negatively associated with high elevations or evergreen forests during both winter and breeding seasons. The seasonal differences in distribution indicate that individuals move to the coast during the winter, as suggested by historical accounts.
Keywords:citizen science  class imbalance  random forest  spatial bias  species distribution model  tricoloured blackbird
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号