Abstract: | Invasive species threaten global biodiversity, food security and ecosystem function. Such incursions present challenges to agriculture where invasive species cause significant crop damage and require major economic investment to control production losses. Pest risk analysis (PRA) is key to prioritize agricultural biosecurity efforts, but is hampered by incomplete knowledge of current crop pest and pathogen distributions. Here, we develop predictive models of current pest distributions and test these models using new observations at subnational resolution. We apply generalized linear models (GLM) to estimate presence probabilities for 1,739 crop pests in the CABI pest distribution database. We test model predictions for 100 unobserved pest occurrences in the People's Republic of China (PRC), against observations of these pests abstracted from the Chinese literature. This resource has hitherto been omitted from databases on global pest distributions. Finally, we predict occurrences of all unobserved pests globally. Presence probability increases with host presence, presence in neighbouring regions, per capita GDP and global prevalence. Presence probability decreases with mean distance from coast and known host number per pest. The models are good predictors of pest presence in provinces of the PRC, with area under the ROC curve (AUC) values of 0.75–0.76. Large numbers of currently unobserved, but probably present pests (defined here as unreported pests with a predicted presence probability >0.75), are predicted in China, India, southern Brazil and some countries of the former USSR. We show that GLMs can predict presences of pseudoabsent pests at subnational resolution. The Chinese literature has been largely inaccessible to Western academia but contains important information that can support PRA. Prior studies have often assumed that unreported pests in a global distribution database represent a true absence. Our analysis provides a method for quantifying pseudoabsences to enable improved PRA and species distribution modelling. |