Abstract: | The era of big biodiversity data has led to rapid, exciting advances in the theoretical and applied biological, ecological and conservation sciences. While large genetic, geographic and trait databases are available, these are neither complete nor random samples of the globe. Gaps and biases in these databases reduce our inferential and predictive power, and this incompleteness is even more worrisome because we are ignorant of both its kind and magnitude. We performed a comprehensive examination of the taxonomic and spatial sampling in the most complete current databases for plant genes, locations and functional traits. To do this, we downloaded data from The Plant List (taxonomy), the Global Biodiversity Information Facility (locations), TRY (traits) and GenBank (genes). Only 17.7% of the world's described and accepted land plant species feature in all three databases, meaning that more than 82% of known plant biodiversity lacks representation in at least one database. Species coverage is highest for location data and lowest for genetic data. Bryophytes and orchids stand out taxonomically and the equatorial region stands out spatially as poorly represented in all databases. We have highlighted a number of clades and regions about which we know little functionally, spatially and genetically, on which we should set research targets. The scientific community should recognize and reward the significant value, both for biodiversity science and conservation, of filling in these gaps in our knowledge of the plant tree of life. |