首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 40 毫秒
1.
Aim To design and apply statistical tests for measuring sampling bias in the raw data used to the determine priority areas for conservation, and to discuss their impact on conservation analyses for the region. Location Sub‐Saharan Africa. Methods An extensive data set comprising 78,083 vouchered locality records for 1068 passerine birds in sub‐Saharan Africa has been assembled. Using geographical information systems, we designed and applied two tests to determine if sampling of these taxa was biased. First, we detected possible biases because of accessibility by measuring the proximity of each record to cities, rivers and roads. Second, we quantified the intensity of sampling of each species inside and surrounding proposed conservation priority areas and compared it with sampling intensity in non‐priority areas. We applied statistical tests to determine if the distribution of these sampling records deviated significantly from random distributions. Results The analyses show that the location and intensity of collecting have historically been heavily influenced by accessibility. Sampling localities show dense, significant aggregation around city limits, and along rivers and roads. When examining the collecting sites of each individual species, the pattern of sampling has been significantly concentrated within and immediately surrounding areas now designated as conservation priorities. Main conclusions Assessment of patterns of species richness and endemicity at the scale useful for establishing conservation priorities, below the continental level, undoubtedly reflects biases in taxonomic sampling. This is especially problematic for priorities established using the criterion of complementarity because the estimated spatial costs of this approach are highly sensitive to sampling artefacts. Hence such conservation priorities should be interpreted with caution proportional to the bias found. We argue that conservation priority setting analyses require (1) statistical tests to detect these biases, and (2) data treatment to reflect species distribution rather than patterns of collecting effort.  相似文献   

2.
Quality conservation planning requires quality input data. However, the broad scale sampling strategies typically employed to obtain primary species distribution data are prone to geographic bias in the form of errors of omission. This study provides a quantitative measure of sampling bias to inform accuracy assessment of conservation plans based on the South African Frog Atlas Project. Significantly higher sampling intensity near to cities and roads is likely to result in overstated conservation priority and heightened conservation conflicts in urban areas. Particularly well sampled protected areas will also erroneously appear to contribute highly to amphibian biodiversity targets. Conversely, targeted sampling in the arid northwest and along mountain ranges is needed to ensure that these under-sampled regions are not excluded from conservation plans. The South African Frog Atlas Project offers a reasonably accurate picture of the broad scale west-to-east increase in amphibian richness and abundance, but geographic bias may limit its applicability for fine scale conservation planning. The Global Amphibian Assessment species distribution data offered a less biased alternative, but only at the cost of inflated commission error.  相似文献   

3.
Complementarity-based reserve selection algorithms efficiently prioritize sites for biodiversity conservation, but they are data-intensive and most regions lack accurate distribution maps for the majority of species. We explored implications of basing conservation planning decisions on incomplete and biased data using occurrence records of the plant family Proteaceae in South Africa. Treating this high-quality database as 'complete', we introduced three realistic sampling biases characteristic of biodiversity databases: a detectability sampling bias and two forms of roads sampling bias. We then compared reserve networks constructed using complete, biased, and randomly sampled data. All forms of biased sampling performed worse than both the complete data set and equal-effort random sampling. Biased sampling failed to detect a median of 1-5% of species, and resulted in reserve networks that were 9-17% larger than those designed with complete data. Spatial congruence and the correlation of irreplaceability scores between reserve networks selected with biased and complete data were low. Thus, reserve networks based on biased data require more area to protect fewer species and identify different locations than those selected with randomly sampled or complete data.  相似文献   

4.
Perhaps the most important recent advance in species delimitation has been the development of model‐based approaches to objectively diagnose species diversity from genetic data. Additionally, the growing accessibility of next‐generation sequence data sets provides powerful insights into genome‐wide patterns of divergence during speciation. However, applying complex models to large data sets is time‐consuming and computationally costly, requiring careful consideration of the influence of both individual and population sampling, as well as the number and informativeness of loci on species delimitation conclusions. Here, we investigated how locus number and information content affect species delimitation results for an endangered Mexican salamander species, Ambystoma ordinarium. We compared results for an eight‐locus, 137‐individual data set and an 89‐locus, seven‐individual data set. For both data sets, we used species discovery methods to define delimitation models and species validation methods to rigorously test these hypotheses. We also used integrated demographic model selection tools to choose among delimitation models, while accounting for gene flow. Our results indicate that while cryptic lineages may be delimited with relatively few loci, sampling larger numbers of loci may be required to ensure that enough informative loci are available to accurately identify and validate shallow‐scale divergences. These analyses highlight the importance of striking a balance between dense sampling of loci and individuals, particularly in shallowly diverged lineages. They also suggest the presence of a currently unrecognized, endangered species in the western part of A. ordinarium's range.  相似文献   

5.

Aim

Citizen science is a cost-effective potential source of invasive species occurrence data. However, data quality issues due to unstructured sampling approaches may discourage the use of these observations by science and conservation professionals. This study explored the utility of low-structure iNaturalist citizen science data in invasive plant monitoring. We first examined the prevalence of invasive taxa in iNaturalist plant observations and sampling biases associated with these data. Using four invasive species as examples, we then compared iNaturalist and professional agency observations and used the two datasets to model suitable habitat for each species.

Location

Hawai'i, USA.

Methods

To estimate the prevalence of invasive plant data, we compared the number of species and observations recorded in iNaturalist to botanical checklists for Hawai'i. Sampling bias was quantified along gradients of site accessibility, protective status and vegetation disturbance using a bias index. Habitat suitability for four invasive species was modelled in Maxent, using observations from iNaturalist, professional agencies and stratified subsets of iNaturalist data.

Results

iNaturalist plant observations were biased towards invasive species, which were frequently recorded in areas with higher road/trail density and vegetation disturbance. Professional observations of four example invasive species tended to occur in less accessible, native-dominated sites. Habitat suitability models based on iNaturalist versus professional data showed moderate overlap and different distributions of suitable habitat across vegetation disturbance classes. Stratifying iNaturalist observations had little effect on how suitable habitat was distributed for the species modelled in this study.

Main Conclusions

Opportunistic iNaturalist observations have the potential to complement and expand professional invasive plant monitoring, which we found was often affected by inverse sampling biases. Invasive species represented a high proportion of iNaturalist plant observations, and were recorded in environments that were not captured by professional surveys. Combining the datasets thus led to more comprehensive estimates of suitable habitat.  相似文献   

6.
7.

Aim

Taxon co‐occurrence analysis is commonly used in ecology, but it has not been applied to range‐wide distribution data of partly allopatric taxa because existing methods cannot differentiate between distribution‐related effects and taxon interactions. Our first aim was to develop a taxon co‐occurrence analysis method that is also capable of taking into account the effect of species ranges and can handle faunistic records from museum databases or biodiversity inventories. Our second aim was to test the independence of taxon co‐occurrences of rock‐dwelling gastropods at different taxonomic levels, with a special focus on the Clausiliidae subfamily Alopiinae, and in particular the genus Montenegrina.

Location

Balkan Peninsula in south‐eastern Europe (46N–36N, 13.5E–28E).

Methods

We introduced a taxon‐specific metric that characterizes the occurrence probability at a given location. This probability was calculated as a distance‐weighted mean of the taxon's presence and absence records at all sites. We applied corrections to account for the biases introduced by varying sampling intensity in our dataset. Then we used probabilistic null‐models to simulate taxon distributions under the null hypothesis of no taxon interactions and calculated pairwise and cumulated co‐occurrences. Independence of taxon occurrences was tested by comparing observed co‐occurrences to simulated values.

Results

We observed significantly fewer co‐occurrences among species and intra‐generic lineages of Montenegrina than expected under the assumption of no taxon interaction.

Main conclusions

Fewer than expected co‐occurrences among species and intra‐generic clades indicate that species divergence preceded niche partitioning. This suggests a primary role of non‐adaptive processes in the speciation of rock‐dwelling gastropods. The method can account for the effects of distributional constraints in range‐wide datasets, making it suitable for testing ecological, biogeographical, or evolutionary hypotheses where interactions of partly allopatric taxa are in question.  相似文献   

8.
Fekete  R.  Haszonits  Gy.  Schmidt  D.  Bak  H.  Vincze  O.  Süveges  K.  Molnár V.  A. 《Biological invasions》2021,23(8):2661-2674

The spread of alien species with the expansion of road networks and increasing traffic is a well-known phenomenon globally. Besides their corridor effects, road maintenance practices, such as the use of de-icing salts during winter facilitate the spread of halophyte (salt tolerant) species along roads. A good example is Plantago coronopus, a mainly coastal halophyte which has started spreading inland from the Atlantic and Mediterranean coastal habitats, recently reaching even Central European countries (e.g. Hungary). Here we studied the spread of this halophyte and tried to identify factors explaining its successful dispersion along roads, while also comparing native and non-native roadside occurrences with regard to altitude of the localities, size of roadside populations and frequency of roadside occurrences. We completed a comprehensive literature review and collected more than 200 reports of occurrence from roadsides spanning a total of 38 years. During systematic sampling the frequency of the species along roads was significantly higher in the Mediterranean (native area), than along Hungarian (non-native area) roads, however the average number of individuals at the sampling localities were very similar, and no significant difference could be detected. Using a germination experiment, we demonstrate that although the species is able to germinate even at high salt concentrations, salt is not required for germination. Indeed salt significantly decreases germination probability of the seeds. The successful spread of the species could most likely be explained by its remarkably high seed production, or some special characteristics (e.g. seed dimorphism) and its ability to adapt to a wide range of environmental conditions. Considering the recent and rapid eastward spread of P. coronopus, occurrences in other countries where it has not been reported yet can be predicted in coming years.

  相似文献   

9.
Species distribution modelling (SDM) has become an essential method in ecology and conservation. In the absence of survey data, the majority of SDMs are calibrated with opportunistic presence‐only data, incurring substantial sampling bias. We address the challenge of correcting for sampling bias in the data‐sparse situations. We modelled the relative intensity of bat records in their entire range using three modelling algorithms under the point‐process modelling framework (GLMs with subset selection, GLMs fitted with an elastic‐net penalty, and Maxent). To correct for sampling bias, we applied model‐based bias correction by incorporating spatial information on site accessibility or sampling efforts. We evaluated the effect of bias correction on the models’ predictive performance (AUC and TSS), calculated on spatial‐block cross‐validation and a holdout data set. When evaluated with independent, but also sampling‐biased test data, correction for sampling bias led to improved predictions. The predictive performance of the three modelling algorithms was very similar. Elastic‐net models have intermediate performance, with slight advantage for GLMs on cross‐validation and Maxent on hold‐out evaluation. Model‐based bias correction is very useful in data‐sparse situations, where detailed data are not available to apply other bias correction methods. However, bias correction success depends on how well the selected bias variables describe the sources of bias. In this study, accessibility covariates described bias in our data better than the effort covariate, and their use led to larger changes in predictive performance. Objectively evaluating bias correction requires bias‐free presence–absence test data, and without them the real improvement for describing a species’ environmental niche cannot be assessed.  相似文献   

10.
Historical biodiversity occurrence records are often discarded in spatial modeling analyses because of a lack of a method to quantify their sampling bias. Here we propose a new approach for predicting sampling bias in historical written records of occurrence, using a South African example as proof of concept. We modelled and mapped accessibility of the study area as the mean of proximity to freshwater and European settlements. We tested the model's ability to predict the location of historical biodiversity records from a dataset of 2612 large mammal occurrence records collected from historical written sources in South Africa in the period 1497–1920. We investigated temporal, spatial and environmental biases in these historical records and examined if the model prediction and occurrence dataset share similar environmental bias. We find a good agreement between the accessibility map and the distribution of sampling effort in the early historical period in South Africa. Environmental biases in the empirical data are identified, showing a preference for lower maximum temperature of the warmest month, higher mean monthly precipitation, higher net primary productivity and less arid biomes than expected by a uniform use of the study area. We find that the model prediction shares similar environmental bias as the empirical data. Accessibility maps, built with very simple statistical rules and in the absence of empirical data, can thus predict the spatial and environmental biases observed in historical biodiversity occurrence records. We recommend that this approach be used as a tool to estimate sampling bias in small datasets of occurrence and to improve the use of these data in spatial analyses in ecological and conservation studies.  相似文献   

11.
  • Persistent seed banks are a key plant regeneration strategy, buffering environmental variation to allow population and species persistence. Understanding seed bank functioning within herb layer dynamics is therefore important. However, rather than assessing emergence from the seed bank in herb layer gaps, most studies evaluate the seed bank functioning via a greenhouse census. We hypothesise that greenhouse data may not reflect seed bank‐driven emergence in disturbance gaps due to methodological differences. Failure in detecting (specialist) species may then introduce methodological bias into the ecological interpretation of seed bank functions using greenhouse data.
  • The persistent seed bank was surveyed in 40 semi‐natural grassland plots across a fragmented landscape, quantifying seedling emergence in both the greenhouse and in disturbance gaps. Given the suspected interpretational bias, we tested whether each census uncovers similar seed bank responses to fragmentation.
  • Seed bank characteristics were similar between censuses. Census type affected seed bank composition, with >25% of species retrieved better by either census type, dependent on functional traits including seed longevity, production and size. Habitat specialists emerged more in disturbance gaps than in the greenhouse, while the opposite was true for ruderal species. Both censuses uncovered fragmentation‐induced seed bank patterns.
  • Low surface area sampling, larger depth of sampling and germination conditions cause underrepresentation of the habitat‐specialised part of the persistent seed bank flora during greenhouse censuses. Methodological bias introduced in the recorded seed bank data may consequently have significant implications for the ecological interpretation of seed bank community functions based on greenhouse data.
  相似文献   

12.
The use of willingness-to-pay approaches in mammal conservation   总被引:4,自引:0,他引:4  
With limited monetary resources available for nature conservation, policy‐makers need to be able to prioritize conservation objectives. This has traditionally been done using qualitative ecological criteria. However, since declines in species and habitats are largely the result of socio‐economic and political forces, human preferences and values should also be taken into account. An environmental economics technique, contingent valuation, provides one way of doing this by quantifying public willingness‐to‐pay towards specific conservation objectives. In this paper, the use of this approach for quantifying public preferences towards the UK Biodiversity Action Plans for four different British mammal species is considered. The species included are the Red Squirrel Sciurus vulgaris, the Brown Hare Lepus europaeus, the Otter Lutra lutra and the Water Vole Arvicola terrestris. Willingness‐to‐pay for conservation was increased by the inclusion of the Otter among the species, membership of an environmental organization and awareness of the general and species‐specific threats facing British mammals. It was reduced by the presence of the Brown Hare among the species being considered. These findings for British mammals are compared with other willingness‐to‐pay studies for mammal conservation worldwide. Willingness‐to‐pay tends to be greater for marine mammals than terrestrial ones, and recreational users of species (tourists or hunters) are generally more willing than residents to pay towards species conservation. The choice of technique for eliciting willingness‐to‐pay from respondents is also shown to be highly significant. Willingness‐to‐pay values for British mammals derived from contingent valuation are sensitive to the species included rather than merely symbolic. This indicates that, with care, such measures can be used as a reliable means of quantifying public preferences for conservation, and therefore contributing to the decision‐making process. However, irrespective of the internal consistency of contingent valuation, the validity of the approach, especially for use in nature conservation, is disputed. Willingness‐to‐pay is likely to reflect many interrelated factors such as ethical and moral values, knowledge and tradition, and monetary values may not be an adequate representation of these broader considerations. Willingness‐to‐pay approaches should therefore be used in addition to, rather than in place of, expert judgements and more deliberative approaches towards policy decision‐making for conservation.  相似文献   

13.
The urgency of conservation concerns in the tropics, linked with the limitations imposed on research efforts by the tropical environment has resulted in the development of methods for rapid assessment of biological communities. One such method, the MacKinnon list technique, has been increasingly applied in avifaunal surveys worldwide. Using paired tropical bird data sets from Ecuadorian cloud forest and Madagascan littoral forest, we compare the performance of the MacKinnon list with that of the more standard method of point counts in indicating when a site has been adequately surveyed, estimating the magnitude of species richness, quantifying relative species abundance, and providing an α‐index of diversity. In species‐rich Ecuadorian cloud forest, neither method produced data indicating adequate survey effort, despite extensive sampling, whereas in the relatively species‐poor Madagascan littoral forests, data collected by both methods indicated that the area had been sufficiently surveyed with comparable sampling effort. Species richness estimates generated from MacKinnon list data provided a more accurate estimate of the magnitude of the species richness for the Ecuadorian avifauna, whereas estimates for the Madagascan avifauna stabilised with relatively few samples using either method. Data collected by each method reflected different patterns of relative abundance among the five most abundant species, with MacKinnon list data showing a bias towards solitary and territorial species and against monospecific flocking species relative to the point count data. As a consequence of this bias, MacKinnon list data also fail to reflect accurately the structure of communities as quantified by an index of community evenness. Point counts, on the other hand, failed to capture the full species complement of the species‐rich Ecuadorian study area. As techniques for the rapid assessment of unsurveyed areas, both methods are subject to biases that limit their value, if used alone, in collecting data of scientific and management value. We propose a hybrid rapid assessment methodology that capitalises on the strengths of both techniques while compensating for their weaknesses.  相似文献   

14.
15.
The recent increase in corn ethanol production has drawn attention to the environmental sustainability of biofuel production. Environmental assessments of second‐generation biofuel crops (SGBC) have focused primarily on greenhouse gas emissions and water quality. However, expanding the production of cellulosic biomass resources, especially those that require dedicated agricultural land, is also likely to have impacts on biodiversity. We developed an optimization framework for projecting the spatial pattern of SGBC expansion in the United States and intersected these predictions with occurrence data for at‐risk species. In particular, we focused on two candidate perennial grass feedstocks, Panicum virgatum (switchgrass), and Miscanthus × giganteus (Miscanthus). Tradeoffs between biodiversity and economic profitability are assessed using county level data sets of SGBC yield, agricultural land availability, land rents, and at‐risk species occurrences. Results show that future SGBC expansion is likely to occur outside of the Corn Belt, where conventional biofuel feedstocks are currently grown. The set of at‐risk species that could potentially be impacted is therefore likely to be different from the at‐risk species prevalent in the agroecological landscapes of the Upper Midwest that are dominated by corn and soy production. The total number and type of potentially impacted taxa is influenced by several factors, including the total demand for cellulosic biomass, the type of agricultural land used for production, and the method for defining at‐risk species. SGBC production is also concentrated in fewer counties when a national species conservation constraint is combined with a biofuel production mandate. This analysis provides a foundation for future research on species conservation in bioenergy production landscapes and highlights the importance of incorporating biodiversity into broader environmental assessments of biofuel sustainability.  相似文献   

16.
Due to socioeconomic differences, the accuracy and extent of reporting on the occurrence of native species differs among countries, which can impact the performance of species distribution models. We assessed the importance of geographical biases in occurrence data on model performance using Hydrilla verticillata as a case study. We used Maxent to predict potential North American distribution of the aquatic invasive macrophyte based upon training data from its native range. We produced a model using all available native range occurrence data, then explored the change in model performance produced by omitting subsets of training data based on political boundaries. We also compared those results with models trained on data from which a random sample of occurrence data was omitted from across the native range. Although most models accurately predicted the occurrence of H. verticillata in North America (AUC > 0.7600), data omissions influenced model predictions. Omitting data based on political boundaries resulted in larger shifts in model accuracy than omitting randomly selected occurrence data. For well‐documented species like H. verticillata, missing records from single countries or ecoregions may minimally influence model predictions, but for species with fewer documented occurrences or poorly understood ranges, geographic biases could misguide predictions. Regardless of focal species, we recommend that future species distribution modeling efforts begin with a reflection on potential spatial biases of available occurrence data. Improved biodiversity surveillance and reporting will provide benefit not only in invaded ranges but also within under‐reported and unexplored native ranges.  相似文献   

17.
Leveraging existing presence records and geospatial datasets, species distribution modeling has been widely applied to informing species conservation and restoration efforts. Maxent is one of the most popular modeling algorithms, yet recent research has demonstrated Maxent models are vulnerable to prediction errors related to spatial sampling bias and model complexity. Despite elevated rates of biodiversity imperilment in stream ecosystems, the application of Maxent models to stream networks has lagged, as has the availability of tools to address potential sources of error and calculate model evaluation metrics when modeling in nonraster environments (such as stream networks). Herein, we use Maxent and customized R code to estimate the potential distribution of paddlefish (Polyodon spathula) at a stream‐segment level within the Arkansas River basin, USA, while accounting for potential spatial sampling bias and model complexity. Filtering the presence data appeared to adequately remove an eastward, large‐river sampling bias that was evident within the unfiltered presence dataset. In particular, our novel riverscape filter provided a repeatable means of obtaining a relatively even coverage of presence data among watersheds and streams of varying sizes. The greatest differences in estimated distributions were observed among models constructed with default versus AICC‐selected parameterization. Although all models had similarly high performance and evaluation metrics, the AICC‐selected models were more inclusive of westward‐situated and smaller, headwater streams. Overall, our results solidified the importance of accounting for model complexity and spatial sampling bias in SDMs constructed within stream networks and provided a roadmap for future paddlefish restoration efforts in the study area.  相似文献   

18.
In bacteria, synonymous codon usage can be considerably affected by base composition at neighboring sites. Such context-dependent biases may be caused by either selection against specific nucleotide motifs or context-dependent mutation biases. Here we consider the evolutionary conservation of context-dependent codon bias across 11 completely sequenced bacterial genomes. In particular, we focus on two contextual biases previously identified in Escherichia coli; the avoidance of out-of-frame stop codons and AGG motifs. By identifying homologues of E. coli genes, we also investigate the effect of gene expression level in Haemophilus influenzae and Mycoplasma genitalium. We find that while context-dependent codon biases are widespread in bacteria, few are conserved across all species considered. Avoidance of out-of-frame stop codons does not apply to all stop codons or amino acids in E. coli, does not hold for different species, does not increase with gene expression level, and is not relaxed in Mycoplasma spp., in which the canonical stop codon, TGA, is recognized as tryptophan. Avoidance of AGG motifs shows some evolutionary conservation and increases with gene expression level in E. coli, suggestive of the action of selection, but the cause of the bias differs between species. These results demonstrate that strong context-dependent forces, both selective and mutational, operate on synonymous codon usage but that these differ considerably between genomes. Received: 6 May 1999 / Accepted: 29 October 1999  相似文献   

19.
Much biodiversity data is collected worldwide, but it remains challenging to assemble the scattered knowledge for assessing biodiversity status and trends. The concept of Essential Biodiversity Variables (EBVs) was introduced to structure biodiversity monitoring globally, and to harmonize and standardize biodiversity data from disparate sources to capture a minimum set of critical variables required to study, report and manage biodiversity change. Here, we assess the challenges of a ‘Big Data’ approach to building global EBV data products across taxa and spatiotemporal scales, focusing on species distribution and abundance. The majority of currently available data on species distributions derives from incidentally reported observations or from surveys where presence‐only or presence–absence data are sampled repeatedly with standardized protocols. Most abundance data come from opportunistic population counts or from population time series using standardized protocols (e.g. repeated surveys of the same population from single or multiple sites). Enormous complexity exists in integrating these heterogeneous, multi‐source data sets across space, time, taxa and different sampling methods. Integration of such data into global EBV data products requires correcting biases introduced by imperfect detection and varying sampling effort, dealing with different spatial resolution and extents, harmonizing measurement units from different data sources or sampling methods, applying statistical tools and models for spatial inter‐ or extrapolation, and quantifying sources of uncertainty and errors in data and models. To support the development of EBVs by the Group on Earth Observations Biodiversity Observation Network (GEO BON), we identify 11 key workflow steps that will operationalize the process of building EBV data products within and across research infrastructures worldwide. These workflow steps take multiple sequential activities into account, including identification and aggregation of various raw data sources, data quality control, taxonomic name matching and statistical modelling of integrated data. We illustrate these steps with concrete examples from existing citizen science and professional monitoring projects, including eBird, the Tropical Ecology Assessment and Monitoring network, the Living Planet Index and the Baltic Sea zooplankton monitoring. The identified workflow steps are applicable to both terrestrial and aquatic systems and a broad range of spatial, temporal and taxonomic scales. They depend on clear, findable and accessible metadata, and we provide an overview of current data and metadata standards. Several challenges remain to be solved for building global EBV data products: (i) developing tools and models for combining heterogeneous, multi‐source data sets and filling data gaps in geographic, temporal and taxonomic coverage, (ii) integrating emerging methods and technologies for data collection such as citizen science, sensor networks, DNA‐based techniques and satellite remote sensing, (iii) solving major technical issues related to data product structure, data storage, execution of workflows and the production process/cycle as well as approaching technical interoperability among research infrastructures, (iv) allowing semantic interoperability by developing and adopting standards and tools for capturing consistent data and metadata, and (v) ensuring legal interoperability by endorsing open data or data that are free from restrictions on use, modification and sharing. Addressing these challenges is critical for biodiversity research and for assessing progress towards conservation policy targets and sustainable development goals.  相似文献   

20.

Aim

To improve the accuracy of inferences on habitat associations and distribution patterns of rare species by combining machine‐learning, spatial filtering and resampling to address class imbalance and spatial bias of large volumes of citizen science data.

Innovation

Modelling rare species’ distributions is a pressing challenge for conservation and applied research. Often, a large number of surveys are required before enough detections occur to model distributions of rare species accurately, resulting in a data set with a high proportion of non‐detections (i.e. class imbalance). Citizen science data can provide a cost‐effective source of surveys but likely suffer from class imbalance. Citizen science data also suffer from spatial bias, likely from preferential sampling. To correct for class imbalance and spatial bias, we used spatial filtering to under‐sample the majority class (non‐detection) while maintaining all of the limited information from the minority class (detection). We investigated the use of spatial under‐sampling with randomForest models and compared it to common approaches used for imbalanced data, the synthetic minority oversampling technique (SMOTE), weighted random forest and balanced random forest models. Model accuracy was assessed using kappa, Brier score and AUC. We demonstrate the method by evaluating habitat associations and seasonal distribution patterns using citizen science data for a rare species, the tricoloured blackbird (Agelaius tricolor).

Main Conclusions

Spatial under‐sampling increased the accuracy of each model and outperformed the approach typically used to direct under‐sampling in the SMOTE algorithm. Our approach is the first to characterize winter distribution and movement of tricoloured blackbirds. Our results show that tricoloured blackbirds are positively associated with grassland, pasture and wetland habitats, and negatively associated with high elevations or evergreen forests during both winter and breeding seasons. The seasonal differences in distribution indicate that individuals move to the coast during the winter, as suggested by historical accounts.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号