首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Effects of sample size on the performance of species distribution models   总被引:8,自引:0,他引:8  
A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms to sample size. This is the first study to address this need by rigorously evaluating a broad suite of algorithms with independent presence–absence data from multiple species and regions. We evaluated predictions from 12 algorithms for 46 species (from six different regions of the world) at three sample sizes (100, 30, and 10 records). We used data from natural history collections to run the models, and evaluated the quality of model predictions with area under the receiver operating characteristic curve (AUC). With decreasing sample size, model accuracy decreased and variability increased across species and between models. Novel modelling methods that incorporate both interactions between predictor variables and complex response shapes (i.e. GBM, MARS-INT, BRUTO) performed better than most methods at large sample sizes but not at the smallest sample sizes. Other algorithms were much less sensitive to sample size, including an algorithm based on maximum entropy (MAXENT) that had among the best predictive power across all sample sizes. Relative to other algorithms, a distance metric algorithm (DOMAIN) and a genetic algorithm (OM-GARP) had intermediate performance at the largest sample size and among the best performance at the lowest sample size. No algorithm predicted consistently well with small sample size ( n  < 30) and this should encourage highly conservative use of predictions based on small sample size and restrict their use to exploratory modelling.  相似文献   

2.
Species distribution models (SDMs) have been widely used in ecology, biogeography, and conservation. Although ecological theory predicts that species occupancy is dynamic, the outputs of SDMs are generally converted into a single occurrence map, and model performance is evaluated in terms of success to predict presences and absences. The aim of this study was to characterize the effects of a gradual response in species occupancy to environmental gradients into the performance of SDMs. First we outline guidelines for the appropriate simulation of artificial species that allows controlling for gradualism and prevalence in the occupancy patterns over an environmental gradient. Second, we derive theoretical expected values for success measures based on presence‐absence predictions (AUC, Kappa, sensitivity and specificity). And finally we used artificial species to exemplify and test the effect of a gradual probabilistic occupancy response to environmental gradients on SDM performance. Our results show that when a species responds gradually to an environmental gradient, conventional measures of SDM predictive success based on presence‐absence cannot be expected to attain currently accepted performance values considered as good, even for a model that recovers perfectly well the true probability of occurrence. A gradual response imposes a theoretical expected value for these measures of performance that can be calculated from the species properties. However, irrespective of the statistical modeling strategy used and of how gradual the species response is, one can recover the true probability of occurrence as a function of environmental variables provided that species and sample prevalence are similar. Therefore, model performance based on presence‐absence should be judged against the theoretical expected value rather than to absolute values currently in use such as AUC > 0.8. Overall, we advocate for a wider use of the probability of occurrence and emphasize the need for further technical developments in this sense.  相似文献   

3.
物种分布模型(SDMs)通过量化物种分布和环境变量之间的关系,并将其外推到未知的景观单元,模拟、预测地理空间中生物的潜在分布,是生态学、生物地理学、保护生物学等研究领域的重要工具.然而,目前物种分布模型主要采用非生物因素作为预测变量,由于数据量化和建模表达困难,生物因素特别是种间作用在物种分布模型中常被忽略,将种间作用...  相似文献   

4.
Scale is a vital component to consider in ecological research, and spatial resolution or grain size is one of its key facets. Species distribution models (SDMs) are prime examples of ecological research in which grain size is an important component. Despite this, SDMs rarely explicitly examine the effects of varying the grain size of the predictors for species with different niche breadths. To investigate the effect of grain size and niche breadth on SDMs, we simulated four virtual species with different grain sizes/niche breadths using three environmental predictors (elevation, aspect, and percent forest) across two real landscapes of differing heterogeneity in predictor values. We aggregated these predictors to seven different grain sizes and modeled the distribution of each of our simulated species using MaxEnt and GLM techniques at each grain size. We examined model accuracy using the AUC statistic, Pearson's correlations of predicted suitability with the true suitability, and the binary area of presence determined from suitability above the maximum true skill statistic (TSS) threshold. Habitat specialists were more accurately modeled than generalist species, and the models constructed at the grain size from which a species was derived generally performed the best. The accuracy of models in the homogenous landscape deteriorated with increasing grain size to a greater degree than models in the heterogenous landscape. Variable effects on the model varied with grain size, with elevation increasing in importance as grain size increased while aspect lost importance. The area of predicted presence was drastically affected by grain size, with larger grain sizes over predicting this value by up to a factor of 14. Our results have implications for species distribution modeling and conservation planning, and we suggest more studies include analysis of grain size as part of their protocol.  相似文献   

5.
Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence‐only model Maxent performs well with small sample sizes. While the apparent accuracy of such models with small samples has been studied, less emphasis has been placed on the effect of small or biased species records on the secondary modeling steps, specifically accuracy assessment and threshold selection, particularly with profile (presence‐only) modeling techniques. When testing the effects of small sample sizes on distribution models, accuracy assessment has generally been conducted with complete species occurrence data, rather than similarly limited (e.g. few or biased) test data. Likewise, selection of a probability threshold – a selection of probability that classifies a model into discrete areas of presences and absences – has also generally been conducted with complete data. In this study we subsampled distribution data for an endangered rodent across multiple years to assess the effects of different sample sizes and types of bias on threshold selection, and examine the differences between apparent and actual accuracy of the models. Although some previously recommended threshold selection techniques showed little difference in threshold selection, the most commonly used methods performed poorly. Apparent model accuracy calculated from limited data was much higher than true model accuracy, but the true model accuracy was lower than it could have been with a more optimal threshold. That is, models with thresholds and accuracy calculated from biased and limited data had inflated reported accuracy, but were less accurate than they could have been if better data on species distribution were available and an optimal threshold were used.  相似文献   

6.
Nowadays great deal of research is physiological field is conducted on experimental animals and there is a lot of criticism from the wide public on methods used. Therefore, recently there is a lot of effort focused on the welfare of the animals. Main aim of this study is to determine the effect of experimental sample collection method on the selected parameters of stress. In the experiment two sample collections of rabbit blood from marginal ear vein were realized – first using standard method with one person fixing the animal and other collecting the blood using gently fixating the animal. In the second groups experimental method of inserting the experimental animal into a sack and further collection in dark was realized. During the experiment the levels of cortisol – main stress indicator in organism and other health parameters of animals including mineral profile and haematological parameters were observed. Our results show no significant changes in levels of cortisol but also a decreasing tendency in the sample from the second (dark) collection. Haematological parameters were generally in the reference values and any significant changes except levels of lymphocytes and percent of lymphocytes which shown significant increase in the second collection period were found. Also the levels of mean corpuscular haemoglobin and percent of neutrophils unveiled a significant decrease in values. Values of mineral profile parameters have indicated no significant changes except the levels of phosphorus. Based on the result we can state that the experimental sample collection has no effect on blood parameters of the animals but we spectated a statistically insignificant decrease in the levels of cortisol which can suggest that the dark collection is possibly less stressful to the animals.  相似文献   

7.
In species distribution analyses, environmental predictors and distribution data for large spatial extents are often available in long‐lat format, such as degree raster grids. Long‐lat projections suffer from unequal cell sizes, as a degree of longitude decreases in length from approximately 110 km at the equator to 0 km at the poles. Here we investigate whether long‐lat and equal‐area projections yield similar model parameter estimates, or result in a consistent bias. We analyzed the environmental effects on the distribution of 12 ungulate species with a northern distribution, as models for these species should display the strongest effect of projectional distortion. Additionally we choose four species with entirely continental distributions to investigate the effect of incomplete cell coverage at the coast. We expected that including model weights proportional to the actual cell area should compensate for the observed bias in model coefficients, and similarly that using land coverage of a cell should decrease bias in species with coastal distribution. As anticipated, model coefficients were different between long‐lat and equal‐area projections. Having progressively smaller and a higher number of cells with increasing latitude influenced the importance of parameters in models, increased the sample size for the northernmost parts of species ranges, and reduced the subcell variability of those areas. However, this bias could be largely removed by weighting long‐lat cells by the area they cover, and marginally by correcting for land coverage. Overall we found little effect of using long‐lat rather than equal‐area projections in our analysis. The fitted relationship between environmental parameters and occurrence probability differed only very little between the two projection types. We still recommend using equal‐area projections to avoid possible bias. More importantly, our results suggest that the cell area and the proportion of a cell covered by land should be used as a weight when analyzing distribution of terrestrial species.  相似文献   

8.
Most high‐performing species distribution modelling techniques require both presences, and either absences or pseudo‐absences or background points. In this paper, we explore the effect of sample size, towards developing improved strategies for modelling. We generated 1800 virtual species with three levels of prevalence using ten modelling techniques, while varying the number of training presences (NTP) and the number of random points (NRP representing pseudo‐absences or background sites). For five of the ten modelling techniques we built two versions of models: one with an equal total weight (ETW) setting where the total weight for pseudo‐absence is equivalent to the total weight for presence, and another with an unequal total weight (UTW) setting where the total weight for pseudo‐absence is not required to be equal to the total weight for presence. We compared two strategies for NRP: a small multiplier strategy (i.e. setting NRP at a few times as large as NTP), and a large number strategy (i.e. using numerous random points). We produced ensemble models (by averaging the predictions from 30 models built with the same set of training presences and different sets of random points in equivalent numbers) for three NTP magnitudes and two NRP strategies. We found that model accuracy altered as NRP increased with four distinct patterns of performance: increasing, decreasing, arch‐shaped and horizontal. In most cases ETW improved model performance. Ensemble models had higher accuracy than the corresponding single models, and this improvement was pronounced when NTP was low. We conclude that a large NRP is not always an appropriate strategy. The best choice for NRP will depend on the modelling techniques used, species prevalence and NTP. We recommend building ensemble models instead of single models, using the small multiplier strategy for NRP with ETW, especially when only a small number of species presence records are available.  相似文献   

9.
1. Fifteen species richness estimators (three asymptotic based on species accumulation curves, 11 nonparametric, and one based in the species-area relationship) were compared by examining their performance in estimating the total species richness of epigean arthropods in the Azorean Laurisilva forests. Data obtained with standardized sampling of 78 transects in natural forest remnants of five islands were aggregated in seven different grains (i.e. ways of defining a single sample): islands, natural areas, transects, pairs of traps, traps, database records and individuals to assess the effect of using different sampling units on species richness estimations. 2. Estimated species richness scores depended both on the estimator considered and on the grain size used to aggregate data. However, several estimators (ACE, Chao 1, Jackknifel and 2 and Bootstrap) were precise in spite of grain variations. Weibull and several recent estimators [proposed by Rosenzweig et al. (Conservation Biology, 2003, 17, 864-874), and Ugland et al. (Journal of Animal Ecology, 2003, 72, 888-897)] performed poorly. 3. Estimations developed using the smaller grain sizes (pair of traps, traps, records and individuals) presented similar scores in a number of estimators (the above-mentioned plus ICE, Chao2, Michaelis-Menten, Negative Exponential and Clench). The estimations from those four sample sizes were also highly correlated. 4. Contrary to other studies, we conclude that most species richness estimators may be useful in biodiversity studies. Owing to their inherent formulas, several nonparametric and asymptotic estimators present insensitivity to differences in the way the samples are aggregated. Thus, they could be used to compare species richness scores obtained from different sampling strategies. Our results also point out that species richness estimations coming from small grain sizes can be directly compared and other estimators could give more precise results in those cases. We propose a decision framework based on our results and on the literature to assess which estimator should be used to compare species richness scores of different sites, depending on the grain size of the original data, and of the kind of data available (species occurrence or abundance data).  相似文献   

10.
Abundances of two harpacticoid copepod species, Enhydrosoma littorale Wells and Zausodes c.f. arenicolus Wilson, were significantly higher in one of two adjacent subtidal, soft-bottom habitats in St. George Sound, Florida (29°54′N : 84°37′48′′W). For Enhydrosoma littorale, a laboratory-preference experiment indicated that sediment-related factors caused the observed distribution. In a series of preference experiments, differences between the sediments of the two habitats in granulometry and organic matter were shown not to account for the preference. Rather, the preference results from differences in the microbes attached to the sediment particles in the two areas. In contrast, Zausodes c.f. arenicolus did nol prefer sediments from its area of high field abundance in laboratory preference experiments, indicating that factors external to the sediment i.e. hydrographic conditions or biological interactions, were responsible for this species' distribution.  相似文献   

11.
Ecological niche models and species distribution models are used in many fields of science. Despite their popularity, only recently have important aspects of the modeling process like model selection been developed. Choosing environmental variables with which to create these models is another critical part of the process, but methods currently in use are not consistent in their results and no comprehensive approach exists by which to perform this step. Here, we compared seven heuristic methods of variable selection against a novel approach that proposes to select best sets of variables by evaluating performance of models created with all combinations of variables and distinct parameter settings of the algorithm in concert. Our results were that—except for the jackknife method for one of the 12 species and fluctuation index for two of the 12 species—none of the heuristic methods for variable selection coincided with the exhaustive one. Performance decreased in models created using variables selected with heuristic methods and both underfitting and overfitting were detected when comparing their geographic projections with the ones of models created with variables selected with the exhaustive method. Using the exhaustive approach could be time consuming, so a two-step exercise may be necessary. However, using this method identifies adequate variable sets and parameter settings in concert that are associated with increased model performance.  相似文献   

12.
Prediction maps produced by species distribution models (SDMs) influence decision‐making in resource management or designation of land in conservation planning. Many studies have compared the prediction accuracy of different SDM modeling methods, but few have quantified the similarity among prediction maps. There has also been little systematic exploration of how the relative importance of different predictor variables varies among model types and affects map similarity. Our objective was to expand the evaluation of SDM performance for 45 plant species in southern California to better understand how map predictions vary among model types, and to explain what factors may affect spatial correspondence, including the selection and relative importance of different environmental variables. Four types of models were tested. Correlation among maps was highest between generalized linear models (GLMs) and generalized additive models (GAMs) and lowest between classification trees and GAMs or GLMs. Correlation between Random Forests (RFs) and GAMs was the same as between RFs and classification trees. Spatial correspondence among maps was influenced the most by model prediction accuracy (AUC) and species prevalence; map correspondence was highest when accuracy was high and prevalence was intermediate (average prevalence for all species was 0.124). Species functional type and the selection of climate variables also influenced map correspondence. For most (but not all) species, climate variables were more important than terrain or soil in predicting their distributions. Environmental variable selection varied according to modeling method, but the largest differences were between RFs and GLMs or GAMs. Although prediction accuracy was equal for GLMs, GAMs, and RFs, the differences in spatial predictions suggest that it may be important to evaluate the results of more than one model to estimate the range of spatial uncertainty before making planning decisions based on map outputs. This may be particularly important if models have low accuracy or if species prevalence is not intermediate.  相似文献   

13.
As globalization continues, the spread of invasive species is accelerating, posing a severe threat to native biodiversity. To manage such species, reduce their negative impact on native biota and utilize management costs efficiently, a profound understanding of their geographical distribution pattern is mandatory. In this study, the species distribution model Maxent was used to predict the potential spatial distribution of U. europaeus. To account for sampling bias, three bias correction methods were applied, including a novel approach to increase the number of presence points by sampling occurrences based on satellite images. Furthermore, a decision structured process was used to evaluate and select optimal Maxent parameterization and account for limitations of single evaluation criteria. The currently suitable area of U. europaeus is primarily distributed in the coastal and central regions of Chilean natural region Zona Sur in south-central Chile. Annual mean temperature (bio1), annual precipitation (bio12), and precipitation seasonality (bio15) were the most important environmental variables that affected the distribution of U. europaeus. The sampling of additional presence points could effectively correct for sampling bias in species occurrence data. The use of a decision structured process for model evaluation proved to be useful in determining optimal model parameterization for decreased model complexity. This study highlights the importance of optimized Maxent calibrations to yield results as accurately as possible. The predicted suitable habitats can inform nature conservation planners and landscape managers to guide and prioritize conservation measures.  相似文献   

14.
Aim The method used to generate hypotheses about species distributions, in addition to spatial scale, may affect the biodiversity patterns that are then observed. We compared the performance of range maps and MaxEnt species distribution models at different spatial resolutions by examining the degree of similarity between predicted species richness and composition against observed values from well‐surveyed cells (WSCs). Location Mexico. Methods We estimated amphibian richness distributions at five spatial resolutions (from 0.083° to 2°) by overlaying 370 individual range maps or MaxEnt predictions, comparing the similarity of the spatial patterns and correlating predicted values with the observed values for WSCs. Additionally, we looked at species composition and assessed commission and omission errors associated with each method. Results MaxEnt predictions reveal greater geographic differences in richness between species rich and species poor regions than the range maps did at the five resolutions assessed. Correlations between species richness values estimated by either of the two procedures and the observed values from the WSCs increased with decreasing resolution. The slopes of the regressions between the predicted and observed values indicate that MaxEnt overpredicts observed species richness at all of the resolutions used, while range maps underpredict them, except at the finest resolution. Prediction errors did not vary significantly between methods at any resolution and tended to decrease with decreasing resolution. The accuracy of both procedures was clearly different when commission and omission errors were examined separately. Main conclusions Despite the congruent increase in the geographic richness patterns obtained from both procedures as resolution decreases, the maps created with these methods cannot be used interchangeably because of notable differences in the species compositions they report.  相似文献   

15.
D Sampson  G P Murphy 《Cryobiology》1971,8(6):594-598
Canine livers were stored in 4 different ways prior to an evaluation by a normothermic dilute blood test perfusion. The storage modalities were hypothermia alone, hypothermic low-flow perfusion with Tis-U-Sol, hypothermic low-flow perfusion with cryoprecipitated plasma and normothermic perfusion with dilute blood.  相似文献   

16.

Objectives

Validation studies in juvenile dental age estimation primarily focus on point estimates while interval performance for reference samples of different ancestry group compositions has received minimal attention. We tested the effect of reference sample size and composition by sex and ancestry group on age interval estimates.

Materials and Methods

The dataset consisted of Moorrees et al. dental scores from panoramic radiographs of 3334 London children of Bangladeshi and European ancestry and 2–23 years of age. Model stability was assessed using standard error of mean age-at-transition for univariate cumulative probit and sample size, group mixing (sex or ancestry), and staging system as factors. Age estimation performance was tested using molar reference samples of four sizes, stratified by year of age, sex, and ancestry. Age estimates were performed using Bayesian multivariate cumulative probit with 5-fold cross-validation.

Results

Standard error increased with decreasing sample size but showed no effect from mixing by sex or ancestry. Estimating ages using a reference and target sample of different sex reduced success rate significantly. The same test by ancestry groups had a lesser effect. Small sample size (n < 20/year of age) negatively affected most performance metrics.

Discussion

We found that reference sample size, followed by sex, primarily drove age estimation performance. Combining reference samples by ancestry produced equivalent or better estimates of age by all metrics than using a single-demographic reference of smaller size. We further proposed that population specificity is an alternative hypothesis of intergroup difference that has been erroneously treated as a null.  相似文献   

17.
18.
Sufficient sample sizes are needed in breeding programs to be confident, with a specified probability , of obtaining a specified number of plants of a desired genotype in segregating populations. We develop a method of determining the minimum sample size needed to produce, with specified probability , at least m individuals of a desired genotype. This method takes into consideration factors affecting differential selection of gametes, segregation at a single locus, and linkage among the loci of interest. We first consider the effects in the gametophyte (haploid level) of fitness and linkage on the frequencies of alleles at two linked loci, then at three or more linked loci. The probability of obtaining at least m successes, or occurrences of the desired allele, among n gametes is given by a formula based on the binomial distribution. This probability is affected by fitness and linkage through their impact on the probability that a single randomly chosen gamete is of the desired type. Using an extension of this approach, we examine the effects of the altered allelic frequencies on the likelihood of obtaining the desired genotype from a randomly chosen pair of gametes in the sporophyte (diploid level). A table and a figure show the sample size required to produce, with probability 0.95, m individuals of the desired g enotype or phenotype, as a function of m and the probability that a randomly selected individual is of the desired type.BU-1031-MC in the Technical Report Series of the Biometrics Unit, Cornell University, Ithaca, New York 14853  相似文献   

19.
Inventory data for trees ≥ 10 cm DBH from a hectare plot are compared to data obtained by the Point-Centered Quarter Method along a line transect from the same locality in Anangu, Amazonian Ecuador. The one-hectare quadrat plot of 100 × 100 m had 734 individuals, 153 species, 46 families, a total basal area of 22.2 m2, and an estimated above ground tree volume of 240.5 m3. The line transect had a calculated density of 728 individuals per hectare, which included 239 species, 51 families, a total basal area of 34.1 m2, and an estimated above ground tree volume of 409.6 m3. Of the 20 species with the highest IVI, only four were shared by the two samples. The most important species were Quararibea ochrocalyx on the hectare plot and Iriartea deltoidea on the line transect, constituting 26.6 and 13.3% of the individuals, respectively. The five families with the highest FIV on the hectare plot (Bombacaceae, Arecaceae, Moraceae, Caesalpinaceae, and Lauraceae) and on the line transect (Arecaceae, Moraceae, Meliaceae, Mimosaceae and Caesalpinaceae) constitute 40.4% and 35.4% of the Family Importance Values of the samples, respectively. The Point-Centered Quarter Method used along a line transect reflects maximum diversity and provides average values of density and tree size in the area. The quadrat plot reflects the local structure and composition of the forest within the plot.  相似文献   

20.
The major role played by environmental factors in determining the geographical range sizes of species raises the possibility of describing their long-term dynamics in relatively simple terms, a goal which has hitherto proved elusive. Here we develop a stochastic differential equation to describe the dynamics of the range size of an individual species based on the relationship between abundance and range size, derive a limiting stationary probability model to quantify the stochastic nature of the range size for that species at steady state, and then generalize this model to the species-range size distribution for an assemblage. The model fits well to several empirical datasets of the geographical range sizes of species in taxonomic assemblages, and provides the simplest explanation of species-range size distributions to date.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号