首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 781 毫秒
1.
Volker Bahn  Brian J. McGill 《Oikos》2013,122(3):321-331
Distribution models are used to predict the likelihood of occurrence or abundance of a species at locations where census data are not available. An integral part of modelling is the testing of model performance. We compared different schemes and measures for testing model performance using 79 species from the North American Breeding Bird Survey. The four testing schemes we compared featured increasing independence between test and training data: resubstitution, random data hold‐out and two spatially segregated data hold‐out designs. The different testing measures also addressed different levels of information content in the dependent variable: regression R2 for absolute abundance, squared correlation coefficient r2 for relative abundance and AUC/Somer’s D for presence/absence. We found that higher levels of independence between test and training data lead to lower assessments of prediction accuracy. Even for data collected independently, spatial autocorrelation leads to dependence between random hold‐out test data and training data, and thus to inflated measures of model performance. While there is a general awareness of the importance of autocorrelation to model building and hypothesis testing, its consequences via violation of independence between training and testing data have not been addressed systematically and comprehensively before. Furthermore, increasing information content (from correctly classifying presence/absence, to predicting relative abundance, to predicting absolute abundance) leads to decreasing predictive performance. The current tests for presence/absence distribution models are typically overly optimistic because a) the test and training data are not independent and b) the correct classification of presence/absence has a relatively low information content and thus capability to address ecological and conservation questions compared to a prediction of abundance. Meaningful evaluation of model performance requires testing on spatially independent data, if the intended application of the model is to predict into new geographic or climatic space, which arguably is the case for most applications of distribution models.  相似文献   

2.
In spite of the widespread use of statistics in plant ecology, some misunderstandings are widespread. Lájer’s warning against non-random sampling in the field is well taken, but non-randomization is probably more common than we realize in experimental work too, and a frequent cause of inexplicable “significant” results. However, in the placement of quadrats/samples, restricted randomization is always preferable to plain random. The main purpose of randomization, as R.A. Fisher made clear, is to obtain a valid estimate of the error. Random placement does not, as Fisher realized, ensure independence of samples because of spatial autocorrelation, which is present in all ecological work. If we forget this, we can end up concluding that elephants carefully select moss cushions to tread on. Although a normal distribution is often formally required, tests such as the Analysis of Variance are fairly robust against departures. Obsession with normality leads to the use of inappropriate transformations, for example a log transformation when the author had no intention of a multiplicative model. Even worse is the use of a log (x + 1) transformation, which gives answers in neither additive nor multiplicative terms, and in a way unrelated to the means presented. There are several solutions to this, including randomization tests. After all this, we should not take the arbitrary value of 0.05 too seriously. Many statisticians do not.  相似文献   

3.
The currently dominating hypothetico-deductive research paradigm for ecology has statistical hypothesis testing as a basic element. Classic statistical hypothesis testing does, however, present the ecologist with two fundamental dilemmas when field data are to be analyzed: (1) that the statistically motivated demand for a random and representative sample and the ecologically motivated demand for representation of variation in the study area cannot be fully met at the same time; and (2) that the statistically motivated demand for independence of errors calls for sampling distances that exceed the scales of relevant pattern-generating processes, so that samples with statistically desirable properties will be ecologically irrelevant. Reasons for these dilemmas are explained by consideration of the classic statistical Neyman-Pearson test procedure, properties of ecological variables, properties of sampling designs, interactions between properties of the ecological variables and properties of sampling designs, and specific assumptions of the statistical methods. Analytic solutions to problems underlying the dilemmas are briefly reviewed. I conclude that several important research objectives cannot be approached without subjective elements in sampling designs. I argue that a research strategy entirely based on rigorous statistical testing of hypotheses is insufficient for field ecological data and that inductive and deductive approaches are complementary in the process of building ecological knowledge. I recommend that great care is taken when statistical tests are applied to ecological field data. Use of less formal modelling approaches is recommended for cases when formal testing is not strictly needed. Sets of recommendations, “Guidelines for wise use of statistical tools”, are proposed both for testing and for modelling. Important elements of wise-use guidelines are parallel use of methods that preferably belong to different methodologies, selection of methods with few and less rigorous assumptions, conservative interpretation of results, and abandonment of definitive decisions based a predefined significance level.  相似文献   

4.
In ecological field surveys, observations are gathered at different spatial locations. The purpose may be to relate biological response variables (e.g., species abundances) to explanatory environmental variables (e.g., soil characteristics). In the absence of prior knowledge, ecologists have been taught to rely on systematic or random sampling designs. If there is prior knowledge about the spatial patterning of the explanatory variables, obtained from either previous surveys or a pilot study, can we use this information to optimize the sampling design in order to maximize our ability to detect the relationships between the response and explanatory variables?
The specific questions addressed in this paper are: a) What is the effect (type I error) of spatial autocorrelation on the statistical tests commonly used by ecologists to analyse field survey data? b) Can we eliminate, or at least minimize, the effect of spatial autocorrelation by the design of the survey? Are there designs that provide greater power for surveys, at least under certain circumstances? c) Can we eliminate or control for the effect of spatial autocorrelation during the analysis? To answer the last question, we compared regular regression analysis to a modified t‐test developed by Dutilleul for correlation coefficients in the presence of spatial autocorrelation.
Replicated surfaces (typically, 1000 of them) were simulated using different spatial parameters, and these surfaces were subjected to different sampling designs and methods of statistical analysis. The simulated surfaces may represent, for example, vegetation response to underlying environmental variation. This allowed us 1) to measure the frequency of type I error (the failure to reject the null hypothesis when in fact there is no effect of the environment on the response variable) and 2) to estimate the power of the different combinations of sampling designs and methods of statistical analysis (power is measured by the rate of rejection of the null hypothesis when an effect of the environment on the response variable has been created).
Our results indicate that: 1) Spatial autocorrelation in both the response and environmental variables affects the classical tests of significance of correlation or regression coefficients. Spatial autocorrelation in only one of the two variables does not affect the test of significance. 2) A broad‐scale spatial structure present in data has the same effect on the tests as spatial autocorrelation. When such a structure is present in one of the variables and autocorrelation is found in the other, or in both, the tests of significance have inflated rates of type I error. 3) Dutilleul's modified t‐test for the correlation coefficient, corrected for spatial autocorrelation, effectively corrects for spatial autocorrelation in the data. It also effectively corrects for the presence of deterministic structures, with or without spatial autocorrelation.
The presence of a broad‐scale deterministic structure may, in some cases, reduce the power of the modified t‐test.  相似文献   

5.
Species diversity may be additively partitioned within and among samples (alpha and beta diversity) from hierarchically scaled studies to assess the proportion of the total diversity (gamma) found in different habitats, landscapes, or regions. We developed a statistical approach for testing null hypotheses that observed partitions of species richness or diversity indices differed from those expected by chance, and we illustrate these tests using data from a hierarchical study of forest-canopy beetles. Two null hypotheses were implemented using individual- and sample-based randomization tests to generate null distributions for alpha and beta components of diversity at multiple sampling scales. The two tests differed in their null distributions and power to detect statistically significant diversity components. Individual-based randomization was more powerful at all hierarchical levels and was sensitive to departures between observed and null partitions due to intraspecific aggregation of individuals. Sample-based randomization had less power but still may be useful for determining whether different habitats show a higher degree of differentiation in species diversity compared with random samples from the landscape. Null hypothesis tests provide a basis for inferences on partitions of species richness or diversity indices at multiple sampling levels, thereby increasing our understanding of how alpha and beta diversity change across spatial scales.  相似文献   

6.
Aim Environmental niche models that utilize presence‐only data have been increasingly employed to model species distributions and test ecological and evolutionary predictions. The ideal method for evaluating the accuracy of a niche model is to train a model with one dataset and then test model predictions against an independent dataset. However, a truly independent dataset is often not available, and instead random subsets of the total data are used for ‘training’ and ‘testing’ purposes. The goal of this study was to determine how spatially autocorrelated sampling affects measures of niche model accuracy when using subsets of a larger dataset for accuracy evaluation. Location The distribution of Centaurea maculosa (spotted knapweed; Asteraceae) was modelled in six states in the western United States: California, Oregon, Washington, Idaho, Wyoming and Montana. Methods Two types of niche modelling algorithms – the genetic algorithm for rule‐set prediction (GARP) and maximum entropy modelling (as implemented with Maxent) – were used to model the potential distribution of C. maculosa across the region. The effect of spatially autocorrelated sampling was examined by applying a spatial filter to the presence‐only data (to reduce autocorrelation) and then comparing predictions made using the spatial filter with those using a random subset of the data, equal in sample size to the filtered data. Results The accuracy of predictions from both algorithms was sensitive to the spatial autocorrelation of sampling effort in the occurrence data. Spatial filtering led to lower values of the area under the receiver operating characteristic curve plot but higher similarity statistic (I) values when compared with predictions from models built with random subsets of the total data, meaning that spatial autocorrelation of sampling effort between training and test data led to inflated measures of accuracy. Main conclusions The findings indicate that care should be taken when interpreting the results from presence‐only niche models when training and test data have been randomly partitioned but occurrence data were non‐randomly sampled (in a spatially autocorrelated manner). The higher accuracies obtained without the spatial filter are a result of spatial autocorrelation of sampling effort between training and test data inflating measures of prediction accuracy. If independently surveyed data for testing predictions are unavailable, then it may be necessary to explicitly account for the spatial autocorrelation of sampling effort between randomly partitioned training and test subsets when evaluating niche model predictions.  相似文献   

7.
This review identifies several important challenges in null model testing in ecology: 1) developing randomization algorithms that generate appropriate patterns for a specified null hypothesis; these randomization algorithms stake out a middle ground between formal Pearson–Neyman tests (which require a fully‐specified null distribution) and specific process‐based models (which require parameter values that cannot be easily and independently estimated); 2) developing metrics that specify a particular pattern in a matrix, but ideally exclude other, related patterns; 3) avoiding classification schemes based on idealized matrix patterns that may prove to be inconsistent or contradictory when tested with empirical matrices that do not have the idealized pattern; 4) testing the performance of proposed null models and metrics with artificial test matrices that contain specified levels of pattern and randomness; 5) moving beyond simple presence–absence matrices to incorporate species‐level traits (such as abundance) and site‐level traits (such as habitat suitability) into null model analysis; 6) creating null models that perform well with many sites, many species pairs, and varying degrees of spatial autocorrelation in species occurrence data. In spite of these challenges, the development and application of null models has continued to provide valuable insights in ecology, evolution, and biogeography for over 80 years.  相似文献   

8.

Background

Independence between observations is a standard prerequisite of traditional statistical tests of association. This condition is, however, violated when autocorrelation is present within the data. In the case of variables that are regularly sampled in space (i.e. lattice data or images), such as those provided by remote-sensing or geographical databases, this problem is particularly acute. Because analytic derivation of the null probability distribution of the test statistic (e.g. Pearson''s r) is not always possible when autocorrelation is present, we propose instead the use of a Monte Carlo simulation with surrogate data.

Methodology/Principal Findings

The null hypothesis that two observed mapped variables are the result of independent pattern generating processes is tested here by generating sets of random image data while preserving the autocorrelation function of the original images. Surrogates are generated by matching the dual-tree complex wavelet spectra (and hence the autocorrelation functions) of white noise images with the spectra of the original images. The generated images can then be used to build the probability distribution function of any statistic of association under the null hypothesis. We demonstrate the validity of a statistical test of association based on these surrogates with both actual and synthetic data and compare it with a corrected parametric test and three existing methods that generate surrogates (randomization, random rotations and shifts, and iterative amplitude adjusted Fourier transform). Type I error control was excellent, even with strong and long-range autocorrelation, which is not the case for alternative methods.

Conclusions/Significance

The wavelet-based surrogates are particularly appropriate in cases where autocorrelation appears at all scales or is direction-dependent (anisotropy). We explore the potential of the method for association tests involving a lattice of binary data and discuss its potential for validation of species distribution models. An implementation of the method in Java for the generation of wavelet-based surrogates is available online as supporting material.  相似文献   

9.
Local spatial autocorrelation in biological variables   总被引:2,自引:0,他引:2  
Spatial autocorrelation (SA) methods have recently been extended to include the detection of local spatial autocorrelation at individual sampling stations. We review the formulas for these statistics and report on the results of an extensive population-genetic simulation study we have published elsewhere to test the applicability of these methods in spatially distributed biological data. We find that most biological variables exhibit global SA, and that in such cases the methods proposed for testing the significance of local SA coefficients reject the null hypothesis excessively. When global SA is absent, permutational methods for testing significance yield reliable results. Although standard errors have been published for the local SA coefficients, their employment using an asymptotically normal approach leads to unreliable results; permutational methods are preferred. In addition to significance tests of suspected non-stationary localities, we can use these methods in an exploratory manner to find and identify hotspots (places with positive local SA) and coldspots (negative local SA) in a dataset. We illustrate the application of these methods in three biological examples from plant population biology, ecology and population genetics. The examples range from the study of single variables to the joint analysis of several variables and can lead to successful demographic and evolutionary inferences about the populations studied.  相似文献   

10.
Statistical tests for non-random associations with components of habitat or different kinds of prey require information about the availability of sub-habitats or types of prey. The data are obtained from sampling (Stage 1 samples). Tests are then constructed using this information to predict what will be the occupancy of habitats or composition of diet under the null hypothesis of random association. Estimates of actual occupancy of habitats or composition of diet are then obtained from Stage 2 sampling and tests are done to compare the observed data from Stage 2 with what was predicted from Stage 1.Estimates from each stage of sampling are subject to sampling error, particularly where small samples are involved. The errors involved in Stage 1 sampling are often ignored, resulting in biases in tests and excessive rejection of null hypotheses (i.e. non-random patterns are claimed when they are not present). Here, accurate tests are developed which take into account both types of error.For animals in patchy habitats, with two or more types of patch, the data from Stages 1 and 2 are used to derive maximal likelihood estimators for the proportions of area occupied by the sub-habitats and the proportions of animals in each sub-habitat. These are then used in χ2 tests.For composition of diets, data are more complex, because the consumption of food of each type (on its own) must be estimated in separate experiments or sampling. So, Stage 1 sampling is more difficult and the maximal likelihood estimators described here are more complex. The accurate tests described here give much more realistic answers in that they properly control rates of Type I error, particularly with small samples. The effects of errors in Stage 1 sampling are, however, shown to be important, even for quite large samples. The tests can and should be used in any analyses of non-random association or preference among sub-habitats or types of prey.  相似文献   

11.
Many authors apply statistical tests to sets of relevés obtained using non-random methods to investigate phytosociological and ecological relationships. Frequently applied tests include thet-test, ANOVA, Mann-Whitney test, Kruskal-Wallis test, chi-square test (of independence, goodness-of-fit, and homogeneity), Kolmogorov-Smirnov test, concentration analysis, tests of linear correlation and Spearman rank correlation coefficient, computer intensive methods (such as randomization and re-sampling) and others. I examined the extent of reliability of the results of such tests applied to non-random data by examining the tests requirements according to statistical theory. I conclude that when used for such data, the statistical tests do not provide reliable support for the inferences made because non-randomness of samples violated the demand for observations to be independent, and different parts of the investigated communities did not have equal chance to be represented in the sample. Additional requirements, e.g. of normality and homoscedasticity, were also neglected in several cases. The importance of data satisfying the basic requirements set by statistical tests is stressed.  相似文献   

12.
A primary focus of wildlife ecology is studying how the arrangement, quality, and distribution of habitat influence wildlife populations at multiple spatial scales. A practical limitation of conducting wildlife–habitat investigations in the field, however, is that sampling points tend to be close to one another, resulting in spatial clustering. Consequently, when ecologists seek to quantify the effects of environmental predictors surrounding their sampling points, they encounter the issue of using landscapes that are partially or completely overlapping. A presumed problem of overlapping landscapes is that data generated from these landscapes, when used as predictors in statistical modeling, might violate the assumption of independence. However, the independence of error is the critical assumption, not the independence of predictor variables. Nonetheless, many researchers strive to avoid such overlaps through sampling design or alternative analytical procedures and specialized software programs have been created to assist with this. We present theoretical arguments and empirical evidence showing that changing the amount of overlap does not alter the degree of spatial autocorrelation. Using data derived from 2 broad-scaled avian monitoring programs, we quantified the relationship between forest cover and bird abundance and occurrence at multiple landscapes ranging from 100 m to 24 km across. We found no clear evidence that increasing overlap of landscapes increased spatial autocorrelation in model residuals. Our results demonstrate that the concern of overlapping landscapes as a potential cause of violation of spatial independency among sampling units is misdirected and represents an oversimplification of the statistical and ecological issues surrounding spatial autocorrelation. Overlapping landscapes and spatial autocorrelation are separate issues in the modeling of wildlife populations and their habitats; non-overlapping landscapes do not ensure spatial independency and overlapping landscapes do not necessarily lead to greater spatial autocorrelation in model errors. © 2011 The Wildlife Society.  相似文献   

13.
Moore JE  Swihart RK 《Oecologia》2007,152(4):763-777
A community is "nested" when species assemblages in less rich sites form nonrandom subsets of those at richer sites. Conventional null models used to test for statistically nonrandom nestedness are under- or over-restrictive because they do not sufficiently isolate ecological processes of interest, which hinders ecological inference. We propose a class of null models that are ecologically explicit and interpretable. Expected values of species richness and incidence, rather than observed values, are used to create random presence-absence matrices for hypothesis testing. In our examples, based on six datasets, expected values were derived either by using an individually based random placement model or by fitting empirical models to richness data as a function of environmental covariates. We describe an algorithm for constructing unbiased null matrices, which permitted valid testing of our null models. Our approach avoids the problem of building too much structure into the null model, and enabled us to explicitly test whether observed communities were more nested than would be expected for a system structured solely by species-abundance and species-area or similar relationships. We argue that this test or similar tests are better determinants of whether a system is truly nested; a nested system should contain unique pattern not already predicted by more fundamental ecological principles such as species-area relationships. Most species assemblages we studied were not nested under these null models. Our results suggest that nestedness, beyond that which is explained by passive sampling processes, may not be as widespread as currently believed. These findings may help to improve the utility of nestedness as an ecological concept and conservation tool.  相似文献   

14.
Research frontiers in null model analysis   总被引:4,自引:0,他引:4  
Null models are pattern‐generating models that deliberately exclude a mechanism of interest, and allow for randomization tests of ecological and biogeographic data. Although they have had a controversial history, null models are widely used as statistical tools by ecologists and biogeographers. Three active research fronts in null model analysis include biodiversity measures, species co‐occurrence patterns, and macroecology. In the analysis of biodiversity, ecologists have used random sampling procedures such as rarefaction to adjust for differences in abundance and sampling effort. In the analysis of species co‐occurrence and assembly rules, null models have been used to detect the signature of species interactions. However, controversy persists over the details of computer algorithms used for randomizing presence–absence matrices. Finally, in the newly emerging discipline of macroecology, null models can be used to identify constraining boundaries in bivariate scatterplots of variables such as body size, range size, and population density. Null models provide specificity and flexibility in data analysis that is often not possible with conventional statistical tests.  相似文献   

15.
The size of a sampling unit has a critical effect on our perception of ecological phenomena; it influences the variance and correlation structure estimates of the data. Classical statistical theory works well to predict the changes in variance when there is no autocorrelation structure, but it is not applicable when the data are spatially autocorrelated. Geostatistical theory, on the other hand, uses analytical relationships to predict the variance and autocorrelation structure that would be observed if a survey was conducted using sampling units of a different size. To test the geostatistical predictions, we used information about individual tree locations in the tropical rain forest of the Pasoh Reserve, Malaysia. This allowed us to simulate and compare various sampling designs. The original data were reorganised into three artificial data sets, computing tree densities (number of trees per square meter in each quadrat) corresponding to three quadrat sizes (5×5, 10×10 and 20×20 m(2)). Based upon the 5×5 m(2) data set, the spatial structure was modelled using a random component (nugget effect) plus an exponential model for the spatially structured component. Using the within-quadrat variances inferred from the variogram model, the change of support relationships predicted the spatial autocorrelation structure and new variances corresponding to 10×10 m(2) and 20×20 m(2) quadrats. The theoretical and empirical results agreed closely, while the classical approach would have largely underestimated the variance. As quadrat size increases, the range of the autocorrelation model increases, while the variance and proportion of noise in the data decrease. Large quadrats filter out the spatial variation occurring at scales smaller than the size of their sampling units, thus increasing the proportion of spatially structured component with range larger than the size of the sampling units.  相似文献   

16.
17.
Spatial point pattern analysis of available and exploited resources   总被引:7,自引:0,他引:7  
A patchy spatial distribution of resources underpins many models of population regulation and species coexistence, so ecologists require methods to analyse spatially‐explicit data of resource distribution and use. We describe a method for analysing maps of resources and testing hypotheses about species' distributions and selectivity. The method uses point pattern analysis based on the L‐function, the linearised form of Ripley's K‐function. Monte Carlo permutations are used for statistical tests. We estimate the difference between observed and expected values of L(t), an approach with several advantages: 1) The results are easy to interpret ecologically. 2) It obviates the need for edge correction, which has largely precluded the use of L‐functions where plot boundaries are “real”. Including edge corrections may lead to erroneous conclusions if the underlying assumptions are invalid. 3) The null expectation can take many forms, we illustrate two models: complete spatial randomness (to describe the spatial pattern of resources in the landscape) and the underlying pattern of resource patches in the landscape (akin to a neutral landscape approach). The second null is particularly useful to test whether spatial patterns of exploited resource points simply reflect the spatial patterns of all resource points. We tested this method using over 100 simulated point patterns encompassing a range of patterns that might occur in ecological systems, and some very extreme patterns. The approach is generally robust, but Type II decision errors might arise where spatial patterns are weak and when trying to detect a clumped pattern of exploited points against a clumped pattern of all points. An empirical example of an intertidal lichen growing on barnacle shells illustrates how this technique might be used to test hypotheses about dispersal mechanisms. This approach can increase the value of survey data, by permitting quantification of natural resource patch distribution in the landscape as well as patterns of resource use by species.  相似文献   

18.
Most of the historical phytosociological data on vegetation composition have been sampled preferentially and thus belong to those ecological data that do not fulfill the statistical assumption of independence of observations, necessary for valid statistical testing and inference. Nevertheless, phytosociological data have been recently used for various ecological meta-analyses, especially in studies of large-scale vegetation patterns. For this reason, we focus on the comparison of preferential sampling with other sampling designs that have been recommended as more convenient alternatives from the point of view of statistical theory. We discuss that while simple random sampling, systematic sampling and stratified random sampling better meet some of the statistical assumptions, preferential sampling yields data sets that cover a broader range of vegetation variability. Moreover, today’s large phytosociological databases provide huge amounts of vegetation data with unrivalled geographic extent and density. We conclude that in the near future ecologists will not be able to replace the preferentially sampled phytosociological data in large-scale studies. At the same time, phytosociological databases have to be complemented with relevés of vegetation composed mostly of common and generalist species, which are under-represented in historical data. Stratified random sampling seems to be a suitable tool for doing this. Nevertheless, a methodology and input data for stratification have to be developed to make stratified random sampling an ecologically more relevant and practical method.  相似文献   

19.
It has long been known that insufficient consideration of spatial autocorrelation leads to unreliable hypothesis‐tests and inaccurate parameter estimates. Yet, ecologists are confronted with a confusing array of methods to account for spatial autocorrelation. Although Beale et al. (2010) provided guidance for continuous data on regular grids, researchers still need advice for other types of data in more flexible spatial contexts. In this paper, we extend Beale et al. (2010)‘s work to count data on both regularly‐ and irregularly‐spaced plots, the latter being commonly encountered in ecological studies. Through a simulation‐based approach, we assessed the accuracy and the type I errors of two frequentist and two Bayesian ready‐to‐use methods in the family of generalized mixed models, with distance‐based or neighbourhood‐based correlated random effects. In addition, we tested whether the methods are robust to spatial non‐stationarity, and over‐ and under‐dispersion – both typical features of species distribution count data which violate standard regression assumptions. In the simplest of our simulated datasets, the two frequentist methods gave inflated type I errors, while the two Bayesian methods provided satisfying results. When facing real‐world complexities, the distance‐based Bayesian method (MCMC with Langevin–Hastings updates) performed best of all. We hope that, in the light of our results, ecological researchers will feel more comfortable including spatial autocorrelation in their analyses of count data.  相似文献   

20.
Species dispersal studies provide valuable information in biological research. Restricted dispersal may give rise to a non-random distribution of genotypes in space. Detection of spatial genetic structure may therefore provide valuable insight into dispersal. Spatial structure has been treated via autocorrelation analysis with several univariate statistics for which results could dependent on sampling designs. New geostatistical approaches (variogram-based analysis) have been proposed to overcome this problem. However, modelling parametric variograms could be difficult in practice. We introduce a non-parametric variogram-based method for autocorrelation analysis between DNA samples that have been genotyped by means of multilocus-multiallele molecular markers. The method addresses two important aspects of fine-scale spatial genetic analyses: the identification of a non-random distribution of genotypes in space, and the estimation of the magnitude of any non-random structure. The method uses a plot of the squared Euclidean genetic distances vs. spatial distances between pairs of DNA-samples as empirical variogram. The underlying spatial trend in the plot is fitted by a non-parametric smoothing (LOESS, Local Regression). Finally, the predicted LOESS values are explained by segmented regressions (SR) to obtain classical spatial values such as the extent of autocorrelation. For illustration we use multivariate and single-locus genetic distances calculated from a microsatellite data set for which autocorrelation was previously reported. The LOESS/SR method produced a good fit providing similar value of published autocorrelation for this data. The fit by LOESS/SR was simpler to obtain than the parametric analysis since initial parameter values are not required during the trend estimation process. The LOESS/SR method offers a new alternative for spatial analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号