首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
The assessment of population trends is a key point in wildlife conservation. Survey data collected over long period may not be comparable due to the presence of environmental biases (i.e. inadequate representation of the variability of environmental covariates in the study area). Moreover, count data may be affected by both overdispersion (i.e. the variance is larger than the mean) and excess of zero counts (potentially leading to zero inflation). The aim of this study was to define a modelling procedure to assess long-term population trends that addressed these three issues and to shed light on the effects of environmental bias, overdispersion, and zero inflation on trend estimates. To test our procedure, we used six bird species whose data were collected in northern Italy from 1992 to 2019. We designed a multi-step approach. First, using generalised additive models (GAMs), we implemented a full factorial design of models (eight models per species) taking or not into account the environmental bias (including or not including environmental covariates, respectively), overdispersion (using a negative binomial distribution or a Poisson distribution, respectively), and zero inflation (using or not using zero-inflated models, respectively). Models were ranked according to the Akaike Information Criterion. Second, annual population indices (median and 95% confidence interval of the number of breeding pairs per point count) were predicted through a parametric bootstrap procedure. Third, long-term population trends were assessed and tested for significance fitting weighted least square linear regression models to the predicted annual indices. To evaluate the effect of environmental bias, overdispersion, and zero inflation on trend estimates, an average discrepancy index was calculated for each model group. The results showed that environmental bias was the most important driver in determining different trend estimates, although overlooking overdispersion and zero inflation could lead to misleading results. For five species, zero-inflated GAMs resulted the best models to predict annual population indices. Our findings suggested a mutual interaction between zero inflation and overdispersion, with overdispersion arising in non-zero-inflated models. Moreover, for species having flocking foraging and/or colonial breeding behaviours, overdispersed and zero-inflated models may be more adequate. In conclusion, properly handling environmental bias, which may affect several data sets coming from long-term monitoring programs, is crucial to obtain reliable estimates of population trends. Furthermore, the extent to which overdispersion and zero inflation may affect trend estimates should be assessed by comparing different models, rather than presumed using statistical assumption.  相似文献   

2.
The case-crossover design was introduced in epidemiology 15 years ago as a method for studying the effects of a risk factor on a health event using only cases. The idea is to compare a case's exposure immediately prior to or during the case-defining event with that same person's exposure at otherwise similar "reference" times. An alternative approach to the analysis of daily exposure and case-only data is time series analysis. Here, log-linear regression models express the expected total number of events on each day as a function of the exposure level and potential confounding variables. In time series analyses of air pollution, smooth functions of time and weather are the main confounders. Time series and case-crossover methods are often viewed as competing methods. In this paper, we show that case-crossover using conditional logistic regression is a special case of time series analysis when there is a common exposure such as in air pollution studies. This equivalence provides computational convenience for case-crossover analyses and a better understanding of time series models. Time series log-linear regression accounts for overdispersion of the Poisson variance, while case-crossover analyses typically do not. This equivalence also permits model checking for case-crossover data using standard log-linear model diagnostics.  相似文献   

3.
Count data are very common in health services research, and very commonly the basic Poisson regression model has to be extended in several ways to accommodate several sources of heterogeneity: (i) an excess number of zeros relative to a Poisson distribution, (ii) hierarchical structures, and correlated data, (iii) remaining “unexplained” sources of overdispersion. In this paper, we propose hierarchical zero‐inflated and overdispersed models with independent, correlated, and shared random effects for both components of the mixture model. We show that all different extensions of the Poisson model can be based on the concept of mixture models, and that they can be combined to account for all different sources of heterogeneity. Expressions for the first two moments are derived and discussed. The models are applied to data on maternal deaths and related risk factors within health facilities in Mozambique. The final model shows that the maternal mortality rate mainly depends on the geographical location of the health facility, the percentage of women admitted with HIV and the percentage of referrals from the health facility.  相似文献   

4.
5.
BackgroundStatistical models are regularly used in the forecasting and surveillance of infectious diseases to guide public health. Variable selection assists in determining factors associated with disease transmission, however, often overlooked in this process is the evaluation and suitability of the statistical model used in forecasting disease transmission and outbreaks. Here we aim to evaluate several modelling methods to optimise predictive modelling of Ross River virus (RRV) disease notifications and outbreaks in epidemiological important regions of Victoria and Western Australia.Methodology/Principal findingsWe developed several statistical methods using meteorological and RRV surveillance data from July 2000 until June 2018 in Victoria and from July 1991 until June 2018 in Western Australia. Models were developed for 11 Local Government Areas (LGAs) in Victoria and seven LGAs in Western Australia. We found generalised additive models and generalised boosted regression models, and generalised additive models and negative binomial models to be the best fit models when predicting RRV outbreaks and notifications, respectively. No association was found with a model’s ability to predict RRV notifications in LGAs with greater RRV activity, or for outbreak predictions to have a higher accuracy in LGAs with greater RRV notifications. Moreover, we assessed the use of factor analysis to generate independent variables used in predictive modelling. In the majority of LGAs, this method did not result in better model predictive performance.Conclusions/SignificanceWe demonstrate that models which are developed and used for predicting disease notifications may not be suitable for predicting disease outbreaks, or vice versa. Furthermore, poor predictive performance in modelling disease transmissions may be the result of inappropriate model selection methods. Our findings provide approaches and methods to facilitate the selection of the best fit statistical model for predicting mosquito-borne disease notifications and outbreaks used for disease surveillance.  相似文献   

6.
We investigated whether signals of known dispersal processes and habitat patch turnover could be detected in a snapshot of the distribution of the tansy leaf beetle Chrysolina graminis among patches of its host plant tansy Tanacetum vulgare . Beetle occupancy in 1305 patches was analysed using autologistic generalised additive models (GAMs). These model spatial autocorrelation with an autocovariate calculated as the distance-weighted rate of occupancy among neighbouring patches. The autocovariate that best explained beetle occupancy was one which represented the active search for patches during beetle dispersal, included a distance weight that closely matched a previously fitted dispersal kernel and had neighbourhood sizes encompassing ∼95% of known dispersal distances. Autocovariates distinguishing between neighbours on the same and opposite riverbanks outperformed those that did not, revealing the river as a barrier to dispersal. Differentiating between up and downstream autocorrelation did not improve model fit, as is consistent with the beetle's lack of directional bias in dispersal. Habitat connectivity (the extent to which it was surrounded by other patches) did not appear to affect beetle occupancy in the field, while positive effects were found for distributions simulated from the GAM. We argue that this reflects a non-equilibrium distribution driven by slow responses to high rates of habitat patch turnover due to limited dispersal ability. Our findings suggest that presence/absence snapshots can reveal patterns of dispersal and be used to test whether species' ranges are at equilibrium. Such information is important for effective conservation so the possibility of inferring these patterns from distribution data is an appealing one.  相似文献   

7.
Overdispersion is a common phenomenon in Poisson modeling, and the negative binomial (NB) model is frequently used to account for overdispersion. Testing approaches (Wald test, likelihood ratio test (LRT), and score test) for overdispersion in the Poisson regression versus the NB model are available. Because the generalized Poisson (GP) model is similar to the NB model, we consider the former as an alternate model for overdispersed count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes a score test for overdispersion based on the GP model and compares the power of the test with the LRT and Wald tests. A simulation study indicates the score test based on asymptotic standard Normal distribution is more appropriate in practical application for higher empirical power, however, it underestimates the nominal significance level, especially in small sample situations, and examples illustrate the results of comparing the candidate tests between the Poisson and GP models. A bootstrap test is also proposed to adjust the underestimation of nominal level in the score statistic when the sample size is small. The simulation study indicates the bootstrap test has significance level closer to nominal size and has uniformly greater power than the score test based on asymptotic standard Normal distribution. From a practical perspective, we suggest that, if the score test gives even a weak indication that the Poisson model is inappropriate, say at the 0.10 significance level, we advise the more accurate bootstrap procedure as a better test for comparing whether the GP model is more appropriate than Poisson model. Finally, the Vuong test is illustrated to choose between GP and NB2 models for the same dataset.  相似文献   

8.
There have been numerous claims in the ecological literature that spatial autocorrelation in the residuals of ordinary least squares (OLS) regression models results in shifts in the partial coefficients, which bias the interpretation of factors influencing geographical patterns. We evaluate the validity of these claims using gridded species richness data for the birds of North America, South America, Europe, Africa, the ex‐USSR, and Australia. We used richness in 110×110 km cells and environmental predictor variables to generate OLS and simultaneous autoregressive (SAR) multiple regression models for each region. Spatial correlograms of the residuals from each OLS model were then used to identify the minimum distance between cells necessary to avoid short‐distance residual spatial autocorrelation in each data set. This distance was used to subsample cells to generate spatially independent data. The partial OLS coefficients estimated with the full dataset were then compared to the distributions of coefficients created with the subsamples. We found that OLS coefficients generated from data containing residual spatial autocorrelation were statistically indistinguishable from coefficients generated from the same data sets in which short‐distance spatial autocorrelation was not present in all 22 coefficients tested. Consistent with the statistical literature on this subject, we conclude that coefficients estimated from OLS regression are not seriously affected by the presence of spatial autocorrelation in gridded geographical data. Further, shifts in coefficients that occurred when using SAR tended to be correlated with levels of uncertainty in the OLS coefficients. Thus, shifts in the relative importance of the predictors between OLS and SAR models are expected when small‐scale patterns for these predictors create weaker and more unstable broad‐scale coefficients. Our results indicate both that OLS regression is unbiased and that differences between spatial and nonspatial regression models should be interpreted with an explicit awareness of spatial scale.  相似文献   

9.
In this paper, we consider selection based on the best predictor of animal additive genetic values in Gaussian linear mixed models, threshold models, Poisson mixed models, and log normal frailty models for survival data (including models with time-dependent covariates with associated fixed or random effects). In the different models, expressions are given (when these can be found – otherwise unbiased estimates are given) for prediction error variance, accuracy of selection and expected response to selection on the additive genetic scale and on the observed scale. The expressions given for non Gaussian traits are generalisations of the well-known formulas for Gaussian traits – and reflect, for Poisson mixed models and frailty models for survival data, the hierarchal structure of the models. In general the ratio of the additive genetic variance to the total variance in the Gaussian part of the model (heritability on the normally distributed level of the model) or a generalised version of heritability plays a central role in these formulas.  相似文献   

10.
Despite a growing interest in species distribution modelling, relatively little attention has been paid to spatial autocorrelation and non-stationarity. Both spatial autocorrelation (the tendency for adjacent locations to be more similar than distant ones) and non-stationarity (the variation in modelled relationships over space) are likely to be common properties of ecological systems. This paper focuses on non-stationarity and uses two local techniques, geographically weighted regression (GWR) and varying coefficient modelling (VCM), to assess its impact on model predictions. We extend two published studies, one on the presence–absence of calandra larks in Spain and the other on bird species richness in Britain, to compare GWR and VCM with the more usual global generalized linear modelling (GLM) and generalized additive modelling (GAM). For the calandra lark data, GWR and VCM produced better-fitting models than GLM or GAM. VCM in particular gave significantly reduced spatial autocorrelation in the model residuals. GWR showed that individual predictors became stationary at different spatial scales, indicating that distributions are influenced by ecological processes operating over multiple scales. VCM was able to predict occurrence accurately on independent data from the same geographical area as the training data but not beyond, whereas the GAM produced good results on all areas. Individual predictions from the local methods often differed substantially from the global models. For the species richness data, VCM and GWR produced far better predictions than ordinary regression. Our analyses suggest that modellers interpolating data to produce maps for practical actions (e.g. conservation) should consider local methods, whereas they should not be used for extrapolation to new areas. We argue that local methods are complementary to global methods, revealing details of habitat associations and data properties which global methods average out and miss.  相似文献   

11.
Aim To analyse the effects of simultaneously using spatial and phylogenetic information in removing spatial autocorrelation of residuals within a multiple regression framework of trait analysis. Location Switzerland, Europe. Methods We used an eigenvector filtering approach to analyse the relationship between spatial distribution of a trait (flowering phenology) and environmental covariates in a multiple regression framework. Eigenvector filters were calculated from ordinations of distance matrices. Distance matrices were either based on pure spatial information, pure phylogenetic information or spatially structured phylogenetic information. In the multiple regression, those filters were selected which best reduced Moran's I coefficient of residual autocorrelation. These were added as covariates to a regression model of environmental variables explaining trait distribution. Results The simultaneous provision of spatial and phylogenetic information was effectively able to remove residual autocorrelation in the analysis. Adding phylogenetic information was superior to adding purely spatial information. Applying filters showed altered results, i.e. different environmental predictors were seen to be significant. Nevertheless, mean annual temperature and calcareous substrate remained the most important predictors to explain the onset of flowering in Switzerland; namely, the warmer the temperature and the more calcareous the substrate, the earlier the onset of flowering. A sequential approach, i.e. first removing the phylogenetic signal from traits and then applying a spatial analysis, did not provide more information or yield less autocorrelation than simple or purely spatial models. Main conclusions The combination of spatial and spatio‐phylogenetic information is recommended in the analysis of trait distribution data in a multiple regression framework. This approach is an efficient means for reducing residual autocorrelation and for testing the robustness of results, including the indication of incomplete parameterizations, and can facilitate ecological interpretation.  相似文献   

12.
We analyze a real data set pertaining to reindeer fecal pellet‐group counts obtained from a survey conducted in a forest area in northern Sweden. In the data set, over 70% of counts are zeros, and there is high spatial correlation. We use conditionally autoregressive random effects for modeling of spatial correlation in a Poisson generalized linear mixed model (GLMM), quasi‐Poisson hierarchical generalized linear model (HGLM), zero‐inflated Poisson (ZIP), and hurdle models. The quasi‐Poisson HGLM allows for both under‐ and overdispersion with excessive zeros, while the ZIP and hurdle models allow only for overdispersion. In analyzing the real data set, we see that the quasi‐Poisson HGLMs can perform better than the other commonly used models, for example, ordinary Poisson HGLMs, spatial ZIP, and spatial hurdle models, and that the underdispersed Poisson HGLMs with spatial correlation fit the reindeer data best. We develop R codes for fitting these models using a unified algorithm for the HGLMs. Spatial count response with an extremely high proportion of zeros, and underdispersion can be successfully modeled using the quasi‐Poisson HGLM with spatial random effects.  相似文献   

13.
The summer of 2003 was exceptionally hot, leading to an excess of mortality in Europe. Here, we assess the short-term effects of extreme hot summer temperatures on total daily mortality in Barcelona (Spain). Daily mortality from burial records, maximum temperature, relative humidity and photochemical pollutants, were collected for the period 1999–2003. Data was analysed using Poisson regression with generalised additive models. Mortality shows a considerable increase when maximum temperatures are over a threshold temperature of 30.5°C. The risk of death associated with an increase of 1°C above the threshold was 6%, 7% and 5% after 1, 2 and 3 days, respectively. Exposure to extreme hot temperatures leads to an significant increase in mortality.  相似文献   

14.
Planning actions for species conservation involves working at both an ecologically meaningful spatial scale and a scale suitable for implementing management or conservation plans. Animal populations and conservation policies often operate across wide areas. Large-extent spatial datasets are thus often used, but their analyses rarely deal with problems inherent to spatial datasets such as residual spatial autocorrelation, which can bias or even reverse results. Here we propose a procedure for analysing a large-scale count dataset integrating residual spatial autocorrelation in a Generalized Linear Model framework by combining and extending previously published methods. The first step concerns the selection of the environmental variables by a modified cross-validation procedure allowing for residual spatial autocorrelation. Then the second step consists in evaluating the spatial effect of the model using a spatial filtering approach based on the variogram parameters. We apply this method to the Black kite (Milvus migrans) to estimate the distribution and population size of this species in France. We found some divergence in estimated population size between spatial and non spatial models, as well as in the distribution map. We also found that the uncertainty of the model was underestimated by the residual spatial autocorrelation. Our analysis confirms previous results, that residual spatial autocorrelation should be always accounted for, especially in conservation where false results may lead to poor management decisions.  相似文献   

15.
Simple models of molecular evolution assume that sequences evolve by a Poisson process in which nucleotide or amino acid substitutions occur as rare independent events. In these models, the expected ratio of the variance to the mean of substitution counts equals 1, and substitution processes with a ratio greater than 1 are called overdispersed. Comparing the genomes of 10 closely related species of Drosophila, we extend earlier evidence for overdispersion in amino acid replacements as well as in four-fold synonymous substitutions. The observed deviation from the Poisson expectation can be described as a linear function of the rate at which substitutions occur on a phylogeny, which implies that deviations from the Poisson expectation arise from gene-specific temporal variation in substitution rates. Amino acid sequences show greater temporal variation in substitution rates than do four-fold synonymous sequences. Our findings provide a general phenomenological framework for understanding overdispersion in the molecular clock. Also, the presence of substantial variation in gene-specific substitution rates has broad implications for work in phylogeny reconstruction and evolutionary rate estimation.  相似文献   

16.
Abstract. We evaluate the potential influence of disturbance on the predictability of alpine plant species distribution from equilibrium‐based habitat distribution models. Firstly, abundance data of 71 plant species were correlated with a comprehensive set of environmental variables using ordinal regression models. Subsequently, the residual spatial autocorrelation (at distances of 40 to 320 m) in these models was explored. The additional amount of variance explained by spatial structuring was compared with a set of functional traits assumed to confer advantages in disturbed or undisturbed habitats. We found significant residual spatial autocorrelation in the habitat models of most of the species that were analysed. The amount of this autocorrelation was positively correlated with the dispersal capacity of the species, levelling off with increasing spatial scale. Both trends indicate that dispersal and colonization processes, whose frequency is enhanced by disturbance, influence the distribution of many alpine plant species. Since habitat distribution models commonly ignore such spatial processes they miss an important driver of local‐ to landscape‐scale plant distribution.  相似文献   

17.
基于地理加权回归拓展模型的天然次生林碳储量空间分布   总被引:1,自引:0,他引:1  
为精准获取区域尺度天然次生林的碳储量及其空间分布格局,以吉林省汪清林业局浪溪林场的天然次生林为研究对象,基于165块局级固定样地,以林分因子、地形因子和土壤因子为影响因子,将普通地理加权回归模型(GWR)作为基础,从空间维度、参数异质性特征和残差空间自相关性3个方面进行改进,构建7类拓展模型,即地理海拔加权回归模型(G...  相似文献   

18.
We consider models for hierarchical count data, subject to overdispersion and/or excess zeros. Molenberghs et al. ( 2007 ) and Molenberghs et al. ( 2010 ) extend the Poisson‐normal generalized linear‐mixed model by including gamma random effects to accommodate overdispersion. Excess zeros are handled using either a zero‐inflation or a hurdle component. These models were studied by Kassahun et al. ( 2014 ). While flexible, they are quite elaborate in parametric specification and therefore model assessment is imperative. We derive local influence measures to detect and examine influential subjects, that is subjects who have undue influence on either the fit of the model as a whole, or on specific important sub‐vectors of the parameter vector. The latter include the fixed effects for the Poisson and for the excess‐zeros components, the variance components for the normal random effects, and the parameters describing gamma random effects, included to accommodate overdispersion. Interpretable influence components are derived. The method is applied to data from a longitudinal clinical trial involving patients with epileptic seizures. Even though the data were extensively analyzed in earlier work, the insight gained from the proposed diagnostics, statistically and clinically, is considerable. Possibly, a small but important subgroup of patients has been identified.  相似文献   

19.
The malaria burden in Viet Nam has been in decline in recent decades, but localised areas of high transmission remain. We used spatiotemporal analytical tools to determine the social and environmental drivers of malaria risk and to identify residual high-risk areas where control and surveillance resources can be targeted. Counts of reported Plasmodium falciparum and Plasmodium vivax malaria cases by month (January 2007-December 2008) and by district were assembled. Zero-inflated Poisson regression models were developed in a Bayesian framework. Models had the percentage of the district’s population living below the poverty line, percent of the district covered by forest, median elevation, median long-term average precipitation, and minimum temperature included as fixed effects, and terms for temporal trend and residual district-level spatial autocorrelation. Strong temporal and spatial heterogeneity in counts of malaria cases was apparent. Poverty and forest cover were significantly associated with an increased count of malaria cases but the magnitude and direction of associations between climate and malaria varied by socio-ecological zone. There was a declining trend in counts of malaria cases during the study period. After accounting for the social and environmental fixed effects, substantial spatial heterogeneity was still evident. Unmeasured factors which may contribute to this residual variation include malaria control activities, population migration and accessibility to health care. Forest-related activities and factors encompassed by poverty indicators are major drivers of malaria incidence in Viet Nam.  相似文献   

20.
Summary Doubling time has been widely used to represent the growth pattern of cells. A traditional method for finding the doubling time is to apply gray-scaled cells, where the logarithmic transformed scale is used. As an alternative statistical method, the log-linear model was recently proposed, for which actual cell numbers are used instead of the transformed gray-scaled cells. In this paper, I extend the log-linear model and propose the extended log-linear model. This model is designed for extra-Poisson variation, where the log-linear model produces the less appropriate estimate of the doubling time. Moreover, I compare statistical properties of the gray-scaled method, the log-linear model, and the extended log-linear model. For this purpose, I perform a Monte Carlo simulation study with three data-generating models: the additive error model, the multiplicative error model, and the overdispersed Poisson model. From the simulation study, I found that the gray-scaled method highly depends on the normality assumption of the gray-scaled cells; hence, this method is appropriate when the error model is multiplicative with the log-normally distributed errors. However, it is less efficient for other types of error distributions, especially when the error model is additive or the errors follow the Poisson distribution. The estimated standard error for the doubling time is not accurate in this case. The log-linear model was found to be efficient when the errors follow the Poisson distribution or nearly Poisson distribution. The efficiency of the log-linear model was decreased accordingly as the overdispersion increased, compared to the extended log-linear model. When the error model is additive or multiplicative with Gamma-distributed errors, the log-linear model is more efficient than the gray-scaled method. The extended log-linear model performs well overall for all three data-generating models. The loss of efficiency of the extended log-linear model is observed only when the error model is multiplicative with log-normally distributed errors, where the gray-scaled method is appropriate. However, the extended log-linear model is more efficient than the log-linear model in this case.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号