首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper presents the zero‐truncated negative binomial regression model to estimate the population size in the presence of a single registration file. The model is an alternative to the zero‐truncated Poisson regression model and it may be useful if the data are overdispersed due to unobserved heterogeneity. Horvitz–Thompson point and interval estimates for the population size are derived, and the performance of these estimators is evaluated in a simulation study. To illustrate the model, the size of the population of opiate users in the city of Rotterdam is estimated. In comparison to the Poisson model, the zero‐truncated negative binomial regression model fits these data better and yields a substantially higher population size estimate. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

2.
The one‐inflated positive Poisson mixture model (OIPPMM) is presented, for use as the truncated count model in Horvitz–Thompson estimation of an unknown population size. The OIPPMM offers a way to address two important features of some capture–recapture data: one‐inflation and unobserved heterogeneity. The OIPPMM provides markedly different results than some other popular estimators, and these other estimators can appear to be quite biased, or utterly fail due to the boundary problem, when the OIPPMM is the true data‐generating process. In addition, the OIPPMM provides a solution to the boundary problem, by labelling any mixture components on the boundary instead as one‐inflation.  相似文献   

3.
In this study, we would like to show that the one‐inflated zero‐truncated negative binomial (OIZTNB) regression model can be easily implemented in R via built‐in functions when we use mean‐parameterization feature of negative binomial distribution to build OIZTNB regression model. From the practitioners' point of view, we believe that this approach presents a computationally convenient way for implementation of the OIZTNB regression model.  相似文献   

4.
The purpose of the study is to estimate the population size under a homogeneous truncated count model and under model contaminations via the Horvitz‐Thompson approach on the basis of a count capture‐recapture experiment. The proposed estimator is based on a mixture of zero‐truncated Poisson distributions. The benefit of using the proposed model is statistical inference of the long‐tailed or skewed distributions and the concavity of the likelihood function with strong results available on the nonparametric maximum likelihood estimator (NPMLE). The results of comparisons, for finding the appropriate estimator among McKendrick's, Mantel‐Haenszel's, Zelterman's, Chao's, the maximum likelihood, and the proposed methods in a simulation study, reveal that under model contaminations the proposed estimator provides the best choice according to its smallest bias and smallest mean square error for a situation of sufficiently large population sizes and the further results show that the proposed estimator performs well even for a homogeneous situation. The empirical examples, containing the cholera epidemic in India based on homogeneity and the heroin user data in Bangkok 2002 based on heterogeneity, are fitted with an excellent goodness‐of‐fit of the models and the confidence interval estimations may also be of considerable interest. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

5.
Zero‐truncated data arises in various disciplines where counts are observed but the zero count category cannot be observed during sampling. Maximum likelihood estimation can be used to model these data; however, due to its nonstandard form it cannot be easily implemented using well‐known software packages, and additional programming is often required. Motivated by the Rao–Blackwell theorem, we develop a weighted partial likelihood approach to estimate model parameters for zero‐truncated binomial and Poisson data. The resulting estimating function is equivalent to a weighted score function for standard count data models, and allows for applying readily available software. We evaluate the efficiency for this new approach and show that it performs almost as well as maximum likelihood estimation. The weighted partial likelihood approach is then extended to regression modelling and variable selection. We examine the performance of the proposed methods through simulation and present two case studies using real data.  相似文献   

6.
Little attention has been paid to the use of multi‐sample batch‐marking studies, as it is generally assumed that an individual's capture history is necessary for fully efficient estimates. However, recently, Huggins et al. ( 2010 ) present a pseudo‐likelihood for a multi‐sample batch‐marking study where they used estimating equations to solve for survival and capture probabilities and then derived abundance estimates using a Horvitz–Thompson‐type estimator. We have developed and maximized the likelihood for batch‐marking studies. We use data simulated from a Jolly–Seber‐type study and convert this to what would have been obtained from an extended batch‐marking study. We compare our abundance estimates obtained from the Crosbie–Manly–Arnason–Schwarz (CMAS) model with those of the extended batch‐marking model to determine the efficiency of collecting and analyzing batch‐marking data. We found that estimates of abundance were similar for all three estimators: CMAS, Huggins, and our likelihood. Gains are made when using unique identifiers and employing the CMAS model in terms of precision; however, the likelihood typically had lower mean square error than the pseudo‐likelihood method of Huggins et al. ( 2010 ). When faced with designing a batch‐marking study, researchers can be confident in obtaining unbiased abundance estimators. Furthermore, they can design studies in order to reduce mean square error by manipulating capture probabilities and sample size.  相似文献   

7.
Estimation of a population size by means of capture‐recapture techniques is an important problem occurring in many areas of life and social sciences. We consider the frequencies of frequencies situation, where a count variable is used to summarize how often a unit has been identified in the target population of interest. The distribution of this count variable is zero‐truncated since zero identifications do not occur in the sample. As an application we consider the surveillance of scrapie in Great Britain. In this case study holdings with scrapie that are not identified (zero counts) do not enter the surveillance database. The count variable of interest is the number of scrapie cases per holding. For count distributions a common model is the Poisson distribution and, to adjust for potential heterogeneity, a discrete mixture of Poisson distributions is used. Mixtures of Poissons usually provide an excellent fit as will be demonstrated in the application of interest. However, as it has been recently demonstrated, mixtures also suffer under the so‐called boundary problem, resulting in overestimation of population size. It is suggested here to select the mixture model on the basis of the Bayesian Information Criterion. This strategy is further refined by employing a bagging procedure leading to a series of estimates of population size. Using the median of this series, highly influential size estimates are avoided. In limited simulation studies it is shown that the procedure leads to estimates with remarkable small bias. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

8.
The assessment of population trends is a key point in wildlife conservation. Survey data collected over long period may not be comparable due to the presence of environmental biases (i.e. inadequate representation of the variability of environmental covariates in the study area). Moreover, count data may be affected by both overdispersion (i.e. the variance is larger than the mean) and excess of zero counts (potentially leading to zero inflation). The aim of this study was to define a modelling procedure to assess long-term population trends that addressed these three issues and to shed light on the effects of environmental bias, overdispersion, and zero inflation on trend estimates. To test our procedure, we used six bird species whose data were collected in northern Italy from 1992 to 2019. We designed a multi-step approach. First, using generalised additive models (GAMs), we implemented a full factorial design of models (eight models per species) taking or not into account the environmental bias (including or not including environmental covariates, respectively), overdispersion (using a negative binomial distribution or a Poisson distribution, respectively), and zero inflation (using or not using zero-inflated models, respectively). Models were ranked according to the Akaike Information Criterion. Second, annual population indices (median and 95% confidence interval of the number of breeding pairs per point count) were predicted through a parametric bootstrap procedure. Third, long-term population trends were assessed and tested for significance fitting weighted least square linear regression models to the predicted annual indices. To evaluate the effect of environmental bias, overdispersion, and zero inflation on trend estimates, an average discrepancy index was calculated for each model group. The results showed that environmental bias was the most important driver in determining different trend estimates, although overlooking overdispersion and zero inflation could lead to misleading results. For five species, zero-inflated GAMs resulted the best models to predict annual population indices. Our findings suggested a mutual interaction between zero inflation and overdispersion, with overdispersion arising in non-zero-inflated models. Moreover, for species having flocking foraging and/or colonial breeding behaviours, overdispersed and zero-inflated models may be more adequate. In conclusion, properly handling environmental bias, which may affect several data sets coming from long-term monitoring programs, is crucial to obtain reliable estimates of population trends. Furthermore, the extent to which overdispersion and zero inflation may affect trend estimates should be assessed by comparing different models, rather than presumed using statistical assumption.  相似文献   

9.
10.
Summary Estimation of abundance is important in both open and closed population capture–recapture analysis, but unmodeled heterogeneity of capture probability leads to negative bias in abundance estimates. This article defines and develops a suite of open population capture–recapture models using finite mixtures to model heterogeneity of capture and survival probabilities. Model comparisons and parameter estimation use likelihood‐based methods. A real example is analyzed, and simulations are used to check the main features of the heterogeneous models, especially the quality of estimation of abundance, survival, recruitment, and turnover. The two major advances in this article are the provision of realistic abundance estimates that take account of heterogenetiy of capture, and an appraisal of the amount of overestimation of survival arising from conditioning on the first capture when heterogeneity of survival is present.  相似文献   

11.
Estimating population density as precise as possible is a key premise for managing wild animal species. This can be a challenging task if the species in question is elusive or, due to high quantities, hard to count. We present a new, mathematically derived estimator for population size, where the estimation is based solely on the frequency of genetically assigned parent–offspring pairs within a subsample of an ungulate population. By use of molecular markers like microsatellites, the number of these parent–offspring pairs can be determined. The study's aim was to clarify whether a classical capture–mark–recapture (CMR) method can be adapted or extended by this genetic element to a genetic‐based capture–mark–recapture (g‐CMR). We numerically validate the presented estimator (and corresponding variance estimates) and provide the R‐code for the computation of estimates of population size including confidence intervals. The presented method provides a new framework to precisely estimate population size based on the genetic analysis of a one‐time subsample. This is especially of value where traditional CMR methods or other DNA‐based (fecal or hair) capture–recapture methods fail or are too difficult to apply. The DNA source used is basically irrelevant, but in the present case the sampling of an annual hunting bag is to serve as data basis. In addition to the high quality of muscle tissue samples, hunting bags provide additional and essential information for wildlife management practices, such as age, weight, or sex. In cases where a g‐CMR method is ecologically and hunting‐wise appropriate, it enables a wide applicability, also through its species‐independent use.  相似文献   

12.
Obtaining inferences on disease dynamics (e.g., host population size, pathogen prevalence, transmission rate, host survival probability) typically requires marking and tracking individuals over time. While multistate mark–recapture models can produce high‐quality inference, these techniques are difficult to employ at large spatial and long temporal scales or in small remnant host populations decimated by virulent pathogens, where low recapture rates may preclude the use of mark–recapture techniques. Recently developed N‐mixture models offer a statistical framework for estimating wildlife disease dynamics from count data. N‐mixture models are a type of state‐space model in which observation error is attributed to failing to detect some individuals when they are present (i.e., false negatives). The analysis approach uses repeated surveys of sites over a period of population closure to estimate detection probability. We review the challenges of modeling disease dynamics and describe how N‐mixture models can be used to estimate common metrics, including pathogen prevalence, transmission, and recovery rates while accounting for imperfect host and pathogen detection. We also offer a perspective on future research directions at the intersection of quantitative and disease ecology, including the estimation of false positives in pathogen presence, spatially explicit disease‐structured N‐mixture models, and the integration of other data types with count data to inform disease dynamics. Managers rely on accurate and precise estimates of disease dynamics to develop strategies to mitigate pathogen impacts on host populations. At a time when pathogens pose one of the greatest threats to biodiversity, statistical methods that lead to robust inferences on host populations are critically needed for rapid, rather than incremental, assessments of the impacts of emerging infectious diseases.  相似文献   

13.
Summary We estimate the parameters of a stochastic process model for a macroparasite population within a host using approximate Bayesian computation (ABC). The immunity of the host is an unobserved model variable and only mature macroparasites at sacrifice of the host are counted. With very limited data, process rates are inferred reasonably precisely. Modeling involves a three variable Markov process for which the observed data likelihood is computationally intractable. ABC methods are particularly useful when the likelihood is analytically or computationally intractable. The ABC algorithm we present is based on sequential Monte Carlo, is adaptive in nature, and overcomes some drawbacks of previous approaches to ABC. The algorithm is validated on a test example involving simulated data from an autologistic model before being used to infer parameters of the Markov process model for experimental data. The fitted model explains the observed extra‐binomial variation in terms of a zero‐one immunity variable, which has a short‐lived presence in the host.  相似文献   

14.
Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log‐normal model (Aitchison and Ho, 1989) cannot be used to fit multivariate count data with excess zero‐vectors; (ii) The multivariate zero‐inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero‐truncated/deflated count data and it is difficult to apply to high‐dimensional cases; (iii) The Type I multivariate zero‐adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods.  相似文献   

15.
Elephants living in dense woodlands are difficult to count. Many elephant populations in Africa occur in such conditions. Estimates of these populations based on total counts, aerial counts and dung counts often lack information on precision and accuracy. We use standard mark–recapture field methods to obtain estimates of population size with associated confidence limits. We apply this approach to a closed elephant population in the Tembe Elephant Park (300 km2), South Africa. A registration count completed in 4 months gives a known population size. We evaluate mark–recapture models against the known population size. Individual identification profiles obtained for elephants during the registration count and mark–recapture events indicate that at least 167 elephants live in the park. We consider this value as an estimate of the minimum number alive. We include 189 sightings of bulls and 37 sightings of breeding herds in the mark–recapture modelling. Of the models we test (Petersen, Schnabel, Schumacher, Jolly–Seber, Bowden's, Poisson and negative binomial), Bowden's gives an estimate closest to the registration count. Assumptions of the model are not violated. For all models except one (negative binomial), our estimates improve with increased sampling intensity. Confidence intervals do not improve with increased effort except for the Schnabel model. Mark–recapture methods should be considered as reliable estimators of population size for elephants occurring in dense woodlands and forests when other methods cannot be relied on.  相似文献   

16.
  1. In capture–recapture studies, recycled individuals occur when individuals lose all of their tags and are recaptured as though they were new individuals. Typically, the effect of these recycled individuals is assumed negligible.
  2. Through a simulation‐based study of double‐tagging experiments, we examined the effect of recycled individuals on parameter estimates in the Jolly–Seber model with tag loss (Cowen & Schwarz, 2006). We validated the simulation framework using long‐term census data of elephant seals.
  3. Including recycled individuals did not affect estimates of capture, survival, and tag‐retention probabilities. However, with low tag‐retention rates, high capture rates, and high survival rates, recycled individuals produced overestimates of population size. For the elephant seal case study, we found population size estimates to be between 8% and 53% larger when recycled individuals were ignored.
  4. Ignoring the effects of recycled individuals can cause large biases in population size estimates. These results are particularly noticeable in longer studies.
  相似文献   

17.
Count data often exhibit more zeros than predicted by common count distributions like the Poisson or negative binomial. In recent years, there has been considerable interest in methods for analyzing zero-inflated count data in longitudinal or other correlated data settings. A common approach has been to extend zero-inflated Poisson models to include random effects that account for correlation among observations. However, these models have been shown to have a few drawbacks, including interpretability of regression coefficients and numerical instability of fitting algorithms even when the data arise from the assumed model. To address these issues, we propose a model that parameterizes the marginal associations between the count outcome and the covariates as easily interpretable log relative rates, while including random effects to account for correlation among observations. One of the main advantages of this marginal model is that it allows a basis upon which we can directly compare the performance of standard methods that ignore zero inflation with that of a method that explicitly takes zero inflation into account. We present simulations of these various model formulations in terms of bias and variance estimation. Finally, we apply the proposed approach to analyze toxicological data of the effect of emissions on cardiac arrhythmias.  相似文献   

18.
Count data sets are traditionally analyzed using the ordinary Poisson distribution. However, such a model has its applicability limited as it can be somewhat restrictive to handle specific data structures. In this case, it arises the need for obtaining alternative models that accommodate, for example, (a) zero‐modification (inflation or deflation at the frequency of zeros), (b) overdispersion, and (c) individual heterogeneity arising from clustering or repeated (correlated) measurements made on the same subject. Cases (a)–(b) and (b)–(c) are often treated together in the statistical literature with several practical applications, but models supporting all at once are less common. Hence, this paper's primary goal was to jointly address these issues by deriving a mixed‐effects regression model based on the hurdle version of the Poisson–Lindley distribution. In this framework, the zero‐modification is incorporated by assuming that a binary probability model determines which outcomes are zero‐valued, and a zero‐truncated process is responsible for generating positive observations. Approximate posterior inferences for the model parameters were obtained from a fully Bayesian approach based on the Adaptive Metropolis algorithm. Intensive Monte Carlo simulation studies were performed to assess the empirical properties of the Bayesian estimators. The proposed model was considered for the analysis of a real data set, and its competitiveness regarding some well‐established mixed‐effects models for count data was evaluated. A sensitivity analysis to detect observations that may impact parameter estimates was performed based on standard divergence measures. The Bayesian ‐value and the randomized quantile residuals were considered for model diagnostics.  相似文献   

19.
Summary Reversible jump Markov chain Monte Carlo (RJMCMC) methods are used to fit Bayesian capture–recapture models incorporating heterogeneity in individuals and samples. Heterogeneity in capture probabilities comes from finite mixtures and/or fixed sample effects allowing for interactions. Estimation by RJMCMC allows automatic model selection and/or model averaging. Priors on the parameters stabilize the estimates and produce realistic credible intervals for population size for overparameterized models, in contrast to likelihood‐based methods. To demonstrate the approach we analyze the standard Snowshoe hare and Cottontail rabbit data sets from ecology, a reliability testing data set.  相似文献   

20.
  1. Reliable estimates of abundance are critical in effectively managing threatened species, but the feasibility of integrating data from wildlife surveys completed using advanced technologies such as remotely piloted aircraft systems (RPAS) and machine learning into abundance estimation methods such as N‐mixture modeling is largely unknown due to the unique sources of detection errors associated with these technologies.
  2. We evaluated two modeling approaches for estimating the abundance of koalas detected automatically in RPAS imagery: (a) a generalized N‐mixture model and (b) a modified Horvitz–Thompson (H‐T) estimator method combining generalized linear models and generalized additive models for overall probability of detection, false detection, and duplicate detection. The final estimates from each model were compared to the true number of koalas present as determined by telemetry‐assisted ground surveys.
  3. The modified H‐T estimator approach performed best, with the true count of koalas captured within the 95% confidence intervals around the abundance estimates in all 4 surveys in the testing dataset (n = 138 detected objects), a particularly strong result given the difficulty in attaining accuracy found with previous methods.
  4. The results suggested that N‐mixture models in their current form may not be the most appropriate approach to estimating the abundance of wildlife detected in RPAS surveys with automated detection, and accurate estimates could be made with approaches that account for spurious detections.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号