首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present the one‐inflated zero‐truncated negative binomial (OIZTNB) model, and propose its use as the truncated count distribution in Horvitz–Thompson estimation of an unknown population size. In the presence of unobserved heterogeneity, the zero‐truncated negative binomial (ZTNB) model is a natural choice over the positive Poisson (PP) model; however, when one‐inflation is present the ZTNB model either suffers from a boundary problem, or provides extremely biased population size estimates. Monte Carlo evidence suggests that in the presence of one‐inflation, the Horvitz–Thompson estimator under the ZTNB model can converge in probability to infinity. The OIZTNB model gives markedly different population size estimates compared to some existing truncated count distributions, when applied to several capture–recapture data that exhibit both one‐inflation and unobserved heterogeneity.  相似文献   

2.
Zero‐truncated data arises in various disciplines where counts are observed but the zero count category cannot be observed during sampling. Maximum likelihood estimation can be used to model these data; however, due to its nonstandard form it cannot be easily implemented using well‐known software packages, and additional programming is often required. Motivated by the Rao–Blackwell theorem, we develop a weighted partial likelihood approach to estimate model parameters for zero‐truncated binomial and Poisson data. The resulting estimating function is equivalent to a weighted score function for standard count data models, and allows for applying readily available software. We evaluate the efficiency for this new approach and show that it performs almost as well as maximum likelihood estimation. The weighted partial likelihood approach is then extended to regression modelling and variable selection. We examine the performance of the proposed methods through simulation and present two case studies using real data.  相似文献   

3.
In this study, we would like to show that the one‐inflated zero‐truncated negative binomial (OIZTNB) regression model can be easily implemented in R via built‐in functions when we use mean‐parameterization feature of negative binomial distribution to build OIZTNB regression model. From the practitioners' point of view, we believe that this approach presents a computationally convenient way for implementation of the OIZTNB regression model.  相似文献   

4.
In many biometrical applications, the count data encountered often contain extra zeros relative to the Poisson distribution. Zero‐inflated Poisson regression models are useful for analyzing such data, but parameter estimates may be seriously biased if the nonzero observations are over‐dispersed and simultaneously correlated due to the sampling design or the data collection procedure. In this paper, a zero‐inflated negative binomial mixed regression model is presented to analyze a set of pancreas disorder length of stay (LOS) data that comprised mainly same‐day separations. Random effects are introduced to account for inter‐hospital variations and the dependency of clustered LOS observations. Parameter estimation is achieved by maximizing an appropriate log‐likelihood function using an EM algorithm. Alternative modeling strategies, namely the finite mixture of Poisson distributions and the non‐parametric maximum likelihood approach, are also considered. The determination of pertinent covariates would assist hospital administrators and clinicians to manage LOS and expenditures efficiently.  相似文献   

5.
Ridout M  Hinde J  Demétrio CG 《Biometrics》2001,57(1):219-223
Count data often show a higher incidence of zero counts than would be expected if the data were Poisson distributed. Zero-inflated Poisson regression models are a useful class of models for such data, but parameter estimates may be seriously biased if the nonzero counts are overdispersed in relation to the Poisson distribution. We therefore provide a score test for testing zero-inflated Poisson regression models against zero-inflated negative binomial alternatives.  相似文献   

6.
Binomial regression models are commonly applied to proportion data such as those relating to the mortality and infection rates of diseases. However, it is often the case that the responses may exhibit excessive zeros; in such cases a zero‐inflated binomial (ZIB) regression model can be applied instead. In practice, it is essential to test if there are excessive zeros in the outcome to help choose an appropriate model. The binomial models can yield biased inference if there are excessive zeros, while ZIB models may be unnecessarily complex and hard to interpret, and even face convergence issues, if there are no excessive zeros. In this paper, we develop a new test for testing zero inflation in binomial regression models by directly comparing the amount of observed zeros with what would be expected under the binomial regression model. A closed form of the test statistic, as well as the asymptotic properties of the test, is derived based on estimating equations. Our systematic simulation studies show that the new test performs very well in most cases, and outperforms the classical Wald, likelihood ratio, and score tests, especially in controlling type I errors. Two real data examples are also included for illustrative purpose.  相似文献   

7.
When analyzing Poisson count data sometimes a high frequency of extra zeros is observed. The Zero‐Inflated Poisson (ZIP) model is a popular approach to handle zero‐inflation. In this paper we generalize the ZIP model and its regression counterpart to accommodate the extent of individual exposure. Empirical evidence drawn from an occupational injury data set confirms that the incorporation of exposure information can exert a substantial impact on the model fit. Tests for zero‐inflation are also considered. Their finite sample properties are examined in a Monte Carlo study.  相似文献   

8.
We prove that the generalized Poisson distribution GP(theta, eta) (eta > or = 0) is a mixture of Poisson distributions; this is a new property for a distribution which is the topic of the book by Consul (1989). Because we find that the fits to count data of the generalized Poisson and negative binomial distributions are often similar, to understand their differences, we compare the probability mass functions and skewnesses of the generalized Poisson and negative binomial distributions with the first two moments fixed. They have slight differences in many situations, but their zero-inflated distributions, with masses at zero, means and variances fixed, can differ more. These probabilistic comparisons are helpful in selecting a better fitting distribution for modelling count data with long right tails. Through a real example of count data with large zero fraction, we illustrate how the generalized Poisson and negative binomial distributions as well as their zero-inflated distributions can be discriminated.  相似文献   

9.
The purpose of the study is to estimate the population size under a homogeneous truncated count model and under model contaminations via the Horvitz‐Thompson approach on the basis of a count capture‐recapture experiment. The proposed estimator is based on a mixture of zero‐truncated Poisson distributions. The benefit of using the proposed model is statistical inference of the long‐tailed or skewed distributions and the concavity of the likelihood function with strong results available on the nonparametric maximum likelihood estimator (NPMLE). The results of comparisons, for finding the appropriate estimator among McKendrick's, Mantel‐Haenszel's, Zelterman's, Chao's, the maximum likelihood, and the proposed methods in a simulation study, reveal that under model contaminations the proposed estimator provides the best choice according to its smallest bias and smallest mean square error for a situation of sufficiently large population sizes and the further results show that the proposed estimator performs well even for a homogeneous situation. The empirical examples, containing the cholera epidemic in India based on homogeneity and the heroin user data in Bangkok 2002 based on heterogeneity, are fitted with an excellent goodness‐of‐fit of the models and the confidence interval estimations may also be of considerable interest. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

10.
This paper reviews the generalized Poisson regression model, the restricted generalized Poisson regression model and the mixed Poisson regression (negative binomial regression and Poisson inverse Gaussian regression) models which can be used for regression analysis of counts. The aim of this study is to demonstrate the quasi likelihood/moment method, which is used for estimation of the parameters of mixed Poisson regression models, also applicable to obtain the estimates of the parameters of the generalized Poisson regression and the restricted generalized Poisson regression models. Besides, at the end of this study an application related to this method for zoological data is given.  相似文献   

11.
Matrix population models are a standard tool for studying stage‐structured populations, but they are not flexible in describing stage duration distributions. This study describes a method for modeling various such distributions in matrix models. The method uses a mixture of two negative binomial distributions (parametrized using a maximum likelihood method) to approximate a target (true) distribution. To examine the performance of the method, populations consisting of two life stages (juvenile and adult) were considered. The juvenile duration distribution followed a gamma distribution, lognormal distribution, or zero‐truncated (over‐dispersed) Poisson distribution, each of which represents a target distribution to be approximated by a mixture distribution. The true population growth rate based on a target distribution was obtained using an individual‐based model, and the extent to which matrix models can approximate the target dynamics was examined. The results show that the method generally works well for the examined target distributions, but is prone to biased predictions under some conditions. In addition, the method works uniformly better than an existing method whose performance was also examined for comparison. Other details regarding parameter estimation and model development are also discussed.  相似文献   

12.
Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log‐normal model (Aitchison and Ho, 1989) cannot be used to fit multivariate count data with excess zero‐vectors; (ii) The multivariate zero‐inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero‐truncated/deflated count data and it is difficult to apply to high‐dimensional cases; (iii) The Type I multivariate zero‐adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods.  相似文献   

13.
Estimation of a population size by means of capture‐recapture techniques is an important problem occurring in many areas of life and social sciences. We consider the frequencies of frequencies situation, where a count variable is used to summarize how often a unit has been identified in the target population of interest. The distribution of this count variable is zero‐truncated since zero identifications do not occur in the sample. As an application we consider the surveillance of scrapie in Great Britain. In this case study holdings with scrapie that are not identified (zero counts) do not enter the surveillance database. The count variable of interest is the number of scrapie cases per holding. For count distributions a common model is the Poisson distribution and, to adjust for potential heterogeneity, a discrete mixture of Poisson distributions is used. Mixtures of Poissons usually provide an excellent fit as will be demonstrated in the application of interest. However, as it has been recently demonstrated, mixtures also suffer under the so‐called boundary problem, resulting in overestimation of population size. It is suggested here to select the mixture model on the basis of the Bayesian Information Criterion. This strategy is further refined by employing a bagging procedure leading to a series of estimates of population size. Using the median of this series, highly influential size estimates are avoided. In limited simulation studies it is shown that the procedure leads to estimates with remarkable small bias. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

14.
Böhning D  Kuhnert R 《Biometrics》2006,62(4):1207-1215
This article is about modeling count data with zero truncation. A parametric count density family is considered. The truncated mixture of densities from this family is different from the mixture of truncated densities from the same family. Whereas the former model is more natural to formulate and to interpret, the latter model is theoretically easier to treat. It is shown that for any mixing distribution leading to a truncated mixture, a (usually different) mixing distribution can be found so that the associated mixture of truncated densities equals the truncated mixture, and vice versa. This implies that the likelihood surfaces for both situations agree, and in this sense both models are equivalent. Zero-truncated count data models are used frequently in the capture-recapture setting to estimate population size, and it can be shown that the two Horvitz-Thompson estimators, associated with the two models, agree. In particular, it is possible to achieve strong results for mixtures of truncated Poisson densities, including reliable, global construction of the unique NPMLE (nonparametric maximum likelihood estimator) of the mixing distribution, implying a unique estimator for the population size. The benefit of these results lies in the fact that it is valid to work with the mixture of truncated count densities, which is less appealing for the practitioner but theoretically easier. Mixtures of truncated count densities form a convex linear model, for which a developed theory exists, including global maximum likelihood theory as well as algorithmic approaches. Once the problem has been solved in this class, it might readily be transformed back to the original problem by means of an explicitly given mapping. Applications of these ideas are given, particularly in the case of the truncated Poisson family.  相似文献   

15.
A common design for a falls prevention trial is to assess falling at baseline, randomize participants into an intervention or control group, and ask them to record the number of falls they experience during a follow‐up period of time. This paper addresses how best to include the baseline count in the analysis of the follow‐up count of falls in negative binomial (NB) regression. We examine the performance of various approaches in simulated datasets where both counts are generated from a mixed Poisson distribution with shared random subject effect. Including the baseline count after log‐transformation as a regressor in NB regression (NB‐logged) or as an offset (NB‐offset) resulted in greater power than including the untransformed baseline count (NB‐unlogged). Cook and Wei's conditional negative binomial (CNB) model replicates the underlying process generating the data. In our motivating dataset, a statistically significant intervention effect resulted from the NB‐logged, NB‐offset, and CNB models, but not from NB‐unlogged, and large, outlying baseline counts were overly influential in NB‐unlogged but not in NB‐logged. We conclude that there is little to lose by including the log‐transformed baseline count in standard NB regression compared to CNB for moderate to larger sized datasets.  相似文献   

16.
Count data sets are traditionally analyzed using the ordinary Poisson distribution. However, such a model has its applicability limited as it can be somewhat restrictive to handle specific data structures. In this case, it arises the need for obtaining alternative models that accommodate, for example, (a) zero‐modification (inflation or deflation at the frequency of zeros), (b) overdispersion, and (c) individual heterogeneity arising from clustering or repeated (correlated) measurements made on the same subject. Cases (a)–(b) and (b)–(c) are often treated together in the statistical literature with several practical applications, but models supporting all at once are less common. Hence, this paper's primary goal was to jointly address these issues by deriving a mixed‐effects regression model based on the hurdle version of the Poisson–Lindley distribution. In this framework, the zero‐modification is incorporated by assuming that a binary probability model determines which outcomes are zero‐valued, and a zero‐truncated process is responsible for generating positive observations. Approximate posterior inferences for the model parameters were obtained from a fully Bayesian approach based on the Adaptive Metropolis algorithm. Intensive Monte Carlo simulation studies were performed to assess the empirical properties of the Bayesian estimators. The proposed model was considered for the analysis of a real data set, and its competitiveness regarding some well‐established mixed‐effects models for count data was evaluated. A sensitivity analysis to detect observations that may impact parameter estimates was performed based on standard divergence measures. The Bayesian ‐value and the randomized quantile residuals were considered for model diagnostics.  相似文献   

17.
This paper discusses a two‐state hidden Markov Poisson regression (MPR) model for analyzing longitudinal data of epileptic seizure counts, which allows for the rate of the Poisson process to depend on covariates through an exponential link function and to change according to the states of a two‐state Markov chain with its transition probabilities associated with covariates through a logit link function. This paper also considers a two‐state hidden Markov negative binomial regression (MNBR) model, as an alternative, by using the negative binomial instead of Poisson distribution in the proposed MPR model when there exists extra‐Poisson variation conditional on the states of the Markov chain. The two proposed models in this paper relax the stationary requirement of the Markov chain, allow for overdispersion relative to the usual Poisson regression model and for correlation between repeated observations. The proposed methodology provides a plausible analysis for the longitudinal data of epileptic seizure counts, and the MNBR model fits the data much better than the MPR model. Maximum likelihood estimation using the EM and quasi‐Newton algorithms is discussed. A Monte Carlo study for the proposed MPR model investigates the reliability of the estimation method, the choice of probabilities for the initial states of the Markov chain, and some finite sample behaviors of the maximum likelihood estimates, suggesting that (1) the estimation method is accurate and reliable as long as the total number of observations is reasonably large, and (2) the choice of probabilities for the initial states of the Markov process has little impact on the parameter estimates.  相似文献   

18.
Elephants living in dense woodlands are difficult to count. Many elephant populations in Africa occur in such conditions. Estimates of these populations based on total counts, aerial counts and dung counts often lack information on precision and accuracy. We use standard mark–recapture field methods to obtain estimates of population size with associated confidence limits. We apply this approach to a closed elephant population in the Tembe Elephant Park (300 km2), South Africa. A registration count completed in 4 months gives a known population size. We evaluate mark–recapture models against the known population size. Individual identification profiles obtained for elephants during the registration count and mark–recapture events indicate that at least 167 elephants live in the park. We consider this value as an estimate of the minimum number alive. We include 189 sightings of bulls and 37 sightings of breeding herds in the mark–recapture modelling. Of the models we test (Petersen, Schnabel, Schumacher, Jolly–Seber, Bowden's, Poisson and negative binomial), Bowden's gives an estimate closest to the registration count. Assumptions of the model are not violated. For all models except one (negative binomial), our estimates improve with increased sampling intensity. Confidence intervals do not improve with increased effort except for the Schnabel model. Mark–recapture methods should be considered as reliable estimators of population size for elephants occurring in dense woodlands and forests when other methods cannot be relied on.  相似文献   

19.
When analyzing mortality data due to rare diseases in small areas, it is common to find several health zones with no mortality cases. In these circumstances, the classical homogeneous model based on the Poisson distribution used to estimate the relative risks within each area may encounter lack of fit due to a disproportionately large frequency of zeros. To cope with these zeros, the zero inflated Poisson model can be used. In this paper, we propose a test for detecting zero inflation in the context of disease mapping which is based on bootstrap techniques. The test is illustrated using male mortality data due to brain cancer in Navarra, Spain. In addition, comparisons with other tests for Poisson zero inflation such as the score test and the likelihood ratio test are carried out in terms of empirical power and size using the brain cancer scenario. The proposed bootstrap test has good power and size and works well when detecting the excess of zeros in small area data sets. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

20.
Using the binomial law we modelled field data to estimate the probability ( ̂ ) of detecting pairs of breeding White-throated Dippers, and the population size ( ̂ ± confidence limits). The model was divided into two parts according to whether the actual size of the population under study was known or not; in the latter case the truncated binomial model was used. Dipper abundance data were collected from three 4-km-long river tracts in the Pyrénées (France) during the breeding seasons of different years. Goodness-of-fit tests indicated that the binomial model fitted the data well. For a given visit during the survey, the estimated probability of detecting any pair of Dippers if they were present was always high (0.63–0.94) and constant from year to year but not between sites. Estimations ( ̂ ) of the size of the population provided by the binomial model were very close to that derived from mapping techniques. This study provides the first ever quantification of the number of visits required to detect birds on linear territories: three visits were necessary to detect the whole breeding population.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号