期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Local influence diagnostics for hierarchical count data models with overdispersion and excess zeros

Trias Wahyuni Rakhmawati Geert Molenberghs Geert Verbeke Christel Faes 《Biometrical journal. Biometrische Zeitschrift》2016,58(6):1390-1408

We consider models for hierarchical count data, subject to overdispersion and/or excess zeros. Molenberghs et al. ( 2007 ) and Molenberghs et al. ( 2010 ) extend the Poisson‐normal generalized linear‐mixed model by including gamma random effects to accommodate overdispersion. Excess zeros are handled using either a zero‐inflation or a hurdle component. These models were studied by Kassahun et al. ( 2014 ). While flexible, they are quite elaborate in parametric specification and therefore model assessment is imperative. We derive local influence measures to detect and examine influential subjects, that is subjects who have undue influence on either the fit of the model as a whole, or on specific important sub‐vectors of the parameter vector. The latter include the fixed effects for the Poisson and for the excess‐zeros components, the variance components for the normal random effects, and the parameters describing gamma random effects, included to accommodate overdispersion. Interpretable influence components are derived. The method is applied to data from a longitudinal clinical trial involving patients with epileptic seizures. Even though the data were extensively analyzed in earlier work, the insight gained from the proposed diagnostics, statistically and clinically, is considerable. Possibly, a small but important subgroup of patients has been identified. 相似文献

2.

The k-ZIG: Flexible Modeling for Zero-Inflated Counts

Ghosh S Gelfand AE Zhu K Clark JS 《Biometrics》2012,68(3):878-885

Summary Many applications involve count data from a process that yields an excess number of zeros. Zero-inflated count models, in particular, zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models, along with Poisson hurdle models, are commonly used to address this problem. However, these models struggle to explain extreme incidence of zeros (say more than 80%), especially to find important covariates. In fact, the ZIP may struggle even when the proportion is not extreme. To redress this problem we propose the class of k-ZIG models. These models allow more flexible modeling of both the zero-inflation and the nonzero counts, allowing interplay between these two components. We develop the properties of this new class of models, including reparameterization to a natural link function. The models are straightforwardly fitted within a Bayesian framework. The methodology is illustrated with simulated data examples as well as a forest seedling dataset obtained from the USDA Forest Service's Forest Inventory and Analysis program. 相似文献

3.

Zero-inflated Poisson and binomial regression with random effects: a case study 总被引：17，自引：0，他引：17

Hall DB 《Biometrics》2000,56(4):1030-1039

In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be distributed as a mixture of a Poisson(lambda) distribution and a distribution with point mass of one at zero, with mixing probability p. Both p and lambda are allowed to depend on covariates through canonical link generalized linear models. In this paper, we adapt Lambert's methodology to an upper bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model. In addition, we add to the flexibility of these fixed effects models by incorporating random effects so that, e.g., the within-subject correlation and between-subject heterogeneity typical of repeated measures data can be accommodated. We motivate, develop, and illustrate the methods described here with an example from horticulture, where both upper bounded count (binomial-type) and unbounded count (Poisson-type) data with excess zeros were collected in a repeated measures designed experiment. 相似文献

4.

Modeling heterogeneity for count data: A study of maternal mortality in health facilities in Mozambique

Osvaldo Loquiha Niel Hens Leonardo Chavane Marleen Temmerman Marc Aerts 《Biometrical journal. Biometrische Zeitschrift》2013,55(5):647-660

Count data are very common in health services research, and very commonly the basic Poisson regression model has to be extended in several ways to accommodate several sources of heterogeneity: (i) an excess number of zeros relative to a Poisson distribution, (ii) hierarchical structures, and correlated data, (iii) remaining “unexplained” sources of overdispersion. In this paper, we propose hierarchical zero‐inflated and overdispersed models with independent, correlated, and shared random effects for both components of the mixture model. We show that all different extensions of the Poisson model can be based on the concept of mixture models, and that they can be combined to account for all different sources of heterogeneity. Expressions for the first two moments are derived and discussed. The models are applied to data on maternal deaths and related risk factors within health facilities in Mozambique. The final model shows that the maternal mortality rate mainly depends on the geographical location of the health facility, the percentage of women admitted with HIV and the percentage of referrals from the health facility. 相似文献

5.

A novel method for quantifying overdispersion in count data and its application to farmland birds

下载免费PDF全文

Barry J. Mcmahon Gordon Purvis Helen Sheridan Gavin M. Siriwardena Andrew C. Parnell 《Ibis》2017,159(2):406-414

The statistical modelling of count data permeates the discipline of ecology. Such data often exhibit overdispersion compared with a standard Poisson distribution, so that the variance of the counts is greater than that of the mean. Whereas modelling to reveal the effects of explanatory variables on the mean is commonplace, overdispersion is generally regarded as a nuisance parameter to be accounted for and subsequently ignored. Instead, we propose a method that models the overdispersion as a biologically interesting property of a data set and show how novel inference is provided as a result. We adapted the double hierarchical generalized linear model approach to create an easily extendible model structure that quantifies the influence of explanatory variables on the overdispersion of count data, and apply it to farmland birds. These data were from a study within Irish agricultural ecosystems, in which total bird species abundance and the abundance of farmland indicator species were compared on dairy and non‐dairy farms in the winter and breeding seasons. In general, overdispersion in bird counts was greater on dairy farms than on non‐dairy farms, and for total bird numbers, overdispersion was greatest on dairy farms in winter. Our code is fitted using the Bayesian package Rstan, and we make all code and data available in a GitHub repository. Within a Bayesian framework, this approach facilitates a meaningful quantification of the effects of categorical explanatory variables on any response variable with a tendency to overdispersion that has a meaningful biological or ecological explanation. 相似文献

6.

Selecting the right statistical model for analysis of insect count data by using information theoretic measures

Sileshi G 《Bulletin of entomological research》2006,96(5):479-488

相似文献

7.

Analysis of Zero‐Inflated Poisson Data Incorporating Extent of Exposure

Andy H. Lee Kui Wang Kelvin K.W. Yau 《Biometrical journal. Biometrische Zeitschrift》2001,43(8):963-975

When analyzing Poisson count data sometimes a high frequency of extra zeros is observed. The Zero‐Inflated Poisson (ZIP) model is a popular approach to handle zero‐inflation. In this paper we generalize the ZIP model and its regression counterpart to accommodate the extent of individual exposure. Empirical evidence drawn from an occupational injury data set confirms that the incorporation of exposure information can exert a substantial impact on the model fit. Tests for zero‐inflation are also considered. Their finite sample properties are examined in a Monte Carlo study. 相似文献

8.

Quasi Likelihood/Moment Method for Generalized and Restricted Generalized Poisson Regression Models and Its Application

&#x;lknur zmen 《Biometrical journal. Biometrische Zeitschrift》2000,42(3):303-314

This paper reviews the generalized Poisson regression model, the restricted generalized Poisson regression model and the mixed Poisson regression (negative binomial regression and Poisson inverse Gaussian regression) models which can be used for regression analysis of counts. The aim of this study is to demonstrate the quasi likelihood/moment method, which is used for estimation of the parameters of mixed Poisson regression models, also applicable to obtain the estimates of the parameters of the generalized Poisson regression and the restricted generalized Poisson regression models. Besides, at the end of this study an application related to this method for zoological data is given. 相似文献

9.

Zero-inflated Poisson regression models for QTL mapping applied to tick-resistance in a Gyr × Holstein F2 population

Silva FF Tunin KP Rosa GJ da Silva MV Azevedo AL da Silva Verneque R Machado MA Packer IU 《Genetics and molecular biology》2011,34(4):575-581

相似文献

10.

A two-part mixed-effects pattern-mixture model to handle zero-inflation and incompleteness in a longitudinal setting

Maruotti A 《Biometrical journal. Biometrische Zeitschrift》2011,53(5):716-734

Two-part regression models are frequently used to analyze longitudinal count data with excess zeros, where the same set of subjects is repeatedly observed over time. In this context, several sources of heterogeneity may arise at individual level that affect the observed process. Further, longitudinal studies often suffer from missing values: individuals dropout of the study before its completion, and thus present incomplete data records. In this paper, we propose a finite mixture of hurdle models to face the heterogeneity problem, which is handled by introducing random effects with a discrete distribution; a pattern-mixture approach is specified to deal with non-ignorable missing values. This approach helps us to consider overdispersed counts, while allowing for association between the two parts of the model, and for non-ignorable dropouts. The effectiveness of the proposal is tested through a simulation study. Finally, an application to real data on skin cancer is provided. 相似文献

11.

A comparison between Poisson and zero-inflated Poisson regression models with an application to number of black spots in Corriedale sheep

Hugo Naya Jorge I Urioste Yu-Mei Chang Mariana Rodrigues-Motta Roberto Kremer Daniel Gianola 《遗传、选种与进化》2008,40(4):379-394

Dark spots in the fleece area are often associated with dark fibres in wool, which limits its competitiveness with other textile fibres. Field data from a sheep experiment in Uruguay revealed an excess number of zeros for dark spots. We compared the performance of four Poisson and zero-inflated Poisson (ZIP) models under four simulation scenarios. All models performed reasonably well under the same scenario for which the data were simulated. The deviance information criterion favoured a Poisson model with residual, while the ZIP model with a residual gave estimates closer to their true values under all simulation scenarios. Both Poisson and ZIP models with an error term at the regression level performed better than their counterparts without such an error. Field data from Corriedale sheep were analysed with Poisson and ZIP models with residuals. Parameter estimates were similar for both models. Although the posterior distribution of the sire variance was skewed due to a small number of rams in the dataset, the median of this variance suggested a scope for genetic selection. The main environmental factor was the age of the sheep at shearing. In summary, age related processes seem to drive the number of dark spots in this breed of sheep. 相似文献

12.

A new mixed‐effects regression model for the analysis of zero‐modified hierarchical count data

Wesley Bertoli Katiane S. Conceio Marinho G. Andrade Francisco Louzada 《Biometrical journal. Biometrische Zeitschrift》2021,63(1):81-104

Count data sets are traditionally analyzed using the ordinary Poisson distribution. However, such a model has its applicability limited as it can be somewhat restrictive to handle specific data structures. In this case, it arises the need for obtaining alternative models that accommodate, for example, (a) zero‐modification (inflation or deflation at the frequency of zeros), (b) overdispersion, and (c) individual heterogeneity arising from clustering or repeated (correlated) measurements made on the same subject. Cases (a)–(b) and (b)–(c) are often treated together in the statistical literature with several practical applications, but models supporting all at once are less common. Hence, this paper's primary goal was to jointly address these issues by deriving a mixed‐effects regression model based on the hurdle version of the Poisson–Lindley distribution. In this framework, the zero‐modification is incorporated by assuming that a binary probability model determines which outcomes are zero‐valued, and a zero‐truncated process is responsible for generating positive observations. Approximate posterior inferences for the model parameters were obtained from a fully Bayesian approach based on the Adaptive Metropolis algorithm. Intensive Monte Carlo simulation studies were performed to assess the empirical properties of the Bayesian estimators. The proposed model was considered for the analysis of a real data set, and its competitiveness regarding some well‐established mixed‐effects models for count data was evaluated. A sensitivity analysis to detect observations that may impact parameter estimates was performed based on standard divergence measures. The Bayesian ‐value and the randomized quantile residuals were considered for model diagnostics. 相似文献

13.

Analysis of Longitudinal Data of Epileptic Seizure Counts – A Two‐State Hidden Markov Regression Approach

Peiming Wang Martin L. Puterman 《Biometrical journal. Biometrische Zeitschrift》2001,43(8):941-962

This paper discusses a two‐state hidden Markov Poisson regression (MPR) model for analyzing longitudinal data of epileptic seizure counts, which allows for the rate of the Poisson process to depend on covariates through an exponential link function and to change according to the states of a two‐state Markov chain with its transition probabilities associated with covariates through a logit link function. This paper also considers a two‐state hidden Markov negative binomial regression (MNBR) model, as an alternative, by using the negative binomial instead of Poisson distribution in the proposed MPR model when there exists extra‐Poisson variation conditional on the states of the Markov chain. The two proposed models in this paper relax the stationary requirement of the Markov chain, allow for overdispersion relative to the usual Poisson regression model and for correlation between repeated observations. The proposed methodology provides a plausible analysis for the longitudinal data of epileptic seizure counts, and the MNBR model fits the data much better than the MPR model. Maximum likelihood estimation using the EM and quasi‐Newton algorithms is discussed. A Monte Carlo study for the proposed MPR model investigates the reliability of the estimation method, the choice of probabilities for the initial states of the Markov chain, and some finite sample behaviors of the maximum likelihood estimates, suggesting that (1) the estimation method is accurate and reliable as long as the total number of observations is reasonably large, and (2) the choice of probabilities for the initial states of the Markov process has little impact on the parameter estimates. 相似文献

14.

A New Approach for Handling Longitudinal Count Data with Zero‐Inflation and Overdispersion: Poisson Geometric Process Model

Wai‐Yin Wan Jennifer S. K. Chan 《Biometrical journal. Biometrische Zeitschrift》2009,51(4):556-570

For time series of count data, correlated measurements, clustering as well as excessive zeros occur simultaneously in biomedical applications. Ignoring such effects might contribute to misleading treatment outcomes. A generalized mixture Poisson geometric process (GMPGP) model and a zero‐altered mixture Poisson geometric process (ZMPGP) model are developed from the geometric process model, which was originally developed for modelling positive continuous data and was extended to handle count data. These models are motivated by evaluating the trend development of new tumour counts for bladder cancer patients as well as by identifying useful covariates which affect the count level. The models are implemented using Bayesian method with Markov chain Monte Carlo (MCMC) algorithms and are assessed using deviance information criterion (DIC). 相似文献

15.

Pattern‐Mixture Zero‐Inflated Mixed Models for Longitudinal Unbalanced Count Data with Excessive Zeros

M. Tariqul Hasan Gary Sneddon Renjun Ma 《Biometrical journal. Biometrische Zeitschrift》2009,51(6):946-960

Analysis of longitudinal data with excessive zeros has gained increasing attention in recent years; however, current approaches to the analysis of longitudinal data with excessive zeros have primarily focused on balanced data. Dropouts are common in longitudinal studies; therefore, the analysis of the resulting unbalanced data is complicated by the missing mechanism. Our study is motivated by the analysis of longitudinal skin cancer count data presented by Greenberg, Baron, Stukel, Stevens, Mandel, Spencer, Elias, Lowe, Nierenberg, Bayrd, Vance, Freeman, Clendenning, Kwan, and the Skin Cancer Prevention Study Group[New England Journal of Medicine 323 , 789–795]. The data consist of a large number of zero responses (83% of the observations) as well as a substantial amount of dropout (about 52% of the observations). To account for both excessive zeros and dropout patterns, we propose a pattern‐mixture zero‐inflated model with compound Poisson random effects for the unbalanced longitudinal skin cancer data. We also incorporate an autoregressive of order 1 correlation structure in the model to capture longitudinal correlation of the count responses. A quasi‐likelihood approach has been developed in the estimation of our model. We illustrated the method with analysis of the longitudinal skin cancer data. 相似文献

16.

Zero-Altered and other Regression Models for Count Data with Added Zeros

David C. Heilbron 《Biometrical journal. Biometrische Zeitschrift》1994,36(5):531-547

On occasion, generalized linear models for counts based on Poisson or overdispersed count distributions may encounter lack of fit due to disproportionately large frequencies of zeros. Three alternative types of regression models that utilize all the information and explicitly account for excess zeros are examined and given general formulations. A simple mechanism for added zeros is assumed that directly motivates one type of model, here called the added-zero type, particular forms of which have been proposed independently by D. LAMBERT (1992) and in unpublished work by the author. An original regression formulation (the zero-altered model) is presented as a reduced form of the two-part model for count data, which is also discussed. It is suggested that two-part models be used to aid in development of an added-zero model when the latter is thought to be appropriate. 相似文献

17.

Comparison of Bayesian methods for flexible modeling of spatial risk surfaces in disease mapping

下载免费PDF全文

Sibylle Sturtz Katja Ickstadt 《Biometrical journal. Biometrische Zeitschrift》2014,56(1):5-22

Bayesian hierarchical models usually model the risk surface on the same arbitrary geographical units for all data sources. Poisson/gamma random field models overcome this restriction as the underlying risk surface can be specified independently to the resolution of the data. Moreover, covariates may be considered as either excess or relative risk factors. We compare the performance of the Poisson/gamma random field model to the Markov random field (MRF)‐based ecologic regression model and the Bayesian Detection of Clusters and Discontinuities (BDCD) model, in both a simulation study and a real data example. We find the BDCD model to have advantages in situations dominated by abruptly changing risk while the Poisson/gamma random field model convinces by its flexibility in the estimation of random field structures and by its flexibility incorporating covariates. The MRF‐based ecologic regression model is inferior. WinBUGS code for Poisson/gamma random field models is provided. 相似文献

18.

Modelling bivariate count series with excess zeros

Lee AH Wang K Yau KK Carrivick PJ Stevenson MR 《Mathematical biosciences》2005,196(2):226-237

Bivariate time series of counts with excess zeros relative to the Poisson process are common in many bioscience applications. Failure to account for the extra zeros in the analysis may result in biased parameter estimates and misleading inferences. A class of bivariate zero-inflated Poisson autoregression models is presented to accommodate the zero-inflation and the inherent serial dependency between successive observations. An autoregressive correlation structure is assumed in the random component of the compound regression model. Parameter estimation is achieved via an EM algorithm, by maximizing an appropriate log-likelihood function to obtain residual maximum likelihood estimates. The proposed method is applied to analyze a bivariate series from an occupational health study, in which the zero-inflated injury count events are classified as either musculoskeletal or non-musculoskeletal in nature. The approach enables the evaluation of the effectiveness of a participatory ergonomics intervention at the population level, in terms of reducing the overall incidence of lost-time injury and a simultaneous decline in the two mean injury rates. 相似文献

19.

Zero‐inflated spatio‐temporal models for disease mapping

Mahmoud Torabi 《Biometrical journal. Biometrische Zeitschrift》2017,59(3):430-444

In this paper, our aim is to analyze geographical and temporal variability of disease incidence when spatio‐temporal count data have excess zeros. To that end, we consider random effects in zero‐inflated Poisson models to investigate geographical and temporal patterns of disease incidence. Spatio‐temporal models that employ conditionally autoregressive smoothing across the spatial dimension and B‐spline smoothing over the temporal dimension are proposed. The analysis of these complex models is computationally difficult from the frequentist perspective. On the other hand, the advent of the Markov chain Monte Carlo algorithm has made the Bayesian analysis of complex models computationally convenient. Recently developed data cloning method provides a frequentist approach to mixed models that is also computationally convenient. We propose to use data cloning, which yields to maximum likelihood estimation, to conduct frequentist analysis of zero‐inflated spatio‐temporal modeling of disease incidence. One of the advantages of the data cloning approach is that the prediction and corresponding standard errors (or prediction intervals) of smoothing disease incidence over space and time is easily obtained. We illustrate our approach using a real dataset of monthly children asthma visits to hospital in the province of Manitoba, Canada, during the period April 2006 to March 2010. Performance of our approach is also evaluated through a simulation study. 相似文献

20.

A new multivariate zero‐adjusted Poisson model with applications to biomedicine

Yin Liu Guo‐Liang Tian Man‐Lai Tang Kam Chuen Yuen 《Biometrical journal. Biometrische Zeitschrift》2019,61(6):1340-1370

Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log‐normal model (Aitchison and Ho, 1989) cannot be used to fit multivariate count data with excess zero‐vectors; (ii) The multivariate zero‐inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero‐truncated/deflated count data and it is difficult to apply to high‐dimensional cases; (iii) The Type I multivariate zero‐adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods. 相似文献