首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
  总被引:3,自引:0,他引:3  
Ridout M  Hinde J  Demétrio CG 《Biometrics》2001,57(1):219-223
Count data often show a higher incidence of zero counts than would be expected if the data were Poisson distributed. Zero-inflated Poisson regression models are a useful class of models for such data, but parameter estimates may be seriously biased if the nonzero counts are overdispersed in relation to the Poisson distribution. We therefore provide a score test for testing zero-inflated Poisson regression models against zero-inflated negative binomial alternatives.  相似文献   

2.
  总被引:1,自引:0,他引:1  
Jung BC  Jhun M  Lee JW 《Biometrics》2005,61(2):626-628
Ridout, Hinde, and Demétrio (2001, Biometrics 57, 219-223) derived a score test for testing a zero-inflated Poisson (ZIP) regression model against zero-inflated negative binomial (ZINB) alternatives. They mentioned that the score test using the normal approximation might underestimate the nominal significance level possibly for small sample cases. To remedy this problem, a parametric bootstrap method is proposed. It is shown that the bootstrap method keeps the significance level close to the nominal one and has greater power uniformly than the existing normal approximation for testing the hypothesis.  相似文献   

3.
When analyzing Poisson count data sometimes a high frequency of extra zeros is observed. The Zero‐Inflated Poisson (ZIP) model is a popular approach to handle zero‐inflation. In this paper we generalize the ZIP model and its regression counterpart to accommodate the extent of individual exposure. Empirical evidence drawn from an occupational injury data set confirms that the incorporation of exposure information can exert a substantial impact on the model fit. Tests for zero‐inflation are also considered. Their finite sample properties are examined in a Monte Carlo study.  相似文献   

4.
    
We propose a likelihood-based model for correlated count data that display under- or overdispersion within units (e.g. subjects). The model is capable of handling correlation due to clustering and/or serial correlation, in the presence of unbalanced, missing or unequally spaced data. A family of distributions based on birth-event processes is used to model within-subject underdispersion. A computational approach is given to overcome a parameterization difficulty with this family, and this allows use of common Markov Chain Monte Carlo software (e.g. WinBUGS) for estimation. Application of the model to daily counts of asthma inhaler use by children shows substantial within-subject underdispersion, between-subject heterogeneity and correlation due to both clustering of measurements within subjects and serial correlation of longitudinal measurements. The model provides a major improvement over Poisson longitudinal models, and diagnostics show that the model fits well.  相似文献   

5.
    
Summary .  When replicate count data are overdispersed, it is common practice to incorporate this extra-Poisson variability by including latent parameters at the observation level. For example, the negative binomial and Poisson-lognormal (PLN) models are obtained by using gamma and lognormal latent parameters, respectively. Several recent publications have employed the deviance information criterion (DIC) to choose between these two models, with the deviance defined using the Poisson likelihood that is obtained from conditioning on these latent parameters. The results herein show that this use of DIC is inappropriate. Instead, DIC was seen to perform well if calculated using likelihood that was marginalized at the group level by integrating out the observation-level latent parameters. This group-level marginalization is explicit in the case of the negative binomial, but requires numerical integration for the PLN model. Similarly, DIC performed well to judge whether zero inflation was required when calculated using the group-marginalized form of the zero-inflated likelihood. In the context of comparing multilevel hierarchical models, the top-level DIC was obtained using likelihood that was further marginalized by additional integration over the group-level latent parameters, and the marginal densities of the models were calculated for the purpose of providing Bayes' factors. The computational viability and interpretability of these different measures is considered.  相似文献   

6.
    
Analysis of longitudinal data with excessive zeros has gained increasing attention in recent years; however, current approaches to the analysis of longitudinal data with excessive zeros have primarily focused on balanced data. Dropouts are common in longitudinal studies; therefore, the analysis of the resulting unbalanced data is complicated by the missing mechanism. Our study is motivated by the analysis of longitudinal skin cancer count data presented by Greenberg, Baron, Stukel, Stevens, Mandel, Spencer, Elias, Lowe, Nierenberg, Bayrd, Vance, Freeman, Clendenning, Kwan, and the Skin Cancer Prevention Study Group[New England Journal of Medicine 323 , 789–795]. The data consist of a large number of zero responses (83% of the observations) as well as a substantial amount of dropout (about 52% of the observations). To account for both excessive zeros and dropout patterns, we propose a pattern‐mixture zero‐inflated model with compound Poisson random effects for the unbalanced longitudinal skin cancer data. We also incorporate an autoregressive of order 1 correlation structure in the model to capture longitudinal correlation of the count responses. A quasi‐likelihood approach has been developed in the estimation of our model. We illustrated the method with analysis of the longitudinal skin cancer data.  相似文献   

7.
    
A randomised controlled trial to evaluate a training programme for physician-patient communication required the analysis of paired count data. The impact of departures from the Poisson assumption when paired count data are analysed through use of a conditional likelihood is illustrated. A simple approach to providing robust inference is outlined and illustrated.  相似文献   

8.
    
This paper presents the zero‐truncated negative binomial regression model to estimate the population size in the presence of a single registration file. The model is an alternative to the zero‐truncated Poisson regression model and it may be useful if the data are overdispersed due to unobserved heterogeneity. Horvitz–Thompson point and interval estimates for the population size are derived, and the performance of these estimators is evaluated in a simulation study. To illustrate the model, the size of the population of opiate users in the city of Rotterdam is estimated. In comparison to the Poisson model, the zero‐truncated negative binomial regression model fits these data better and yields a substantially higher population size estimate. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

9.
    
Summary The classical concordance correlation coefficient (CCC) to measure agreement among a set of observers assumes data to be distributed as normal and a linear relationship between the mean and the subject and observer effects. Here, the CCC is generalized to afford any distribution from the exponential family by means of the generalized linear mixed models (GLMMs) theory and applied to the case of overdispersed count data. An example of CD34+ cell count data is provided to show the applicability of the procedure. In the latter case, different CCCs are defined and applied to the data by changing the GLMM that fits the data. A simulation study is carried out to explore the behavior of the procedure with a small and moderate sample size.  相似文献   

10.
    
In this paper, a Bayesian method for inference is developed for the zero‐modified Poisson (ZMP) regression model. This model is very flexible for analyzing count data without requiring any information about inflation or deflation of zeros in the sample. A general class of prior densities based on an information matrix is considered for the model parameters. A sensitivity study to detect influential cases that can change the results is performed based on the Kullback–Leibler divergence. Simulation studies are presented in order to illustrate the performance of the developed methodology. Two real datasets on leptospirosis notification in Bahia State (Brazil) are analyzed using the proposed methodology for the ZMP model.  相似文献   

11.
    
This article presents two‐component hierarchical Bayesian models which incorporate both overdispersion and excess zeros. The components may be resultants of some intervention (treatment) that changes the rare event generating process. The models are also expanded to take into account any heterogeneity that may exist in the data. Details of the model fitting, checking and selecting alternative models from a Bayesian perspective are also presented. The proposed methods are applied to count data on the assessment of an efficacy of pesticides in controlling the reproduction of whitefly. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

12.
    
Two-part regression models are frequently used to analyze longitudinal count data with excess zeros, where the same set of subjects is repeatedly observed over time. In this context, several sources of heterogeneity may arise at individual level that affect the observed process. Further, longitudinal studies often suffer from missing values: individuals dropout of the study before its completion, and thus present incomplete data records. In this paper, we propose a finite mixture of hurdle models to face the heterogeneity problem, which is handled by introducing random effects with a discrete distribution; a pattern-mixture approach is specified to deal with non-ignorable missing values. This approach helps us to consider overdispersed counts, while allowing for association between the two parts of the model, and for non-ignorable dropouts. The effectiveness of the proposal is tested through a simulation study. Finally, an application to real data on skin cancer is provided.  相似文献   

13.
  总被引:1,自引:0,他引:1  
Overdispersion is a common phenomenon in Poisson modeling, and the negative binomial (NB) model is frequently used to account for overdispersion. Testing approaches (Wald test, likelihood ratio test (LRT), and score test) for overdispersion in the Poisson regression versus the NB model are available. Because the generalized Poisson (GP) model is similar to the NB model, we consider the former as an alternate model for overdispersed count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes a score test for overdispersion based on the GP model and compares the power of the test with the LRT and Wald tests. A simulation study indicates the score test based on asymptotic standard Normal distribution is more appropriate in practical application for higher empirical power, however, it underestimates the nominal significance level, especially in small sample situations, and examples illustrate the results of comparing the candidate tests between the Poisson and GP models. A bootstrap test is also proposed to adjust the underestimation of nominal level in the score statistic when the sample size is small. The simulation study indicates the bootstrap test has significance level closer to nominal size and has uniformly greater power than the score test based on asymptotic standard Normal distribution. From a practical perspective, we suggest that, if the score test gives even a weak indication that the Poisson model is inappropriate, say at the 0.10 significance level, we advise the more accurate bootstrap procedure as a better test for comparing whether the GP model is more appropriate than Poisson model. Finally, the Vuong test is illustrated to choose between GP and NB2 models for the same dataset.  相似文献   

14.
    
Huang X 《Biometrics》2009,65(2):361-368
Summary .  Generalized linear mixed models (GLMMs) are widely used in the analysis of clustered data. However, the validity of likelihood-based inference in such analyses can be greatly affected by the assumed model for the random effects. We propose a diagnostic method for random-effect model misspecification in GLMMs for clustered binary response. We provide a theoretical justification of the proposed method and investigate its finite sample performance via simulation. The proposed method is applied to data from a longitudinal respiratory infection study.  相似文献   

15.
16.
    
Behavioural research often produces data that have a complicated structure. For instance, data can represent repeated observations of the same individual and suffer from heteroscedasticity as well as other technical snags. The regression analysis of such data is often complicated by the fact that the observations (response variables) are mutually correlated. The correlation structure can be quite complex and might or might not be of direct interest to the user. In any case, one needs to take correlations into account (e.g. by means of random‐effect specification) in order to arrive at correct statistical inference (e.g. for construction of the appropriate test or confidence intervals). Over the last decade, such data have been more and more frequently analysed using repeated‐measures ANOVA and mixed‐effects models. Some researchers invoke the heavy machinery of mixed‐effects modelling to obtain the desired population‐level (marginal) inference, which can be achieved by using simpler tools – namely marginal models. This paper highlights marginal modelling (using generalized least squares [GLS] regression) as an alternative method. In various concrete situations, such marginal models can be based on fewer assumptions and directly generate estimates (population‐level parameters) which are of immediate interest to the behavioural researcher (such as population mean). Sometimes, they might be not only easier to interpret but also easier to specify than their competitors (e.g. mixed‐effects models). Using five examples from behavioural research, we demonstrate the use, advantages, limits and pitfalls of marginal and mixed‐effects models implemented within the functions of the ‘nlme’ package in R.  相似文献   

17.
18.
    
We discuss the problem of estimating the number of nests of different species of seabirds on North East Herald Cay based on the data from a 1996 survey of quadrats along transects and data from similar past surveys. We consider three approaches based on different plausible models, namely a conditional negative binomial model that allows for additional zeroes in the data, a weighting approach (based on a heteroscedastic regression model), and a transform-both-sides regression approach. We find that the conditional negative binomial approach and a linear regression approach work well but that the transform-both-sides approach should not be used. We apply the conditional negative binomial and linear regression approaches with poststratification based on data quality and availability to estimate the number of frigatebird nests on North East Herald Cay.  相似文献   

19.
  总被引:6,自引:0,他引:6  
Tao H  Palta M  Yandell BS  Newton MA 《Biometrics》1999,55(1):102-110
A semiparametric mixed effects regression model is proposed for the analysis of clustered or longitudinal data with continuous, ordinal, or binary outcome. The common assumption of Gaussian random effects is relaxed by using a predictive recursion method (Newton and Zhang, 1999) to provide a nonparametric smooth density estimate. A new strategy is introduced to accelerate the algorithm. Parameter estimates are obtained by maximizing the marginal profile likelihood by Powell's conjugate direction search method. Monte Carlo results are presented to show that the method can improve the mean squared error of the fixed effects estimators when the random effects distribution is not Gaussian. The usefulness of visualizing the random effects density itself is illustrated in the analysis of data from the Wisconsin Sleep Survey. The proposed estimation procedure is computationally feasible for quite large data sets.  相似文献   

20.
    
The potency of antiretroviral agents in AIDS clinical trials can be assessed on the basis of an early viral response such as viral decay rate or change in viral load (number of copies of HIV RNA) of the plasma. Linear, parametric nonlinear, and semiparametric nonlinear mixed‐effects models have been proposed to estimate viral decay rates in viral dynamic models. However, before applying these models to clinical data, a critical question that remains to be addressed is whether these models produce coherent estimates of viral decay rates, and if not, which model is appropriate and should be used in practice. In this paper, we applied these models to data from an AIDS clinical trial of potent antiviral treatments and found significant incongruity in the estimated rates of reduction in viral load. Simulation studies indicated that reliable estimates of viral decay rate were obtained by using the parametric and semiparametric nonlinear mixed‐effects models. Our analysis also indicated that the decay rates estimated by using linear mixed‐effects models should be interpreted differently from those estimated by using nonlinear mixed‐effects models. The semiparametric nonlinear mixed‐effects model is preferred to other models because arbitrary data truncation is not needed. Based on real data analysis and simulation studies, we provide guidelines for estimating viral decay rates from clinical data. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号