首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Bivariate time series of counts with excess zeros relative to the Poisson process are common in many bioscience applications. Failure to account for the extra zeros in the analysis may result in biased parameter estimates and misleading inferences. A class of bivariate zero-inflated Poisson autoregression models is presented to accommodate the zero-inflation and the inherent serial dependency between successive observations. An autoregressive correlation structure is assumed in the random component of the compound regression model. Parameter estimation is achieved via an EM algorithm, by maximizing an appropriate log-likelihood function to obtain residual maximum likelihood estimates. The proposed method is applied to analyze a bivariate series from an occupational health study, in which the zero-inflated injury count events are classified as either musculoskeletal or non-musculoskeletal in nature. The approach enables the evaluation of the effectiveness of a participatory ergonomics intervention at the population level, in terms of reducing the overall incidence of lost-time injury and a simultaneous decline in the two mean injury rates.  相似文献   

2.
Recurrent events data are commonly encountered in medical studies. In many applications, only the number of events during the follow‐up period rather than the recurrent event times is available. Two important challenges arise in such studies: (a) a substantial portion of subjects may not experience the event, and (b) we may not observe the event count for the entire study period due to informative dropout. To address the first challenge, we assume that underlying population consists of two subpopulations: a subpopulation nonsusceptible to the event of interest and a subpopulation susceptible to the event of interest. In the susceptible subpopulation, the event count is assumed to follow a Poisson distribution given the follow‐up time and the subject‐specific characteristics. We then introduce a frailty to account for informative dropout. The proposed semiparametric frailty models consist of three submodels: (a) a logistic regression model for the probability such that a subject belongs to the nonsusceptible subpopulation; (b) a nonhomogeneous Poisson process model with an unspecified baseline rate function; and (c) a Cox model for the informative dropout time. We develop likelihood‐based estimation and inference procedures. The maximum likelihood estimators are shown to be consistent. Additionally, the proposed estimators of the finite‐dimensional parameters are asymptotically normal and the covariance matrix attains the semiparametric efficiency bound. Simulation studies demonstrate that the proposed methodologies perform well in practical situations. We apply the proposed methods to a clinical trial on patients with myelodysplastic syndromes.  相似文献   

3.
Analysis of count data is required in many areas of biometric interest. Often the simple Poisson distribution is not appropriate, since an extra-number of zero counts occur in the count data. Some current approaches for the problem at hand are reviewed. It will be argued that these situations can often be easily modeled using the zero-inflated Poisson distribution. A variety of applications are considered in which this occurs. Possibilities are outlined on how the validity of the zero-inflated Poisson can be validated including a comparison with the nonparametric Poisson mixture maximum likelihood estimator.  相似文献   

4.
1. A quantile regression model for counts of breeding Cape Sable seaside sparrows Ammodramus maritimus mirabilis (L.) as a function of water depth and previous year abundance was developed based on extensive surveys, 1992-2005, in the Florida Everglades. The quantile count model extends linear quantile regression methods to discrete response variables, providing a flexible alternative to discrete parametric distributional models, e.g. Poisson, negative binomial and their zero-inflated counterparts. 2. Estimates from our multiplicative model demonstrated that negative effects of increasing water depth in breeding habitat on sparrow numbers were dependent on recent occupation history. Upper 10th percentiles of counts (one to three sparrows) decreased with increasing water depth from 0 to 30 cm when sites were not occupied in previous years. However, upper 40th percentiles of counts (one to six sparrows) decreased with increasing water depth for sites occupied in previous years. 3. Greatest decreases (-50% to -83%) in upper quantiles of sparrow counts occurred as water depths increased from 0 to 15 cm when previous year counts were 1, but a small proportion of sites (5-10%) held at least one sparrow even as water depths increased to 20 or 30 cm. 4. A zero-inflated Poisson regression model provided estimates of conditional means that also decreased with increasing water depth but rates of change were lower and decreased with increasing previous year counts compared to the quantile count model. Quantiles computed for the zero-inflated Poisson model enhanced interpretation of this model but had greater lack-of-fit for water depths > 0 cm and previous year counts 1, conditions where the negative effect of water depths were readily apparent and fitted better with the quantile count model.  相似文献   

5.
Phenotypes measured in counts are commonly observed in nature. Statistical methods for mapping quantitative trait loci (QTL) underlying count traits are documented in the literature. The majority of them assume that the count phenotype follows a Poisson distribution with appropriate techniques being applied to handle data dispersion. When a count trait has a genetic basis, “naturally occurring” zero status also reflects the underlying gene effects. Simply ignoring or miss-handling the zero data may lead to wrong QTL inference. In this article, we propose an interval mapping approach for mapping QTL underlying count phenotypes containing many zeros. The effects of QTLs on the zero-inflated count trait are modelled through the zero-inflated generalized Poisson regression mixture model, which can handle the zero inflation and Poisson dispersion in the same distribution. We implement the approach using the EM algorithm with the Newton-Raphson algorithm embedded in the M-step, and provide a genome-wide scan for testing and estimating the QTL effects. The performance of the proposed method is evaluated through extensive simulation studies. Extensions to composite and multiple interval mapping are discussed. The utility of the developed approach is illustrated through a mouse F2 intercross data set. Significant QTLs are detected to control mouse cholesterol gallstone formation.  相似文献   

6.
Qihuang Zhang  Grace Y. Yi 《Biometrics》2023,79(2):1089-1102
Zero-inflated count data arise frequently from genomics studies. Analysis of such data is often based on a mixture model which facilitates excess zeros in combination with a Poisson distribution, and various inference methods have been proposed under such a model. Those analysis procedures, however, are challenged by the presence of measurement error in count responses. In this article, we propose a new measurement error model to describe error-contaminated count data. We show that ignoring the measurement error effects in the analysis may generally lead to invalid inference results, and meanwhile, we identify situations where ignoring measurement error can still yield consistent estimators. Furthermore, we propose a Bayesian method to address the effects of measurement error under the zero-inflated Poisson model and discuss the identifiability issues. We develop a data-augmentation algorithm that is easy to implement. Simulation studies are conducted to evaluate the performance of the proposed method. We apply our method to analyze the data arising from a prostate adenocarcinoma genomic study.  相似文献   

7.
8.
Ridout M  Hinde J  Demétrio CG 《Biometrics》2001,57(1):219-223
Count data often show a higher incidence of zero counts than would be expected if the data were Poisson distributed. Zero-inflated Poisson regression models are a useful class of models for such data, but parameter estimates may be seriously biased if the nonzero counts are overdispersed in relation to the Poisson distribution. We therefore provide a score test for testing zero-inflated Poisson regression models against zero-inflated negative binomial alternatives.  相似文献   

9.
This paper presents new methods, using a Bayesian approach, for analyzing longitudinal count data with excess zeros and nonlinear effects of continuously valued covariates. In longitudinal count data there are many problems that can make the use of a zero-inflated Poisson (ZIP) model ineffective. These problems are unobserved heterogeneity and nonlinear effects of continuously valued covariates. Our proposed semiparametric model can simultaneously handle these problems in a unified framework. The framework accounts for heterogeneity by incorporating random effects and has two components. The parametric component of the model which deals with the linear effects of time invariant covariates and the non-parametric component which gives an arbitrary smooth function to model the effect of time or time-varying covariates on the logarithm of mean count. The proposed methods are illustrated by analyzing longitudinal count data on the assessment of an efficacy of pesticides in controlling the reproduction of whitefly.  相似文献   

10.
Clustered interval-censored failure time data occur when the failure times of interest are clustered into small groups and known only to lie in certain intervals. A number of methods have been proposed for regression analysis of clustered failure time data, but most of them apply only to clustered right-censored data. In this paper, a sieve estimation procedure is proposed for fitting a Cox frailty model to clustered interval-censored failure time data. In particular, a two-step algorithm for parameter estimation is developed and the asymptotic properties of the resulting sieve maximum likelihood estimators are established. The finite sample properties of the proposed estimators are investigated through a simulation study and the method is illustrated by the data arising from a lymphatic filariasis study.  相似文献   

11.
Daniel R. Kowal  Bohan Wu 《Biometrics》2023,79(2):1520-1533
‘‘For how many days during the past 30 days was your mental health not good?” The responses to this question measure self-reported mental health and can be linked to important covariates in the National Health and Nutrition Examination Survey (NHANES). However, these count variables present major distributional challenges: The data are overdispersed, zero-inflated, bounded by 30, and heaped in 5- and 7-day increments. To address these challenges—which are especially common for health questionnaire data—we design a semiparametric estimation and inference framework for count data regression. The data-generating process is defined by simultaneously transforming and rounding (star ) a latent Gaussian regression model. The transformation is estimated nonparametrically and the rounding operator ensures the correct support for the discrete and bounded data. Maximum likelihood estimators are computed using an expectation-maximization (EM) algorithm that is compatible with any continuous data model estimable by least squares. star regression includes asymptotic hypothesis testing and confidence intervals, variable selection via information criteria, and customized diagnostics. Simulation studies validate the utility of this framework. Using star regression, we identify key factors associated with self-reported mental health and demonstrate substantial improvements in goodness-of-fit compared to existing count data regression models.  相似文献   

12.
Clinical trials with Poisson distributed count data as the primary outcome are common in various medical areas such as relapse counts in multiple sclerosis trials or the number of attacks in trials for the treatment of migraine. In this article, we present approximate sample size formulae for testing noninferiority using asymptotic tests which are based on restricted or unrestricted maximum likelihood estimators of the Poisson rates. The Poisson outcomes are allowed to be observed for unequal follow‐up schemes, and both the situations that the noninferiority margin is expressed in terms of the difference and the ratio are considered. The exact type I error rates and powers of these tests are evaluated and the accuracy of the approximate sample size formulae is examined. The test statistic using the restricted maximum likelihood estimators (for the difference test problem) and the test statistic that is based on the logarithmic transformation and employs the maximum likelihood estimators (for the ratio test problem) show favorable type I error control and can be recommended for practical application. The approximate sample size formulae show high accuracy even for small sample sizes and provide power values identical or close to the aspired ones. The methods are illustrated by a clinical trial example from anesthesia.  相似文献   

13.
Hall DB 《Biometrics》2000,56(4):1030-1039
In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be distributed as a mixture of a Poisson(lambda) distribution and a distribution with point mass of one at zero, with mixing probability p. Both p and lambda are allowed to depend on covariates through canonical link generalized linear models. In this paper, we adapt Lambert's methodology to an upper bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model. In addition, we add to the flexibility of these fixed effects models by incorporating random effects so that, e.g., the within-subject correlation and between-subject heterogeneity typical of repeated measures data can be accommodated. We motivate, develop, and illustrate the methods described here with an example from horticulture, where both upper bounded count (binomial-type) and unbounded count (Poisson-type) data with excess zeros were collected in a repeated measures designed experiment.  相似文献   

14.
This paper reviews the generalized Poisson regression model, the restricted generalized Poisson regression model and the mixed Poisson regression (negative binomial regression and Poisson inverse Gaussian regression) models which can be used for regression analysis of counts. The aim of this study is to demonstrate the quasi likelihood/moment method, which is used for estimation of the parameters of mixed Poisson regression models, also applicable to obtain the estimates of the parameters of the generalized Poisson regression and the restricted generalized Poisson regression models. Besides, at the end of this study an application related to this method for zoological data is given.  相似文献   

15.
Ghosh S  Gelfand AE  Zhu K  Clark JS 《Biometrics》2012,68(3):878-885
Summary Many applications involve count data from a process that yields an excess number of zeros. Zero-inflated count models, in particular, zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models, along with Poisson hurdle models, are commonly used to address this problem. However, these models struggle to explain extreme incidence of zeros (say more than 80%), especially to find important covariates. In fact, the ZIP may struggle even when the proportion is not extreme. To redress this problem we propose the class of k-ZIG models. These models allow more flexible modeling of both the zero-inflation and the nonzero counts, allowing interplay between these two components. We develop the properties of this new class of models, including reparameterization to a natural link function. The models are straightforwardly fitted within a Bayesian framework. The methodology is illustrated with simulated data examples as well as a forest seedling dataset obtained from the USDA Forest Service's Forest Inventory and Analysis program.  相似文献   

16.
This paper discusses a two‐state hidden Markov Poisson regression (MPR) model for analyzing longitudinal data of epileptic seizure counts, which allows for the rate of the Poisson process to depend on covariates through an exponential link function and to change according to the states of a two‐state Markov chain with its transition probabilities associated with covariates through a logit link function. This paper also considers a two‐state hidden Markov negative binomial regression (MNBR) model, as an alternative, by using the negative binomial instead of Poisson distribution in the proposed MPR model when there exists extra‐Poisson variation conditional on the states of the Markov chain. The two proposed models in this paper relax the stationary requirement of the Markov chain, allow for overdispersion relative to the usual Poisson regression model and for correlation between repeated observations. The proposed methodology provides a plausible analysis for the longitudinal data of epileptic seizure counts, and the MNBR model fits the data much better than the MPR model. Maximum likelihood estimation using the EM and quasi‐Newton algorithms is discussed. A Monte Carlo study for the proposed MPR model investigates the reliability of the estimation method, the choice of probabilities for the initial states of the Markov chain, and some finite sample behaviors of the maximum likelihood estimates, suggesting that (1) the estimation method is accurate and reliable as long as the total number of observations is reasonably large, and (2) the choice of probabilities for the initial states of the Markov process has little impact on the parameter estimates.  相似文献   

17.
Zero‐truncated data arises in various disciplines where counts are observed but the zero count category cannot be observed during sampling. Maximum likelihood estimation can be used to model these data; however, due to its nonstandard form it cannot be easily implemented using well‐known software packages, and additional programming is often required. Motivated by the Rao–Blackwell theorem, we develop a weighted partial likelihood approach to estimate model parameters for zero‐truncated binomial and Poisson data. The resulting estimating function is equivalent to a weighted score function for standard count data models, and allows for applying readily available software. We evaluate the efficiency for this new approach and show that it performs almost as well as maximum likelihood estimation. The weighted partial likelihood approach is then extended to regression modelling and variable selection. We examine the performance of the proposed methods through simulation and present two case studies using real data.  相似文献   

18.
Multivariate spatial count data are often segmented by unobserved space-varying factors that vary across space. In this setting, regression models that assume space-constant covariate effects could be too restrictive. Motivated by the analysis of cause-specific mortality data, we propose to estimate space-varying effects by exploiting a multivariate hidden Markov field. It models the data by a battery of Poisson regressions with spatially correlated regression coefficients, which are driven by an unobserved spatial multinomial process. It parsimoniously describes multivariate count data by means of a finite number of latent classes. Parameter estimation is carried out by composite likelihood methods, that we specifically develop for the proposed model. In a case study of cause-specific mortality data in Italy, the model was capable to capture the spatial variation of gender differences and age effects.  相似文献   

19.
Count phenotypes with excessive zeros are often observed in the biological world. Researchers have studied many statistical methods for mapping the quantitative trait loci (QTLs) of zero-inflated count phenotypes. However, most of the existing methods consist of finding the approximate positions of the QTLs on the chromosome by genome-wide scanning. Additionally, most of the existing methods use the EM algorithm for parameter estimation. In this paper, we propose a Bayesian interval mapping scheme of QTLs for zero-inflated count data. The method takes advantage of a zero-inflated generalized Poisson (ZIGP) regression model to study the influence of QTLs on the zero-inflated count phenotype. The MCMC algorithm is used to estimate the effects and position parameters of QTLs. We use the Haldane map function to realize the conversion between recombination rate and map distance. Monte Carlo simulations are conducted to test the applicability and advantage of the proposed method. The effects of QTLs on the formation of mouse cholesterol gallstones were demonstrated by analyzing an mouse data set.  相似文献   

20.
Cui Y  Kim DY  Zhu J 《Genetics》2006,174(4):2159-2172
Statistical methods for mapping quantitative trait loci (QTL) have been extensively studied. While most existing methods assume normal distribution of the phenotype, the normality assumption could be easily violated when phenotypes are measured in counts. One natural choice to deal with count traits is to apply the classical Poisson regression model. However, conditional on covariates, the Poisson assumption of mean-variance equality may not be valid when data are potentially under- or overdispersed. In this article, we propose an interval-mapping approach for phenotypes measured in counts. We model the effects of QTL through a generalized Poisson regression model and develop efficient likelihood-based inference procedures. This approach, implemented with the EM algorithm, allows for a genomewide scan for the existence of QTL throughout the entire genome. The performance of the proposed method is evaluated through extensive simulation studies along with comparisons with existing approaches such as the Poisson regression and the generalized estimating equation approach. An application to a rice tiller number data set is given. Our approach provides a standard procedure for mapping QTL involved in the genetic control of complex traits measured in counts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号