首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
Lu Z  Hui YV  Lee AH 《Biometrics》2003,59(4):1016-1026
Minimum Hellinger distance estimation (MHDE) has been shown to discount anomalous data points in a smooth manner with first-order efficiency for a correctly specified model. An estimation approach is proposed for finite mixtures of Poisson regression models based on MHDE. Evidence from Monte Carlo simulations suggests that MHDE is a viable alternative to the maximum likelihood estimator when the mixture components are not well separated or the model parameters are near zero. Biometrical applications also illustrate the practical usefulness of the MHDE method.  相似文献   

2.
Roy J  Daniels MJ 《Biometrics》2008,64(2):538-545
Summary .   In this article we consider the problem of fitting pattern mixture models to longitudinal data when there are many unique dropout times. We propose a marginally specified latent class pattern mixture model. The marginal mean is assumed to follow a generalized linear model, whereas the mean conditional on the latent class and random effects is specified separately. Because the dimension of the parameter vector of interest (the marginal regression coefficients) does not depend on the assumed number of latent classes, we propose to treat the number of latent classes as a random variable. We specify a prior distribution for the number of classes, and calculate (approximate) posterior model probabilities. In order to avoid the complications with implementing a fully Bayesian model, we propose a simple approximation to these posterior probabilities. The ideas are illustrated using data from a longitudinal study of depression in HIV-infected women.  相似文献   

3.
In many biometrical applications, the count data encountered often contain extra zeros relative to the Poisson distribution. Zero‐inflated Poisson regression models are useful for analyzing such data, but parameter estimates may be seriously biased if the nonzero observations are over‐dispersed and simultaneously correlated due to the sampling design or the data collection procedure. In this paper, a zero‐inflated negative binomial mixed regression model is presented to analyze a set of pancreas disorder length of stay (LOS) data that comprised mainly same‐day separations. Random effects are introduced to account for inter‐hospital variations and the dependency of clustered LOS observations. Parameter estimation is achieved by maximizing an appropriate log‐likelihood function using an EM algorithm. Alternative modeling strategies, namely the finite mixture of Poisson distributions and the non‐parametric maximum likelihood approach, are also considered. The determination of pertinent covariates would assist hospital administrators and clinicians to manage LOS and expenditures efficiently.  相似文献   

4.
Nielsen JD  Dean CB 《Biometrics》2008,64(3):751-761
Summary .   A flexible semiparametric model for analyzing longitudinal panel count data arising from mixtures is presented. Panel count data refers here to count data on recurrent events collected as the number of events that have occurred within specific follow-up periods. The model assumes that the counts for each subject are generated by mixtures of nonhomogeneous Poisson processes with smooth intensity functions modeled with penalized splines. Time-dependent covariate effects are also incorporated into the process intensity using splines. Discrete mixtures of these nonhomogeneous Poisson process spline models extract functional information from underlying clusters representing hidden subpopulations. The motivating application is an experiment to test the effectiveness of pheromones in disrupting the mating pattern of the cherry bark tortrix moth. Mature moths arise from hidden, but distinct, subpopulations and monitoring the subpopulation responses was of interest. Within-cluster random effects are used to account for correlation structures and heterogeneity common to this type of data. An estimating equation approach to inference requiring only low moment assumptions is developed and the finite sample properties of the proposed estimating functions are investigated empirically by simulation.  相似文献   

5.
Yau KK 《Biometrics》2001,57(1):96-102
A method for modeling survival data with multilevel clustering is described. The Cox partial likelihood is incorporated into the generalized linear mixed model (GLMM) methodology. Parameter estimation is achieved by maximizing a log likelihood analogous to the likelihood associated with the best linear unbiased prediction (BLUP) at the initial step of estimation and is extended to obtain residual maximum likelihood (REML) estimators of the variance component. Estimating equations for a three-level hierarchical survival model are developed in detail, and such a model is applied to analyze a set of chronic granulomatous disease (CGD) data on recurrent infections as an illustration with both hospital and patient effects being considered as random. Only the latter gives a significant contribution. A simulation study is carried out to evaluate the performance of the REML estimators. Further extension of the estimation procedure to models with an arbitrary number of levels is also discussed.  相似文献   

6.
Shrinkage Estimators for Covariance Matrices   总被引:1,自引:0,他引:1  
Estimation of covariance matrices in small samples has been studied by many authors. Standard estimators, like the unstructured maximum likelihood estimator (ML) or restricted maximum likelihood (REML) estimator, can be very unstable with the smallest estimated eigenvalues being too small and the largest too big. A standard approach to more stably estimating the matrix in small samples is to compute the ML or REML estimator under some simple structure that involves estimation of fewer parameters, such as compound symmetry or independence. However, these estimators will not be consistent unless the hypothesized structure is correct. If interest focuses on estimation of regression coefficients with correlated (or longitudinal) data, a sandwich estimator of the covariance matrix may be used to provide standard errors for the estimated coefficients that are robust in the sense that they remain consistent under misspecification of the covariance structure. With large matrices, however, the inefficiency of the sandwich estimator becomes worrisome. We consider here two general shrinkage approaches to estimating the covariance matrix and regression coefficients. The first involves shrinking the eigenvalues of the unstructured ML or REML estimator. The second involves shrinking an unstructured estimator toward a structured estimator. For both cases, the data determine the amount of shrinkage. These estimators are consistent and give consistent and asymptotically efficient estimates for regression coefficients. Simulations show the improved operating characteristics of the shrinkage estimators of the covariance matrix and the regression coefficients in finite samples. The final estimator chosen includes a combination of both shrinkage approaches, i.e., shrinking the eigenvalues and then shrinking toward structure. We illustrate our approach on a sleep EEG study that requires estimation of a 24 x 24 covariance matrix and for which inferences on mean parameters critically depend on the covariance estimator chosen. We recommend making inference using a particular shrinkage estimator that provides a reasonable compromise between structured and unstructured estimators.  相似文献   

7.
Unlike zero‐inflated Poisson regression, marginalized zero‐inflated Poisson (MZIP) models for counts with excess zeros provide estimates with direct interpretations for the overall effects of covariates on the marginal mean. In the presence of missing covariates, MZIP and many other count data models are ordinarily fitted using complete case analysis methods due to lack of appropriate statistical methods and software. This article presents an estimation method for MZIP models with missing covariates. The method, which is applicable to other missing data problems, is illustrated and compared with complete case analysis by using simulations and dental data on the caries preventive effects of a school‐based fluoride mouthrinse program.  相似文献   

8.
Association Models for Clustered Data with Binary and Continuous Responses   总被引:1,自引:0,他引:1  
Summary .  We consider analysis of clustered data with mixed bivariate responses, i.e., where each member of the cluster has a binary and a continuous outcome. We propose a new bivariate random effects model that induces associations among the binary outcomes within a cluster, among the continuous outcomes within a cluster, between a binary outcome and a continuous outcome from different subjects within a cluster, as well as the direct association between the binary and continuous outcomes within the same subject. For the ease of interpretations of the regression effects, the marginal model of the binary response probability integrated over the random effects preserves the logistic form and the marginal expectation of the continuous response preserves the linear form. We implement maximum likelihood estimation of our model parameters using standard software such as PROC NLMIXED of SAS . Our simulation study demonstrates the robustness of our method with respect to the misspecification of the regression model as well as the random effects model. We illustrate our methodology by analyzing a developmental toxicity study of ethylene glycol in mice.  相似文献   

9.
Bayesian hierarchical models usually model the risk surface on the same arbitrary geographical units for all data sources. Poisson/gamma random field models overcome this restriction as the underlying risk surface can be specified independently to the resolution of the data. Moreover, covariates may be considered as either excess or relative risk factors. We compare the performance of the Poisson/gamma random field model to the Markov random field (MRF)‐based ecologic regression model and the Bayesian Detection of Clusters and Discontinuities (BDCD) model, in both a simulation study and a real data example. We find the BDCD model to have advantages in situations dominated by abruptly changing risk while the Poisson/gamma random field model convinces by its flexibility in the estimation of random field structures and by its flexibility incorporating covariates. The MRF‐based ecologic regression model is inferior. WinBUGS code for Poisson/gamma random field models is provided.  相似文献   

10.
Summary .   Motivated by the spatial modeling of aberrant crypt foci (ACF) in colon carcinogenesis, we consider binary data with probabilities modeled as the sum of a nonparametric mean plus a latent Gaussian spatial process that accounts for short-range dependencies. The mean is modeled in a general way using regression splines. The mean function can be viewed as a fixed effect and is estimated with a penalty for regularization. With the latent process viewed as another random effect, the model becomes a generalized linear mixed model. In our motivating data set and other applications, the sample size is too large to easily accommodate maximum likelihood or restricted maximum likelihood estimation (REML), so pairwise likelihood, a special case of composite likelihood, is used instead. We develop an asymptotic theory for models that are sufficiently general to be used in a wide variety of applications, including, but not limited to, the problem that motivated this work. The splines have penalty parameters that must converge to zero asymptotically: we derive theory for this along with a data-driven method for selecting the penalty parameter, a method that is shown in simulations to improve greatly upon standard devices, such as likelihood crossvalidation. Finally, we apply the methods to the data from our experiment ACF. We discover an unexpected location for peak formation of ACF.  相似文献   

11.
A mixed-model procedure for analysis of censored data assuming a multivariate normal distribution is described. A Bayesian framework is adopted which allows for estimation of fixed effects and variance components and prediction of random effects when records are left-censored. The procedure can be extended to right- and two-tailed censoring. The model employed is a generalized linear model, and the estimation equations resemble those arising in analysis of multivariate normal or categorical data with threshold models. Estimates of variance components are obtained using expressions similar to those employed in the EM algorithm for restricted maximum likelihood (REML) estimation under normality.  相似文献   

12.
Lou XY  Yang MC 《Genetica》2006,128(1-3):471-484
A genetic model is developed with additive and dominance effects of a single gene and polygenes as well as general and specific reciprocal effects for the progeny from a diallel mating design. The methods of ANOVA, minimum norm quadratic unbiased estimation (MINQUE), restricted maximum likelihood estimation (REML), and maximum likelihood estimation (ML) are suggested for estimating variance components, and the methods of generalized least squares (GLS) and ordinary least squares (OLS) for fixed effects, while best linear unbiased prediction, linear unbiased prediction (LUP), and adjusted unbiased prediction are suggested for analyzing random effects. Monte Carlo simulations were conducted to evaluate the unbiasedness and efficiency of statistical methods involving two diallel designs with commonly used sample sizes, 6 and 8 parents, with no and missing crosses, respectively. Simulation results show that GLS and OLS are almost equally efficient for estimation of fixed effects, while MINQUE (1) and REML are better estimators of the variance components and LUP is most practical method for prediction of random effects. Data from a Drosophila melanogaster experiment (Gilbert 1985a, Theor appl Genet 69:625–629) were used as a working example to demonstrate the statistical analysis. The new methodology is also applicable to screening candidate gene(s) and to other mating designs with multiple parents, such as nested (NC Design I) and factorial (NC Design II) designs. Moreover, this methodology can serve as a guide to develop new methods for detecting indiscernible major genes and mapping quantitative trait loci based on mixture distribution theory. The computer program for the methods suggested in this article is freely available from the authors.  相似文献   

13.
This paper extends the multilevel survival model by allowing the existence of cured fraction in the model. Random effects induced by the multilevel clustering structure are specified in the linear predictors in both hazard function and cured probability parts. Adopting the generalized linear mixed model (GLMM) approach to formulate the problem, parameter estimation is achieved by maximizing a best linear unbiased prediction (BLUP) type log‐likelihood at the initial step of estimation, and is then extended to obtain residual maximum likelihood (REML) estimators of the variance component. The proposed multilevel mixture cure model is applied to analyze the (i) child survival study data with multilevel clustering and (ii) chronic granulomatous disease (CGD) data on recurrent infections as illustrations. A simulation study is carried out to evaluate the performance of the REML estimators and assess the accuracy of the standard error estimates.  相似文献   

14.
Clegg LX  Cai J  Sen PK 《Biometrics》1999,55(3):805-812
In multivariate failure time data analysis, a marginal regression modeling approach is often preferred to avoid assumptions on the dependence structure among correlated failure times. In this paper, a marginal mixed baseline hazards model is introduced. Estimating equations are proposed for the estimation of the marginal hazard ratio parameters. The proposed estimators are shown to be consistent and asymptotically Gaussian with a robust covariance matrix that can be consistently estimated. Simulation studies indicate the adequacy of the proposed methodology for practical sample sizes. The methodology is illustrated with a data set from the Framingham Heart Study.  相似文献   

15.
Maternity length of stay (LOS) is an important measure of hospital activity, but its empirical distribution is often positively skewed. A two-component gamma mixture regression model has been proposed to analyze the heterogeneous maternity LOS. The problem is that observations collected from the same hospital are often correlated, which can lead to spurious associations and misleading inferences. To account for the inherent correlation, random effects are incorporated within the linear predictors of the two-component gamma mixture regression model. An EM algorithm is developed for the residual maximum quasi-likelihood estimation of the regression coefficients and variance component parameters. The approach enables the correct identification and assessment of risk factors affecting the short-stay and long-stay patient subgroups. In addition, the predicted random effects can provide information on the inter-hospital variations after adjustment for patient characteristics and health provision factors. A simulation study shows that the estimators obtained via the EM algorithm perform well in all the settings considered. Application to a set of maternity LOS data for women having obstetrical delivery with multiple complicating diagnoses is illustrated.  相似文献   

16.
Count data often exhibit more zeros than predicted by common count distributions like the Poisson or negative binomial. In recent years, there has been considerable interest in methods for analyzing zero-inflated count data in longitudinal or other correlated data settings. A common approach has been to extend zero-inflated Poisson models to include random effects that account for correlation among observations. However, these models have been shown to have a few drawbacks, including interpretability of regression coefficients and numerical instability of fitting algorithms even when the data arise from the assumed model. To address these issues, we propose a model that parameterizes the marginal associations between the count outcome and the covariates as easily interpretable log relative rates, while including random effects to account for correlation among observations. One of the main advantages of this marginal model is that it allows a basis upon which we can directly compare the performance of standard methods that ignore zero inflation with that of a method that explicitly takes zero inflation into account. We present simulations of these various model formulations in terms of bias and variance estimation. Finally, we apply the proposed approach to analyze toxicological data of the effect of emissions on cardiac arrhythmias.  相似文献   

17.
Semiparametric analysis of zero-inflated count data   总被引:1,自引:0,他引:1  
Lam KF  Xue H  Cheung YB 《Biometrics》2006,62(4):996-1003
Medical and public health research often involve the analysis of count data that exhibit a substantially large proportion of zeros, such as the number of heart attacks and the number of days of missed primary activities in a given period. A zero-inflated Poisson regression model, which hypothesizes a two-point heterogeneity in the population characterized by a binary random effect, is generally used to model such data. Subjects are broadly categorized into the low-risk group leading to structural zero counts and high-risk (or normal) group so that the counts can be modeled by a Poisson regression model. The main aim is to identify the explanatory variables that have significant effects on (i) the probability that the subject is from the low-risk group by means of a logistic regression formulation; and (ii) the magnitude of the counts, given that the subject is from the high-risk group by means of a Poisson regression where the effects of the covariates are assumed to be linearly related to the natural logarithm of the mean of the counts. In this article we consider a semiparametric zero-inflated Poisson regression model that postulates a possibly nonlinear relationship between the natural logarithm of the mean of the counts and a particular covariate. A sieve maximum likelihood estimation method is proposed. Asymptotic properties of the proposed sieve maximum likelihood estimators are discussed. Under some mild conditions, the estimators are shown to be asymptotically efficient and normally distributed. Simulation studies were carried out to investigate the performance of the proposed method. For illustration purpose, the method is applied to a data set from a public health survey conducted in Indonesia where the variable of interest is the number of days of missed primary activities due to illness in a 4-week period.  相似文献   

18.
Hogan JW  Lin X  Herman B 《Biometrics》2004,60(4):854-864
The analysis of longitudinal repeated measures data is frequently complicated by missing data due to informative dropout. We describe a mixture model for joint distribution for longitudinal repeated measures, where the dropout distribution may be continuous and the dependence between response and dropout is semiparametric. Specifically, we assume that responses follow a varying coefficient random effects model conditional on dropout time, where the regression coefficients depend on dropout time through unspecified nonparametric functions that are estimated using step functions when dropout time is discrete (e.g., for panel data) and using smoothing splines when dropout time is continuous. Inference under the proposed semiparametric model is hence more robust than the parametric conditional linear model. The unconditional distribution of the repeated measures is a mixture over the dropout distribution. We show that estimation in the semiparametric varying coefficient mixture model can proceed by fitting a parametric mixed effects model and can be carried out on standard software platforms such as SAS. The model is used to analyze data from a recent AIDS clinical trial and its performance is evaluated using simulations.  相似文献   

19.
Diallel analysis for sex-linked and maternal effects   总被引:40,自引:0,他引:40  
Genetic models including sex-linked and maternal effects as well as autosomal gene effects are described. Monte Carlo simulations were conducted to compare efficiencies of estimation by minimum norm quadratic unbiased estimation (MINQUE) and restricted maximum likelihood (REML) methods. MINQUE(1), which has 1 for all prior values, has a similar efficiency to MINQUE(), which requires prior estimates of parameter values. MINQUE(1) has the advantage over REML of unbiased estimation and convenient computation. An adjusted unbiased prediction (AUP) method is developed for predicting random genetic effects. AUP is desirable for its easy computation and unbiasedness of both mean and variance of predictors. The jackknife procedure is appropriate for estimating the sampling variances of estimated variances (or covariances) and of predicted genetic effects. A t-test based on jackknife variances is applicable for detecting significance of variation. Worked examples from mice and silkworm data are given in order to demonstrate variance and covariance estimation and genetic effect prediction.  相似文献   

20.
Ying Yuan  Guosheng Yin 《Biometrics》2010,66(1):105-114
Summary .  We study quantile regression (QR) for longitudinal measurements with nonignorable intermittent missing data and dropout. Compared to conventional mean regression, quantile regression can characterize the entire conditional distribution of the outcome variable, and is more robust to outliers and misspecification of the error distribution. We account for the within-subject correlation by introducing a   ℓ2   penalty in the usual QR check function to shrink the subject-specific intercepts and slopes toward the common population values. The informative missing data are assumed to be related to the longitudinal outcome process through the shared latent random effects. We assess the performance of the proposed method using simulation studies, and illustrate it with data from a pediatric AIDS clinical trial.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号