首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
4.
Chen B  Zhou XH 《Biometrics》2011,67(3):830-842
Longitudinal studies often feature incomplete response and covariate data. Likelihood-based methods such as the expectation-maximization algorithm give consistent estimators for model parameters when data are missing at random (MAR) provided that the response model and the missing covariate model are correctly specified; however, we do not need to specify the missing data mechanism. An alternative method is the weighted estimating equation, which gives consistent estimators if the missing data and response models are correctly specified; however, we do not need to specify the distribution of the covariates that have missing values. In this article, we develop a doubly robust estimation method for longitudinal data with missing response and missing covariate when data are MAR. This method is appealing in that it can provide consistent estimators if either the missing data model or the missing covariate model is correctly specified. Simulation studies demonstrate that this method performs well in a variety of situations.  相似文献   

5.
Inference and missing data   总被引:85,自引:0,他引:85  
RUBIN  DONALD B. 《Biometrika》1976,63(3):581-592
  相似文献   

6.
Imputation, weighting, direct likelihood, and direct Bayesian inference (Rubin, 1976) are important approaches for missing data regression. Many useful semiparametric estimators have been developed for regression analysis of data with missing covariates or outcomes. It has been established that some semiparametric estimators are asymptotically equivalent, but it has not been shown that many are numerically the same. We applied some existing methods to a bladder cancer case-control study and noted that they were the same numerically when the observed covariates and outcomes are categorical. To understand the analytical background of this finding, we further show that when observed covariates and outcomes are categorical, some estimators are not only asymptotically equivalent but also actually numerically identical. That is, although their estimating equations are different, they lead numerically to exactly the same root. This includes a simple weighted estimator, an augmented weighted estimator, and a mean-score estimator. The numerical equivalence may elucidate the relationship between imputing scores and weighted estimation procedures.  相似文献   

7.
On using the Cox proportional hazards model with missing covariates   总被引:1,自引:0,他引:1  
  相似文献   

8.
9.
Methods in the literature for missing covariate data in survival models have relied on the missing at random (MAR) assumption to render regression parameters identifiable. MAR means that missingness can depend on the observed exit time, and whether or not that exit is a failure or a censoring event. By considering ways in which missingness of covariate X could depend on the true but possibly censored failure time T and the true censoring time C, we attempt to identify missingness mechanisms which would yield MAR data. We find that, under various reasonable assumptions about how missingness might depend on T and/or C, additional strong assumptions are needed to obtain MAR. We conclude that MAR is difficult to justify in practical applications. One exception arises when missingness is independent of T, and C is independent of the value of the missing X. As alternatives to MAR, we propose two new missingness assumptions. In one, the missingness depends on T but not on C; in the other, the situation is reversed. For each, we show that the failure time model is identifiable. When missingness is independent of T, we show that the naive complete record analysis will yield a consistent estimator of the failure time distribution. When missingness is independent of C, we develop a complete record likelihood function and a corresponding estimator for parametric failure time models. We propose analyses to evaluate the plausibility of either assumption in a particular data set, and illustrate the ideas using data from the literature on this problem.  相似文献   

10.
Wang CY  Huang WT 《Biometrics》2000,56(1):98-105
We consider estimation in logistic regression where some covariate variables may be missing at random. Satten and Kupper (1993, Journal of the American Statistical Association 88, 200-208) proposed estimating odds ratio parameters using methods based on the probability of exposure. By approximating a partial likelihood, we extend their idea and propose a method that estimates the cumulant-generating function of the missing covariate given observed covariates and surrogates in the controls. Our proposed method first estimates some lower order cumulants of the conditional distribution of the unobserved data and then solves a resulting estimating equation for the logistic regression parameter. A simple version of the proposed method is to replace a missing covariate by the summation of its conditional mean and conditional variance given observed data in the controls. We note that one important property of the proposed method is that, when the validation is only on controls, a class of inverse selection probability weighted semiparametric estimators cannot be applied because selection probabilities on cases are zeroes. The proposed estimator performs well unless the relative risk parameters are large, even though it is technically inconsistent. Small-sample simulations are conducted. We illustrate the method by an example of real data analysis.  相似文献   

11.
Ibrahim JG  Chen MH  Lipsitz SR 《Biometrics》1999,55(2):591-596
We propose a method for estimating parameters for general parametric regression models with an arbitrary number of missing covariates. We allow any pattern of missing data and assume that the missing data mechanism is ignorable throughout. When the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). We extend this method to continuous or mixed categorical and continuous covariates, and for arbitrary parametric regression models, by adapting a Monte Carlo version of the EM algorithm as discussed by Wei and Tanner (1990, Journal of the American Statistical Association 85, 699-704). In addition, we discuss the Gibbs sampler for sampling from the conditional distribution of the missing covariates given the observed data and show that the appropriate complete conditionals are log-concave. The log-concavity property of the conditional distributions will facilitate a straightforward implementation of the Gibbs sampler via the adaptive rejection algorithm of Gilks and Wild (1992, Applied Statistics 41, 337-348). We assume the model for the response given the covariates is an arbitrary parametric regression model, such as a generalized linear model, a parametric survival model, or a nonlinear model. We model the marginal distribution of the covariates as a product of one-dimensional conditional distributions. This allows us a great deal of flexibility in modeling the distribution of the covariates and reduces the number of nuisance parameters that are introduced in the E-step. We present examples involving both simulated and real data.  相似文献   

12.
13.
Unlike zero‐inflated Poisson regression, marginalized zero‐inflated Poisson (MZIP) models for counts with excess zeros provide estimates with direct interpretations for the overall effects of covariates on the marginal mean. In the presence of missing covariates, MZIP and many other count data models are ordinarily fitted using complete case analysis methods due to lack of appropriate statistical methods and software. This article presents an estimation method for MZIP models with missing covariates. The method, which is applicable to other missing data problems, is illustrated and compared with complete case analysis by using simulations and dental data on the caries preventive effects of a school‐based fluoride mouthrinse program.  相似文献   

14.
We present a method for estimating the parameters in random effects models for survival data when covariates are subject to missingness. Our method is more general than the usual frailty model as it accommodates a wide range of distributions for the random effects, which are included as an offset in the linear predictor in a manner analogous to that used in generalized linear mixed models. We propose using a Monte Carlo EM algorithm along with the Gibbs sampler to obtain parameter estimates. This method is useful in reducing the bias that may be incurred using complete-case methods in this setting. The methodology is applied to data from Eastern Cooperative Oncology Group melanoma clinical trials in which observations were believed to be clustered and several tumor characteristics were not always observed.  相似文献   

15.
This work develops a joint model selection criterion for simultaneously selecting the marginal mean regression and the correlation/covariance structure in longitudinal data analysis where both the outcome and the covariate variables may be subject to general intermittent patterns of missingness under the missing at random mechanism. The new proposal, termed “joint longitudinal information criterion” (JLIC), is based on the expected quadratic error for assessing model adequacy, and the second‐order weighted generalized estimating equation (WGEE) estimation for mean and covariance models. Simulation results reveal that JLIC outperforms existing methods performing model selection for the mean regression and the correlation structure in a two stage and hence separate manner. We apply the proposal to a longitudinal study to identify factors associated with life satisfaction in the elderly of Taiwan.  相似文献   

16.
17.
18.
Maximum likelihood methods for cure rate models with missing covariates   总被引:1,自引:0,他引:1  
Chen MH  Ibrahim JG 《Biometrics》2001,57(1):43-52
We propose maximum likelihood methods for parameter estimation for a novel class of semiparametric survival models with a cure fraction, in which the covariates are allowed to be missing. We allow the covariates to be either categorical or continuous and specify a parametric distribution for the covariates that is written as a sequence of one-dimensional conditional distributions. We propose a novel EM algorithm for maximum likelihood estimation and derive standard errors by using Louis's formula (Louis, 1982, Journal of the Royal Statistical Society, Series B 44, 226-233). Computational techniques using the Monte Carlo EM algorithm are discussed and implemented. A real data set involving a melanoma cancer clinical trial is examined in detail to demonstrate the methodology.  相似文献   

19.
Huang L  Chen MH  Ibrahim JG 《Biometrics》2005,61(3):767-780
We propose Bayesian methods for estimating parameters in generalized linear models (GLMs) with nonignorably missing covariate data. We show that when improper uniform priors are used for the regression coefficients, phi, of the multinomial selection model for the missing data mechanism, the resulting joint posterior will always be improper if (i) all missing covariates are discrete and an intercept is included in the selection model for the missing data mechanism, or (ii) at least one of the covariates is continuous and unbounded. This impropriety will result regardless of whether proper or improper priors are specified for the regression parameters, beta, of the GLM or the parameters, alpha, of the covariate distribution. To overcome this problem, we propose a novel class of proper priors for the regression coefficients, phi, in the selection model for the missing data mechanism. These priors are robust and computationally attractive in the sense that inferences about beta are not sensitive to the choice of the hyperparameters of the prior for phi and they facilitate a Gibbs sampling scheme that leads to accelerated convergence. In addition, we extend the model assessment criterion of Chen, Dey, and Ibrahim (2004a, Biometrika 91, 45-63), called the weighted L measure, to GLMs and missing data problems as well as extend the deviance information criterion (DIC) of Spiegelhalter et al. (2002, Journal of the Royal Statistical Society B 64, 583-639) for assessing whether the missing data mechanism is ignorable or nonignorable. A novel Markov chain Monte Carlo sampling algorithm is also developed for carrying out posterior computation. Several simulations are given to investigate the performance of the proposed Bayesian criteria as well as the sensitivity of the prior specification. Real datasets from a melanoma cancer clinical trial and a liver cancer study are presented to further illustrate the proposed methods.  相似文献   

20.
Lee SM  Gee MJ  Hsieh SH 《Biometrics》2011,67(3):788-798
Summary We consider the estimation problem of a proportional odds model with missing covariates. Based on the validation and nonvalidation data sets, we propose a joint conditional method that is an extension of Wang et al. (2002, Statistica Sinica 12, 555–574). The proposed method is semiparametric since it requires neither an additional model for the missingness mechanism, nor the specification of the conditional distribution of missing covariates given observed variables. Under the assumption that the observed covariates and the surrogate variable are categorical, we derived the large sample property. The simulation studies show that in various situations, the joint conditional method is more efficient than the conditional estimation method and weighted method. We also use a real data set that came from a survey of cable TV satisfaction to illustrate the approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号