首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Stubbendick AL  Ibrahim JG 《Biometrics》2003,59(4):1140-1150
This article analyzes quality of life (QOL) data from an Eastern Cooperative Oncology Group (ECOG) melanoma trial that compared treatment with ganglioside vaccination to treatment with high-dose interferon. The analysis of this data set is challenging due to several difficulties, namely, nonignorable missing longitudinal responses and baseline covariates. Hence, we propose a selection model for estimating parameters in the normal random effects model with nonignorable missing responses and covariates. Parameters are estimated via maximum likelihood using the Gibbs sampler and a Monte Carlo expectation maximization (EM) algorithm. Standard errors are calculated using the bootstrap. The method allows for nonmonotone patterns of missing data in both the response variable and the covariates. We model the missing data mechanism and the missing covariate distribution via a sequence of one-dimensional conditional distributions, allowing the missing covariates to be either categorical or continuous, as well as time-varying. We apply the proposed approach to the ECOG quality-of-life data and conduct a small simulation study evaluating the performance of the maximum likelihood estimates. Our results indicate that a patient treated with the vaccine has a higher QOL score on average at a given time point than a patient treated with high-dose interferon.  相似文献   

2.
Ibrahim JG  Chen MH  Lipsitz SR 《Biometrics》1999,55(2):591-596
We propose a method for estimating parameters for general parametric regression models with an arbitrary number of missing covariates. We allow any pattern of missing data and assume that the missing data mechanism is ignorable throughout. When the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). We extend this method to continuous or mixed categorical and continuous covariates, and for arbitrary parametric regression models, by adapting a Monte Carlo version of the EM algorithm as discussed by Wei and Tanner (1990, Journal of the American Statistical Association 85, 699-704). In addition, we discuss the Gibbs sampler for sampling from the conditional distribution of the missing covariates given the observed data and show that the appropriate complete conditionals are log-concave. The log-concavity property of the conditional distributions will facilitate a straightforward implementation of the Gibbs sampler via the adaptive rejection algorithm of Gilks and Wild (1992, Applied Statistics 41, 337-348). We assume the model for the response given the covariates is an arbitrary parametric regression model, such as a generalized linear model, a parametric survival model, or a nonlinear model. We model the marginal distribution of the covariates as a product of one-dimensional conditional distributions. This allows us a great deal of flexibility in modeling the distribution of the covariates and reduces the number of nuisance parameters that are introduced in the E-step. We present examples involving both simulated and real data.  相似文献   

3.
Horton NJ  Laird NM 《Biometrics》2001,57(1):34-42
This article presents a new method for maximum likelihood estimation of logistic regression models with incomplete covariate data where auxiliary information is available. This auxiliary information is extraneous to the regression model of interest but predictive of the covariate with missing data. Ibrahim (1990, Journal of the American Statistical Association 85, 765-769) provides a general method for estimating generalized linear regression models with missing covariates using the EM algorithm that is easily implemented when there is no auxiliary data. Vach (1997, Statistics in Medicine 16, 57-72) describes how the method can be extended when the outcome and auxiliary data are conditionally independent given the covariates in the model. The method allows the incorporation of auxiliary data without making the conditional independence assumption. We suggest tests of conditional independence and compare the performance of several estimators in an example concerning mental health service utilization in children. Using an artificial dataset, we compare the performance of several estimators when auxiliary data are available.  相似文献   

4.
Satten GA  Carroll RJ 《Biometrics》2000,56(2):384-388
We consider methods for analyzing categorical regression models when some covariates (Z) are completely observed but other covariates (X) are missing for some subjects. When data on X are missing at random (i.e., when the probability that X is observed does not depend on the value of X itself), we present a likelihood approach for the observed data that allows the same nuisance parameters to be eliminated in a conditional analysis as when data are complete. An example of a matched case-control study is used to demonstrate our approach.  相似文献   

5.
Chen Q  Ibrahim JG 《Biometrics》2006,62(1):177-184
We consider a class of semiparametric models for the covariate distribution and missing data mechanism for missing covariate and/or response data for general classes of regression models including generalized linear models and generalized linear mixed models. Ignorable and nonignorable missing covariate and/or response data are considered. The proposed semiparametric model can be viewed as a sensitivity analysis for model misspecification of the missing covariate distribution and/or missing data mechanism. The semiparametric model consists of a generalized additive model (GAM) for the covariate distribution and/or missing data mechanism. Penalized regression splines are used to express the GAMs as a generalized linear mixed effects model, in which the variance of the corresponding random effects provides an intuitive index for choosing between the semiparametric and parametric model. Maximum likelihood estimates are then obtained via the EM algorithm. Simulations are given to demonstrate the methodology, and a real data set from a melanoma cancer clinical trial is analyzed using the proposed methods.  相似文献   

6.
Maximum likelihood methods for cure rate models with missing covariates   总被引:1,自引:0,他引:1  
Chen MH  Ibrahim JG 《Biometrics》2001,57(1):43-52
We propose maximum likelihood methods for parameter estimation for a novel class of semiparametric survival models with a cure fraction, in which the covariates are allowed to be missing. We allow the covariates to be either categorical or continuous and specify a parametric distribution for the covariates that is written as a sequence of one-dimensional conditional distributions. We propose a novel EM algorithm for maximum likelihood estimation and derive standard errors by using Louis's formula (Louis, 1982, Journal of the Royal Statistical Society, Series B 44, 226-233). Computational techniques using the Monte Carlo EM algorithm are discussed and implemented. A real data set involving a melanoma cancer clinical trial is examined in detail to demonstrate the methodology.  相似文献   

7.
Pan W  Lin X  Zeng D 《Biometrics》2006,62(2):402-412
We propose a new class of models, transition measurement error models, to study the effects of covariates and the past responses on the current response in longitudinal studies when one of the covariates is measured with error. We show that the response variable conditional on the error-prone covariate follows a complex transition mixed effects model. The naive model obtained by ignoring the measurement error correctly specifies the transition part of the model, but misspecifies the covariate effect structure and ignores the random effects. We next study the asymptotic bias in naive estimator obtained by ignoring the measurement error for both continuous and discrete outcomes. We show that the naive estimator of the regression coefficient of the error-prone covariate is attenuated, while the naive estimators of the regression coefficients of the past responses are generally inflated. We then develop a structural modeling approach for parameter estimation using the maximum likelihood estimation method. In view of the multidimensional integration required by full maximum likelihood estimation, an EM algorithm is developed to calculate maximum likelihood estimators, in which Monte Carlo simulations are used to evaluate the conditional expectations in the E-step. We evaluate the performance of the proposed method through a simulation study and apply it to a longitudinal social support study for elderly women with heart disease. An additional simulation study shows that the Bayesian information criterion (BIC) performs well in choosing the correct transition orders of the models.  相似文献   

8.
Incomplete covariate data are a common occurrence in studies in which the outcome is survival time. Further, studies in the health sciences often give rise to correlated, possibly censored, survival data. With no missing covariate data, if the marginal distributions of the correlated survival times follow a given parametric model, then the estimates using the maximum likelihood estimating equations, naively treating the correlated survival times as independent, give consistent estimates of the relative risk parameters Lipsitz et al. 1994 50, 842-846. Now, suppose that some observations within a cluster have some missing covariates. We show in this paper that if one naively treats observations within a cluster as independent, that one can still use the maximum likelihood estimating equations to obtain consistent estimates of the relative risk parameters. This method requires the estimation of the parameters of the distribution of the covariates. We present results from a clinical trial Lipsitz and Ibrahim (1996b) 2, 5-14 with five covariates, four of which have some missing values. In the trial, the clusters are the hospitals in which the patients were treated.  相似文献   

9.
Maps depicting cancer incidence rates have become useful tools in public health research, giving valuable information about the spatial variation in rates of disease. Typically, these maps are generated using count data aggregated over areas such as counties or census blocks. However, with the proliferation of geographic information systems and related databases, it is becoming easier to obtain exact spatial locations for the cancer cases and suitable control subjects. The use of such point data allows us to adjust for individual-level covariates, such as age and smoking status, when estimating the spatial variation in disease risk. Unfortunately, such covariate information is often subject to missingness. We propose a method for mapping cancer risk when covariates are not completely observed. We model these data using a logistic generalized additive model. Estimates of the linear and non-linear effects are obtained using a mixed effects model representation. We develop an EM algorithm to account for missing data and the random effects. Since the expectation step involves an intractable integral, we estimate the E-step with a Laplace approximation. This framework provides a general method for handling missing covariate values when fitting generalized additive models. We illustrate our method through an analysis of cancer incidence data from Cape Cod, Massachusetts. These analyses demonstrate that standard complete-case methods can yield biased estimates of the spatial variation of cancer risk.  相似文献   

10.
Multiple imputation (MI) is increasingly popular for handling multivariate missing data. Two general approaches are available in standard computer packages: MI based on the posterior distribution of incomplete variables under a multivariate (joint) model, and fully conditional specification (FCS), which imputes missing values using univariate conditional distributions for each incomplete variable given all the others, cycling iteratively through the univariate imputation models. In the context of longitudinal or clustered data, it is not clear whether these approaches result in consistent estimates of regression coefficient and variance component parameters when the analysis model of interest is a linear mixed effects model (LMM) that includes both random intercepts and slopes with either covariates or both covariates and outcome contain missing information. In the current paper, we compared the performance of seven different MI methods for handling missing values in longitudinal and clustered data in the context of fitting LMMs with both random intercepts and slopes. We study the theoretical compatibility between specific imputation models fitted under each of these approaches and the LMM, and also conduct simulation studies in both the longitudinal and clustered data settings. Simulations were motivated by analyses of the association between body mass index (BMI) and quality of life (QoL) in the Longitudinal Study of Australian Children (LSAC). Our findings showed that the relative performance of MI methods vary according to whether the incomplete covariate has fixed or random effects and whether there is missingnesss in the outcome variable. We showed that compatible imputation and analysis models resulted in consistent estimation of both regression parameters and variance components via simulation. We illustrate our findings with the analysis of LSAC data.  相似文献   

11.
Wang CY  Huang WT 《Biometrics》2000,56(1):98-105
We consider estimation in logistic regression where some covariate variables may be missing at random. Satten and Kupper (1993, Journal of the American Statistical Association 88, 200-208) proposed estimating odds ratio parameters using methods based on the probability of exposure. By approximating a partial likelihood, we extend their idea and propose a method that estimates the cumulant-generating function of the missing covariate given observed covariates and surrogates in the controls. Our proposed method first estimates some lower order cumulants of the conditional distribution of the unobserved data and then solves a resulting estimating equation for the logistic regression parameter. A simple version of the proposed method is to replace a missing covariate by the summation of its conditional mean and conditional variance given observed data in the controls. We note that one important property of the proposed method is that, when the validation is only on controls, a class of inverse selection probability weighted semiparametric estimators cannot be applied because selection probabilities on cases are zeroes. The proposed estimator performs well unless the relative risk parameters are large, even though it is technically inconsistent. Small-sample simulations are conducted. We illustrate the method by an example of real data analysis.  相似文献   

12.
Chen B  Zhou XH 《Biometrics》2011,67(3):830-842
Longitudinal studies often feature incomplete response and covariate data. Likelihood-based methods such as the expectation-maximization algorithm give consistent estimators for model parameters when data are missing at random (MAR) provided that the response model and the missing covariate model are correctly specified; however, we do not need to specify the missing data mechanism. An alternative method is the weighted estimating equation, which gives consistent estimators if the missing data and response models are correctly specified; however, we do not need to specify the distribution of the covariates that have missing values. In this article, we develop a doubly robust estimation method for longitudinal data with missing response and missing covariate when data are MAR. This method is appealing in that it can provide consistent estimators if either the missing data model or the missing covariate model is correctly specified. Simulation studies demonstrate that this method performs well in a variety of situations.  相似文献   

13.
S G Baker 《Biometrics》1990,46(4):1193-7, Discussion 1198-200
A simple EM algorithm is proposed for obtaining maximum likelihood estimates when fitting a loglinear model to data from k capture-recapture samples with categorical covariates. The method is used to analyze data on screening for the early detection of breast cancer.  相似文献   

14.
Huang Y  Dagne G 《Biometrics》2012,68(3):943-953
Summary It is a common practice to analyze complex longitudinal data using semiparametric nonlinear mixed-effects (SNLME) models with a normal distribution. Normality assumption of model errors may unrealistically obscure important features of subject variations. To partially explain between- and within-subject variations, covariates are usually introduced in such models, but some covariates may often be measured with substantial errors. Moreover, the responses may be missing and the missingness may be nonignorable. Inferential procedures can be complicated dramatically when data with skewness, missing values, and measurement error are observed. In the literature, there has been considerable interest in accommodating either skewness, incompleteness or covariate measurement error in such models, but there has been relatively little study concerning all three features simultaneously. In this article, our objective is to address the simultaneous impact of skewness, missingness, and covariate measurement error by jointly modeling the response and covariate processes based on a flexible Bayesian SNLME model. The method is illustrated using a real AIDS data set to compare potential models with various scenarios and different distribution specifications.  相似文献   

15.
Liu W  Wu L 《Biometrics》2007,63(2):342-350
Semiparametric nonlinear mixed-effects (NLME) models are flexible for modeling complex longitudinal data. Covariates are usually introduced in the models to partially explain interindividual variations. Some covariates, however, may be measured with substantial errors. Moreover, the responses may be missing and the missingness may be nonignorable. We propose two approximate likelihood methods for semiparametric NLME models with covariate measurement errors and nonignorable missing responses. The methods are illustrated in a real data example. Simulation results show that both methods perform well and are much better than the commonly used naive method.  相似文献   

16.
In cohort studies the outcome is often time to a particular event, and subjects are followed at regular intervals. Periodic visits may also monitor a secondary irreversible event influencing the event of primary interest, and a significant proportion of subjects develop the secondary event over the period of follow‐up. The status of the secondary event serves as a time‐varying covariate, but is recorded only at the times of the scheduled visits, generating incomplete time‐varying covariates. While information on a typical time‐varying covariate is missing for entire follow‐up period except the visiting times, the status of the secondary event are unavailable only between visits where the status has changed, thus interval‐censored. One may view interval‐censored covariate of the secondary event status as missing time‐varying covariates, yet missingness is partial since partial information is provided throughout the follow‐up period. Current practice of using the latest observed status produces biased estimators, and the existing missing covariate techniques cannot accommodate the special feature of missingness due to interval censoring. To handle interval‐censored covariates in the Cox proportional hazards model, we propose an available‐data estimator, a doubly robust‐type estimator as well as the maximum likelihood estimator via EM algorithm and present their asymptotic properties. We also present practical approaches that are valid. We demonstrate the proposed methods using our motivating example from the Northern Manhattan Study.  相似文献   

17.
Yip PS  Lin HZ  Xi L 《Biometrics》2005,61(4):1085-1092
A semiparametric estimation procedure is proposed to model capture-recapture data with the aim of estimating the population size for a closed population. Individuals' covariates are possibly time dependent and missing at noncaptured times and may be measured with error. A set of estimating equations (EEs) based on covariate process and capture-recapture data is constructed to estimate the relevant parameters and the population size. These EEs can be solved by an algorithm similar to an EM algorithm. Simulation results show that the proposed procedures work better than the naive estimate. In some cases they are even better than "ideal" estimates, for which the true values of covariates are available for all captured subjects over the entire experimental period. We apply the method to a capture-recapture experiment on the bird species Prinia flaviventris in Hong Kong.  相似文献   

18.
19.
Roy J  Lin X 《Biometrics》2005,61(3):837-846
We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covariates. In this article, we first study the asymptotic bias that would result from applying existing methods, where missing time-varying covariates are handled using naive approaches, which include: (1) using only baseline values; (2) carrying forward the last observation; and (3) assuming the missing data are ignorable. Our asymptotic bias analysis shows that these naive approaches yield inconsistent estimators of model parameters. We next propose a selection/transition model that allows covariates to be missing in addition to the outcome variable at the time of dropout. The EM algorithm is used for inference in the proposed model. Data from a longitudinal study of human immunodeficiency virus (HIV)-infected women are used to illustrate the methodology.  相似文献   

20.
Larsen K 《Biometrics》2004,60(1):85-92
Multiple categorical variables are commonly used in medical and epidemiological research to measure specific aspects of human health and functioning. To analyze such data, models have been developed considering these categorical variables as imperfect indicators of an individual's "true" status of health or functioning. In this article, the latent class regression model is used to model the relationship between covariates, a latent class variable (the unobserved status of health or functioning), and the observed indicators (e.g., variables from a questionnaire). The Cox model is extended to encompass a latent class variable as predictor of time-to-event, while using information about latent class membership available from multiple categorical indicators. The expectation-maximization (EM) algorithm is employed to obtain maximum likelihood estimates, and standard errors are calculated based on the profile likelihood, treating the nonparametric baseline hazard as a nuisance parameter. A sampling-based method for model checking is proposed. It allows for graphical investigation of the assumption of proportional hazards across latent classes. It may also be used for checking other model assumptions, such as no additional effect of the observed indicators given latent class. The usefulness of the model framework and the proposed techniques are illustrated in an analysis of data from the Women's Health and Aging Study concerning the effect of severe mobility disability on time-to-death for elderly women.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号