首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Liu M  Taylor JM  Belin TR 《Biometrics》2000,56(4):1157-1163
This paper outlines a multiple imputation method for handling missing data in designed longitudinal studies. A random coefficients model is developed to accommodate incomplete multivariate continuous longitudinal data. Multivariate repeated measures are jointly modeled; specifically, an i.i.d. normal model is assumed for time-independent variables and a hierarchical random coefficients model is assumed for time-dependent variables in a regression model conditional on the time-independent variables and time, with heterogeneous error variances across variables and time points. Gibbs sampling is used to draw model parameters and for imputations of missing observations. An application to data from a study of startle reactions illustrates the model. A simulation study compares the multiple imputation procedure to the weighting approach of Robins, Rotnitzky, and Zhao (1995, Journal of the American Statistical Association 90, 106-121) that can be used to address similar data structures.  相似文献   

3.
Longitudinal data often encounter missingness with monotone and/or intermittent missing patterns. Multiple imputation (MI) has been popularly employed for analysis of missing longitudinal data. In particular, the MI‐GEE method has been proposed for inference of generalized estimating equations (GEE) when missing data are imputed via MI. However, little is known about how to perform model selection with multiply imputed longitudinal data. In this work, we extend the existing GEE model selection criteria, including the “quasi‐likelihood under the independence model criterion” (QIC) and the “missing longitudinal information criterion” (MLIC), to accommodate multiple imputed datasets for selection of the MI‐GEE mean model. According to real data analyses from a schizophrenia study and an AIDS study, as well as simulations under nonmonotone missingness with moderate proportion of missing observations, we conclude that: (i) more than a few imputed datasets are required for stable and reliable model selection in MI‐GEE analysis; (ii) the MI‐based GEE model selection methods with a suitable number of imputations generally perform well, while the naive application of existing model selection methods by simply ignoring missing observations may lead to very poor performance; (iii) the model selection criteria based on improper (frequentist) multiple imputation generally performs better than their analogies based on proper (Bayesian) multiple imputation.  相似文献   

4.
Reiter  Jerome P. 《Biometrika》2007,94(2):502-508
When performing multi-component significance tests with multiply-imputeddatasets, analysts can use a Wald-like test statistic and areference F-distribution. The currently employed degrees offreedom in the denominator of this F-distribution are derivedassuming an infinite sample size. For modest complete-data samplesizes, this degrees of freedom can be unrealistic; for example,it may exceed the complete-data degrees of freedom. This paperpresents an alternative denominator degrees of freedom thatis always less than or equal to the complete-data denominatordegrees of freedom, and equals the currently employed denominatordegrees of freedom for infinite sample sizes. Its advantagesover the currently employed degrees of freedom are illustratedwith a simulation.  相似文献   

5.
Albert PS 《Biometrics》2000,56(2):602-608
Binary longitudinal data are often collected in clinical trials when interest is on assessing the effect of a treatment over time. Our application is a recent study of opiate addiction that examined the effect of a new treatment on repeated urine tests to assess opiate use over an extended follow-up. Drug addiction is episodic, and a new treatment may affect various features of the opiate-use process such as the proportion of positive urine tests over follow-up and the time to the first occurrence of a positive test. Complications in this trial were the large amounts of dropout and intermittent missing data and the large number of observations on each subject. We develop a transitional model for longitudinal binary data subject to nonignorable missing data and propose an EM algorithm for parameter estimation. We use the transitional model to derive summary measures of the opiate-use process that can be compared across treatment groups to assess treatment effect. Through analyses and simulations, we show the importance of properly accounting for the missing data mechanism when assessing the treatment effect in our example.  相似文献   

6.
Chen B  Zhou XH 《Biometrics》2011,67(3):830-842
Longitudinal studies often feature incomplete response and covariate data. Likelihood-based methods such as the expectation-maximization algorithm give consistent estimators for model parameters when data are missing at random (MAR) provided that the response model and the missing covariate model are correctly specified; however, we do not need to specify the missing data mechanism. An alternative method is the weighted estimating equation, which gives consistent estimators if the missing data and response models are correctly specified; however, we do not need to specify the distribution of the covariates that have missing values. In this article, we develop a doubly robust estimation method for longitudinal data with missing response and missing covariate when data are MAR. This method is appealing in that it can provide consistent estimators if either the missing data model or the missing covariate model is correctly specified. Simulation studies demonstrate that this method performs well in a variety of situations.  相似文献   

7.
This paper proposes a method for modeling longitudinal binary data when nonresponse depends on unobserved responses. The proposed method presumes that the target of inference is the marginal distribution of the response at each occasion and its dependence on covariates, and can accommodate both monotone and non-monotone missingness. The approach involves a marginally specified pattern-mixture model that directly parameterizes both the marginal means at each occasion and the dependence of each response on indicators of nonresponse pattern. This formulation readily incorporates a variety of nonresponse processes assumed within a sensitivity analysis. Once identifying restrictions have been made, estimation of model parameters proceeds via solution to a set of modified generalized estimating equations. The proposed method provides an alternative to standard selection and pattern-mixture modeling frameworks, while featuring certain advantages of each. The paper concludes with application of the method to data from a contraceptive clinical trial with substantial dropout.  相似文献   

8.
This article presents a likelihood-based method for handling nonignorable dropout in longitudinal studies with binary responses. The methodology developed is appropriate when the target of inference is the marginal distribution of the response at each occasion and its dependence on covariates. A "hybrid" model is formulated, which is designed to retain advantageous features of the selection and pattern-mixture model approaches. This formulation accommodates a variety of assumed forms of nonignorable dropout, while maintaining transparency of the constraints required for identifying the overall model. Once appropriate identifying constraints have been imposed, likelihood-based estimation is conducted via the EM algorithm. The article concludes by applying the approach to data from a randomized clinical trial comparing two doses of a contraceptive.  相似文献   

9.
Most models for incomplete data are formulated within the selection model framework. This paper studies similarities and differences of modeling incomplete data within both selection and pattern-mixture settings. The focus is on missing at random mechanisms and on categorical data. Point and interval estimation is discussed. A comparison of both approaches is done on side effects in a psychiatric study.  相似文献   

10.
GEE with Gaussian estimation of the correlations when data are incomplete   总被引:4,自引:0,他引:4  
This paper considers a modification of generalized estimating equations (GEE) for handling missing binary response data. The proposed method uses Gaussian estimation of the correlation parameters, i.e., the estimating function that yields an estimate of the correlation parameters is obtained from the multivariate normal likelihood. The proposed method yields consistent estimates of the regression parameters when data are missing completely at random (MCAR). However, when data are missing at random (MAR), consistency may not hold. In a simulation study with repeated binary outcomes that are missing at random, the magnitude of the potential bias that can arise is examined. The results of the simulation study indicate that, when the working correlation matrix is correctly specified, the bias is almost negligible for the modified GEE. In the simulation study, the proposed modification of GEE is also compared to the standard GEE, multiple imputation, and weighted estimating equations approaches. Finally, the proposed method is illustrated using data from a longitudinal clinical trial comparing two therapeutic treatments, zidovudine (AZT) and didanosine (ddI), in patients with HIV.  相似文献   

11.
Roy J  Lin X 《Biometrics》2005,61(3):837-846
We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covariates. In this article, we first study the asymptotic bias that would result from applying existing methods, where missing time-varying covariates are handled using naive approaches, which include: (1) using only baseline values; (2) carrying forward the last observation; and (3) assuming the missing data are ignorable. Our asymptotic bias analysis shows that these naive approaches yield inconsistent estimators of model parameters. We next propose a selection/transition model that allows covariates to be missing in addition to the outcome variable at the time of dropout. The EM algorithm is used for inference in the proposed model. Data from a longitudinal study of human immunodeficiency virus (HIV)-infected women are used to illustrate the methodology.  相似文献   

12.
When novel scientific questions arise after longitudinal binary data have been collected, the subsequent selection of subjects from the cohort for whom further detailed assessment will be undertaken is often necessary to efficiently collect new information. Key examples of additional data collection include retrospective questionnaire data, novel data linkage, or evaluation of stored biological specimens. In such cases, all data required for the new analyses are available except for the new target predictor or exposure. We propose a class of longitudinal outcome-dependent sampling schemes and detail a design corrected conditional maximum likelihood analysis for highly efficient estimation of time-varying and time-invariant covariate coefficients when resource limitations prohibit exposure ascertainment on all participants. Additionally, we detail an important study planning phase that exploits available cohort data to proactively examine the feasibility of any proposed substudy as well as to inform decisions regarding the most desirable study design. The proposed designs and associated analyses are discussed in the context of a study that seeks to examine the modifying effect of an interleukin-10 cytokine single nucleotide polymorphism on asthma symptom regression in adolescents participating Childhood Asthma Management Program Continuation Study. Using this example we assume that all data necessary to conduct the study are available except subject-specific genotype data. We also assume that these data would be ascertained by analyzing stored blood samples, the cost of which limits the sample size.  相似文献   

13.
Marginalized models (Heagerty, 1999, Biometrics 55, 688-698) permit likelihood-based inference when interest lies in marginal regression models for longitudinal binary response data. Two such models are the marginalized transition and marginalized latent variable models. The former captures within-subject serial dependence among repeated measurements with transition model terms while the latter assumes exchangeable or nondiminishing response dependence using random intercepts. In this article, we extend the class of marginalized models by proposing a single unifying model that describes both serial and long-range dependence. This model will be particularly useful in longitudinal analyses with a moderate to large number of repeated measurements per subject, where both serial and exchangeable forms of response correlation can be identified. We describe maximum likelihood and Bayesian approaches toward parameter estimation and inference, and we study the large sample operating characteristics under two types of dependence model misspecification. Data from the Madras Longitudinal Schizophrenia Study (Thara et al., 1994, Acta Psychiatrica Scandinavica 90, 329-336) are analyzed.  相似文献   

14.
For analyzing longitudinal binary data with nonignorable and nonmonotone missing responses, a full likelihood method is complicated algebraically, and often requires intensive computation, especially when there are many follow-up times. As an alternative, a pseudolikelihood approach has been proposed in the literature under minimal parametric assumptions. This formulation only requires specification of the marginal distributions of the responses and missing data mechanism, and uses an independence working assumption. However, this estimator can be inefficient for estimating both time-varying and time-stationary effects under moderate to strong within-subject associations among repeated responses. In this article, we propose an alternative estimator, based on a bivariate pseudolikelihood, and demonstrate in simulations that the proposed method can be much more efficient than the previous pseudolikelihood obtained under the assumption of independence. We illustrate the method using longitudinal data on CD4 counts from two clinical trials of HIV-infected patients.  相似文献   

15.
Summary .  Multiple outcomes are often used to properly characterize an effect of interest. This article discusses model-based statistical methods for the classification of units into one of two or more groups where, for each unit, repeated measurements over time are obtained on each outcome. We relate the observed outcomes using multivariate nonlinear mixed-effects models to describe evolutions in different groups. Due to its flexibility, the random-effects approach for the joint modeling of multiple outcomes can be used to estimate population parameters for a discriminant model that classifies units into distinct predefined groups or populations. Parameter estimation is done via the expectation-maximization algorithm with a linear approximation step. We conduct a simulation study that sheds light on the effect that the linear approximation has on classification results. We present an example using data from a study in 161 pregnant women in Santiago, Chile, where the main interest is to predict normal versus abnormal pregnancy outcomes.  相似文献   

16.
17.
Ding J  Wang JL 《Biometrics》2008,64(2):546-556
Summary .   In clinical studies, longitudinal biomarkers are often used to monitor disease progression and failure time. Joint modeling of longitudinal and survival data has certain advantages and has emerged as an effective way to mutually enhance information. Typically, a parametric longitudinal model is assumed to facilitate the likelihood approach. However, the choice of a proper parametric model turns out to be more elusive than models for standard longitudinal studies in which no survival endpoint occurs. In this article, we propose a nonparametric multiplicative random effects model for the longitudinal process, which has many applications and leads to a flexible yet parsimonious nonparametric random effects model. A proportional hazards model is then used to link the biomarkers and event time. We use B-splines to represent the nonparametric longitudinal process, and select the number of knots and degrees based on a version of the Akaike information criterion (AIC). Unknown model parameters are estimated through maximizing the observed joint likelihood, which is iteratively maximized by the Monte Carlo Expectation Maximization (MCEM) algorithm. Due to the simplicity of the model structure, the proposed approach has good numerical stability and compares well with the competing parametric longitudinal approaches. The new approach is illustrated with primary biliary cirrhosis (PBC) data, aiming to capture nonlinear patterns of serum bilirubin time courses and their relationship with survival time of PBC patients.  相似文献   

18.
Cook RJ  Zeng L  Yi GY 《Biometrics》2004,60(3):820-828
In recent years there has been considerable research devoted to the development of methods for the analysis of incomplete data in longitudinal studies. Despite these advances, the methods used in practice have changed relatively little, particularly in the reporting of pharmaceutical trials. In this setting, perhaps the most widely adopted strategy for dealing with incomplete longitudinal data is imputation by the "last observation carried forward" (LOCF) approach, in which values for missing responses are imputed using observations from the most recently completed assessment. We examine the asymptotic and empirical bias, the empirical type I error rate, and the empirical coverage probability associated with estimators and tests of treatment effect based on the LOCF imputation strategy. We consider a setting involving longitudinal binary data with longitudinal analyses based on generalized estimating equations, and an analysis based simply on the response at the end of the scheduled follow-up. We find that for both of these approaches, imputation by LOCF can lead to substantial biases in estimators of treatment effects, the type I error rates of associated tests can be greatly inflated, and the coverage probability can be far from the nominal level. Alternative analyses based on all available data lead to estimators with comparatively small bias, and inverse probability weighted analyses yield consistent estimators subject to correct specification of the missing data process. We illustrate the differences between various methods of dealing with drop-outs using data from a study of smoking behavior.  相似文献   

19.
Longitudinal studies frequently incur outcome-related nonresponse. In this article, we discuss a likelihood-based method for analyzing repeated binary responses when the mechanism leading to missing response data depends on unobserved responses. We describe a pattern-mixture model for the joint distribution of the vector of binary responses and the indicators of nonresponse patterns. Specifically, we propose an extension of the multivariate logistic model to handle nonignorable nonresponse. This method yields estimates of the mean parameters under a variety of assumptions regarding the distribution of the unobserved responses. Because these models make unverifiable identifying assumptions, we recommended conducting sensitivity analyses that provide a range of inferences, each of which is valid under different assumptions for nonresponse. The methodology is illustrated using data from a longitudinal study of obesity in children.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号