首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 93 毫秒
Analyzing incomplete longitudinal clinical trial data   总被引:1,自引:0,他引:1  
Using standard missing data taxonomy, due to Rubin and co-workers, and simple algebraic derivations, it is argued that some simple but commonly used methods to handle incomplete longitudinal clinical trial data, such as complete case analyses and methods based on last observation carried forward, require restrictive assumptions and stand on a weaker theoretical foundation than likelihood-based methods developed under the missing at random (MAR) framework. Given the availability of flexible software for analyzing longitudinal sequences of unequal length, implementation of likelihood-based MAR analyses is not limited by computational considerations. While such analyses are valid under the comparatively weak assumption of MAR, the possibility of data missing not at random (MNAR) is difficult to rule out. It is argued, however, that MNAR analyses are, themselves, surrounded with problems and therefore, rather than ignoring MNAR analyses altogether or blindly shifting to them, their optimal place is within sensitivity analysis. The concepts developed here are illustrated using data from three clinical trials, where it is shown that the analysis method may have an impact on the conclusions of the study.  相似文献   

Chen B  Zhou XH 《Biometrics》2011,67(3):830-842
Longitudinal studies often feature incomplete response and covariate data. Likelihood-based methods such as the expectation-maximization algorithm give consistent estimators for model parameters when data are missing at random (MAR) provided that the response model and the missing covariate model are correctly specified; however, we do not need to specify the missing data mechanism. An alternative method is the weighted estimating equation, which gives consistent estimators if the missing data and response models are correctly specified; however, we do not need to specify the distribution of the covariates that have missing values. In this article, we develop a doubly robust estimation method for longitudinal data with missing response and missing covariate when data are MAR. This method is appealing in that it can provide consistent estimators if either the missing data model or the missing covariate model is correctly specified. Simulation studies demonstrate that this method performs well in a variety of situations.  相似文献   

On using the Cox proportional hazards model with missing covariates   总被引:1,自引:0,他引:1  

Summary .  Little and An (2004,  Statistica Sinica   14, 949–968) proposed a penalized spline of propensity prediction (PSPP) method of imputation of missing values that yields robust model-based inference under the missing at random assumption. The propensity score for a missing variable is estimated and a regression model is fitted that includes the spline of the estimated logit propensity score as a covariate. The predicted unconditional mean of the missing variable has a double robustness (DR) property under misspecification of the imputation model. We show that a simplified version of PSPP, which does not center other regressors prior to including them in the prediction model, also has the DR property. We also propose two extensions of PSPP, namely, stratified PSPP and bivariate PSPP, that extend the DR property to inferences about conditional means. These extended PSPP methods are compared with the PSPP method and simple alternatives in a simulation study and applied to an online weight loss study conducted by Kaiser Permanente.  相似文献   

Inference and missing data   总被引:85,自引:0,他引:85  
RUBIN  DONALD B. 《Biometrika》1976,63(3):581-592

Summary A routine challenge is that of making inference on parameters in a statistical model of interest from longitudinal data subject to dropout, which are a special case of the more general setting of monotonely coarsened data. Considerable recent attention has focused on doubly robust (DR) estimators, which in this context involve positing models for both the missingness (more generally, coarsening) mechanism and aspects of the distribution of the full data, that have the appealing property of yielding consistent inferences if only one of these models is correctly specified. DR estimators have been criticized for potentially disastrous performance when both of these models are even only mildly misspecified. We propose a DR estimator applicable in general monotone coarsening problems that achieves comparable or improved performance relative to existing DR methods, which we demonstrate via simulation studies and by application to data from an AIDS clinical trial.  相似文献   

We propose a method to estimate the regression coefficients in a competing risks model where the cause-specific hazard for the cause of interest is related to covariates through a proportional hazards relationship and when cause of failure is missing for some individuals. We use multiple imputation procedures to impute missing cause of failure, where the probability that a missing cause is the cause of interest may depend on auxiliary covariates, and combine the maximum partial likelihood estimators computed from several imputed data sets into an estimator that is consistent and asymptotically normal. A consistent estimator for the asymptotic variance is also derived. Simulation results suggest the relevance of the theory in finite samples. Results are also illustrated with data from a breast cancer study.  相似文献   

It is very common in regression analysis to encounter incompletely observed covariate information. A recent approach to analyse such data is weighted estimating equations (Robins, J. M., Rotnitzky, A. and Zhao, L. P. (1994), JASA, 89, 846-866, and Zhao, L. P., Lipsitz, S. R. and Lew, D. (1996), Biometrics, 52, 1165-1182). With weighted estimating equations, the contribution to the estimating equation from a complete observation is weighted by the inverse of the probability of being observed. We propose a test statistic to assess if the weighted estimating equations produce biased estimates. Our test statistic is similar to the test statistic proposed by DuMouchel and Duncan (1983) for weighted least squares estimates for sample survey data. The method is illustrated using data from a randomized clinical trial on chemotherapy for multiple myeloma.  相似文献   

Wang C  Daniels MJ 《Biometrics》2011,67(3):810-818
Summary Pattern mixture modeling is a popular approach for handling incomplete longitudinal data. Such models are not identifiable by construction. Identifying restrictions is one approach to mixture model identification ( Little, 1995 , Journal of the American Statistical Association 90 , 1112–1121; Little and Wang, 1996 , Biometrics 52 , 98–111; Thijs et al., 2002 , Biostatistics 3 , 245–265; Kenward, Molenberghs, and Thijs, 2003 , Biometrika 90 , 53–71; Daniels and Hogan, 2008 , in Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis) and is a natural starting point for missing not at random sensitivity analysis ( Thijs et al., 2002 , Biostatistics 3 , 245–265; Daniels and Hogan, 2008 , in Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis). However, when the pattern specific models are multivariate normal, identifying restrictions corresponding to missing at random (MAR) may not exist. Furthermore, identification strategies can be problematic in models with covariates (e.g., baseline covariates with time‐invariant coefficients). In this article, we explore conditions necessary for identifying restrictions that result in MAR to exist under a multivariate normality assumption and strategies for identifying sensitivity parameters for sensitivity analysis or for a fully Bayesian analysis with informative priors. In addition, we propose alternative modeling and sensitivity analysis strategies under a less restrictive assumption for the distribution of the observed response data. We adopt the deviance information criterion for model comparison and perform a simulation study to evaluate the performances of the different modeling approaches. We also apply the methods to a longitudinal clinical trial. Problems caused by baseline covariates with time‐invariant coefficients are investigated and an alternative identifying restriction based on residuals is proposed as a solution.  相似文献   

In this paper we develop pseudo-likelihood methods for the estimation of parameters in a model that is specified in terms of both selection modelling and pattern-mixture modelling quantities. Two cases are considered: (1) the model is specified directly from a joint model for the measurement and dropout processes; (2) conditional models for the measurement process given dropout and vice versa are specified directly. In the latter case, compatibility constraints to ensure the existence of a joint density are derived. The method is applied to data from a psychiatric study, where a bivariate therapeutic outcome is supplemented with covariate information.  相似文献   

Methods in the literature for missing covariate data in survival models have relied on the missing at random (MAR) assumption to render regression parameters identifiable. MAR means that missingness can depend on the observed exit time, and whether or not that exit is a failure or a censoring event. By considering ways in which missingness of covariate X could depend on the true but possibly censored failure time T and the true censoring time C, we attempt to identify missingness mechanisms which would yield MAR data. We find that, under various reasonable assumptions about how missingness might depend on T and/or C, additional strong assumptions are needed to obtain MAR. We conclude that MAR is difficult to justify in practical applications. One exception arises when missingness is independent of T, and C is independent of the value of the missing X. As alternatives to MAR, we propose two new missingness assumptions. In one, the missingness depends on T but not on C; in the other, the situation is reversed. For each, we show that the failure time model is identifiable. When missingness is independent of T, we show that the naive complete record analysis will yield a consistent estimator of the failure time distribution. When missingness is independent of C, we develop a complete record likelihood function and a corresponding estimator for parametric failure time models. We propose analyses to evaluate the plausibility of either assumption in a particular data set, and illustrate the ideas using data from the literature on this problem.  相似文献   

Summary In medical research, the receiver operating characteristic (ROC) curves can be used to evaluate the performance of biomarkers for diagnosing diseases or predicting the risk of developing a disease in the future. The area under the ROC curve (ROC AUC), as a summary measure of ROC curves, is widely utilized, especially when comparing multiple ROC curves. In observational studies, the estimation of the AUC is often complicated by the presence of missing biomarker values, which means that the existing estimators of the AUC are potentially biased. In this article, we develop robust statistical methods for estimating the ROC AUC and the proposed methods use information from auxiliary variables that are potentially predictive of the missingness of the biomarkers or the missing biomarker values. We are particularly interested in auxiliary variables that are predictive of the missing biomarker values. In the case of missing at random (MAR), that is, missingness of biomarker values only depends on the observed data, our estimators have the attractive feature of being consistent if one correctly specifies, conditional on auxiliary variables and disease status, either the model for the probabilities of being missing or the model for the biomarker values. In the case of missing not at random (MNAR), that is, missingness may depend on the unobserved biomarker values, we propose a sensitivity analysis to assess the impact of MNAR on the estimation of the ROC AUC. The asymptotic properties of the proposed estimators are studied and their finite‐sample behaviors are evaluated in simulation studies. The methods are further illustrated using data from a study of maternal depression during pregnancy.  相似文献   

Zimmerman DL 《Biometrics》2008,64(1):262-270
Summary .   The estimation of spatial intensity is an important inference problem in spatial epidemiologic studies. A standard data assimilation component of these studies is the assignment of a geocode, that is, point-level spatial coordinates, to the address of each subject in the study population. Unfortunately, when geocoding is performed by the standard automated method of street-segment matching to a georeferenced road file and subsequent interpolation, it is rarely completely successful. Typically, 10–30% of the addresses in the study population, and even higher percentages in particular subgroups, fail to geocode, potentially leading to a selection bias, called geographic bias, and an inefficient analysis. Missing-data methods could be considered for analyzing such data; however, because there is almost always some geographic information coarser than a point (e.g., a Zip code) observed for the addresses that fail to geocode, a coarsened-data analysis is more appropriate. This article develops methodology for estimating spatial intensity from coarsened geocoded data. Both nonparametric (kernel smoothing) and likelihood-based estimation procedures are considered. Substantial improvements in the estimation quality of coarsened-data analyses relative to analyses of only the observations that geocode are demonstrated via simulation and an example from a rural health study in Iowa.  相似文献   

Pattern-mixture models with proper time dependence   总被引:1,自引:0,他引:1  

This article investigates an augmented inverse selection probability weighted estimator for Cox regression parameter estimation when covariate variables are incomplete. This estimator extends the Horvitz and Thompson (1952, Journal of the American Statistical Association 47, 663-685) weighted estimator. This estimator is doubly robust because it is consistent as long as either the selection probability model or the joint distribution of covariates is correctly specified. The augmentation term of the estimating equation depends on the baseline cumulative hazard and on a conditional distribution that can be implemented by using an EM-type algorithm. This method is compared with some previously proposed estimators via simulation studies. The method is applied to a real example.  相似文献   

Dewanji A  Sengupta D 《Biometrics》2003,59(4):1063-1070
In competing risks data, missing failure types (causes) is a very common phenomenon. In this work, we consider a general missing pattern in which, if a failure type is not observed, one observes a set of possible types containing the true type, along with the failure time. We first consider maximum likelihood estimation with missing-at-random assumption via the expectation maximization (EM) algorithm. We then propose a Nelson-Aalen type estimator for situations when certain information on the conditional probability of the true type given a set of possible failure types is available from the experimentalists. This is based on a least-squares type method using the relationships between hazards for different types and hazards for different combinations of missing types. We conduct a simulation study to investigate the performance of this method, which indicates that bias may be small, even for high proportion of missing data, for sufficiently large number of observations. The estimates are somewhat sensitive to misspecification of the conditional probabilities of the true types when the missing proportion is high. We also consider an example from an animal experiment to illustrate our methodology.  相似文献   

The problem of dropout is a common one in longitudinal studies. One usually assumes for the analysis that dropout is at random. There are some tests to investigate this assumption. But these tests depend on normally distributed data or lack power, cf. Listing and Schlittgen (1998). We here propose an overall test which combines several Wilcoxon rank sum tests. The alternative hypothesis states that there is a tendency for larger (smaller) values of the target variable the last time the probands show up. The test is applicable with many ties also. It proves to perform well, compared to the test developed for normally distributed data, as well as to a test for completely missing at random which is proposed by Little (1988). An application to real data is given too.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号