共查询到20条相似文献,搜索用时 0 毫秒
1.
Longitudinal clinical trials often collect long sequences of binary data. Our application is a recent clinical trial in opiate addicts that examined the effect of a new treatment on repeated binary urine tests to assess opiate use over an extended follow-up. The dataset had two sources of missingness: dropout and intermittent missing observations. The primary endpoint of the study was comparing the marginal probability of a positive urine test over follow-up across treatment arms. We present a latent autoregressive model for longitudinal binary data subject to informative missingness. In this model, a Gaussian autoregressive process is shared between the binary response and missing-data processes, thereby inducing informative missingness. Our approach extends the work of others who have developed models that link the various processes through a shared random effect but do not allow for autocorrelation. We discuss parameter estimation using Monte Carlo EM and demonstrate through simulations that incorporating within-subject autocorrelation through a latent autoregressive process can be very important when longitudinal binary data is subject to informative missingness. We illustrate our new methodology using the opiate clinical trial data. 相似文献
2.
One of the objectives in the Northern Manhattan Stroke Study is to investigate the impact of stroke subtype on the functional status 2 years after the first ischemic stroke. A challenge in this analysis is that the functional status at 2 years after stroke is not completely observed. In this paper, we propose a method to handle nonignorably missing binary functional status when the baseline value and the covariates are completely observed. The proposed method consists of fitting four separate binary regression models: for the baseline outcome, the outcome 2 years after the stroke, the product of the previous two, and finally, the missingness indicator. We then conduct a sensitivity analysis by varying the assumptions about the third and the fourth binary regression models. Our method belongs to an imputation paradigm and can be an alternative to the weighting method of Rotnitzky and Robins (1997, Statistics in Medicine 16, 81-102). A jackknife variance estimate is proposed for the variance of the resulting estimate. The proposed analysis can be implemented using statistical software such as SAS. 相似文献
3.
Cho Paik M 《Biometrics》2004,60(2):306-314
Matched case-control data analysis is often challenged by a missing covariate problem, the mishandling of which could cause bias or inefficiency. Satten and Carroll (2000, Biometrics56, 384-388) and other authors have proposed methods to handle missing covariates when the probability of missingness depends on the observed data, i.e., when data are missing at random. In this article, we propose a conditional likelihood method to handle the case when the probability of missingness depends on the unobserved covariate, i.e., when data are nonignorably missing. When the missing covariate is binary, the proposed method can be implemented using standard software. Using the Northern Manhattan Stroke Study data, we illustrate the method and discuss how sensitivity analysis can be conducted. 相似文献
4.
Longitudinal studies frequently incur outcome-related nonresponse. In this article, we discuss a likelihood-based method for analyzing repeated binary responses when the mechanism leading to missing response data depends on unobserved responses. We describe a pattern-mixture model for the joint distribution of the vector of binary responses and the indicators of nonresponse patterns. Specifically, we propose an extension of the multivariate logistic model to handle nonignorable nonresponse. This method yields estimates of the mean parameters under a variety of assumptions regarding the distribution of the unobserved responses. Because these models make unverifiable identifying assumptions, we recommended conducting sensitivity analyses that provide a range of inferences, each of which is valid under different assumptions for nonresponse. The methodology is illustrated using data from a longitudinal study of obesity in children. 相似文献
5.
Summary With advances in modern medicine and clinical diagnosis, case–control data with characterization of finer subtypes of cases are often available. In matched case–control studies, missingness in exposure values often leads to deletion of entire stratum, and thus entails a significant loss in information. When subtypes of cases are treated as categorical outcomes, the data are further stratified and deletion of observations becomes even more expensive in terms of precision of the category‐specific odds‐ratio parameters, especially using the multinomial logit model. The stereotype regression model for categorical responses lies intermediate between the proportional odds and the multinomial or baseline category logit model. The use of this class of models has been limited as the structure of the model implies certain inferential challenges with nonidentifiability and nonlinearity in the parameters. We illustrate how to handle missing data in matched case–control studies with finer disease subclassification within the cases under a stereotype regression model. We present both Monte Carlo based full Bayesian approach and expectation/conditional maximization algorithm for the estimation of model parameters in the presence of a completely general missingness mechanism. We illustrate our methods by using data from an ongoing matched case–control study of colorectal cancer. Simulation results are presented under various missing data mechanisms and departures from modeling assumptions. 相似文献
6.
In certain diseases, outcome is the number of morbid events over the course of follow-up. In epilepsy, e.g., daily seizure counts are often used to reflect disease severity. Follow-up of patients in clinical trials of such diseases is often subject to censoring due to patients dying or dropping out. If the sicker patients tend to be censored in such trials, estimates of the treatment effect that do not incorporate the censoring process may be misleading. We extend the shared random effects approach of Wu and Carroll (1988, Biometrics 44, 175-188) to the setting of repeated counts of events. Three strategies are developed. The first is a likelihood-based approach for jointly modeling the count and censoring processes. A shared random effect is incorporated to introduce dependence between the two processes. The second is a likelihood-based approach that conditions on the dropout times in adjusting for informative dropout. The third is a generalized estimating equations (GEE) approach, which also conditions on the dropout times but makes fewer assumptions about the distribution of the count process. Estimation procedures for each of the approaches are discussed, and the approaches are applied to data from an epilepsy clinical trial. A simulation study is also conducted to compare the various approaches. Through analyses and simulations, we demonstrate the flexibility of the likelihood-based conditional model for analyzing data from the epilepsy trial. 相似文献
7.
We study publication bias in meta-analysis by supposing there is a population (y, sigma) of studies which give treatment effect estimates y approximately N(theta, sigma(2)). A selection function describes the probability that each study is selected for review. The overall estimate of theta depends on the studies selected, and hence on the (unknown) selection function. Our previous paper, Copas and Jackson (2004, Biometrics 60, 146-153), studied the maximum bias over all possible selection functions which satisfy the weak condition that large studies (small sigma) are as likely, or more likely, to be selected than small studies (large sigma). This led to a worst-case sensitivity analysis, controlling for the overall fraction of studies selected. However, no account was taken of the effect of selection on the uncertainty in estimation. This article extends the previous work by finding corresponding confidence intervals and P-values, and hence a new sensitivity analysis for publication bias. Two examples are discussed. 相似文献
8.
9.
Thijs H Molenberghs G Michiels B Verbeke G Curran D 《Biostatistics (Oxford, England)》2002,3(2):245-265
Whereas most models for incomplete longitudinal data are formulated within the selection model framework, pattern-mixture models have gained considerable interest in recent years (Little, 1993, 1994). In this paper, we outline several strategies to fit pattern-mixture models, including the so-called identifying restrictions strategy. Multiple imputation is used to apply this strategy to realistic settings, such as quality-of-life data from a longitudinal study on metastatic breast cancer patients. 相似文献
10.
Recently, a lot of concern has been raised about assumptions needed in order to fit statistical models to incomplete multivariate and longitudinal data. In response, research efforts are being devoted to the development of tools that assess the sensitivity of such models to often strong but always, at least in part, unverifiable assumptions. Many efforts have been devoted to longitudinal data, primarily in the selection model context, although some researchers have expressed interest in the pattern-mixture setting as well. A promising tool, proposed by Verbeke et al. (2001, Biometrics 57, 43-50), is based on local influence (Cook, 1986, Journal of the Royal Statistical Society, Series B 48, 133-169). These authors considered the Diggle and Kenward (1994, Applied Statistics 43, 49-93) model, which is based on a selection model, integrating a linear mixed model for continuous outcomes with logistic regression for dropout. In this article, we show that a similar idea can be developed for multivariate and longitudinal binary data, subject to nonmonotone missingness. We focus on the model proposed by Baker, Rosenberger, and DerSimonian (1992, Statistics in Medicine 11, 643-657). The original model is first extended to allow for (possibly continuous) covariates, whereafter a local influence strategy is developed to support the model-building process. The model is able to deal with nonmonotone missingness but has some limitations as well, stemming from the conditional nature of the model parameters. Some analytical insight is provided into the behavior of the local influence graphs. 相似文献
11.
This article analyzes quality of life (QOL) data from an Eastern Cooperative Oncology Group (ECOG) melanoma trial that compared treatment with ganglioside vaccination to treatment with high-dose interferon. The analysis of this data set is challenging due to several difficulties, namely, nonignorable missing longitudinal responses and baseline covariates. Hence, we propose a selection model for estimating parameters in the normal random effects model with nonignorable missing responses and covariates. Parameters are estimated via maximum likelihood using the Gibbs sampler and a Monte Carlo expectation maximization (EM) algorithm. Standard errors are calculated using the bootstrap. The method allows for nonmonotone patterns of missing data in both the response variable and the covariates. We model the missing data mechanism and the missing covariate distribution via a sequence of one-dimensional conditional distributions, allowing the missing covariates to be either categorical or continuous, as well as time-varying. We apply the proposed approach to the ECOG quality-of-life data and conduct a small simulation study evaluating the performance of the maximum likelihood estimates. Our results indicate that a patient treated with the vaccine has a higher QOL score on average at a given time point than a patient treated with high-dose interferon. 相似文献
12.
In randomized studies with missing outcomes, non-identifiable assumptions are required to hold for valid data analysis. As a result, statisticians have been advocating the use of sensitivity analysis to evaluate the effect of varying assumptions on study conclusions. While this approach may be useful in assessing the sensitivity of treatment comparisons to missing data assumptions, it may be dissatisfying to some researchers/decision makers because a single summary is not provided. In this paper, we present a fully Bayesian methodology that allows the investigator to draw a 'single' conclusion by formally incorporating prior beliefs about non-identifiable, yet interpretable, selection bias parameters. Our Bayesian model provides robustness to prior specification of the distributional form of the continuous outcomes. 相似文献
13.
Summary . We study quantile regression (QR) for longitudinal measurements with nonignorable intermittent missing data and dropout. Compared to conventional mean regression, quantile regression can characterize the entire conditional distribution of the outcome variable, and is more robust to outliers and misspecification of the error distribution. We account for the within-subject correlation by introducing a ℓ2 penalty in the usual QR check function to shrink the subject-specific intercepts and slopes toward the common population values. The informative missing data are assumed to be related to the longitudinal outcome process through the shared latent random effects. We assess the performance of the proposed method using simulation studies, and illustrate it with data from a pediatric AIDS clinical trial. 相似文献
14.
Multiple imputation has become a widely accepted technique to deal with the problem of incomplete data. Typically, imputation of missing values and the statistical analysis are performed separately. Therefore, the imputation model has to be consistent with the analysis model. If the data are analyzed with a mixture model, the parameter estimates are usually obtained iteratively. Thus, if the data are missing not at random, parameter estimation and treatment of missingness should be combined. We solve both problems by simultaneously imputing values using the data augmentation method and estimating parameters using the EM algorithm. This iterative procedure ensures that the missing values are properly imputed given the current parameter estimates. Properties of the parameter estimates were investigated in a simulation study. The results are illustrated using data from the National Health and Nutrition Examination Survey. 相似文献
15.
Using data from 145,007 adults in the Disability Supplement to the National Health Interview Survey, we investigated the effect of balance difficulties on frequent depression after controlling for age, gender, race, and other baseline health status information. There were two major complications: (i) 80% of subjects were missing data on depression and the missing-data mechanism was likely related to depression, and (ii) the data arose from a complex sample survey. To adjust for (i) we investigated three classes of models: missingness in depression, missingness in depression and balance, and missingness in depression with an auxiliary variable. To adjust for (ii) we developed the first linearization variance formula for nonignorable missing-data models. Our sensitivity analysis was based on fitting a range of ignorable missing-data models along with nonignorable missing-data models that added one or two parameters. All nonignorable missing-data models that we considered fit the data substantially better than their ignorable missing-data counterparts. Under an ignorable missing-data mechanism, the odds ratio for the association between balance and depression was 2.0 with a 95% CI of (1.8, 2.2). Under 29 of the 30 selected nonignorable missing-data models, the odds ratios ranged from 2.7 with 95% CI of (2.3, 3.1) to 4.2 with 95% CI of (3.9, 4.6). Under one nonignorable missing-data model, the odds ratio was 7.4 with 95% CI of (6.3, 8.6). This is the first analysis to find a strong association between balance difficulties and frequent depression. 相似文献
16.
Summary. The present article deals with informative missing (IM) exposure data in matched case–control studies. When the missingness mechanism depends on the unobserved exposure values, modeling the missing data mechanism is inevitable. Therefore, a full likelihood-based approach for handling IM data has been proposed by positing a model for selection probability, and a parametric model for the partially missing exposure variable among the control population along with a disease risk model. We develop an EM algorithm to estimate the model parameters. Three special cases: (a) binary exposure variable, (b) normally distributed exposure variable, and (c) lognormally distributed exposure variable are discussed in detail. The method is illustrated by analyzing a real matched case–control data with missing exposure variable. The performance of the proposed method is evaluated through simulation studies, and the robustness of the proposed method for violation of different types of model assumptions has been considered. 相似文献
17.
Pattern-mixture models with proper time dependence 总被引:1,自引:0,他引:1
18.
Paul S. Albert Lisa M. McShane Joanna H. Shih The U.S. National Cancer Institute Bladder Tumor Marker Network 《Biometrics》2001,57(2):610-619
Improved characterization of tumors for purposes of guiding treatment decisions for cancer patients will require that accurate and reproducible assays be developed for a variety of tumor markers. No gold standards exist for most tumor marker assays. Therefore, estimates of assay sensitivity and specificity cannot be obtained unless a latent class model-based approach is used. Our goal in this article is to estimate sensitivity and specificity for p53 immunohistochemical assays of bladder tumors using data from a reproducibility study conducted by the National Cancer Institute Bladder Tumor Marker Network. We review latent class modeling approaches proposed by previous authors, and we find that many of these approaches impose assumptions about specimen heterogeneity that are not consistent with the biology of bladder tumors. We present flexible mixture model alternatives that are biologically plausible for our example, and we use them to estimate sensitivity and specificity for our p53 assay example. These mixture models are shown to offer an improvement over other methods in a variety of settings, but we caution that, in general, care must be taken in applying latent class models. 相似文献
19.
Pattern mixture models are frequently used to analyze longitudinal data where missingness is induced by dropout. For measured responses, it is typical to model the complete data as a mixture of multivariate normal distributions, where mixing is done over the dropout distribution. Fully parameterized pattern mixture models are not identified by incomplete data; Little (1993, Journal of the American Statistical Association 88, 125-134) has characterized several identifying restrictions that can be used for model fitting. We propose a reparameterization of the pattern mixture model that allows investigation of sensitivity to assumptions about nonidentified parameters in both the mean and variance, allows consideration of a wide range of nonignorable missing-data mechanisms, and has intuitive appeal for eliciting plausible missing-data mechanisms. The parameterization makes clear an advantage of pattern mixture models over parametric selection models, namely that the missing-data mechanism can be varied without affecting the marginal distribution of the observed data. To illustrate the utility of the new parameterization, we analyze data from a recent clinical trial of growth hormone for maintaining muscle strength in the elderly. Dropout occurs at a high rate and is potentially informative. We undertake a detailed sensitivity analysis to understand the impact of missing-data assumptions on the inference about the effects of growth hormone on muscle strength. 相似文献
20.
The analysis of longitudinal repeated measures data is frequently complicated by missing data due to informative dropout. We describe a mixture model for joint distribution for longitudinal repeated measures, where the dropout distribution may be continuous and the dependence between response and dropout is semiparametric. Specifically, we assume that responses follow a varying coefficient random effects model conditional on dropout time, where the regression coefficients depend on dropout time through unspecified nonparametric functions that are estimated using step functions when dropout time is discrete (e.g., for panel data) and using smoothing splines when dropout time is continuous. Inference under the proposed semiparametric model is hence more robust than the parametric conditional linear model. The unconditional distribution of the repeated measures is a mixture over the dropout distribution. We show that estimation in the semiparametric varying coefficient mixture model can proceed by fitting a parametric mixed effects model and can be carried out on standard software platforms such as SAS. The model is used to analyze data from a recent AIDS clinical trial and its performance is evaluated using simulations. 相似文献