首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.
Noncompliance is a common problem in experiments involving randomized assignment of treatments, and standard analyses based on intention-to-treat or treatment received have limitations. An attractive alternative is to estimate the Complier-Average Causal Effect (CACE), which is the average treatment effect for the subpopulation of subjects who would comply under either treatment (Angrist, Imbens, and Rubin, 1996, Journal of American Statistical Association 91, 444-472). We propose an extended general location model to estimate the CACE from data with noncompliance and missing data in the outcome and in baseline covariates. Models for both continuous and categorical outcomes and ignorable and latent ignorable (Frangakis and Rubin, 1999, Biometrika 86, 365-379) missing-data mechanisms are developed. Inferences for the models are based on the EM algorithm and Bayesian MCMC methods. We present results from simulations that investigate sensitivity to model assumptions and the influence of missing-data mechanism. We also apply the method to the data from a job search intervention for unemployed workers.  相似文献   

2.
Chen H  Geng Z  Zhou XH 《Biometrics》2009,65(3):675-682
Summary .  In this article, we first study parameter identifiability in randomized clinical trials with noncompliance and missing outcomes. We show that under certain conditions the parameters of interest are identifiable even under different types of completely nonignorable missing data: that is, the missing mechanism depends on the outcome. We then derive their maximum likelihood and moment estimators and evaluate their finite-sample properties in simulation studies in terms of bias, efficiency, and robustness. Our sensitivity analysis shows that the assumed nonignorable missing-data model has an important impact on the estimated complier average causal effect (CACE) parameter. Our new method provides some new and useful alternative nonignorable missing-data models over the existing latent ignorable model, which guarantees parameter identifiability, for estimating the CACE in a randomized clinical trial with noncompliance and missing data.  相似文献   

3.
This paper deals with a Cox proportional hazards regression model, where some covariates of interest are randomly right‐censored. While methods for censored outcomes have become ubiquitous in the literature, methods for censored covariates have thus far received little attention and, for the most part, dealt with the issue of limit‐of‐detection. For randomly censored covariates, an often‐used method is the inefficient complete‐case analysis (CCA) which consists in deleting censored observations in the data analysis. When censoring is not completely independent, the CCA leads to biased and spurious results. Methods for missing covariate data, including type I and type II covariate censoring as well as limit‐of‐detection do not readily apply due to the fundamentally different nature of randomly censored covariates. We develop a novel method for censored covariates using a conditional mean imputation based on either Kaplan–Meier estimates or a Cox proportional hazards model to estimate the effects of these covariates on a time‐to‐event outcome. We evaluate the performance of the proposed method through simulation studies and show that it provides good bias reduction and statistical efficiency. Finally, we illustrate the method using data from the Framingham Heart Study to assess the relationship between offspring and parental age of onset of cardiovascular events.  相似文献   

4.
Roy J  Lin X 《Biometrics》2005,61(3):837-846
We consider estimation in generalized linear mixed models (GLMM) for longitudinal data with informative dropouts. At the time a unit drops out, time-varying covariates are often unobserved in addition to the missing outcome. However, existing informative dropout models typically require covariates to be completely observed. This assumption is not realistic in the presence of time-varying covariates. In this article, we first study the asymptotic bias that would result from applying existing methods, where missing time-varying covariates are handled using naive approaches, which include: (1) using only baseline values; (2) carrying forward the last observation; and (3) assuming the missing data are ignorable. Our asymptotic bias analysis shows that these naive approaches yield inconsistent estimators of model parameters. We next propose a selection/transition model that allows covariates to be missing in addition to the outcome variable at the time of dropout. The EM algorithm is used for inference in the proposed model. Data from a longitudinal study of human immunodeficiency virus (HIV)-infected women are used to illustrate the methodology.  相似文献   

5.
Incomplete covariate data are a common occurrence in studies in which the outcome is survival time. Further, studies in the health sciences often give rise to correlated, possibly censored, survival data. With no missing covariate data, if the marginal distributions of the correlated survival times follow a given parametric model, then the estimates using the maximum likelihood estimating equations, naively treating the correlated survival times as independent, give consistent estimates of the relative risk parameters Lipsitz et al. 1994 50, 842-846. Now, suppose that some observations within a cluster have some missing covariates. We show in this paper that if one naively treats observations within a cluster as independent, that one can still use the maximum likelihood estimating equations to obtain consistent estimates of the relative risk parameters. This method requires the estimation of the parameters of the distribution of the covariates. We present results from a clinical trial Lipsitz and Ibrahim (1996b) 2, 5-14 with five covariates, four of which have some missing values. In the trial, the clusters are the hospitals in which the patients were treated.  相似文献   

6.
Randomized trials with dropouts or censored data and discrete time-to-event type outcomes are frequently analyzed using the Kaplan-Meier or product limit (PL) estimation method. However, the PL method assumes that the censoring mechanism is noninformative and when this assumption is violated, the inferences may not be valid. We propose an expanded PL method using a Bayesian framework to incorporate informative censoring mechanism and perform sensitivity analysis on estimates of the cumulative incidence curves. The expanded method uses a model, which can be viewed as a pattern mixture model, where odds for having an event during the follow-up interval $$({t}_{k-1},{t}_{k}]$$, conditional on being at risk at $${t}_{k-1}$$, differ across the patterns of missing data. The sensitivity parameters relate the odds of an event, between subjects from a missing-data pattern with the observed subjects for each interval. The large number of the sensitivity parameters is reduced by considering them as random and assumed to follow a log-normal distribution with prespecified mean and variance. Then we vary the mean and variance to explore sensitivity of inferences. The missing at random (MAR) mechanism is a special case of the expanded model, thus allowing exploration of the sensitivity to inferences as departures from the inferences under the MAR assumption. The proposed approach is applied to data from the TRial Of Preventing HYpertension.  相似文献   

7.

Longitudinal studies with binary outcomes characterized by informative right censoring are commonly encountered in clinical, basic, behavioral, and health sciences. Approaches developed to analyze data with binary outcomes were mainly tailored to clustered or longitudinal data with missing completely at random or at random. Studies that focused on informative right censoring with binary outcomes are characterized by their imbedded computational complexity and difficulty of implementation. Here we present a new maximum likelihood-based approach with repeated binary measures modeled in a generalized linear mixed model as a function of time and other covariates. The longitudinal binary outcome and the censoring process determined by the number of times a subject is observed share latent random variables (random intercept and slope) where these subject-specific random effects are common to both models. A simulation study and sensitivity analysis were conducted to test the model under different assumptions and censoring settings. Our results showed accuracy of the estimates generated under this model when censoring was fully informative or partially informative with dependence on the slopes. A successful implementation was undertaken on a cohort of renal transplant patients with blood urea nitrogen as a binary outcome measured over time to indicate normal and abnormal kidney function until the emanation of graft rejection that eventuated in informative right censoring. In addition to its novelty and accuracy, an additional key feature and advantage of the proposed model is its viability of implementation on available analytical tools and widespread application on any other longitudinal dataset with informative censoring.

  相似文献   

8.
In cohort studies the outcome is often time to a particular event, and subjects are followed at regular intervals. Periodic visits may also monitor a secondary irreversible event influencing the event of primary interest, and a significant proportion of subjects develop the secondary event over the period of follow‐up. The status of the secondary event serves as a time‐varying covariate, but is recorded only at the times of the scheduled visits, generating incomplete time‐varying covariates. While information on a typical time‐varying covariate is missing for entire follow‐up period except the visiting times, the status of the secondary event are unavailable only between visits where the status has changed, thus interval‐censored. One may view interval‐censored covariate of the secondary event status as missing time‐varying covariates, yet missingness is partial since partial information is provided throughout the follow‐up period. Current practice of using the latest observed status produces biased estimators, and the existing missing covariate techniques cannot accommodate the special feature of missingness due to interval censoring. To handle interval‐censored covariates in the Cox proportional hazards model, we propose an available‐data estimator, a doubly robust‐type estimator as well as the maximum likelihood estimator via EM algorithm and present their asymptotic properties. We also present practical approaches that are valid. We demonstrate the proposed methods using our motivating example from the Northern Manhattan Study.  相似文献   

9.
Missing data is a common issue in research using observational studies to investigate the effect of treatments on health outcomes. When missingness occurs only in the covariates, a simple approach is to use missing indicators to handle the partially observed covariates. The missing indicator approach has been criticized for giving biased results in outcome regression. However, recent papers have suggested that the missing indicator approach can provide unbiased results in propensity score analysis under certain assumptions. We consider assumptions under which the missing indicator approach can provide valid inferences, namely, (1) no unmeasured confounding within missingness patterns; either (2a) covariate values of patients with missing data were conditionally independent of treatment or (2b) these values were conditionally independent of outcome; and (3) the outcome model is correctly specified: specifically, the true outcome model does not include interactions between missing indicators and fully observed covariates. We prove that, under the assumptions above, the missing indicator approach with outcome regression can provide unbiased estimates of the average treatment effect. We use a simulation study to investigate the extent of bias in estimates of the treatment effect when the assumptions are violated and we illustrate our findings using data from electronic health records. In conclusion, the missing indicator approach can provide valid inferences for outcome regression, but the plausibility of its assumptions must first be considered carefully.  相似文献   

10.
We consider longitudinal studies in which the outcome observed over time is binary and the covariates of interest are categorical. With no missing responses or covariates, one specifies a multinomial model for the responses given the covariates and uses maximum likelihood to estimate the parameters. Unfortunately, incomplete data in the responses and covariates are a common occurrence in longitudinal studies. Here we assume the missing data are missing at random (Rubin, 1976, Biometrika 63, 581-592). Since all of the missing data (responses and covariates) are categorical, a useful technique for obtaining maximum likelihood parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). In using the EM algorithm with missing responses and covariates, one specifies the joint distribution of the responses and covariates. Here we consider the parameters of the covariate distribution as a nuisance. In data sets where the percentage of missing data is high, the estimates of the nuisance parameters can lead to highly unstable estimates of the parameters of interest. We propose a conditional model for the covariate distribution that has several modeling advantages for the EM algorithm and provides a reduction in the number of nuisance parameters, thus providing more stable estimates in finite samples.  相似文献   

11.
We investigate the possible bias due to an erroneous missing at random assumption if adjusted odds ratios are estimated from incomplete covariate data using the maximum likelihood principle. A relation between complete case estimates and maximum likelihood estimates allows us to identify situations where the bias vanishes. Numerical computations demonstrate that the bias is most serious if the degree of the violation of the missing at random assumption depends on the value of the outcome variable or of the observed covariate. Implications for the analysis of prospective and retrospective studies are given.  相似文献   

12.
Using data from 145,007 adults in the Disability Supplement to the National Health Interview Survey, we investigated the effect of balance difficulties on frequent depression after controlling for age, gender, race, and other baseline health status information. There were two major complications: (i) 80% of subjects were missing data on depression and the missing-data mechanism was likely related to depression, and (ii) the data arose from a complex sample survey. To adjust for (i) we investigated three classes of models: missingness in depression, missingness in depression and balance, and missingness in depression with an auxiliary variable. To adjust for (ii) we developed the first linearization variance formula for nonignorable missing-data models. Our sensitivity analysis was based on fitting a range of ignorable missing-data models along with nonignorable missing-data models that added one or two parameters. All nonignorable missing-data models that we considered fit the data substantially better than their ignorable missing-data counterparts. Under an ignorable missing-data mechanism, the odds ratio for the association between balance and depression was 2.0 with a 95% CI of (1.8, 2.2). Under 29 of the 30 selected nonignorable missing-data models, the odds ratios ranged from 2.7 with 95% CI of (2.3, 3.1) to 4.2 with 95% CI of (3.9, 4.6). Under one nonignorable missing-data model, the odds ratio was 7.4 with 95% CI of (6.3, 8.6). This is the first analysis to find a strong association between balance difficulties and frequent depression.  相似文献   

13.
Yan W  Hu Y  Geng Z 《Biometrics》2012,68(1):121-128
We discuss identifiability and estimation of causal effects of a treatment in subgroups defined by a covariate that is sometimes missing due to death, which is different from a problem with outcomes censored by death. Frangakis et al. (2007, Biometrics 63, 641-662) proposed an approach for estimating the causal effects under a strong monotonicity (SM) assumption. In this article, we focus on identifiability of the joint distribution of the covariate, treatment and potential outcomes, show sufficient conditions for identifiability, and relax the SM assumption to monotonicity (M) and no-interaction (NI) assumptions. We derive expectation-maximization algorithms for finding the maximum likelihood estimates of parameters of the joint distribution under different assumptions. Further we remove the M and NI assumptions, and prove that signs of the causal effects of a treatment in the subgroups are identifiable, which means that their bounds do not cover zero. We perform simulations and a sensitivity analysis to evaluate our approaches. Finally, we apply the approaches to the National Study on the Costs and Outcomes of Trauma Centers data, which are also analyzed by Frangakis et al. (2007) and Xie and Murphy (2007, Biometrics 63, 655-658).  相似文献   

14.
Cho M  Schenker N 《Biometrics》1999,55(3):826-833
Data obtained from studies in the health sciences often have incompletely observed covariates as well as censored outcomes. In this paper, we present methods for fitting the log-F accelerated failure time model with incomplete continuous and/or categorical time-independent covariates using the Gibbs sampler. A general location model that allows different covariance structures across cells is specified for the covariates, and ignorable missingness of the covariates is assumed. Techniques that accommodate standard assumptions of ignorable censoring as well as certain types of nonignorable censoring are developed. We compare our approach to traditional complete-case analysis in an application to data obtained from a study of melanoma. The comparison indicates that substantial gains in efficiency are possible with our approach.  相似文献   

15.
Zhang N  Little RJ 《Biometrics》2012,68(3):933-942
Summary We consider the linear regression of outcome Y on regressors W and Z with some values of W missing, when our main interest is the effect of Z on Y, controlling for W. Three common approaches to regression with missing covariates are (i) complete‐case analysis (CC), which discards the incomplete cases, and (ii) ignorable likelihood methods, which base inference on the likelihood based on the observed data, assuming the missing data are missing at random ( Rubin, 1976b ), and (iii) nonignorable modeling, which posits a joint distribution of the variables and missing data indicators. Another simple practical approach that has not received much theoretical attention is to drop the regressor variables containing missing values from the regression modeling (DV, for drop variables). DV does not lead to bias when either (i) the regression coefficient of W is zero or (ii) W and Z are uncorrelated. We propose a pseudo‐Bayesian approach for regression with missing covariates that compromises between the CC and DV estimates, exploiting information in the incomplete cases when the data support DV assumptions. We illustrate favorable properties of the method by simulation, and apply the proposed method to a liver cancer study. Extension of the method to more than one missing covariate is also discussed.  相似文献   

16.
Analysis with time-to-event data in clinical and epidemiological studies often encounters missing covariate values, and the missing at random assumption is commonly adopted, which assumes that missingness depends on the observed data, including the observed outcome which is the minimum of survival and censoring time. However, it is conceivable that in certain settings, missingness of covariate values is related to the survival time but not to the censoring time. This is especially so when covariate missingness is related to an unmeasured variable affected by the patient's illness and prognosis factors at baseline. If this is the case, then the covariate missingness is not at random as the survival time is censored, and it creates a challenge in data analysis. In this article, we propose an approach to deal with such survival-time-dependent covariate missingness based on the well known Cox proportional hazard model. Our method is based on inverse propensity weighting with the propensity estimated by nonparametric kernel regression. Our estimators are consistent and asymptotically normal, and their finite-sample performance is examined through simulation. An application to a real-data example is included for illustration.  相似文献   

17.
Spatial models for disease mapping should ideally account for covariates measured both at individual and area levels. The newly available “indiCAR” model fits the popular conditional autoregresssive (CAR) model by accommodating both individual and group level covariates while adjusting for spatial correlation in the disease rates. This algorithm has been shown to be effective but assumes log‐linear associations between individual level covariates and outcome. In many studies, the relationship between individual level covariates and the outcome may be non‐log‐linear, and methods to track such nonlinearity between individual level covariate and outcome in spatial regression modeling are not well developed. In this paper, we propose a new algorithm, smooth‐indiCAR, to fit an extension to the popular conditional autoregresssive model that can accommodate both linear and nonlinear individual level covariate effects while adjusting for group level covariates and spatial correlation in the disease rates. In this formulation, the effect of a continuous individual level covariate is accommodated via penalized splines. We describe a two‐step estimation procedure to obtain reliable estimates of individual and group level covariate effects where both individual and group level covariate effects are estimated separately. This distributed computing framework enhances its application in the Big Data domain with a large number of individual/group level covariates. We evaluate the performance of smooth‐indiCAR through simulation. Our results indicate that the smooth‐indiCAR method provides reliable estimates of all regression and random effect parameters. We illustrate our proposed methodology with an analysis of data on neutropenia admissions in New South Wales (NSW), Australia.  相似文献   

18.
Recurrent events data are common in experimental and observational studies. It is often of interest to estimate the effect of an intervention on the incidence rate of the recurrent events. The incidence rate difference is a useful measure of intervention effect. A weighted least squares estimator of the incidence rate difference for recurrent events was recently proposed for an additive rate model in which both the baseline incidence rate and the covariate effects were constant over time. In this article, we relax this model assumption and examine the properties of the estimator under the additive and multiplicative rate models assumption in which the baseline incidence rate and covariate effects may vary over time. We show analytically and numerically that the estimator gives an appropriate summary measure of the time‐varying covariate effects. In particular, when the underlying covariate effects are additive and time‐varying, the estimator consistently estimates the weighted average of the covariate effects over time. When the underlying covariate effects are multiplicative and time‐varying, and if there is only one binary covariate indicating the intervention status, the estimator consistently estimates the weighted average of the underlying incidence rate difference between the intervention and control groups over time. We illustrate the method with data from a randomized vaccine trial.  相似文献   

19.
Wen CC  Lin CT 《Biometrics》2011,67(3):760-769
Statistical inference based on right-censored data for the proportional hazards (PH) model with missing covariates has received considerable attention, but interval-censored or current status data with missing covariates has not yet been investigated. Our study is partly motivated by the analysis of fracture data from the 2005 National Health Interview Survey Original Database in Taiwan, where the occurrence of fractures was interval censored and the covariate osteoporosis was not reported for all residents. We assume that the data are realized from a PH model. A semiparametric maximum likelihood estimate implemented by a hybrid algorithm is proposed to analyze current status data with missing covariates. A comparison of the performance of our method with full-cohort analysis, complete-case analysis, and surrogate analysis is made via simulation with moderate sample sizes. The fracture data are then analyzed.  相似文献   

20.
Horton NJ  Laird NM 《Biometrics》2001,57(1):34-42
This article presents a new method for maximum likelihood estimation of logistic regression models with incomplete covariate data where auxiliary information is available. This auxiliary information is extraneous to the regression model of interest but predictive of the covariate with missing data. Ibrahim (1990, Journal of the American Statistical Association 85, 765-769) provides a general method for estimating generalized linear regression models with missing covariates using the EM algorithm that is easily implemented when there is no auxiliary data. Vach (1997, Statistics in Medicine 16, 57-72) describes how the method can be extended when the outcome and auxiliary data are conditionally independent given the covariates in the model. The method allows the incorporation of auxiliary data without making the conditional independence assumption. We suggest tests of conditional independence and compare the performance of several estimators in an example concerning mental health service utilization in children. Using an artificial dataset, we compare the performance of several estimators when auxiliary data are available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号