首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Methods in the literature for missing covariate data in survival models have relied on the missing at random (MAR) assumption to render regression parameters identifiable. MAR means that missingness can depend on the observed exit time, and whether or not that exit is a failure or a censoring event. By considering ways in which missingness of covariate X could depend on the true but possibly censored failure time T and the true censoring time C, we attempt to identify missingness mechanisms which would yield MAR data. We find that, under various reasonable assumptions about how missingness might depend on T and/or C, additional strong assumptions are needed to obtain MAR. We conclude that MAR is difficult to justify in practical applications. One exception arises when missingness is independent of T, and C is independent of the value of the missing X. As alternatives to MAR, we propose two new missingness assumptions. In one, the missingness depends on T but not on C; in the other, the situation is reversed. For each, we show that the failure time model is identifiable. When missingness is independent of T, we show that the naive complete record analysis will yield a consistent estimator of the failure time distribution. When missingness is independent of C, we develop a complete record likelihood function and a corresponding estimator for parametric failure time models. We propose analyses to evaluate the plausibility of either assumption in a particular data set, and illustrate the ideas using data from the literature on this problem.  相似文献   

Analyzing incomplete longitudinal clinical trial data   总被引:1,自引:0,他引:1  
Using standard missing data taxonomy, due to Rubin and co-workers, and simple algebraic derivations, it is argued that some simple but commonly used methods to handle incomplete longitudinal clinical trial data, such as complete case analyses and methods based on last observation carried forward, require restrictive assumptions and stand on a weaker theoretical foundation than likelihood-based methods developed under the missing at random (MAR) framework. Given the availability of flexible software for analyzing longitudinal sequences of unequal length, implementation of likelihood-based MAR analyses is not limited by computational considerations. While such analyses are valid under the comparatively weak assumption of MAR, the possibility of data missing not at random (MNAR) is difficult to rule out. It is argued, however, that MNAR analyses are, themselves, surrounded with problems and therefore, rather than ignoring MNAR analyses altogether or blindly shifting to them, their optimal place is within sensitivity analysis. The concepts developed here are illustrated using data from three clinical trials, where it is shown that the analysis method may have an impact on the conclusions of the study.  相似文献   

On using the Cox proportional hazards model with missing covariates   总被引:1,自引:0,他引:1  

Inference about means from incomplete multivariate data   总被引:1,自引:0,他引:1  
LITTLE  R. J. A. 《Biometrika》1976,63(3):593-604

Satten GA  Carroll RJ 《Biometrics》2000,56(2):384-388
We consider methods for analyzing categorical regression models when some covariates (Z) are completely observed but other covariates (X) are missing for some subjects. When data on X are missing at random (i.e., when the probability that X is observed does not depend on the value of X itself), we present a likelihood approach for the observed data that allows the same nuisance parameters to be eliminated in a conditional analysis as when data are complete. An example of a matched case-control study is used to demonstrate our approach.  相似文献   

Imputation, weighting, direct likelihood, and direct Bayesian inference (Rubin, 1976) are important approaches for missing data regression. Many useful semiparametric estimators have been developed for regression analysis of data with missing covariates or outcomes. It has been established that some semiparametric estimators are asymptotically equivalent, but it has not been shown that many are numerically the same. We applied some existing methods to a bladder cancer case-control study and noted that they were the same numerically when the observed covariates and outcomes are categorical. To understand the analytical background of this finding, we further show that when observed covariates and outcomes are categorical, some estimators are not only asymptotically equivalent but also actually numerically identical. That is, although their estimating equations are different, they lead numerically to exactly the same root. This includes a simple weighted estimator, an augmented weighted estimator, and a mean-score estimator. The numerical equivalence may elucidate the relationship between imputing scores and weighted estimation procedures.  相似文献   

Inference using surrogate outcome data and a validation sample   总被引:7,自引:0,他引:7  
PEPE  MARGARET SULLIVAN 《Biometrika》1992,79(2):355-365

Premature terminations or dropouts occur often in repeated measurement experiments. A number of methods have been proposed to analyze such data but most of them assume that the censoring mechanism is, within each group, unaffected by the mechanism generating the response variables. In this paper, we propose a model for the censoring mechanism that generates dropouts. We then show how this model can be used to check whether the censoring mechanism is affected by the response variables and other covariates. Finally, the methods of the paper are applied to the “Halothane” data set.  相似文献   

Multivariate Student-t regression models: Pitfalls and inference   总被引:1,自引:0,他引:1  
Fernandez  C; Steel  MFJ 《Biometrika》1999,86(1):153-167

Inference and sequential design   总被引:1,自引:0,他引:1  

Using data from 145,007 adults in the Disability Supplement to the National Health Interview Survey, we investigated the effect of balance difficulties on frequent depression after controlling for age, gender, race, and other baseline health status information. There were two major complications: (i) 80% of subjects were missing data on depression and the missing-data mechanism was likely related to depression, and (ii) the data arose from a complex sample survey. To adjust for (i) we investigated three classes of models: missingness in depression, missingness in depression and balance, and missingness in depression with an auxiliary variable. To adjust for (ii) we developed the first linearization variance formula for nonignorable missing-data models. Our sensitivity analysis was based on fitting a range of ignorable missing-data models along with nonignorable missing-data models that added one or two parameters. All nonignorable missing-data models that we considered fit the data substantially better than their ignorable missing-data counterparts. Under an ignorable missing-data mechanism, the odds ratio for the association between balance and depression was 2.0 with a 95% CI of (1.8, 2.2). Under 29 of the 30 selected nonignorable missing-data models, the odds ratios ranged from 2.7 with 95% CI of (2.3, 3.1) to 4.2 with 95% CI of (3.9, 4.6). Under one nonignorable missing-data model, the odds ratio was 7.4 with 95% CI of (6.3, 8.6). This is the first analysis to find a strong association between balance difficulties and frequent depression.  相似文献   

For the predominantly southern hemisphere plant group Styphelioideae (Ericaceae) published sequence datasets of five markers are now available for all except one of the 38 recognised genera. However, several markers are highly incomplete therefore missing data is problematic for producing a genus level phylogeny. We explore the relative utility of supertree and supermatrix approaches for addressing this challenge, and examine the effects of missing data on tree topology and resolution. Although the supertree approach returned a more conservative hypothesis, overall, both supermatrix and supertree analyses concurred in the topologies they returned. Using multiple genes and a dataset of variably complete taxa we found improved support for the monophyly and position of the tribes and genus level relationships. However, there was mixed support for the Richeeae tribe appearing one node basal to the Cosmelieae tribe or vice versa. It is probable that this will only be resolved through further sequencing. Our study supports previous findings that the amount of data is more critical than the completeness of the dataset in estimating well-resolved trees. Our results suggest that a “serendipitous” scaffolding approach that includes a mixture of well and poorly sequenced taxa can lead to robust phylogenetic hypotheses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号