首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Wang T  Wu L 《Biometrics》2011,67(4):1452-1460
Multivariate one-sided hypotheses testing problems arise frequently in practice. Various tests have been developed. In practice, there are often missing values in multivariate data. In this case, standard testing procedures based on complete data may not be applicable or may perform poorly if the missing data are discarded. In this article, we propose several multiple imputation methods for multivariate one-sided testing problem with missing data. Some theoretical results are presented. The proposed methods are evaluated using simulations. A real data example is presented to illustrate the methods.  相似文献   

BackgroundPopulation-based net survival by tumour stage at diagnosis is a key measure in cancer surveillance. Unfortunately, data on tumour stage are often missing for a non-negligible proportion of patients and the mechanism giving rise to the missingness is usually anything but completely at random. In this setting, restricting analysis to the subset of complete records gives typically biased results. Multiple imputation is a promising practical approach to the issues raised by the missing data, but its use in conjunction with the Pohar-Perme method for estimating net survival has not been formally evaluated.MethodsWe performed a resampling study using colorectal cancer population-based registry data to evaluate the ability of multiple imputation, used along with the Pohar-Perme method, to deliver unbiased estimates of stage-specific net survival and recover missing stage information. We created 1000 independent data sets, each containing 5000 patients. Stage data were then made missing at random under two scenarios (30% and 50% missingness).ResultsComplete records analysis showed substantial bias and poor confidence interval coverage. Across both scenarios our multiple imputation strategy virtually eliminated the bias and greatly improved confidence interval coverage.ConclusionsIn the presence of missing stage data complete records analysis often gives severely biased results. We showed that combining multiple imputation with the Pohar-Perme estimator provides a valid practical approach for the estimation of stage-specific colorectal cancer net survival. As usual, when the percentage of missing data is high the results should be interpreted cautiously and sensitivity analyses are recommended.  相似文献   

In clinical and epidemiological studies information on the primary outcome of interest, that is, the disease status, is usually collected at a limited number of follow‐up visits. The disease status can often only be retrieved retrospectively in individuals who are alive at follow‐up, but will be missing for those who died before. Right‐censoring the death cases at the last visit (ad‐hoc analysis) yields biased hazard ratio estimates of a potential risk factor, and the bias can be substantial and occur in either direction. In this work, we investigate three different approaches that use the same likelihood contributions derived from an illness‐death multistate model in order to more adequately estimate the hazard ratio by including the death cases into the analysis: a parametric approach, a penalized likelihood approach, and an imputation‐based approach. We investigate to which extent these approaches allow for an unbiased regression analysis by evaluating their performance in simulation studies and on a real data example. In doing so, we use the full cohort with complete illness‐death data as reference and artificially induce missing information due to death by setting discrete follow‐up visits. Compared to an ad‐hoc analysis, all considered approaches provide less biased or even unbiased results, depending on the situation studied. In the real data example, the parametric approach is seen to be too restrictive, whereas the imputation‐based approach could almost reconstruct the original event history information.  相似文献   

In cluster randomized trials (CRTs), identifiable clusters rather than individuals are randomized to study groups. Resulting data often consist of a small number of clusters with correlated observations within a treatment group. Missing data often present a problem in the analysis of such trials, and multiple imputation (MI) has been used to create complete data sets, enabling subsequent analysis with well-established analysis methods for CRTs. We discuss strategies for accounting for clustering when multiply imputing a missing continuous outcome, focusing on estimation of the variance of group means as used in an adjusted t-test or ANOVA. These analysis procedures are congenial to (can be derived from) a mixed effects imputation model; however, this imputation procedure is not yet available in commercial statistical software. An alternative approach that is readily available and has been used in recent studies is to include fixed effects for cluster, but the impact of using this convenient method has not been studied. We show that under this imputation model the MI variance estimator is positively biased and that smaller intraclass correlations (ICCs) lead to larger overestimation of the MI variance. Analytical expressions for the bias of the variance estimator are derived in the case of data missing completely at random, and cases in which data are missing at random are illustrated through simulation. Finally, various imputation methods are applied to data from the Detroit Middle School Asthma Project, a recent school-based CRT, and differences in inference are compared.  相似文献   

Liu M  Taylor JM  Belin TR 《Biometrics》2000,56(4):1157-1163
This paper outlines a multiple imputation method for handling missing data in designed longitudinal studies. A random coefficients model is developed to accommodate incomplete multivariate continuous longitudinal data. Multivariate repeated measures are jointly modeled; specifically, an i.i.d. normal model is assumed for time-independent variables and a hierarchical random coefficients model is assumed for time-dependent variables in a regression model conditional on the time-independent variables and time, with heterogeneous error variances across variables and time points. Gibbs sampling is used to draw model parameters and for imputations of missing observations. An application to data from a study of startle reactions illustrates the model. A simulation study compares the multiple imputation procedure to the weighting approach of Robins, Rotnitzky, and Zhao (1995, Journal of the American Statistical Association 90, 106-121) that can be used to address similar data structures.  相似文献   

We present a method to fit a mixed effects Cox model with interval‐censored data. Our proposal is based on a multiple imputation approach that uses the truncated Weibull distribution to replace the interval‐censored data by imputed survival times and then uses established mixed effects Cox methods for right‐censored data. Interval‐censored data were encountered in a database corresponding to a recompilation of retrospective data from eight analytical treatment interruption (ATI) studies in 158 human immunodeficiency virus (HIV) positive combination antiretroviral treatment (cART) suppressed individuals. The main variable of interest is the time to viral rebound, which is defined as the increase of serum viral load (VL) to detectable levels in a patient with previously undetectable VL, as a consequence of the interruption of cART. Another aspect of interest of the analysis is to consider the fact that the data come from different studies based on different grounds and that we have several assessments on the same patient. In order to handle this extra variability, we frame the problem into a mixed effects Cox model that considers a random intercept per subject as well as correlated random intercept and slope for pre‐cART VL per study. Our procedure has been implemented in R using two packages: truncdist and coxme , and can be applied to any data set that presents both interval‐censored survival times and a grouped data structure that could be treated as a random effect in a regression model. The properties of the parameter estimators obtained with our proposed method are addressed through a simulation study.  相似文献   

In problems with missing or latent data, a standard approach is to first impute the unobserved data, then perform all statistical analyses on the completed dataset--corresponding to the observed data and imputed unobserved data--using standard procedures for complete-data inference. Here, we extend this approach to model checking by demonstrating the advantages of the use of completed-data model diagnostics on imputed completed datasets. The approach is set in the theoretical framework of Bayesian posterior predictive checks (but, as with missing-data imputation, our methods of missing-data model checking can also be interpreted as "predictive inference" in a non-Bayesian context). We consider the graphical diagnostics within this framework. Advantages of the completed-data approach include: (1) One can often check model fit in terms of quantities that are of key substantive interest in a natural way, which is not always possible using observed data alone. (2) In problems with missing data, checks may be devised that do not require to model the missingness or inclusion mechanism; the latter is useful for the analysis of ignorable but unknown data collection mechanisms, such as are often assumed in the analysis of sample surveys and observational studies. (3) In many problems with latent data, it is possible to check qualitative features of the model (for example, independence of two variables) that can be naturally formalized with the help of the latent data. We illustrate with several applied examples.  相似文献   

Missing data are ubiquitous in clinical and social research, and multiple imputation (MI) is increasingly the methodology of choice for practitioners. Two principal strategies for imputation have been proposed in the literature: joint modelling multiple imputation (JM‐MI) and full conditional specification multiple imputation (FCS‐MI). While JM‐MI is arguably a preferable approach, because it involves specification of an explicit imputation model, FCS‐MI is pragmatically appealing, because of its flexibility in handling different types of variables. JM‐MI has developed from the multivariate normal model, and latent normal variables have been proposed as a natural way to extend this model to handle categorical variables. In this article, we evaluate the latent normal model through an extensive simulation study and an application on data from the German Breast Cancer Study Group, comparing the results with FCS‐MI. We divide our investigation in four sections, focusing on (i) binary, (ii) categorical, (iii) ordinal, and (iv) count data. Using data simulated from both the latent normal model and the general location model, we find that in all but one extreme general location model setting JM‐MI works very well, and sometimes outperforms FCS‐MI. We conclude the latent normal model, implemented in the R package jomo , can be used with confidence by researchers, both for single and multilevel multiple imputation.  相似文献   

Evidence synthesis, both qualitatively and quantitatively through meta-analysis, is central to the development of evidence-based medicine. Unfortunately, meta-analysis is often complicated by the suspicion that the available studies represent a biased subset of the evidence, possibly due to publication bias or other systematically different effects in small studies. A number of statistical methods have been proposed to address this, among which the trim-and-fill method and the Copas selection model are two of the most widely discussed. However, both methods have drawbacks: the trim-and-fill method is based on strong assumptions about the symmetry of the funnel plot; the Copas selection model is less accessible to systematic reviewers, and sometimes encounters estimation problems. In this article, we adopt a logistic selection model, and show how treatment effects can be rapidly estimated via multiple imputation. Specifically, we impute studies under a missing at random assumption, and then reweight to obtain estimates under nonrandom selection. Our proposal is computationally straightforward. It allows users to increase selection while monitoring the extent of remaining funnel plot asymmetry, and also visualize the results using the funnel plot. We illustrate our approach using a small meta-analysis of benign prostatic hyperplasia.  相似文献   

Taylor L  Zhou XH 《Biometrics》2009,65(1):88-95
Summary .  Randomized clinical trials are a powerful tool for investigating causal treatment effects, but in human trials there are oftentimes problems of noncompliance which standard analyses, such as the intention-to-treat or as-treated analysis, either ignore or incorporate in such a way that the resulting estimand is no longer a causal effect. One alternative to these analyses is the complier average causal effect (CACE) which estimates the average causal treatment effect among a subpopulation that would comply under any treatment assigned. We focus on the setting of a randomized clinical trial with crossover treatment noncompliance (e.g., control subjects could receive the intervention and intervention subjects could receive the control) and outcome nonresponse. In this article, we develop estimators for the CACE using multiple imputation methods, which have been successfully applied to a wide variety of missing data problems, but have not yet been applied to the potential outcomes setting of causal inference. Using simulated data we investigate the finite sample properties of these estimators as well as of competing procedures in a simple setting. Finally we illustrate our methods using a real randomized encouragement design study on the effectiveness of the influenza vaccine.  相似文献   

We derive the nonparametric maximum likelihood estimate (NPMLE) of the cumulative incidence functions for competing risks survival data subject to interval censoring and truncation. Since the cumulative incidence function NPMLEs give rise to an estimate of the survival distribution which can be undefined over a potentially larger set of regions than the NPMLE of the survival function obtained ignoring failure type, we consider an alternative pseudolikelihood estimator. The methods are then applied to data from a cohort of injecting drug users in Thailand susceptible to infection from HIV-1 subtypes B and E.  相似文献   

There is a growing interest in the analysis of survival data with a cured proportion particularly in tumor recurrences studies. Biologically, it is reasonable to assume that the recurrence time is mainly affected by the overall health condition of the patient that depends on some covariates such as age, sex, or treatment type received. We propose a semiparametric frailty‐Cox cure model to quantify the overall health condition of the patient by a covariate‐dependent frailty that has a discrete mass at zero to characterize the cured patients, and a positive continuous part to characterize the heterogeneous health conditions among the uncured patients. A multiple imputation estimation method is proposed for the right‐censored case, which is further extended to accommodate interval‐censored data. Simulation studies show that the performance of the proposed method is highly satisfactory. For illustration, the model is fitted to a set of right‐censored melanoma incidence data and a set of interval‐censored breast cosmesis data. Our analysis suggests that patients receiving treatment of radiotherapy with adjuvant chemotherapy have a significantly higher probability of breast retraction, but also a lower hazard rate of breast retraction among those patients who will eventually experience the event with similar health conditions. The interpretation is very different to those based on models without a cure component that the treatment of radiotherapy with adjuvant chemotherapy significantly increases the risk of breast retraction.  相似文献   

Shen Y  Cheng SC 《Biometrics》1999,55(4):1093-1100
In the context of competing risks, the cumulative incidence function is often used to summarize the cause-specific failure-time data. As an alternative to the proportional hazards model, the additive risk model is used to investigate covariate effects by specifying that the subject-specific hazard function is the sum of a baseline hazard function and a regression function of covariates. Based on such a formulation, we present an approach to constructing simultaneous confidence intervals for the cause-specific cumulative incidence function of patients with given risk factors. A melanoma data set is used for the purpose of illustration.  相似文献   

In this paper, we derive score test statistics to discriminate between proportional hazards and proportional odds models for grouped survival data. These models are embedded within a power family transformation in order to obtain the score tests. In simple cases, some small-sample results are obtained for the score statistics using Monte Carlo simulations. Score statistics have distributions well approximated by the chi-squared distribution. Real examples illustrate the proposed tests.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号