首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In cluster randomized trials (CRTs), identifiable clusters rather than individuals are randomized to study groups. Resulting data often consist of a small number of clusters with correlated observations within a treatment group. Missing data often present a problem in the analysis of such trials, and multiple imputation (MI) has been used to create complete data sets, enabling subsequent analysis with well-established analysis methods for CRTs. We discuss strategies for accounting for clustering when multiply imputing a missing continuous outcome, focusing on estimation of the variance of group means as used in an adjusted t-test or ANOVA. These analysis procedures are congenial to (can be derived from) a mixed effects imputation model; however, this imputation procedure is not yet available in commercial statistical software. An alternative approach that is readily available and has been used in recent studies is to include fixed effects for cluster, but the impact of using this convenient method has not been studied. We show that under this imputation model the MI variance estimator is positively biased and that smaller intraclass correlations (ICCs) lead to larger overestimation of the MI variance. Analytical expressions for the bias of the variance estimator are derived in the case of data missing completely at random, and cases in which data are missing at random are illustrated through simulation. Finally, various imputation methods are applied to data from the Detroit Middle School Asthma Project, a recent school-based CRT, and differences in inference are compared.  相似文献   

2.
Latent class regression (LCR) is a popular method for analyzing multiple categorical outcomes. While nonresponse to the manifest items is a common complication, inferences of LCR can be evaluated using maximum likelihood, multiple imputation, and two‐stage multiple imputation. Under similar missing data assumptions, the estimates and variances from all three procedures are quite close. However, multiple imputation and two‐stage multiple imputation can provide additional information: estimates for the rates of missing information. The methodology is illustrated using an example from a study on racial and ethnic disparities in breast cancer severity.  相似文献   

3.
Taylor L  Zhou XH 《Biometrics》2009,65(1):88-95
Summary .  Randomized clinical trials are a powerful tool for investigating causal treatment effects, but in human trials there are oftentimes problems of noncompliance which standard analyses, such as the intention-to-treat or as-treated analysis, either ignore or incorporate in such a way that the resulting estimand is no longer a causal effect. One alternative to these analyses is the complier average causal effect (CACE) which estimates the average causal treatment effect among a subpopulation that would comply under any treatment assigned. We focus on the setting of a randomized clinical trial with crossover treatment noncompliance (e.g., control subjects could receive the intervention and intervention subjects could receive the control) and outcome nonresponse. In this article, we develop estimators for the CACE using multiple imputation methods, which have been successfully applied to a wide variety of missing data problems, but have not yet been applied to the potential outcomes setting of causal inference. Using simulated data we investigate the finite sample properties of these estimators as well as of competing procedures in a simple setting. Finally we illustrate our methods using a real randomized encouragement design study on the effectiveness of the influenza vaccine.  相似文献   

4.
The stepped wedge cluster randomized trial (SW-CRT) is an increasingly popular design for evaluating health service delivery or policy interventions. An essential consideration of this design is the need to account for both within-period and between-period correlations in sample size calculations. Especially when embedded in health care delivery systems, many SW-CRTs may have subclusters nested in clusters, within which outcomes are collected longitudinally. However, existing sample size methods that account for between-period correlations have not allowed for multiple levels of clustering. We present computationally efficient sample size procedures that properly differentiate within-period and between-period intracluster correlation coefficients in SW-CRTs in the presence of subclusters. We introduce an extended block exchangeable correlation matrix to characterize the complex dependencies of outcomes within clusters. For Gaussian outcomes, we derive a closed-form sample size expression that depends on the correlation structure only through two eigenvalues of the extended block exchangeable correlation structure. For non-Gaussian outcomes, we present a generic sample size algorithm based on linearization and elucidate simplifications under canonical link functions. For example, we show that the approximate sample size formula under a logistic linear mixed model depends on three eigenvalues of the extended block exchangeable correlation matrix. We provide an extension to accommodate unequal cluster sizes and validate the proposed methods via simulations. Finally, we illustrate our methods in two real SW-CRTs with subclusters.  相似文献   

5.
Summary In this article, we propose a positive stable shared frailty Cox model for clustered failure time data where the frailty distribution varies with cluster‐level covariates. The proposed model accounts for covariate‐dependent intracluster correlation and permits both conditional and marginal inferences. We obtain marginal inference directly from a marginal model, then use a stratified Cox‐type pseudo‐partial likelihood approach to estimate the regression coefficient for the frailty parameter. The proposed estimators are consistent and asymptotically normal and a consistent estimator of the covariance matrix is provided. Simulation studies show that the proposed estimation procedure is appropriate for practical use with a realistic number of clusters. Finally, we present an application of the proposed method to kidney transplantation data from the Scientific Registry of Transplant Recipients.  相似文献   

6.
Cluster randomized studies are common in community trials. The standard method for estimating sample size for cluster randomized studies assumes a common cluster size. However often in cluster randomized studies, size of the clusters vary. In this paper, we derive sample size estimation for continuous outcomes for cluster randomized studies while accounting for the variability due to cluster size. It is shown that the proposed formula for estimating total cluster size can be obtained by adding a correction term to the traditional formula which uses the average cluster size. Application of these results to the design of a health promotion educational intervention study is discussed.  相似文献   

7.
In livestock, many studies have reported the results of imputation to 50k single nucleotide polymorphism (SNP) genotypes for animals that are genotyped with low-density SNP panels. The objective of this paper is to review different measures of correctness of imputation, and to evaluate their utility depending on the purpose of the imputed genotypes. Across studies, imputation accuracy, computed as the correlation between true and imputed genotypes, and imputation error rates, that counts the number of incorrectly imputed alleles, are commonly used measures of imputation correctness. Based on the nature of both measures and results reported in the literature, imputation accuracy appears to be a more useful measure of the correctness of imputation than imputation error rates, because imputation accuracy does not depend on minor allele frequency (MAF), whereas imputation error rate depends on MAF. Therefore imputation accuracy can be better compared across loci with different MAF. Imputation accuracy depends on the ability of identifying the correct haplotype of a SNP, but many other factors have been identified as well, including the number of genotyped immediate ancestors, the number of animals with genotypes at the high-density panel, the SNP density on the low- and high-density panel, the MAF of the imputed SNP and whether imputed SNP are located at the end of a chromosome or not. Some of these factors directly contribute to the linkage disequilibrium between imputed SNP and SNP on the low-density panel. When imputation accuracy is assessed as a predictor for the accuracy of subsequent genomic prediction, we recommend that: (1) individual-specific imputation accuracies should be used that are computed after centring and scaling both true and imputed genotypes; and (2) imputation of gene dosage is preferred over imputation of the most likely genotype, as this increases accuracy and reduces bias of the imputed genotypes and the subsequent genomic predictions.  相似文献   

8.
Reiter  Jerome P. 《Biometrika》2008,95(4):933-946
When some of the records used to estimate the imputation modelsin multiple imputation are not used or available for analysis,the usual multiple imputation variance estimator has positivebias. We present an alternative approach that enables unbiasedestimation of variances and, hence, calibrated inferences insuch contexts. First, using all records, the imputer samplesm values of the parameters of the imputation model. Second,for each parameter draw, the imputer simulates the missing valuesfor all records n times. From these mn completed datasets, theimputer can analyse or disseminate the appropriate subset ofrecords. We develop methods for interval estimation and significancetesting for this approach. Methods are presented in the contextof multiple imputation for measurement error.  相似文献   

9.
Wang T  Wu L 《Biometrics》2011,67(4):1452-1460
Multivariate one-sided hypotheses testing problems arise frequently in practice. Various tests have been developed. In practice, there are often missing values in multivariate data. In this case, standard testing procedures based on complete data may not be applicable or may perform poorly if the missing data are discarded. In this article, we propose several multiple imputation methods for multivariate one-sided testing problem with missing data. Some theoretical results are presented. The proposed methods are evaluated using simulations. A real data example is presented to illustrate the methods.  相似文献   

10.
Standard sample size calculation formulas for stepped wedge cluster randomized trials (SW-CRTs) assume that cluster sizes are equal. When cluster sizes vary substantially, ignoring this variation may lead to an under-powered study. We investigate the relative efficiency of a SW-CRT with varying cluster sizes to equal cluster sizes, and derive variance estimators for the intervention effect that account for this variation under a mixed effects model—a commonly used approach for analyzing data from cluster randomized trials. When cluster sizes vary, the power of a SW-CRT depends on the order in which clusters receive the intervention, which is determined through randomization. We first derive a variance formula that corresponds to any particular realization of the randomized sequence and propose efficient algorithms to identify upper and lower bounds of the power. We then obtain an “expected” power based on a first-order approximation to the variance formula, where the expectation is taken with respect to all possible randomization sequences. Finally, we provide a variance formula for more general settings where only the cluster size arithmetic mean and coefficient of variation, instead of exact cluster sizes, are known in the design stage. We evaluate our methods through simulations and illustrate that the average power of a SW-CRT decreases as the variation in cluster sizes increases, and the impact is largest when the number of clusters is small.  相似文献   

11.
Liu M  Taylor JM  Belin TR 《Biometrics》2000,56(4):1157-1163
This paper outlines a multiple imputation method for handling missing data in designed longitudinal studies. A random coefficients model is developed to accommodate incomplete multivariate continuous longitudinal data. Multivariate repeated measures are jointly modeled; specifically, an i.i.d. normal model is assumed for time-independent variables and a hierarchical random coefficients model is assumed for time-dependent variables in a regression model conditional on the time-independent variables and time, with heterogeneous error variances across variables and time points. Gibbs sampling is used to draw model parameters and for imputations of missing observations. An application to data from a study of startle reactions illustrates the model. A simulation study compares the multiple imputation procedure to the weighting approach of Robins, Rotnitzky, and Zhao (1995, Journal of the American Statistical Association 90, 106-121) that can be used to address similar data structures.  相似文献   

12.
Stepped wedge cluster randomised trials introduce interventions to groups of clusters in a random order and have been used to evaluate interventions for health and wellbeing. Standardised guidance for reporting stepped wedge trials is currently absent, and a range of potential analytic approaches have been described. We systematically identified and reviewed recently published (2010 to 2014) analyses of stepped wedge trials. We extracted data and described the range of reporting and analysis approaches taken across all studies. We critically appraised the strategy described by three trials chosen to reflect a range of design characteristics. Ten reports of completed analyses were identified. Reporting varied: seven of the studies included a CONSORT diagram, and only five also included a diagram of the intervention rollout. Seven assessed the balance achieved by randomisation, and there was considerable heterogeneity among the approaches used. Only six reported the trend in the outcome over time. All used both ‘horizontal’ and ‘vertical’ information to estimate the intervention effect: eight adjusted for time with a fixed effect, one used time as a condition using a Cox proportional hazards model, and one did not account for time trends. The majority used simple random effects to account for clustering and repeat measures, assuming a common intervention effect across clusters. Outcome data from before and after the rollout period were often included in the primary analysis. Potential lags in the outcome response to the intervention were rarely investigated. We use three case studies to illustrate different approaches to analysis and reporting. There is considerable heterogeneity in the reporting of stepped wedge cluster randomised trials. Correct specification of the time-trend underlies the validity of the analytical approaches. The possibility that intervention effects vary by cluster or over time should be considered. Further work should be done to standardise the reporting of the design, attrition, balance, and time-trends in stepped wedge trials.  相似文献   

13.
It is not uncommon for biological anthropologists to analyze incomplete bioarcheological or forensic skeleton specimens. As many quantitative multivariate analyses cannot handle incomplete data, missing data imputation or estimation is a common preprocessing practice for such data. Using William W. Howells' Craniometric Data Set and the Goldman Osteometric Data Set, we evaluated the performance of multiple popular statistical methods for imputing missing metric measurements. Results indicated that multiple imputation methods outperformed single imputation methods, such as Bayesian principal component analysis (BPCA). Multiple imputation with Bayesian linear regression implemented in the R package norm2, the Expectation–Maximization (EM) with Bootstrapping algorithm implemented in Amelia, and the Predictive Mean Matching (PMM) method and several of the derivative linear regression models implemented in mice, perform well regarding accuracy, robustness, and speed. Based on the findings of this study, we suggest a practical procedure for choosing appropriate imputation methods.  相似文献   

14.
A generalized variance component model is proposed for the analysis of a categorical response variable with extra-multinomial variation. Categorical data obtained from research designs such as randomized multicenter clinical trials or complex sample surveys with clustering frequently exhibit extra-variation resulting from intracluster correlation. General correlation patterns are accounted for by utilizing a mixed-effects modelling approach, estimating the cluster variance components through the method of moments and modelling functions of the observed proportions through the use of estimating equations. A flexible set of assumptions characterizing the underlying covariance structure for the proportions can be accommodated. The importance of accounting for extra-variation when performing hypothesis tests is highlighted with an application to data from a multi-investigator clinical trial.  相似文献   

15.
A common assumption of data analysis in clinical trials is that the patient population, as well as treatment effects, do not vary during the course of the study. However, when trials enroll patients over several years, this hypothesis may be violated. Ignoring variations of the outcome distributions over time, under the control and experimental treatments, can lead to biased treatment effect estimates and poor control of false positive results. We propose and compare two procedures that account for possible variations of the outcome distributions over time, to correct treatment effect estimates, and to control type-I error rates. The first procedure models trends of patient outcomes with splines. The second leverages conditional inference principles, which have been introduced to analyze randomized trials when patient prognostic profiles are unbalanced across arms. These two procedures are applicable in response-adaptive clinical trials. We illustrate the consequences of trends in the outcome distributions in response-adaptive designs and in platform trials, and investigate the proposed methods in the analysis of a glioblastoma study.  相似文献   

16.
17.
Summary Cluster randomized trials in health care may involve three instead of two levels, for instance, in trials where different interventions to improve quality of care are compared. In such trials, the intervention is implemented in health care units (“clusters”) and aims at changing the behavior of health care professionals working in this unit (“subjects”), while the effects are measured at the patient level (“evaluations”). Within the generalized estimating equations approach, we derive a sample size formula that accounts for two levels of clustering: that of subjects within clusters and that of evaluations within subjects. The formula reveals that sample size is inflated, relative to a design with completely independent evaluations, by a multiplicative term that can be expressed as a product of two variance inflation factors, one that quantifies the impact of within‐subject correlation of evaluations on the variance of subject‐level means and the other that quantifies the impact of the correlation between subject‐level means on the variance of the cluster means. Power levels as predicted by the sample size formula agreed well with the simulated power for more than 10 clusters in total, when data were analyzed using bias‐corrected estimating equations for the correlation parameters in combination with the model‐based covariance estimator or the sandwich estimator with a finite sample correction.  相似文献   

18.
Heo M  Leon AC 《Biometrics》2008,64(4):1256-1262
SUMMARY: Cluster randomized clinical trials (cluster-RCT), where the community entities serve as clusters, often yield data with three hierarchy levels. For example, interventions are randomly assigned to the clusters (level three unit). Health care professionals (level two unit) within the same cluster are trained with the randomly assigned intervention to provide care to subjects (level one unit). In this study, we derived a closed form power function and formulae for sample size determination required to detect an intervention effect on outcomes at the subject's level. In doing so, we used a test statistic based on maximum likelihood estimates from a mixed-effects linear regression model for three level data. A simulation study follows and verifies that theoretical power estimates based on the derived formulae are nearly identical to empirical estimates based on simulated data. Recommendations at the design stage of a cluster-RCT are discussed.  相似文献   

19.
Ten Have TR  Localio AR 《Biometrics》1999,55(4):1022-1029
We extend an approach for estimating random effects parameters under a random intercept and slope logistic regression model to include standard errors, thereby including confidence intervals. The procedure entails numerical integration to yield posterior empirical Bayes (EB) estimates of random effects parameters and their corresponding posterior standard errors. We incorporate an adjustment of the standard error due to Kass and Steffey (KS; 1989, Journal of the American Statistical Association 84, 717-726) to account for the variability in estimating the variance component of the random effects distribution. In assessing health care providers with respect to adult pneumonia mortality, comparisons are made with the penalized quasi-likelihood (PQL) approximation approach of Breslow and Clayton (1993, Journal of the American Statistical Association 88, 9-25) and a Bayesian approach. To make comparisons with an EB method previously reported in the literature, we apply these approaches to crossover trials data previously analyzed with the estimating equations EB approach of Waclawiw and Liang (1994, Statistics in Medicine 13, 541-551). We also perform simulations to compare the proposed KS and PQL approaches. These two approaches lead to EB estimates of random effects parameters with similar asymptotic bias. However, for many clusters with small cluster size, the proposed KS approach does better than the PQL procedures in terms of coverage of nominal 95% confidence intervals for random effects estimates. For large cluster sizes and a few clusters, the PQL approach performs better than the KS adjustment. These simulation results agree somewhat with those of the data analyses.  相似文献   

20.
Summary Often a binary variable is generated by dichotomizing an underlying continuous variable measured at a specific time point according to a prespecified threshold value. In the event that the underlying continuous measurements are from a longitudinal study, one can use the repeated‐measures model to impute missing data on responder status as a result of subject dropout and apply the logistic regression model on the observed or otherwise imputed responder status. Standard Bayesian multiple imputation techniques ( Rubin, 1987 , in Multiple Imputation for Nonresponse in Surveys) that draw the parameters for the imputation model from the posterior distribution and construct the variance of parameter estimates for the analysis model as a combination of within‐ and between‐imputation variances are found to be conservative. The frequentist multiple imputation approach that fixes the parameters for the imputation model at the maximum likelihood estimates and construct the variance of parameter estimates for the analysis model using the results of Robins and Wang (2000, Biometrika 87, 113–124) is shown to be more efficient. We propose to apply ( Kenward and Roger, 1997 , Biometrics 53, 983–997) degrees of freedom to account for the uncertainty associated with variance–covariance parameter estimates for the repeated measures model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号