首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Propensity score matching is a method to reduce bias in non-randomized and observational studies. Propensity score matching is mainly applied to two treatment groups rather than multiple treatment groups, because some key issues affecting its application to multiple treatment groups remain unsolved, such as the matching distance, the assessment of balance in baseline variables, and the choice of optimal caliper width. The primary objective of this study was to compare propensity score matching methods using different calipers and to choose the optimal caliper width for use with three treatment groups. The authors used caliper widths from 0.1 to 0.8 of the pooled standard deviation of the logit of the propensity score, in increments of 0.1. The balance in baseline variables was assessed by standardized difference. The matching ratio, relative bias, and mean squared error (MSE) of the estimate between groups in different propensity score-matched samples were also reported. The results of Monte Carlo simulations indicate that matching using a caliper width of 0.2 of the pooled standard deviation of the logit of the propensity score affords superior performance in the estimation of treatment effects. This study provides practical solutions for the application of propensity score matching of three treatment groups.  相似文献   

2.
Propensity-score matching is frequently used in the medical literature to reduce or eliminate the effect of treatment selection bias when estimating the effect of treatments or exposures on outcomes using observational data. In propensity-score matching, pairs of treated and untreated subjects with similar propensity scores are formed. Recent systematic reviews of the use of propensity-score matching found that the large majority of researchers ignore the matched nature of the propensity-score matched sample when estimating the statistical significance of the treatment effect. We conducted a series of Monte Carlo simulations to examine the impact of ignoring the matched nature of the propensity-score matched sample on Type I error rates, coverage of confidence intervals, and variance estimation of the treatment effect. We examined estimating differences in means, relative risks, odds ratios, rate ratios from Poisson models, and hazard ratios from Cox regression models. We demonstrated that accounting for the matched nature of the propensity-score matched sample tended to result in type I error rates that were closer to the advertised level compared to when matching was not incorporated into the analyses. Similarly, accounting for the matched nature of the sample tended to result in confidence intervals with coverage rates that were closer to the nominal level, compared to when matching was not taken into account. Finally, accounting for the matched nature of the sample resulted in estimates of standard error that more closely reflected the sampling variability of the treatment effect compared to when matching was not taken into account.  相似文献   

3.
Lu B 《Biometrics》2005,61(3):721-728
In observational studies with a time-dependent treatment and time-dependent covariates, it is desirable to balance the distribution of the covariates at every time point. A time-dependent propensity score based on the Cox proportional hazards model is proposed and used in risk set matching. Matching on this propensity score is shown to achieve a balanced distribution of the covariates in both treated and control groups. Optimal matching with various designs is conducted and compared in a study of a surgical treatment, cystoscopy and hydrodistention, given in response to a chronic bladder disease, interstitial cystitis. Simulation studies also suggest that the statistical analysis after matching outperforms the analysis without matching in terms of both point and interval estimations.  相似文献   

4.
Summary Cluster randomization trials with relatively few clusters have been widely used in recent years for evaluation of health‐care strategies. On average, randomized treatment assignment achieves balance in both known and unknown confounding factors between treatment groups, however, in practice investigators can only introduce a small amount of stratification and cannot balance on all the important variables simultaneously. The limitation arises especially when there are many confounding variables in small studies. Such is the case in the INSTINCT trial designed to investigate the effectiveness of an education program in enhancing the tPA use in stroke patients. In this article, we introduce a new randomization design, the balance match weighted (BMW) design, which applies the optimal matching with constraints technique to a prospective randomized design and aims to minimize the mean squared error (MSE) of the treatment effect estimator. A simulation study shows that, under various confounding scenarios, the BMW design can yield substantial reductions in the MSE for the treatment effect estimator compared to a completely randomized or matched‐pair design. The BMW design is also compared with a model‐based approach adjusting for the estimated propensity score and Robins‐Mark‐Newey E‐estimation procedure in terms of efficiency and robustness of the treatment effect estimator. These investigations suggest that the BMW design is more robust and usually, although not always, more efficient than either of the approaches. The design is also seen to be robust against heterogeneous error. We illustrate these methods in proposing a design for the INSTINCT trial.  相似文献   

5.
Summary .  Little and An (2004,  Statistica Sinica   14, 949–968) proposed a penalized spline of propensity prediction (PSPP) method of imputation of missing values that yields robust model-based inference under the missing at random assumption. The propensity score for a missing variable is estimated and a regression model is fitted that includes the spline of the estimated logit propensity score as a covariate. The predicted unconditional mean of the missing variable has a double robustness (DR) property under misspecification of the imputation model. We show that a simplified version of PSPP, which does not center other regressors prior to including them in the prediction model, also has the DR property. We also propose two extensions of PSPP, namely, stratified PSPP and bivariate PSPP, that extend the DR property to inferences about conditional means. These extended PSPP methods are compared with the PSPP method and simple alternatives in a simulation study and applied to an online weight loss study conducted by Kaiser Permanente.  相似文献   

6.
Propensity score methods are used to estimate a treatment effect with observational data. This paper considers the formation of propensity score subclasses by investigating different methods for determining subclass boundaries and the number of subclasses used. We compare several methods: balancing a summary of the observed information matrix and equal-frequency subclasses. Subclasses that balance the inverse variance of the treatment effect reduce the mean squared error of the estimates and maximize the number of usable subclasses.  相似文献   

7.
Summary With advances in modern medicine and clinical diagnosis, case–control data with characterization of finer subtypes of cases are often available. In matched case–control studies, missingness in exposure values often leads to deletion of entire stratum, and thus entails a significant loss in information. When subtypes of cases are treated as categorical outcomes, the data are further stratified and deletion of observations becomes even more expensive in terms of precision of the category‐specific odds‐ratio parameters, especially using the multinomial logit model. The stereotype regression model for categorical responses lies intermediate between the proportional odds and the multinomial or baseline category logit model. The use of this class of models has been limited as the structure of the model implies certain inferential challenges with nonidentifiability and nonlinearity in the parameters. We illustrate how to handle missing data in matched case–control studies with finer disease subclassification within the cases under a stereotype regression model. We present both Monte Carlo based full Bayesian approach and expectation/conditional maximization algorithm for the estimation of model parameters in the presence of a completely general missingness mechanism. We illustrate our methods by using data from an ongoing matched case–control study of colorectal cancer. Simulation results are presented under various missing data mechanisms and departures from modeling assumptions.  相似文献   

8.
The power of the Mantel-Haenszel test for no treatment effect in the case of binary exposure and response variates was examined through simulation studies when subclasses were formed on the basis of the true and estimated propensity scores and by direct stratification on two continuous covariates. The power of these tests was also compared to the score test in a misspecified logistic regression model. In general adjustment by the true propensity score was most likely to reject a false null hypothesis, the score test was more likely to reject a false null hypothesis than the Mantel-Haenszel test when adjustment is by the estimated propensity score or subclassification on the covariates. There was litte difference in the observed powers of the Mantel-Haenszel tests between adjustment by the estimated propensity score and subclassification on the covariates.  相似文献   

9.
We study bias-reduced estimators of exponentially transformed parameters in general linear models (GLMs) and show how they can be used to obtain bias-reduced conditional (or unconditional) odds ratios in matched case-control studies. Two options are considered and compared: the explicit approach and the implicit approach. The implicit approach is based on the modified score function where bias-reduced estimates are obtained by using iterative procedures to solve the modified score equations. The explicit approach is shown to be a one-step approximation of this iterative procedure. To apply these approaches for the conditional analysis of matched case-control studies, with potentially unmatched confounding and with several exposures, we utilize the relation between the conditional likelihood and the likelihood of the unconditional logit binomial GLM for matched pairs and Cox partial likelihood for matched sets with appropriately setup data. The properties of the estimators are evaluated by using a large Monte Carlo simulation study and an illustration of a real dataset is shown. Researchers reporting the results on the exponentiated scale should use bias-reduced estimators since otherwise the effects can be under or overestimated, where the magnitude of the bias is especially large in studies with smaller sample sizes.  相似文献   

10.
Evaluation of impact of potential uncontrolled confounding is an important component for causal inference based on observational studies. In this article, we introduce a general framework of sensitivity analysis that is based on inverse probability weighting. We propose a general methodology that allows both non‐parametric and parametric analyses, which are driven by two parameters that govern the magnitude of the variation of the multiplicative errors of the propensity score and their correlations with the potential outcomes. We also introduce a specific parametric model that offers a mechanistic view on how the uncontrolled confounding may bias the inference through these parameters. Our method can be readily applied to both binary and continuous outcomes and depends on the covariates only through the propensity score that can be estimated by any parametric or non‐parametric method. We illustrate our method with two medical data sets.  相似文献   

11.
In observational studies, subjects are often nested within clusters. In medical studies, patients are often treated by doctors and therefore patients are regarded as nested or clustered within doctors. A concern that arises with clustered data is that cluster-level characteristics (e.g., characteristics of the doctor) are associated with both treatment selection and patient outcomes, resulting in cluster-level confounding. Measuring and modeling cluster attributes can be difficult and statistical methods exist to control for all unmeasured cluster characteristics. An assumption of these methods however is that characteristics of the cluster and the effects of those characteristics on the outcome (as well as probability of treatment assignment when using covariate balancing methods) are constant over time. In this paper, we consider methods that relax this assumption and allow for estimation of treatment effects in the presence of unmeasured time-dependent cluster confounding. The methods are based on matching with the propensity score and incorporate unmeasured time-specific cluster effects by performing matching within clusters or using fixed- or random-cluster effects in the propensity score model. The methods are illustrated using data to compare the effectiveness of two total hip devices with respect to survival of the device and a simulation study is performed that compares the proposed methods. One method that was found to perform well is matching within surgeon clusters partitioned by time. Considerations in implementing the proposed methods are discussed.  相似文献   

12.
Multivariable model building for propensity score modeling approaches is challenging. A common propensity score approach is exposure-driven propensity score matching, where the best model selection strategy is still unclear. In particular, the situation may require variable selection, while it is still unclear if variables included in the propensity score should be associated with the exposure and the outcome, with either the exposure or the outcome, with at least the exposure or with at least the outcome. Unmeasured confounders, complex correlation structures, and non-normal covariate distributions further complicate matters. We consider the performance of different modeling strategies in a simulation design with a complex but realistic structure and effects on a binary outcome. We compare the strategies in terms of bias and variance in estimated marginal exposure effects. Considering the bias in estimated marginal exposure effects, the most reliable results for estimating the propensity score are obtained by selecting variables related to the exposure. On average this results in the least bias and does not greatly increase variances. Although our results cannot be generalized, this provides a counterexample to existing recommendations in the literature based on simple simulation settings. This highlights that recommendations obtained in simple simulation settings cannot always be generalized to more complex, but realistic settings and that more complex simulation studies are needed.  相似文献   

13.
A method for reducing bias in observational studies proposed by ROSENBAUM and RUBIN (1983, 1984) is discussed with a view to applications in studies designed to compare two treatments. The data are stratified on a function of covariates, called the propensity score. The propensity score is the conditional probability of receiving a specific treatment given a set of observed covariates. Some insight into how this kind of stratification works in theory is given. Within strata, the treatment groups are comparable with respect to the distribution of covariates incorporated into the score, hence a corresponding stratified analysis can be considered. The method is different from other strategies in that the sub-classes are not intended to comprise patients with similar prognosis. In practice, estimated grouped scores are used. Problems concerning the interpretation of the proposed stratified approach are illustrated by an application in oncology, and the results are compared to those from an analysis in a standard regression model.  相似文献   

14.
Huang Y  Leroux B 《Biometrics》2011,67(3):843-851
Summary Williamson, Datta, and Satten's (2003, Biometrics 59 , 36–42) cluster‐weighted generalized estimating equations (CWGEEs) are effective in adjusting for bias due to informative cluster sizes for cluster‐level covariates. We show that CWGEE may not perform well, however, for covariates that can take different values within a cluster if the numbers of observations at each covariate level are informative. On the other hand, inverse probability of treatment weighting accounts for informative treatment propensity but not for informative cluster size. Motivated by evaluating the effect of a binary exposure in presence of such types of informativeness, we propose several weighted GEE estimators, with weights related to the size of a cluster as well as the distribution of the binary exposure within the cluster. Choice of the weights depends on the population of interest and the nature of the exposure. Through simulation studies, we demonstrate the superior performance of the new estimators compared to existing estimators such as from GEE, CWGEE, and inverse probability of treatment‐weighted GEE. We demonstrate the use of our method using an example examining covariate effects on the risk of dental caries among small children.  相似文献   

15.

Background

Long-acting beta-agonists were one of the first-choice bronchodilator agents for stable chronic obstructive pulmonary disease. But the impact of long-acting beta-agonists on mortality was not well investigated.

Methods

National Emphysema Treatment Trial provided the data. Severe and very severe stable chronic obstructive pulmonary disease patients who were eligible for volume reduction surgery were recruited at 17 clinical centers in United States during 1988–2002. We used the 6–10 year follow-up data of patients randomized to non-surgery treatment. Hazard ratios for death by long-acting beta-agonists were estimated by three models using Cox proportional hazard analysis and propensity score matching were measured.

Results

The pre-matching cohort was comprised of 591 patients (50.6% were administered long-acting beta-agonists. Age: 66.6 ± 5.3 year old. Female: 35.4%. Forced expiratory volume in one second (%predicted): 26.7 ± 7.1%. Mortality during follow-up: 70.2%). Hazard ratio using a multivariate Cox model in the pre-matching cohort was 0.77 (P = 0.010). Propensity score matching was conducted (C-statics: 0.62. No parameter differed between cohorts). The propensity-matched cohort was comprised of 492 patients (50.0% were administered long-acting beta-agonists. Age: 66.8 ± 5.1 year old. Female: 34.8%. Forced expiratory volume in one second (%predicted) 26.5 ± 6.8%. Mortality during follow-up: 69.1%). Hazard ratio using a univariate Cox model in the propensity-matched cohort was 0.77 (P = 0.017). Hazard ratio using a multivariate Cox model in the propensity-matched cohort was 0.76 (P = 0.011).

Conclusions

Long-acting beta-agonists reduce mortality of severe and very severe chronic obstructive pulmonary disease patients.  相似文献   

16.
Summary This article develops semiparametric approaches for estimation of propensity scores and causal survival functions from prevalent survival data. The analytical problem arises when the prevalent sampling is adopted for collecting failure times and, as a result, the covariates are incompletely observed due to their association with failure time. The proposed procedure for estimating propensity scores shares interesting features similar to the likelihood formulation in case‐control study, but in our case it requires additional consideration in the intercept term. The result shows that the corrected propensity scores in logistic regression setting can be obtained through standard estimation procedure with specific adjustments on the intercept term. For causal estimation, two different types of missing sources are encountered in our model: one can be explained by potential outcome framework; the other is caused by the prevalent sampling scheme. Statistical analysis without adjusting bias from both sources of missingness will lead to biased results in causal inference. The proposed methods were partly motivated by and applied to the Surveillance, Epidemiology, and End Results (SEER)‐Medicare linked data for women diagnosed with breast cancer.  相似文献   

17.
Analysts often estimate treatment effects in observational studies using propensity score matching techniques. When there are missing covariate values, analysts can multiply impute the missing data to create m completed data sets. Analysts can then estimate propensity scores on each of the completed data sets, and use these to estimate treatment effects. However, there has been relatively little attention on developing imputation models to deal with the additional problem of missing treatment indicators, perhaps due to the consequences of generating implausible imputations. However, simply ignoring the missing treatment values, akin to a complete case analysis, could also lead to problems when estimating treatment effects. We propose a latent class model to multiply impute missing treatment indicators. We illustrate its performance through simulations and with data taken from a study on determinants of children's cognitive development. This approach is seen to obtain treatment effect estimates closer to the true treatment effect than when employing conventional imputation procedures as well as compared to a complete case analysis.  相似文献   

18.

Background

Quasi-experimental studies of menu labeling have found mixed results for improving diet. Differences between experimental groups can hinder interpretation. Propensity scores are an increasingly common method to improve covariate balance, but multiple methods exist and the improvements associated with each method have rarely been compared. In this re-analysis of the impact of menu labeling, we compare multiple propensity score methods to determine which methods optimize balance between experimental groups.

Methods

Study participants included adult customers who visited full-service restaurants with menu labeling (treatment) and without (control). We compared the balance between treatment groups obtained by four propensity score methods: 1) 1:1 nearest neighbor matching (NN), 2) augmented 1:1 NN (using caliper of 0.2 and an exact match on an imbalanced covariate), 3) full matching, and 4) inverse probability weighting (IPW). We then evaluated the treatment effect on differences in nutrients purchased across the different methods.

Results

1:1 NN resulted in worse balance than the original unmatched sample (average standardized absolute mean distance [ASAM]: 0.185 compared to 0.171). Augmented 1:1 NN improved balance (ASAM: 0.038) but resulted in a large reduction in sample size. Full matching and IPW improved balance over the unmatched sample without a reduction in sample size (ASAM: 0.049 and 0.031, respectively). Menu labeling was associated with decreased calories, fat, sodium and carbohydrates in the unmatched analysis. Results were qualitatively similar in the propensity score matched/weighted models.

Conclusions

While propensity scores offer an increasingly popular tool to improve causal inference, choosing the correct method can be challenging. Our results emphasize the benefit of examining multiple methods to ensure results are consistent, and considering approaches beyond the most popular method of 1:1 NN matching.  相似文献   

19.
Propensity score matching (PSM) and propensity score weighting (PSW) are popular tools to estimate causal effects in observational studies. We address two open issues: how to estimate propensity scores and assess covariate balance. Using simulations, we compare the performance of PSM and PSW based on logistic regression and machine learning algorithms (CART; Bagging; Boosting; Random Forest; Neural Networks; naive Bayes). Additionally, we consider several measures of covariate balance (Absolute Standardized Average Mean (ASAM) with and without interactions; measures based on the quantile‐quantile plots; ratio between variances of propensity scores; area under the curve (AUC)) and assess their ability in predicting the bias of PSM and PSW estimators. We also investigate the importance of tuning of machine learning parameters in the context of propensity score methods. Two simulation designs are employed. In the first, the generating processes are inspired to birth register data used to assess the effect of labor induction on the occurrence of caesarean section. The second exploits more general generating mechanisms. Overall, among the different techniques, random forests performed the best, especially in PSW. Logistic regression and neural networks also showed an excellent performance similar to that of random forests. As for covariate balance, the simplest and commonly used metric, the ASAM, showed a strong correlation with the bias of causal effects estimators. Our findings suggest that researchers should aim at obtaining an ASAM lower than 10% for as many variables as possible. In the empirical study we found that labor induction had a small and not statistically significant impact on caesarean section.  相似文献   

20.

Background

The role of uric acid (UA) in the progression of chronic kidney disease (CKD) remains controversial due to the unavoidable cause and result relationship. This study was aimed to clarify the independent impact of UA on the subsequent risk of end-stage renal disease (ESRD) by a propensity score analysis.

Methods

A retrospective CKD cohort was used (n = 803). Baseline 23 covariates were subjected to a multivariate binary logistic regression with the targeted time-averaged UA of 6.0, 6.5 or 7.0 mg/dL. The participants trimmed 2.5 percentile from the extreme ends of the cohort underwent propensity score analyses consisting of matching, stratification on quintile and covariate adjustment. Covariate balances after 1:1 matching without replacement were tested for by paired analysis and standardized differences. A stratified Cox regression and a Cox regression adjusted for logit of propensity scores were examined.

Results

After propensity score matching, the higher UA showed elevated hazard ratios (HRs) by Kaplan-Meier analysis (≥6.0 mg/dL, HR 4.53, 95%CI 1.79–11.43; ≥6.5 mg/dL, HR 3.39, 95%CI 1.55–7.42; ≥7.0 mg/dL, HR 2.19, 95%CI 1.28–3.75). The number needed to treat was 8 to 9 over 5 years. A stratified Cox regression likewise showed significant crude HRs (≥6.0 mg/dL, HR 3.63, 95%CI 1.25–10.58; ≥6.5 mg/dL, HR 3.46, 95%CI 1.56–7.68; ≥7.0 mg/dL, HR 2.05, 95%CI 1.21–3.48). Adjusted HR lost its significance at 6.0 mg/dL. The adjustment for the logit of the propensity scores showed the similar results but with worse model fittings than the stratification method. Upon further adjustment for other covariates the significance was attained at 6.5 mg/dL.

Conclusions

Three different methods of the propensity score analysis showed consistent results that the higher UA accelerates the progression to the subsequent ESRD. A stratified Cox regression outperforms other methods in generalizability and adjusting for residual bias. Serum UA should be targeted less than 6.5 mg/dL.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号