首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Vanderweele TJ 《Biometrics》2008,64(3):702-706
Summary .   Unmeasured confounding variables are a common problem in drawing causal inferences in observational studies. A theorem is given which in certain circumstances allows the researcher to draw conclusions about the sign of the bias of unmeasured confounding. Specifically, it is possible to determine the sign of the bias when monotonicity relationships hold between the unmeasured confounding variable and the treatment, and between the unmeasured confounding variable and the outcome. Some discussion is given to the conditions under which the theorem applies and the strengths and limitations of using the theorem to assess the sign of the bias of unmeasured confounding.  相似文献   

2.
Data-driven methods for personalizing treatment assignment have garnered much attention from clinicians and researchers. Dynamic treatment regimes formalize this through a sequence of decision rules that map individual patient characteristics to a recommended treatment. Observational studies are commonly used for estimating dynamic treatment regimes due to the potentially prohibitive costs of conducting sequential multiple assignment randomized trials. However, estimating a dynamic treatment regime from observational data can lead to bias in the estimated regime due to unmeasured confounding. Sensitivity analyses are useful for assessing how robust the conclusions of the study are to a potential unmeasured confounder. A Monte Carlo sensitivity analysis is a probabilistic approach that involves positing and sampling from distributions for the parameters governing the bias. We propose a method for performing a Monte Carlo sensitivity analysis of the bias due to unmeasured confounding in the estimation of dynamic treatment regimes. We demonstrate the performance of the proposed procedure with a simulation study and apply it to an observational study examining tailoring the use of antidepressant medication for reducing symptoms of depression using data from Kaiser Permanente Washington.  相似文献   

3.
The establishment of cause and effect relationships is a fundamental objective of scientific research. Many lines of evidence can be used to make cause–effect inferences. When statistical data are involved, alternative explanations for the statistical relationship need to be ruled out. These include chance (apparent patterns due to random factors), confounding effects (a relationship between two variables because they are each associated with an unmeasured third variable), and sampling bias (effects due to preexisting properties of compared groups). The gold standard for managing these issues is a controlled randomized experiment. In disciplines such as biological anthropology, where controlled experiments are not possible for many research questions, causal inferences are made from observational data. Methods that statisticians recommend for this difficult objective have not been widely adopted in the biological anthropology literature. Issues involved in using statistics to make valid causal inferences from observational data are discussed.  相似文献   

4.
Multiple regression of observational data is frequently used to infer causal effects. Partial regression coefficients are biased estimates of causal effects if unmeasured confounders are not in the regression model. The sensitivity of partial regression coefficients to omitted confounders is investigated with a Monte‐Carlo simulation. A subset of causal traits is “measured” and their effects are estimated using ordinary least squares regression and compared to their expected values. Three major results are: (1) the error due to confounding is much larger than that due to sampling, especially with large samples, (2) confounding error shrinks trivially with sample size, and (3) small true effects are frequently estimated as large effects. Consequently, confidence intervals from regression are poor guides to the true intervals, especially with large sample sizes. The addition of a confounder to the model improves estimates only 55% of the time. Results are improved with complete knowledge of the rank order of causal effects but even with this omniscience, measured intervals are poor proxies for true intervals if there are many unmeasured confounders. The results suggest that only under very limited conditions can we have much confidence in the magnitude of partial regression coefficients as estimates of causal effects.  相似文献   

5.
Over the last decade the availability of SNP-trait associations from genome-wide association studies has led to an array of methods for performing Mendelian randomization studies using only summary statistics. A common feature of these methods, besides their intuitive simplicity, is the ability to combine data from several sources, incorporate multiple variants and account for biases due to weak instruments and pleiotropy. With the advent of large and accessible fully-genotyped cohorts such as UK Biobank, there is now increasing interest in understanding how best to apply these well developed summary data methods to individual level data, and to explore the use of more sophisticated causal methods allowing for non-linearity and effect modification.In this paper we describe a general procedure for optimally applying any two sample summary data method using one sample data. Our procedure first performs a meta-analysis of summary data estimates that are intentionally contaminated by collider bias between the genetic instruments and unmeasured confounders, due to conditioning on the observed exposure. These estimates are then used to correct the standard observational association between an exposure and outcome. Simulations are conducted to demonstrate the method’s performance against naive applications of two sample summary data MR. We apply the approach to the UK Biobank cohort to investigate the causal role of sleep disturbance on HbA1c levels, an important determinant of diabetes.Our approach can be viewed as a generalization of Dudbridge et al. (Nat. Comm. 10: 1561), who developed a technique to adjust for index event bias when uncovering genetic predictors of disease progression based on case-only data. Our work serves to clarify that in any one sample MR analysis, it can be advantageous to estimate causal relationships by artificially inducing and then correcting for collider bias.  相似文献   

6.
In an observational study, the treatment received and the outcome exhibited may be associated in the absence of an effect caused by the treatment, even after controlling for observed covariates. Two tactics are common: (i) a test for unmeasured bias may be obtained using a secondary outcome for which the effect is known and (ii) a sensitivity analysis may explore the magnitude of unmeasured bias that would need to be present to explain the observed association as something other than an effect caused by the treatment. Can such a test for unmeasured bias inform the sensitivity analysis? If the test for bias does not discover evidence of unmeasured bias, then ask: Are conclusions therefore insensitive to larger unmeasured biases? Conversely, if the test for bias does find evidence of bias, then ask: What does that imply about sensitivity to biases? This problem is formulated in a new way as a convex quadratically constrained quadratic program and solved on a large scale using interior point methods by a modern solver. That is, a convex quadratic function of N variables is minimized subject to constraints on linear and convex quadratic functions of these variables. The quadratic function that is minimized is a statistic for the primary outcome that is a function of the unknown treatment assignment probabilities. The quadratic function that constrains this minimization is a statistic for subsidiary outcome that is also a function of these same unknown treatment assignment probabilities. In effect, the first statistic is minimized over a confidence set for the unknown treatment assignment probabilities supplied by the unaffected outcome. This process avoids the mistake of interpreting the failure to reject a hypothesis as support for the truth of that hypothesis. The method is illustrated by a study of the effects of light daily alcohol consumption on high-density lipoprotein (HDL) cholesterol levels. In this study, the method quickly optimizes a nonlinear function of N = 800 $N=800$ variables subject to linear and quadratic constraints. In the example, strong evidence of unmeasured bias is found using the subsidiary outcome, but, perhaps surprisingly, this finding makes the primary comparison insensitive to larger biases.  相似文献   

7.
Vanderweele TJ 《Biometrics》2008,64(2):645-649
Summary .   In a presentation of various methods for assessing the sensitivity of regression results to unmeasured confounding, Lin, Psaty, and Kronmal (1998, Biometrics 54 , 948–963) use a conditional independence assumption to derive algebraic relationships between the true exposure effect and the apparent exposure effect in a reduced model that does not control for the unmeasured confounding variable. However, Hernán and Robins (1999, Biometrics 55 , 1316–1317) have noted that if the measured covariates and the unmeasured confounder both affect the exposure of interest then the principal conditional independence assumption that is used to derive these algebraic relationships cannot hold. One particular result of Lin et al. does not rely on the conditional independence assumption but only on assumptions concerning additivity. It can be shown that this assumption is satisfied for an entire family of distributions even if both the measured covariates and the unmeasured confounder affect the exposure of interest. These considerations clarify the appropriate contexts in which relevant sensitivity analysis techniques can be applied.  相似文献   

8.
Performing causal inference in observational studies requires we assume confounding variables are correctly adjusted for. In settings with few discrete-valued confounders, standard models can be employed. However, as the number of confounders increases these models become less feasible as there are fewer observations available for each unique combination of confounding variables. In this paper, we propose a new model for estimating treatment effects in observational studies that incorporates both parametric and nonparametric outcome models. By conceptually splitting the data, we can combine these models while maintaining a conjugate framework, allowing us to avoid the use of Markov chain Monte Carlo (MCMC) methods. Approximations using the central limit theorem and random sampling allow our method to be scaled to high-dimensional confounders. Through simulation studies we show our method can be competitive with benchmark models while maintaining efficient computation, and illustrate the method on a large epidemiological health survey.  相似文献   

9.
Studies of vaccine efficacy often record both the incidence of vaccine-targeted virus strains (primary outcome) and the incidence of nontargeted strains (secondary outcome). However, standard estimates of vaccine efficacy on targeted strains ignore the data on nontargeted strains. Assuming nontargeted strains are unaffected by vaccination, we regard the secondary outcome as a negative control outcome and show how using such data can (i) increase the precision of the estimated vaccine efficacy against targeted strains in randomized trials and (ii) reduce confounding bias of that same estimate in observational studies. For objective (i), we augment the primary outcome estimating equation with a function of the secondary outcome that is unbiased for zero. For objective (ii), we jointly estimate the treatment effects on the primary and secondary outcomes. If the bias induced by the unmeasured confounders is similar for both types of outcomes, as is plausible for factors that influence the general risk of infection, then we can use the estimated efficacy against the secondary outcomes to remove the bias from estimated efficacy against the primary outcome. We demonstrate the utility of these approaches in studies of HPV vaccines that only target a few highly carcinogenic strains. In this example, using nontargeted strains increased precision in randomized trials modestly but reduced bias in observational studies substantially.  相似文献   

10.

Summary

Omission of relevant covariates can lead to bias when estimating treatment or exposure effects from survival data in both randomized controlled trials and observational studies. This paper presents a general approach to assessing bias when covariates are omitted from the Cox model. The proposed method is applicable to both randomized and non‐randomized studies. We distinguish between the effects of three possible sources of bias: omission of a balanced covariate, data censoring and unmeasured confounding. Asymptotic formulae for determining the bias are derived from the large sample properties of the maximum likelihood estimator. A simulation study is used to demonstrate the validity of the bias formulae and to characterize the influence of the different sources of bias. It is shown that the bias converges to fixed limits as the effect of the omitted covariate increases, irrespective of the degree of confounding. The bias formulae are used as the basis for developing a new method of sensitivity analysis to assess the impact of omitted covariates on estimates of treatment or exposure effects. In simulation studies, the proposed method gave unbiased treatment estimates and confidence intervals with good coverage when the true sensitivity parameters were known. We describe application of the method to a randomized controlled trial and a non‐randomized study.  相似文献   

11.
Rosenbaum PR 《Biometrics》2011,67(3):1017-1027
Summary In an observational or nonrandomized study of treatment effects, a sensitivity analysis indicates the magnitude of bias from unmeasured covariates that would need to be present to alter the conclusions of a naïve analysis that presumes adjustments for observed covariates suffice to remove all bias. The power of sensitivity analysis is the probability that it will reject a false hypothesis about treatment effects allowing for a departure from random assignment of a specified magnitude; in particular, if this specified magnitude is “no departure” then this is the same as the power of a randomization test in a randomized experiment. A new family of u‐statistics is proposed that includes Wilcoxon's signed rank statistic but also includes other statistics with substantially higher power when a sensitivity analysis is performed in an observational study. Wilcoxon's statistic has high power to detect small effects in large randomized experiments—that is, it often has good Pitman efficiency—but small effects are invariably sensitive to small unobserved biases. Members of this family of u‐statistics that emphasize medium to large effects can have substantially higher power in a sensitivity analysis. For example, in one situation with 250 pair differences that are Normal with expectation 1/2 and variance 1, the power of a sensitivity analysis that uses Wilcoxon's statistic is 0.08 while the power of another member of the family of u‐statistics is 0.66. The topic is examined by performing a sensitivity analysis in three observational studies, using an asymptotic measure called the design sensitivity, and by simulating power in finite samples. The three examples are drawn from epidemiology, clinical medicine, and genetic toxicology.  相似文献   

12.
Sir Ronald Aylmer Fisher was the most famous and most productive statistician of the 20th century. Throughout his life, however, Fisher doubted the causal relationship between tobacco smoking and lung cancer. Instead, he invoked a genetic confounder to explain the statistical association between the two factors, i.e., he believed in the existence of a gene that plays a role in both cancer etiology and smoking behavior. There have been many attempts to explain Fisher’s stubbornness regarding this matter. In addition to nonscientific reasons (Fisher was himself a keen smoker) worries about the future importance of valid statistical methodology in medical research also may have played an important role. Interestingly, recent genome-wide association studies (GWAS) of smoking behavior as well as lung cancer have revealed that there may have been a grain of truth in Fisher’s idea and that his confounder may coincide with the gene encoding nicotine receptor subunit α5 on chromosome 15q25.  相似文献   

13.
The evolutionary potential of organisms depends on how their parts are structured into a cohesive whole. A major obstacle for empirical studies of phenotypic organization is that observed associations among characters usually confound different causal pathways such as pleiotropic modules, interphenotypic causal relationships and environmental effects. The present article proposes causal search algorithms as a new tool to distinguish these different modes of phenotypic integration. Without assuming an a priori structure, the algorithms seek a class of causal hypotheses consistent with independence relationships holding in observational data. The technique can be applied to discover causal relationships among a set of measured traits and to distinguish genuine selection from spurious correlations. The former application is illustrated with a biological data set of rat morphological measurements previously analysed by Cheverud et al. (Evolution 1983, 37, 895).  相似文献   

14.
In this paper, we discuss the identifiability and estimation of causal effects of a continuous treatment on a binary response when the treatment is measured with errors and there exists a latent categorical confounder associated with both treatment and response. Under some widely used parametric models, we first discuss the identifiability of the causal effects and then propose an approach for estimation and inference. Our approach can eliminate the biases induced by latent confounding and measurement errors by using only a single instrumental variable. Based on the identification results, we give guidelines for determining the existence of a latent categorical confounder and for selecting the number of levels of the latent confounder. We apply the proposed approach to a data set from the Framingham Heart Study to evaluate the effect of the systolic blood pressure on the coronary heart disease.  相似文献   

15.
Causal inference has been increasingly reliant on observational studies with rich covariate information. To build tractable causal procedures, such as the doubly robust estimators, it is imperative to first extract important features from high or even ultra-high dimensional data. In this paper, we propose causal ball screening for confounder selection from modern ultra-high dimensional data sets. Unlike the familiar task of variable selection for prediction modeling, our confounder selection procedure aims to control for confounding while improving efficiency in the resulting causal effect estimate. Previous empirical and theoretical studies suggest excluding causes of the treatment that are not confounders. Motivated by these results, our goal is to keep all the predictors of the outcome in both the propensity score and outcome regression models. A distinctive feature of our proposal is that we use an outcome model-free procedure for propensity score model selection, thereby maintaining double robustness in the resulting causal effect estimator. Our theoretical analyses show that the proposed procedure enjoys a number of properties, including model selection consistency and pointwise normality. Synthetic and real data analysis show that our proposal performs favorably with existing methods in a range of realistic settings. Data used in preparation of this paper were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.  相似文献   

16.
Propensity score matching (PSM) and propensity score weighting (PSW) are popular tools to estimate causal effects in observational studies. We address two open issues: how to estimate propensity scores and assess covariate balance. Using simulations, we compare the performance of PSM and PSW based on logistic regression and machine learning algorithms (CART; Bagging; Boosting; Random Forest; Neural Networks; naive Bayes). Additionally, we consider several measures of covariate balance (Absolute Standardized Average Mean (ASAM) with and without interactions; measures based on the quantile‐quantile plots; ratio between variances of propensity scores; area under the curve (AUC)) and assess their ability in predicting the bias of PSM and PSW estimators. We also investigate the importance of tuning of machine learning parameters in the context of propensity score methods. Two simulation designs are employed. In the first, the generating processes are inspired to birth register data used to assess the effect of labor induction on the occurrence of caesarean section. The second exploits more general generating mechanisms. Overall, among the different techniques, random forests performed the best, especially in PSW. Logistic regression and neural networks also showed an excellent performance similar to that of random forests. As for covariate balance, the simplest and commonly used metric, the ASAM, showed a strong correlation with the bias of causal effects estimators. Our findings suggest that researchers should aim at obtaining an ASAM lower than 10% for as many variables as possible. In the empirical study we found that labor induction had a small and not statistically significant impact on caesarean section.  相似文献   

17.
18.
We assume that multivariate observational data are generated from a distribution whose conditional independencies are encoded in a Directed Acyclic Graph (DAG). For any given DAG, the causal effect of a variable onto another one can be evaluated through intervention calculus. A DAG is typically not identifiable from observational data alone. However, its Markov equivalence class (a collection of DAGs) can be estimated from the data. As a consequence, for the same intervention a set of causal effects, one for each DAG in the equivalence class, can be evaluated. In this paper, we propose a fully Bayesian methodology to make inference on the causal effects of any intervention in the system. Main features of our method are: (a) both uncertainty on the equivalence class and the causal effects are jointly modeled; (b) priors on the parameters of the modified Cholesky decomposition of the precision matrices across all DAG models are constructively assigned starting from a unique prior on the complete (unrestricted) DAG; (c) an efficient algorithm to sample from the posterior distribution on graph space is adopted; (d) an objective Bayes approach, requiring virtually no user specification, is used throughout. We demonstrate the merits of our methodology in simulation studies, wherein comparisons with current state‐of‐the‐art procedures turn out to be highly satisfactory. Finally we examine a real data set of gene expressions for Arabidopsis thaliana.  相似文献   

19.
Guanglei Hong  Fan Yang  Xu Qin 《Biometrics》2023,79(2):1042-1056
In causal mediation studies that decompose an average treatment effect into indirect and direct effects, examples of posttreatment confounding are abundant. In the presence of treatment-by-mediator interactions, past research has generally considered it infeasible to adjust for a posttreatment confounder of the mediator–outcome relationship due to incomplete information: for any given individual, a posttreatment confounder is observed under the actual treatment condition while missing under the counterfactual treatment condition. This paper proposes a new sensitivity analysis strategy for handling posttreatment confounding and incorporates it into weighting-based causal mediation analysis. The key is to obtain the conditional distribution of the posttreatment confounder under the counterfactual treatment as a function of not only pretreatment covariates but also its counterpart under the actual treatment. The sensitivity analysis then generates a bound for the natural indirect effect and that for the natural direct effect over a plausible range of the conditional correlation between the posttreatment confounder under the actual and that under the counterfactual conditions. Implemented through either imputation or integration, the strategy is suitable for binary as well as continuous measures of posttreatment confounders. Simulation results demonstrate major strengths and potential limitations of this new solution. A reanalysis of the National Evaluation of Welfare-to-Work Strategies (NEWWS) Riverside data reveals that the initial analytic results are sensitive to omitted posttreatment confounding.  相似文献   

20.
Accommodating general patterns of confounding in sample size/power calculations for observational studies is extremely challenging, both technically and scientifically. While employing previously implemented sample size/power tools is appealing, they typically ignore important aspects of the design/data structure. In this paper, we show that sample size/power calculations that ignore confounding can be much more unreliable than is conventionally thought; using real data from the US state of North Carolina, naive calculations yield sample size estimates that are half those obtained when confounding is appropriately acknowledged. Unfortunately, eliciting realistic design parameters for confounding mechanisms is difficult. To overcome this, we propose a novel two-stage strategy for observational study design that can accommodate arbitrary patterns of confounding. At the first stage, researchers establish bounds for power that facilitate the decision of whether or not to initiate the study. At the second stage, internal pilot data are used to estimate key scientific inputs that can be used to obtain realistic sample size/power. Our results indicate that the strategy is effective at replicating gold standard calculations based on knowing the true confounding mechanism. Finally, we show that consideration of the nature of confounding is a crucial aspect of the elicitation process; depending on whether the confounder is positively or negatively associated with the exposure of interest and outcome, naive power calculations can either under or overestimate the required sample size. Throughout, simulation is advocated as the only general means to obtain realistic estimates of statistical power; we describe, and provide in an R package, a simple algorithm for estimating power for a case-control study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号