首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
Mendelian randomization (MR) analysis uses genotypes as instruments to estimate the causal effect of an exposure in the presence of unobserved confounders. The existing MR methods focus on the data generated from prospective cohort studies. We develop a procedure for studying binary outcomes under a case-control design. The proposed procedure is built upon two working models commonly used for MR analyses and adopts a quasi-empirical likelihood framework to address the ascertainment bias from case-control sampling. We derive various approaches for estimating the causal effect and hypothesis testing under the empirical likelihood framework. We conduct extensive simulation studies to evaluate the proposed methods. We find that the proposed empirical likelihood estimate is less biased than the existing estimates. Among all the approaches considered, the Lagrange multiplier (LM) test has the highest power, and the confidence intervals derived from the LM test have the most accurate coverage. We illustrate the use of our method in MR analysis of prostate cancer case-control data with vitamin D level as exposure and three single nucleotide polymorphisms as instruments.  相似文献   

2.
Mendelian randomization utilizes genetic variants as instrumental variables (IVs) to estimate the causal effect of an exposure variable on an outcome of interest even in the presence of unmeasured confounders. However, the popular inverse-variance weighted (IVW) estimator could be biased in the presence of weak IVs, a common challenge in MR studies. In this article, we develop a novel penalized inverse-variance weighted (pIVW) estimator, which adjusts the original IVW estimator to account for the weak IV issue by using a penalization approach to prevent the denominator of the pIVW estimator from being close to zero. Moreover, we adjust the variance estimation of the pIVW estimator to account for the presence of balanced horizontal pleiotropy. We show that the recently proposed debiased IVW (dIVW) estimator is a special case of our proposed pIVW estimator. We further prove that the pIVW estimator has smaller bias and variance than the dIVW estimator under some regularity conditions. We also conduct extensive simulation studies to demonstrate the performance of the proposed pIVW estimator. Furthermore, we apply the pIVW estimator to estimate the causal effects of five obesity-related exposures on three coronavirus disease 2019 (COVID-19) outcomes. Notably, we find that hypertensive disease is associated with an increased risk of hospitalized COVID-19; and peripheral vascular disease and higher body mass index are associated with increased risks of COVID-19 infection, hospitalized COVID-19, and critically ill COVID-19.  相似文献   

3.
Mendelian randomization methods, which use genetic variants as instrumental variables for exposures of interest to overcome problems of confounding and reverse causality, are becoming widespread for assessing causal relationships in epidemiological studies. The main purpose of this paper is to demonstrate how results can be biased if researchers select genetic variants on the basis of their association with the exposure in their own dataset, as often happens in candidate gene analyses. This can lead to estimates that indicate apparent “causal” relationships, despite there being no true effect of the exposure. In addition, we discuss the potential bias in estimates of magnitudes of effect from Mendelian randomization analyses when the measured exposure is a poor proxy for the true underlying exposure. We illustrate these points with specific reference to tobacco research.  相似文献   

4.
Nonrandom selection in one-sample Mendelian Randomization (MR) results in biased estimates and inflated type I error rates only when the selection effects are sufficiently large. In two-sample MR, the different selection mechanisms in two samples may more seriously affect the causal effect estimation. Firstly, we propose sufficient conditions for causal effect invariance under different selection mechanisms using two-sample MR methods. In the simulation study, we consider 49 possible selection mechanisms in two-sample MR, which depend on genetic variants (G), exposures (X), outcomes (Y) and their combination. We further compare eight pleiotropy-robust methods under different selection mechanisms. Results of simulation reveal that nonrandom selection in sample II has a larger influence on biases and type I error rates than those in sample I. Furthermore, selections depending on X+Y, G+Y, or G+X+Y in sample II lead to larger biases than other selection mechanisms. Notably, when selection depends on Y, bias of causal estimation for non-zero causal effect is larger than that for null causal effect. Especially, the mode based estimate has the largest standard errors among the eight methods. In the absence of pleiotropy, selections depending on Y or G in sample II show nearly unbiased causal effect estimations when the casual effect is null. In the scenarios of balanced pleiotropy, all eight MR methods, especially MR-Egger, demonstrate large biases because the nonrandom selections result in the violation of the Instrument Strength Independent of Direct Effect (InSIDE) assumption. When directional pleiotropy exists, nonrandom selections have a severe impact on the eight MR methods. Application demonstrates that the nonrandom selection in sample II (coronary heart disease patients) can magnify the causal effect estimation of obesity on HbA1c levels. In conclusion, nonrandom selection in two-sample MR exacerbates the bias of causal effect estimation for pleiotropy-robust MR methods.  相似文献   

5.
Studying time-dependent exposure mixtures has gained increasing attentions in environmental health research. When a scalar outcome is of interest, distributed lag (DL) models have been employed to characterize the exposures effects distributed over time on the mean of final outcome. However, there is a methodological gap on investigating time-dependent exposure mixtures with different quantiles of outcome. In this paper, we introduce semiparametric partial-linear single-index (PLSI) DL quantile regression, which can describe the DL effects of time-dependent exposure mixtures on different quantiles of outcome and identify susceptible periods of exposures. We consider two time-dependent exposure settings: discrete and functional, when exposures are measured in a small number of time points and at dense time grids, respectively. Spline techniques are used to approximate the nonparametric DL function and single-index link function, and a profile estimation algorithm is proposed. Through extensive simulations, we demonstrate the performance and value of our proposed models and inference procedures. We further apply the proposed methods to study the effects of maternal exposures to ambient air pollutants of fine particulate and nitrogen dioxide on birth weight in New York University Children's Health and Environment Study (NYU CHES).  相似文献   

6.
Over the last decade the availability of SNP-trait associations from genome-wide association studies has led to an array of methods for performing Mendelian randomization studies using only summary statistics. A common feature of these methods, besides their intuitive simplicity, is the ability to combine data from several sources, incorporate multiple variants and account for biases due to weak instruments and pleiotropy. With the advent of large and accessible fully-genotyped cohorts such as UK Biobank, there is now increasing interest in understanding how best to apply these well developed summary data methods to individual level data, and to explore the use of more sophisticated causal methods allowing for non-linearity and effect modification.In this paper we describe a general procedure for optimally applying any two sample summary data method using one sample data. Our procedure first performs a meta-analysis of summary data estimates that are intentionally contaminated by collider bias between the genetic instruments and unmeasured confounders, due to conditioning on the observed exposure. These estimates are then used to correct the standard observational association between an exposure and outcome. Simulations are conducted to demonstrate the method’s performance against naive applications of two sample summary data MR. We apply the approach to the UK Biobank cohort to investigate the causal role of sleep disturbance on HbA1c levels, an important determinant of diabetes.Our approach can be viewed as a generalization of Dudbridge et al. (Nat. Comm. 10: 1561), who developed a technique to adjust for index event bias when uncovering genetic predictors of disease progression based on case-only data. Our work serves to clarify that in any one sample MR analysis, it can be advantageous to estimate causal relationships by artificially inducing and then correcting for collider bias.  相似文献   

7.
In this paper we review the methodological underpinnings of the general pharmacogenetic approach for uncovering genetically-driven treatment effect heterogeneity. This typically utilises only individuals who are treated and relies on fairly strong baseline assumptions to estimate what we term the ‘genetically moderated treatment effect’ (GMTE). When these assumptions are seriously violated, we show that a robust but less efficient estimate of the GMTE that incorporates information on the population of untreated individuals can instead be used. In cases of partial violation, we clarify when Mendelian randomization and a modified confounder adjustment method can also yield consistent estimates for the GMTE. A decision framework is then described to decide when a particular estimation strategy is most appropriate and how specific estimators can be combined to further improve efficiency. Triangulation of evidence from different data sources, each with their inherent biases and limitations, is becoming a well established principle for strengthening causal analysis. We call our framework ‘Triangulation WIthin a STudy’ (TWIST)’ in order to emphasise that an analysis in this spirit is also possible within a single data set, using causal estimates that are approximately uncorrelated, but reliant on different sets of assumptions. We illustrate these approaches by re-analysing primary-care-linked UK Biobank data relating to CYP2C19 genetic variants, Clopidogrel use and stroke risk, and data relating to APOE genetic variants, statin use and Coronary Artery Disease.  相似文献   

8.
J M Robins  S D Mark  W K Newey 《Biometrics》1992,48(2):479-495
In order to estimate the causal effects of one or more exposures or treatments on an outcome of interest, one has to account for the effect of "confounding factors" which both covary with the exposures or treatments and are independent predictors of the outcome. In this paper we present regression methods which, in contrast to standard methods, adjust for the confounding effect of multiple continuous or discrete covariates by modelling the conditional expectation of the exposures or treatments given the confounders. In the special case of a univariate dichotomous exposure or treatment, this conditional expectation is identical to what Rosenbaum and Rubin have called the propensity score. They have also proposed methods to estimate causal effects by modelling the propensity score. Our methods generalize those of Rosenbaum and Rubin in several ways. First, our approach straightforwardly allows for multivariate exposures or treatments, each of which may be continuous, ordinal, or discrete. Second, even in the case of a single dichotomous exposure, our approach does not require subclassification or matching on the propensity score so that the potential for "residual confounding," i.e., bias, due to incomplete matching is avoided. Third, our approach allows a rather general formalization of the idea that it is better to use the "estimated propensity score" than the true propensity score even when the true score is known. The additional power of our approach derives from the fact that we assume the causal effects of the exposures or treatments can be described by the parametric component of a semiparametric regression model. To illustrate our methods, we reanalyze the effect of current cigarette smoking on the level of forced expiratory volume in one second in a cohort of 2,713 adult white males. We compare the results with those obtained using standard methods.  相似文献   

9.
Lyles RH  MacFarlane G 《Biometrics》2000,56(2):634-639
When repeated measures of an exposure variable are obtained on individuals, it can be of epidemiologic interest to relate the slope of this variable over time to a subsequent response. Subject-specific estimates of this slope are measured with error, as are corresponding estimates of the level of exposure, i.e., the intercept of a linear regression over time. Because the intercept is often correlated with the slope and may also be associated with the outcome, each error-prone covariate (intercept and slope) is a potential confounder, thereby tending to accentuate potential biases due to measurement error. Under a familiar mixed linear model for the exposure measurements, we present closed-form estimators for the true parameters of interest in the case of a continuous outcome with complete and equally timed follow-up for all subjects. Generalizations to handle incomplete follow-up, other types of outcome variables, and additional fixed covariates are illustrated via maximum likelihood. We provide examples using data from the Multicenter AIDS Cohort Study. In these examples, substantial adjustments are made to uncorrected parameter estimates corresponding to the health-related effects of exposure variable slopes over time. We illustrate the potential impact of such adjustments on the interpretation of an epidemiologic analysis.  相似文献   

10.
Monte Carlo risk assessments commonly take as input empirical or parametric exposure distributions from specially designed exposure studies. The exposure studies typically have limited duration, since their design is based on statistical and practical factors (such as cost and respondent burden). For these reasons, the exposure period studied rarely corresponds to the biologic exposure period, which we define as the time at risk that is relevant for quantifying exposure that may result in health effects. Both the exposure period studied and the biologic exposure period will often differ from the exposure interval used in a Monte Carlo analysis. Such time period differences, which are often not accounted for, can have dramatic effects on the ultimate risk assessment. When exposure distributions are right skewed and/or follow a lognormal distribution, exposure will usually be overestimated for percentiles above the median by direct use of exposure study empirical data, since biologic exposure periods are generally longer than the exposure periods in exposure assessment studies. We illustrate the effect that biologic exposure time period and response error can have on exposure distributions, using soil ingestion as an example. Beginning with variance components from lognormally distributed soil ingestion estimates, we illustrate the effect of different modeling assumptions, and the sensitivity of the resulting analyses to these assumptions. We develop a strategy for determining appropriate exposure input distributions for soil ingestion, and illustrate this using data on soil ingestion in children.  相似文献   

11.
Expression quantitative trait loci (eQTL) studies are used to understand the regulatory function of non-coding genome-wide association study (GWAS) risk loci, but colocalization alone does not demonstrate a causal relationship of gene expression affecting a trait. Evidence for mediation, that perturbation of gene expression in a given tissue or developmental context will induce a change in the downstream GWAS trait, can be provided by two-sample Mendelian Randomization (MR). Here, we introduce a new statistical method, MRLocus, for Bayesian estimation of the gene-to-trait effect from eQTL and GWAS summary data for loci with evidence of allelic heterogeneity, that is, containing multiple causal variants. MRLocus makes use of a colocalization step applied to each nearly-LD-independent eQTL, followed by an MR analysis step across eQTLs. Additionally, our method involves estimation of the extent of allelic heterogeneity through a dispersion parameter, indicating variable mediation effects from each individual eQTL on the downstream trait. Our method is evaluated against other state-of-the-art methods for estimation of the gene-to-trait mediation effect, using an existing simulation framework. In simulation, MRLocus often has the highest accuracy among competing methods, and in each case provides more accurate estimation of uncertainty as assessed through interval coverage. MRLocus is then applied to five candidate causal genes for mediation of particular GWAS traits, where gene-to-trait effects are concordant with those previously reported. We find that MRLocus’s estimation of the causal effect across eQTLs within a locus provides useful information for determining how perturbation of gene expression or individual regulatory elements will affect downstream traits. The MRLocus method is implemented as an R package available at https://mikelove.github.io/mrlocus.  相似文献   

12.
Generalized causal mediation analysis   总被引:1,自引:0,他引:1  
Albert JM  Nelson S 《Biometrics》2011,67(3):1028-1038
The goal of mediation analysis is to assess direct and indirect effects of a treatment or exposure on an outcome. More generally, we may be interested in the context of a causal model as characterized by a directed acyclic graph (DAG), where mediation via a specific path from exposure to outcome may involve an arbitrary number of links (or "stages"). Methods for estimating mediation (or pathway) effects are available for a continuous outcome and a continuous mediator related via a linear model, while for a categorical outcome or categorical mediator, methods are usually limited to two-stage mediation. We present a method applicable to multiple stages of mediation and mixed variable types using generalized linear models. We define pathway effects using a potential outcomes framework and present a general formula that provides the effect of exposure through any specified pathway. Some pathway effects are nonidentifiable and their estimation requires an assumption regarding the correlation between counterfactuals. We provide a sensitivity analysis to assess the impact of this assumption. Confidence intervals for pathway effect estimates are obtained via a bootstrap method. The method is applied to a cohort study of dental caries in very low birth weight adolescents. A simulation study demonstrates low bias of pathway effect estimators and close-to-nominal coverage rates of confidence intervals. We also find low sensitivity to the counterfactual correlation in most scenarios.  相似文献   

13.
14.

Background

Previous Mendelian randomization studies have suggested that, while low-density lipoprotein cholesterol (LDL-c) and triglycerides are causally implicated in coronary artery disease (CAD) risk, high-density lipoprotein cholesterol (HDL-c) may not be, with causal effect estimates compatible with the null.

Principal Findings

The causal effects of these three lipid fractions can be better identified using the extended methods of ‘multivariable Mendelian randomization’. We employ this approach using published data on 185 lipid-related genetic variants and their associations with lipid fractions in 188,578 participants, and with CAD risk in 22,233 cases and 64,762 controls. Our results suggest that HDL-c may be causally protective of CAD risk, independently of the effects of LDL-c and triglycerides. Estimated causal odds ratios per standard deviation increase, based on 162 variants not having pleiotropic associations with either blood pressure or body mass index, are 1.57 (95% credible interval 1.45 to 1.70) for LDL-c, 0.91 (0.83 to 0.99, p-value  = 0.028) for HDL-c, and 1.29 (1.16 to 1.43) for triglycerides.

Significance

Some interventions on HDL-c concentrations may influence risk of CAD, but to a lesser extent than interventions on LDL-c. A causal interpretation of these estimates relies on the assumption that the genetic variants do not have pleiotropic associations with risk factors on other pathways to CAD. If they do, a weaker conclusion is that genetic predictors of LDL-c, HDL-c and triglycerides each have independent associations with CAD risk.  相似文献   

15.
We are interested in the estimation of average treatment effects based on right-censored data of an observational study. We focus on causal inference of differences between t-year absolute event risks in a situation with competing risks. We derive doubly robust estimation equations and implement estimators for the nuisance parameters based on working regression models for the outcome, censoring, and treatment distribution conditional on auxiliary baseline covariates. We use the functional delta method to show that these estimators are regular asymptotically linear estimators and estimate their variances based on estimates of their influence functions. In empirical studies, we assess the robustness of the estimators and the coverage of confidence intervals. The methods are further illustrated using data from a Danish registry study.  相似文献   

16.
Establishing causal relationships between environmental exposures and common diseases is beset with problems of unresolved confounding, reverse causation and selection bias that may result in spurious inferences. Mendelian randomization, in which a functional genetic variant acts as a proxy for an environmental exposure, provides a means of overcoming these problems as the inheritance of genetic variants is independent of—that is randomized with respect to—the inheritance of other traits, according to Mendel’s law of independent assortment. Examples drawn from exposures and outcomes as diverse as milk and osteoporosis, alcohol and coronary heart disease, sheep dip and farm workers’ compensation neurosis, folate and neural tube defects are used to illustrate the applications of Mendelian randomization approaches in assessing potential environmental causes of disease. As with all genetic epidemiology studies there are problems associated with the need for large sample sizes, the non-replication of findings, and the lack of relevant functional genetic variants. In addition to these problems, Mendelian randomization findings may be confounded by other genetic variants in linkage disequilibrium with the variant under study, or by population stratification. Furthermore, pleiotropy of effect of a genetic variant may result in null associations, as may canalisation of genetic effects. If correctly conducted and carefully interpreted, Mendelian randomization studies can provide useful evidence to support or reject causal hypotheses linking environmental exposures to common diseases.  相似文献   

17.
Personal genome tests are now offered direct-to-consumer (DTC) via genetic variants identified by genome-wide association studies (GWAS) for common diseases. Tests report risk estimates (age-specific and lifetime) for various diseases based on genotypes at multiple loci. However, uncertainty surrounding such risk estimates has not been systematically investigated. With breast cancer as an example, we examined the combined effect of uncertainties in population incidence rates, genotype frequency, effect sizes, and models of joint effects among genetic variants on lifetime risk estimates. We performed simulations to estimate lifetime breast cancer risk for carriers and noncarriers of genetic variants. We derived population-based cancer incidence rates from Surveillance, Epidemiology, and End Results (SEER) Program and comparative international data. We used data for non-Hispanic white women from 2003 to 2005. We derived genotype frequencies and effect sizes from published GWAS and meta-analyses. For a single genetic variant in FGFR2 gene (rs2981582), combination of uncertainty in these parameters produced risk estimates where upper and lower 95% simulation intervals differed by more than 3-fold. Difference in population incidence rates was the largest contributor to variation in risk estimates. For a panel of five genetic variants, estimated lifetime risk of developing breast cancer before age 80 for a woman that carried all risk variants ranged from 6.1% to 21%, depending on assumptions of additive or multiplicative joint effects and breast cancer incidence rates. Epidemiologic parameters involved in computation of disease risk have substantial uncertainty, and cumulative uncertainty should be properly recognized. Reliance on point estimates alone could be seriously misleading.  相似文献   

18.
The case-crossover design of Maclure is widely used in epidemiology and other fields to study causal effects of transient treatments on acute outcomes. However, its validity and causal interpretation have only been justified under informal conditions. Here, we place the design in a formal counterfactual framework for the first time. Doing so helps to clarify its assumptions and interpretation. In particular, when the treatment effect is nonnull, we identify a previously unnoticed bias arising from strong common causes of the outcome at different person-times. We analyze this bias and demonstrate its potential importance with simulations. We also use our derivation of the limit of the case-crossover estimator to analyze its sensitivity to treatment effect heterogeneity, a violation of one of the informal criteria for validity. The upshot of this work for practitioners is that, while the case-crossover design can be useful for testing the causal null hypothesis in the presence of baseline confounders, extra caution is warranted when using the case-crossover design for point estimation of causal effects.  相似文献   

19.
Diet is considered as one of the most important modifiable factors influencing human health, but efforts to identify foods or dietary patterns associated with health outcomes often suffer from biases, confounding, and reverse causation. Applying Mendelian randomization in this context may provide evidence to strengthen causality in nutrition research. To this end, we first identified 283 genetic markers associated with dietary intake in 445,779 UK Biobank participants. We then converted these associations into direct genetic effects on food exposures by adjusting them for effects mediated via other traits. The SNPs which did not show evidence of mediation were then used for MR, assessing the association between genetically predicted food choices and other risk factors, health outcomes. We show that using all associated SNPs without omitting those which show evidence of mediation, leads to biases in downstream analyses (genetic correlations, causal inference), similar to those present in observational studies. However, MR analyses using SNPs which have only a direct effect on the exposure on food exposures provided unequivocal evidence of causal associations between specific eating patterns and obesity, blood lipid status, and several other risk factors and health outcomes.  相似文献   

20.
Maternal exposure to environmental chemicals during pregnancy can alter birth and children's health outcomes. Research seeks to identify critical windows, time periods when exposures can change future health outcomes, and estimate the exposure–response relationship. Existing statistical approaches focus on estimation of the association between maternal exposure to a single environmental chemical observed at high temporal resolution (e.g., weekly throughout pregnancy) and children's health outcomes. Extending to multiple chemicals observed at high temporal resolution poses a dimensionality problem and statistical methods are lacking. We propose a regression tree–based model for mixtures of exposures observed at high temporal resolution. The proposed approach uses an additive ensemble of tree pairs that defines structured main effects and interactions between time-resolved predictors and performs variable selection to select out of the model predictors not correlated with the outcome. In simulation, we show that the tree-based approach performs better than existing methods for a single exposure and can accurately estimate critical windows in the exposure–response relation for mixtures. We apply our method to estimate the relationship between five exposures measured weekly throughout pregnancy and birth weight in a Denver, Colorado, birth cohort. We identified critical windows during which fine particulate matter, sulfur dioxide, and temperature are negatively associated with birth weight and an interaction between fine particulate matter and temperature. Software is made available in the R package dlmtree.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号