首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
To infer a causal relationship between two traits, several correlation-based causal direction (CD) methods have been proposed with the use of SNPs as instrumental variables (IVs) based on GWAS summary data for the two traits; however, none of the existing CD methods can deal with SNPs with correlated pleiotropy. Alternatively, reciprocal Mendelian randomization (MR) can be applied, which however may perform poorly in the presence of (unknown) invalid IVs, especially for bi-directional causal relationships. In this paper, first, we propose a CD method that performs better than existing CD methods regardless of the presence of correlated pleiotropy. Second, along with a simple but yet effective IV screening rule, we propose applying a closely related and state-of-the-art MR method in reciprocal MR, showing its almost identical performance to that of the new CD method when their model assumptions hold; however, if the modeling assumptions are violated, the new CD method is expected to better control type I errors. Notably bi-directional causal relationships impose some unique challenges beyond those for uni-directional ones, and thus requiring special treatments. For example, we point out for the first time several scenarios where a bi-directional relationship, but not a uni-directional one, can unexpectedly cause the violation of some weak modeling assumptions commonly required by many robust MR methods. We also offer some numerical support and a modeling justification for the application of our new methods (and more generally MR) to binary traits. Finally we applied the proposed methods to 12 risk factors and 4 common diseases, confirming mostly well-known uni-directional causal relationships, while identifying some novel and plausible bi-directional ones such as between body mass index and type 2 diabetes (T2D), and between diastolic blood pressure and stroke.  相似文献   

2.
Mendelian randomization (MR) is an instrumental variable (IV) method using genetic variants such as single nucleotide polymorphisms (SNPs) as IVs to disentangle the causal relationship between an exposure and an outcome. Since any causal conclusion critically depends on the three valid IV assumptions, which will likely be violated in practice, MR methods robust to the IV assumptions are greatly needed. As such a method, Egger regression stands out as one of the most widely used due to its easy use and perceived robustness. Although Egger regression is claimed to be robust to directional pleiotropy under the instrument strength independent of direct effect (InSIDE) assumption, it is known to be dependent on the orientations/coding schemes of SNPs (i.e. which allele of an SNP is selected as the reference group). The current practice, as recommended as the default setting in some popular MR software packages, is to orientate the SNPs to be all positively associated with the exposure, which however, to our knowledge, has not been fully studied to assess its robustness and potential impact. We use both numerical examples (with both real data and simulated data) and analytical results to demonstrate the practical problem of Egger regression with respect to its heavy dependence on the SNP orientations. Under the assumption that InSIDE holds for some specific (and unknown) coding scheme of the SNPs, we analytically show that other coding schemes would in general lead to the violation of InSIDE. Other related MR and IV regression methods may suffer from the same problem. Cautions should be taken when applying Egger regression (and related MR and IV regression methods) in practice.  相似文献   

3.
With the increasing availability of large-scale GWAS summary data on various traits, Mendelian randomization (MR) has become commonly used to infer causality between a pair of traits, an exposure and an outcome. It depends on using genetic variants, typically SNPs, as instrumental variables (IVs). The inverse-variance weighted (IVW) method (with a fixed-effect meta-analysis model) is most powerful when all IVs are valid; however, when horizontal pleiotropy is present, it may lead to biased inference. On the other hand, Egger regression is one of the most widely used methods robust to (uncorrelated) pleiotropy, but it suffers from loss of power. We propose a two-component mixture of regressions to combine and thus take advantage of both IVW and Egger regression; it is often both more efficient (i.e. higher powered) and more robust to pleiotropy (i.e. controlling type I error) than either IVW or Egger regression alone by accounting for both valid and invalid IVs respectively. We propose a model averaging approach and a novel data perturbation scheme to account for uncertainties in model/IV selection, leading to more robust statistical inference for finite samples. Through extensive simulations and applications to the GWAS summary data of 48 risk factor-disease pairs and 63 genetically uncorrelated trait pairs, we showcase that our proposed methods could often control type I error better while achieving much higher power than IVW and Egger regression (and sometimes than several other new/popular MR methods). We expect that our proposed methods will be a useful addition to the toolbox of Mendelian randomization for causal inference.  相似文献   

4.
Standard Mendelian randomization (MR) analysis can produce biased results if the genetic variant defining an instrumental variable (IV) is confounded and/or has a horizontal pleiotropic effect on the outcome of interest not mediated by the treatment variable. We provide novel identification conditions for the causal effect of a treatment in the presence of unmeasured confounding by leveraging a possibly invalid IV for which both the IV independence and exclusion restriction assumptions may be violated. The proposed Mendelian randomization mixed-scale treatment effect robust identification (MR MiSTERI) approach relies on (i) an assumption that the treatment effect does not vary with the possibly invalid IV on the additive scale; (ii) that the confounding bias does not vary with the possibly invalid IV on the odds ratio scale; and (iii) that the residual variance for the outcome is heteroskedastic with respect to the possibly invalid IV. Although assumptions (i) and (ii) have, respectively, appeared in the IV literature, assumption (iii) has not; we formally establish that their conjunction can identify a causal effect even with an invalid IV. MR MiSTERI is shown to be particularly advantageous in the presence of pervasive heterogeneity of pleiotropic effects on the additive scale. We propose a simple and consistent three-stage estimator that can be used as a preliminary estimator to a carefully constructed efficient one-step-update estimator. In order to incorporate multiple, possibly correlated, and weak invalid IVs, a common challenge in MR studies, we develop a MAny Weak Invalid Instruments (MR MaWII MiSTERI) approach for strengthened identification and improved estimation accuracy. Both simulation studies and UK Biobank data analysis results demonstrate the robustness of the proposed methods.  相似文献   

5.
6.
Lu Deng  Han Zhang  Lei Song  Kai Yu 《Biometrics》2020,76(2):369-379
Mendelian randomization (MR) is a type of instrumental variable (IV) analysis that uses genetic variants as IVs for a risk factor to study its causal effect on an outcome. Extensive investigations on the performance of IV analysis procedures, such as the one based on the two-stage least squares (2SLS) procedure, have been conducted under the one-sample scenario, where measures on IVs, the risk factor, and the outcome are assumed to be available for each study participant. Recent MR analysis usually is performed with data from two independent or partially overlapping genetic association studies (two-sample setting), with one providing information on the association between the IVs and the outcome, and the other on the association between the IVs and the risk factor. We investigate the performance of 2SLS in the two-sample–based MR when the IVs are weakly associated with the risk factor. We derive closed form formulas for the bias and mean squared error of the 2SLS estimate and verify them with numeric simulations under realistic circumstances. Using these analytic formulas, we can study the pros and cons of conducting MR analysis under one-sample and two-sample settings and assess the impact of having overlapping samples. We also propose and validate a bias-corrected estimator for the causal effect.  相似文献   

7.

Background

In fledgling areas of research, evidence supporting causal assumptions is often scarce due to the small number of empirical studies conducted. In many studies it remains unclear what impact explicit and implicit causal assumptions have on the research findings; only the primary assumptions of the researchers are often presented. This is particularly true for research on the effect of faculty’s teaching performance on their role modeling. Therefore, there is a need for robust frameworks and methods for transparent formal presentation of the underlying causal assumptions used in assessing the causal effects of teaching performance on role modeling. This study explores the effects of different (plausible) causal assumptions on research outcomes.

Methods

This study revisits a previously published study about the influence of faculty’s teaching performance on their role modeling (as teacher-supervisor, physician and person). We drew eight directed acyclic graphs (DAGs) to visually represent different plausible causal relationships between the variables under study. These DAGs were subsequently translated into corresponding statistical models, and regression analyses were performed to estimate the associations between teaching performance and role modeling.

Results

The different causal models were compatible with major differences in the magnitude of the relationship between faculty’s teaching performance and their role modeling. Odds ratios for the associations between teaching performance and the three role model types ranged from 31.1 to 73.6 for the teacher-supervisor role, from 3.7 to 15.5 for the physician role, and from 2.8 to 13.8 for the person role.

Conclusions

Different sets of assumptions about causal relationships in role modeling research can be visually depicted using DAGs, which are then used to guide both statistical analysis and interpretation of results. Since study conclusions can be sensitive to different causal assumptions, results should be interpreted in the light of causal assumptions made in each study.  相似文献   

8.
Mendelian Randomisation (MR) is a powerful tool in epidemiology that can be used to estimate the causal effect of an exposure on an outcome in the presence of unobserved confounding, by utilising genetic variants as instrumental variables (IVs) for the exposure. The effect estimates obtained from MR studies are often interpreted as the lifetime effect of the exposure in question. However, the causal effects of some exposures are thought to vary throughout an individual’s lifetime with periods during which an exposure has a greater effect on a particular outcome. Multivariable MR (MVMR) is an extension of MR that allows for multiple, potentially highly related, exposures to be included in an MR estimation. MVMR estimates the direct effect of each exposure on the outcome conditional on all the other exposures included in the estimation. We explore the use of MVMR to estimate the direct effect of a single exposure at different time points in an individual’s lifetime on an outcome. We use simulations to illustrate the interpretation of the results from such analyses and the key assumptions required. We show that causal effects at different time periods can be estimated through MVMR when the association between the genetic variants used as instruments and the exposure measured at those time periods varies. However, this estimation will not necessarily identify exact time periods over which an exposure has the most effect on the outcome. Prior knowledge regarding the biological basis of exposure trajectories can help interpretation. We illustrate the method through estimation of the causal effects of childhood and adult BMI on C-Reactive protein and smoking behaviour.  相似文献   

9.
10.
Extensive genetic studies have identified a large number of causal genetic variations in many human phenotypes; however, these could not completely explain heritability in complex diseases. Some researchers have proposed that the “missing heritability” may be attributable to gene–gene and gene–environment interactions. Because there are billions of potential interaction combinations, the statistical power of a single study is often ineffective in detecting these interactions. Meta-analysis is a common method of increasing detection power; however, accessing individual data could be difficult. This study presents a simple method that employs aggregated summary values from a “case” group to detect these specific interactions that based on rare disease and independence assumptions. However, these assumptions, particularly the rare disease assumption, may be violated in real situations; therefore, this study further investigated the robustness of our proposed method when it violates the assumptions. In conclusion, we observed that the rare disease assumption is relatively nonessential, whereas the independence assumption is an essential component. Because single nucleotide polymorphisms (SNPs) are often unrelated to environmental factors and SNPs on other chromosomes, researchers should use this method to investigate gene–gene and gene–environment interactions when they are unable to obtain detailed individual patient data.  相似文献   

11.
12.
Tao Sun  Yu Cheng  Ying Ding 《Biometrics》2023,79(3):1713-1725
Copula is a popular method for modeling the dependence among marginal distributions in multivariate censored data. As many copula models are available, it is essential to check if the chosen copula model fits the data well for analysis. Existing approaches to testing the fitness of copula models are mainly for complete or right-censored data. No formal goodness-of-fit (GOF) test exists for interval-censored or recurrent events data. We develop a general GOF test for copula-based survival models using the information ratio (IR) to address this research gap. It can be applied to any copula family with a parametric form, such as the frequently used Archimedean, Gaussian, and D-vine families. The test statistic is easy to calculate, and the test procedure is straightforward to implement. We establish the asymptotic properties of the test statistic. The simulation results show that the proposed test controls the type-I error well and achieves adequate power when the dependence strength is moderate to high. Finally, we apply our method to test various copula models in analyzing multiple real datasets. Our method consistently separates different copula models for all these datasets in terms of model fitness.  相似文献   

13.
14.
摘要 目的:本文拟探讨遗传预测的循环亚油酸水平与不同部位动脉粥样硬化的因果关联。方法:采用两样本孟德尔随机化(Mendelian randomization, MR)研究方法,选择与亚油酸相关联的单核苷酸多态性位点(single nucleotide polymorphism, SNPs)作为工具变量(Instrument Variables, IVs),评估遗传预测的循环亚油酸水平与不同部位动脉粥样硬化的因果关联。结果:逆方差加权法(Inverse Variance Weighted, IVW)分析结果显示,遗传预测的循环亚油酸水平与冠状动脉粥样硬化风险存在显著正相关(OR=1.32, 95% CI: 1.09-1.61, P=0.005);循环亚油酸水平与脑动脉粥样硬化风险之间无因果关联 (OR=1.18, 95% CI: 0.63-2.23, P=0.602)。循环亚油酸水平与外周动脉粥样硬化风险存在显著负相关(OR= 0.55, 95% CI: 0.39-0.77, P=0.001)。循环亚油酸水平与其他动脉粥样硬化(不包括脑、冠状动脉和外周动脉)之间无显著的因果关联(OR=0.99, 95% CI: 0.81-1.21, P=0.916)。结论:遗传预测的循环亚油酸水平与冠状动脉粥样硬化及外周动脉硬化存在因果关联,亚油酸在动脉粥样硬化防治中的作用值得重视及进一步研究。  相似文献   

15.
Mendelian randomization utilizes genetic variants as instrumental variables (IVs) to estimate the causal effect of an exposure variable on an outcome of interest even in the presence of unmeasured confounders. However, the popular inverse-variance weighted (IVW) estimator could be biased in the presence of weak IVs, a common challenge in MR studies. In this article, we develop a novel penalized inverse-variance weighted (pIVW) estimator, which adjusts the original IVW estimator to account for the weak IV issue by using a penalization approach to prevent the denominator of the pIVW estimator from being close to zero. Moreover, we adjust the variance estimation of the pIVW estimator to account for the presence of balanced horizontal pleiotropy. We show that the recently proposed debiased IVW (dIVW) estimator is a special case of our proposed pIVW estimator. We further prove that the pIVW estimator has smaller bias and variance than the dIVW estimator under some regularity conditions. We also conduct extensive simulation studies to demonstrate the performance of the proposed pIVW estimator. Furthermore, we apply the pIVW estimator to estimate the causal effects of five obesity-related exposures on three coronavirus disease 2019 (COVID-19) outcomes. Notably, we find that hypertensive disease is associated with an increased risk of hospitalized COVID-19; and peripheral vascular disease and higher body mass index are associated with increased risks of COVID-19 infection, hospitalized COVID-19, and critically ill COVID-19.  相似文献   

16.
Recent theoretical work in quantitative genetics has fueled interest in measuring natural selection in the wild. We discuss statistical and biological issues that may arise in applications of Lande and Arnold's (1983) multiple-regression approach to measuring selection. We review assumptions involved in estimation and hypothesis testing in regression problems, and we note difficulties that frequently arise as a result of violation of these assumptions. In particular, multicollinearity (extreme intercorrelation of characters) and extrinsic, unmeasured factors affecting fitness may seriously complicate inference regarding selection. Further, violation of the assumption that residuals are normally distributed vitiates tests of significance. For this situation, we suggest applications of recently developed jackknife tests of significance. While fitness regression permits direct assessment of selection in a form suitable for predicting selection response, we suggest that the aim of inferring causal relationships about the effects of phenotypic characters on fitness is greatly facilitated by manipulative experiments. Finally, we discuss alternative definitions of stabilizing and disruptive selection.  相似文献   

17.
GWAS has facilitated greatly the discovery of risk SNPs associated with complex diseases. Traditional methods analyze SNP individually and are limited by low power and reproducibility since correction for multiple comparisons is necessary. Several methods have been proposed based on grouping SNPs into SNP sets using biological knowledge and/or genomic features. In this article, we compare the linear kernel machine based test (LKM) and principal components analysis based approach (PCA) using simulated datasets under the scenarios of 0 to 3 causal SNPs, as well as simple and complex linkage disequilibrium (LD) structures of the simulated regions. Our simulation study demonstrates that both LKM and PCA can control the type I error at the significance level of 0.05. If the causal SNP is in strong LD with the genotyped SNPs, both the PCA with a small number of principal components (PCs) and the LKM with kernel of linear or identical-by-state function are valid tests. However, if the LD structure is complex, such as several LD blocks in the SNP set, or when the causal SNP is not in the LD block in which most of the genotyped SNPs reside, more PCs should be included to capture the information of the causal SNP. Simulation studies also demonstrate the ability of LKM and PCA to combine information from multiple causal SNPs and to provide increased power over individual SNP analysis. We also apply LKM and PCA to analyze two SNP sets extracted from an actual GWAS dataset on non-small cell lung cancer.  相似文献   

18.
In this paper we review the methodological underpinnings of the general pharmacogenetic approach for uncovering genetically-driven treatment effect heterogeneity. This typically utilises only individuals who are treated and relies on fairly strong baseline assumptions to estimate what we term the ‘genetically moderated treatment effect’ (GMTE). When these assumptions are seriously violated, we show that a robust but less efficient estimate of the GMTE that incorporates information on the population of untreated individuals can instead be used. In cases of partial violation, we clarify when Mendelian randomization and a modified confounder adjustment method can also yield consistent estimates for the GMTE. A decision framework is then described to decide when a particular estimation strategy is most appropriate and how specific estimators can be combined to further improve efficiency. Triangulation of evidence from different data sources, each with their inherent biases and limitations, is becoming a well established principle for strengthening causal analysis. We call our framework ‘Triangulation WIthin a STudy’ (TWIST)’ in order to emphasise that an analysis in this spirit is also possible within a single data set, using causal estimates that are approximately uncorrelated, but reliant on different sets of assumptions. We illustrate these approaches by re-analysing primary-care-linked UK Biobank data relating to CYP2C19 genetic variants, Clopidogrel use and stroke risk, and data relating to APOE genetic variants, statin use and Coronary Artery Disease.  相似文献   

19.
20.
Informing missing heritability for complex disease will likely require leveraging information across multiple SNPs within a gene region simultaneously to characterize gene and locus-level contributions to disease phenotypes. To this aim, we introduce a novel strategy, termed Mixed modeling of Meta-Analysis P-values (MixMAP), that draws on a principled statistical modeling framework and the vast array of summary data now available from genetic association studies, to test formally for locus level association. The primary inputs to this approach are: (a) single SNP level p-values for tests of association; and (b) the mapping of SNPs to genomic regions. The output of MixMAP is comprised of locus level estimates and tests of association. In application of MixMAP to summary data from the Global Lipids Gene Consortium, we suggest twelve new loci (PKN, FN1, UGT1A1, PPARG, DMDGH, PPARD, CDK6, VPS13B, GAD2, GAB2, APOH and NPC1) for low-density lipoprotein cholesterol (LDL-C), a causal risk factor for cardiovascular disease and we also demonstrate the potential utility of MixMAP in small data settings. Overall, MixMAP offers novel and complementary information as compared to SNP-based analysis approaches and is straightforward to implement with existing open-source statistical software tools.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号