首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: The identification and characterization of susceptibility genes that influence the risk of common and complex diseases remains a statistical and computational challenge in genetic association studies. This is partly because the effect of any single genetic variant for a common and complex disease may be dependent on other genetic variants (gene-gene interaction) and environmental factors (gene-environment interaction). To address this problem, the multifactor dimensionality reduction (MDR) method has been proposed by Ritchie et al. to detect gene-gene interactions or gene-environment interactions. The MDR method identifies polymorphism combinations associated with the common and complex multifactorial diseases by collapsing high-dimensional genetic factors into a single dimension. That is, the MDR method classifies the combination of multilocus genotypes into high-risk and low-risk groups based on a comparison of the ratios of the numbers of cases and controls. When a high-order interaction model is considered with multi-dimensional factors, however, there may be many sparse or empty cells in the contingency tables. The MDR method cannot classify an empty cell as high risk or low risk and leaves it as undetermined. RESULTS: In this article, we propose the log-linear model-based multifactor dimensionality reduction (LM MDR) method to improve the MDR in classifying sparse or empty cells. The LM MDR method estimates frequencies for empty cells from a parsimonious log-linear model so that they can be assigned to high-and low-risk groups. In addition, LM MDR includes MDR as a special case when the saturated log-linear model is fitted. Simulation studies show that the LM MDR method has greater power and smaller error rates than the MDR method. The LM MDR method is also compared with the MDR method using as an example sporadic Alzheimer's disease.  相似文献   

2.
Hsieh AR  Hsiao CL  Chang SW  Wang HM  Fann CS 《Genomics》2011,97(2):77-85
Haplotype-based approaches may have greater power than single-locus analyses when the SNPs are in strong linkage disequilibrium with the risk locus. To overcome potential complexities owing to large numbers of haplotypes in genetic studies, we evaluated two data mining approaches, multifactor dimensionality reduction (MDR) and classification and regression tree (CART), with the concept of haplotypes considering their haplotype uncertainty to detect haplotype-haplotype (HH) interactions. In evaluation of performance for detecting HH interactions, MDR had higher power than CART, but MDR gave a slightly higher type I error. Additionally, we performed an HH interaction analysis with a publicly available dataset of Parkinson's disease and confirmed previous findings that the RET proto-oncogene is associated with the disease. In this study, we showed that using HH interaction analysis is possible to assist researchers in gaining more insight into identifying genetic risk factors for complex diseases.  相似文献   

3.
Complex diseases, by definition, involve multiple factors, including gene-gene interactions and gene-environment interactions. Researchers commonly rely on simulated data to evaluate their approaches for detecting high-order interactions in disease gene mapping. A publicly available simulation program to generate samples involving complex genetic and environmental interactions is of great interest to the community. We have developed a software package named gs1.0, which has been widely used since its publication. In this article, we present an upgraded version gs2.0, which not only inherits its capacity to generate realistic genotype data but also provides great functionality and flexibility to simulate various interaction models. In addition to a standalone version, a user-friendly web server (http://cbc.case.edu/gs) has been set up to help users to build complex interaction models. Furthermore, by utilizing three three-locus models as an example, we have shown how realistic model parameters can be chosen in generating simulated data.  相似文献   

4.
X-Y Lou 《Heredity》2015,114(3):255-261
Biological outcomes are governed by multiple genetic and environmental factors that act in concert. Determining multifactor interactions is the primary topic of interest in recent genetics studies but presents enormous statistical and mathematical challenges. The computationally efficient multifactor dimensionality reduction (MDR) approach has emerged as a promising tool for meeting these challenges. On the other hand, complex traits are expressed in various forms and have different data generation mechanisms that cannot be appropriately modeled by a dichotomous model; the subjects in a study may be recruited according to its own analytical goals, research strategies and resources available, not only consisting of homogeneous unrelated individuals. Although several modifications and extensions of MDR have in part addressed the practical problems, they are still limited in statistical analyses of diverse phenotypes, multivariate phenotypes and correlated observations, correcting for potential population stratification and unifying both unrelated and family samples into a more powerful analysis. I propose a comprehensive statistical framework, referred as to unified generalized MDR (UGMDR), for systematic extension of MDR. The proposed approach is quite versatile, not only allowing for covariate adjustment, being suitable for analyzing almost any trait type, for example, binary, count, continuous, polytomous, ordinal, time-to-onset, multivariate and others, as well as combinations of those, but also being applicable to various study designs, including homogeneous and admixed unrelated-subject and family as well as mixtures of them. The proposed UGMDR offers an important addition to the arsenal of analytical tools for identifying nonlinear multifactor interactions and unraveling the genetic architecture of complex traits.  相似文献   

5.

BACKGROUND:

Idiopathic pulmonary arterial hypertension (IPAH) is a poorly understood complex disorder, which results in progressive remodeling of the pulmonary artery that ultimately leads to right ventricular failure. A two-hit hypothesis has been implicated in pathogenesis of IPAH, according to which the vascular abnormalities characteristic of PAH are triggered by the accumulation of genetic and/or environmental insults in an already existing genetic background. The multifactor dimensionality reduction (MDR) analysis is a statistical method used to identify gene–gene interaction or epistasis and gene–environment interactions that are associated with a particular disease. The MDR method collapses high-dimensional genetic data into a single dimension, thus permitting interactions to be detected in relatively small sample sizes.

AIM:

To identify and characterize polymorphisms/genes that increases the susceptibility to IPAH using MDR analysis.

MATERIALS AND METHODS:

A total of 77 IPAH patients and 100 controls were genotyped for eight polymorphisms of five genes (5HTT, EDN1, NOS3, ALK-1, and PPAR-γ2). MDR method was adopted to determine gene–gene interactions that increase the risk of IPAH.

RESULTS:

With MDR method, the single-locus model of 5HTT (L/S) polymorphism and the combination of 5HTT(L/S), EDN1(K198N), and NOS3(G894T) polymorphisms in the three-locus model were attributed to be the best models for predicting susceptibility to IPAH, with a P value of 0.05.

CONCLUSION:

MDR method can be useful in understanding the role of epistatic and gene–environmental interactions in pathogenesis of IPAH.  相似文献   

6.
Widespread multifactor interactions present a significant challenge in determining risk factors of complex diseases. Several combinatorial approaches, such as the multifactor dimensionality reduction (MDR) method, have emerged as a promising tool for better detecting gene-gene (G x G) and gene-environment (G x E) interactions. We recently developed a general combinatorial approach, namely the generalized multifactor dimensionality reduction (GMDR) method, which can entertain both qualitative and quantitative phenotypes and allows for both discrete and continuous covariates to detect G x G and G x E interactions in a sample of unrelated individuals. In this article, we report the development of an algorithm that can be used to study G x G and G x E interactions for family-based designs, called pedigree-based GMDR (PGMDR). Compared to the available method, our proposed method has several major improvements, including allowing for covariate adjustments and being applicable to arbitrary phenotypes, arbitrary pedigree structures, and arbitrary patterns of missing marker genotypes. Our Monte Carlo simulations provide evidence that the PGMDR method is superior in performance to identify epistatic loci compared to the MDR-pedigree disequilibrium test (PDT). Finally, we applied our proposed approach to a genetic data set on tobacco dependence and found a significant interaction between two taste receptor genes (i.e., TAS2R16 and TAS2R38) in affecting nicotine dependence.  相似文献   

7.

Background

With the rapid advancement of array-based genotyping techniques, genome-wide association studies (GWAS) have successfully identified common genetic variants associated with common complex diseases. However, it has been shown that only a small proportion of the genetic etiology of complex diseases could be explained by the genetic factors identified from GWAS. This missing heritability could possibly be explained by gene-gene interaction (epistasis) and rare variants. There has been an exponential growth of gene-gene interaction analysis for common variants in terms of methodological developments and practical applications. Also, the recent advancement of high-throughput sequencing technologies makes it possible to conduct rare variant analysis. However, little progress has been made in gene-gene interaction analysis for rare variants.

Results

Here, we propose GxGrare which is a new gene-gene interaction method for the rare variants in the framework of the multifactor dimensionality reduction (MDR) analysis. The proposed method consists of three steps; 1) collapsing the rare variants, 2) MDR analysis for the collapsed rare variants, and 3) detect top candidate interaction pairs. GxGrare can be used for the detection of not only gene-gene interactions, but also interactions within a single gene. The proposed method is illustrated with 1080 whole exome sequencing data of the Korean population in order to identify causal gene-gene interaction for rare variants for type 2 diabetes.

Conclusion

The proposed GxGrare performs well for gene-gene interaction detection with collapsing of rare variants. GxGrare is available at http://bibs.snu.ac.kr/software/gxgrare which contains simulation data and documentation. Supported operating systems include Linux and OS X.
  相似文献   

8.
Detecting, characterizing, and interpreting gene-gene interactions or epistasis in studies of human disease susceptibility is both a mathematical and a computational challenge. To address this problem, we have previously developed a multifactor dimensionality reduction (MDR) method for collapsing high-dimensional genetic data into a single dimension (i.e. constructive induction) thus permitting interactions to be detected in relatively small sample sizes. In this paper, we describe a comprehensive and flexible framework for detecting and interpreting gene-gene interactions that utilizes advances in information theory for selecting interesting single-nucleotide polymorphisms (SNPs), MDR for constructive induction, machine learning methods for classification, and finally graphical models for interpretation. We illustrate the usefulness of this strategy using artificial datasets simulated from several different two-locus and three-locus epistasis models. We show that the accuracy, sensitivity, specificity, and precision of a na?ve Bayes classifier are significantly improved when SNPs are selected based on their information gain (i.e. class entropy removed) and reduced to a single attribute using MDR. We then apply this strategy to detecting, characterizing, and interpreting epistatic models in a genetic study (n = 500) of atrial fibrillation and show that both classification and model interpretation are significantly improved.  相似文献   

9.
Multilocus analysis of hypertension: a hierarchical approach   总被引:11,自引:0,他引:11  
While hypertension is a complex disease with a well-documented genetic component, genetic studies often fail to replicate findings. One possibility for such inconsistency is that the underlying genetics of hypertension is not based on single genes of major effect, but on interactions among genes. To test this hypothesis, we studied both single locus and multilocus effects, using a case-control design of subjects from Ghana. Thirteen polymorphisms in eight candidate genes were studied. Each candidate gene has been shown to play a physiological role in blood pressure regulation and affects one of four pathways that modulate blood pressure: vasoconstriction (angiotensinogen, angiotensin converting enzyme - ACE, angiotensin II receptor), nitric oxide (NO) dependent and NO independent vasodilation pathways and sodium balance (G protein-coupled receptor kinase, GRK4). We evaluated single site allelic and genotypic associations, multilocus genotype equilibrium and multilocus genotype associations, using multifactor dimensionality reduction (MDR). For MDR, we performed systematic reanalysis of the data to address the role of various physiological pathways. We found no significant single site associations, but the hypertensive class deviated significantly from genotype equilibrium in more than 25% of all multilocus comparisons (2,162 of 8,178), whereas the normotensive class rarely did (11 of 8,178). The MDR analysis identified a two-locus model including ACE and GRK4 that successfully predicted blood pressure phenotype 70.5% of the time. Thus, our data indicate epistatic interactions play a major role in hypertension susceptibility. Our data also support a model where multiple pathways need to be affected in order to predispose to hypertension.  相似文献   

10.
Complex diseases such as cardiovascular disease are likely due to the effects of high-order interactions among multiple genes and demographic factors. Therefore, in order to understand their underlying biological mechanisms, we need to consider simultaneously the effects of genotypes across multiple loci. Statistical methods such as multifactor dimensionality reduction (MDR), the combinatorial partitioning method (CPM), recursive partitioning (RP), and patterning and recursive partitioning (PRP) are designed to uncover complex relationships without relying on a specific model for the interaction, and are therefore well-suited to this data setting. However, the theoretical overlap among these methods and their relative merits have not been well characterized. In this paper we demonstrate mathematically that MDR is a special case of RP in which (1) patterns are used as predictors (PRP), (2) tree growth is restricted to a single split, and (3) misclassification error is used as the measure of impurity. Both approaches are applied to a case-control study assessing the effect of eleven single nucleotide polymorphisms on coronary artery calcification in people at risk for cardiovascular disease.  相似文献   

11.
Gene-gene interactions may play an important role in the genetics of a complex disease. Detection and characterization of gene-gene interactions is a challenging issue that has stimulated the development of various statistical methods to address it. In this study, we introduce a method to measure gene interactions using entropy-based statistics from a contingency table of trait and genotype combinations. We also developed an exploration procedure by using graphs. We propose a standardized relative information gain (RIG) measure to evaluate the interactions between single nucleotide polymorphism (SNP) combinations. To identify the k th order interactions, contingency tables of trait and genotype combinations of k SNPs are constructed, with which RIGs are calculated. The RIGs are standardized using the mean and standard deviation from the permuted datasets. SNP combinations yielding high standardized RIG are chosen for gene-gene interactions. Detection of high-order interactions and comparison of interaction strengths between different orders are made possible by using standardized RIG. We have applied the proposed standardized entropy-based method to two types of data sets from a simulation study and a real genetic association study. We have compared our method and the multifactor dimensionality reduction (MDR) method through power analysis of eight different genetic models with varying penetrance rates, number of SNPs, and sample sizes. Our method shows successful identification of genetic associations and gene-gene interactions both in simulation and real genetic data. Simulation results suggest that the proposed entropy-based method is better able to detect high-order interactions and is superior to the MDR method in most cases. The proposed method is well suited for detecting interactions without main effects as well as for models including main effects.  相似文献   

12.
MOTIVATION: The identification and characterization of genes that increase the susceptibility to common complex multifactorial diseases is a challenging task in genetic association studies. The multifactor dimensionality reduction (MDR) method has been proposed and implemented by Ritchie et al. (2001) to identify the combinations of multilocus genotypes and discrete environmental factors that are associated with a particular disease. However, the original MDR method classifies the combination of multilocus genotypes into high-risk and low-risk groups in an ad hoc manner based on a simple comparison of the ratios of the number of cases and controls. Hence, the MDR approach is prone to false positive and negative errors when the ratio of the number of cases and controls in a combination of genotypes is similar to that in the entire data, or when both the number of cases and controls is small. Hence, we propose the odds ratio based multifactor dimensionality reduction (OR MDR) method that uses the odds ratio as a new quantitative measure of disease risk. RESULTS: While the original MDR method provides a simple binary measure of risk, the OR MDR method provides not only the odds ratio as a quantitative measure of risk but also the ordering of the multilocus combinations from the highest risk to lowest risk groups. Furthermore, the OR MDR method provides a confidence interval for the odds ratio for each multilocus combination, which is extremely informative in judging its importance as a risk factor. The proposed OR MDR method is illustrated using the dataset obtained from the CDC Chronic Fatigue Syndrome Research Group. AVAILABILITY: The program written in R is available.  相似文献   

13.
Identifying susceptibility genes that influence complex diseases is extremely difficult because loci often influence the disease state through genetic interactions. Numerous approaches to detect disease-associated SNP-SNP interactions have been developed, but none consistently generates high-quality results under different disease scenarios. Using summarizing techniques to combine a number of existing methods may provide a solution to this problem. Here we used three popular non-parametric methods—Gini, absolute probability difference (APD), and entropy—to develop two novel summary scores, namely principle component score (PCS) and Z-sum score (ZSS), with which to predict disease-associated genetic interactions. We used a simulation study to compare performance of the non-parametric scores, the summary scores, the scaled-sum score (SSS; used in polymorphism interaction analysis (PIA)), and the multifactor dimensionality reduction (MDR). The non-parametric methods achieved high power, but no non-parametric method outperformed all others under a variety of epistatic scenarios. PCS and ZSS, however, outperformed MDR. PCS, ZSS and SSS displayed controlled type-I-errors (< 0.05) compared to GS, APDS, ES (> 0.05). A real data study using the genetic-analysis-workshop 16 (GAW 16) rheumatoid arthritis dataset identified a number of interesting SNP-SNP interactions.  相似文献   

14.
We developed a computationally efficient algorithm AMBIENCE, for identifying the informative variables involved in gene-gene (GGI) and gene-environment interactions (GEI) that are associated with disease phenotypes. The AMBIENCE algorithm uses a novel information theoretic metric called phenotype-associated information (PAI) to search for combinations of genetic variants and environmental variables associated with the disease phenotype. The PAI-based AMBIENCE algorithm effectively and efficiently detected GEI in simulated data sets of varying size and complexity, including the 10K simulated rheumatoid arthritis data set from Genetic Analysis Workshop 15. The method was also successfully used to detect GGI in a Crohn's disease data set. The performance of the AMBIENCE algorithm was compared to the multifactor dimensionality reduction (MDR), generalized MDR (GMDR), and pedigree disequilibrium test (PDT) methods. Furthermore, we assessed the computational speed of AMBIENCE for detecting GGI and GEI for data sets varying in size from 100 to 10(5) variables. Our results demonstrate that the AMBIENCE information theoretic algorithm is useful for analyzing a diverse range of epidemiologic data sets containing evidence for GGI and GEI.  相似文献   

15.
We present an extension of the two-class multifactor dimensionality reduction (MDR) algorithm that enables detection and characterization of epistatic SNP-SNP interactions in the context of a quantitative trait. The proposed Quantitative MDR (QMDR) method handles continuous data by modifying MDR’s constructive induction algorithm to use a T-test. QMDR replaces the balanced accuracy metric with a T-test statistic as the score to determine the best interaction model. We used a simulation to identify the empirical distribution of QMDR’s testing score. We then applied QMDR to genetic data from the ongoing prospective Prevention of Renal and Vascular End-Stage Disease (PREVEND) study.  相似文献   

16.
Parallel multifactor dimensionality reduction is a tool for large-scale analysis of gene-gene and gene-environment interactions. The MDR algorithm was redesigned to allow an unlimited number of study subjects, total variables and variable states, and to remove restrictions on the order of interactions being analyzed. In addition, the algorithm is markedly more efficient, with approximately 150-fold decrease in runtime for equivalent analyses. To facilitate the processing of large datasets, the algorithm was made parallel. AVAILABILITY: Parallel MDR is freely available for non-commercial research institutions. For full details see http://chgr.mc.vanderbilt.edu/ritchielab/pMDR. An open-source version of MDR software is available at http://www.epistasis.org.  相似文献   

17.
Chen GB  Xu Y  Xu HM  Li MD  Zhu J  Lou XY 《PloS one》2011,6(2):e16981
Detection of interacting risk factors for complex traits is challenging. The choice of an appropriate method, sample size, and allocation of cases and controls are serious concerns. To provide empirical guidelines for planning such studies and data analyses, we investigated the performance of the multifactor dimensionality reduction (MDR) and generalized MDR (GMDR) methods under various experimental scenarios. We developed the mathematical expectation of accuracy and used it as an indicator parameter to perform a gene-gene interaction study. We then examined the statistical power of GMDR and MDR within the plausible range of accuracy (0.50~0.65) reported in the literature. The GMDR with covariate adjustment had a power of >80% in a case-control design with a sample size of ≥2000, with theoretical accuracy ranging from 0.56 to 0.62. However, when the accuracy was <0.56, a sample size of ≥4000 was required to have sufficient power. In our simulations, the GMDR outperformed the MDR under all models with accuracy ranging from 0.56~0.62 for a sample size of 1000-2000. However, the two methods performed similarly when the accuracy was outside this range or the sample was significantly larger. We conclude that with adjustment of a covariate, GMDR performs better than MDR and a sample size of 1000~2000 is reasonably large for detecting gene-gene interactions in the range of effect size reported by the current literature; whereas larger sample size is required for more subtle interactions with accuracy <0.56.  相似文献   

18.

Background

Graves’ disease (GD) is a complex disease in which genetic predisposition is modified by environmental factors. Each gene exerts limited effects on the development of autoimmune disease (OR = 1.2–1.5). An epidemiological study revealed that nearly 70% of the risk of developing inherited autoimmunological thyroid diseases (AITD) is the result of gene interactions. In the present study, we analyzed the effects of the interactions of multiple loci on the genetic predisposition to GD. The aim of our analyses was to identify pairs of genes that exhibit a multiplicative interaction effect.

Material and Methods

A total of 709 patients with GD were included in the study. The patients were stratified into more homogeneous groups depending on the age at time of GD onset: younger patients less than 30 years of age and older patients greater than 30 years of age. Association analyses were performed for genes that influence the development of GD: HLADRB1, PTPN22, CTLA4 and TSHR. The interactions among polymorphisms were analyzed using the multiple logistic regression and multifactor dimensionality reduction (MDR) methods.

Results

GD patients stratified by the age of onset differed in the allele frequencies of the HLADRB1*03 and 1858T polymorphisms of the PTPN22 gene (OR = 1.7, p = 0.003; OR = 1.49, p = 0.01, respectively). We evaluated the genetic interactions of four SNPs in a pairwise fashion with regard to disease risk. The coexistence of HLADRB1 with CTLA4 or HLADRB1 with PTPN22 exhibited interactions on more than additive levels (OR = 3.64, p = 0.002; OR = 4.20, p < 0.001, respectively). These results suggest that interactions between these pairs of genes contribute to the development of GD. MDR analysis confirmed these interactions.

Conclusion

In contrast to a single gene effect, we observed that interactions between the HLADRB1/PTPN22 and HLADRB1/CTLA4 genes more closely predicted the risk of GD onset in young patients.  相似文献   

19.
The widespread use of high-throughput methods of single nucleotide polymorphism (SNP) genotyping has created a number of computational and statistical challenges. The problem of identifying SNP–SNP interactions in case–control studies has been studied extensively, and a number of new techniques have been developed. Little progress has been made, however, in the analysis of SNP–SNP interactions in relation to time-to-event data, such as patient survival time or time to cancer relapse. We present an extension of the two class multifactor dimensionality reduction (MDR) algorithm that enables detection and characterization of epistatic SNP–SNP interactions in the context of survival analysis. The proposed Survival MDR (Surv-MDR) method handles survival data by modifying MDR’s constructive induction algorithm to use the log-rank test. Surv-MDR replaces balanced accuracy with log-rank test statistics as the score to determine the best models. We simulated datasets with a survival outcome related to two loci in the absence of any marginal effects. We compared Surv-MDR with Cox-regression for their ability to identify the true predictive loci in these simulated data. We also used this simulation to construct the empirical distribution of Surv-MDR’s testing score. We then applied Surv-MDR to genetic data from a population-based epidemiologic study to find prognostic markers of survival time following a bladder cancer diagnosis. We identified several two-loci SNP combinations that have strong associations with patients’ survival outcome. Surv-MDR is capable of detecting interaction models with weak main effects. These epistatic models tend to be dropped by traditional Cox regression approaches to evaluating interactions. With improved efficiency to handle genome wide datasets, Surv-MDR will play an important role in a research strategy that embraces the complexity of the genotype–phenotype mapping relationship since epistatic interactions are an important component of the genetic basis of disease.  相似文献   

20.
Interactions among genes and the environment are a common source of phenotypic variation. To characterize the interplay between genetics and the environment at single nucleotide resolution, we quantified the genetic and environmental interactions of four quantitative trait nucleotides (QTN) that govern yeast sporulation efficiency. We first constructed a panel of strains that together carry all 32 possible combinations of the 4 QTN genotypes in 2 distinct genetic backgrounds. We then measured the sporulation efficiencies of these 32 strains across 8 controlled environments. This dataset shows that variation in sporulation efficiency is shaped largely by genetic and environmental interactions. We find clear examples of QTN:environment, QTN: background, and environment:background interactions. However, we find no QTN:QTN interactions that occur consistently across the entire dataset. Instead, interactions between QTN only occur under specific combinations of environment and genetic background. Thus, what might appear to be a QTN:QTN interaction in one background and environment becomes a more complex QTN:QTN:environment:background interaction when we consider the entire dataset as a whole. As a result, the phenotypic impact of a set of QTN alleles cannot be predicted from genotype alone. Our results instead demonstrate that the effects of QTN and their interactions are inextricably linked both to genetic background and to environmental variation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号