首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The determination of gene-by-gene and gene-by-environment interactions has long been one of the greatest challenges in genetics. The traditional methods are typically inadequate because of the problem referred to as the "curse of dimensionality." Recent combinatorial approaches, such as the multifactor dimensionality reduction (MDR) method, the combinatorial partitioning method, and the restricted partition method, have a straightforward correspondence to the concept of the phenotypic landscape that unifies biological, statistical genetics, and evolutionary theories. However, the existing approaches have several limitations, such as not allowing for covariates, that restrict their practical use. In this study, we report a generalized MDR (GMDR) method that permits adjustment for discrete and quantitative covariates and is applicable to both dichotomous and continuous phenotypes in various population-based study designs. Computer simulations indicated that the GMDR method has superior performance in its ability to identify epistatic loci, compared with current methods in the literature. We applied our proposed method to a genetics study of four genes that were reported to be associated with nicotine dependence and found significant joint action between CHRNB4 and NTRK2. Moreover, our example illustrates that the newly proposed GMDR approach can increase prediction ability, suggesting that its use is justified in practice. In summary, GMDR serves the purpose of identifying contributors to population variation better than do the other existing methods.  相似文献   

2.
The elusive but ubiquitous multifactor interactions represent a stumbling block that urgently needs to be removed in searching for determinants involved in human complex diseases. The dimensionality reduction approaches are a promising tool for this task. Many complex diseases exhibit composite syndromes required to be measured in a cluster of clinical traits with varying correlations and/or are inherently longitudinal in nature (changing over time and measured dynamically at multiple time points). A multivariate approach for detecting interactions is thus greatly needed on the purposes of handling a multifaceted phenotype and longitudinal data, as well as improving statistical power for multiple significance testing via a two-stage testing procedure that involves a multivariate analysis for grouped phenotypes followed by univariate analysis for the phenotypes in the significant group(s). In this article, we propose a multivariate extension of generalized multifactor dimensionality reduction (GMDR) based on multivariate generalized linear, multivariate quasi-likelihood and generalized estimating equations models. Simulations and real data analysis for the cohort from the Study of Addiction: Genetics and Environment are performed to investigate the properties and performance of the proposed method, as compared with the univariate method. The results suggest that the proposed multivariate GMDR substantially boosts statistical power.  相似文献   

3.
We propose a novel multifactor dimensionality reduction method for epistasis detection in small or extended pedigrees, FAM-MDR. It combines features of the Genome-wide Rapid Association using Mixed Model And Regression approach (GRAMMAR) with Model-Based MDR (MB-MDR). We focus on continuous traits, although the method is general and can be used for outcomes of any type, including binary and censored traits. When comparing FAM-MDR with Pedigree-based Generalized MDR (PGMDR), which is a generalization of Multifactor Dimensionality Reduction (MDR) to continuous traits and related individuals, FAM-MDR was found to outperform PGMDR in terms of power, in most of the considered simulated scenarios. Additional simulations revealed that PGMDR does not appropriately deal with multiple testing and consequently gives rise to overly optimistic results. FAM-MDR adequately deals with multiple testing in epistasis screens and is in contrast rather conservative, by construction. Furthermore, simulations show that correcting for lower order (main) effects is of utmost importance when claiming epistasis. As Type 2 Diabetes Mellitus (T2DM) is a complex phenotype likely influenced by gene-gene interactions, we applied FAM-MDR to examine data on glucose area-under-the-curve (GAUC), an endophenotype of T2DM for which multiple independent genetic associations have been observed, in the Amish Family Diabetes Study (AFDS). This application reveals that FAM-MDR makes more efficient use of the available data than PGMDR and can deal with multi-generational pedigrees more easily. In conclusion, we have validated FAM-MDR and compared it to PGMDR, the current state-of-the-art MDR method for family data, using both simulations and a practical dataset. FAM-MDR is found to outperform PGMDR in that it handles the multiple testing issue more correctly, has increased power, and efficiently uses all available information.  相似文献   

4.
We developed a computationally efficient algorithm AMBIENCE, for identifying the informative variables involved in gene-gene (GGI) and gene-environment interactions (GEI) that are associated with disease phenotypes. The AMBIENCE algorithm uses a novel information theoretic metric called phenotype-associated information (PAI) to search for combinations of genetic variants and environmental variables associated with the disease phenotype. The PAI-based AMBIENCE algorithm effectively and efficiently detected GEI in simulated data sets of varying size and complexity, including the 10K simulated rheumatoid arthritis data set from Genetic Analysis Workshop 15. The method was also successfully used to detect GGI in a Crohn's disease data set. The performance of the AMBIENCE algorithm was compared to the multifactor dimensionality reduction (MDR), generalized MDR (GMDR), and pedigree disequilibrium test (PDT) methods. Furthermore, we assessed the computational speed of AMBIENCE for detecting GGI and GEI for data sets varying in size from 100 to 10(5) variables. Our results demonstrate that the AMBIENCE information theoretic algorithm is useful for analyzing a diverse range of epidemiologic data sets containing evidence for GGI and GEI.  相似文献   

5.
Chen GB  Xu Y  Xu HM  Li MD  Zhu J  Lou XY 《PloS one》2011,6(2):e16981
Detection of interacting risk factors for complex traits is challenging. The choice of an appropriate method, sample size, and allocation of cases and controls are serious concerns. To provide empirical guidelines for planning such studies and data analyses, we investigated the performance of the multifactor dimensionality reduction (MDR) and generalized MDR (GMDR) methods under various experimental scenarios. We developed the mathematical expectation of accuracy and used it as an indicator parameter to perform a gene-gene interaction study. We then examined the statistical power of GMDR and MDR within the plausible range of accuracy (0.50~0.65) reported in the literature. The GMDR with covariate adjustment had a power of >80% in a case-control design with a sample size of ≥2000, with theoretical accuracy ranging from 0.56 to 0.62. However, when the accuracy was <0.56, a sample size of ≥4000 was required to have sufficient power. In our simulations, the GMDR outperformed the MDR under all models with accuracy ranging from 0.56~0.62 for a sample size of 1000-2000. However, the two methods performed similarly when the accuracy was outside this range or the sample was significantly larger. We conclude that with adjustment of a covariate, GMDR performs better than MDR and a sample size of 1000~2000 is reasonably large for detecting gene-gene interactions in the range of effect size reported by the current literature; whereas larger sample size is required for more subtle interactions with accuracy <0.56.  相似文献   

6.
Genome-wide association studies (GWAS) have successfully discovered hundreds of associations between genetic variants and complex traits. Most GWAS have focused on the identification of single variants. It has been shown that most of the variants that were discovered by GWAS could only partially explain disease heritability. The explanation for this missing heritability is generally believed to be gene-gene (GG) or gene-environment (GE) interactions and other structural variants. Generalized multifactor dimensionality reduction (GMDR) has been proven to be reasonably powerful in detecting GG and GE interactions; however, its performance has been found to decline when outlying quantitative traits are present. This paper proposes a robust GMDR estimation method (based on the L-estimator and M-estimator estimation methods) in an attempt to reduce the effects caused by outlying traits. A comparison of robust GMDR with the original MDR based on simulation studies showed the former method to outperform the latter. The performance of robust GMDR is illustrated through a real GWA example consisting of 8,577 samples from the Korean population using the Homeostasis Model Assessment of Insulin Resistance (HOMA-IR) level as a phenotype. Robust GMDR identified the KCNH1 gene to have strong interaction effects with other genes on the function of insulin secretion.  相似文献   

7.
We investigated whether variants in major candidate genes for food intake and body weight regulation contribute to obesity-related traits under a multilocus perspective. We studied 375 Brazilian subjects from partially isolated African-derived populations (quilombos). Seven variants displaying conflicting results in previous reports and supposedly implicated in the susceptibility of obesity-related phenotypes were investigated: β2-adrenergic receptor (ADRB2) (Arg16Gly), insulin induced gene 2 (INSIG2) (rs7566605), leptin (LEP) (A19G), LEP receptor (LEPR) (Gln223Arg), perilipin (PLIN) (6209T > C), peroxisome proliferator-activated receptor-γ (PPARG) (Pro12Ala), and resistin (RETN) (-420 C > G). Regression models as well as generalized multifactor dimensionality reduction (GMDR) were employed to test the contribution of individual effects and higher-order interactions to BMI and waist-hip ratio (WHR) variation and risk of overweight/obesity. The best multilocus association signal identified in the quilombos was further examined in an independent sample of 334 Brazilian subjects of European ancestry. In quilombos, only the PPARG polymorphism displayed significant individual effects (WHR variation, P = 0.028). No association was observed either with the risk of overweight/obesity (BMI ≥ 25 kg/m2), risk of obesity alone (BMI ≥ 30 kg/m2) or BMI variation. However, GMDR analyses revealed an interaction between the LEPR and ADRB2 polymorphisms (P = 0.009) as well as a third-order effect involving the latter two variants plus INSIG2 (P = 0.034) with overweight/obesity. Assessment of the LEPR-ADRB2 interaction in the second sample indicated a marginally significant association (P = 0.0724), which was further verified to be limited to men (P = 0.0118). Together, our findings suggest evidence for a two-locus interaction between the LEPR Gln223Arg and ADRB2 Arg16Gly variants in the risk of overweight/obesity, and highlight further the importance of multilocus effects in the genetic component of obesity.  相似文献   

8.
X-Y Lou 《Heredity》2015,114(3):255-261
Biological outcomes are governed by multiple genetic and environmental factors that act in concert. Determining multifactor interactions is the primary topic of interest in recent genetics studies but presents enormous statistical and mathematical challenges. The computationally efficient multifactor dimensionality reduction (MDR) approach has emerged as a promising tool for meeting these challenges. On the other hand, complex traits are expressed in various forms and have different data generation mechanisms that cannot be appropriately modeled by a dichotomous model; the subjects in a study may be recruited according to its own analytical goals, research strategies and resources available, not only consisting of homogeneous unrelated individuals. Although several modifications and extensions of MDR have in part addressed the practical problems, they are still limited in statistical analyses of diverse phenotypes, multivariate phenotypes and correlated observations, correcting for potential population stratification and unifying both unrelated and family samples into a more powerful analysis. I propose a comprehensive statistical framework, referred as to unified generalized MDR (UGMDR), for systematic extension of MDR. The proposed approach is quite versatile, not only allowing for covariate adjustment, being suitable for analyzing almost any trait type, for example, binary, count, continuous, polytomous, ordinal, time-to-onset, multivariate and others, as well as combinations of those, but also being applicable to various study designs, including homogeneous and admixed unrelated-subject and family as well as mixtures of them. The proposed UGMDR offers an important addition to the arsenal of analytical tools for identifying nonlinear multifactor interactions and unraveling the genetic architecture of complex traits.  相似文献   

9.
Chanda P  Zhang A  Ramanathan M 《Heredity》2011,107(4):320-327
To develop a model synthesis method for parsimoniously modeling gene-environmental interactions (GEI) associated with clinical outcomes and phenotypes. The AMBROSIA model synthesis approach utilizes the k-way interaction information (KWII), an information-theoretic metric capable of identifying variable combinations associated with GEI. For model synthesis, AMBROSIA considers relevance of combinations to the phenotype, it precludes entry of combinations with redundant information, and penalizes for unjustifiable complexity; each step is KWII based. The performance and power of AMBROSIA were evaluated with simulations and Genetic Association Workshop 15 (GAW15) data sets of rheumatoid arthritis (RA). AMBROSIA identified parsimonious models in data sets containing multiple interactions with linkage disequilibrium present. For the GAW15 data set containing 9187 single-nucleotide polymorphisms, the parsimonious AMBROSIA model identified nine RA-associated combinations with power >90%. AMBROSIA was compared with multifactor dimensionality reduction across several diverse models and had satisfactory power. Software source code is available from http://www.cse.buffalo.edu/DBGROUP/bioinformatics/resources.html. AMBROSIA is a promising method for GEI model synthesis.  相似文献   

10.
MOTIVATION: The identification and characterization of genes that increase the susceptibility to common complex multifactorial diseases is a challenging task in genetic association studies. The multifactor dimensionality reduction (MDR) method has been proposed and implemented by Ritchie et al. (2001) to identify the combinations of multilocus genotypes and discrete environmental factors that are associated with a particular disease. However, the original MDR method classifies the combination of multilocus genotypes into high-risk and low-risk groups in an ad hoc manner based on a simple comparison of the ratios of the number of cases and controls. Hence, the MDR approach is prone to false positive and negative errors when the ratio of the number of cases and controls in a combination of genotypes is similar to that in the entire data, or when both the number of cases and controls is small. Hence, we propose the odds ratio based multifactor dimensionality reduction (OR MDR) method that uses the odds ratio as a new quantitative measure of disease risk. RESULTS: While the original MDR method provides a simple binary measure of risk, the OR MDR method provides not only the odds ratio as a quantitative measure of risk but also the ordering of the multilocus combinations from the highest risk to lowest risk groups. Furthermore, the OR MDR method provides a confidence interval for the odds ratio for each multilocus combination, which is extremely informative in judging its importance as a risk factor. The proposed OR MDR method is illustrated using the dataset obtained from the CDC Chronic Fatigue Syndrome Research Group. AVAILABILITY: The program written in R is available.  相似文献   

11.
The purpose of our work was to develop heuristics for visualizing and interpreting gene-environment interactions (GEIs) and to assess the dependence of candidate visualization metrics on biological and study-design factors. Two information-theoretic metrics, the k-way interaction information (KWII) and the total correlation information (TCI), were investigated. The effectiveness of the KWII and TCI to detect GEIs in a diverse range of simulated data sets and a Crohn disease data set was assessed. The sensitivity of the KWII and TCI spectra to biological and study-design variables was determined. Head-to-head comparisons with the relevance-chain, multifactor dimensionality reduction, and the pedigree disequilibrium test (PDT) methods were obtained. The KWII and TCI spectra, which are graphical summaries of the KWII and TCI for each subset of environmental and genotype variables, were found to detect each known GEI in the simulated data sets. The patterns in the KWII and TCI spectra were informative for factors such as case-control misassignment, locus heterogeneity, allele frequencies, and linkage disequilibrium. The KWII and TCI spectra were found to have excellent sensitivity for identifying the key disease-associated genetic variations in the Crohn disease data set. In head-to-head comparisons with the relevance-chain, multifactor dimensionality reduction, and PDT methods, the results from visual interpretation of the KWII and TCI spectra performed satisfactorily. The KWII and TCI are promising metrics for visualizing GEIs. They are capable of detecting interactions among numerous single-nucleotide polymorphisms and environmental variables for a diverse range of GEI models.  相似文献   

12.
MOTIVATION: The identification and characterization of susceptibility genes that influence the risk of common and complex diseases remains a statistical and computational challenge in genetic association studies. This is partly because the effect of any single genetic variant for a common and complex disease may be dependent on other genetic variants (gene-gene interaction) and environmental factors (gene-environment interaction). To address this problem, the multifactor dimensionality reduction (MDR) method has been proposed by Ritchie et al. to detect gene-gene interactions or gene-environment interactions. The MDR method identifies polymorphism combinations associated with the common and complex multifactorial diseases by collapsing high-dimensional genetic factors into a single dimension. That is, the MDR method classifies the combination of multilocus genotypes into high-risk and low-risk groups based on a comparison of the ratios of the numbers of cases and controls. When a high-order interaction model is considered with multi-dimensional factors, however, there may be many sparse or empty cells in the contingency tables. The MDR method cannot classify an empty cell as high risk or low risk and leaves it as undetermined. RESULTS: In this article, we propose the log-linear model-based multifactor dimensionality reduction (LM MDR) method to improve the MDR in classifying sparse or empty cells. The LM MDR method estimates frequencies for empty cells from a parsimonious log-linear model so that they can be assigned to high-and low-risk groups. In addition, LM MDR includes MDR as a special case when the saturated log-linear model is fitted. Simulation studies show that the LM MDR method has greater power and smaller error rates than the MDR method. The LM MDR method is also compared with the MDR method using as an example sporadic Alzheimer's disease.  相似文献   

13.
MOTIVATION: Polymorphisms in human genes are being described in remarkable numbers. Determining which polymorphisms and which environmental factors are associated with common, complex diseases has become a daunting task. This is partly because the effect of any single genetic variation will likely be dependent on other genetic variations (gene-gene interaction or epistasis) and environmental factors (gene-environment interaction). Detecting and characterizing interactions among multiple factors is both a statistical and a computational challenge. To address this problem, we have developed a multifactor dimensionality reduction (MDR) method for collapsing high-dimensional genetic data into a single dimension thus permitting interactions to be detected in relatively small sample sizes. In this paper, we describe the MDR approach and an MDR software package. RESULTS: We developed a program that integrates MDR with a cross-validation strategy for estimating the classification and prediction error of multifactor models. The software can be used to analyze interactions among 2-15 genetic and/or environmental factors. The dataset may contain up to 500 total variables and a maximum of 4000 study subjects. AVAILABILITY: Information on obtaining the executable code, example data, example analysis, and documentation is available upon request. SUPPLEMENTARY INFORMATION: All supplementary information can be found at http://phg.mc.vanderbilt.edu/Software/MDR.  相似文献   

14.
Complex diseases such as cardiovascular disease are likely due to the effects of high-order interactions among multiple genes and demographic factors. Therefore, in order to understand their underlying biological mechanisms, we need to consider simultaneously the effects of genotypes across multiple loci. Statistical methods such as multifactor dimensionality reduction (MDR), the combinatorial partitioning method (CPM), recursive partitioning (RP), and patterning and recursive partitioning (PRP) are designed to uncover complex relationships without relying on a specific model for the interaction, and are therefore well-suited to this data setting. However, the theoretical overlap among these methods and their relative merits have not been well characterized. In this paper we demonstrate mathematically that MDR is a special case of RP in which (1) patterns are used as predictors (PRP), (2) tree growth is restricted to a single split, and (3) misclassification error is used as the measure of impurity. Both approaches are applied to a case-control study assessing the effect of eleven single nucleotide polymorphisms on coronary artery calcification in people at risk for cardiovascular disease.  相似文献   

15.
Resistant hypertension, a complex multifactorial hypertensive disease, is triggered by genetic and environmental factors and involves multiple physiological pathways. Single genetic variants may not reveal significant associations with resistant hypertension because their effects may be dependent on gene-gene or gene-environment interactions. We examined the interaction of angiotensin I-converting enzyme (ACE), angiotensinogen (AGT), and endothelial nitric oxide synthase (NOS3) polymorphisms with environmental factors (gender, age, body mass index, glycemia, total cholesterol, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, triglycerides, estimated glomerular filtration rate, and urinary sodium excretion) in 70 resistant, 80 well-controlled hypertensive patients, and 70 normotensive controls. All subjects were genotyped for ACE insertion/deletion (rs1799752); AGT M235T (rs699), and NOS3 Glu298Asp (rs 1799983). Multifactorial associations were tested using two statistical methods: the traditional parametric method (adjusted logistic regression analysis) and gene-gene and gene-environment interactions evaluated by multifactor dimensionality reduction analyses. While adjusted logistic regression found no significant association between the studied polymorphisms and controlled or resistant hypertension, the multifactor dimensionality reduction analyses showed that carriers of the AGT 235T allele were at increased risk for resistant hypertension, especially if they were older than 50 years. The AGT 235T allele constituted an independent risk factor for resistant hypertension.  相似文献   

16.

Background  

To examine interactions among the angiotensin converting enzyme (ACE) insertion/deletion, plasminogen activator inhibitor-1 (PAI-1) 4G/5G, and tissue plasminogen activator (t-PA) insertion/deletion gene polymorphisms on risk of myocardial infarction using data from 343 matched case-control pairs from the Physicians Health Study. We examined the data using both conditional logistic regression and the multifactor dimensionality reduction (MDR) method. One advantage of the MDR method is that it provides an internal prediction error for validation. We summarize our use of this internal prediction error for model validation.  相似文献   

17.
Community genetics examines how genotypic variation within a species influences the associated ecological community. The inclusion of additional environmental and genotypic factors is a natural extension of the current community genetics framework. However, the extent to which the presence of and genetic variation in associated species influences interspecific interactions (i.e., genotype x genotype x environment [G x G x E] interactions) has been largely ignored. We used a community genetics approach to study the interaction of barley and aphids in the absence and presence of rhizosphere bacteria. We designed a matrix of aphid genotype and barley genotype combinations and found a significant G x G x E interaction, indicating that the barley-aphid interaction is dependent on the genotypes of the interacting species as well as the biotic environment. We discuss the consequences of the strong G x G x E interaction found in our study in relation to its impact on the study of species interactions in a community context.  相似文献   

18.
It is becoming clearly evident that single gene or single environmental factor cannot explain susceptibility to diseases with complex etiology such as head and neck cancer. In this study, we applied the multifactor dimensionality reduction method to explore potential gene-environment and gene-gene interactions that may contribute to predisposition to head and neck cancer in the North Indian population. We genotyped 203 patients with head and neck cancer and 201 healthy controls for 13 functional polymorphisms in genes coding for tobacco metabolizing enzymes; CYP1A1, CYP2A13, GSTM1, and UGT1A7 using polymerase chain reaction-restriction fragment length polymorphism method, real-time polymerase chain reaction quantitative assay, and denaturing high-performance liquid chromatography followed by direct sequencing. We found that GSTM1 copy number variations were the most influential factor for head and neck cancer. We also observed significant gene-gene interactions among GSTM1 copy number variants, CYP1A1 T3801C and UGT1A7 T622C variants among smokers. Multifactor dimensionality reduction approach showed that the three-factor model, including smoking status, CYP1A1 T3801C, and GSTM1 copy number variants, conferred more than fourfold increased risk of head and neck cancer (odds ratio 4.89; 95% confidence interval: 3.15-7.32, p?相似文献   

19.
We present an extension of the two-class multifactor dimensionality reduction (MDR) algorithm that enables detection and characterization of epistatic SNP-SNP interactions in the context of a quantitative trait. The proposed Quantitative MDR (QMDR) method handles continuous data by modifying MDR’s constructive induction algorithm to use a T-test. QMDR replaces the balanced accuracy metric with a T-test statistic as the score to determine the best interaction model. We used a simulation to identify the empirical distribution of QMDR’s testing score. We then applied QMDR to genetic data from the ongoing prospective Prevention of Renal and Vascular End-Stage Disease (PREVEND) study.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号