首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 640 毫秒
1.
MOTIVATION: The identification and characterization of susceptibility genes that influence the risk of common and complex diseases remains a statistical and computational challenge in genetic association studies. This is partly because the effect of any single genetic variant for a common and complex disease may be dependent on other genetic variants (gene-gene interaction) and environmental factors (gene-environment interaction). To address this problem, the multifactor dimensionality reduction (MDR) method has been proposed by Ritchie et al. to detect gene-gene interactions or gene-environment interactions. The MDR method identifies polymorphism combinations associated with the common and complex multifactorial diseases by collapsing high-dimensional genetic factors into a single dimension. That is, the MDR method classifies the combination of multilocus genotypes into high-risk and low-risk groups based on a comparison of the ratios of the numbers of cases and controls. When a high-order interaction model is considered with multi-dimensional factors, however, there may be many sparse or empty cells in the contingency tables. The MDR method cannot classify an empty cell as high risk or low risk and leaves it as undetermined. RESULTS: In this article, we propose the log-linear model-based multifactor dimensionality reduction (LM MDR) method to improve the MDR in classifying sparse or empty cells. The LM MDR method estimates frequencies for empty cells from a parsimonious log-linear model so that they can be assigned to high-and low-risk groups. In addition, LM MDR includes MDR as a special case when the saturated log-linear model is fitted. Simulation studies show that the LM MDR method has greater power and smaller error rates than the MDR method. The LM MDR method is also compared with the MDR method using as an example sporadic Alzheimer's disease.  相似文献   

2.
Interactions of single nucleotide polymorphisms (SNPs) are assumed to be responsible for complex diseases such as sporadic breast cancer. Important goals of studies concerned with such genetic data are thus to identify combinations of SNPs that lead to a higher risk of developing a disease and to measure the importance of these interactions. There are many approaches based on classification methods such as CART and random forests that allow measuring the importance of single variables. But none of these methods enable the importance of combinations of variables to be quantified directly. In this paper, we show how logic regression can be employed to identify SNP interactions explanatory for the disease status in a case-control study and propose 2 measures for quantifying the importance of these interactions for classification. These approaches are then applied on the one hand to simulated data sets and on the other hand to the SNP data of the GENICA study, a study dedicated to the identification of genetic and gene-environment interactions associated with sporadic breast cancer.  相似文献   

3.

Objective

Cholesterol gallstone disease (CGD) is a multifactorial and multistep disease. Apart from female gender and increasing age being the documented non-modifiable risk factor for gallstones the pathobiological mechanisms underlying the phenotypic expression of CGD appear to be rather complex, and one or more variations in genes could play critical roles in the diverse pathways further progressing to cholesterol crystal formation. In the present study we performed genotyping score, Multifactor dimensionality reduction (MDR) and Classification and Regression Tree analysis (CART) to identify combinations of alleles among the hormonal, hepatocanalicular transporter and adipogenesis differentiation pathway genes in modifying the risk for CGD.

Design

The present case-control study recruited total of 450 subjects, including 230 CGD patients and 220 controls. We analyzed common ESR1, ESR2, PGR, ADRB3, ADRA2A, ABCG8, SLCO1B1, PPARγ2, and SREBP2 gene polymorphisms to find out combinations of genetic variants contributing to CGD risk, using multi-analytical approaches (G-score, MDR, and CART).

Results

Single locus analysis by logistic regression showed association of ESR1 IVS1-397C>T (rs2234693), IVS1-351A>G (rs9340799) PGR ins/del (rs1042838) ADRB3-190 T>C (rs4994) ABCG8 D19H (rs11887534), SLCO1B1 Exon4 C>A (rs11045819) and SREBP2 1784G>C (rs2228314) with CGD risk. However, the MDR and CART analysis revealed ESR1 IVS1-397C>T (rs2234693) ADRB3-190 T>C (rs4994) and ABCG8 D19H (rs11887534) polymorphisms as the best polymorphic signature for discriminating between cases and controls. The overall odds ratio for the applied multi-analytical approaches ranged from 4.33 to 10.05 showing an incremental risk for cholesterol crystal formation. In conclusion, our muti-analytical approach suggests that, ESR1, ADRB3, in addition to ABCG8 genetic variants confer significant risk for cholesterol gallstone disease.  相似文献   

4.
Epistasis or gene-gene interaction is a fundamental component of the genetic architecture of complex traits such as disease susceptibility. Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free method to detect epistasis when there are no significant marginal genetic effects. However, in many studies of complex disease, other covariates like age of onset and smoking status could have a strong main effect and may potentially interfere with MDR's ability to achieve its goal. In this paper, we present a simple and computationally efficient sampling method to adjust for covariate effects in MDR. We use simulation to show that after adjustment, MDR has sufficient power to detect true gene-gene interactions. We also compare our method with the state-of-art technique in covariate adjustment. The results suggest that our proposed method performs similarly, but is more computationally efficient. We then apply this new method to an analysis of a population-based bladder cancer study in New Hampshire.  相似文献   

5.
Complex human diseases do not have a clear inheritance pattern, and it is expected that risk involves multiple genes with modest effects acting independently or interacting. Major challenges for the identification of genetic effects are genetic heterogeneity and difficulty in analyzing high-order interactions. To address these challenges, we present MDR-Phenomics, a novel approach based on the multifactor dimensionality reduction (MDR) method, to detect genetic effects in pedigree data by integration of phenotypic covariates (PCs) that may reflect genetic heterogeneity. The P value of the test is calculated using a permutation test adjusted for multiple tests. To validate MDR-Phenomics, we compared it with two MDR-based methods: (1) traditional MDR pedigree disequilibrium test (PDT) without consideration of PCs (MDR-PDT) and (2) stratified phenotype (SP) analysis based on PCs, with use of MDR-PDT with a Bonferroni adjustment (SP-MDR). Using computer simulations, we examined the statistical power and type I error of the different approaches under several genetic models and sampling scenarios. We conclude that MDR-Phenomics is more powerful than MDR-PDT and SP-MDR when there is genetic heterogeneity, and the statistical power is affected by sample size and the number of PC levels. We further compared MDR-Phenomics with conditional logistic regression (CLR) for testing interactions across single or multiple loci with consideration of PC. The results show that CLR with PC has only slightly smaller power than does MDR-Phenomics for single-locus analysis but has considerably smaller power for multiple loci. Finally, by applying MDR-Phenomics to autism, a complex disease in which multiple genes are believed to confer risk, we attempted to identify multiple gene effects in two candidate genes of interest—the serotonin transporter gene (SLC6A4) and the integrin beta 3 gene (ITGB3) on chromosome 17. Analyzing four markers in SLC6A4 and four markers in ITGB3 in 117 white family triads with autism and using sex of the proband as a PC, we found significant interaction between two markers—rs1042173 in SLC6A4 and rs3809865 in ITGB3.  相似文献   

6.
Identifying susceptibility genes that influence complex diseases is extremely difficult because loci often influence the disease state through genetic interactions. Numerous approaches to detect disease-associated SNP-SNP interactions have been developed, but none consistently generates high-quality results under different disease scenarios. Using summarizing techniques to combine a number of existing methods may provide a solution to this problem. Here we used three popular non-parametric methods—Gini, absolute probability difference (APD), and entropy—to develop two novel summary scores, namely principle component score (PCS) and Z-sum score (ZSS), with which to predict disease-associated genetic interactions. We used a simulation study to compare performance of the non-parametric scores, the summary scores, the scaled-sum score (SSS; used in polymorphism interaction analysis (PIA)), and the multifactor dimensionality reduction (MDR). The non-parametric methods achieved high power, but no non-parametric method outperformed all others under a variety of epistatic scenarios. PCS and ZSS, however, outperformed MDR. PCS, ZSS and SSS displayed controlled type-I-errors (< 0.05) compared to GS, APDS, ES (> 0.05). A real data study using the genetic-analysis-workshop 16 (GAW 16) rheumatoid arthritis dataset identified a number of interesting SNP-SNP interactions.  相似文献   

7.

Background  

It is hypothesized that common, complex diseases may be due to complex interactions between genetic and environmental factors, which are difficult to detect in high-dimensional data using traditional statistical approaches. Multifactor Dimensionality Reduction (MDR) is the most commonly used data-mining method to detect epistatic interactions. In all data-mining methods, it is important to consider internal validation procedures to obtain prediction estimates to prevent model over-fitting and reduce potential false positive findings. Currently, MDR utilizes cross-validation for internal validation. In this study, we incorporate the use of a three-way split (3WS) of the data in combination with a post-hoc pruning procedure as an alternative to cross-validation for internal model validation to reduce computation time without impairing performance. We compare the power to detect true disease causing loci using MDR with both 5- and 10-fold cross-validation to MDR with 3WS for a range of single-locus and epistatic disease models. Additionally, we analyze a dataset in HIV immunogenetics to demonstrate the results of the two strategies on real data.  相似文献   

8.
The candidate-gene approach in association studies of polygenic diseases has often yielded conflicting results. In this hospital-based case-control study with 696 white patients newly diagnosed with bladder cancer and 629 unaffected white controls, we applied a multigenic approach to examine the associations with bladder cancer risk of a comprehensive panel of 44 selected polymorphisms in two pathways, DNA repair and cell-cycle control, and to evaluate higher-order gene-gene interactions, using classification and regression tree (CART) analysis. Individually, only XPD Asp312Asn, RAG1 Lys820Arg, and a p53 intronic SNP exhibited statistically significant main effects. However, we found a significant gene-dosage effect for increasing numbers of potential high-risk alleles in DNA-repair and cell-cycle pathways separately and combined. For the nucleotide-excision repair pathway, compared with the referent group (fewer than four adverse alleles), individuals with four (odds ratio [OR] = 1.52, 95% CI 1.05-2.20), five to six (OR = 1.81, 95% CI 1.31-2.50), and seven or more adverse alleles (OR = 2.50, 95% CI 1.69-3.70) had increasingly elevated risks of bladder cancer (P for trend <.001). Each additional adverse allele was associated with a 1.21-fold increase in risk (95% CI 1.12-1.29). For the combined analysis of DNA-repair and cell-cycle SNPs, compared with the referent group (<13 adverse alleles), the ORs for individuals with 13-15, 16-17, and >or=18 adverse alleles were 1.22 (95% CI 0.84-1.76), 1.57 (95% CI 1.05-2.35), and 1.77 (95% CI 1.19-2.63), respectively (P for trend = .002). Each additional high-risk allele was associated with a 1.07-fold significant increase in risk. In addition, we found that smoking had a significant multiplicative interaction with SNPs in the combined DNA-repair and cell-cycle-control pathways (P<.01). All genetic effects were evident only in "ever smokers" (persons who had smoked >or=100 cigarettes) and not in "never smokers." A cross-validation statistical method developed in this study confirmed the above observations. CART analysis revealed potential higher-order gene-gene and gene-smoking interactions and categorized a few higher-risk subgroups for bladder cancer. Moreover, subgroups identified with higher cancer risk also exhibited higher levels of induced genetic damage than did subgroups with lower risk. There was a significant trend of higher numbers of bleomycin- and benzo[a]pyrene diol-epoxide (BPDE)-induced chromatid breaks (by mutagen-sensitivity assay) and DNA damage (by comet assay) for individuals in higher-risk subgroups among cases of bladder cancer in smokers. The P for the trend was .0348 for bleomycin-induced chromosome breaks, .0036 for BPDE-induced chromosome breaks, and .0397 for BPDE-induced DNA damage, indicating that these higher-order gene-gene and gene-smoking interactions included SNPs that modulated repair and resulted in diminished DNA-repair capacity. Thus, genotype/phenotype analyses support findings from CART analyses. This is the first comprehensive study to use a multigenic analysis for bladder cancer, and the data suggest that individuals with a higher number of genetic variations in DNA-repair and cell-cycle-control genes are at an increased risk for bladder cancer, confirming the importance of taking a multigenic pathway-based approach to risk assessment.  相似文献   

9.
MOTIVATION: Polymorphisms in human genes are being described in remarkable numbers. Determining which polymorphisms and which environmental factors are associated with common, complex diseases has become a daunting task. This is partly because the effect of any single genetic variation will likely be dependent on other genetic variations (gene-gene interaction or epistasis) and environmental factors (gene-environment interaction). Detecting and characterizing interactions among multiple factors is both a statistical and a computational challenge. To address this problem, we have developed a multifactor dimensionality reduction (MDR) method for collapsing high-dimensional genetic data into a single dimension thus permitting interactions to be detected in relatively small sample sizes. In this paper, we describe the MDR approach and an MDR software package. RESULTS: We developed a program that integrates MDR with a cross-validation strategy for estimating the classification and prediction error of multifactor models. The software can be used to analyze interactions among 2-15 genetic and/or environmental factors. The dataset may contain up to 500 total variables and a maximum of 4000 study subjects. AVAILABILITY: Information on obtaining the executable code, example data, example analysis, and documentation is available upon request. SUPPLEMENTARY INFORMATION: All supplementary information can be found at http://phg.mc.vanderbilt.edu/Software/MDR.  相似文献   

10.

Background

Molecular and epidemiological evidence demonstrate that altered gene expression and single nucleotide polymorphisms in the apoptotic pathway are linked to many cancers. Yet, few studies emphasize the interaction of variant apoptotic genes and their joint modifying effects on prostate cancer (PCA) outcomes. An exhaustive assessment of all the possible two-, three- and four-way gene-gene interactions is computationally burdensome. This statistical conundrum stems from the prohibitive amount of data needed to account for multiple hypothesis testing.

Methods

To address this issue, we systematically prioritized and evaluated individual effects and complex interactions among 172 apoptotic SNPs in relation to PCA risk and aggressive disease (i.e., Gleason score ≥ 7 and tumor stages III/IV). Single and joint modifying effects on PCA outcomes among European-American men were analyzed using statistical epistasis networks coupled with multi-factor dimensionality reduction (SEN-guided MDR). The case-control study design included 1,175 incident PCA cases and 1,111 controls from the prostate, lung, colo-rectal, and ovarian (PLCO) cancer screening trial. Moreover, a subset analysis of PCA cases consisted of 688 aggressive and 488 non-aggressive PCA cases. SNP profiles were obtained using the NCI Cancer Genetic Markers of Susceptibility (CGEMS) data portal. Main effects were assessed using logistic regression (LR) models. Prior to modeling interactions, SEN was used to pre-process our genetic data. SEN used network science to reduce our analysis from > 36 million to < 13,000 SNP interactions. Interactions were visualized, evaluated, and validated using entropy-based MDR. All parametric and non-parametric models were adjusted for age, family history of PCA, and multiple hypothesis testing.

Results

Following LR modeling, eleven and thirteen sequence variants were associated with PCA risk and aggressive disease, respectively. However, none of these markers remained significant after we adjusted for multiple comparisons. Nevertheless, we detected a modest synergistic interaction between AKT3 rs2125230-PRKCQ rs571715 and disease aggressiveness using SEN-guided MDR (p = 0.011).

Conclusions

In summary, entropy-based SEN-guided MDR facilitated the logical prioritization and evaluation of apoptotic SNPs in relation to aggressive PCA. The suggestive interaction between AKT3-PRKCQ and aggressive PCA requires further validation using independent observational studies.  相似文献   

11.
One of the greatest challenges facing human geneticists is the identification and characterization of susceptibility genes for common complex multifactorial human diseases. This challenge is partly due to the limitations of parametric-statistical methods for detection of gene effects that are dependent solely or partially on interactions with other genes and with environmental exposures. We introduce multifactor-dimensionality reduction (MDR) as a method for reducing the dimensionality of multilocus information, to improve the identification of polymorphism combinations associated with disease risk. The MDR method is nonparametric (i.e., no hypothesis about the value of a statistical parameter is made), is model-free (i.e., it assumes no particular inheritance model), and is directly applicable to case-control and discordant-sib-pair studies. Using simulated case-control data, we demonstrate that MDR has reasonable power to identify interactions among two or more loci in relatively small samples. When it was applied to a sporadic breast cancer case-control data set, in the absence of any statistically significant independent main effects, MDR identified a statistically significant high-order interaction among four polymorphisms from three different estrogen-metabolism genes. To our knowledge, this is the first report of a four-locus interaction associated with a common complex multifactorial disease.  相似文献   

12.
Despite the growing consensus on the importance of testing gene-gene interactions in genetic studies of complex diseases, the effect of gene-gene interactions has often been defined as a deviance from genetic additive effects, which is essentially treated as a residual term in genetic analysis and leads to low power in detecting the presence of interacting effects. To what extent the definition of gene-gene interaction at population level reflects the genes' biochemical or physiological interaction remains a mystery. In this article, we introduce a novel definition and a new measure of gene-gene interaction between two unlinked loci (or genes). We developed a general theory for studying linkage disequilibrium (LD) patterns in disease population under two-locus disease models. The properties of using the LD measure in a disease population as a function of the measure of gene-gene interaction between two unlinked loci were also investigated. We examined how interaction between two loci creates LD in a disease population and showed that the mathematical formulation of the new definition for gene-gene interaction between two loci was similar to that of the LD between two loci. This finding motived us to develop an LD-based statistic to detect gene-gene interaction between two unlinked loci. The null distribution and type I error rates of the LD-based statistic for testing gene-gene interaction were validated using extensive simulation studies. We found that the new test statistic was more powerful than the traditional logistic regression under three two-locus disease models and demonstrated that the power of the test statistic depends on the measure of gene-gene interaction. We also investigated the impact of using tagging SNPs for testing interaction on the power to detect interaction between two unlinked loci. Finally, to evaluate the performance of our new method, we applied the LD-based statistic to two published data sets. Our results showed that the P values of the LD-based statistic were smaller than those obtained by other approaches, including logistic regression models.  相似文献   

13.
Chen GB  Xu Y  Xu HM  Li MD  Zhu J  Lou XY 《PloS one》2011,6(2):e16981
Detection of interacting risk factors for complex traits is challenging. The choice of an appropriate method, sample size, and allocation of cases and controls are serious concerns. To provide empirical guidelines for planning such studies and data analyses, we investigated the performance of the multifactor dimensionality reduction (MDR) and generalized MDR (GMDR) methods under various experimental scenarios. We developed the mathematical expectation of accuracy and used it as an indicator parameter to perform a gene-gene interaction study. We then examined the statistical power of GMDR and MDR within the plausible range of accuracy (0.50~0.65) reported in the literature. The GMDR with covariate adjustment had a power of >80% in a case-control design with a sample size of ≥2000, with theoretical accuracy ranging from 0.56 to 0.62. However, when the accuracy was <0.56, a sample size of ≥4000 was required to have sufficient power. In our simulations, the GMDR outperformed the MDR under all models with accuracy ranging from 0.56~0.62 for a sample size of 1000-2000. However, the two methods performed similarly when the accuracy was outside this range or the sample was significantly larger. We conclude that with adjustment of a covariate, GMDR performs better than MDR and a sample size of 1000~2000 is reasonably large for detecting gene-gene interactions in the range of effect size reported by the current literature; whereas larger sample size is required for more subtle interactions with accuracy <0.56.  相似文献   

14.
Hirschsprung disease (HSCR) is a severe multifactorial genetic disorder. Microarray studies indicated GAL, GAP43 and NRSN1 might contribute to the altered risk in HSCR. Thus, we focused on genetic variations in GAL, GAP43 and NRSN1, and the gene‐gene interactions involved in HSCR susceptibility. We recruited a strategy combining case‐control study and MassArray system with interaction network analysis. For GAL, GAP43 and NRSN1, a total of 18 polymorphisms were assessed in 104 subjects with sporadic HSCR and 151 controls of Han Chinese origin. We found statistically significant differences between HSCR and control groups at 5 genetic variants. For each gene, the haplotypes combining all polymorphisms were the most significant. Based on SNPsyn, MDR and GeneMANIA analyses, we observed significant gene‐gene interactions among GAL, GAP43, NRSN1 and our previous identified RELN, GABRG2 and PTCH1. Our study for the first time indicates that genetic variants within GAL, GAP43 and NRSN1 and related gene‐gene interaction networks might be involved in the altered susceptibility to HSCR in the Han Chinese population, which might shed more light on HSCR pathogenesis.  相似文献   

15.
Gene-gene interactions may play an important role in the genetics of a complex disease. Detection and characterization of gene-gene interactions is a challenging issue that has stimulated the development of various statistical methods to address it. In this study, we introduce a method to measure gene interactions using entropy-based statistics from a contingency table of trait and genotype combinations. We also developed an exploration procedure by using graphs. We propose a standardized relative information gain (RIG) measure to evaluate the interactions between single nucleotide polymorphism (SNP) combinations. To identify the k th order interactions, contingency tables of trait and genotype combinations of k SNPs are constructed, with which RIGs are calculated. The RIGs are standardized using the mean and standard deviation from the permuted datasets. SNP combinations yielding high standardized RIG are chosen for gene-gene interactions. Detection of high-order interactions and comparison of interaction strengths between different orders are made possible by using standardized RIG. We have applied the proposed standardized entropy-based method to two types of data sets from a simulation study and a real genetic association study. We have compared our method and the multifactor dimensionality reduction (MDR) method through power analysis of eight different genetic models with varying penetrance rates, number of SNPs, and sample sizes. Our method shows successful identification of genetic associations and gene-gene interactions both in simulation and real genetic data. Simulation results suggest that the proposed entropy-based method is better able to detect high-order interactions and is superior to the MDR method in most cases. The proposed method is well suited for detecting interactions without main effects as well as for models including main effects.  相似文献   

16.
Complex disease such as cancer results from interactions of multiple genetic and environmental factors. Studying these factors singularly cannot explain the underlying pathogenetic mechanism of the disease. Multi-analytical approach, including logistic regression (LR), classification and regression tree (CART) and multifactor dimensionality reduction (MDR), was applied in 188 lung cancer cases and 290 controls to explore high order interactions among xenobiotic metabolizing genes and environmental risk factors. Smoking was identified as the predominant risk factor by all three analytical approaches. Individually, CYP1A1*2A polymorphism was significantly associated with increased lung cancer risk (OR = 1.69;95%CI = 1.11–2.59,p = 0.01), whereas EPHX1 Tyr113His and SULT1A1 Arg213His conferred reduced risk (OR = 0.40;95%CI = 0.25–0.65,p<0.001 and OR = 0.51;95%CI = 0.33–0.78,p = 0.002 respectively). In smokers, EPHX1 Tyr113His and SULT1A1 Arg213His polymorphisms reduced the risk of lung cancer, whereas CYP1A1*2A, CYP1A1*2C and GSTP1 Ile105Val imparted increased risk in non-smokers only. While exploring non-linear interactions through CART analysis, smokers carrying the combination of EPHX1 113TC (Tyr/His), SULT1A1 213GG (Arg/Arg) or AA (His/His) and GSTM1 null genotypes showed the highest risk for lung cancer (OR = 3.73;95%CI = 1.33–10.55,p = 0.006), whereas combined effect of CYP1A1*2A 6235CC or TC, SULT1A1 213GG (Arg/Arg) and betel quid chewing showed maximum risk in non-smokers (OR = 2.93;95%CI = 1.15–7.51,p = 0.01). MDR analysis identified two distinct predictor models for the risk of lung cancer in smokers (tobacco chewing, EPHX1 Tyr113His, and SULT1A1 Arg213His) and non-smokers (CYP1A1*2A, GSTP1 Ile105Val and SULT1A1 Arg213His) with testing balance accuracy (TBA) of 0.6436 and 0.6677 respectively. Interaction entropy interpretations of MDR results showed non-additive interactions of tobacco chewing with SULT1A1 Arg213His and EPHX1 Tyr113His in smokers and SULT1A1 Arg213His with GSTP1 Ile105Val and CYP1A1*2C in nonsmokers. These results identified distinct gene-gene and gene environment interactions in smokers and non-smokers, which confirms the importance of multifactorial interaction in risk assessment of lung cancer.  相似文献   

17.
Recent technological developments in genetic screening approaches have offered the means to start exploring quantitative genotype-phenotype relationships on a large-scale. What remains unclear is the extent to which the quantitative genetic interaction datasets can distinguish the broad spectrum of interaction classes, as compared to existing information on mutation pairs associated with both positive and negative interactions, and whether the scoring of varying degrees of such epistatic effects could be improved by computational means. To address these questions, we introduce here a computational approach for improving the quantitative discrimination power encoded in the genetic interaction screening data. Our matrix approximation model decomposes the original double-mutant fitness matrix into separate components, representing variability across the array and query mutants, which can be utilized for estimating and correcting the single-mutant fitness effects, respectively. When applied to three large-scale quantitative interaction datasets in yeast, we could improve the accuracy of scoring various interaction classes beyond that obtained with the original fitness data, especially in synthetic genetic array (SGA) and in genetic interaction mapping (GIM) datasets. In addition to the known pairs of interactions used in the evaluation of the computational approach, a number of novel interaction pairs were also predicted, along with underlying biological mechanisms, which remained undetected by the original datasets. It was shown that the optimal choice of the scoring function depends heavily on the screening approach and on the interaction class under analysis. Moreover, a simple preprocessing of the fitness matrix could further enhance the discrimination power of the epistatic miniarray profiling (E-MAP) dataset. These systematic evaluation results provide in-depth information on the optimal analysis of the future, large-scale screening experiments. In general, the modeling framework, enabling accurate identification and classification of genetic interactions, provides a solid basis for completing and mining the genetic interaction networks in yeast and other organisms.  相似文献   

18.
Multifactor Dimensionality Reduction (MDR) is a method for the classification and prediction of discrete clinical endpoints using attributes constructed from multilocus genotype data. Empirical studies with both real and simulated data suggest that MDR has good power for detecting gene-gene interactions in the absence of independent main effects. The purpose of this study is to develop an objective, theory-driven approach to evaluate the strengths and limitations of MDR. To accomplish this goal, we borrow concepts from ideal observer analysis used in visual perception to evaluate the theoretical limits of classifying and predicting discrete clinical endpoints using multilocus genotype data. We conclude that MDR ideally discriminates between low risk and high risk subjects using attributes constructed from multilocus genotype data. We also how that the classification approach used once a multilocus attribute is constructed is similar to that of a naive Bayes classifier. This study provides a theoretical foundation for the continued development, evaluation, and application of the MDR as a data mining tool in the domain of statistical genetics and genetic epidemiology.  相似文献   

19.

Background

Graves’ disease (GD) is a complex disease in which genetic predisposition is modified by environmental factors. Each gene exerts limited effects on the development of autoimmune disease (OR = 1.2–1.5). An epidemiological study revealed that nearly 70% of the risk of developing inherited autoimmunological thyroid diseases (AITD) is the result of gene interactions. In the present study, we analyzed the effects of the interactions of multiple loci on the genetic predisposition to GD. The aim of our analyses was to identify pairs of genes that exhibit a multiplicative interaction effect.

Material and Methods

A total of 709 patients with GD were included in the study. The patients were stratified into more homogeneous groups depending on the age at time of GD onset: younger patients less than 30 years of age and older patients greater than 30 years of age. Association analyses were performed for genes that influence the development of GD: HLADRB1, PTPN22, CTLA4 and TSHR. The interactions among polymorphisms were analyzed using the multiple logistic regression and multifactor dimensionality reduction (MDR) methods.

Results

GD patients stratified by the age of onset differed in the allele frequencies of the HLADRB1*03 and 1858T polymorphisms of the PTPN22 gene (OR = 1.7, p = 0.003; OR = 1.49, p = 0.01, respectively). We evaluated the genetic interactions of four SNPs in a pairwise fashion with regard to disease risk. The coexistence of HLADRB1 with CTLA4 or HLADRB1 with PTPN22 exhibited interactions on more than additive levels (OR = 3.64, p = 0.002; OR = 4.20, p < 0.001, respectively). These results suggest that interactions between these pairs of genes contribute to the development of GD. MDR analysis confirmed these interactions.

Conclusion

In contrast to a single gene effect, we observed that interactions between the HLADRB1/PTPN22 and HLADRB1/CTLA4 genes more closely predicted the risk of GD onset in young patients.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号