Summarizing techniques that combine three non-parametric scores to detect disease-associated 2-way SNP-SNP interactions |
| |
Authors: | Amrita Sengupta Chattopadhyay Ching-Lin Hsiao Chien Ching Chang Ie-Bin Lian Cathy SJ Fann |
| |
Institution: | 1. Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan;2. Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan;3. Institute of Public Health, National Yang-Ming University, Taipei, Taiwan;4. Institute of Information Science, Academia Sinica, Taipei, Taiwan;5. Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan;6. Department of Mathematics, National Changhua University of Education, Changhua, Taiwan |
| |
Abstract: | Identifying susceptibility genes that influence complex diseases is extremely difficult because loci often influence the disease state through genetic interactions. Numerous approaches to detect disease-associated SNP-SNP interactions have been developed, but none consistently generates high-quality results under different disease scenarios. Using summarizing techniques to combine a number of existing methods may provide a solution to this problem. Here we used three popular non-parametric methods—Gini, absolute probability difference (APD), and entropy—to develop two novel summary scores, namely principle component score (PCS) and Z-sum score (ZSS), with which to predict disease-associated genetic interactions. We used a simulation study to compare performance of the non-parametric scores, the summary scores, the scaled-sum score (SSS; used in polymorphism interaction analysis (PIA)), and the multifactor dimensionality reduction (MDR). The non-parametric methods achieved high power, but no non-parametric method outperformed all others under a variety of epistatic scenarios. PCS and ZSS, however, outperformed MDR. PCS, ZSS and SSS displayed controlled type-I-errors (< 0.05) compared to GS, APDS, ES (> 0.05). A real data study using the genetic-analysis-workshop 16 (GAW 16) rheumatoid arthritis dataset identified a number of interesting SNP-SNP interactions. |
| |
Keywords: | APD absolute probability difference APDS APD score BDNF brain derived neurotrophic factor C5 compliment component CART classification and regression trees CASP 9 caspase 9 CCP cyclic citrullinated peptide CV cross-validation ES entropy score GAW16 genetic- analysis-workshop 16 GS Gini score GWAS genome wide association study HLA human leukocyte antigens HLA-DQB1 major hiscompatibility complex class II DQ beta 1 HLA-DRB1 major hiscompatibility complex class II DR beta 1 KEGG kyoto encyclopedia of genes and genomes LD linkage disequilibrium MAF minor allele frequency Max maximum MDR multifactor dimensionality reduction NARAC North American Rheumatoid Arthritis Consortium NN neural networks NTRK2 neurotrophic tyrosine kinase receptor type 2 PC1 principal component 1 PCS principle component score PIA polymorphism interaction analysis PTPN22 protein tyrosine phosphatase non-receptor type 22 lymphoid QC quality control RA rheumatoid arthritis RASSUN RAnked Summarized Scores Using Non-parametric-methods SNP single-nucleotide-polymorphism SS scaled score SSS sum of scaled scores Std Dev standard deviation TRAF1 TNF-receptor-associated factor 1 ZS Z-score ZSS Z-sum score |
本文献已被 ScienceDirect 等数据库收录! |
|