首页 | 本学科首页   官方微博 | 高级检索  
   检索      

基于贝叶斯网潜类模型的高维SNPs分析
引用本文:马靖,张韶凯,张岩波.基于贝叶斯网潜类模型的高维SNPs分析[J].生物信息学,2012,10(2):120-124.
作者姓名:马靖  张韶凯  张岩波
作者单位:山西医科大学卫生统计教研室,山西太原,030001
基金项目:国家自然科学基金资助项目
摘    要:采用贝叶斯(Bayesian)网的潜类模型对GAW17高维SNPs数据进行分析,为复杂性状疾病遗传以及基因定位等方面的研究提供新的方法支持。本研究从GAW17提供的包含697个个体22条常染色体的上万个SNP中,随机挑选出1号染色体上12个基因的29个SNPs作为研究对象。按照累计信息贡献率达到95%的原则,应用贝叶斯网潜变量模型选出C1S11408,C1S3201,C1S1786等15个与X0互信息量大的SNPs位点来对研究人群进行分类与解释。结果表明697个个体总的被分为2个潜在类别,各类别的概率分别为0.68和0.32。对两类人群的疾病分布状况进行分析,结果表明二者不一致,第二个类别人群患病率(38.64%)明显高于第一个类别人群(25.99%)(χ2=11.46,P=0.001)。由此可见,两类人群疾病患病率的差别正是由选出的15个SNPs造成的,从而有理由认为这些SNPs为可疑致病位点,为进一步的研究提供明确的思路。

关 键 词:贝叶斯网  潜变量模型  单核苷酸多态性  潜类分析

Analysis of High-dimensional SNPs data Based on Latent class Model of Bayesian Network
MA Jing , ZHANG Shao-kai , ZHANG yan-bo.Analysis of High-dimensional SNPs data Based on Latent class Model of Bayesian Network[J].China Journal of Bioinformation,2012,10(2):120-124.
Authors:MA Jing  ZHANG Shao-kai  ZHANG yan-bo
Institution:*(Department of Health Statistics,School of Public health,Shanxi Medical University,Taiyuan 030001)
Abstract:To analyze high-dimension SNPs data of GAW17 by latent class model based on Bayesian network,and to provide a new method for the study of heredity and gene location of complex diseases.The data provided by GAW17 consists of a collection of 697 individuals and include tens of thousands of SNPs on 22 euchromosome.This research randomly chooses 29 SNPs located 12 gene on chromosome 1 as research object.According to the principle that accumulative contribution rate of information should reach to 95%,the model selects 15 SNPs which contain abundance mutual information with X0,including C1S11408,C1S3201,C1S1786 and so on,classifies the study population,and explains the meaning.The population including 697 individuals is divided into 2 latent classes,the probability of the two classes are 0.68 and 0.32,respectively.To analyze the disease situation of the 2 classes,and the results show that they are not accordance.The prevalence of the second class(38.64%) is higher than the first class(25.99%),and the difference is statistically significant(χ~ 2 = 11.46,P = 0.001).This difference is caused by the 15 choosed SNPs.So we have reasons to think that these SNPs are suspicious disease locus,which provide clear idea to the next research.
Keywords:Bayesian network  Latent variable model  Single-Nucleotide Polymorphisms  Latent class analysis
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号