首页 | 本学科首页   官方微博 | 高级检索  
     

基于DNA序列数据挖掘算法研究
引用本文:岳晓宁,井元伟. 基于DNA序列数据挖掘算法研究[J]. 生物数学学报, 2009, 24(2): 363-368
作者姓名:岳晓宁  井元伟
作者单位:岳晓宁(沈阳大学,理学院,辽宁,沈阳,110044);井元伟(东北大学,信息科学与工程学院,辽宁,沈阳,110004) 
摘    要:引入数据挖掘技术,研究DNA序列数据内在规律性,并给出DNA序列分类问题的算法.综合考虑碱基组的出现概率以及相邻氨基酸之间的关系,从DNA序列片段的个案中密码子分布密度角度出发,提取出已知类别的DNA序列片段的特征;应用分类的逐步判别分析方法,剔除判别能力不显著的变量,给出DNA序列分类的判别函数.仿真结果表明,该算法具有分类计算公式简单且分类结果精度的优点.

关 键 词:DNA序列  密码子  判别函数  数据挖掘  频率

Research Based on the Algorithm of DNA Sequences Data Mining
Affiliation:YUE Xiao-ning, JING Yuan-wei (1 School of Science, Shenyang University, Shenyang Liaoning 110044 China;2 School of Information Science and Engineering, Northeastern University, Shenyang Liaoning 110004 China)
Abstract:Using data mining technology, the inherent regularity of DNA sequence data was investigated; the algorithm of DNA sequence classification was given. Based on the appearance probability of Tri-base Forms and the relationship between adjacent amino acids, and from the view of codon distribution density in the case of the DNA sequence segmentation, the characters of DNA sequence segmentation whose categories were known were obtained. Using the method of stepwise discriminant analysis, the insensitive variables in math model were deleted; the discriminant functions of DNA sequence classifications were established. The simulation results show that this Mgorithm is simple in structure and have a precise classification result.
Keywords:DNA sequence  Codon  discriminant function  Data Mining  Frequency
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号