首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Using Markov model to improve word normalization algorithm for biological sequence comparison
Authors:Qi Dai  Xiaoqing Liu  Yuhua Yao  Fukun Zhao
Institution:(1) College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou, 310018, People’s Republic of China;(2) School of Science, Hangzhou Dianzi University, Hangzhou, 310018, People’s Republic of China
Abstract:There are two crucial problems with statistical measures for sequence comparison: overlapping structures and background information of words in biological sequences. Word normalization in improved composition vector method took into account these problems and achieved better performance in evolutionary analysis. The word normalization is desirable, but not sufficient, because it assumes that the four bases A, C, T, and G occur randomly with equal chance. This paper proposed an improved word normalization which uses Markov model to estimate exact k-word distribution according to observed biological sequence and thus has the ability to adjust the background information of the k-word frequencies in biological sequences. The improved word normalization was tested with three experiments and compared with the existing word normalization. The experiment results confirm that the improved word normalization using Markov model to estimate the exact k-word distribution in biological sequences is more efficient.
Keywords:
本文献已被 PubMed SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号