Using Markov model to improve word normalization algorithm for biological sequence comparison期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Using Markov model to improve word normalization algorithm for biological sequence comparison

Authors:	Qi Dai Xiaoqing Liu Yuhua Yao Fukun Zhao

Institution:	(1) College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou, 310018, People’s Republic of China;(2) School of Science, Hangzhou Dianzi University, Hangzhou, 310018, People’s Republic of China

Abstract:	There are two crucial problems with statistical measures for sequence comparison: overlapping structures and background information of words in biological sequences. Word normalization in improved composition vector method took into account these problems and achieved better performance in evolutionary analysis. The word normalization is desirable, but not sufficient, because it assumes that the four bases A, C, T, and G occur randomly with equal chance. This paper proposed an improved word normalization which uses Markov model to estimate exact k-word distribution according to observed biological sequence and thus has the ability to adjust the background information of the k-word frequencies in biological sequences. The improved word normalization was tested with three experiments and compared with the existing word normalization. The experiment results confirm that the improved word normalization using Markov model to estimate the exact k-word distribution in biological sequences is more efficient.

Keywords:
本文献已被 PubMed SpringerLink 等数据库收录！