Statistical distance between texts and filtration methods in sequence comparison期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Statistical distance between texts and filtration methods in sequence comparison

Authors:	Pevzner Pavel A

Institution:	Department of Mathematics, University of Southern California Los Angeles, CA 90089-1113, USA and Laboratory of Mathematical Methods, Institute of Genetics of Microorganisms Moscow 113545, USSR

Abstract:	Upon searching local similarities in long sequences, the necessityof a ‘rapid’ similarity search becomes acute. Quadraticcomplexity of dynamic programming algorithms forces the employmentof filtration methods that allow elimination of the sequenceswith a low similarity level. The paper is devoted to the theoreticalsubstantiations of the filtration method based on the statisticaldistance between texts. The notion of the filtration efficiencyis introduced and the efficiency of several filters is estimated.It is shown that the efficiency of the statistical l-tuple filtrationupon DNA database search is associated with a potential extensionof the original four–letter alphabet and grows exponentiallywith increasing l. The formula that allows one to estimate thefiltration parameters is presented.

Keywords:
本文献已被 Oxford 等数据库收录！