Using sequence compression to speedup probabilistic profile matching |
| |
Authors: | Freschi Valerio Bogliolo Alessandro |
| |
Institution: | Information Science and Technology Institute, University of Urbino, 61029 Urbino, Italy. |
| |
Abstract: | MOTIVATION: Matching a biological sequence against a probabilistic pattern (or profile) is a common task in computational biology. A probabilistic profile, represented as a scoring matrix, is more suitable than a deterministic pattern to retain the peculiarities of a given segment of a family of biological sequences. Brute-force algorithms take O(NP) to match a sequence of N characters against a profile of length P < N. RESULTS: In this work, we exploit string compression techniques to speedup brute-force profile matching. We present two algorithms, based on run-length and LZ78 encodings, that reduce computational complexity by the compression factor of the encoding. |
| |
Keywords: | |
本文献已被 PubMed Oxford 等数据库收录! |
|