Pattern recognition in DNA sequences and its application to consensus foot-printing |
| |
Authors: | Lefevre Christophe; Ikeda Joh-E |
| |
Institution: | Genosphere Project, ERATO, JRDC, Tokai University School of Medicine Isehara, Kanagawa 259-11, Japan |
| |
Abstract: | We consider the problem of comparing several nucleic acid sequencesto identify words occurring imperfectly (patterns with no gap)with unusual frequency. Methods for computing, representing,and inspecting interactively the structure of such repeatingmotifs in nucleic acids and more generally any text are described.Multiple sequences are treated as one large concatenate. Ina preprocessing step, a lexical index is created to providerapid string matching for the enumeration of the words matchinga pattern. For given word features (word length, minimal frequency),a sequence profile is displayed. The profile can be inspectedinteractively with on-line algorithms. Applications to the identificationof regulatory elements in DNA regions involved in the controlof gene expression are presented. Our program (DNA-Lexemics)runs on the Macintosh. |
| |
Keywords: | |
本文献已被 Oxford 等数据库收录! |
|