首页 | 本学科首页   官方微博 | 高级检索  
     


Permutation pattern discovery in biosequences.
Authors:Revital Eres  Gad M Landau  Laxmi Parida
Affiliation:Department of Computer Science, University of Haifa, Mount Carmel, Haifa 31905, Israel. revitale@cslx.haifa.ac.il
Abstract:Functionally related genes often appear in each other's neighborhood on the genome; however, the order of the genes may not be the same. These groups or clusters of genes may have an ancient evolutionary origin or may signify some other critical phenomenon and may also aid in function prediction of genes. Such gene clusters also aid toward solving the problem of local alignment of genes. Similarly, clusters of protein domains, albeit appearing in different orders in the protein sequence, suggest common functionality in spite of being nonhomologous. In the paper, we address the problem of automatically discovering clusters of entities, be they genes or domains: we formalize the abstract problem as a discovery problem called the (pi)pattern problem and give an algorithm that automatically discovers the clusters of patterns in multiple data sequences. We take a model-less approach and introduce a notation for maximal patterns that drastically reduces the number of valid cluster patterns, without any loss of information, We demonstrate the automatic pattern discovery tool on motifs on E. Coli protein sequences.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号