Finding keywords for intergenic and gene regions for human genome |
| |
Authors: | Qiao Y H Liu J L Zhang C G Zeng Yanjun |
| |
Affiliation: | Biomechanics and Medical Information Institute, Beijing University of Technology, Beijing, China. |
| |
Abstract: | The analysis of functionally related sequences for conserved patterns is important for further research of different functional regions. This paper presents an analysis of genes and intergenic sequences from the point of view of linguistics analysis, where gene and intergenic regions are regarded as two different subjects written in the four-letter alphabet [A, C, G, T] and high-frequency simple sequences are taken as keywords. A measurement alpha[l(tau)] was introduced to describe the relative repeat ratio of simple sequences. Cutoff values were found for keywords selection. After eliminating "noise," 87 short sequences were selected as keywords for intergenic regions and 76 for gene regions. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|