A stochastic context free grammar based framework for analysis of protein sequences |
| |
Authors: | Witold Dyrka Jean-Christophe Nebel |
| |
Institution: | (1) Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology, Wroclaw, Poland;(2) Faculty of Computing, Information Systems & Mathematics, Kingston University, London, UK |
| |
Abstract: | Background In the last decade, there have been many applications of formal language theory in bioinformatics such as RNA structure prediction
and detection of patterns in DNA. However, in the field of proteomics, the size of the protein alphabet and the complexity
of relationship between amino acids have mainly limited the application of formal language theory to the production of grammars
whose expressive power is not higher than stochastic regular grammars. However, these grammars, like other state of the art
methods, cannot cover any higher-order dependencies such as nested and crossing relationships that are common in proteins.
In order to overcome some of these limitations, we propose a Stochastic Context Free Grammar based framework for the analysis
of protein sequences where grammars are induced using a genetic algorithm. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|