JUZBOX: A web server for extracting biomedical words from the protein sequence |
| |
Authors: | Paul Bobby Seetharaman Balaji Variath Sathyanath Santhosh J Eapen |
| |
Institution: | Indian Institute of Spices Research, Calicut, Kerala, India |
| |
Abstract: | The recognition of gene/protein names in literature is one of the pivotal steps in the processing of biological literatures for information extraction
or data mining. We have compiled a lexicon of biomedical words (conserved patterns/ potential motifs) which has the combination of only 20
alphabets of amino acids. The remaining 6 letters of the English alphabets (B, J, O, U, X, Z) are treated as invalid amino acid characters (to our
context), We have jumbled the 6 letters for the sake of usage and convenience and termed as ’JUZBOX‘ and these characters were filtered in the
biomedical lexicon. Undoubtedly, the generation of biomedical words from protein sequence using JUZBOX have applications specific for
functional annotation.AvailabilityJUZBOX is available freely at http://www.spices.res.in/juzbox |
| |
Keywords: | JUZBOX biomedical words lexicon |
|
|