Automated Alphabet Reduction for Protein Datasets期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Automated Alphabet Reduction for Protein Datasets

Authors:	Jaume Bacardit Michael Stout Jonathan D Hirst Alfonso Valencia Robert E Smith and Natalio Krasnogor

Institution:	(1) ASAP research group, School of Computer Science, University of Nottingham, Jubilee Campus, Wollaton Road, Nottingham, NG8 1BB, UK;(2) MYCIB, School of Biosciences, University of Nottingham, Sutton Bonington, LE12 5RD, UK;(3) School of Chemistry, University of Nottingham, University Park, Nottingham, NG7 2RD, UK;(4) Spanish National Cancer Research Centre, Melchor Fdez Almagro, 3., 28029 Madrid, Spain;(5) Dept. of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK

Abstract:	Background We investigate automated and generic alphabet reduction techniques for protein structure prediction datasets. Reducing alphabet cardinality without losing key biochemical information opens the door to potentially faster machine learning, data mining and optimization applications in structural bioinformatics. Furthermore, reduced but informative alphabets often result in, e.g., more compact and human-friendly classification/clustering rules. In this paper we propose a robust and sophisticated alphabet reduction protocol based on mutual information and state-of-the-art optimization techniques.

Keywords:
本文献已被 SpringerLink 等数据库收录！