首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Automated Alphabet Reduction for Protein Datasets
Authors:Jaume Bacardit  Michael Stout  Jonathan D Hirst  Alfonso Valencia  Robert E Smith and Natalio Krasnogor
Institution:(1) ASAP research group, School of Computer Science, University of Nottingham, Jubilee Campus, Wollaton Road, Nottingham, NG8 1BB, UK;(2) MYCIB, School of Biosciences, University of Nottingham, Sutton Bonington, LE12 5RD, UK;(3) School of Chemistry, University of Nottingham, University Park, Nottingham, NG7 2RD, UK;(4) Spanish National Cancer Research Centre, Melchor Fdez Almagro, 3., 28029 Madrid, Spain;(5) Dept. of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK
Abstract:

Background  

We investigate automated and generic alphabet reduction techniques for protein structure prediction datasets. Reducing alphabet cardinality without losing key biochemical information opens the door to potentially faster machine learning, data mining and optimization applications in structural bioinformatics. Furthermore, reduced but informative alphabets often result in, e.g., more compact and human-friendly classification/clustering rules. In this paper we propose a robust and sophisticated alphabet reduction protocol based on mutual information and state-of-the-art optimization techniques.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号