首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Protein sequences classification by means of feature extraction with substitution matrices
Authors:Rabie Saidi  Engelbert Mephu Nguifo
Institution:1.LIMOS - Blaise Pascal University,Clermont University,Clermont-Ferrand,France;2.LIMOS,CNRS UMR,Aubière,France;3.Department of Computer Science - FSJ,University of Jendouba,Jendouba,Tunisia;4.URPAH - FST,University of Tunis El Manar,Tunis,Tunisia;5.Department of Computer Science - FSG,University of Gafsa,Gafsa,Tunisia
Abstract:

Background  

This paper deals with the preprocessing of protein sequences for supervised classification. Motif extraction is one way to address that task. It has been largely used to encode biological sequences into feature vectors to enable using well-known machine-learning classifiers which require this format. However, designing a suitable feature space, for a set of proteins, is not a trivial task. For this purpose, we propose a novel encoding method that uses amino-acid substitution matrices to define similarity between motifs during the extraction step.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号