Di-codon usage for classification of genes |
| |
Authors: | Minh N Nguyen Jianmin Ma Gary B Fogel Jagath C Rajapakse |
| |
Institution: | 1. BioInfomatics Institute, Singapore;2. Natural Selection Inc., San Diego, USA;3. BioInformatics Research Centre, Nanyang Technological University, Singapore;4. Singapore-MIT Alliance, Singapore;5. Department of Biological Engineering, Massachusettes Institutes of Technology, USA |
| |
Abstract: | Genes are often classified into biologically related groups so that inferences on their functions can be made. This paper demonstrates that the di-codon usage is a useful feature for gene classification and gives better classification accuracy than the codon usage. Our experiments with different classifiers show that support vector machines performs better than other classifiers in classifying genes by using di-codon usage as features. The method is illustrated on 1841 HLA sequences which are classified into two major classes, HLA-I and HLA-II, and further classified into the subclasses of major classes. By using both codon and di-codon features, we show near perfect accuracies in the classification of HLA molecules into major classes and their sub-classes. |
| |
Keywords: | Di-codon usage Gene classification Human leukocyte antigen (HLA) Major histocompatibility complex (MHC) Support vector machines (SVM) |
本文献已被 ScienceDirect 等数据库收录! |
|