Predicting Viral Protein Subcellular Localization with Chou's Pseudo Amino Acid Composition and Imbalance-Weighted Multi-Label K-Nearest Neighbor Algorithm |
| |
Authors: | Cao Jun-Zhe Liu Wen-Qi Gu Hong |
| |
Affiliation: | School of Control Science and Engineering, Dalian University of Technology, No.2 Ling-gong Road, Dalian, Liaoning, China. guhong@dlut.edu.cn. |
| |
Abstract: | Machine learning is a kind of reliable technology for automated subcellular localization of viral proteins within a host cell or virus-infected cell. One challenge is that the viral protein samples are not only with multiple location sites, but also class-imbalanced. The imbalanced dataset often decreases the prediction performance. In order to accomplish this challenge, this paper proposes a novel approach named imbalance-weighted multi-label K-nearest neighbor to predict viral protein subcellular location with multiple sites. The experimental results by jackknife test indicate that the presented algorithm achieves a better performance than the existing methods and has great potentials in protein science. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|