Predicting subcellular localization of multi-label proteins by incorporating the sequence features into Chou's PseAAC |
| |
Authors: | Faisal Javed Maqsood Hayat |
| |
Affiliation: | Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan |
| |
Abstract: | The emergence of numerous genome projects has made the experimental classification of the protein localization almost impossible due to the exponential increase in the number of protein samples. However, most of the applications are merely developed for single-plex and completely ignored the presence of one protein at two or more locations in a cell. In this regard, few attempts were carried out to target Multi-label protein localizations; consequently, undesirable accuracies are achieved. This paper presents a novel approach, in which a discrete feature extraction method is fused with physicochemical properties of amino acids by using Chou's general form of Pseudo Amino Acid Composition. The technique is tested on two benchmark datasets namely: Gpos-mploc and Virus-mPLoc. The empirical results demonstrated that the proposed method yields better results via two examined classifiers i.e. ML-KNN and Rank-SVM. It is established that the proposed model has improved values in all performance measures considered for the comparison. |
| |
Keywords: | Corresponding author. Split Amino Acid Composition Pseudo Amino Acid Composition ML-KNN Rank-SVM SMOTE |
本文献已被 ScienceDirect 等数据库收录! |