Identifying anticancer peptides by using a generalized chaos game representation |
| |
Authors: | Ge Li Liu Jiaguo Zhang Yusen Dehmer Matthias |
| |
Affiliation: | 1.School of Mathematics and Statistics, Shandong University at Weihai, Weihai, 264209, China ;2.Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria ; |
| |
Abstract: |
We generalize chaos game representation (CGR) to higher dimensional spaces while maintaining its bijection, keeping such method sufficiently representative and mathematically rigorous compare to previous attempts. We first state and prove the asymptotic property of CGR and our generalized chaos game representation (GCGR) method. The prediction follows that the dissimilarity of sequences which possess identical subsequences but distinct positions would be lowered exponentially by the length of the identical subsequence; this effect was taking place unbeknownst to researchers. By shining a spotlight on it now, we show the effect fundamentally supports (G)CGR as a similarity measure or feature extraction technique. We develop two feature extraction techniques: GCGR-Centroid and GCGR-Variance. We use the GCGR-Centroid to analyze the similarity between protein sequences by using the datasets 9 ND5, 24 TF and 50 beta-globin proteins. We obtain consistent results compared with previous studies which proves the significance thereof. Finally, by utilizing support vector machines, we train the anticancer peptide prediction model by using both GCGR-Centroid and GCGR-Variance, and achieve a significantly higher prediction performance by employing the 3 well-studied anticancer peptide datasets. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|