Predicting protein solubility with a hybrid approach by pseudo amino acid composition |
| |
Authors: | Xiaohui Niu Nana Li Feng Shi Xuehai Hu Jingbo Xia Huijuan Xiong |
| |
Institution: | College of Science, Huazhong Agricultural University, Wuhan, P.R. of China. |
| |
Abstract: | Protein solubility plays a major role for understanding the crystal growth and crystallization process of protein. How to predict the propensity of a protein to be soluble or to form inclusion body is a long but not fairly resolved problem. After choosing almost 10,000 protein sequences from NCBI database and eliminating the sequences with 90% homologous similarity by CD-HIT, 5692 sequences remained. By using Chou's pseudo amino acid composition features, we predict the soluble protein with the three methods: support vector machine (SVM), back propagation neural network (BP Neural Network) and hybrid method based on SVM and BP Neural Network, respectively. Each method is evaluated by re-substitution test and 10-fold cross-validation test. In the re-substitution test, the BP Neural Network performs with the best results, in which the accuracy achieves 0.9288 and Matthews Correlation Coefficient (MCC) achieves 0.8513. Meanwhile, the other two methods are better than BP Neural Network in 10-fold cross-validation test. The hybrid method based on SVM and BP Neural Network is the best. The average accuracy is 0.8678 and average MCC is 0.7233. Although all of the three methods achieve considerable evaluations, the hybrid method is deemed to be the best, according to the performance comparison. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|