Support vector machine for discrimination of thermophilic and mesophilic proteins based on amino acid composition |
| |
Authors: | Zhang Guangya Fang Baishan |
| |
Affiliation: | Key Laboratory of Industrial Biotechnology, Hua Qiao University, Fujian Province University, Quanzhou, Fujian 362021, PR China. |
| |
Abstract: | The identification of the thermostability from the amino acid sequence information would be helpful in computational screening for thermostable proteins. We have developed a method to discriminate thermophilic and mesophilic proteins based on support vector machines. Using self-consistency validation, 5-fold cross-validation and independent testing procedure with other datasets, this module achieved overall accuracy of 94.2%, 90.5% and 92.4%, respectively. The performance of this SVM-based module was better than the classifiers built using alternative machine learning and statistical algorithms including artificial neural networks, Bayesian statistics, and decision trees, when evaluated using these three validation methods. The influence of protein size on prediction accuracy was also addressed. |
| |
Keywords: | |
本文献已被 PubMed 等数据库收录! |
|