Using pseudo amino acid composition to predict protease families by incorporating a series of protein biological features期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Using pseudo amino acid composition to predict protease families by incorporating a series of protein biological features

Authors:	Hu Lele Zheng Lulu Wang Zhiwen Li Bing Liu Lei

Affiliation:	Institute of Systems Biology, Shanghai University, Shanghai 200444, China. lele_hu@yahoo.cn

Abstract:	Proteases are essential to most biological processes though they themselves remain intact during the processes. In this research, a computational approach was developed for predicting the families of proteases based on their sequences. According to the concept of pseudo amino acid composition, in order to catch the essential patterns for the sequences of proteases, the sample of a protein was formulated by a series of its biological features. There were a total of 132 biological features, which were sourced from various biochemical and physicochemical properties of the constituent amino acids. The importance of these features to the prediction is rated by Maximum Relevance Minimum Redundancy algorithm and then the Incremental Feature Selection was applied to select an optimal feature set, which was used to construct a predictor through the nearest neighbor algorithm. As a demonstration, the overall success rate by the jackknife test in identifying proteases among their seven families was 92.74%. It was revealed by further analysis on the optimal feature set that the secondary structure and amino acid composition play the key roles for the classification, which is quite consistent with some previous findings. The promising results imply that the predictor as presented in this paper may become a useful tool for studying proteases.

Keywords:
本文献已被 PubMed 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏