首页 | 本学科首页   官方微博 | 高级检索  
     

基于不同标度伪氨基酸组成预测脂肪酶的类型
引用本文:张光亚,李红春,高嘉强,方柏山. 基于不同标度伪氨基酸组成预测脂肪酶的类型[J]. 生物工程学报, 2008, 24(11): 1968-1974
作者姓名:张光亚  李红春  高嘉强  方柏山
作者单位:华侨大学生物工程与技术系,厦门,361021
基金项目:高等学校博士学科点专项科研基金项目,福建省自然科学基金项目
摘    要:从序列出发预测某蛋白质是否为脂肪酶以及属于哪种脂肪酶具有重要的理论和应用价值.提出了基于Z标度和T标度的伪氨基酸组成方法提取序列特征值,采用了k-近邻算法回答上述问题.经参数选择后,三种方法在各自最优运行参数下,其1倍交叉验证的结果为:对脂肪酶和非脂肪酶预测精度分别为92.8%、91.4%和91.3%;对脂肪酶类型预测的精度分别为92.3%、90.3%和89.7%.其中基于Z标度伪氨基酸组成效果最佳.基于T标度的次之,但均明显优于其他6种常见的特征值提取方法,并对其可能的原因进行了探讨.

关 键 词:脂肪酶  Z-标度  T-标度  伪氨基酸组成  k-近邻
收稿时间:2008-03-11

Prediction of Lipases Types by Different Scale Pseudo-amino Acid Composition
Guangya Zhang,Hongchun Li,Jiaqiang Gao and Baishan Fang. Prediction of Lipases Types by Different Scale Pseudo-amino Acid Composition[J]. Chinese journal of biotechnology, 2008, 24(11): 1968-1974
Authors:Guangya Zhang  Hongchun Li  Jiaqiang Gao  Baishan Fang
Affiliation:Institute of Industrial Biotechnology, Huaqiao University, Quanzhou 362021, China;Institute of Industrial Biotechnology, Huaqiao University, Quanzhou 362021, China;Institute of Industrial Biotechnology, Huaqiao University, Quanzhou 362021, China;Institute of Industrial Biotechnology, Huaqiao University, Quanzhou 362021, China
Abstract:Lipases are widely used enzymes in biotechnology. Although they catalyze the same reaction, their sequences vary. Therefore, it is highly desired to develop a fast and reliable method to identify the types of lipases according to their sequences, or even just to confirm whether they are lipases or not. By proposing two scales based pseudo amino acid composition approaches to extract the features of the sequences, a powerful predictor based on k-nearest neighbor was introduced to address the problems. The overall success rates thus obtained by the 10-fold cross-validation test were shown as below: for predicting lipases and nonlipase, the success rates were 92.8%, 91.4% and 91.3%, respectively. For lipase types, the success rates were 92.3%, 90.3% and 89.7%, respectively. Among them, the Z scales based pseudo amino acid composition was the best, T scales was the second. They outperformed significantly than 6 other frequently used sequence feature extraction methods. The high success rates yielded for such a stringent dataset indicate predicting the types of lipases is feasible and the different scales pseudo amino acid composition might be a useful tool for extracting the features of protein sequences, or at lease can play a complementary role to many of the other existing approaches.
Keywords:Lipase   Z-scales   T-scales   pseudo-amino acid composition   k-nearest neighbor
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《生物工程学报》浏览原始摘要信息
点击此处可从《生物工程学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号