首页 | 本学科首页   官方微博 | 高级检索  
   检索      


pLoc_bal-mGpos: Predict subcellular localization of Gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC
Authors:Xuan Xiao  Xiang Cheng  Genqiang Chen  Qi Mao  Kuo-Chen Chou
Institution:1. Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China;2. Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China;3. College of Chemistry, Chemical Engineering and Biotechnology, Donghua University, Shanghai 201620, China;4. College of Information Science and Technology, Donghua University, Shanghai, China;5. Gordon Life Science Institute, Boston, MA 02478, USA
Abstract:Knowledge of protein subcellular localization is vitally important for both basic research and drug development. With the avalanche of protein sequences emerging in the post-genomic age, it is highly desired to develop computational tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called “pLoc-mGpos” was developed for identifying the subcellular localization of Gram-positive bacterial proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mGpos was trained by an extremely skewed dataset in which some subset (subcellular location) was over 11 times the size of the other subsets. Accordingly, it cannot avoid the bias consequence caused by such an uneven training dataset. To alleviate such bias consequence, we have developed a new and bias-reducing predictor called pLoc_bal-mGpos by quasi-balancing the training dataset. Rigorous target jackknife tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mGpos, the existing state-of-the-art predictor in identifying the subcellular localization of Gram-positive bacterial proteins. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mGpos/, by which users can easily get their desired results without the need to go through the detailed mathematics.
Keywords:Correspondence to: Xuan Xiao  Computer Department  Jingdezhen Ceramic Institute  Jingdezhen  China    Multi-label system  Gram-positive bacterial proteins  IHTS treatment  Five-step rules  ML-GKR  Chou's intuitive metrics
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号