共查询到4条相似文献,搜索用时 0 毫秒
1.
Information of the proteins' subcellular localization is crucially important for revealing their biological functions in a cell, the basic unit of life. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to develop computational tools for timely identifying their subcellular locations based on the sequence information alone. The current study is focused on the Gram-negative bacterial proteins. Although considerable efforts have been made in protein subcellular prediction, the problem is far from being solved yet. This is because mounting evidences have indicated that many Gram-negative bacterial proteins exist in two or more location sites. Unfortunately, most existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions important for both basic research and drug design. In this study, by using the multi-label theory, we developed a new predictor called “pLoc-mGneg” for predicting the subcellular localization of Gram-negative bacterial proteins with both single and multiple locations. Rigorous cross-validation on a high quality benchmark dataset indicated that the proposed predictor is remarkably superior to “iLoc-Gneg”, the state-of-the-art predictor for the same purpose. For the convenience of most experimental scientists, a user-friendly web-server for the novel predictor has been established at http://www.jci-bioinfo.cn/pLoc-mGneg/, by which users can easily get their desired results without the need to go through the complicated mathematics involved. 相似文献
2.
Succinylation is a posttranslational modification (PTM) where a succinyl group is added to a Lys (K) residue of a protein molecule. Lysine succinylation plays an important role in orchestrating various biological processes, but it is also associated with some diseases. Therefore, we are challenged by the following problem from both basic research and drug development: given an uncharacterized protein sequence containing many Lys residues, which one of them can be succinylated, and which one cannot? With the avalanche of protein sequences generated in the postgenomic age, the answer to the problem has become even more urgent. Fortunately, the statistical significance experimental data for succinylated sites in proteins have become available very recently, an indispensable prerequisite for developing a computational method to address this problem. By incorporating the sequence-coupling effects into the general pseudo amino acid composition and using KNNC (K-nearest neighbors cleaning) treatment and IHTS (inserting hypothetical training samples) treatment to optimize the training dataset, a predictor called iSuc-PseOpt has been developed. Rigorous cross-validations indicated that it remarkably outperformed the existing method. A user-friendly web-server for iSuc-PseOpt has been established at http://www.jci-bioinfo.cn/iSuc-PseOpt, where users can easily get their desired results without needing to go through the complicated mathematical equations involved. 相似文献
3.
4.
Knowledge of protein subcellular localization is vitally important for both basic research and drug development. With the avalanche of protein sequences emerging in the post-genomic age, it is highly desired to develop computational tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called “pLoc-mGpos” was developed for identifying the subcellular localization of Gram-positive bacterial proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mGpos was trained by an extremely skewed dataset in which some subset (subcellular location) was over 11 times the size of the other subsets. Accordingly, it cannot avoid the bias consequence caused by such an uneven training dataset. To alleviate such bias consequence, we have developed a new and bias-reducing predictor called pLoc_bal-mGpos by quasi-balancing the training dataset. Rigorous target jackknife tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mGpos, the existing state-of-the-art predictor in identifying the subcellular localization of Gram-positive bacterial proteins. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mGpos/, by which users can easily get their desired results without the need to go through the detailed mathematics. 相似文献