PluriPred: A Web server for predicting proteins involved in pluripotent network |
| |
Authors: | Sukhen Das Mandal Sudipto Saha |
| |
Institution: | 1.Bioinformatics Centre,Bose Institute,Kolkata,India;2.Department of Biological Sciences,Indian Institute of Science Education and Research, Kolkata,Mohanpur,India |
| |
Abstract: | Pluripotency is a unique property of stem cells that allows them to differentiate into all types of adult cells or maintain the self-renewal property. PluriPred predicts whether a protein is involved in pluripotency from primary protein sequence using manually curated pluripotent proteins as training datasets. Machine learning techniques (MLTs) such as Support Vector Machine (SVM), Naïve Base (NB), Random Forest (RF), and sequence alignment technique BLAST were used in our study. The combination of SVM and PSI-BLAST was our proposed best model, which obtained a sensitivity of 77.40%, specificity of 79.72%, accuracy of 79.2%, and area under the ROC curve was 0.82 using 5-fold cross-validation. Furthermore, PluriPred gives the confidence of the prediction from training dataset’s SVM score distribution and p-value from BLAST. We validated our proposed model with the other existing high-throughput studies using blind/independent datasets. Using PluriPred, 233 novel core and 323 novel extended core pluripotent proteins from mouse proteome, and 167 novel core and 385 extended core pluripotent proteins from human proteome, were predicted with high confidence. The Web application of PluriPred is available from bicresources.jcbose.ac.in/ssaha4/pluripred/. Many pluripotent genes/proteins take part in protein-protein networks associated with stem cell, cancer, and developmental biology, and we believe that PluriPred will help in these research. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|