Prediction of lysine ubiquitination with mRMR feature selection and analysis |
| |
Authors: | Cai Yudong Huang Tao Hu Lele Shi Xiaohe Xie Lu Li Yixue |
| |
Institution: | (1) Institute of Systems Biology, Shanghai University, Shanghai, 200444, People’s Republic of China;(2) Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, People’s Republic of China;(3) Shanghai Center for Bioinformation Technology, Shanghai, 200235, People’s Republic of China;(4) Centre for Computational Systems Biology, Fudan University, Shanghai, 200433, People’s Republic of China;(5) Singapore Bioimaging Consortium, Agency for Science, Technology and Research, Singapore, 138667, Singapore |
| |
Abstract: | Ubiquitination, one of the most important post-translational modifications of proteins, occurs when ubiquitin (a small 76-amino
acid protein) is attached to lysine on a target protein. It often commits the labeled protein to degradation and plays important
roles in regulating many cellular processes implicated in a variety of diseases. Since ubiquitination is rapid and reversible,
it is time-consuming and labor-intensive to identify ubiquitination sites using conventional experimental approaches. To efficiently
discover lysine-ubiquitination sites, a sequence-based predictor of ubiquitination site was developed based on nearest neighbor
algorithm. We used the maximum relevance and minimum redundancy principle to identify the key features and the incremental
feature selection procedure to optimize the prediction engine. PSSM conservation scores, amino acid factors and disorder scores
of the surrounding sequence formed the optimized 456 features. The Mathew’s correlation coefficient (MCC) of our ubiquitination
site predictor achieved 0.142 by jackknife cross-validation test on a large benchmark dataset. In independent test, the MCC
of our method was 0.139, higher than the existing ubiquitination site predictor UbiPred and UbPred. The MCCs of UbiPred and
UbPred on the same test set were 0.135 and 0.117, respectively. Our analysis shows that the conservation of amino acids at
and around lysine plays an important role in ubiquitination site prediction. What’s more, disorder and ubiquitination have
a strong relevance. These findings might provide useful insights for studying the mechanisms of ubiquitination and modulating
the ubiquitination pathway, potentially leading to potential therapeutic strategies in the future. |
| |
Keywords: | |
本文献已被 PubMed SpringerLink 等数据库收录! |
|