首页 | 本学科首页   官方微博 | 高级检索  
     


RAPID: Fast and accurate sequence-based prediction of intrinsic disorder content on proteomic scale
Authors:Jing Yan  Marcin J. Mizianty  Paul L. Filipow  Vladimir N. Uversky  Lukasz Kurgan
Affiliation:1. Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada;2. Department of Molecular Medicine, Byrd Alzheimer''s Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA;3. Institute for Biological Instrumentation, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
Abstract:Recent research in the protein intrinsic disorder was stimulated by the availability of accurate computational predictors. However, most of these methods are relatively slow, especially considering proteome-scale applications, and were shown to produce relatively large errors when estimating disorder at the protein- (in contrast to residue-) level, which is defined by the fraction/content of disordered residues. To this end, we propose a novel support vector Regression-based Accurate Predictor of Intrinsic Disorder (RAPID). Key advantages of RAPID are speed (prediction of an average-size eukaryotic proteome takes < 1 h on a modern desktop computer); sophisticated design (multiple, complementary information sources that are aggregated over an input chain are combined using feature selection); and high-quality and robust predictive performance. Empirical tests on two diverse benchmark datasets reveal that RAPID's predictive performance compares favorably to a comprehensive set of state-of-the-art disorder and disorder content predictors. Drawing on high speed and good predictive quality, RAPID was used to perform large-scale characterization of disorder in 200 + fully sequenced eukaryotic proteomes. Our analysis reveals interesting relations of disorder with structural coverage and chain length, and unusual distribution of fully disordered chains. We also performed a comprehensive (using 56000+ annotated chains, which doubles the scope of previous studies) investigation of cellular functions and localizations that are enriched in the disorder in the human proteome. RAPID, which allows for batch (proteome-wide) predictions, is available as a web server at http://biomine.ece.ualberta.ca/RAPID/.
Keywords:Intrinsic disorder   Disorder content   Disorder prediction   Eukaryotes   Structural coverage
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号