首页 | 本学科首页   官方微博 | 高级检索  
     


PRIESSTESS: interpretable,high-performing models of the sequence and structure preferences of RNA-binding proteins
Authors:Kaitlin U Laverty,Arttu Jolma,Sara E Pour,Hong Zheng,Debashish Ray,Quaid Morris,Timothy   R Hughes
Affiliation:Department of Molecular Genetics, University of Toronto, Toronto, Canada;Donnelly Centre, University of Toronto, Toronto, Canada;Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, USA
Abstract:Modelling both primary sequence and secondary structure preferences for RNA binding proteins (RBPs) remains an ongoing challenge. Current models use varied RNA structure representations and can be difficult to interpret and evaluate. To address these issues, we present a universal RNA motif-finding/scanning strategy, termed PRIESSTESS (Predictive RBP-RNA InterpretablE Sequence-Structure moTif regrESSion), that can be applied to diverse RNA binding datasets. PRIESSTESS identifies dozens of enriched RNA sequence and/or structure motifs that are subsequently reduced to a set of core motifs by logistic regression with LASSO regularization. Importantly, these core motifs are easily visualized and interpreted, and provide a measure of RBP secondary structure specificity. We used PRIESSTESS to interrogate new HTR-SELEX data for 23 RBPs with diverse RNA binding modes and captured known primary sequence and secondary structure preferences for each. Moreover, when applying PRIESSTESS to 144 RBPs across 202 RNA binding datasets, 75% showed an RNA secondary structure preference but only 10% had a preference besides unpaired bases, suggesting that most RBPs simply recognize the accessibility of primary sequences.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号