首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Association pattern mining of intron retention events in human based on hybrid learning machine
Authors:Hu Hae-Jin  Goh Sung-Ho  Lee Yeon-Su
Institution:Functional Genomics Branch, Research Institute, National Cancer Center, Gyeonggi-do, Republic of Korea.
Abstract:Alternative splicing is a main component of protein diversity, and aberrant splicing is known to be one of the main causes of genetic disorders such as cancer. Many statistical and computational approaches have identified several major factors that determine the splicing event, such as exon/intron length, splice site strength, and density of splicing enhancers or silencers. These factors may be correlated with one another and thus result in a specific type of splicing, but there has not been a systematic approach to extracting comprehensible association patterns. Here, we attempted to understand the decision making process of the learning machine on intron retention event. We adopted a hybrid learning machine approach using a random forest and association rule mining algorithm to determine the governing factors of intron retention events and their combined effect on decision-making processes. By quantifying all candidate features into five category values, we enhanced the understandability of generated rules. The interesting features found by the random forest algorithm are that only the adenine- and thymine-based triplets such as ATA, TTA, and ATT, but not the known intronic splicing enhancer GGG triplet is shown the significant features. The rules generated by the association rule mining algorithm also show that constitutive introns are generally characterized by high adenine- and thymine-based triplet frequency (level 3 and above), 3' and 5' splice site scores, exonic splicing silencer scores, and intron length, whereas retained introns are characterized by low-level counterpart scores.
Keywords:
本文献已被 PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号