首页 | 本学科首页   官方微博 | 高级检索  
   检索      

人源microRNA前体的全基因组预测
引用本文:应晓敏,朱娟娟,王小磊,赵东升,付汉江,郑晓飞,李伍举.人源microRNA前体的全基因组预测[J].中国科学:生命科学,2011,41(10):958-964.
作者姓名:应晓敏  朱娟娟  王小磊  赵东升  付汉江  郑晓飞  李伍举
作者单位:军事医学科学院基础医学研究所计算生物学中心, 北京 100850;
军事医学科学院放射与辐射医学研究所, 北京 100850;
军事医学科学院卫生勤务与医学情报研究所, 北京 100850
基金项目:国家自然科学基金(批准号: 30500105 和31071157)、国家重大科学研究计划(批准号: 2010CB912801)和国家科技重大专项“艾滋病和病毒性肝炎等重大传染病防治”(批准号: 2008ZX10002-016)资助项目
摘    要:microRNA(miRNA)是一类不编码蛋白的调控小分子RNA,在真核生物中发挥着广泛而重要的调控功能.由于miRNA的表达具有时空特异性,因而通过计算方法预测miRNA而后有针对性的实验验证是miRNA发现的一条重要途径.降低假阳性率是miRNA预测方法面临的重要挑战.本研究采用集成学习方法构建预测miRNA前体的分类器SVMbagging,对训练集、测试集和独立测试集的结果表明,本研究的方法性能稳健、假阳性率低,具有很好的泛化能力,尤其是当阈值取0.9时,特异性高达99.90%,敏感性在26%以上,适合于全基因组预测.采用SVMbagging在人全基因组中预测miRNA前体,当取阈值0.9时,得到14933个可能的miRNA前体.通过与高通量小RNA测序数据的比较,发现其中4481个miRNA前体具有完全匹配的小RNA序列,与理论估计的真阳性数值非常接近.最后,对32个可能的miRNA进行实验验证,确定其中2条为真实的miRNA.

关 键 词:miRNA  预测  机器学习  集成学习

Genome-wide Prediction of Human microRNA Precursors
YING XiaoMin,ZHU JuanJuan,WANG XiaoLei,ZHAO DongSheng,FU HanJiang,ZHENG XiaoFei,LI WuJu.Genome-wide Prediction of Human microRNA Precursors[J].Scientia Sinica Vitae,2011,41(10):958-964.
Authors:YING XiaoMin  ZHU JuanJuan  WANG XiaoLei  ZHAO DongSheng  FU HanJiang  ZHENG XiaoFei  LI WuJu
Institution:1 Center of Computational Biology, Institute of Basic Medical Sciences, Academy of Military Medical Sciences, Beijing 100850, China;
2 Institute of Radiation Medicine, Academy of Military Medical Sciences, Beijing 100850, China;
3 Institute of Health Administration and Medicine Information, Academy of Military Medical Sciences, Beijing 100850, China
Abstract:microRNAs (miRNAs) are a class of small regulatory non-coding RNAs. They are involved in diverse pathways and play important roles in gene regulation in eukaryotes. Since the expression of miRNAs is spatial and temporal-specific, computational prediction followed by experimental validation is still an important approach in miRNA discovery. Decreasing false positive ratio is very challenging in miRNA prediction. Here we employed ensemble learning to construct the classifier SVMbagging to predict miRNA precursors. The results of the training, test and independent test datasets demonstrate that SVMbagging is robust, has low false positive ratio and good generalization ability. Especially when the threshold is 0.9, the specificity of SVMbagging is as high as 99.90% and the sensitivity is higher than 26.00%. Therefore, SVMbagging is suitable for genome-wide prediction. We applied SVMbagging in genome-wide prediction of miRNA precursors in human genome. We obtained 14933 candidate miRNA precursors at the threshold of 0.9. Among them, 4481 miRNA precursors have perfect matches with small RNA reads when aligning with three groups of small RNA datasets from high-throughput sequencing technologies. The number of miRNAs is very close to the true positives estimated theoretically according to the performance of SVMbagging. Finally, we applied experimental methods to validate 32 candidate miRNAs. Two of them were confirmed to be true miRNAs.
Keywords:miRNA  prediction  machine learning  ensemble learning
本文献已被 维普 等数据库收录!
点击此处可从《中国科学:生命科学》浏览原始摘要信息
点击此处可从《中国科学:生命科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号