首页 | 本学科首页   官方微博 | 高级检索  
     


TWARIT: an extremely rapid and efficient approach for phylogenetic classification of metagenomic sequences
Authors:Reddy Rachamalla Maheedhar  Mohammed Monzoorul Haque  Mande Sharmila S
Affiliation:Bio-sciences R&D Division, TCS Innovation Labs, Tata Research Development & Design Centre, 54-B, Hadapsar Industrial Estate, Pune, 411013, India. maheedhar@atc.tcs.com
Abstract:Phylogenetic assignment of individual sequence reads to their respective taxa, referred to as 'taxonomic binning', constitutes a key step of metagenomic analysis. Existing binning methods have limitations either with respect to time or accuracy/specificity of binning. Given these limitations, development of a method that can bin vast amounts of metagenomic sequence data in a rapid, efficient and computationally inexpensive manner can profoundly influence metagenomic analysis in computational resource poor settings. We introduce TWARIT, a hybrid binning algorithm, that employs a combination of short-read alignment and composition-based signature sorting approaches to achieve rapid binning rates without compromising on binning accuracy and specificity. TWARIT is validated with simulated and real-world metagenomes and the results demonstrate significantly lower overall binning times compared to that of existing methods. Furthermore, the binning accuracy and specificity of TWARIT are observed to be comparable/superior to them. A web server implementing TWARIT algorithm is available at http://metagenomics.atc.tcs.com/Twarit/
Keywords:BLAST, basic local alignment search tool   HPBA, hit-pair based assignment   SSBA, signature sorting based assignment   BWA, Burrows–Wheeler alignment   NCBI, national center for biotechnology information   LCA, least common ancestor   bp, base pair(s)   256D, 256-dimensional   RC-centroids, reference cluster centroids   SS-score, signature similarity score   TA-flag, taxonomic assignment flag   MCT, most common taxon   FAMeS, fidelity of analysis of metagenomic samples   simHC, high-complexity simulated metagenome   simMC, medium-complexity simulated metagenome   simLC, low-complexity simulated metagenome   MEGAN, metagenome analyzer   SOrt-ITEMS, sequence orthology based approach for improved taxonomic estimation of metagenomic sequences   AMD, acid mine drainage   WGS scaffolds, whole genome shotgun scaffolds   GB, gigabyte(s)   RAM, random access memory   min, minute(s)
本文献已被 ScienceDirect PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号