Binpairs: Utilization of Illumina Paired-End Information for Improving Efficiency of Taxonomic Binning of Metagenomic Sequences |
| |
Authors: | Anirban Dutta Disha Tandon Mohammed MH Tungadri Bose Sharmila S. Mande |
| |
Affiliation: | Bio-Sciences R&D Division, TCS Innovation Labs, Tata Research Development & Design Centre, Tata Consultancy Services Ltd., 54-B Hadapsar Industrial Estate, Pune 411013, Maharashtra, India.; Oklahoma State University, United States of America, |
| |
Abstract: | MotivationPaired-end sequencing protocols, offered by next generation sequencing (NGS) platforms like Illumia, generate a pair of reads for every DNA fragment in a sample. Although this protocol has been utilized for several metagenomics studies, most taxonomic binning approaches classify each of the reads (forming a pair), independently. The present work explores some simple but effective strategies of utilizing pairing-information of Illumina short reads for improving the accuracy of taxonomic binning of metagenomic datasets. The strategies proposed can be used in conjunction with all genres of existing binning methods.ResultsValidation results suggest that employment of these “Binpairs” strategies can provide significant improvements in the binning outcome. The quality of the taxonomic assignments thus obtained are often comparable to those that can only be achieved with relatively longer reads obtained using other NGS platforms (such as Roche).AvailabilityAn implementation of the proposed strategies of utilizing pairing information is freely available for academic users at https://metagenomics.atc.tcs.com/binning/binpairs. |
| |
Keywords: | |
|
|