BatAlign: an incremental method for accurate alignment of sequencing reads |
| |
Authors: | Jing-Quan Lim Chandana Tennakoon Peiyong Guan Wing-Kin Sung |
| |
Affiliation: | 1.Department of Computer Science, National University of Singapore, Singapore 117417;2.Laboratory of Cancer Epigenome, Division of Medical Sciences, National Cancer Centre Singapore, Singapore 169610;3.NUS Graduate School for Integrative Sciences and Engineering, (CeLS), #05-01, 28 Medical Drive, Singapore 117456;4.Department of Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672;5.UAE University, PO Box 17551, Al Ain, UAE |
| |
Abstract: | Structural variations (SVs) play a crucial role in genetic diversity. However, the alignments of reads near/across SVs are made inaccurate by the presence of polymorphisms. BatAlign is an algorithm that integrated two strategies called ‘Reverse-Alignment’ and ‘Deep-Scan’ to improve the accuracy of read-alignment. In our experiments, BatAlign was able to obtain the highest F-measures in read-alignments on mismatch-aberrant, indel-aberrant, concordantly/discordantly paired and SV-spanning data sets. On real data, the alignments of BatAlign were able to recover 4.3% more PCR-validated SVs with 73.3% less callings. These suggest BatAlign to be effective in detecting SVs and other polymorphic-variants accurately using high-throughput data. BatAlign is publicly available at https://goo.gl/a6phxB. |
| |
Keywords: | |
|
|