首页 | 本学科首页   官方微博 | 高级检索  
     


Identification of indels in next-generation sequencing data
Authors:Aakrosh Ratan  Thomas L Olson  Thomas P Loughran  Jr   Webb Miller
Affiliation:.Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, 506, Wartik Laboratory, University Park, PA 16802 USA ;.Department of Medicine, Hematology and Oncology, and the University of Virginia Cancer Center, University of Virginia, Charlottesville, VA 22908 USA ;.Department of Public Health Sciences and Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908 USA
Abstract:

Background

The discovery and mapping of genomic variants is an essential step in most analysis done using sequencing reads. There are a number of mature software packages and associated pipelines that can identify single nucleotide polymorphisms (SNPs) with a high degree of concordance. However, the same cannot be said for tools that are used to identify the other types of variants. Indels represent the second most frequent class of variants in the human genome, after single nucleotide polymorphisms. The reliable detection of indels is still a challenging problem, especially for variants that are longer than a few bases.

Results

We have developed a set of algorithms and heuristics collectively called indelMINER to identify indels from whole genome resequencing datasets using paired-end reads. indelMINER uses a split-read approach to identify the precise breakpoints for indels of size less than a user specified threshold, and supplements that with a paired-end approach to identify larger variants that are frequently missed with the split-read approach. We use simulated and real datasets to show that an implementation of the algorithm performs favorably when compared to several existing tools.

Conclusions

indelMINER can be used effectively to identify indels in whole-genome resequencing projects. The output is provided in the VCF format along with additional information about the variant, including information about its presence or absence in another sample. The source code and documentation for indelMINER can be freely downloaded from www.bx.psu.edu/miller_lab/indelMINER.tar.gz.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0483-6) contains supplementary material, which is available to authorized users.
Keywords:Indels   Variants   Sequencing analysis
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号