首页 | 本学科首页   官方微博 | 高级检索  
     


NoDe: a fast error-correction algorithm for pyrosequencing amplicon reads
Authors:Mohamed Mysara  Natalie Leys  Jeroen Raes  Pieter Monsieurs
Affiliation:.Unit of Microbiology, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium ;.Department of Bioscience Engineering, Vrije Universiteit Brussel, Brussels, Belgium ;.VIB Center for the Biology of Disease, VIB, Leuven, Belgium ;.Department of Microbiology and Immunology, REGA institute, KU, Leuven, Belgium
Abstract:

Background

The popularity of new sequencing technologies has led to an explosion of possible applications, including new approaches in biodiversity studies. However each of these sequencing technologies suffers from sequencing errors originating from different factors. For 16S rRNA metagenomics studies, the 454 pyrosequencing technology is one of the most frequently used platforms, but sequencing errors still lead to important data analysis issues (e.g. in clustering in taxonomic units and biodiversity estimation). Moreover, retaining a higher portion of the sequencing data by preserving as much of the read length as possible while maintaining the error rate within an acceptable range, will have important consequences at the level of taxonomic precision.

Results

The new error correction algorithm proposed in this work - NoDe (Noise Detector) - is trained to identify those positions in 454 sequencing reads that are likely to have an error, and subsequently clusters those error-prone reads with correct reads resulting in error-free representative read. A benchmarking study with other denoising algorithms shows that NoDe can detect up to 75% more errors in a large scale mock community dataset, and this with a low computational cost compared to the second best algorithm considered in this study. The positive effect of NoDe in 16S rRNA studies was confirmed by the beneficial effect on the precision of the clustering of pyrosequencing reads in operational taxonomic units.

Conclusions

NoDe was shown to be a computational efficient denoising algorithm for pyrosequencing reads, producing the lowest error rates in an extensive benchmarking study with other denoising algorithms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0520-5) contains supplementary material, which is available to authorized users.
Keywords:Error correction   Denoising   16S rRNA amplicon sequencing   454 pyrosequencing   Metagenomics
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号