首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Next-Generation Sequence Assembly: Four Stages of Data Processing and Computational Challenges
Authors:Sara El-Metwally  Taher Hamza  Magdi Zakaria  Mohamed Helmy
Institution:1.Computer Science Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt;2.Botany Department, Faculty of Agriculture, Al-Azhar University, Cairo, Egypt;3.Biotechnology Department, Faculty of Agriculture, Al-Azhar University, Cairo, Egypt;Accelrys, United States of America
Abstract:Decoding DNA symbols using next-generation sequencers was a major breakthrough in genomic research. Despite the many advantages of next-generation sequencers, e.g., the high-throughput sequencing rate and relatively low cost of sequencing, the assembly of the reads produced by these sequencers still remains a major challenge. In this review, we address the basic framework of next-generation genome sequence assemblers, which comprises four basic stages: preprocessing filtering, a graph construction process, a graph simplification process, and postprocessing filtering. Here we discuss them as a framework of four stages for data analysis and processing and survey variety of techniques, algorithms, and software tools used during each stage. We also discuss the challenges that face current assemblers in the next-generation environment to determine the current state-of-the-art. We recommend a layered architecture approach for constructing a general assembler that can handle the sequences generated by different sequencing platforms.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号