An Experimentally Derived Data Set Constructed for Testing Large-Scale DNA Sequence Assembly Algorithms期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

An Experimentally Derived Data Set Constructed for Testing Large-Scale DNA Sequence Assembly Algorithms

Authors:	Donald Seto Ben F. Koop Leroy Hood

Abstract:	A data set consisting of DNA sequences from a large-scale shotgun DNA cloning and sequencing project has been collected and posted for public release. The purpose is to propose a standard genomic DNA sequencing data set by which various algorithms and implementations can be tested. This set of data is divided into two subsets, one containing raw DNA sequence data (1023 clones) and the other consisting of the corresponding partially refined or edited DNA sequence data (820 clones). Suggested criteria or guidelines for this data refinement are presented so that algorithms for preprocessing and screening raw sequences may be developed. Development of such preprocessing, screening, aligning, and assembling algorithms will expedite large-scale DNA sequencing projects so that the complete unambiguous consensus DNA sequences will be made available to the general research community in a quicker manner. Smaller scale routine DNA sequencing projects will also be greatly aided by such computational efforts.

Keywords:
本文献已被 ScienceDirect 等数据库收录！