DDBJ Read Annotation Pipeline: A Cloud Computing-Based Pipeline for High-Throughput Analysis of Next-Generation Sequencing Data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

DDBJ Read Annotation Pipeline: A Cloud Computing-Based Pipeline for High-Throughput Analysis of Next-Generation Sequencing Data

Authors:	Hideki Nagasaki Takako Mochizuki Yuichi Kodama Satoshi Saruhashi Shota Morizaki Hideaki Sugawara Hajime Ohyanagi Nori Kurata Kousaku Okubo Toshihisa Takagi Eli Kaminuma Yasukazu Nakamura

Institution:	1.Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8510, Japan;2.Fujisoft Incorporated, 3 Kanda-neribeicho, Chiyoda-ku, Tokyo 101-0022, Japan;3.Plant Genetics Laboratory, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8510, Japan;4.Database Center for Life Science, 2-11-16 Yayoi, Bunkyo, Tokyo 113-0032, Japan

Abstract:	High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.

Keywords:	next-generation sequencing sequence read archive cloud computing analytical pipeline genome analysis

设为首页 | 免责声明 | 关于勤云 | 加入收藏