首页 | 本学科首页   官方微博 | 高级检索  
   检索      


MOCAT: A Metagenomics Assembly and Gene Prediction Toolkit
Authors:Jens Roat Kultima  Shinichi Sunagawa  Junhua Li  Weineng Chen  Hua Chen  Daniel R Mende  Manimozhiyan Arumugam  Qi Pan  Binghang Liu  Junjie Qin  Jun Wang  Peer Bork
Institution:1. Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.; 2. Department of Science and Technology, BGI-Shenzhen, Shenzhen, Guangdong, China.; 3. School of Bioscience and Biotechnology, South China University of Technology, Guangzhou, Guangdong, China.; 4. Max-Delbruck-Centre for Molecular Medicine, Berlin-Buch, Germany.; Argonne National Laboratory, United States of America,
Abstract:MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号