首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Sequencing of high G+C microbial genomes using the ultrafast pyrosequencing technology
Authors:Schwientek Patrick  Szczepanowski Rafael  Rückert Christian  Stoye Jens  Pühler Alfred
Institution:a Senior Research Group in Genome Research of Industrial Microorganisms, Center for Biotechnology, Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany
b Institute for Genome Research and Systems Biology, Center for Biotechnology, Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany
c Genome Informatics Research Group, Faculty of Technology, Bielefeld University, Universitätsstraße 25, 33615 Bielefeld, Germany
Abstract:Next generation pyrosequencing of high G + C content genomes still poses problems to automated sequencing and assembly processes which necessitates cost and time intensive manual work in order to finish such genomes completely. The sequencing of the high G + C actinomycete Actinoplanes sp. SE50/110 was performed with standard pyrosequencing technology (454 Life Sciences) and revealed a high number of gaps. The reasons for the introduction of gaps were analyzed on a previously known 41 kb long DNA reference sequence from Actinoplanes sp. SE50/110, hosting the acarbose biosynthesis gene cluster. Mapping of the sequencing results on the reference gene cluster sequence revealed a fragmentation into 30 contiguous sequences of different lengths. The gaps between these sequences were characterized by extremely low read coverage which strongly correlated with the G + C content in the gap regions in a negative manner. Furthermore, the gap-sequences contained strong stem-loop structures which hindered the amplification of these sequences during the emulsion PCR. Being significantly underrepresented or absent in the subsequent sequencing process, these sequences lead to weakly or uncovered genomic regions which forces the assembly algorithm to output multiple contiguous sequences instead of one finished genome. However, by applying a different pyrosequencing protocol, it was possible to sequence the complete acarbose biosynthesis gene cluster. The changes to the protocol include longer read length and addition of chemicals to the amplification chemistry, which reduces the self-annealing of DNA fragments during the amplification process and enables the complete reconstruction of high G + C content genomes without manual intervention.
Keywords:    C content  Sequencing bias  Secondary structures  High-throughput sequencing
本文献已被 ScienceDirect PubMed 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号