期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6–40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources. 相似文献

9.

De novo assembly of bacterial transcriptomes from RNA-seq data

Brian Tjaden 《Genome biology》2015,16(1)

相似文献

10.

Comparing de novo and reference-based transcriptome assembly strategies by applying them to the blood-sucking bug Rhodnius prolixus

《Insect biochemistry and molecular biology》2016

相似文献

11.

Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish,Fundulus heteroclitus

Satshil B. Rana Frank J. Zadlock IV Ziping Zhang Wyatt R. Murphy Carolyn S. Bentivegna 《PloS one》2016,11(4)

相似文献

12.

Evaluating de Bruijn Graph Assemblers on 454 Transcriptomic Data

Xianwen Ren Tao Liu Jie Dong Lilian Sun Jian Yang Yafang Zhu Qi Jin 《PloS one》2012,7(12)

相似文献

13.

Predicting the functional repertoire of an organism from unassembled RNA–seq data

Manuel Landesfeind Peter Meinicke 《BMC genomics》2014,15(1)

相似文献

14.

Optimization of De Novo Short Read Assembly of Seabuckthorn (Hippophae rhamnoides L.) Transcriptome

Rajesh Ghangal Saurabh Chaudhary Mukesh Jain Ram Singh Purty Prakash Chand Sharma 《PloS one》2013,8(8)

相似文献

15.

Identifying wrong assemblies in <Emphasis Type="Italic">de novo</Emphasis> short read primary sequence assembly contigs

Vandna Chawla Rajnish Kumar Ravi Shankar 《Journal of biosciences》2016,41(3):455-474

With the advent of short-reads-based genome sequencing approaches, large number of organisms are being sequenced all over the world. Most of these assemblies are done using some de novo short read assemblers and other related approaches. However, the contigs produced this way are prone to wrong assembly. So far, there is a conspicuous dearth of reliable tools to identify mis-assembled contigs. Mis-assemblies could result from incorrectly deleted or wrongly arranged genomic sequences. In the present work various factors related to sequence, sequencing and assembling have been assessed for their role in causing mis-assembly by using different genome sequencing data. Finally, some mis-assembly detecting tools have been evaluated for their ability to detect the wrongly assembled primary contigs, suggesting a lot of scope for improvement in this area. The present work also proposes a simple unsupervised learning-based novel approach to identify mis-assemblies in the contigs which was found performing reasonably well when compared to the already existing tools to report mis-assembled contigs. It was observed that the proposed methodology may work as a complementary system to the existing tools to enhance their accuracy. 相似文献

16.

Effects of GC Bias in Next-Generation-Sequencing Data on De Novo Genome Assembly

Yen-Chun Chen Tsunglin Liu Chun-Hui Yu Tzen-Yuh Chiang Chi-Chuan Hwang 《PloS one》2013,8(4)

Next-generation-sequencing (NGS) has revolutionized the field of genome assembly because of its much higher data throughput and much lower cost compared with traditional Sanger sequencing. However, NGS poses new computational challenges to de novo genome assembly. Among the challenges, GC bias in NGS data is known to aggravate genome assembly. However, it is not clear to what extent GC bias affects genome assembly in general. In this work, we conduct a systematic analysis on the effects of GC bias on genome assembly. Our analyses reveal that GC bias only lowers assembly completeness when the degree of GC bias is above a threshold. At a strong GC bias, the assembly fragmentation due to GC bias can be explained by the low coverage of reads in the GC-poor or GC-rich regions of a genome. This effect is observed for all the assemblers under study. Increasing the total amount of NGS data thus rescues the assembly fragmentation because of GC bias. However, the amount of data needed for a full rescue depends on the distribution of GC contents. Both low and high coverage depths due to GC bias lower the accuracy of assembly. These pieces of information provide guidance toward a better de novo genome assembly in the presence of GC bias. 相似文献

17.

Critical assessment of assembly strategies for non-model species mRNA-Seq data and application of next-generation sequencing to the comparison of C(3) and C(4) species

Bräutigam A Mullick T Schliesky S Weber AP 《Journal of experimental botany》2011,62(9):3093-3102

相似文献

18.

Semantic Assembly and Annotation of Draft RNAseq Transcripts without a Reference Genome

Andrey Ptitsyn Ramzi Temanni Christelle Bouchard Peter A. V. Anderson 《PloS one》2015,10(9)

相似文献

19.

Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data

J Duan C Xia G Zhao J Jia X Kong 《BMC genomics》2012,13(1):392

相似文献

20.

Bridger: a new framework for de novo transcriptome assembly using RNA-seq data

Zheng Chang Guojun Li Juntao Liu Yu Zhang Cody Ashby Deli Liu Carole L Cramer Xiuzhen Huang 《Genome biology》2015,16(1)

相似文献