期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Impact of Library Preparation on Downstream Analysis and Interpretation of RNA-Seq Data: Comparison between Illumina PolyA and NuGEN Ovation Protocol

Zhifu Sun Yan W. Asmann Asha Nair Yuji Zhang Liguo Wang Krishna R. Kalari Aditya V. Bhagwate Tiffany R. Baker Jennifer M. Carr Jean-Pierre A. Kocher Edith A. Perez E. Aubrey Thompson 《PloS one》2013,8(8)

Objectives

The sequencing by the PolyA selection is the most common approach for library preparation. With limited amount or degraded RNA, alternative protocols such as the NuGEN have been developed. However, it is not yet clear how the different library preparations affect the downstream analyses of the broad applications of RNA sequencing.

Methods and Materials

Eight human mammary epithelial cell (HMEC) lines with high quality RNA were sequenced by Illumina’s mRNA-Seq PolyA selection and NuGEN ENCORE library preparation. The following analyses and comparisons were conducted: 1) the numbers of genes captured by each protocol; 2) the impact of protocols on differentially expressed gene detection between biological replicates; 3) expressed single nucleotide variant (SNV) detection; 4) non-coding RNAs, particularly lincRNA detection; and 5) intragenic gene expression.

Results

Sequences from the NuGEN protocol had lower (75%) alignment rate than the PolyA (over 90%). The NuGEN protocol detected fewer genes (12–20% less) with a significant portion of reads mapped to non-coding regions. A large number of genes were differentially detected between the two protocols. About 17–20% of the differentially expressed genes between biological replicates were commonly detected between the two protocols. Significantly higher numbers of SNVs (5–6 times) were detected in the NuGEN samples, which were largely from intragenic and intergenic regions. The NuGEN captured fewer exons (25% less) and had higher base level coverage variance. While 6.3% of reads were mapped to intragenic regions in the PolyA samples, the percentages were much higher (20–25%) for the NuGEN samples. The NuGEN protocol did not detect more known non-coding RNAs such as lincRNAs, but targeted small and “novel” lincRNAs.

Conclusion

Different library preparations can have significant impacts on downstream analysis and interpretation of RNA-seq data. The NuGEN provides an alternative for limited or degraded RNA but it has limitations for some RNA-seq applications. 相似文献

2.

A simple strand-specific RNA-Seq library preparation protocol combining the Illumina TruSeq RNA and the dUTP methods

Sultan M Dökel S Amstislavskiy V Wuttig D Sültmann H Lehrach H Yaspo ML 《Biochemical and biophysical research communications》2012,422(4):643-646

相似文献

3.

Evaluating bias-reducing protocols for RNA sequencing library preparation

Thomas J Jackson Ruth V Spriggs Nicholas J Burgoyne Carolyn Jones Anne E Willis 《BMC genomics》2014,15(1)

Background

Next-generation sequencing does not yield fully unbiased estimates for read abundance, which may impact on the conclusions that can be drawn from sequencing data. The ligation step in RNA sequencing library generation is a known source of bias, motivating developments in enzyme technology and library construction protocols. We present the first comparison of the standard duplex adaptor protocol supplied by Life Technologies for use on the Ion Torrent PGM with an alternate single adaptor approach involving CircLigase (CircLig protocol).A correlation between over-representation in sequenced libraries and degree of secondary structure has been reported previously, therefore we also investigated whether bias could be reduced by ligation with an enzyme that functions at a temperature not permissive for such structure.

Results

A pool of small RNA fragments of known composition was converted into a sequencing library using one of three protocols and sequenced on an Ion Torrent PGM. The CircLig protocol resulted in less over-representation of specific sequences than the standard protocol. Over-represented sequences are more likely to be predicted to have secondary structure and to co-fold with adaptor sequences. However, use of the thermostable ligase Methanobacterium thermoautotrophicum RNA ligase K97A (Mth K97A) was not sufficient to reduce bias.

Conclusions

The single adaptor CircLigase-based approach significantly reduces, but does not eliminate, bias in Ion Torrent data. Ligases that function at temperatures to remove the possible influence of secondary structure on library generation may be of value, although Mth K97A is not effective in this case.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-569) contains supplementary material, which is available to authorized users. 相似文献

4.

Illumina mate-paired DNA sequencing-library preparation using Cre-Lox recombination

Van Nieuwerburgh F Thompson RC Ledesma J Deforce D Gaasterland T Ordoukhanian P Head SR 《Nucleic acids research》2012,40(3):e24

Standard Illumina mate-paired libraries are constructed from 3- to 5-kb DNA fragments by a blunt-end circularization. Sequencing reads that pass through the junction of the two joined ends of a 3-5-kb DNA fragment are not easy to identify and pose problems during mapping and de novo assembly. Longer read lengths increase the possibility that a read will cross the junction. To solve this problem, we developed a mate-paired protocol for use with Illumina sequencing technology that uses Cre-Lox recombination instead of blunt end circularization. In this method, a LoxP sequence is incorporated at the junction site. This sequence allows screening reads for junctions without using a reference genome. Junction reads can be trimmed or split at the junction. Moreover, the location of the LoxP sequence in the reads distinguishes mate-paired reads from spurious paired-end reads. We tested this new method by preparing and sequencing a mate-paired library with an insert size of 3 kb from Saccharomyces cerevisiae. We present an analysis of the library quality statistics and a new bio-informatics tool called DeLoxer that can be used to analyze an IlluminaCre-Lox mate-paired data set. We also demonstrate how the resulting data significantly improves a de novo assembly of the S. cerevisiae genome. 相似文献

5.

De novo transcriptome sequencing in Anopheles funestus using Illumina RNA-seq technology

Crawford JE Guelbeogo WM Sanou A Traoré A Vernick KD Sagnon N Lazzaro BP 《PloS one》2010,5(12):e14202

相似文献

6.

An improved method for sequencing double stranded plasmid DNA from minipreps using DMSO and modified template preparation. 总被引：11，自引：0，他引：11

下载免费PDF全文

D Seto 《Nucleic acids research》1990,18(19):5905-5906

相似文献

7.

leeHom: adaptor trimming and merging for Illumina sequencing reads

Gabriel Renaud Udo Stenzel Janet Kelso 《Nucleic acids research》2014,42(18):e141

The sequencing of libraries containing molecules shorter than the read length, such as in ancient or forensic applications, may result in the production of reads that include the adaptor, and in paired reads that overlap one another. Challenges for the processing of such reads are the accurate identification of the adaptor sequence and accurate reconstruction of the original sequence most likely to have given rise to the observed read(s). We introduce an algorithm that removes the adaptors and reconstructs the original DNA sequences using a Bayesian maximum a posteriori probability approach. Our algorithm is faster, and provides a more accurate reconstruction of the original sequence for both simulated and ancient DNA data sets, than other approaches. leeHom is released under the GPLv3 and is freely available from: https://bioinf.eva.mpg.de/leehom/ 相似文献

8.

Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform

Melanie Schirmer Umer Z. Ijaz Rosalinda D'Amore Neil Hall William T. Sloan Christopher Quince 《Nucleic acids research》2015,43(6):e37

With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina''s MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. 相似文献

9.

Elucidating carbon uptake from vinyl chloride using stable isotope probing and Illumina sequencing

Fernanda Paes Xikun Liu Timothy E. Mattes Alison M. Cupples 《Applied microbiology and biotechnology》2015,99(18):7735-7743

相似文献

10.

Transcriptome profiling of testis during sexual maturation stages in Eriocheir sinensis using Illumina sequencing 总被引：1，自引：0，他引：1

He L Wang Q Jin X Jin X Wang Y Chen L Liu L Wang Y 《PloS one》2012,7(3):e33735

相似文献

11.

High-performance single-chip exon capture allows accurate whole exome sequencing using the Illumina Genome Analyzer

Jiang T Yang L Jiang H Tian G Zhang X 《中国科学：生命科学英文版》2011,54(10):945-952

Here we present an adaptation of NimbleGen 2.1M-probe array sequence capture for whole exome sequencing using the Illumina Genome Analyzer (GA) platform. The protocol involves two-stage library construction. The specificity of exome enrichment was approximately 80% with 95.6% even coverage of the 34 Mb target region at an average sequencing depth of 33-fold. Comparison of our results with whole genome shot-gun resequencing results showed that the exome SNP calls gave only 0.97% false positive and 6.27% false negative variants. Our protocol is also well suited for use with whole genome amplified DNA. The results presented here indicate that there is a promising future for large-scale population genomics and medical studies using a whole exome sequencing approach. 相似文献

12.

Transcriptome sequencing of a highly salt tolerant mangrove species Sonneratia alba using Illumina platform

Chen S Zhou R Huang Y Zhang M Yang G Zhong C Shi S 《Marine Genomics》2011,4(2):129-136

相似文献

13.

Small RNA library preparation for next-generation sequencing by single ligation,extension and circularization technology

Kwon YS 《Biotechnology letters》2011,33(8):1633-1641

The discovery of novel small RNA classes and species has accelerated since the implementation of high-throughput sequencing technologies for the identification of small RNAs. However, as the sequence coverage increases in a cell, the expectation of finding novel small RNAs from a batch of sequencing gradually decreases. To improve the finding of novel small RNAs, an alternative small RNA library preparation method, the single ligation, extension and circularization method, has been developed which is adequate for high throughput sequencing. The procedure is faster and simpler than the more widely used procedures, and the constructed libraries are compatible with high-level multiplex analysis. The analysis of human small RNA libraries prepared by the SLEC method reported known small RNAs and novel small RNAs including 25 mirtron candidates. This study demonstrates that the method is effective in identifying known and novel small RNAs. 相似文献

14.

MT-Toolbox: improved amplicon sequencing using molecule tags

Scott M Yourstone Derek S Lundberg Jeffery L Dangl Corbin D Jones 《BMC bioinformatics》2014,15(1)

Background

Short oligonucleotides can be used as markers to tag and track DNA sequences. For example, barcoding techniques (i.e. Multiplex Identifiers or Indexing) use short oligonucleotides to distinguish between reads from different DNA samples pooled for high-throughput sequencing. A similar technique called molecule tagging uses the same principles but is applied to individual DNA template molecules. Each template molecule is tagged with a unique oligonucleotide prior to polymerase chain reaction. The resulting amplicon sequences can be traced back to their original templates by their oligonucleotide tag. Consensus building from sequences sharing the same tag enables inference of original template molecules thereby reducing effects of sequencing error and polymerase chain reaction bias. Several independent groups have developed similar protocols for molecule tagging; however, user-friendly software for build consensus sequences from molecule tagged reads is not readily available or is highly specific for a particular protocol.

Results

MT-Toolbox recognizes oligonucleotide tags in amplicons and infers the correct template sequence. On a set of molecule tagged test reads, MT-Toolbox generates sequences having on average 0.00047 errors per base. MT-Toolbox includes a graphical user interface, command line interface, and options for speed and accuracy maximization. It can be run in serial on a standard personal computer or in parallel on a Load Sharing Facility based cluster system. An optional plugin provides features for common 16S metagenome profiling analysis such as chimera filtering, building operational taxonomic units, contaminant removal, and taxonomy assignments.

Conclusions

MT-Toolbox provides an accessible, user-friendly environment for analysis of molecule tagged reads thereby reducing technical errors and polymerase chain reaction bias. These improvements reduce noise and allow for greater precision in single amplicon sequencing experiments.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-284) contains supplementary material, which is available to authorized users. 相似文献

15.

Profile of bacterial communities in South African mine-water samples using Illumina next-generation sequencing platform 总被引：1，自引：0，他引：1

Jitendra Keshri Boitumelo B. J. Mankazana Maggy N. B. Momba 《Applied microbiology and biotechnology》2015,99(7):3233-3242

相似文献

16.

De novo transcriptome assembly of Ipomoea nil using Illumina sequencing for gene discovery and SSR marker identification

Changhe Wei Xiang Tao Ming Li Bin He Lang Yan Xuemei Tan Yizheng Zhang 《Molecular genetics and genomics : MGG》2015,290(5):1873-1884

相似文献

17.

Transcriptome Sequencing and de novo Analysis for Oviductus Ranae of Rana chensinensis Using Illumina RNA-Seq Technology

Mei Zhang Yuntong Li Baojin Yao Minying Sun Zhiwu Wang Yu Zhao 《遗传学报》2013,40(3):137-140

Oviductus Ranae is the dried oviduct of female Rana temporaria chensinensis (David), distributed mainly in North-eastern China. Oviductus Ranae is one of the best-known and highly valued oriental foods and medicines. Traditional Chinesemedicine holds that Oviductus Ranae can nourish yin, moisten lung and replenish the kidney essence. Meanwhile, activities of Oviductus Ranae such as anti-aging, anti-lipemic, anti-oxidation 相似文献

18.

Sonication-based isolation and enrichment of Chlorella protothecoides chloroplasts for Illumina genome sequencing

Angelina Angelova Sang-Hycuk Park John Kyndt Kevin Fitzsimmons Judith K. Brown 《Journal of applied phycology》2014,26(1):209-218

With the increasing world demand for biofuel, a number of oleaginous algal species are being considered as renewable sources of oil. Chlorella protothecoides Krüger synthesizes triacylglycerols (TAGs) as storage compounds that can be converted into renewable fuel utilizing an anabolic pathway that is poorly understood. The paucity of algal chloroplast genome sequences has been an important constraint to chloroplast transformation and for studying gene expression in TAGs pathways. In this study, the intact chloroplasts were released from algal cells using sonication followed by sucrose gradient centrifugation, resulting in a 2.36-fold enrichment of chloroplasts from C. protothecoides, based on qPCR analysis. The C. protothecoides chloroplast genome (cpDNA) was determined using the Illumina HiSeq 2000 sequencing platform and found to be 84,576 Kb in size (8.57 Kb) in size, with a GC content of 30.8 %. This is the first report of an optimized protocol that uses a sonication step, followed by sucrose gradient centrifugation, to release and enrich intact chloroplasts from a microalga (C. prototheocoides) of sufficient quality to permit chloroplast genome sequencing with high coverage, while minimizing nuclear genome contamination. The approach is expected to guide chloroplast isolation from other oleaginous algal species for a variety of uses that benefit from enrichment of chloroplasts, ranging from biochemical analysis to genomics studies. 相似文献

19.

Assessing the bacterial diversity and functional profiles of the River Yamuna using Illumina MiSeq sequencing

Sodhi Kushneet Kaur Kumar Mohit Singh Dileep Kumar 《Archives of microbiology》2021,203(1):367-375

Archives of Microbiology - A small percentage of the total freshwater on Earth is represented by river water. Microbes have an essential role to play in the biogeochemical cycles, mineralization of... 相似文献

20.

De novo sequencing of hazelnut bacterial artificial chromosomes (BACs) using multiplex Illumina sequencing and targeted marker development for eastern filbert blight resistance

Vidyasagar R. Sathuvalli Shawn A. Mehlenbacher 《Tree Genetics & Genomes》2013,9(4):1109-1118

Bacterial artificial chromosome (BAC) libraries are widely used in map-based cloning of plant genes. Eastern filbert blight (EFB), caused by the pyrenomycete Anisogramma anomala (Peck) E. Müller, is a devastating disease of European hazelnut (Corylus avellana L.) in the Pacific Northwest. A dominant allele at a single locus from the obsolete pollenizer “Gasaway” confers complete resistance. Our map-based cloning efforts use a BAC library for “Jefferson” hazelnut, which is heterozygous for resistance. Screening the library with primer pairs designed from RAPD markers closely linked to the EFB resistance locus identified 38 BACs. We sequenced 28 of these BACs using Illumina technology, by multiplexing with barcoded adapters. De novo sequence assembly using the programs Velvet and SOPRA and further alignment using CodonCode Aligner generated contigs whose length ranged from 393 to 108,194 bp. The number of contigs per BAC ranged from 1 to 19, and estimated coverage of assembled BACs ranged from 64 % to 100 %. Preliminary analysis of the sequences identified 779 simple sequence repeats (SSRs), from which we developed 23 markers. Of these, 17 were assigned to linkage group 6 adjacent to the disease resistance locus, five were placed on other linkage groups, and one could not be assigned to a linkage group. The BAC sequences and new SSR markers will be useful for our efforts at map-based cloning of the disease resistance gene. 相似文献