首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 56 毫秒
1.
Linear amplification for deep sequencing (LADS) is an amplification method that produces representative libraries for Illumina next-generation sequencing within 2 d. The method relies on attaching two different sequencing adapters to blunt-end repaired and A-tailed DNA fragments, wherein one of the adapters is extended with the sequence for the T7 RNA polymerase promoter. Ligated and size-selected DNA fragments are transcribed in vitro with high RNA yields. Subsequent cDNA synthesis is initiated from a primer complementary to the first adapter, ensuring that the library will only contain full-length fragments with two distinct adapters. Contrary to the severely biased representation of AT- or GC-rich fragments in standard PCR-amplified libraries, the sequence coverage in T7-amplified libraries is indistinguishable from that of nonamplified libraries. Moreover, in contrast to amplification-free methods, LADS can generate sequencing libraries from a few nanograms of DNA, which is essential for all applications in which the starting material is limited.  相似文献   

2.
Standard Illumina mate-paired libraries are constructed from 3- to 5-kb DNA fragments by a blunt-end circularization. Sequencing reads that pass through the junction of the two joined ends of a 3-5-kb DNA fragment are not easy to identify and pose problems during mapping and de novo assembly. Longer read lengths increase the possibility that a read will cross the junction. To solve this problem, we developed a mate-paired protocol for use with Illumina sequencing technology that uses Cre-Lox recombination instead of blunt end circularization. In this method, a LoxP sequence is incorporated at the junction site. This sequence allows screening reads for junctions without using a reference genome. Junction reads can be trimmed or split at the junction. Moreover, the location of the LoxP sequence in the reads distinguishes mate-paired reads from spurious paired-end reads. We tested this new method by preparing and sequencing a mate-paired library with an insert size of 3 kb from Saccharomyces cerevisiae. We present an analysis of the library quality statistics and a new bio-informatics tool called DeLoxer that can be used to analyze an IlluminaCre-Lox mate-paired data set. We also demonstrate how the resulting data significantly improves a de novo assembly of the S. cerevisiae genome.  相似文献   

3.
新一代测序技术(NGS)的文库制备方法在基因组的拼装中起着重要作用。但是NGS技术制备的普通DNA文库片段只有500 bp左右,难以满足复杂基因组的从头(de novo)拼装要求。三代测序技术的读长可以达到20 kb,但是其高错误率及测序成本过高使得其又不易推广。因此二代测序的Mate-paired文库制备技术一直在基因组的de novo拼装中扮演着非常重要的角色。目前主流的NGS平台Illumina制备的Mate-paired文库的片段范围只有2~5 kb,为了得到更长的可用于Illumina平台测序的Mate-paired文库,本研究首次整合并优化了Illumina和Roche/454两种测序平台的Mate-paired文库制备技术,采用诱导环化酶来提高基因组长片段DNA的环化效率,成功建立了20 kb Mate-paired文库制备技术,并已将该技术应用于人类基因组20 kb Mate-paired文库制备。该技术为Illumina平台制备长片段Mate-paired库提供了方法指导。  相似文献   

4.
DNA-encoded chemical libraries are increasingly being employed for the identification of binding molecules to protein targets of pharmaceutical relevance. Here, we describe the synthesis and characterization of a DNA-encoded chemical library, consisting of 4000 compounds generated by Diels-Alder cycloaddition reactions. The compounds were encoded with unique DNA fragments which were generated through a stepwise assembly process and serve as amplifiable bar codes for the identification and relative quantification of library members.  相似文献   

5.
Massively parallel DNA sequencing is capable of sequencing tens of millions of DNA fragments at the same time. However, sequence bias in the initial cycles, which are used to determine the coordinates of individual clusters, causes a loss of fidelity in cluster identification on Illumina Genome Analysers. This can result in a significant reduction in the numbers of clusters that can be analysed. Such low sample diversity is an intrinsic problem of sequencing libraries that are generated by restriction enzyme digestion, such as e4C-seq or reduced-representation libraries. Similarly, this problem can also arise through the combined sequencing of barcoded, multiplexed libraries. We describe a procedure to defer the mapping of cluster coordinates until low-diversity sequences have been passed. This simple procedure can recover substantial amounts of next generation sequencing data that would otherwise be lost.  相似文献   

6.
DNA-encoded libraries of small organic molecules facilitate the construction of large, encoded self-assembling chemical libraries for the identification of high-affinity binders to protein targets. We have constructed a library of 477 chemical compounds, coupled to 48mer-oligonucleotides, each containing a unique six-base sequence serving as "bar-code" for the identification of the chemical moiety. The functionality of the library was confirmed by selection and amplification of both high- and low-affinity binding molecules specific to streptavidin.  相似文献   

7.
Cyclic peptides are of great interest as therapeutic compounds due to their potential for specificity and intracellular activity, but specific compounds can be difficult to identify from large libraries without resorting to molecular encoding techniques. Large libraries of cyclic peptides are often DNA-encoded or linearized before sequencing, but both of those deconvolution strategies constrain the chemistry, assays, and quantification methods which can be used. We developed an automated sequencing program, CycLS, to identify cyclic peptides contained within large synthetic libraries. CycLS facilitates quick and easy identification of all library-members via tandem mass spectrometry data without requiring any specific chemical moieties or modifications within the library. Validation of CycLS against a library of 400 cyclic hexapeptide peptoid hybrids (peptomers) of unique mass yielded a result of 95% accuracy when compared against a simulated library size of 234,256 compounds. CycLS was also evaluated by resynthesizing pure compounds from a separate 1800-member library of cyclic hexapeptides and hexapeptomers with high mass redundancy. Of 22 peptides resynthesized, 17 recapitulated the retention times and fragmentation patterns assigned to them from the whole-library bulk assay results. Implementing a database-matching approach, CycLS is fast and provides a robust method for sequencing cyclic peptides that is particularly applicable to the deconvolution of synthetic libraries.  相似文献   

8.
Cytosine methylation is the quintessential epigenetic mark. Two well-established methods, bisulfite sequencing and methyl-DNA immunoprecipitation (MeDIP) lend themselves to the genome-wide analysis of DNA methylation by high throughput sequencing. Here we provide an overview and brief review of these methods. We summarize our experience with MeDIP followed by high throughput Illumina/Solexa sequencing, exemplified by the analysis of the methylated fraction of the Neurospora crassa genome ("methylome"). We provide detailed methods for DNA isolation, processing and the generation of in vitro libraries for Illumina/Solexa sequencing. We discuss potential problems in the generation of sequencing libraries. Finally, we provide an overview of software that is appropriate for the analysis of high throughput sequencing data generated by Illumina/Solexa-type sequencing by synthesis, with a special emphasis on approaches and applications that can generate more accurate depictions of sequence reads that fall in repeated regions of a chosen reference genome.  相似文献   

9.
Next-generation sequencing of environmental samples can be challenging because of the variable DNA quantity and quality in these samples. High quality DNA libraries are needed for optimal results from next-generation sequencing. Environmental samples such as water may have low quality and quantities of DNA as well as contaminants that co-precipitate with DNA. The mechanical and enzymatic processes involved in extraction and library preparation may further damage the DNA. Gel size selection enables purification and recovery of DNA fragments of a defined size for sequencing applications. Nevertheless, this task is one of the most time-consuming steps in the DNA library preparation workflow. The protocol described here enables complete automation of agarose gel loading, electrophoretic analysis, and recovery of targeted DNA fragments. In this study, we describe a high-throughput approach to prepare high quality DNA libraries from freshwater samples that can be applied also to other environmental samples. We used an indirect approach to concentrate bacterial cells from environmental freshwater samples; DNA was extracted using a commercially available DNA extraction kit, and DNA libraries were prepared using a commercial transposon-based protocol. DNA fragments of 500 to 800 bp were gel size selected using Ranger Technology, an automated electrophoresis workstation. Sequencing of the size-selected DNA libraries demonstrated significant improvements to read length and quality of the sequencing reads.  相似文献   

10.

The reduced representation bisulfite sequencing (RRBS) method has been developed for the high-throughput analysis of DNA methylation based on the sequencing of genomic libraries treated with sodium bisulfite by next-generation approaches. In contrast to whole-genome sequencing, the RRBS approach elaborates specific endonucleases to prepare libraries in order to produce pools of CpG-rich DNA fragments. The original RRBS technology based on the use of the MspI libraries allows one to increase the relative number of CpG islands in the pools of genomic fragments compared to whole-genome bisulfite sequencing. Nevertheless, this technology is rarely used due to the high cost compared with bisulfite methylation analysis with hybridization microarrays and significant residual amount of data represented by the sequences of genomic repeats that complicates the alignment and is not of particular interest for developing DNA methylation markers, which is often the main goal of biomedical research. We have developed an algorithm for estimating the likelihood that recognition sites of restriction endonucleases will be represented in CpG islands and present a method of reducing the effective size of the RRBS library without a significant loss of the CpG islands based on the use of the XmaI endonuclease for library preparation. In silico analysis demonstrates that the optimum range of the XmaI-RRBS fragment lengths is 110–200 base pairs. The sequencing of this library allows one to assess the methylation status of over 125000 CpG dinucleotides, of which over 90000 belong to CpG islands.

  相似文献   

11.
Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories.  相似文献   

12.
Current efforts to recover the Neandertal and mammoth genomes by 454 DNA sequencing demonstrate the sensitivity of this technology. However, routine 454 sequencing applications still require microgram quantities of initial material. This is due to a lack of effective methods for quantifying 454 sequencing libraries, necessitating expensive and labour-intensive procedures when sequencing ancient DNA and other poor DNA samples. Here we report a 454 sequencing library quantification method based on quantitative PCR that effectively eliminates these limitations. We estimated both the molecule numbers and the fragment size distributions in sequencing libraries derived from Neandertal DNA extracts, SAGE ditags and bonobo genomic DNA, obtaining optimal sequencing yields without performing any titration runs. Using this method, 454 sequencing can routinely be performed from as little as 50 pg of initial material without titration runs, thereby drastically reducing costs while increasing the scope of sample throughput and protocol development on the 454 platform. The method should also apply to Illumina/Solexa and ABI/SOLiD sequencing, and should therefore help to widen the accessibility of all three platforms.  相似文献   

13.
14.
Genome-wide mapping of 5-methylcytosine is of broad interest to many fields of biology and medicine. A variety of methods have been developed, and several have recently been advanced to genome-wide scale using arrays and next-generation sequencing approaches. We have previously reported reduced representation bisulfite sequencing (RRBS), a bisulfite-based protocol that enriches CG-rich parts of the genome, thereby reducing the amount of sequencing required while capturing the majority of promoters and other relevant genomic regions. The approach provides single-nucleotide resolution, is highly sensitive and provides quantitative DNA methylation measurements. This protocol should enable any standard molecular biology laboratory to generate RRBS libraries of high quality. Briefly, purified genomic DNA is digested by the methylation-insensitive restriction enzyme MspI to generate short fragments that contain CpG dinucleotides at the ends. After end-repair, A-tailing and ligation to methylated Illumina adapters, the CpG-rich DNA fragments (40-220 bp) are size selected, subjected to bisulfite conversion, PCR amplified and end sequenced on an Illumina Genome Analyzer. Note that alignment and analysis of RRBS sequencing reads are not covered in this protocol. The extremely low input requirements (10-300 ng), the applicability of the protocol to formalin-fixed and paraffin-embedded samples, and the technique's single-nucleotide resolution extends RRBS to a wide range of biological and clinical samples and research applications. The entire process of RRBS library construction takes ~9 d.  相似文献   

15.
本文介绍了构建水稻二化螟和三化螟"双酶切限制性酶切位点关联DNA测序"(Double digest restrictionsite associated DNA sequencing,ddRADseq)文库的方法。利用安捷伦2100生物分析仪对4种单酶切及2种双酶切的酶切产物片段大小及分布范围进行分析,筛选出Mlu C I和Nla III两种限制性内切酶组合对螟虫基因组DNA进行酶切。酶切后的DNA片段两端连接上特定的P1、P2接头后,用Pippin Prep回收大小为285-435 bp的DNA片段。通过PCR扩增进行文库的富集并引入index序列。构建好的ddRADseq文库用琼脂糖凝胶电泳和生物分析仪进行质量检测。本方法所构建的文库DNA片段长度、分布和摩尔浓度能够达到Illumina平台测序的技术要求。本研究证实了利用Mlu C I和Nla III组合酶切构建水稻螟虫基因组ddRADseq文库的可行性,为在水稻螟虫中利用ddRADseq技术开展生物地理学、种群遗传学和系统发育重建等方面的研究奠定基础。  相似文献   

16.
SCRATCHY is a methodology for the construction of libraries of chimeras between genes that display low sequence homology. We have developed a strategy for library creation termed enhanced crossover SCRATCHY, that significantly increases the number of clones containing multiple crossovers. Complementary chimeric gene libraries generated by incremental truncation (ITCHY) of two distinct parental sequences are created, and are then divided into arbitrarily defined sections. The respective sections are amplified by skewed sets of primers (i.e. a combination of gene A specific forward primer and gene B specific reverse primer, etc.) allowing DNA fragments containing non-homologous crossover points to be amplified. The amplified chimeric sections are then subjected to a DNA shuffling process generating an enhanced crossover SCRATCHY library. We have constructed such a library using the rat theta 2 glutathione transferase (rGSTT2) and the human theta 1 glutathione transferase (hGSTT1) genes (63% DNA sequence identity). DNA sequencing analysis of unselected library members revealed a greater diversity than that obtained by canonical family shuffling or with conventional SCRATCHY. Expression and high-throughput flow cytometric screening of the chimeric GST library identified several chimeric progeny that retained rat-like parental substrate specificity.  相似文献   

17.
Deep sequencing of strand-specific cDNA libraries is now a ubiquitous tool for identifying and quantifying RNAs in diverse sample types. The accuracy of conclusions drawn from these analyses depends on precise and quantitative conversion of the RNA sample into a DNA library suitable for sequencing. Here, we describe an optimized method of preparing strand-specific RNA deep sequencing libraries from small RNAs and variably sized RNA fragments obtained from ribonucleoprotein particle footprinting experiments or fragmentation of long RNAs. Our approach works across a wide range of input amounts (400 pg to 200 ng), is easy to follow and produces a library in 2–3 days at relatively low reagent cost, all while giving the user complete control over every step. Because all enzymatic reactions were optimized and driven to apparent completion, sequence diversity and species abundance in the input sample are well preserved.  相似文献   

18.
Advances in both high-throughput sequencing and whole-genome amplification (WGA) protocols have allowed genomes to be sequenced from femtograms of DNA, for example from individual cells or from precious clinical and archived samples. Using the highly curated Caenorhabditis elegans genome as a reference, we have sequenced and identified errors and biases associated with Illumina library construction, library insert size, different WGA methods and genome features such as GC bias and simple repeat content. Detailed analysis of the reads from amplified libraries revealed characteristics suggesting that majority of amplified fragment ends are identical but inverted versions of each other. Read coverage in amplified libraries is correlated with both tandem and inverted repeat content, while GC content only influences sequencing in long-insert libraries. Nevertheless, single nucleotide polymorphism (SNP) calls and assembly metrics from reads in amplified libraries show comparable results with unamplified libraries. To utilize the full potential of WGA to reveal the real biological interest, this article highlights the importance of recognizing additional sources of errors from amplified sequence reads and discusses the potential implications in downstream analyses.  相似文献   

19.
Accurate estimation of systemic tumor load from the blood of cancer patients has enormous potential. One avenue is to measure the presence of cell-free circulating tumor DNA in plasma. Various approaches have been investigated, predominantly covering hotspot mutations or customized, patient-specific assays. Therefore, we investigated the utility of using exome sequencing to monitor circulating tumor DNA levels through the detection of single nucleotide variants in plasma. Two technologies, claiming to offer efficient library preparation from nanogram levels of DNA, were evaluated. This allowed us to estimate the proportion of starting molecules measurable by sequence capture (<5%). As cell-free DNA is highly fragmented, we designed and provide software for efficient identification of PCR duplicates in single-end libraries with a varying size distribution. On average, this improved sequence coverage by 38% in comparison to standard tools. By exploiting the redundant information in PCR-duplicates the background noise was reduced to ∼1/35000. By applying our optimized analysis pipeline to a simulation analysis, we determined the current sensitivity limit to ∼1/2400, starting with 30 ng of cell-free DNA. Subsequently, circulating tumor DNA levels were assessed in seven breast- and one prostate cancer patient. One patient carried detectable levels of circulating tumor DNA, as verified by break-point specific PCR. These results demonstrate exome sequencing on cell-free DNA to be a powerful tool for disease monitoring of metastatic cancers. To enable a broad implementation in the diagnostic settings, the efficiency limitations of sequence capture and the inherent noise levels of the Illumina sequencing technology must be further improved.  相似文献   

20.
The advent and widespread application of next-generation sequencing (NGS) technologies to the study of microbial genomes has led to a substantial increase in the number of studies in which whole genome sequencing (WGS) is applied to the analysis of microbial genomic epidemiology. However, microorganisms such as Mycobacterium tuberculosis (MTB) present unique problems for sequencing and downstream analysis based on their unique physiology and the composition of their genomes. In this study, we compare the quality of sequence data generated using the Nextera and TruSeq isolate preparation kits for library construction prior to Illumina sequencing-by-synthesis. Our results confirm that MTB NGS data quality is highly dependent on the purity of the DNA sample submitted for sequencing and its guanine-cytosine content (or GC-content). Our data additionally demonstrate that the choice of library preparation method plays an important role in mitigating downstream sequencing quality issues. Importantly for MTB, the Illumina TruSeq library preparation kit produces more uniform data quality than the Nextera XT method, regardless of the quality of the input DNA. Furthermore, specific genomic sequence motifs are commonly missed by the Nextera XT method, as are regions of especially high GC-content relative to the rest of the MTB genome. As coverage bias is highly undesirable, this study illustrates the importance of appropriate protocol selection when performing NGS studies in order to ensure that sound inferences can be made regarding mycobacterial genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号