首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Dabney J  Meyer M 《BioTechniques》2012,52(2):87-94
High-throughput sequencing technologies frequently necessitate the use of PCR for sequencing library amplification. PCR is a sometimes enigmatic process and is known to introduce biases. Here we perform a simple amplification-sequencing assay using 10 commercially available polymerase-buffer systems to amplify libraries prepared from both modern and ancient DNA. We compare the performance of the polymerases with respect to a previously uncharacterized template length bias, as well as GC-content bias, and find that simply avoiding certain polymerase can dramatically decrease the occurrence of both. For amplification of ancient DNA, we found that some commonly used polymerases strongly bias against amplification of endogenous DNA in favor of GC-rich microbial contamination, in our case reducing the fraction of endogenous sequences to almost half.  相似文献   

3.
Amplification by polymerase chain reaction is often used in the preparation of template DNA molecules for next-generation sequencing. Amplification increases the number of available molecules for sequencing but changes the representation of the template molecules in the amplified product and introduces random errors. Such changes in representation hinder applications requiring accurate quantification of template molecules, such as allele calling or estimation of microbial diversity. We present a simple method to count the number of template molecules using degenerate bases and show that it improves genotyping accuracy and removes noise from PCR amplification. This method can be easily added to existing DNA library preparation techniques and can improve the accuracy of variant calling.  相似文献   

4.
PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules.  相似文献   

5.

Background

Massively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences.

Results

We have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates.

Conclusion

We have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material.  相似文献   

6.
RAD‐tag is a powerful tool for high‐throughput genotyping. It relies on PCR amplification of the starting material, following enzymatic digestion and sequencing adaptor ligation. Amplification introduces duplicate reads into the data, which arise from the same template molecule and are statistically nonindependent, potentially introducing errors into genotype calling. In shotgun sequencing, data duplicates are removed by filtering reads starting at the same position in the alignment. However, restriction enzymes target specific locations within the genome, causing reads to start in the same place, and making it difficult to estimate the extent of PCR duplication. Here, we introduce a slight change to the Illumina sequencing adaptor chemistry, appending a unique four‐base tag to the first index read, which allows duplicate discrimination in aligned data. This approach was validated on the Illumina MiSeq platform, using double‐digest libraries of ants (Wasmannia auropunctata) and yeast (Saccharomyces cerevisiae) with known genotypes, producing modest though statistically significant gains in the odds of calling a genotype accurately. More importantly, removing duplicates also corrected for strong sample‐to‐sample variability of genotype calling accuracy seen in the ant samples. For libraries prepared from low‐input degraded museum bird samples (Mixornis gularis), which had low complexity, having been generated from relatively few starting molecules, adaptor tags show that virtually all of the genotypes were called with inflated confidence as a result of PCR duplicates. Quantification of library complexity by adaptor tagging does not significantly increase the difficulty of the overall workflow or its cost, but corrects for differences in quality between samples and permits analysis of low‐input material.  相似文献   

7.
1. DNA metabarcoding is a cost-effective species identification approach with great potential to assist entomological ecologists. This review presents a practical guide to help entomological ecologists design their own DNA metabarcoding studies and ensure that sound ecological conclusions can be obtained. 2. The review considers approaches to field sampling, laboratory work, and bioinformatic analyses, with the aim of providing the background knowledge needed to make decisions at each step of a DNA metabarcoding workflow. 3. Although most conventional sampling methods can be adapted to DNA metabarcoding, this review highlights techniques that will ensure suitable DNA preservation during field sampling and laboratory storage. The review also calls for a greater understanding of the occurrence, transportation, and deposition of environmental DNA when applying DNA metabarcoding approaches for different ecosystems. 4. Accurate species detection with DNA metabarcoding needs to consider biases introduced during DNA extraction and PCR amplification, cross-contamination resulting from inappropriate amplicon library preparation, and downstream bioinformatic analyses. Quantifying species abundance with DNA metabarcoding is in its infancy, yet recent studies demonstrate promise for estimating relative species abundance from DNA sequencing reads. 5. Given that bioinformatics is one of the biggest hurdles for researchers new to DNA metabarcoding, several useful graphical user interface programs are recommended for sequence data processing, and the application of emerging sequencing technologies is discussed.  相似文献   

8.
Suppression Subtractive Hybridization (SSH) and its derivative, Pooled Suppression Subtractive hybridization (PSSH), are powerful tools used to study variances larger than ~100 bp in prokaryotic genome structure. The initial steps involve ligating an oligonucleotide of known sequence (the “adaptor”) to a fragmented genome to facilitate amplification, subtraction and downstream sequencing. SSH results in the creation of a library of unique DNA fragments which have been traditionally analyzed via Sanger sequencing. Numerous next generation sequencing technologies have entered the market yet SSH is incompatible with these platforms. This is due to the high level of sequence conservation of the oligonucleotide used for SSH. This rigid adherence is partly because it has yet to be determined if alteration of this oligonucleotide will have a deleterious impact on subtraction efficiency. The subtraction occurs when non-unique fragments are inhibited by a secondary self-pairing structure which requires exact nucleotide sequence. We determine if appending custom sequence to the 5′ terminal ends of these oligonucleotides during the nested PCR stages of PSSH will reduce subtraction efficiency. We compare a pool of ten S. aureus clinical isolates with a standard PSSH and custom tailed-PSSH. We detected no statistically significant difference between their subtraction efficiencies. Our observations suggest that the adaptor’s terminal ends may be labeled during the nested PCR step. This produces libraries labeled with custom sequence. This does not lead to loss of subtraction efficiency and would be invaluable for groups wishing to combine SSH or PSSH with their own downstream applications, such as a high throughput sequencing platform.  相似文献   

9.
Sequencing PCR DNA amplified directly from a bacterial colony   总被引:7,自引:0,他引:7  
We show that PCR product asymmetrically amplified directly from a bacterial colony can be sequenced to yield results as good as those obtained when purified template DNA is used for the PCR amplification step. With either template, greater than 300 nucleotides can be read from a typical sequencing reaction. Taq DNA polymerase was used for both the PCR amplification and sequencing reactions.  相似文献   

10.
The development of DNA sequencing methods for characterizing microbial communities has evolved rapidly over the past decades. To evaluate more traditional, as well as newer methodologies for DNA library preparation and sequencing, we compared fosmid, short-insert shotgun and 454 pyrosequencing libraries prepared from the same metagenomic DNA samples. GC content was elevated in all fosmid libraries, compared with shotgun and 454 libraries. Taxonomic composition of the different libraries suggested that this was caused by a relative underrepresentation of dominant taxonomic groups with low GC content, notably Prochlorales and the SAR11 cluster, in fosmid libraries. While these abundant taxa had a large impact on library representation, we also observed a positive correlation between taxon GC content and fosmid library representation in other low-GC taxa, suggesting a general trend. Analysis of gene category representation in different libraries indicated that the functional composition of a library was largely a reflection of its taxonomic composition, and no additional systematic biases against particular functional categories were detected at the level of sequencing depth in our samples. Another important but less predictable factor influencing the apparent taxonomic and functional library composition was the read length afforded by the different sequencing technologies. Our comparisons and analyses provide a detailed perspective on the influence of library type on the recovery of microbial taxa in metagenomic libraries and underscore the different uses and utilities of more traditional, as well as contemporary ‘next-generation'' DNA library construction and sequencing technologies for exploring the genomics of the natural microbial world.  相似文献   

11.

Background

PCR amplification is an important step in the preparation of DNA sequencing libraries prior to high-throughput sequencing. PCR amplification introduces redundant reads in the sequence data and estimating the PCR duplication rate is important to assess the frequency of such reads. Existing computational methods do not distinguish PCR duplicates from “natural” read duplicates that represent independent DNA fragments and therefore, over-estimate the PCR duplication rate for DNA-seq and RNA-seq experiments.

Results

In this paper, we present a computational method to estimate the average PCR duplication rate of high-throughput sequence datasets that accounts for natural read duplicates by leveraging heterozygous variants in an individual genome. Analysis of simulated data and exome sequence data from the 1000 Genomes project demonstrated that our method can accurately estimate the PCR duplication rate on paired-end as well as single-end read datasets which contain a high proportion of natural read duplicates. Further, analysis of exome datasets prepared using the Nextera library preparation method indicated that 45–50% of read duplicates correspond to natural read duplicates likely due to fragmentation bias. Finally, analysis of RNA-seq datasets from individuals in the 1000 Genomes project demonstrated that 70–95% of read duplicates observed in such datasets correspond to natural duplicates sampled from genes with high expression and identified outlier samples with a 2-fold greater PCR duplication rate than other samples.

Conclusions

The method described here is a useful tool for estimating the PCR duplication rate of high-throughput sequence datasets and for assessing the fraction of read duplicates that correspond to natural read duplicates. An implementation of the method is available at https://github.com/vibansal/PCRduplicates.
  相似文献   

12.
Accurate estimation of systemic tumor load from the blood of cancer patients has enormous potential. One avenue is to measure the presence of cell-free circulating tumor DNA in plasma. Various approaches have been investigated, predominantly covering hotspot mutations or customized, patient-specific assays. Therefore, we investigated the utility of using exome sequencing to monitor circulating tumor DNA levels through the detection of single nucleotide variants in plasma. Two technologies, claiming to offer efficient library preparation from nanogram levels of DNA, were evaluated. This allowed us to estimate the proportion of starting molecules measurable by sequence capture (<5%). As cell-free DNA is highly fragmented, we designed and provide software for efficient identification of PCR duplicates in single-end libraries with a varying size distribution. On average, this improved sequence coverage by 38% in comparison to standard tools. By exploiting the redundant information in PCR-duplicates the background noise was reduced to ∼1/35000. By applying our optimized analysis pipeline to a simulation analysis, we determined the current sensitivity limit to ∼1/2400, starting with 30 ng of cell-free DNA. Subsequently, circulating tumor DNA levels were assessed in seven breast- and one prostate cancer patient. One patient carried detectable levels of circulating tumor DNA, as verified by break-point specific PCR. These results demonstrate exome sequencing on cell-free DNA to be a powerful tool for disease monitoring of metastatic cancers. To enable a broad implementation in the diagnostic settings, the efficiency limitations of sequence capture and the inherent noise levels of the Illumina sequencing technology must be further improved.  相似文献   

13.
PCR amplification of limited amounts of DNA template carries an increased risk of product redundancy and contamination. We use molecular barcoding to label each genomic DNA template with an individual sequence tag prior to PCR amplification. In addition, we include molecular ‘batch-stamps’ that effectively label each genomic template with a sample ID and analysis date. This highly sensitive method identifies redundant and contaminant sequences and serves as a reliable method for positive identification of desired sequences; we can therefore capture accurately the genomic template diversity in the sample analyzed. Although our application described here involves the use of hairpin-bisulfite PCR for amplification of double-stranded DNA, the method can readily be adapted to single-strand PCR. Useful applications will include analyses of limited template DNA for biomedical, ancient DNA and forensic purposes.  相似文献   

14.
PCR and sequencing artefacts can seriously bias population genetic analyses, particularly of populations with low genetic variation such as endangered vertebrate populations. Here, we estimate the error rates, discuss their population genetics implications, and propose a simple detection method that helps to reduce the risk of accepting such errors. We study the major histocompatibility complex (MHC) class IIB of guppies, Poecilia reticulata and find that PCR base misincorporations inflate the apparent sequence diversity. When analysing neutral genes, such bias can inflate estimates of effective population size. Previously suggested protocols for identifying genuine alleles are unlikely to exclude all sequencing errors, or they ignore genuine sequence diversity. We present a novel and statistically robust method that reduces the likelihood of accepting PCR artefacts as genuine alleles, and which minimises the necessity of repeated genotyping. Our method identifies sequences that are unlikely to be a PCR artefact, and which need to be independently confirmed through additional PCR of the same template DNA. The proposed methods are recommended particularly for population genetic studies that involve multi-template DNA and in studies on genes with low genetic diversity.  相似文献   

15.
Longitudinal studies that integrate samples with variable biomass are essential to understand microbial community dynamics across space or time. Shotgun metagenomics is widely used to investigate these communities at the functional level, but little is known about the effects of combining low and high biomass samples on downstream analysis. We investigated the interacting effects of DNA input and library amplification by polymerase chain reaction on comparative metagenomic analysis using dilutions of a single complex template from an Arabidopsis thaliana‐associated microbial community. We modified the Illumina Nextera kit to generate high‐quality large‐insert (680 bp) paired‐end libraries using a range of 50 pg to 50 ng of input DNA. Using assembly‐based metagenomic analysis, we demonstrate that DNA input level has a significant impact on community structure due to overrepresentation of low‐GC genomic regions following library amplification. In our system, these differences were largely superseded by variations between biological replicates, but our results advocate verifying the influence of library amplification on a case‐by‐case basis. Overall, this study provides recommendations for quality filtering and de‐replication prior to analysis, as well as a practical framework to address the issue of low biomass or biomass heterogeneity in longitudinal metagenomic surveys.  相似文献   

16.
A comprehensive genomic analysis of single cells is instrumental for numerous applications in tumor genetics, clinical diagnostics and forensic analyses. Here, we provide a protocol for single-cell isolation and whole genome amplification, which includes the following stages: preparation of single-cell suspensions from blood or bone marrow samples and cancer cell lines; their characterization on the basis of morphology, interphase fluorescent in situ hybridization pattern and antibody staining; isolation of single cells by either laser microdissection or micromanipulation; and unbiased amplification of single-cell genomes by either linker-adaptor PCR or GenomePlex library technology. This protocol provides a suitable template to screen for chromosomal copy number changes by conventional comparative genomic hybridization (CGH) or array CGH. Expected results include the generation of several micrograms of DNA from single cells, which can be used for CGH or other analyses, such as sequencing. Using linker-adaptor PCR or GenomePlex library technology, the protocol takes 72 or 30 h, respectively.  相似文献   

17.
Shao K  Ding W  Wang F  Li H  Ma D  Wang H 《PloS one》2011,6(9):e24910
Aptamers are short RNA or DNA oligonucleotides which can bind with different targets. Typically, they are selected from a large number of random DNA sequence libraries. The main strategy to obtain aptamers is systematic evolution of ligands by exponential enrichment (SELEX). Low efficiency is one of the limitations for conventional PCR amplification of random DNA sequence library in aptamer selection because of relative low products and high by-products formation efficiency. Here, we developed emulsion PCR for aptamer selection. With this method, the by-products formation decreased tremendously to an undetectable level, while the products formation increased significantly. Our results indicated that by-products in conventional PCR amplification were from primer-product and product-product hybridization. In emulsion PCR, we can completely avoid the product-product hybridization and avoid the most of primer-product hybridization if the conditions were optimized. In addition, it also showed that the molecule ratio of template to compartment was crucial to by-product formation efficiency in emulsion PCR amplification. Furthermore, the concentration of the Taq DNA polymerase in the emulsion PCR mixture had a significant impact on product formation efficiency. So, the results of our study indicated that emulsion PCR could improve the efficiency of SELEX.  相似文献   

18.
Metabarcoding of environmental samples on second‐generation sequencing platforms has rapidly become a valuable tool for ecological studies. A fundamental assumption of this approach is the reliance on being able to track tagged amplicons back to the samples from which they originated. In this study, we address the problem of sequences in metabarcoding sequencing outputs with false combinations of used tags (tag jumps). Unless these sequences can be identified and excluded from downstream analyses, tag jumps creating sequences with false, but already used tag combinations, can cause incorrect assignment of sequences to samples and artificially inflate diversity. In this study, we document and investigate tag jumping in metabarcoding studies on Illumina sequencing platforms by amplifying mixed‐template extracts obtained from bat droppings and leech gut contents with tagged generic arthropod and mammal primers, respectively. We found that an average of 2.6% and 2.1% of sequences had tag combinations, which could be explained by tag jumping in the leech and bat diet study, respectively. We suggest that tag jumping can happen during blunt‐ending of pools of tagged amplicons during library build and as a consequence of chimera formation during bulk amplification of tagged amplicons during library index PCR. We argue that tag jumping and contamination between libraries represents a considerable challenge for Illumina‐based metabarcoding studies, and suggest measures to avoid false assignment of tag jumping‐derived sequences to samples.  相似文献   

19.
The incorporation of locked nucleic acids (LNAs) into oligonucleotide primers has been shown to increase template binding strength and specificity for DNA amplification. Real-time PCR and DNA sequencing have been shown to be significantly enhanced by the use of LNAs. Theoretically, increasing primers' binding strength may also increase the sensitivity of conventional PCR, reducing minimum template requirements. We compared LNA-modified PCR primers with their standard DNA counterparts for amplification sensitivity with template amounts as low as 5 pg. Although the results are highly dependent on the design of the LNA primers, large increases in peak height can be achieved from as little as 75 pg, as well as clearer and more complete profiles. Increased amplification success with lower template amounts may also be seen. Additionally, the use of LNAs can enhance multiplexing. Thus, incorporating LNAs into PCR primers can increase amplification success, sensitivity, and performance under a wide range of conditions.  相似文献   

20.
We compared the species composition in phytobenthic communities at different sampling sites in a small French river presenting polluted and unpolluted areas. For each sampling point, the total DNA was extracted and used to construct an 18S rRNA gene clone library after PCR amplification of a ca 400 bp fragment. Phytobenthic community composition was estimated by random sequencing of several clones per library. Most of the sequences corresponded to the Bacillariophyceae and Chlorophyceae groups. By combining phylogenetic and correspondence analyses, we showed that our molecular approach is able to estimate and compare the species composition at different sampling sites in order to assess the environmental impact of xenobiotics on phytobenthic communities. Changes in species composition of these communities were found, but no evident decrease in the diversity. We discuss the significance of these changes with regard to the existing level of pollution and their impact on the functionality of the ecosystem. Our findings suggest that it is now possible to use faster molecular methods (DGGE, ARISA...) to test large numbers of samples in the context of ecotoxicological studies, and thus to assess the impact of pollution in an aquatic ecosystem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号