首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Next‐generation sequencing technologies have provided unprecedented insights into fungal diversity and ecology. However, intrinsic biases and insufficient quality control in next‐generation methods can lead to difficult‐to‐detect errors in estimating fungal community richness, distributions and composition. The aim of this study was to examine how tissue storage prior to DNA extraction, primer design and various quality‐control approaches commonly used in 454 amplicon pyrosequencing might influence ecological inferences in studies of endophytic and endolichenic fungi. We first contrast 454 data sets generated contemporaneously from subsets of the same plant and lichen tissues that were stored in CTAB buffer, dried in silica gel or freshly frozen prior to DNA extraction. We show that storage in silica gel markedly limits the recovery of sequence data and yields a small fraction of the diversity observed by the other two methods. Using lichen mycobiont sequences as internal positive controls, we next show that despite careful filtering of raw reads and utilization of current best‐practice OTU clustering methods, homopolymer errors in sequences representing rare taxa artificially increased estimates of richness c. 15‐fold in a model data set. Third, we show that inferences regarding endolichenic diversity can be improved using a novel primer that reduces amplification of the mycobiont. Together, our results provide a rationale for selecting tissue treatment regimes prior to DNA extraction, demonstrate the efficacy of reducing mycobiont amplification in studies of the fungal microbiomes of lichen thalli and highlight the difficulties in differentiating true information about fungal biodiversity from methodological artefacts.  相似文献   

2.
3.
The 454 Genome Sequencer (GS) FLX System is one of the next-generation sequencing systems featured by long reads, high accuracy, and ultra-high throughput. Based on the mechanism of emulsion PCR, a unique DNA template would only generate a unique sequence read after being amplified and sequenced on GS FLX. However, biased amplification of DNA templates might occur in the process of emulsion PCR, which results in production of artificial duplicate reads. Under the condition that each DNA template is unique to another, 3.49%-18.14% of total reads in GS FLX-sequencing data were found to be artificial duplicate reads. These duplicate reads may lead to misunderstanding of sequencing data and special attention should be paid to the potential biases they introduced to the data.  相似文献   

4.
Eukaryotic diversity in environmental samples is often assessed via PCR-based amplification of nSSU genes. However, estimates of diversity derived from pyrosequencing environmental data sets are often inflated, mainly because of the formation of chimeric sequences during PCR amplification. Chimeras are hybrid products composed of distinct parental sequences that can lead to the misinterpretation of diversity estimates. We have analyzed the effect of sample richness, evenness and phylogenetic diversity on the formation of chimeras using a nSSU data set derived from 454 Roche pyrosequencing of replicated, large control pools of closely and distantly related nematode mock communities, of known intragenomic identity and richness. To further investigate how chimeric molecules are formed, the nSSU gene secondary structure was analyzed in several individuals. For the first time in eukaryotes, chimera formation proved to be higher in both richer and more genetically diverse samples, thus providing a novel perspective of chimera formation in pyrosequenced environmental data sets. Findings contribute to a better understanding of the nature and mechanisms involved in chimera formation during PCR amplification of environmentally derived DNA. Moreover, given the similarities between biodiversity analyses using amplicon sequencing and those used to assess genomic variation, our findings have potential broad application for identifying genetic variation in homologous loci or multigene families in general.  相似文献   

5.
Analyses of degraded DNA are typically hampered by contamination, especially when employing universal primers such as commonly used in environmental DNA studies. In addition to false-positive results, the amplification of contaminant DNA may cause false-negative results because of competition, or bias, during the PCR. In this study, we test the utility of human-specific blocking primers in mammal diversity analyses of ancient permafrost samples from Siberia. Using quantitative PCR (qPCR) on human and mammoth DNA, we first optimized the design and concentration of blocking primer in the PCR. Subsequently, 454 pyrosequencing of ancient permafrost samples amplified with and without the addition of blocking primer revealed that DNA sequences from a diversity of mammalian representatives of the Beringian megafauna were retrieved only when the blocking primer was added to the PCR. Notably, we observe the first retrieval of woolly rhinoceros (Coelodonta antiquitatis) DNA from ancient permafrost cores. In contrast, reactions without blocking primer resulted in complete dominance by human DNA sequences. These results demonstrate that in ancient environmental analyses, the PCR can be biased towards the amplification of contaminant sequences to such an extent that retrieval of the endogenous DNA is severely restricted. The application of blocking primers is a promising tool to avoid this bias and can greatly enhance the quantity and the diversity of the endogenous DNA sequences that are amplified.  相似文献   

6.
Proofreading polymerases have 3′ to 5′ exonuclease activity that allows the excision and correction of mis-incorporated bases during DNA replication. In a previous study, we demonstrated that in addition to correcting substitution errors and lowering the error rate of DNA amplification, proofreading polymerases can also edit PCR primers to match template sequences. Primer editing is a feature that can be advantageous in certain experimental contexts, such as amplicon-based microbiome profiling. Here we develop a set of synthetic DNA standards to report on primer editing activity and use these standards to dissect this phenomenon. The primer editing standards allow next-generation sequencing-based enzymological measurements, reveal the extent of editing, and allow the comparison of different polymerases and cycling conditions. We demonstrate that proofreading polymerases edit PCR primers in a concentration-dependent manner, and we examine whether primer editing exhibits any sequence specificity. In addition, we use these standards to show that primer editing is tunable through the incorporation of phosphorothioate linkages. Finally, we demonstrate the ability of primer editing to robustly rescue the drop-out of taxa with 16S rRNA gene-targeting primer mismatches using mock communities and human skin microbiome samples.  相似文献   

7.
Accurate genotyping of complex systems, such as the major histocompatibility complex (MHC) often requires simultaneous analysis of multiple co-amplifying loci. Here we explore the utility of the massively parallel 454 sequencing method as a universal tool for genotyping complex MHC systems in nonmodel vertebrates. The power of this approach stems from the use of tagged polymerase chain reaction (PCR) primers to identify individual amplicons which can be simultaneously sequenced to the arbitrarily chosen coverage. However, the error-prone sequencing technology poses considerable challenges as it may be difficult to discriminate between sequencing errors and true rare alleles; due to complex nature of artefacts and errors, efficient quality control is required. Nevertheless, our study demonstrates the parallel 454 sequencing can be an efficient genotyping platform for MHC and provides an alternative to classical genotyping methods. We introduced procedures to identify the threshold that can be used to reduce number of genotyping errors by eliminating most of artefactual alleles (AA) representing PCR or sequencing errors. Our procedures are based on two expectations: first, that AA should be relatively rare, both overall and on per-individual basis, and second, that most AA result from errors introduced to sequences of true alleles. In our data set, alleles with an average per-individual frequency below 3% most likely represented artefacts. This threshold will vary in other applications according to the complexity of the genotyped system. We strongly suggest direct assessment of genotyping error in every experiment by running a fraction of duplicates: individuals amplified in independent PCRs.  相似文献   

8.
ABSTRACT: BACKGROUND: Roche 454 sequencing is the leading sequencing technology for producing long read high throughput sequence data. Unlike most methods where sequencing errors translate to base uncertainties, 454 sequencing inaccuracies create nucleotide gaps. These gaps are particularly troublesome for translated search tools such as BLASTx where they introduce frame-shifts and result in regions of decreased identity and/or terminated alignments, which affect further analysis. RESULTS: To address this issue, the Homopolymer Aware Cross Alignment Tool (HAXAT) was developed. HAXAT uses a novel dynamic programming algorithm for solving the optimal local alignment between a 454 nucleotide and a protein sequence by allowing frame-shifts, guided by 454 flowpeak values. The algorithm is an efficient minimal extension of the Smith-Waterman-Gotoh algorithm that easily fits in into other tools.Experiments using HAXAT demonstrate, through the introduction of 454 specific frame-shift penalties, significantly increased accuracy of alignments spanning homopolymer sequence errors. The full effect of the new parameters introduced with this novel alignment model is explored. Experimental results evaluating homopolymer inaccuracy through alignments show a two to five-fold increase in Matthews Correlation Coefficient over previous algorithms, for 454-derived data. CONCLUSIONS: This increased accuracy provided by HAXAT does not only result in improved homologue estimations, but also provides un-interrupted reading-frames, which greatly facilitate further analysis of protein space, for example phylogenetic analysis.The alignment tool is available at http://bioinfo.ifm.liu.se/454tools/haxat.  相似文献   

9.

Background  

Advances in automated DNA sequencing technology have greatly increased the scale of genomic and metagenomic studies. An increasingly popular means of increasing project throughput is by multiplexing samples during the sequencing phase. This can be achieved by covalently linking short, unique "barcode" DNA segments to genomic DNA samples, for instance through incorporation of barcode sequences in PCR primers. Although several strategies have been described to insure that barcode sequences are unique and robust to sequencing errors, these have not been integrated into the overall primer design process, thus potentially introducing bias into PCR amplification and/or sequencing steps.  相似文献   

10.

Background

Second-generation sequencers generate millions of relatively short, but error-prone, reads. These errors make sequence assembly and other downstream projects more challenging. Correcting these errors improves the quality of assemblies and projects which benefit from error-free reads.

Results

We have developed a general-purpose error corrector that corrects errors introduced by Illumina, Ion Torrent, and Roche 454 sequencing technologies and can be applied to single- or mixed-genome data. In addition to correcting substitution errors, we locate and correct insertion, deletion, and homopolymer errors while remaining sensitive to low coverage areas of sequencing projects. Using published data sets, we correct 94% of Illumina MiSeq errors, 88% of Ion Torrent PGM errors, 85% of Roche 454 GS Junior errors. Introduced errors are 20 to 70 times more rare than successfully corrected errors. Furthermore, we show that the quality of assemblies improves when reads are corrected by our software.

Conclusions

Pollux is highly effective at correcting errors across platforms, and is consistently able to perform as well or better than currently available error correction software. Pollux provides general-purpose error correction and may be used in applications with or without assembly.  相似文献   

11.
DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large‐scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next‐generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high‐target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next‐generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10‐mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full‐length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full‐length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next‐generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content.  相似文献   

12.
Next-generation Roche 454 pyrosequencing was used to rapidly identify polymorphic microsatellites from enriched DNA libraries for the pink stem borer, Sesamia inferens (Walker). A total of 1,459 simple sequence repeats (SSRs) were isolated from the microsatellite-enriched library using 454 sequencing. Thirty-nine microsatellite markers were selected to synthesize for further optimization, and 12 loci exhibited reliable amplification of a single product of expected size. The forward primer of 12 primer pairs was end labeled with a fluorescent dye. All of the 12 microsatellite loci were polymorphic, with 5–13 alleles per locus and observed heterozygosities ranging from 0.097 to 0.957. Here, we also tested these 12 SSRs for cross-species amplification in Chilo suppressalis (Walker), Tryporyza incertulas (Walker) and Cnaphalocrocis medinalis (Guenée). These polymorphic markers will be a valuable tool for analyses of population connectivity and genetic structure in this rice pest.  相似文献   

13.
The use of automated fluorescent DNA sequencer systems and PCR-based DNA sequencing methods play an important role in the actual effort to improve the efficiency of large-scale DNA analysis. Here we show the application of the linear PCR using a single fluorescent primer and dideoxynucleotide terminators in four separate sequencing reactions on the EMBL/Pharmacia's fluorescent automated DNA sequencer. We have used dideoxy/deoxynucleoside triphosphate ratios and linear amplification cycle conditions to obtain an accurate sequencing response of up to, and over, 500 bases from just 400 ng of double-stranded DNA template without chemical denaturation. The sequencing protocol described in this paper is effectively suited for enhancement of sensitivity and performance of the automated DNA sequencing system.  相似文献   

14.
Traditional Chinese medicine(TCM) preparations are widely used for healthcare and clinical practice. So far, the methods commonly used for quality evaluation of TCM preparations mainly focused on chemical ingredients. The biological ingredient analysis of TCM preparations is also important because TCM preparations usually contain both plant and animal ingredients,which often include some mis-identified herbal materials, adulterants or even some biological contaminants.For biological ingredient analysis, the efficiency of DNA extraction is an important factor which might affect the accuracy and reliability of identification. The component complexity in TCM preparations is high, and DNA might be destroyed or degraded in different degrees after a series of processing procedures. Therefore, it is necessary to establish an effective protocol for DNA extraction from TCM preparations. In this study, we chose a classical TCM preparation,Liuwei Dihuang Wan(LDW), as an example to develop a TCM-specific DNA extraction method.An optimized cetyl trimethyl ammonium bromide(CTAB) method(TCM-CTAB) and three commonlyused extraction kits were tested for extraction of DNA from LDW samples. Experimental results indicated that DNA with the highest purity and concentration was obtained by using TCM-CTAB. To further evaluate the different extraction methods, amplification of the second internal transcribed spacer(ITS2) and the chloroplast genome trnL intron was carried out.The results have shown that PCR amplification was successful only with template of DNA extracted by using TCM-CTAB. Moreover, we performed high-throughput 454 sequencing using DNA extracted by TCM-CTAB. Data analysis showed that 3–4 out of 6 prescribed species were detected from LDW samples, while up to 5 contaminating species were detected, suggesting  相似文献   

15.
Taq DNA聚合酶具有反应速度快、温度作用范围广及良好的续进性等特点,可视为一种理想的DNA顺序分析酶。本文首先对非对称性PCR扩增过程中单、双链DNA产物的积累情况进行了分析,然后采用标记延伸二步法,对Taq DNA聚合酶的性质及影响因素进行分析。为进一步改进Taq DNA聚合酶测序的方法,本反应建立了“Klenow-型”的直接掺入标记同位素测序法,即在反应液中加入与标记核苷酸相应的一定浓度的冷dNTP。此法不但解决了二步法中引物后部分DNA顺序无法读出的缺点,而且简化了反应步骤,亦能得到令人满意的顺序分析结果,每次可读出至少400碱基的序列。  相似文献   

16.
Multiplex polymerase chain reaction (PCR) has multiple applications in molecular biology, including developing new targeted next-generation sequencing (NGS) panels. We present NGS-PrimerPlex, an efficient and versatile command-line application that designs primers for different refined types of amplicon-based genome target enrichment. It supports nested and anchored multiplex PCR, redistribution among multiplex reactions of primers constructed earlier, and extension of existing NGS-panels. The primer design process takes into consideration the formation of secondary structures, non-target amplicons between all primers of a pool, primers and high-frequent genome single-nucleotide polymorphisms (SNPs) overlapping. Moreover, users of NGS-PrimerPlex are free from manually defining input genome regions, because it can be done automatically from a list of genes or their parts like exon or codon numbers. Using the program, the NGS-panel for sequencing the LRRK2 gene coding regions was created, and 354 DNA samples were studied successfully with a median coverage of 97.4% of target regions by at least 30 reads. To show that NGS-PrimerPlex can also be applied for bacterial genomes, we designed primers to detect foodborne pathogens Salmonella enterica, Escherichia coli O157:H7, Listeria monocytogenes, and Staphylococcus aureus considering variable positions of the genomes.  相似文献   

17.
Recently, 454 sequencing has emerged as a popular method for isolating microsatellites owing to cost-effectiveness and time saving. In this study, repeat-enriched libraries from two southern African endemic sparids (Pachymetopon blochii and Lithognathus lithognathus) were 454 GS-FLX sequenced. From these, 7370 sequences containing repeats (SCRs) were identified. A brief survey of 23 studies showed a significant difference between the number of SCRs when enrichment was performed first before 454 sequencing. We designed primers for 302 unique fragments containing more than five repeat units and suitable flanking regions. A fraction (<11%) of these loci were characterized with 18 polymorphic microsatellite loci (nine in each of the focal species) being described. Sanger sequencing of alleles confirmed that size variation was because of differences in the number of tandem repeats. However, a case of homoplasy and sequencing errors in the 454 sequencing were identified. These newly developed and four previously isolated loci were successfully used to identify polymorphic markers in nine other economically important species, representative of sparid diversity. The combination of newly developed markers with data from previous sparid cross-species studies showed a significant negative correlation between genetic divergence to focal species and microsatellite transferability. The high level of transferability we described (48% amplification success and 32% polymorphism) suggests that the 302 microsatellite loci identified represent an excellent resource for future studies on sparids. Microsatellite marker development should commonly include tests of transferability to reduce costs and increase feasibility of population genetics studies in nonmodel organisms.  相似文献   

18.
16S rRNA基因在微生物生态学中的应用   总被引:10,自引:0,他引:10  
16S rRNA(Small subunit ribosomal RNA)基因是对原核微生物进行系统进化分类研究时最常用的分子标志物(Biomarker),广泛应用于微生物生态学研究中。近些年来随着高通量测序技术及数据分析方法等的不断进步,大量基于16S rRNA基因的研究使得微生物生态学得到了快速发展,然而使用16S rRNA基因作为分子标志物时也存在诸多问题,比如水平基因转移、多拷贝的异质性、基因扩增效率的差异、数据分析方法的选择等,这些问题影响了微生物群落组成和多样性分析时的准确性。对当前使用16S rRNA基因分析微生物群落组成和多样性的进展情况做一总结,重点讨论当前存在的主要问题以及各种分析方法的发展,尤其是与高通量测序技术有关的实验和数据处理问题。  相似文献   

19.
The emergence of next-generation sequencing (NGS) technologies has significantly improved sequencing throughput and reduced costs. However, the short read length, duplicate reads and massive volume of data make the data processing much more difficult and complicated than the first-generation sequencing technology. Although there are some software packages developed to assess the data quality, those packages either are not easily available to users or require bioinformatics skills and computer resources. Moreover, almost all the quality assessment software currently available didn’t taken into account the sequencing errors when dealing with the duplicate assessment in NGS data. Here, we present a new user-friendly quality assessment software package called BIGpre, which works for both Illumina and 454 platforms. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well. BIGpre is primarily written in Perl and integrates graphical capability from the statistics package R. This package produces both tabular and graphical summaries of data quality for sequencing datasets from Illumina and 454 platforms. Processing hundreds of millions reads within minutes, this package provides immediate diagnostic information for user to manipulate sequencing data for downstream analyses. BIGpre is freely available at http://bigpre.sourceforge.net/.  相似文献   

20.
In its basic concept, in vitro DNA amplification by the polymerase chain reaction (PCR) is restricted to those instances in which segments of known sequence flank the fragment to be amplified. Recently, techniques have been developed for amplification of unknown DNA sequences. These techniques, however, are dependent on the presence of suitable restriction endonuclease sites. Here, we describe a strategy for PCR amplification of DNA that lies outside the boundaries of known sequence. It is based on the use of one specific primer, homologous to the known sequence, and one semi-random primer. Restriction sites in the 5' proximal regions of both primers allow for cloning of the amplified DNA in a suitable sequencing vector or any other vector. It was shown by sequence analysis that the cloned DNA fragments represent contiguous DNA fragments that are flanked at one side by the sequence of the specific primer. When omitting the semi-random primer, a single clone was obtained, which originated from PCR amplification of target DNA by the specific primer in both directions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号