首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 28 毫秒
1.
Targeted high-throughput sequencing of tagged nucleic acid samples   总被引:6,自引:2,他引:4  
High-throughput 454 DNA sequencing technology allows much faster and more cost-effective sequencing than traditional Sanger sequencing. However, the technology imposes inherent limitations on the number of samples that can be processed in parallel. Here we introduce parallel tagged sequencing (PTS), a simple, inexpensive and flexible barcoding technique that can be used for parallel sequencing any number and type of double-stranded nucleic acid samples. We demonstrate that PTS is particularly powerful for sequencing contiguous DNA fragments such as mtDNA genomes: in theory as many as 250 mammalian mtDNA genomes can be sequenced in a single GS FLX run. PTS dramatically increases the sequencing throughput of samples in parallel and thus fully mobilizes the resources of the 454 technology for targeted sequencing.  相似文献   

2.
High‐throughput sequencing (HTS) of PCR amplicons is becoming the method of choice to sequence one or several targeted loci for phylogenetic and DNA barcoding studies. Although the development of HTS has allowed rapid generation of massive amounts of DNA sequence data, preparing amplicons for HTS remains a rate‐limiting step. For example, HTS platforms require platform‐specific adapter sequences to be present at the 5′ and 3′ end of the DNA fragment to be sequenced. In addition, short multiplex identifier (MID) tags are typically added to allow multiple samples to be pooled in a single HTS run. Existing methods to incorporate HTS adapters and MID tags into PCR amplicons are either inefficient, requiring multiple enzymatic reactions and clean‐up steps, or costly when applied to multiple samples or loci (fusion primers). We describe a method to amplify a target locus and add HTS adapters and MID tags via a linker sequence using a single PCR. We demonstrate our approach by generating reference sequence data for two mitochondrial loci (COI and 16S) for a diverse suite of insect taxa. Our approach provides a flexible, cost‐effective and efficient method to prepare amplicons for HTS.  相似文献   

3.
DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large‐scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next‐generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high‐target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next‐generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10‐mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full‐length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full‐length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next‐generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content.  相似文献   

4.
DNA-encoded chemical libraries are large collections of small organic molecules, individually coupled to DNA fragments that serve as amplifiable identification bar codes. The isolation of specific binders requires a quantitative analysis of the distribution of DNA fragments in the library before and after capture on an immobilized target protein of interest. Here, we show how Illumina sequencing can be applied to the analysis of DNA-encoded chemical libraries, yielding over 10 million DNA sequence tags per flow-lane. The technology can be used in a multiplex format, allowing the encoding and subsequent sequencing of multiple selections in the same experiment. The sequence distributions in DNA-encoded chemical library selections were found to be similar to the ones obtained using 454 technology, thus reinforcing the concept that DNA sequencing is an appropriate avenue for the decoding of library selections. The large number of sequences obtained with the Illumina method now enables the study of very large DNA-encoded chemical libraries (>500,000 compounds) and reduces decoding costs.  相似文献   

5.
为寻找适用于中药材莪术基原植物鉴定的DNA条形码序列,探索快速高效的莪术基原植物鉴定的新方法,该文首先利用扩增成功率和测序成功率对中药材莪术三种基原植物,9个样本的7种DNA条形码序列(ITS、ITS2、matK、psbA-trnH、trnL-trnF、rpoB和atpB-rbcL)进行评估,然后利用MEGA6.0软件对获得的高质量的序列通过变异位点分析、遗传距离计算和系统树分析等进一步进行评估,最后将筛选到的DNA条形码序列对未知基原的待测样品进行基原鉴定。结果表明:(1) ITS、ITS2和matK等条形码序列在莪术基原植物中的扩增或测序成功率较低,难以应用于实际鉴定;而psbA-trnH、trnL-trnF和rpoB条形码序列变异位点信息过少,不足于区分莪术的三种不同基原植物;只有atpB-rbcL条形码序列的扩增和测序成功率较高,容易获得高质量的序列,同时序列长度(642~645 bp)理想,变异位点多(11个),可实现莪术的三种不同基原的区分鉴别。(2)待测样品经基于atpB-rbcL序列构建的系统发育树鉴别为温郁金。综上所述,叶绿体atpB-rbcL序列能够准确鉴定莪术不同基原植物,可以作为中药材莪术基原植物鉴定的条形码序列。  相似文献   

6.
BC Faircloth  TC Glenn 《PloS one》2012,7(8):e42543
Ligating adapters with unique synthetic oligonucleotide sequences (sequence tags) onto individual DNA samples before massively parallel sequencing is a popular and efficient way to obtain sequence data from many individual samples. Tag sequences should be numerous and sufficiently different to ensure sequencing, replication, and oligonucleotide synthesis errors do not cause tags to be unrecoverable or confused. However, many design approaches only protect against substitution errors during sequencing and extant tag sets contain too few tag sequences. We developed an open-source software package to validate sequence tags for conformance to two distance metrics and design sequence tags robust to indel and substitution errors. We use this software package to evaluate several commercial and non-commercial sequence tag sets, design several large sets (maxcount = 7,198) of edit metric sequence tags having different lengths and degrees of error correction, and integrate a subset of these edit metric tags to polymerase chain reaction (PCR) primers and sequencing adapters. We validate a subset of these edit metric tagged PCR primers and sequencing adapters by sequencing on several platforms and subsequent comparison to commercially available alternatives. We find that several commonly used sets of sequence tags or design methodologies used to produce sequence tags do not meet the minimum expectations of their underlying distance metric, and we find that PCR primers and sequencing adapters incorporating edit metric sequence tags designed by our software package perform as well as their commercial counterparts. We suggest that researchers evaluate sequence tags prior to use or evaluate tags that they have been using. The sequence tag sets we design improve on extant sets because they are large, valid across the set, and robust to the suite of substitution, insertion, and deletion errors affecting massively parallel sequencing workflows on all currently used platforms.  相似文献   

7.
A goal of many environmental DNA barcoding studies is to infer quantitative information about relative abundances of different taxa based on sequence read proportions generated by high‐throughput sequencing. However, potential biases associated with this approach are only beginning to be examined. We sequenced DNA amplified from faeces (scats) of captive harbour seals (Phoca vitulina) to investigate whether sequence counts could be used to quantify the seals’ diet. Seals were fed fish in fixed proportions, a chordate‐specific mitochondrial 16S marker was amplified from scat DNA and amplicons sequenced using an Ion Torrent PGM?. For a given set of bioinformatic parameters, there was generally low variability between scat samples in proportions of prey species sequences recovered. However, proportions varied substantially depending on sequencing direction, level of quality filtering (due to differences in sequence quality between species) and minimum read length considered. Short primer tags used to identify individual samples also influenced species proportions. In addition, there were complex interactions between factors; for example, the effect of quality filtering was influenced by the primer tag and sequencing direction. Resequencing of a subset of samples revealed some, but not all, biases were consistent between runs. Less stringent data filtering (based on quality scores or read length) generally produced more consistent proportional data, but overall proportions of sequences were very different than dietary mass proportions, indicating additional technical or biological biases are present. Our findings highlight that quantitative interpretations of sequence proportions generated via high‐throughput sequencing will require careful experimental design and thoughtful data analysis.  相似文献   

8.
Timely and accurate biodiversity analysis poses an ongoing challenge for the success of biomonitoring programs. Morphology-based identification of bioindicator taxa is time consuming, and rarely supports species-level resolution especially for immature life stages. Much work has been done in the past decade to develop alternative approaches for biodiversity analysis using DNA sequence-based approaches such as molecular phylogenetics and DNA barcoding. On-going assembly of DNA barcode reference libraries will provide the basis for a DNA-based identification system. The use of recently introduced next-generation sequencing (NGS) approaches in biodiversity science has the potential to further extend the application of DNA information for routine biomonitoring applications to an unprecedented scale. Here we demonstrate the feasibility of using 454 massively parallel pyrosequencing for species-level analysis of freshwater benthic macroinvertebrate taxa commonly used for biomonitoring. We designed our experiments in order to directly compare morphology-based, Sanger sequencing DNA barcoding, and next-generation environmental barcoding approaches. Our results show the ability of 454 pyrosequencing of mini-barcodes to accurately identify all species with more than 1% abundance in the pooled mixture. Although the approach failed to identify 6 rare species in the mixture, the presence of sequences from 9 species that were not represented by individuals in the mixture provides evidence that DNA based analysis may yet provide a valuable approach in finding rare species in bulk environmental samples. We further demonstrate the application of the environmental barcoding approach by comparing benthic macroinvertebrates from an urban region to those obtained from a conservation area. Although considerable effort will be required to robustly optimize NGS tools to identify species from bulk environmental samples, our results indicate the potential of an environmental barcoding approach for biomonitoring programs.  相似文献   

9.
Single-cell genomic sequencing using Multiple Displacement Amplification   总被引:1,自引:0,他引:1  
Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).  相似文献   

10.
The well-known massively parallel sequencing method is efficient and it can obtain sequence data from multiple individual samples. In order to ensure that sequencing, replication, and oligonucleotide synthesis errors do not result in tags (or barcodes) that are unrecoverable or confused, the tag sequences should be abundant and sufficiently different. Recently, many design methods have been proposed for correcting errors in data using error-correcting codes. The existing tag sets contain small tag sequences, so we used a modified genetic algorithm to improve the lower bound of the tag sets in this study. Compared with previous research, our algorithm is effective for designing sets of DNA tags. Moreover, the GC content determined by existing methods includes an imprecise range. Thus, we improved the GC content determination method to obtain tag sets that control the GC content in a more precise range. Finally, previous studies have only considered perfect self-complementarity. Thus, we considered the crossover between different tags and introduced an improved constraint into the design of tag sets.  相似文献   

11.
对山葡萄(Vitis amurensis)种质资源样品的ITS、ITS2、psb A-trn H、rbc L和mat K序列进行PCR扩增及测序,优化PCR反应的退火温度,比较各序列的扩增效率、测序成功率、品种间和品种内的差异及barcoding gap图,使用BLAST和NJ树法比较不同序列的鉴定能力,最终从5条DNA片段中筛选出可用于山葡萄种质资源鉴定的DNA条形码通用序列。结果表明,在采集的11份33个山葡萄样品中,psb A-trn H和ITS2序列的扩增与测序成功率较高,其品种间、品种内差异及barcoding gap较ITS、rbc L和mat K序列具有明显的优势,且ITS2序列能够鉴别psb A-trn H序列无法鉴别的品种。实验证明,ITS2和psb A-trn H序列是较适合鉴别山葡萄资源的DNA条形码序列组合。DNA条形码弥补了形态学鉴定的不足,可为山葡萄种质资源的准确鉴定提供科学依据。  相似文献   

12.
DNA barcoding has been recently promoted as a method for both assigning specimens to known species and for discovering new and cryptic species. Here we test both the potential and the limitations of DNA barcodes by analysing a group of well-studied organisms--the primates. Our results show that DNA barcodes provide enough information to efficiently identify and delineate primate species, but that they cannot reliably uncover many of the deeper phylogenetic relationships. Our conclusion is that these short DNA sequences do not contain enough information to build reliable molecular phylogenies or define new species, but that they can provide efficient sequence tags for assigning unknown specimens to known species. As such, DNA barcoding provides enormous potential for use in global biodiversity studies.  相似文献   

13.
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding.  相似文献   

14.
锦葵科植物DNA条形码通用序列的筛选   总被引:1,自引:0,他引:1  
王柯  陈科力  刘震  陈士林 《植物学报》2011,46(3):276-284
对锦葵科植物样品的ITS、ITS2、rbcL、matK和psbA-trnH序列进行PCR扩增和测序, 比较各序列的扩增效率、测序成功率、种内和种间变异的差异以及barcoding gap图, 使用BLAST1和Nearest Distance方法评价不同序列的鉴定能力, 进而从这些候选序列中筛选出较适合锦葵科植物鉴别的DNA条形码序列。结果表明, ITS序列在采集的锦葵科植物11个种26个样品中的扩增成功率较高, 其种内、种间变异差异和barcoding gap较ITS2、psbA-trnH及rbcL序列具有更明显的优势, 且纳入60个属316个种共1 228个样品的网上数据后, 其鉴定成功率可达89.9%。psbA-trnH序列的扩增和测序成功率最高, 其鉴定成功率为63.2%, 并能鉴别一些ITS序列无法鉴别的种。实验结果表明, ITS和psbA-trnH是较适合鉴别锦葵科植物的DNA条形码序列组合。  相似文献   

15.
Chloroplast DNA sequence data are a versatile tool for plant identification or barcoding and establishing genetic relationships among plant species. Different chloroplast loci have been utilized for use at close and distant evolutionary distances in plants, and no single locus has been identified that can distinguish between all plant species. Advances in DNA sequencing technology are providing new cost‐effective options for genome comparisons on a much larger scale. Universal PCR amplification of chloroplast sequences or isolation of pure chloroplast fractions, however, are non‐trivial. We now propose the analysis of chloroplast genome sequences from massively parallel sequencing (MPS) of total DNA as a simple and cost‐effective option for plant barcoding, and analysis of plant relationships to guide gene discovery for biotechnology. We present chloroplast genome sequences of five grass species derived from MPS of total DNA. These data accurately established the phylogenetic relationships between the species, correcting an apparent error in the published rice sequence. The chloroplast genome may be the elusive single‐locus DNA barcode for plants.  相似文献   

16.
对锦葵科植物样品的ITS、ITS2、rbcL、matK和psbA-trnH序列进行PCR扩增和测序,比较各序列的扩增效率、测序成功率、种内和种间变异的差异以及barcoding gap图,使用BLAST1和Nearest Distance方法评价不同序列的鉴定能力,进而从这些候选序列中筛选出较适合锦葵科植物鉴别的DNA条形码序列。结果表明,ITS序列在采集的锦葵科植物11个种26个样品中的扩增成功率较高,其种内、种间变异差异和barcoding gap较ITS2、psbA-trnH及rbcL序列具有更明显的优势,且纳入60个属316个种共1228个样品的网上数据后,其鉴定成功率可达89.9%。psbA-trnH序列的扩增和测序成功率最高,其鉴定成功率为63.2%,并能鉴别一些ITS序列无法鉴别的种。实验结果表明,ITS和psbA-trnH是较适合鉴别锦葵科植物的DNA条形码序列组合。  相似文献   

17.
DNA barcoding remains a challenge when applied to diet analyses, ancient DNA studies, environmental DNA samples and, more generally, in any cases where DNA samples have not been adequately preserved. Because the size of the commonly used barcoding marker (COI) is over 600 base pairs (bp), amplification fails when the DNA molecule is degraded into smaller fragments. However, relevant information for specimen identification may not be evenly distributed along the barcoding region, and a shorter target can be sufficient for identification purposes. This study proposes a new, widely applicable, method to compare the performance of all potential 'mini-barcodes' for a given molecular marker and to objectively select the shortest and most informative one. Our method is based on a sliding window analysis implemented in the new R package SPIDER (Species IDentity and Evolution in R). This method is applicable to any taxon and any molecular marker. Here, it was tested on earthworm DNA that had been degraded through digestion by carnivorous landsnails. A 100 bp region of 16 S rDNA was selected as the shortest informative fragment (mini-barcode) required for accurate specimen identification. Corresponding primers were designed and used to amplify degraded earthworm (prey) DNA from 46 landsnail (predator) faeces using 454-pyrosequencing. This led to the detection of 18 earthworm species in the diet of the snail. We encourage molecular ecologists to use this method to objectively select the most informative region of the gene they aim to amplify from degraded DNA. The method and tools provided here, can be particularly useful (1) when dealing with degraded DNA for which only small fragments can be amplified, (2) for cases where no consensus has yet been reached on the appropriate barcode gene, or (3) to allow direct analysis of short reads derived from massively parallel sequencing without the need for bioinformatic consolidation.  相似文献   

18.
Multiplexed high-throughput pyrosequencing is currently limited in complexity (number of samples sequenced in parallel), and in capacity (number of sequences obtained per sample). Physical-space segregation of the sequencing platform into a fixed number of channels allows limited multiplexing, but obscures available sequencing space. To overcome these limitations, we have devised a novel barcoding approach to allow for pooling and sequencing of DNA from independent samples, and to facilitate subsequent segregation of sequencing capacity. Forty-eight forward–reverse barcode pairs are described: each forward and each reverse barcode unique with respect to at least 4 nt positions. With improved read lengths of pyrosequencers, combinations of forward and reverse barcodes may be used to sequence from as many as n2 independent libraries for each set of ‘n’ forward and ‘n’ reverse barcodes, for each defined set of cloning-linkers. In two pilot series of barcoded sequencing using the GS20 Sequencer (454/Roche), we found that over 99.8% of obtained sequences could be assigned to 25 independent, uniquely barcoded libraries based on the presence of either a perfect forward or a perfect reverse barcode. The false-discovery rate, as measured by the percentage of sequences with unexpected perfect pairings of unmatched forward and reverse barcodes, was estimated to be <0.005%.  相似文献   

19.
Mapping of genomic DNA methylation is a dispensable part of functional genome. We have developed a novel method based on methylation-specific primer and serial analysis of gene expression, called MSP-SAGE, with potential of high-throughput quantification of genomic DNA methylation. We used a 6-mer methylation-specific primer to extend the methylated CpG sequences other than non-methylated CpG sequences. The 17 bp tags contained methylated CpG sequence, which were obtained from extended methylation sequence by digestion of restriction endonuclease, and then the tags were concatenated and cloned for sequencing. We can identify the locations of methylation according to the sequences of tags and quantify the methylation status from the frequency of the tags. MSP-SAGE has a good linearity in a broad methylation range from 5% to 100% with good accuracy and high precision. The proof-of-principle study shows that MSP-SAGE is a reliable high-throughput assay for quantification of DNA methylation.  相似文献   

20.
There has been a dramatic increase of throughput of sequenced bases in the last years but sequencing a multitude of samples in parallel has not yet developed equally. Here we present a novel strategy where the combination of two tags is used to link sequencing reads back to their origins from a pool of samples. By incorporating the tags in two steps sample-handling complexity is lowered by nearly 100 times compared to conventional indexing protocols. In addition, the method described here enables accurate identification and typing of thousands of samples in parallel. In this study the system was designed to test 4992 samples using only 122 tags. To prove the concept of the two-tagging method, the highly polymorphic 2(nd) exon of DLA-DRB1 in dogs and wolves was sequenced using the 454 GS FLX Titanium Chemistry. By requiring a minimum sequence depth of 20 reads per sample, 94% of the successfully amplified samples were genotyped. In addition, the method allowed digital detection of chimeric fragments. These results demonstrate that it is possible to sequence thousands of samples in parallel without complex pooling patterns or primer combinations. Furthermore, the method is highly scalable as only a limited number of additional tags leads to substantial increase of the sample size.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号