首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
近些年来DNA测序技术发展迅速,已经从第一代生化测序发展到第三代单分子测序。作为第三代测序技术中的一种不同于当前流行的其他测序技术,纳米孔测序技术是基于电信号的一种物理方法测序。许多研究者通常将高通量测序技术应用于食品微生物的研究,但是将纳米孔测序技术应用于食品中微生物的检测却鲜有报道。Oxford Nanopore Technologies(牛津纳米孔科技公司)研发的DNA测序仪MinION,是世界首例用于商业测序的纳米孔测序仪,经过不断完善,近年来MinION在DNA测序中被广泛应用。MinION 测序一次需要的DNA量约1μg,其标准识别速度为一秒钟识别250个碱基,平均读长可至13kb~20kb,测序准确率可以达到98%。纳米孔测序的高识别速度和高准确率,完全满足快速检测的要求,将其应用于食品中微生物检测是完全可行的。  相似文献   

2.
DNA barcodes are useful for species discovery and species identification, but obtaining barcodes currently requires a well‐equipped molecular laboratory and is time‐consuming, and/or expensive. We here address these issues by developing a barcoding pipeline for Oxford Nanopore MinION? and demonstrating that one flow cell can generate barcodes for ~500 specimens despite the high basecall error rates of MinION? reads. The pipeline overcomes these errors by first summarizing all reads for the same tagged amplicon as a consensus barcode. Consensus barcodes are overall mismatch‐free but retain indel errors that are concentrated in homopolymeric regions. They are addressed with an optional error correction pipeline that is based on conserved amino acid motifs from publicly available barcodes. The effectiveness of this pipeline is documented by analysing reads from three MinION? runs that represent three different stages of MinION? development. They generated data for (i) 511 specimens of a mixed Diptera sample, (ii) 575 specimens of ants and (iii) 50 specimens of Chironomidae. The run based on the latest chemistry yielded MinION? barcodes for 490 of the 511 specimens which were assessed against reference Sanger barcodes (N = 471). Overall, the MinION? barcodes have an accuracy of 99.3%–100% with the number of ambiguous bases after correction ranging from <0.01% to 1.5% depending on which correction pipeline is used. We demonstrate that it requires ~2 hr of sequencing to gather all information needed for obtaining reliable barcodes for most specimens (>90%). We estimate that up to 1,000 barcodes can be generated in one flow cell and that the cost per barcode can be 相似文献   

3.
单分子实时测序技术的原理与应用   总被引:1,自引:0,他引:1  
柳延虎  王璐  于黎 《遗传》2015,37(3):259-268
单分子DNA测序技术是近10年发展起来的新一代测序技术,也称为第三代测序技术,包括单分子实时测序、真正单分子测序、单分子纳米孔测序等技术。文章介绍了单分子实时(Single-molecule real-time,SMRT)测序技术的基本原理、性能以及应用。与Sanger测序法和下一代测序技术相比,SMRT测序具有超长读长、测序周期短、无需模板扩增和直接检测表观修饰位点等特点,为研究人员提供了新选择。同时,SMRT测序的低准确率备受争议(约85%),其中约93%的错误是插入缺失,因此,其数据应用于基因组组装前需先对数据进行纠错处理。目前,SMRT测序在小型基因组从头测序和完整组装中已有良好应用,并且已经或将在表观遗传学、转录组学、大型基因组组装等领域发挥其优势,促进基因组学的研究。  相似文献   

4.
Nucleotide insertions and deletions (indels) are responsible for gaps in the sequence alignments. Indel is one of the major sources of evolutionary change at the molecular level. We have examined the patterns of insertions and deletions in the 19 mammalian genomes, and found that deletion events are more common than insertions in the mammalian genomes. Both the number of insertions and deletions decrease rapidly when the gap length increases and single nucleotide indel is the most frequent in all indel events. The frequencies of both insertions and deletions can be described well by power law.Key Words: Insertion, deletion, gap, indel, mammalian genome.  相似文献   

5.
Opinions split when it comes to the significance and thus the weighting of indel characters as phylogenetic markers. This paper attempts to test the phylogenetic information content of indels and nucleotide substitutions by proposing an a priori weighting system of non-protein-coding genes. Theoretically, the system rests on a weighting scheme which is based on a falsificationist approach to cladistic inference. It provides insertions, deletions and nucleotide substitutions weights according to their specific number of identical classes of potential falsifiers, resulting in the following system: nucleotide substitutions weight = 3, deletions of n nucleotides weight = (2n–1), and insertions of n nucleotides weight = (5n–1). This weighting system and the utility of indels as phylogenetic markers are tested against a suitable data set of 18S rDNA sequences of Diptera and Strepsiptera taxa together with other Metazoa species. The indels support the same clades as the nucleotide substitution data, and the application of the weighting system increases the corresponding consistency indices of the differentially weighted character types. As a consequence, applying the weighting system seems to be reasonable, and indels appear to be good phylogenetic markers.  相似文献   

6.
7.
Oxford Nanopore MinION Sequencing and Genome Assembly   总被引:1,自引:0,他引:1  
The revolution of genome sequencing is continuing after the successful second-generation sequencing (SGS) technology. The third-generation sequencing (TGS) technology, led by Pacific Biosciences (PacBio), is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that pro-mises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT). MinION identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MinION has thus generated much excitement and interest in the geno-mics community. While de novo genome assemblies can be cheaply produced from SGS data, assem-bly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in gen-ome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.  相似文献   

8.
Nanopore-based DNA sequencing is the most promising third-generation sequencing method. It has superior read length, speed, and sample requirements compared with state-of-the-art second-generation methods. However, base-calling still presents substantial difficulty because the resolution of the technique is limited compared with the measured signal/noise ratio. Here we demonstrate a method to decode 3-bp-resolution nanopore electrical measurements into a DNA sequence using a Hidden Markov model. This method shows tremendous potential for accuracy (~98%), even with a poor signal/noise ratio.  相似文献   

9.
Brandström M  Ellegren H 《Genetics》2007,176(3):1691-1701
It is increasingly recognized that insertions and deletions (indels) are an important source of genetic as well as phenotypic divergence and diversity. We analyzed length polymorphisms identified through partial (0.25x) shotgun sequencing of three breeds of domestic chicken made by the International Chicken Polymorphism Map Consortium. A data set of 140,484 short indel polymorphisms in unique DNA was identified after filtering for microsatellite structures. There was a significant excess of tandem duplicates at indel sites, with deletions of a duplicate motif outnumbering the generation of duplicates through insertion. Indel density was lower in microchromosomes than in macrochromosomes, in the Z chromosome than in autosomes, and in 100 bp of upstream sequence, 5'-UTR, and first introns than in intergenic DNA and in other introns. Indel density was highly correlated with single nucleotide polymorphism (SNP) density. The mean density of indels in pairwise sequence comparisons was 1.9 x 10(-4) indel events/bp, approximately 5% the density of SNPs segregating in the chicken genome. The great majority of indels involved a limited number of nucleotides (median 1 bp), with A-rich motifs being overrepresented at indel sites. The overrepresentation of deletions at tandem duplicates indicates that replication slippage in duplicate sequences is a common mechanism behind indel mutation. The correlation between indel and SNP density indicates common effects of mutation and/or selection on the occurrence of indels and point mutations.  相似文献   

10.
11.
摘要 目的:为了验证不同高保真DNA聚合酶是否会对运用ARTIC工作流进行新型冠状病毒纳米孔测序产生影响。方法:使用英国Nanopore公司MinION测序仪对2份已获得全基因组序列的新冠肺炎确诊病例核酸样本分别采用KAPA HiFi HotStart ReadyMix,PrimeSTAR?誖GXL DNA Polymerase和NEBNext High-Fidelity 2X PCR Master Mix进行ARTIC工作流的多重PCR扩增,对扩增产物进行测序,并对测序质量进行分析。结果:不同高保真DNA聚合酶在相同扩增条件下,扩增产物的质检结果和测序质量均不相同,NEBNext High-Fidelity 2X PCR Master Mix在覆盖度和测序深度上明显好于另外两种酶。结论:NEBNext High-Fidelity 2X PCR Master Mix在纳米孔新型冠状病毒ARTIC快速测序工作流中的应用效果较好。  相似文献   

12.
随着高通量测序技术的不断更新,可以在单个分子水平读取核苷酸序列的第三代测序技术迅速发展,纳米孔测序技术是其具有代表性的单分子测序技术,该技术通过检测DNA单链分子穿过纳米孔时引起的跨膜电流信号的变化,实现碱基识别.纳米孔测序仪在便携性、碱基读取速度、测序读段长度等方面较传统的第一代与第二代测序技术都有明显优势.随着纳米...  相似文献   

13.
《Biophysical journal》2022,121(5):742-754
Transmembrane protein channels enable fast and highly sensitive detection of single molecules. Nanopore sequencing of DNA was achieved using an engineered Mycobacterium smegmatis porin A (MspA) in combination with a motor enzyme. Due to its favorable channel geometry, the octameric MspA pore exhibits the highest current level compared with other pore proteins. To date, MspA is the only protein nanopore with a published record of DNA sequencing. While widely used in commercial devices, nanopore sequencing of DNA suffers from significant base-calling errors due to stochastic events of the complex DNA-motor-pore combination and the contribution of up to five nucleotides to the signal at each position. Different mutations in specific subunits of a pore protein offer an enormous potential to improve nucleotide resolution and sequencing accuracy. However, individual subunits of MspA and other oligomeric protein pores are randomly assembled in vivo and in vitro, preventing the efficient production of designed pores with different subunit mutations. In this study, we converted octameric MspA into a single-chain pore by connecting eight subunits using peptide linkers. Lipid bilayer experiments demonstrated that single-chain MspA formed membrane-spanning channels and discriminated all four nucleotides identical to MspA produced from monomers in DNA hairpin experiments. Single-chain constructs comprising three, five, six, and seven connected subunits assembled to functional channels, demonstrating a remarkable plasticity of MspA to different subunit stoichiometries. Thus, single-chain MspA constitutes a new milestone in the optimization of MspA as a biosensor for DNA sequencing and many other applications by enabling the production of pores with distinct subunit mutations and pore diameters.  相似文献   

14.

Background

Ultra-deep pyrosequencing (UDPS) is used to identify rare sequence variants. The sequence depth is influenced by several factors including the error frequency of PCR and UDPS. This study investigated the characteristics and source of errors in raw and cleaned UDPS data.

Results

UDPS of a 167-nucleotide fragment of the HIV-1 SG3Δenv plasmid was performed on the Roche/454 platform. The plasmid was diluted to one copy, PCR amplified and subjected to bidirectional UDPS on three occasions. The dataset consisted of 47,693 UDPS reads. Raw UDPS data had an average error frequency of 0.30% per nucleotide site. Most errors were insertions and deletions in homopolymeric regions. We used a cleaning strategy that removed almost all indel errors, but had little effect on substitution errors, which reduced the error frequency to 0.056% per nucleotide. In cleaned data the error frequency was similar in homopolymeric and non-homopolymeric regions, but varied considerably across sites. These site-specific error frequencies were moderately, but still significantly, correlated between runs (r = 0.15–0.65) and between forward and reverse sequencing directions within runs (r = 0.33–0.65). Furthermore, transition errors were 48-times more common than transversion errors (0.052% vs. 0.001%; p<0.0001). Collectively the results indicate that a considerable proportion of the sequencing errors that remained after data cleaning were generated during the PCR that preceded UDPS.

Conclusions

A majority of the sequencing errors that remained after data cleaning were introduced by PCR prior to sequencing, which means that they will be independent of platform used for next-generation sequencing. The transition vs. transversion error bias in cleaned UDPS data will influence the detection limits of rare mutations and sequence variants.  相似文献   

15.
It is generally accepted that cancers result from the aggregation of somatic mutations. The emergence of next-generation sequencing (NGS) technologies during the past half-decade has enabled studies of cancer genomes with high sensitivity and resolution through whole-genome and whole-exome sequencing approaches, among others. This saltatory advance introduces the possibility of assembling multiple cancer genomes for analysis in a cost-effective manner. Analytical approaches are now applied to the detection of a number of somatic genome alterations, including nucleotide substitutions, insertions/deletions, copy number variations, and chromosomal rearrangements. This review provides a thorough introduction to the cancer genomics pipeline as well as a case study of these methods put into practice.  相似文献   

16.

Background

The Ion Torrent PGM is a popular benchtop sequencer that shows promise in replacing conventional Sanger sequencing as the gold standard for mutation detection. Despite the PGM’s reported high accuracy in calling single nucleotide variations, it tends to generate many false positive calls in detecting insertions and deletions (indels), which may hinder its utility for clinical genetic testing.

Results

Recently, the proprietary analytical workflow for the Ion Torrent sequencer, Torrent Suite (TS), underwent a series of upgrades. We evaluated three major upgrades of TS by calling indels in the BRCA1 and BRCA2 genes. Our analysis revealed that false negative indels could be generated by TS under both default calling parameters and parameters adjusted for maximum sensitivity. However, indel calling with the same data using the open source variant callers, GATK and SAMtools showed that false negatives could be minimised with the use of appropriate bioinformatics analysis. Furthermore, we identified two variant calling measures, Quality-by-Depth (QD) and VARiation of the Width of gaps and inserts (VARW), which substantially reduced false positive indels, including non-homopolymer associated errors without compromising sensitivity. In our best case scenario that involved the TMAP aligner and SAMtools, we achieved 100% sensitivity, 99.99% specificity and 29% False Discovery Rate (FDR) in indel calling from all 23 samples, which is a good performance for mutation screening using PGM.

Conclusions

New versions of TS, BWA and GATK have shown improvements in indel calling sensitivity and specificity over their older counterpart. However, the variant caller of TS exhibits a lower sensitivity than GATK and SAMtools. Our findings demonstrate that although indel calling from PGM sequences may appear to be noisy at first glance, proper computational indel calling analysis is able to maximize both the sensitivity and specificity at the single base level, paving the way for the usage of this technology for future clinical genetic testing.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-516) contains supplementary material, which is available to authorized users.  相似文献   

17.
18.

Background

With the advance of next generation sequencing (NGS) technologies, a large number of insertion and deletion (indel) variants have been identified in human populations. Despite much research into variant calling, it has been found that a non-negligible proportion of the identified indel variants might be false positives due to sequencing errors, artifacts caused by ambiguous alignments, and annotation errors.

Results

In this paper, we examine indel redundancy in dbSNP, one of the central databases for indel variants, and develop a standalone computational pipeline, dubbed Vindel, to detect redundant indels. The pipeline first applies indel position information to form candidate redundant groups, then performs indel mutations to the reference genome to generate corresponding indel variant substrings. Finally the indel variant substrings in the same candidate redundant groups are compared in a pairwise fashion to identify redundant indels. We applied our pipeline to check for redundancy in the human indels in dbSNP. Our pipeline identified approximately 8% redundancy in insertion type indels, 12% in deletion type indels, and overall 10% for insertions and deletions combined. These numbers are largely consistent across all human autosomes. We also investigated indel size distribution and adjacent indel distance distribution for a better understanding of the mechanisms generating indel variants.

Conclusions

Vindel, a simple yet effective computational pipeline, can be used to check whether a set of indels are redundant with respect to those already in the database of interest such as NCBI’s dbSNP. Of the approximately 5.9 million indels we examined, nearly 0.6 million are redundant, revealing a serious limitation in the current indel annotation. Statistics results prove the consistency of the pipeline on indel redundancy detection for all 22 chromosomes. Apart from the standalone Vindel pipeline, the indel redundancy check algorithm is also implemented in the web server http://bioinformatics.cs.vt.edu/zhanglab/indelRedundant.php.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0359-1) contains supplementary material, which is available to authorized users.  相似文献   

19.
To gauge the processes that might direct the length of introns, I studied the balance of indels (insertions or deletions, determined using Alu and LINE1 retroposon repeats) and the density of these repeats in the introns of the human genome. The indel balance is biased in favour of deletions and correlated with the divergence of repeats. At fixed repeat divergence, the indel bias correlated with the intron size: the shorter the intron, the more deletions were favoured over insertions. This correlation with the intron size was stronger than with the gene-wide or isochore-wide parameters. The density of repeats (the number of repeats in a unit of intron length) correlated positively with the intron size. Thus, quite different mechanisms, the indel bias and the integration and/or persistence of retroposons, act in the same direction in regards to intron size, which suggests selection for the size of individual introns.  相似文献   

20.
Insertions and deletions (indels) are important types of structural variations. Obtaining accurate genotypes of indels may facilitate further genetic study. There are a few existing methods for calling indel genotypes from sequence reads. However, none of these tools can accurately call indel genotypes for indels of all lengths, especially for low coverage sequence data. In this paper, we present GINDEL, an approach for calling genotypes of both insertions and deletions from sequence reads. GINDEL uses a machine learning approach which combines multiple features extracted from next generation sequencing data. We test our approach on both simulated and real data and compare with existing tools, including Genome STRiP, Pindel and Clever-sv. Results show that GINDEL works well for deletions larger than 50 bp on both high and low coverage data. Also, GINDEL performs well for insertion genotyping on both simulated and real data. For comparison, Genome STRiP performs less well for shorter deletions (50–200 bp) on both simulated and real sequence data from the 1000 Genomes Project. Clever-sv performs well for intermediate deletions (200–1500 bp) but is less accurate when coverage is low. Pindel only works well for high coverage data, but does not perform well at low coverage. To summarize, we show that GINDEL not only can call genotypes of insertions and deletions (both short and long) for high and low coverage population sequence data, but also is more accurate and efficient than other approaches. The program GINDEL can be downloaded at: http://sourceforge.net/p/gindel  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号