首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Marker gene amplicon sequencing is often preferred over whole genome sequencing for microbial community characterization, due to its lower cost while still enabling assessment of uncultivable organisms. This technique involves many experimental steps, each of which can be a source of errors and bias. We present an up-to-date overview of the whole experimental pipeline, from sampling to sequencing reads, and give information allowing for informed choices at each step of both planning and execution of a microbial community assessment study. When applicable, we also suggest ways of avoiding inherent pitfalls in amplicon sequencing.  相似文献   

2.
3.

Background

The popularity of new sequencing technologies has led to an explosion of possible applications, including new approaches in biodiversity studies. However each of these sequencing technologies suffers from sequencing errors originating from different factors. For 16S rRNA metagenomics studies, the 454 pyrosequencing technology is one of the most frequently used platforms, but sequencing errors still lead to important data analysis issues (e.g. in clustering in taxonomic units and biodiversity estimation). Moreover, retaining a higher portion of the sequencing data by preserving as much of the read length as possible while maintaining the error rate within an acceptable range, will have important consequences at the level of taxonomic precision.

Results

The new error correction algorithm proposed in this work - NoDe (Noise Detector) - is trained to identify those positions in 454 sequencing reads that are likely to have an error, and subsequently clusters those error-prone reads with correct reads resulting in error-free representative read. A benchmarking study with other denoising algorithms shows that NoDe can detect up to 75% more errors in a large scale mock community dataset, and this with a low computational cost compared to the second best algorithm considered in this study. The positive effect of NoDe in 16S rRNA studies was confirmed by the beneficial effect on the precision of the clustering of pyrosequencing reads in operational taxonomic units.

Conclusions

NoDe was shown to be a computational efficient denoising algorithm for pyrosequencing reads, producing the lowest error rates in an extensive benchmarking study with other denoising algorithms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0520-5) contains supplementary material, which is available to authorized users.  相似文献   

4.
5.

Background  

Artificial duplicates from pyrosequencing reads may lead to incorrect interpretation of the abundance of species and genes in metagenomic studies. Duplicated reads were filtered out in many metagenomic projects. However, since the duplicated reads observed in a pyrosequencing run also include natural (non-artificial) duplicates, simply removing all duplicates may also cause underestimation of abundance associated with natural duplicates.  相似文献   

6.
7.
The emergence of massively parallel sequencing technology has revolutionized microbial profiling, allowing the unprecedented comparison of microbial diversity across time and space in a wide range of host-associated and environmental ecosystems. Although the high-throughput nature of such methods enables the detection of low-frequency bacteria, these advances come at the cost of sequencing read length, limiting the phylogenetic resolution possible by current methods. Here, we present a generic approach for integrating short reads from large genomic regions, thus enabling phylogenetic resolution far exceeding current methods. The approach is based on a mapping to a statistical model that is later solved as a constrained optimization problem. We demonstrate the utility of this method by analyzing human saliva and Drosophila samples, using Illumina single-end sequencing of a 750 bp amplicon of the 16S rRNA gene. Phylogenetic resolution is significantly extended while reducing the number of falsely detected bacteria, as compared with standard single-region Roche 454 Pyrosequencing. Our approach can be seamlessly applied to simultaneous sequencing of multiple genes providing a higher resolution view of the composition and activity of complex microbial communities.  相似文献   

8.
Analysis of microbial communities by high-throughput pyrosequencing of SSU rRNA gene PCR amplicons has transformed microbial ecology research and led to the observation that many communities contain a diverse assortment of rare taxa-a phenomenon termed the Rare Biosphere. Multiple studies have investigated the effect of pyrosequencing read quality on operational taxonomic unit (OTU) richness for contrived communities, yet there is limited information on the fidelity of community structure estimates obtained through this approach. Given that PCR biases are widely recognized, and further unknown biases may arise from the sequencing process itself, a priori assumptions about the neutrality of the data generation process are at best unvalidated. Furthermore, post-sequencing quality control algorithms have not been explicitly evaluated for the accuracy of recovered representative sequences and its impact on downstream analyses, reducing useful discussion on pyrosequencing reads to their diversity and abundances. Here we report on community structures and sequences recovered for in vitro-simulated communities consisting of twenty 16S rRNA gene clones tiered at known proportions. PCR amplicon libraries of the V3-V4 and V6 hypervariable regions from the in vitro-simulated communities were sequenced using the Roche 454 GS FLX Titanium platform. Commonly used quality control protocols resulted in the formation of OTUs with >1% abundance composed entirely of erroneous sequences, while over-aggressive clustering approaches obfuscated real, expected OTUs. The pyrosequencing process itself did not appear to impose significant biases on overall community structure estimates, although the detection limit for rare taxa may be affected by PCR amplicon size and quality control approach employed. Meanwhile, PCR biases associated with the initial amplicon generation may impose greater distortions in the observed community structure.  相似文献   

9.
Two sequencing batch reactors (SBR) were constructed and filled with different inocula of activated sludge (AS) and mature fine tailings (MFT) to treat oil sands process-affected water (OSPW). The COD was reduced by 82% in the AS-SBR and 43% in the MFT-SBR during phase I using 10% OSPW and 90% synthetic wastewater as reactor feed. However, COD removal reached 12% and 20% in the AS-SBR and the MFT-SBR, respectively, when 100% raw OSPW was fed into the reactors. Maximum removal of acid-extractable organics (AEO) was 8.7% and 16.6% in the AS-SBR and the MFT-SBR, respectively with a hydraulic retention time of one day. Pyrosequencing analysis revealed that Proteobacteria was the dominant phylum and beta- and gamma-Proteobacteria were dominant classes in both reactors. Evidence of a microbial community change was observed when influent raw OSPW was switched from 50 to 100%. More significant changes in the AS-SBR community were detected.  相似文献   

10.
Structural variations (SVs) play a crucial role in genetic diversity. However, the alignments of reads near/across SVs are made inaccurate by the presence of polymorphisms. BatAlign is an algorithm that integrated two strategies called ‘Reverse-Alignment’ and ‘Deep-Scan’ to improve the accuracy of read-alignment. In our experiments, BatAlign was able to obtain the highest F-measures in read-alignments on mismatch-aberrant, indel-aberrant, concordantly/discordantly paired and SV-spanning data sets. On real data, the alignments of BatAlign were able to recover 4.3% more PCR-validated SVs with 73.3% less callings. These suggest BatAlign to be effective in detecting SVs and other polymorphic-variants accurately using high-throughput data. BatAlign is publicly available at https://goo.gl/a6phxB.  相似文献   

11.
高通量测序和DGGE分析土壤微生物群落的技术评价   总被引:28,自引:7,他引:28  
夏围围  贾仲君 《微生物学报》2014,54(12):1489-1499
【目的】比较新一代高通量测序与传统的变性梯度凝胶电泳(Denaturing Gradient Gel Electrophoresis,DGGE)指纹图谱技术,评价两种技术研究土壤微生物群落结构的优缺点。【方法】针对新西兰典型草地和森林土壤,以16S rRNA基因为标靶,通过高通量测序和DGGE技术分析土壤微生物群落的组成、丰度和多样性,比较两种方法在土壤微生物研究中的适用性。【结果】在不同的微生物分类水平,高通量测序草地土壤检测到22门,54纲,60目,131科,350属;而DGGE仅检测到6门,9纲,8目,10科,10属,表明DGGE显著低估了土壤微生物的群落组成。森林土壤也得到了类似规律,高通量测序的检测灵敏度是DGGE的3.8、6.7、6.4、19.2及39.4倍。进一步分析土壤中主要微生物类群的相对丰度,发现分类水平越低,高通量测序与DGGE的结果差异越大,尤其在科和属的水平上差异最大。以高通量测序结果为标准,DGGE明显高估了土壤中大多数微生物类群的相对丰度,最高可达2000倍。两种方法都表明草地土壤的多样性指数高于森林土壤,但DGGE多样性指数的绝对值远低于高通量测序结果。【结论】高通量测序能够较为全面和准确的反映土壤微生物群落结构,而DGGE仅能够反映有限的优势微生物类群,在很大程度上极可能低估土壤微生物的物种组成并高估其丰度。  相似文献   

12.
根肿病是由芸薹根肿菌侵染引起的专性寄生性土传病害,严重制约着油菜等十字花科作物的可持续生产.前期研究发现,大豆作为前茬作物可以显著降低后茬油菜根肿病的发生和危害,"豆-油轮作"模式是一种值得探索和应用的根肿病防治新途径.为了解开大豆作为前茬防治根肿病发生的机理,本研究基于扩增子测序技术探究大豆与油菜根际土壤微生物的群落结构差异.结果表明:大豆和油菜根际土壤微生物类群在门水平的优势类群相同,包括变形菌门、拟杆菌门、酸杆菌门、放线菌门、子囊菌门、接合菌门、担子菌门和壶菌门等丰度都较高.但相比于油菜根际土壤,大豆根际土壤富含具有生防作用和促进植物生长的微生物,如黄杆菌属、鞘脂单胞菌属、芽孢杆菌属、链霉菌属、假单胞菌属、木霉属和盾壳霉属等;而一些植物病原细菌(如肠杆菌、黄单胞菌)和真菌(炭疽菌和尾孢菌)含量则低于油菜根际土壤;另外,大豆根际土壤中还富含具有固氮功能的根瘤菌属、慢生根瘤菌属和丛枝菌根真菌(如球囊霉属).可见,大豆根际土壤利于有益微生物生长并可抑制病原菌繁殖.大豆和油菜根际微生物组差异为大豆-油菜轮作防治根肿病提供了理论依据,并为根肿病的防治提供了一些潜在的生物防治资源.  相似文献   

13.
Initial environmental pyrosequencing studies suggested highly complex protistan communities with phylotype richness decisively higher than previously estimated. However, recent studies on individual bacteria or artificial bacterial communities evidenced that pyrosequencing errors may skew our view of the true complexity of microbial communities. We pyrosequenced two diversity markers (hypervariable regions V4 and V9 of the small-subunit rDNA) of an intertidal protistan model community, using the Roche GS-FLX and the most recent GS-FLX Titanium sequencing systems. After pyrosequencing 24 reference sequences we obtained up to 2039 unique tags (from 3879 V4 GS-FLX Titanium reads), 77% of which were singletons. Even binning sequences that share 97% similarity still emulated a pseudodiversity exceeding the true complexity of the model community up to three times (V9 GS-FLX). Pyrosequencing error rates were higher for V4 fragments compared with the V9 domain and for the GS-FLX Titanium compared with the GS-FLX system. Furthermore, this experiment revealed that error rates are taxon-specific. As an outcome of this study we suggest a fast and efficient strategy to discriminate pyrosequencing signals from noise in order to more realistically depict the structure of protistan communities using simple tools that are implemented in standard tag data-processing pipelines.  相似文献   

14.
Microarrays for bacterial detection and microbial community analysis   总被引:27,自引:0,他引:27  
Several types of microarrays have recently been developed and evaluated for bacterial detection and microbial community analysis. These studies demonstrated that specific, sensitive and quantitative detection could be obtained with both functional gene arrays and community genome arrays. Although single-base mismatch can be differentiated with phylogenetic oligonucleotide arrays, reliable specific detection at the single-base level is still problematic. Microarray-based hybridization approaches are also useful for defining genome diversity and bacterial relatedness. However, more rigorous and systematic assessment and development are needed to realize the full potential of microarrays for microbial detection and community analysis.  相似文献   

15.
SUMMARY: Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer. AVAILABILITY: The Genometa program, a step by step tutorial and Java source code are freely available from http://genomics1.mh-hannover.de/genometa/ and on http://code.google.com/p/genometa/. This program has been tested on Ubuntu Linux and Windows XP/7.  相似文献   

16.

Background  

Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20) System (454 Life Sciences Corporation), to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae) and Platanus occidentalis (Platanaceae).  相似文献   

17.
Current ecological knowledge of methanotrophic biofilms is incomplete, although they have been broadly studied in biotechnological processes. Four individual DNA samples were prepared from a methanotrophic biofilm, and a multiplex 16S rDNA pyrosequencing was performed. A complete library (before being de-multiplexed) contained 33,639 sequences (average length, 415 nt). Interestingly, methanotrophs were not dominant, only making up 23% of the community. Methylosinus, Methylomonas, and Methylosarcina were the dominant methanotrophs. Type II methanotrophs were more abundant than type I (56 vs. 44%), but less richer and diverse. Dominant non-methanotrophic genera included Hydrogenophaga, Flavobacterium, and Hyphomicrobium. The library was de-multiplexed into four libraries, with different sequencing efforts (3,915-20,133 sequences). S?rrenson abundance similarity results showed that the four libraries were almost identical (indices > 0.97), and phylogenetic comparisons using UniFrac test and P-test revealed the same results. It was demonstrated that the pyrosequencing was highly reproducible. These survey results can provide an insight into the management and/or manipulation of methanotrophic biofilms.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号