首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Fan L  McElroy K  Thomas T 《PloS one》2012,7(6):e39948
Direct sequencing of environmental DNA (metagenomics) has a great potential for describing the 16S rRNA gene diversity of microbial communities. However current approaches using this 16S rRNA gene information to describe community diversity suffer from low taxonomic resolution or chimera problems. Here we describe a new strategy that involves stringent assembly and data filtering to reconstruct full-length 16S rRNA genes from metagenomicpyrosequencing data. Simulations showed that reconstructed 16S rRNA genes provided a true picture of the community diversity, had minimal rates of chimera formation and gave taxonomic resolution down to genus level. The strategy was furthermore compared to PCR-based methods to determine the microbial diversity in two marine sponges. This showed that about 30% of the abundant phylotypes reconstructed from metagenomic data failed to be amplified by PCR. Our approach is readily applicable to existing metagenomic datasets and is expected to lead to the discovery of new microbial phylotypes.  相似文献   

2.
Assembling individual genomes from complex community metagenomic data remains a challenging issue for environmental studies. We evaluated the quality of genome assemblies from community short read data (Illumina 100 bp pair-ended sequences) using datasets recovered from freshwater and soil microbial communities as well as in silico simulations. Our analyses revealed that the genome of a single genotype (or species) can be accurately assembled from a complex metagenome when it shows at least about 20 × coverage. At lower coverage, however, the derived assemblies contained a substantial fraction of non-target sequences (chimeras), which explains, at least in part, the higher number of hypothetical genes recovered in metagenomic relative to genomic projects. We also provide examples of how to detect intrapopulation structure in metagenomic datasets and estimate the type and frequency of errors in assembled genes and contigs from datasets of varied species complexity.  相似文献   

3.
Phylogenetic diversity--patterns of phylogenetic relatedness among organisms in ecological communities--provides important insights into the mechanisms underlying community assembly. Studies that measure phylogenetic diversity in microbial communities have primarily been limited to a single marker gene approach, using the small subunit of the rRNA gene (SSU-rRNA) to quantify phylogenetic relationships among microbial taxa. In this study, we present an approach for inferring phylogenetic relationships among microorganisms based on the random metagenomic sequencing of DNA fragments. To overcome challenges caused by the fragmentary nature of metagenomic data, we leveraged fully sequenced bacterial genomes as a scaffold to enable inference of phylogenetic relationships among metagenomic sequences from multiple phylogenetic marker gene families. The resulting metagenomic phylogeny can be used to quantify the phylogenetic diversity of microbial communities based on metagenomic data sets. We applied this method to understand patterns of microbial phylogenetic diversity and community assembly along an oceanic depth gradient, and compared our findings to previous studies of this gradient using SSU-rRNA gene and metagenomic analyses. Bacterial phylogenetic diversity was highest at intermediate depths beneath the ocean surface, whereas taxonomic diversity (diversity measured by binning sequences into taxonomically similar groups) showed no relationship with depth. Phylogenetic diversity estimates based on the SSU-rRNA gene and the multi-gene metagenomic phylogeny were broadly concordant, suggesting that our approach will be applicable to other metagenomic data sets for which corresponding SSU-rRNA gene sequences are unavailable. Our approach opens up the possibility of using metagenomic data to study microbial diversity in a phylogenetic context.  相似文献   

4.
Composition and gene content of a biogas-producing microbial community from a production-scale biogas plant fed with renewable primary products was analysed by means of a metagenomic approach applying the ultrafast 454-pyrosequencing technology. Sequencing of isolated total community DNA on a Genome Sequencer FLX System resulted in 616,072 reads with an average read length of 230 bases accounting for 141,664,289 bases sequence information. Assignment of obtained single reads to COG (Clusters of Orthologous Groups of proteins) categories revealed a genetic profile characteristic for an anaerobic microbial consortium conducting fermentative metabolic pathways. Assembly of single reads resulted in the formation of 8752 contigs larger than 500 bases in size. Contigs longer than 10kb mainly encode house-keeping proteins, e.g. DNA polymerase, recombinase, DNA ligase, sigma factor RpoD and genes involved in sugar and amino acid metabolism. A significant portion of contigs was allocated to the genome sequence of the archaeal methanogen Methanoculleus marisnigri JR1. Mapping of single reads to the M. marisnigri JR1 genome revealed that approximately 64% of the reference genome including methanogenesis gene regions are deeply covered. These results suggest that species related to those of the genus Methanoculleus play a dominant role in methanogenesis in the analysed fermentation sample. Moreover, assignment of numerous contig sequences to clostridial genomes including gene regions for cellulolytic functions indicates that clostridia are important for hydrolysis of cellulosic plant biomass in the biogas fermenter under study. Metagenome sequence data from a biogas-producing microbial community residing in a fermenter of a biogas plant provide the basis for a rational approach to improve the biotechnological process of biogas production.  相似文献   

5.

Background

Microbial life dominates the earth, but many species are difficult or even impossible to study under laboratory conditions. Sequencing DNA directly from the environment, a technique commonly referred to as metagenomics, is an important tool for cataloging microbial life. This culture-independent approach involves collecting samples that include microbes in them, extracting DNA from the samples, and sequencing the DNA. A sample may contain many different microorganisms, macroorganisms, and even free-floating environmental DNA. A fundamental challenge in metagenomics has been estimating the abundance of organisms in a sample based on the frequency with which the organism''s DNA was observed in reads generated via DNA sequencing.

Methodology/Principal Findings

We created mixtures of ten microbial species for which genome sequences are known. Each mixture contained an equal number of cells of each species. We then extracted DNA from the mixtures, sequenced the DNA, and measured the frequency with which genomic regions from each organism was observed in the sequenced DNA. We found that the observed frequency of reads mapping to each organism did not reflect the equal numbers of cells that were known to be included in each mixture. The relative organism abundances varied significantly depending on the DNA extraction and sequencing protocol utilized.

Conclusions/Significance

We describe a new data resource for measuring the accuracy of metagenomic binning methods, created by in vitro-simulation of a metagenomic community. Our in vitro simulation can be used to complement previous in silico benchmark studies. In constructing a synthetic community and sequencing its metagenome, we encountered several sources of observation bias that likely affect most metagenomic experiments to date and present challenges for comparative metagenomic studies. DNA preparation methods have a particularly profound effect in our study, implying that samples prepared with different protocols are not suitable for comparative metagenomics.  相似文献   

6.
Culture-independent diagnostics reduce the reliance on traditional (and slower) culture-based methodologies. Here we capitalize on advances in next-generation sequencing (NGS) to apply this approach to food pathogen detection utilizing NGS as an analytical tool. In this study, spiking spinach with Shiga toxin-producing Escherichia coli (STEC) following an established FDA culture-based protocol was used in conjunction with shotgun metagenomic sequencing to determine the limits of detection, sensitivity, and specificity levels and to obtain information on the microbiology of the protocol. We show that an expected level of contamination (∼10 CFU/100 g) could be adequately detected (including key virulence determinants and strain-level specificity) within 8 h of enrichment at a sequencing depth of 10,000,000 reads. We also rationalize the relative benefit of static versus shaking culture conditions and the addition of selected antimicrobial agents, thereby validating the long-standing culture-based parameters behind such protocols. Moreover, the shotgun metagenomic approach was informative regarding the dynamics of microbial communities during the enrichment process, including initial surveys of the microbial loads associated with bagged spinach; the microbes found included key genera such as Pseudomonas, Pantoea, and Exiguobacterium. Collectively, our metagenomic study highlights and considers various parameters required for transitioning to such sequencing-based diagnostics for food safety and the potential to develop better enrichment processes in a high-throughput manner not previously possible. Future studies will investigate new species-specific DNA signature target regimens, rational design of medium components in concert with judicious use of additives, such as antibiotics, and alterations in the sample processing protocol to enhance detection.  相似文献   

7.
To gain a predictive understanding of the interspecies interactions within microbial communities that govern community function, the genomic complement of every member population must be determined. Although metagenomic sequencing has enabled the de novo reconstruction of some microbial genomes from environmental communities, microdiversity confounds current genome reconstruction techniques. To overcome this issue, we performed short-read metagenomic sequencing on parallel consortia, defined as consortia cultivated under the same conditions from the same natural community with overlapping species composition. The differences in species abundance between the two consortia allowed reconstruction of near-complete (at an estimated >85% of gene complement) genome sequences for 17 of the 20 detected member species. Two Halomonas spp. indistinguishable by amplicon analysis were found to be present within the community. In addition, comparison of metagenomic reads against the consensus scaffolds revealed within-species variation for one of the Halomonas populations, one of the Rhodobacteraceae populations, and the Rhizobiales population. Genomic comparison of these representative instances of inter- and intraspecies microdiversity suggests differences in functional potential that may result in the expression of distinct roles in the community. In addition, isolation and complete genome sequence determination of six member species allowed an investigation into the sensitivity and specificity of genome reconstruction processes, demonstrating robustness across a wide range of sequence coverage (9× to 2,700×) within the metagenomic data set.  相似文献   

8.
陈嘉焕  孙政  王晓君  苏晓泉  宁康 《遗传》2015,37(7):645-654
微生物群落遍布于人体的每个角落,与人共生并对人体健康产生重要和深刻的影响。与人类共生的全部微生物的基因组总和称为“元基因组”或“人类第二基因组”。研究人体微生物群落及相关元基因组数据,对转化医学领域的基础研究和临床应用具有重要的价值。通过对生物医学相关的高通量元基因组数据进行分析,不仅能为基础医学研究向医学临床应用转化提供新思路和新方法,而且具有广阔的应用前景。基于新一代测序技术产生的数据,元基因组分析技术和方法能够弥补以往人体微生物先培养后鉴定方法的缺陷,同时能有效鉴定和分析微生物群落的组成及功能,从而进一步探究和揭示微生物群落与机体生理状态之间的关系,为解决许多医学领域的难题提供了全新的切入角度和思维方法。文章系统介绍了元基因组研究的现状,包括元基因组的方法概念和研究进展,并以元基因组在医学研究中的应用为着眼点,综述了元基因组在转化医学方面的研究进展,进一步阐述了元基因组研究在转化医学应用领域中具有的重要地位。  相似文献   

9.
Xia LC  Cram JA  Chen T  Fuhrman JA  Sun F 《PloS one》2011,6(12):e27992
Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy) by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy). GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data-sets) in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based) even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes.  相似文献   

10.
Advances in DNA extraction and next‐generation sequencing have made a vast number of historical herbarium specimens available for genomic investigation. These specimens contain not only genomic information from the individual plants themselves, but also from associated microorganisms such as bacteria and fungi. These microorganisms may have colonized the living plant (e.g., pathogens or host‐associated commensal taxa) or may result from postmortem colonization that may include decomposition processes or contamination during sample handling. Here we characterize the metagenomic profile from shotgun sequencing data from herbarium specimens of two widespread plant species (Ambrosia artemisiifolia and Arabidopsis thaliana) collected up to 180 years ago. We used blast searching in combination with megan and were able to infer the metagenomic community even from the oldest herbarium sample. Through comparison with contemporary plant collections, we identify three microbial species that are nearly exclusive to herbarium specimens, including the fungus Alternaria alternata, which can comprise up to 7% of the total sequencing reads. This species probably colonizes the herbarium specimens during preparation for mounting or during storage. By removing the probable contaminating taxa, we observe a temporal shift in the metagenomic composition of the invasive weed Am. artemisiifolia. Our findings demonstrate that it is generally possible to use herbarium specimens for metagenomic analyses, but that the results should be treated with caution, as some of the identified species may be herbarium contaminants rather than representing the natural metagenomic community of the host plant.  相似文献   

11.
Accessing the soil metagenome for studies of microbial diversity   总被引:1,自引:0,他引:1  
Soil microbial communities contain the highest level of prokaryotic diversity of any environment, and metagenomic approaches involving the extraction of DNA from soil can improve our access to these communities. Most analyses of soil biodiversity and function assume that the DNA extracted represents the microbial community in the soil, but subsequent interpretations are limited by the DNA recovered from the soil. Unfortunately, extraction methods do not provide a uniform and unbiased subsample of metagenomic DNA, and as a consequence, accurate species distributions cannot be determined. Moreover, any bias will propagate errors in estimations of overall microbial diversity and may exclude some microbial classes from study and exploitation. To improve metagenomic approaches, investigate DNA extraction biases, and provide tools for assessing the relative abundances of different groups, we explored the biodiversity of the accessible community DNA by fractioning the metagenomic DNA as a function of (i) vertical soil sampling, (ii) density gradients (cell separation), (iii) cell lysis stringency, and (iv) DNA fragment size distribution. Each fraction had a unique genetic diversity, with different predominant and rare species (based on ribosomal intergenic spacer analysis [RISA] fingerprinting and phylochips). All fractions contributed to the number of bacterial groups uncovered in the metagenome, thus increasing the DNA pool for further applications. Indeed, we were able to access a more genetically diverse proportion of the metagenome (a gain of more than 80% compared to the best single extraction method), limit the predominance of a few genomes, and increase the species richness per sequencing effort. This work stresses the difference between extracted DNA pools and the currently inaccessible complete soil metagenome.  相似文献   

12.
Several PCR methods have recently been developed to identify fecal contamination in surface waters. In all cases, researchers have relied on one gene or one microorganism for selection of host-specific markers. Here we describe the application of a genome fragment enrichment (GFE) method to identify host-specific genetic markers from fecal microbial community DNA. As a proof of concept, bovine fecal DNA was challenged against a porcine fecal DNA background to select for bovine-specific DNA sequences. Bioinformatic analyses of 380 bovine enriched metagenomic sequences indicated a preponderance of Bacteroidales-like regions predicted to encode membrane-associated and secreted proteins. Oligonucleotide primers capable of annealing to select Bacteroidales-like bovine GFE sequences exhibited extremely high specificity (>99%) in PCR assays with total fecal DNAs from 279 different animal sources. These primers also demonstrated a broad distribution of corresponding genetic markers (81% positive) among 148 different bovine sources. These data demonstrate that direct metagenomic DNA analysis by the competitive solution hybridization approach described is an efficient method for identifying potentially useful fecal genetic markers and for characterizing differences between environmental microbial communities.  相似文献   

13.
Molecular techniques previously used for genome comparisons of closely related bacterial species could prove extremely valuable for comparisons of complex microbial communities, or metagenomes. Our study aimed to determine the breadth and value of suppressive subtractive hybridization (SSH) in a pilot-scale analysis of metagenomic DNA from communities of microorganisms in the rumen. Suppressive subtractive hybridization was performed using total genomic DNA isolated from rumen fluid samples of two hay-fed steers, arbitrarily designated as tester or driver. Ninety-six subtraction DNA fragments from the tester metagenome were amplified, cloned and the DNA sequences were determined. Verification of the isolation of DNA fragments unique to the tester metagenome was accomplished through dot blot and Southern blot hybridizations. Tester-specific SSH fragments were found in 95 of 96 randomly selected clones. DNA sequences of subtraction fragments were analysed by computer assisted DNA and amino acid comparisons. Putative translations of 26 (32.1%) subtractive hybridization fragments exhibited significant similarity to Bacterial proteins, whereas 15 (18.5%) distinctive subtracted fragments had significant similarity to proteins from Archaea. The remainder of the subtractive hybridization fragments displayed no similarity to GenBank sequences. This metagenomic approach has exposed an unexpectedly large difference in Archaeal community structure between the rumen microbial populations of two steers fed identical diets and housed together. 16S rRNA dot blot hybridizations revealed similar proportions of Bacteria and Archaea in both rumen samples and suggest that the differences uncovered by SSH are the result of varying community structural composition. Our study demonstrates a novel approach to comparative analyses of environmental microbial communities through the use of SSH.  相似文献   

14.
Several PCR methods have recently been developed to identify fecal contamination in surface waters. In all cases, researchers have relied on one gene or one microorganism for selection of host-specific markers. Here we describe the application of a genome fragment enrichment (GFE) method to identify host-specific genetic markers from fecal microbial community DNA. As a proof of concept, bovine fecal DNA was challenged against a porcine fecal DNA background to select for bovine-specific DNA sequences. Bioinformatic analyses of 380 bovine enriched metagenomic sequences indicated a preponderance of Bacteroidales-like regions predicted to encode membrane-associated and secreted proteins. Oligonucleotide primers capable of annealing to select Bacteroidales-like bovine GFE sequences exhibited extremely high specificity (>99%) in PCR assays with total fecal DNAs from 279 different animal sources. These primers also demonstrated a broad distribution of corresponding genetic markers (81% positive) among 148 different bovine sources. These data demonstrate that direct metagenomic DNA analysis by the competitive solution hybridization approach described is an efficient method for identifying potentially useful fecal genetic markers and for characterizing differences between environmental microbial communities.  相似文献   

15.
Metagenomic shotgun sequencing data can identify microbes populating a microbial community and their proportions, but existing taxonomic profiling methods are inefficient for increasingly large data sets. We present an approach that uses clade-specific marker genes to unambiguously assign reads to microbial clades more accurately and >50× faster than current approaches. We validated our metagenomic phylogenetic analysis tool, MetaPhlAn, on terabases of short reads and provide the largest metagenomic profiling to date of the human gut. It can be accessed at http://huttenhower.sph.harvard.edu/metaphlan/.  相似文献   

16.
Over the past quarter-century, microbiologists have used DNA sequence information to aid in the characterization of microbial communities. During the last decade, this has expanded from single genes to microbial community genomics, or metagenomics, in which the gene content of an environment can provide not just a census of the community members but direct information on metabolic capabilities and potential interactions among community members. Here we introduce a method for the quantitative characterization and comparison of microbial communities based on the normalization of metagenomic data by estimating average genome sizes. This normalization can relieve comparative biases introduced by differences in community structure, number of sequencing reads, and sequencing read lengths between different metagenomes. We demonstrate the utility of this approach by comparing metagenomes from two different marine sources using both conventional small-subunit (SSU) rRNA gene analyses and our quantitative method to calculate the proportion of genomes in each sample that are capable of a particular metabolic trait. With both environments, to determine what proportion of each community they make up and how differences in environment affect their abundances, we characterize three different types of autotrophic organisms: aerobic, photosynthetic carbon fixers (the Cyanobacteria); anaerobic, photosynthetic carbon fixers (the Chlorobi); and anaerobic, nonphotosynthetic carbon fixers (the Desulfobacteraceae). These analyses demonstrate how genome proportionality compares to SSU rRNA gene relative abundance and how factors such as average genome size and SSU rRNA gene copy number affect sampling probability and therefore both types of community analysis.  相似文献   

17.
Metagenomics is an emerging field in which the power of genomic analysis is applied to an entire microbial community, bypassing the need to isolate and culture individual microbial species. Assembling of metagenomic DNA fragments is very much like the overlap-layout-consensus procedure for assembling isolated genomes, but is augmented by an additional binning step to differentiate scaffolds, contigs and unassembled reads into various taxonomic groups. In this paper, we employed n-mer oligonucleotide frequencies as the features and developed a hierarchical classifier (PCAHIER) for binning short (≤ 1,000 bps) metagenomic fragments. The principal component analysis was used to reduce the high dimensionality of the feature space. The hierarchical classifier consists of four layers of local classifiers that are implemented based on the linear discriminant analysis. These local classifiers are responsible for binning prokaryotic DNA fragments into superkingdoms, of the same superkingdom into phyla, of the same phylum into genera, and of the same genus into species, respectively. We evaluated the performance of the PCAHIER by using our own simulated data sets as well as the widely used simHC synthetic metagenome data set from the IMG/M system. The effectiveness of the PCAHIER was demonstrated through comparisons against a non-hierarchical classifier, and two existing binning algorithms (TETRA and Phylopythia).  相似文献   

18.
With the current fast accumulation of microbial community samples and related metagenomic sequencing data, data integration and analysis system is urgently needed for in-depth analysis of large number of metagenomic samples (also referred to as “microbial communities”) of interest. Although several existing databases have collected a large number of metagenomic samples, they mostly serve as data repositories with crude annotations, and offer limited functionality for analysis. Moreover, the few available tools for comparative analysis in the literature could only support the comparison of a few pre-defined set of metagenomic samples. To facilitate comprehensive comparative analysis on large amount of diverse microbial community samples, we have designed a Meta-Mesh system for a variety of analyses including quantitative analysis of similarities among microbial communities and computation of the correlation between the meta-information of these samples. We have used Meta-Mesh for systematically and efficiently analyses on diverse sets of human associate-habitat microbial community samples. Results have shown that Meta-Mesh could serve well as an efficient data analysis platform for discovery of clusters, biomarker and other valuable biological information from a large pool of human microbial samples.  相似文献   

19.
Hua  Kui  Zhang  Xuegong 《BMC genomics》2019,20(2):93-101
Background

Metagenomic sequencing is a powerful technology for studying the mixture of microbes or the microbiomes on human and in the environment. One basic task of analyzing metagenomic data is to identify the component genomes in the community. This task is challenging due to the complexity of microbiome composition, limited availability of known reference genomes, and usually insufficient sequencing coverage.

Results

As an initial step toward understanding the complete composition of a metagenomic sample, we studied the problem of estimating the total length of all distinct component genomes in a metagenomic sample. We showed that this problem can be solved by estimating the total number of distinct k-mers in all the metagenomic sequencing data. We proposed a method for this estimation based on the sequencing coverage distribution of observed k-mers, and introduced a k-mer redundancy index (KRI) to fill in the gap between the count of distinct k-mers and the total genome length. We showed the effectiveness of the proposed method on a set of carefully designed simulation data corresponding to multiple situations of true metagenomic data. Results on real data indicate that the uncaptured genomic information can vary dramatically across metagenomic samples, with the potential to mislead downstream analyses.

Conclusions

We proposed the question of how long the total genome length of all different species in a microbial community is and introduced a method to answer it.

  相似文献   

20.
Longitudinal studies that integrate samples with variable biomass are essential to understand microbial community dynamics across space or time. Shotgun metagenomics is widely used to investigate these communities at the functional level, but little is known about the effects of combining low and high biomass samples on downstream analysis. We investigated the interacting effects of DNA input and library amplification by polymerase chain reaction on comparative metagenomic analysis using dilutions of a single complex template from an Arabidopsis thaliana‐associated microbial community. We modified the Illumina Nextera kit to generate high‐quality large‐insert (680 bp) paired‐end libraries using a range of 50 pg to 50 ng of input DNA. Using assembly‐based metagenomic analysis, we demonstrate that DNA input level has a significant impact on community structure due to overrepresentation of low‐GC genomic regions following library amplification. In our system, these differences were largely superseded by variations between biological replicates, but our results advocate verifying the influence of library amplification on a case‐by‐case basis. Overall, this study provides recommendations for quality filtering and de‐replication prior to analysis, as well as a practical framework to address the issue of low biomass or biomass heterogeneity in longitudinal metagenomic surveys.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号