首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Massive metagenomic sequencing combined with gene prediction methods were previously used to compile the gene catalogue of the ocean and host-associated microbes. Global expeditions conducted over the past 15 years have sampled the ocean to build a catalogue of genes from pelagic microbes. Here we undertook a large sequencing effort of a perturbed Red Sea plankton community to uncover that the rate of gene discovery increases continuously with sequencing effort, with no indication that the retrieved 2.83 million non-redundant (complete) genes predicted from the experiment represented a nearly complete inventory of the genes present in the sampled community (i.e., no evidence of saturation). The underlying reason is the Pareto-like distribution of the abundance of genes in the plankton community, resulting in a very long tail of millions of genes present at remarkably low abundances, which can only be retrieved through massive sequencing. Microbial metagenomic projects retrieve a variable number of unique genes per Tera base-pair (Tbp), with a median value of 14.7 million unique genes per Tbp sequenced across projects. The increase in the rate of gene discovery in microbial metagenomes with sequencing effort implies that there is ample room for new gene discovery in further ocean and holobiont sequencing studies.  相似文献   

2.
In the pelagic environment, iron is a scarce but essential micronutrient. The iron acquisition capabilities of selected marine bacteria have been investigated, but the recent proliferation of marine prokaryotic genomes and metagenomes offers a more comprehensive picture of microbial iron uptake pathways in the ocean. Searching these data sets, we were able to identify uptake mechanisms for Fe(3+), Fe(2+) and iron chelates (e.g. siderophore and haem iron complexes). Transport of iron chelates is accomplished by TonB-dependent transporters (TBDTs). After clustering the TBDTs from marine prokaryotic genomes, we identified TBDT clusters for the transport of hydroxamate and catecholate siderophore iron complexes and haem using gene neighbourhood analysis and co-clustering of TBDTs of known function. The genomes also contained two classes of siderophore biosynthesis genes: NRPS (non-ribosomal peptide synthase) genes and NIS (NRPS Independent Siderophore) genes. The most common iron transporters, in both the genomes and metagenomes, were Fe(3+) ABC transporters. Iron uptake-related TBDTs and siderophore biosynthesis genes were less common in pelagic marine metagenomes relative to the genomic data set, in part because Pelagibacter ubique and Prochlorococcus species, which almost entirely lacked these Fe uptake systems, dominate the metagenomes. Our results are largely consistent with current knowledge of iron speciation in the ocean, but suggest that in certain niches the ability to acquire siderophores and/or haem iron chelates is beneficial.  相似文献   

3.
Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25–1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition.  相似文献   

4.

Background

The biological and clinical consequences of the tight interactions between host and microbiota are rapidly being unraveled by next generation sequencing technologies and sophisticated bioinformatics, also referred to as microbiota metagenomics. The recent success of metagenomics has created a demand to rapidly apply the technology to large case–control cohort studies and to studies of microbiota from various habitats, including habitats relatively poor in microbes. It is therefore of foremost importance to enable a robust and rapid quality assessment of metagenomic data from samples that challenge present technological limits (sample numbers and size). Here we demonstrate that the distribution of overlapping k-mers of metagenome sequence data predicts sequence quality as defined by gene distribution and efficiency of sequence mapping to a reference gene catalogue.

Results

We used serial dilutions of gut microbiota metagenomic datasets to generate well-defined high to low quality metagenomes. We also analyzed a collection of 52 microbiota-derived metagenomes. We demonstrate that k-mer distributions of metagenomic sequence data identify sequence contaminations, such as sequences derived from “empty” ligation products. Of note, k-mer distributions were also able to predict the frequency of sequences mapping to a reference gene catalogue not only for the well-defined serial dilution datasets, but also for 52 human gut microbiota derived metagenomic datasets.

Conclusions

We propose that k-mer analysis of raw metagenome sequence reads should be implemented as a first quality assessment prior to more extensive bioinformatics analysis, such as sequence filtering and gene mapping. With the rising demand for metagenomic analysis of microbiota it is crucial to provide tools for rapid and efficient decision making. This will eventually lead to a faster turn-around time, improved analytical quality including sample quality metrics and a significant cost reduction. Finally, improved quality assessment will have a major impact on the robustness of biological and clinical conclusions drawn from metagenomic studies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1406-7) contains supplementary material, which is available to authorized users.  相似文献   

5.
The animal gastrointestinal tract contains a complex community of microbes, whose composition ultimately reflects the co-evolution of microorganisms with their animal host. An analysis of 78,619 pyrosequencing reads generated from pygmy loris fecal DNA extracts was performed to help better understand the microbial diversity and functional capacity of the pygmy loris gut microbiome. The taxonomic analysis of the metagenomic reads indicated that pygmy loris fecal microbiomes were dominated by Bacteroidetes and Proteobacteria phyla. The hierarchical clustering of several gastrointestinal metagenomes demonstrated the similarities of the microbial community structures of pygmy loris and mouse gut systems despite their differences in functional capacity. The comparative analysis of function classification revealed that the metagenome of the pygmy loris was characterized by an overrepresentation of those sequences involved in aromatic compound metabolism compared with humans and other animals. The key enzymes related to the benzoate degradation pathway were identified based on the Kyoto Encyclopedia of Genes and Genomes pathway assignment. These results would contribute to the limited body of primate metagenome studies and provide a framework for comparative metagenomic analysis between human and non-human primates, as well as a comparative understanding of the evolution of humans and their microbiome. However, future studies on the metagenome sequencing of pygmy loris and other prosimians regarding the effects of age, genetics, and environment on the composition and activity of the metagenomes are required.  相似文献   

6.
Hadal ecosystems are found at a depth of 6,000 m below sea level and below, occupying less than 1% of the total area of the ocean. The microbial communities and metabolic potential in these ecosystems are largely uncharacterized. Here, we present four single amplified genomes (SAGs) obtained from 8,219 m below the sea surface within the hadal ecosystem of the Puerto Rico Trench (PRT). These SAGs are derived from members of deep-sea clades, including the Thaumarchaeota and SAR11 clade, and two are related to previously isolated piezophilic (high-pressure-adapted) microorganisms. In order to identify genes that might play a role in adaptation to deep-sea environments, comparative analyses were performed with genomes from closely related shallow-water microbes. The archaeal SAG possesses genes associated with mixotrophy, including lipoylation and the glycine cleavage pathway. The SAR11 SAG encodes glycolytic enzymes previously reported to be missing from this abundant and cosmopolitan group. The other SAGs, which are related to piezophilic isolates, possess genes that may supplement energy demands through the oxidation of hydrogen or the reduction of nitrous oxide. We found evidence for potential trench-specific gene distributions, as several SAG genes were observed only in a PRT metagenome and not in shallower deep-sea metagenomes. These results illustrate new ecotype features that might perform important roles in the adaptation of microorganisms to life in hadal environments.  相似文献   

7.
Microbial communities carry out the majority of the biochemical activity on the planet, and they play integral roles in processes including metabolism and immune homeostasis in the human microbiome. Shotgun sequencing of such communities' metagenomes provides information complementary to organismal abundances from taxonomic markers, but the resulting data typically comprise short reads from hundreds of different organisms and are at best challenging to assemble comparably to single-organism genomes. Here, we describe an alternative approach to infer the functional and metabolic potential of a microbial community metagenome. We determined the gene families and pathways present or absent within a community, as well as their relative abundances, directly from short sequence reads. We validated this methodology using a collection of synthetic metagenomes, recovering the presence and abundance both of large pathways and of small functional modules with high accuracy. We subsequently applied this method, HUMAnN, to the microbial communities of 649 metagenomes drawn from seven primary body sites on 102 individuals as part of the Human Microbiome Project (HMP). This provided a means to compare functional diversity and organismal ecology in the human microbiome, and we determined a core of 24 ubiquitously present modules. Core pathways were often implemented by different enzyme families within different body sites, and 168 functional modules and 196 metabolic pathways varied in metagenomic abundance specifically to one or more niches within the microbiome. These included glycosaminoglycan degradation in the gut, as well as phosphate and amino acid transport linked to host phenotype (vaginal pH) in the posterior fornix. An implementation of our methodology is available at http://huttenhower.sph.harvard.edu/humann. This provides a means to accurately and efficiently characterize microbial metabolic pathways and functional modules directly from high-throughput sequencing reads, enabling the determination of community roles in the HMP cohort and in future metagenomic studies.  相似文献   

8.
In the collective genomes (the metagenome) of the microorganisms inhabiting the Earth’s diverse environments is written the history of life on this planet. New molecular tools developed and used for the past 15 years by microbial ecologists are facilitating the extraction, cloning, screening, and sequencing of these genomes. This approach allows microbial ecologists to access and study the full range of microbial diversity, regardless of our ability to culture organisms, and provides an unprecedented access to the breadth of natural products that these genomes encode. However, there is no way that the mere collection of sequences, no matter how expansive, can provide full coverage of the complex world of microbial metagenomes within the foreseeable future. Furthermore, although it is possible to fish out highly informative and useful genes from the sea of gene diversity in the environment, this can be a highly tedious and inefficient procedure. Microbial ecologists must be clever in their pursuit of ecologically relevant, valuable, and niche-defining genomic information within the vast haystack of microbial diversity. In this report, we seek to describe advances and prospects that will help microbial ecologists glean more knowledge from investigations into metagenomes. These include technological advances in sequencing and cloning methodologies, as well as improvements in annotation and comparative sequence analysis. More significant, however, will be ways to focus in on various subsets of the metagenome that may be of particular relevance, either by limiting the target community under study or improving the focus or speed of screening procedures. Lastly, given the cost and infrastructure necessary for large metagenome projects, and the almost inexhaustible amount of data they can produce, trends toward broader use of metagenome data across the research community coupled with the needed investment in bioinformatics infrastructure devoted to metagenomics will no doubt further increase the value of metagenomic studies in various environments.  相似文献   

9.
10.
Metagenomic analyses of marine viruses generate an overview of viral genes present in a sample, but the percentage of the resulting sequence fragments that can be reassembled is low and the phenotype of the virus from which a given sequence derives is usually unknown. In this study, we employed physical fractionation to characterize the morphological and genomic traits of a subset of uncultivated viruses from a natural marine assemblage. Viruses from Kāne‘ohe Bay, Hawai‘i were fractionated by equilibrium buoyant density centrifugation in a cesium chloride (CsCl) gradient, and one fraction from the CsCl gradient was then further fractionated by strong anion-exchange chromatography. One of the fractions resulting from this two-dimensional separation appeared to be dominated by only a few virus types based on genome sizes and morphology. Sequences generated from a shotgun clone library of the viruses in this fraction were assembled into significantly more numerous contigs than have been generated with previous metagenomic investigations of whole DNA viral assemblages with comparable sequencing effort. Analysis of the longer contigs (up to 6.5 kb) assembled from our metagenome allowed us to assess gene arrangement in this subset of marine viruses. Our results demonstrate the potential for physical fractionation to facilitate sequence assembly from viral metagenomes and permit linking of morphological and genomic data for uncultivated viruses.  相似文献   

11.
Environment-dependent genomic features have been defined for different metagenomes, whose genes and their associated processes are related to specific environments. Identification of ORFs and their functional categories are the most common methods for association between functional and environmental features. However, this analysis based on finding ORFs misses noncoding sequences and, therefore, some metagenome regulatory or structural information could be discarded. In this work we analyzed 23 whole metagenomes, including coding and noncoding sequences using the following sequence patterns: (G+C) content, Codon Usage (Cd), Trinucleotide Usage (Tn), and functional assignments for ORF prediction. Herein, we present evidence of a high proportion of noncoding sequences discarded in common similarity-based methods in metagenomics, and the kind of relevant information present in those. We found a high density of trinucleotide repeat sequences (TRS) in noncoding sequences, with a regulatory and adaptive function for metagenome communities. We present associations between trinucleotide values and gene function, where metagenome clustering correlate with microorganism adaptations and kinds of metagenomes. We propose here that noncoding sequences have relevant information to describe metagenomes that could be considered in a whole metagenome analysis in order to improve their organization, classification protocols, and their relation with the environment.  相似文献   

12.
Plasmid diversity is still poorly understood in pelagic marine environments. Metagenomic approaches have the potential to reveal the genetic diversity of microbes actually present in an environment and the contribution of mobile genetic elements such as plasmids. By searching metagenomic datasets from flow cytometry-sorted coastal California seawater samples dominated by cyanobacteria (SynMeta) and from the Global Ocean Survey (GOS) putative marine plasmid sequences were identified as well as their possible hosts in the same samples. Based on conserved plasmid replication protein sequences predicted from the SynMeta metagenomes, PCR primers were designed for amplification of one plasmid family and used to confirm that metagenomic contigs of this family were derived from plasmids. These results suggest that the majority of plasmids in SynMeta metagenomes were small and cryptic, encoding mostly their own replication proteins. In contrast, probable plasmid sequences identified in the GOS dataset showed more complexity, consistent with a much more diverse microbial population, and included genes involved in plasmid transfer, mobilization, stability and partitioning. Phylogenetic trees were constructed based on common replication protein functional domains and, even within one replication domain family, substantial diversity was found within and between different samples. However, some replication protein domain families appear to be rare in the marine environment.  相似文献   

13.
Bacterioplankton of the SAR11 clade are the most abundant microorganisms in marine systems, usually representing 25% or more of the total bacterial cells in seawater worldwide. SAR11 is divided into subclades with distinct spatiotemporal distributions (ecotypes), some of which appear to be specific to deep water. Here we examine the genomic basis for deep ocean distribution of one SAR11 bathytype (depth-specific ecotype), subclade Ic. Four single-cell Ic genomes, with estimated completeness of 55%–86%, were isolated from 770 m at station ALOHA and compared with eight SAR11 surface genomes and metagenomic datasets. Subclade Ic genomes dominated metagenomic fragment recruitment below the euphotic zone. They had similar COG distributions, high local synteny and shared a large number (69%) of orthologous clusters with SAR11 surface genomes, yet were distinct at the 16S rRNA gene and amino-acid level, and formed a separate, monophyletic group in phylogenetic trees. Subclade Ic genomes were enriched in genes associated with membrane/cell wall/envelope biosynthesis and showed evidence of unique phage defenses. The majority of subclade Ic-specfic genes were hypothetical, and some were highly abundant in deep ocean metagenomic data, potentially masking mechanisms for niche differentiation. However, the evidence suggests these organisms have a similar metabolism to their surface counterparts, and that subclade Ic adaptations to the deep ocean do not involve large variations in gene content, but rather more subtle differences previously observed deep ocean genomic data, like preferential amino-acid substitutions, larger coding regions among SAR11 clade orthologs, larger intergenic regions and larger estimated average genome size.  相似文献   

14.
15.
We have analyzed metagenomic fosmid clones from the deep chlorophyll maximum (DCM), which, by genomic parameters, correspond to the 16S ribosomal RNA (rRNA)-defined marine Euryarchaeota group IIB (MGIIB). The fosmid collections associated with this group add up to 4 Mb and correspond to at least two species within this group. From the proposed essential genes contained in the collections, we infer that large sections of the conserved regions of the genomes of these microbes have been recovered. The genomes indicate a photoheterotrophic lifestyle, similar to that of the available genome of MGIIA (assembled from an estuarine metagenome in Puget Sound, Washington Pacific coast), with a proton-pumping rhodopsin of the same kind. Several genomic features support an aerobic metabolism with diversified substrate degradation capabilities that include xenobiotics and agar. On the other hand, these MGIIB representatives are non-motile and possess similar genome size to the MGIIA-assembled genome, but with a lower GC content. The large phylogenomic gap with other known archaea indicates that this is a new class of marine Euryarchaeota for which we suggest the name Thalassoarchaea. The analysis of recruitment from available metagenomes indicates that the representatives of group IIB described here are largely found at the DCM (ca. 50 m deep), in which they are abundant (up to 0.5% of the reads), and at the surface mostly during the winter mixing, which explains formerly described 16S rRNA distribution patterns. Their uneven representation in environmental samples that are close in space and time might indicate sporadic blooms.  相似文献   

16.
Metagenomic sequencing has contributed important new knowledge about the microbes that live in a symbiotic relationship with humans. With modern sequencing technology it is possible to generate large numbers of sequencing reads from a metagenome but analysis of the data is challenging. Here we present the bioinformatics pipeline MEDUSA that facilitates analysis of metagenomic reads at the gene and taxonomic level. We also constructed a global human gut microbial gene catalogue by combining data from 4 studies spanning 3 continents. Using MEDUSA we mapped 782 gut metagenomes to the global gene catalogue and a catalogue of sequenced microbial species. Hereby we find that all studies share about half a million genes and that on average 300 000 genes are shared by half the studied subjects. The gene richness is higher in the European studies compared to Chinese and American and this is also reflected in the species richness. Even though it is possible to identify common species and a core set of genes, we find that there are large variations in abundance of species and genes.  相似文献   

17.
Metagenomic analyses of microbial communities have revealed a large degree of interspecies and intraspecies genetic diversity through the reconstruction of metagenome assembled genomes (MAGs). Yet, metabolic modeling efforts mainly rely on reference genomes as the starting point for reconstruction and simulation of genome scale metabolic models (GEMs), neglecting the immense intra- and inter-species diversity present in microbial communities. Here, we present metaGEM (https://github.com/franciscozorrilla/metaGEM), an end-to-end pipeline enabling metabolic modeling of multi-species communities directly from metagenomes. The pipeline automates all steps from the extraction of context-specific prokaryotic GEMs from MAGs to community level flux balance analysis (FBA) simulations. To demonstrate the capabilities of metaGEM, we analyzed 483 samples spanning lab culture, human gut, plant-associated, soil, and ocean metagenomes, reconstructing over 14,000 GEMs. We show that GEMs reconstructed from metagenomes have fully represented metabolism comparable to isolated genomes. We demonstrate that metagenomic GEMs capture intraspecies metabolic diversity and identify potential differences in the progression of type 2 diabetes at the level of gut bacterial metabolic exchanges. Overall, metaGEM enables FBA-ready metabolic model reconstruction directly from metagenomes, provides a resource of metabolic models, and showcases community-level modeling of microbiomes associated with disease conditions allowing generation of mechanistic hypotheses.  相似文献   

18.
Candidate bacterial phylum BRC1 has been identified in a broad range of mostly organic-rich oxic and anoxic environments through molecular analysis of microbial communities. None of the members of BRC1 have been cultivated and only a few draft genome sequences have been obtained from metagenomes or as a result of single-cell sequencing. We have reconstructed complete genome of BRC1 bacterium, BY40, from metagenome of the microbial community of a deep subsurface thermal aquifer in the Tomsk Region of the Western Siberia, Russia, and used it for metabolic reconstruction and comparison with existing genomic data. Analysis of 3.3 Mb genome of BY40 bacterium revealed numerous glycoside hydrolases that could enable utilization of carbohydrates, including enzymes of chitin-degradation pathway. The bacterium lacks flagellar machinery but the twitching motility is encoded. The reconstructed central metabolism revealed pathways enabling the fermentation of organic substrates, as well as their complete oxidation through aerobic and anaerobic respiration. Phylogenetic analysis using BY40 genome supported the phylum level classification of BRC1 lineage. Based on phylogenetic and genomic analyses, the novel bacterium is proposed to be classified as Candidatus Sumerlaea chitinivorans, within a candidate phylum Sumerlaeota.  相似文献   

19.
Pyrosequence data was used to analyze the composition and metabolic potential of a metagenome from a hydrocarbon-contaminated site. Unamplified and whole genome amplified (WGA) sequence data was compared from this source. According to MG-RAST, an additional 2,742,252 bp of DNA was obtained with the WGA, indicating that WGA has the ability to generate a large amount of DNA from a small amount of starting sample. However, it was observed that WGA introduced a bias with respect to the distribution of the amplified DNA and the types of microbial populations that were accessed from the metagenome. The dominant order in the WGA metagenome was Flavobacteriales, whereas the unamplified metagenome was dominated by Actinomycetales as determined by RDPII and CARMA databases. According to the SEED database, the subsystems shown to be present for the individual metagenomes were associated with the metabolic potential that was expected to be present in the contaminated groundwater, such as the metabolism of aromatic compounds. A higher percentage (4.4) of genes associated with the metabolism of aromatic compounds was identified in the unamplified metagenome when compared to the WGA metagenome (0.66%). This could be attributed to the increased number of hydrocarbon degrading bacteria that had been accessed from this metagenome (Mycobacteria, Nocardia, Brevibacteria, Clavibacter, Rubrobacter, and Rhodoccocus). Therefore, it was possible to relate the taxonomic groups accessed to the contamination profile of the metagenome. By collating the sequencing data obtained pre- and post-amplification, this study provided insight regarding the survival strategies of microbial communities inhabiting contaminated environments.  相似文献   

20.
A great amount of attention has been paid to the study of the microbiota–gut–brain axis in recent years. Gut microbiota can affect development and functioning of the brain through synthesis of various neuroactive metabolites, such as neurotransmitters, hormones, and other compounds. In the present study, the presence and distribution are analyzed for the genes controlling the synthesis of enzymes involved in production of neuroactive compounds in 147 gut metagenomes of healthy people from Human Microbiome Project database and synthetic metagenome artificially assembled from 508 bacterial genomes. The analysis is conducted using the collected catalog of orthologs for 17 key enzymes and an algorithm developed for their search. As a result of analyses of genomic and metagenomic data of healthy people, seven bacterial genera containing the greatest number of enzyme genes and 8 enzymes out of 17 that are observed the most frequently are chosen. It is assumed that the selected “core” genera and enzymes form a metagenomic signature reflecting the neurometabolic potential of the human intestinal microbiota in the norm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号