首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

The remarkable advance of metagenomics presents significant new challenges in data analysis. Metagenomic datasets (metagenomes) are large collections of sequencing reads from anonymous species within particular environments. Computational analyses for very large metagenomes are extremely time-consuming, and there are often many novel sequences in these metagenomes that are not fully utilized. The number of available metagenomes is rapidly increasing, so fast and efficient metagenome comparison methods are in great demand.  相似文献   

2.
By comparing the SEED and Pfam functional profiles of metagenomes of two Brazilian coral species with 29 datasets that are publicly available, we were able to identify some functions, such as protein secretion systems, that are overrepresented in the metagenomes of corals and may play a role in the establishment and maintenance of bacteria-coral associations. However, only a small percentage of the reads of these metagenomes could be annotated by these reference databases, which may lead to a strong bias in the comparative studies. For this reason, we have searched for identical sequences (99% of nucleotide identity) among these metagenomes in order to perform a reference-independent comparative analysis, and we were able to identify groups of microbial communities that may be under similar selective pressures. The identification of sequences shared among the metagenomes was found to be even better for the identification of groups of communities with similar niche requirements than the traditional analysis of functional profiles. This approach is not only helpful for the investigation of similarities between microbial communities with high proportion of unknown reads, but also enables an indirect overview of gene exchange between communities.  相似文献   

3.
Shotgun metagenome sequencing has become a fast, cheap and high-throughput technology for characterizing microbial communities in complex environments and human body sites. However, accurate identification of microorganisms at the strain/species level remains extremely challenging. We present a novel k-mer-based approach, termed GSMer, that identifies genome-specific markers (GSMs) from currently sequenced microbial genomes, which were then used for strain/species-level identification in metagenomes. Using 5390 sequenced microbial genomes, 8 770 321 50-mer strain-specific and 11 736 360 species-specific GSMs were identified for 4088 strains and 2005 species (4933 strains), respectively. The GSMs were first evaluated against mock community metagenomes, recently sequenced genomes and real metagenomes from different body sites, suggesting that the identified GSMs were specific to their targeting genomes. Sensitivity evaluation against synthetic metagenomes with different coverage suggested that 50 GSMs per strain were sufficient to identify most microbial strains with ≥0.25× coverage, and 10% of selected GSMs in a database should be detected for confident positive callings. Application of GSMs identified 45 and 74 microbial strains/species significantly associated with type 2 diabetes patients and obese/lean individuals from corresponding gastrointestinal tract metagenomes, respectively. Our result agreed with previous studies but provided strain-level information. The approach can be directly applied to identify microbial strains/species from raw metagenomes, without the effort of complex data pre-processing.  相似文献   

4.
Environmental parameters drive phenotypic and genotypic frequency variations in microbial communities and thus control the extent and structure of microbial diversity. We tested the extent to which microbial community composition changes are controlled by shifting physiochemical properties within a hypersaline lagoon. We sequenced four sediment metagenomes from the Coorong, South Australia from samples which varied in salinity by 99 Practical Salinity Units (PSU), an order of magnitude in ammonia concentration and two orders of magnitude in microbial abundance. Despite the marked divergence in environmental parameters observed between samples, hierarchical clustering of taxonomic and metabolic profiles of these metagenomes showed striking similarity between the samples (>89%). Comparison of these profiles to those derived from a wide variety of publically available datasets demonstrated that the Coorong sediment metagenomes were similar to other sediment, soil, biofilm and microbial mat samples regardless of salinity (>85% similarity). Overall, clustering of solid substrate and water metagenomes into discrete similarity groups based on functional potential indicated that the dichotomy between water and solid matrices is a fundamental determinant of community microbial metabolism that is not masked by salinity, nutrient concentration or microbial abundance.  相似文献   

5.

Background  

Investigation of metagenomes provides greater insight into uncultured microbial communities. The improvement in sequencing technology, which yields a large amount of sequence data, has led to major breakthroughs in the field. However, at present, taxonomic binning tools for metagenomes discard 30-40% of Sanger sequencing data due to the stringency of BLAST cut-offs. In an attempt to provide a comprehensive overview of metagenomic data, we re-analyzed the discarded metagenomes by using less stringent cut-offs. Additionally, we introduced a new criterion, namely, the evolutionary conservation of adjacency between neighboring genes. To evaluate the feasibility of our approach, we re-analyzed discarded contigs and singletons from several environments with different levels of complexity. We also compared the consistency between our taxonomic binning and those reported in the original studies.  相似文献   

6.
Schmieder R  Edwards R 《PloS one》2011,6(3):e17288
High-throughput sequencing technologies have strongly impacted microbiology, providing a rapid and cost-effective way of generating draft genomes and exploring microbial diversity. However, sequences obtained from impure nucleic acid preparations may contain DNA from sources other than the sample. Those sequence contaminations are a serious concern to the quality of the data used for downstream analysis, causing misassembly of sequence contigs and erroneous conclusions. Therefore, the removal of sequence contaminants is a necessary and required step for all sequencing projects. We developed DeconSeq, a robust framework for the rapid, automated identification and removal of sequence contamination in longer-read datasets (150 bp mean read length). DeconSeq is publicly available as standalone and web-based versions. The results can be exported for subsequent analysis, and the databases used for the web-based version are automatically updated on a regular basis. DeconSeq categorizes possible contamination sequences, eliminates redundant hits with higher similarity to non-contaminant genomes, and provides graphical visualizations of the alignment results and classifications. Using DeconSeq, we conducted an analysis of possible human DNA contamination in 202 previously published microbial and viral metagenomes and found possible contamination in 145 (72%) metagenomes with as high as 64% contaminating sequences. This new framework allows scientists to automatically detect and efficiently remove unwanted sequence contamination from their datasets while eliminating critical limitations of current methods. DeconSeq's web interface is simple and user-friendly. The standalone version allows offline analysis and integration into existing data processing pipelines. DeconSeq's results reveal whether the sequencing experiment has succeeded, whether the correct sample was sequenced, and whether the sample contains any sequence contamination from DNA preparation or host. In addition, the analysis of 202 metagenomes demonstrated significant contamination of the non-human associated metagenomes, suggesting that this method is appropriate for screening all metagenomes. DeconSeq is available at http://deconseq.sourceforge.net/.  相似文献   

7.
In the pelagic environment, iron is a scarce but essential micronutrient. The iron acquisition capabilities of selected marine bacteria have been investigated, but the recent proliferation of marine prokaryotic genomes and metagenomes offers a more comprehensive picture of microbial iron uptake pathways in the ocean. Searching these data sets, we were able to identify uptake mechanisms for Fe(3+), Fe(2+) and iron chelates (e.g. siderophore and haem iron complexes). Transport of iron chelates is accomplished by TonB-dependent transporters (TBDTs). After clustering the TBDTs from marine prokaryotic genomes, we identified TBDT clusters for the transport of hydroxamate and catecholate siderophore iron complexes and haem using gene neighbourhood analysis and co-clustering of TBDTs of known function. The genomes also contained two classes of siderophore biosynthesis genes: NRPS (non-ribosomal peptide synthase) genes and NIS (NRPS Independent Siderophore) genes. The most common iron transporters, in both the genomes and metagenomes, were Fe(3+) ABC transporters. Iron uptake-related TBDTs and siderophore biosynthesis genes were less common in pelagic marine metagenomes relative to the genomic data set, in part because Pelagibacter ubique and Prochlorococcus species, which almost entirely lacked these Fe uptake systems, dominate the metagenomes. Our results are largely consistent with current knowledge of iron speciation in the ocean, but suggest that in certain niches the ability to acquire siderophores and/or haem iron chelates is beneficial.  相似文献   

8.
Lactic acid bacteria (LAB) (n = 152) in African pearl millet slurries and in the metagenomes of amylaceous fermented foods were investigated by screening 33 genes involved in probiotic and nutritional functions. All isolates belonged to six species of the genera Pediococcus and Lactobacillus, and Lactobacillus fermentum was the dominant species. We screened the isolates for the abilities to survive passage through the gastrointestinal tract and to synthesize folate and riboflavin. The isolates were also tested in vitro for their abilities to survive exposure to bile salts and to survive at pH 2. Because the ability to hydrolyze starch confers an ecological advantage on LAB that grow in starchy matrixes as well as improving the nutritional properties of the gruels, we screened for genes involved in starch metabolism. The results showed that genes with the potential ability to survive passage through the gastrointestinal tract were widely distributed among isolates and metagenomes, whereas in vitro tests showed that only a limited set of isolates, mainly those belonging to L. fermentum, could tolerate a low pH. In contrast, the wide distribution of genes associated with bile salt tolerance, in particular bsh, is consistent with the high frequency of tolerance to bile salts observed. Genetic screening revealed a potential for folate and riboflavin synthesis in both isolates and metagenomes, as well as high variability among genes related to starch metabolism. Genetic screening of isolates and metagenomes from fermented foods is thus a promising approach for assessing the functional potential of food microbiotas.  相似文献   

9.
Metagenomic analyses of microbial communities have revealed a large degree of interspecies and intraspecies genetic diversity through the reconstruction of metagenome assembled genomes (MAGs). Yet, metabolic modeling efforts mainly rely on reference genomes as the starting point for reconstruction and simulation of genome scale metabolic models (GEMs), neglecting the immense intra- and inter-species diversity present in microbial communities. Here, we present metaGEM (https://github.com/franciscozorrilla/metaGEM), an end-to-end pipeline enabling metabolic modeling of multi-species communities directly from metagenomes. The pipeline automates all steps from the extraction of context-specific prokaryotic GEMs from MAGs to community level flux balance analysis (FBA) simulations. To demonstrate the capabilities of metaGEM, we analyzed 483 samples spanning lab culture, human gut, plant-associated, soil, and ocean metagenomes, reconstructing over 14,000 GEMs. We show that GEMs reconstructed from metagenomes have fully represented metabolism comparable to isolated genomes. We demonstrate that metagenomic GEMs capture intraspecies metabolic diversity and identify potential differences in the progression of type 2 diabetes at the level of gut bacterial metabolic exchanges. Overall, metaGEM enables FBA-ready metabolic model reconstruction directly from metagenomes, provides a resource of metabolic models, and showcases community-level modeling of microbiomes associated with disease conditions allowing generation of mechanistic hypotheses.  相似文献   

10.
Mercury (Hg) methylation genes (hgcAB) mediate the formation of the toxic methylmercury and have been identified from diverse environments, including freshwater and marine ecosystems, Arctic permafrost, forest and paddy soils, coal-ash amended sediments, chlor-alkali plants discharges and geothermal springs. Here we present the first attempt at a standardized protocol for the detection, identification and quantification of hgc genes from metagenomes. Our Hg-cycling microorganisms in aquatic and terrestrial ecosystems (Hg-MATE) database, a catalogue of hgc genes, provides the most accurate information to date on the taxonomic identity and functional/metabolic attributes of microorganisms responsible for Hg methylation in the environment. Furthermore, we introduce “marky-coco”, a ready-to-use bioinformatic pipeline based on de novo single-metagenome assembly, for easy and accurate characterization of hgc genes from environmental samples. We compared the recovery of hgc genes from environmental metagenomes using the marky-coco pipeline with an approach based on coassembly of multiple metagenomes. Our data show similar efficiency in both approaches for most environments except those with high diversity (i.e., paddy soils) for which a coassembly approach was preferred. Finally, we discuss the definition of true hgc genes and methods to normalize hgc gene counts from metagenomes.  相似文献   

11.
The discovery of several new structured non-coding RNAs in bacterial and archaeal genomes and metagenomes raises burning questions about their biological and biochemical functions.  相似文献   

12.
《Genomics》2020,112(4):2903-2913
Tanneries pose a serious threat to the environment by generating large amount of solid tannery waste (STW). Two metagenomes representing tannery waste dumpsites Jajmau (JJK) and Unnao (UNK) were sequenced using Illumina HiSeq platform. Microbial diversity analysis revealed domination of Proteobacteria, Firmicutes, Bacteroidetes, Actinobacteria, and Planctomycetes in both metagenomes. Presence of pollutant degrading microbes such as Bacillus, Clostridium, Halanaerobium and Pseudomonas strongly indicated their bioremediation ability. KEGG and SEED annotated main functional categories included carbohydrate metabolism, amino acids metabolism, and protein metabolism. KEGG displayed 5848 and 9633 proteases encoding ORFs compared to 5159 and 8044 ORFs displayed by SEED classification in JJK and UNK metagenomes, respectively. Abundantly present serine- and metallo-proteases belonging to Bacillaceae, Clostridiaceae, Xanthomonadaceae, Flavobacteriaceae and Chitinophagaceae families exhibited proteinaceous waste degrading ability of these metagenomes. Further structural and functional analysis of metagenome encoded enzymes may facilitate the discovery of novel proteases useful in bioremediation of STW.  相似文献   

13.
Viruses are abundant yet understudied members of soil environments that influence terrestrial biogeochemical cycles. Here, we characterized the dsDNA viral diversity in biochar-amended agricultural soils at the preplanting and harvesting stages of a tomato growing season via paired total metagenomes and viral size fraction metagenomes (viromes). Size fractionation prior to DNA extraction reduced sources of nonviral DNA in viromes, enabling the recovery of a vaster richness of viral populations (vOTUs), greater viral taxonomic diversity, broader range of predicted hosts, and better access to the rare virosphere, relative to total metagenomes, which tended to recover only the most persistent and abundant vOTUs. Of 2961 detected vOTUs, 2684 were recovered exclusively from viromes, while only three were recovered from total metagenomes alone. Both viral and microbial communities differed significantly over time, suggesting a coupled response to rhizosphere recruitment processes and/or nitrogen amendments. Viral communities alone were also structured along an 18 m spatial gradient. Overall, our results highlight the utility of soil viromics and reveal similarities between viral and microbial community dynamics throughout the tomato growing season yet suggest a partial decoupling of the processes driving their spatial distributions, potentially due to differences in dispersal, decay rates, and/or sensitivities to soil heterogeneity.Subject terms: Microbial ecology, Soil microbiology, Metagenomics, Metagenomics  相似文献   

14.
15.
Numerous marine sponges harbor enormous amounts of as-yet-uncultivated bacteria in their tissues. There is increasing evidence that these symbionts play an important role in the synthesis of protective metabolites, many of which are of great pharmacological interest. In this study, genes for the biosynthesis of polyketides, one of the most important classes of bioactive natural products, were systematically investigated in 20 demosponge species from different oceans. Unexpectedly, the sponge metagenomes were dominated by a ubiquitously present, evolutionarily distinct, and highly sponge-specific group of polyketide synthases (PKSs). Open reading frames resembling animal fatty acid genes were found on three corresponding DNA regions isolated from the metagenomes of Theonella swinhoei and Aplysina aerophoba. Their architecture suggests that methyl-branched fatty acids are the metabolic product. According to a phylogenetic analysis of housekeeping genes, at least one of the PKSs belongs to a bacterium of the Deinococcus-Thermus phylum. The results provide new insights into the chemistry of sponge symbionts and allow inference of a detailed phylogeny of the diverse functional PKS types present in sponge metagenomes. Based on these qualitative and quantitative data, we propose a significantly simplified strategy for the targeted isolation of biomedically relevant PKS genes from complex sponge-symbiont associations.  相似文献   

16.
Natural populations of bacteria in different environments can be astonishingly diverse, as was revealed graphically by large-scale sequencing of samples of their so-called metagenomes. Among the sequence datasets from four different samples of marine bacterial metagenomes, we noted that nitrogen fixation (nif) genes were conspicuous by their absence from three of them. However, in one sample, more than one-third of the bacteria appeared to have a complement of these genes. Here, some reasons behind this site-to-site variability and their implications for how molecular methods, involving large-scale sequencing and/or functional metagenomics, can best be used to describe bacterial diversity in natural environments are discussed.  相似文献   

17.
The advances of next-generation sequencing technology have facilitated metagenomics research that attempts to determine directly the whole collection of genetic material within an environmental sample (i.e. the metagenome). Identification of genes directly from short reads has become an important yet challenging problem in annotating metagenomes, since the assembly of metagenomes is often not available. Gene predictors developed for whole genomes (e.g. Glimmer) and recently developed for metagenomic sequences (e.g. MetaGene) show a significant decrease in performance as the sequencing error rates increase, or as reads get shorter. We have developed a novel gene prediction method FragGeneScan, which combines sequencing error models and codon usages in a hidden Markov model to improve the prediction of protein-coding region in short reads. The performance of FragGeneScan was comparable to Glimmer and MetaGene for complete genomes. But for short reads, FragGeneScan consistently outperformed MetaGene (accuracy improved ∼62% for reads of 400 bases with 1% sequencing errors, and ∼18% for short reads of 100 bases that are error free). When applied to metagenomes, FragGeneScan recovered substantially more genes than MetaGene predicted (>90% of the genes identified by homology search), and many novel genes with no homologs in current protein sequence database.  相似文献   

18.
Studies of the genomes of individual microbial organisms as well as aggregate genomes (metagenomes) of microbial communities are expected to lead to advances in various areas, such as healthcare, environmental cleanup, and alternative energy production. A variety of specialized data resources manage the results of different microbial genome data processing and interpretation stages, and represent different degrees of microbial genome characterization. Scientists studying microbial genomes and metagenomes often need one or several of these resources. Given their diversity, these resources cannot be used effectively without determining the scope and type of individual resources as well as the relationship between their data.  相似文献   

19.
Kadnikov  V. V.  Mardanov  A. V.  Beletsky  A. V.  Karnachuk  O. V.  Ravin  N. V. 《Microbiology》2020,89(3):328-336
Microbiology - The candidate phylum Riflebacteria was described based on analysis of genomes assembled from the metagenomes of various anaerobic ecosystems; however, to date, no member of...  相似文献   

20.
The tendency for chlorinated aliphatics and aromatic hydrocarbons to accumulate in environments such as groundwater and sediments poses a serious environmental threat. In this study, the metabolic capacity of hydrocarbon (aromatics and chlorinated aliphatics)-contaminated groundwater in the KwaZulu-Natal province of South Africa has been elucidated for the first time by analysis of pyrosequencing data. The taxonomic data revealed that the metagenomes were dominated by the phylum Proteobacteria (mainly Betaproteobacteria). In addition, Flavobacteriales, Sphingobacteria, Burkholderiales, and Rhodocyclales were the predominant orders present in the individual metagenomes. These orders included microorganisms (Flavobacteria, Dechloromonas aromatica RCB, and Azoarcus) involved in the degradation of aromatic compounds and various other hydrocarbons that were present in the groundwater. Although the metabolic reconstruction of the metagenome represented composite cell networks, the information obtained was sufficient to address questions regarding the metabolic potential of the microbial communities and to correlate the data to the contamination profile of the groundwater. Genes involved in the degradation of benzene and benzoate, heavy metal-resistance mechanisms appeared to provide a survival strategy used by the microbial communities. Analysis of the pyrosequencing-derived data revealed that the metagenomes represent complex microbial communities that have adapted to the geochemical conditions of the groundwater as evidenced by the presence of key enzymes/genes conferring resistance to specific contaminants. Thus, pyrosequencing analysis of the metagenomes provided insights into the microbial activities in hydrocarbon-contaminated habitats.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号