首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
姜忠俊  李小波 《微生物学报》2022,62(8):2954-2968
宏基因组学技术可以直接从环境中提取微生物的全部遗传物质,而不需要像传统方法一样在培养基上纯培养。这种技术的出现为科学家对微生物群落的结构和功能的认识提供了重要的方法,同时对疾病的诊治、环境的治理以及生命的认识具有重大的意义。从环境中提取出微生物全部遗传物质,对其进行测序从而得到它们的reads片段,通过reads组装工具可以进一步组装成重叠群片段。对重叠群片段进行分箱,可以从宏基因组样本中重建出更多完整的基因。分箱效果的好坏直接影响到后续的生物分析,因此如何将这些含有不同微生物基因混合的重叠群序列进行有效的分箱成为了宏基因组学研究的热点和难点。机器学习方法被广泛应用于宏基因组重叠群分箱,通常分为有监督重叠群分类方法和无监督重叠群聚类方法。该综述针对宏基因组重叠群分箱方法进行了较为全面的阐述,深入剖析了重叠群分类方法与聚类方法,发现其存在分类准确率较低、分箱时间较长、难以从复杂数据集中重建更多微生物基因等问题,并对未来重叠群分箱方法的研究和发展进行了展望。作者建议可以使用半监督学习、集成学习以及深度学习方法,并采用更有效的数据特征表示等途径来提高分箱效果。  相似文献   

2.
3.

Background  

Investigation of metagenomes provides greater insight into uncultured microbial communities. The improvement in sequencing technology, which yields a large amount of sequence data, has led to major breakthroughs in the field. However, at present, taxonomic binning tools for metagenomes discard 30-40% of Sanger sequencing data due to the stringency of BLAST cut-offs. In an attempt to provide a comprehensive overview of metagenomic data, we re-analyzed the discarded metagenomes by using less stringent cut-offs. Additionally, we introduced a new criterion, namely, the evolutionary conservation of adjacency between neighboring genes. To evaluate the feasibility of our approach, we re-analyzed discarded contigs and singletons from several environments with different levels of complexity. We also compared the consistency between our taxonomic binning and those reported in the original studies.  相似文献   

4.
Reddy RM  Mohammed MH  Mande SS 《Gene》2012,505(2):259-265
Phylogenetic assignment of individual sequence reads to their respective taxa, referred to as 'taxonomic binning', constitutes a key step of metagenomic analysis. Existing binning methods have limitations either with respect to time or accuracy/specificity of binning. Given these limitations, development of a method that can bin vast amounts of metagenomic sequence data in a rapid, efficient and computationally inexpensive manner can profoundly influence metagenomic analysis in computational resource poor settings. We introduce TWARIT, a hybrid binning algorithm, that employs a combination of short-read alignment and composition-based signature sorting approaches to achieve rapid binning rates without compromising on binning accuracy and specificity. TWARIT is validated with simulated and real-world metagenomes and the results demonstrate significantly lower overall binning times compared to that of existing methods. Furthermore, the binning accuracy and specificity of TWARIT are observed to be comparable/superior to them. A web server implementing TWARIT algorithm is available at http://metagenomics.atc.tcs.com/Twarit/  相似文献   

5.
In the collective genomes (the metagenome) of the microorganisms inhabiting the Earth’s diverse environments is written the history of life on this planet. New molecular tools developed and used for the past 15 years by microbial ecologists are facilitating the extraction, cloning, screening, and sequencing of these genomes. This approach allows microbial ecologists to access and study the full range of microbial diversity, regardless of our ability to culture organisms, and provides an unprecedented access to the breadth of natural products that these genomes encode. However, there is no way that the mere collection of sequences, no matter how expansive, can provide full coverage of the complex world of microbial metagenomes within the foreseeable future. Furthermore, although it is possible to fish out highly informative and useful genes from the sea of gene diversity in the environment, this can be a highly tedious and inefficient procedure. Microbial ecologists must be clever in their pursuit of ecologically relevant, valuable, and niche-defining genomic information within the vast haystack of microbial diversity. In this report, we seek to describe advances and prospects that will help microbial ecologists glean more knowledge from investigations into metagenomes. These include technological advances in sequencing and cloning methodologies, as well as improvements in annotation and comparative sequence analysis. More significant, however, will be ways to focus in on various subsets of the metagenome that may be of particular relevance, either by limiting the target community under study or improving the focus or speed of screening procedures. Lastly, given the cost and infrastructure necessary for large metagenome projects, and the almost inexhaustible amount of data they can produce, trends toward broader use of metagenome data across the research community coupled with the needed investment in bioinformatics infrastructure devoted to metagenomics will no doubt further increase the value of metagenomic studies in various environments.  相似文献   

6.
Phylogenetic diversity--patterns of phylogenetic relatedness among organisms in ecological communities--provides important insights into the mechanisms underlying community assembly. Studies that measure phylogenetic diversity in microbial communities have primarily been limited to a single marker gene approach, using the small subunit of the rRNA gene (SSU-rRNA) to quantify phylogenetic relationships among microbial taxa. In this study, we present an approach for inferring phylogenetic relationships among microorganisms based on the random metagenomic sequencing of DNA fragments. To overcome challenges caused by the fragmentary nature of metagenomic data, we leveraged fully sequenced bacterial genomes as a scaffold to enable inference of phylogenetic relationships among metagenomic sequences from multiple phylogenetic marker gene families. The resulting metagenomic phylogeny can be used to quantify the phylogenetic diversity of microbial communities based on metagenomic data sets. We applied this method to understand patterns of microbial phylogenetic diversity and community assembly along an oceanic depth gradient, and compared our findings to previous studies of this gradient using SSU-rRNA gene and metagenomic analyses. Bacterial phylogenetic diversity was highest at intermediate depths beneath the ocean surface, whereas taxonomic diversity (diversity measured by binning sequences into taxonomically similar groups) showed no relationship with depth. Phylogenetic diversity estimates based on the SSU-rRNA gene and the multi-gene metagenomic phylogeny were broadly concordant, suggesting that our approach will be applicable to other metagenomic data sets for which corresponding SSU-rRNA gene sequences are unavailable. Our approach opens up the possibility of using metagenomic data to study microbial diversity in a phylogenetic context.  相似文献   

7.
Plasmid diversity is still poorly understood in pelagic marine environments. Metagenomic approaches have the potential to reveal the genetic diversity of microbes actually present in an environment and the contribution of mobile genetic elements such as plasmids. By searching metagenomic datasets from flow cytometry-sorted coastal California seawater samples dominated by cyanobacteria (SynMeta) and from the Global Ocean Survey (GOS) putative marine plasmid sequences were identified as well as their possible hosts in the same samples. Based on conserved plasmid replication protein sequences predicted from the SynMeta metagenomes, PCR primers were designed for amplification of one plasmid family and used to confirm that metagenomic contigs of this family were derived from plasmids. These results suggest that the majority of plasmids in SynMeta metagenomes were small and cryptic, encoding mostly their own replication proteins. In contrast, probable plasmid sequences identified in the GOS dataset showed more complexity, consistent with a much more diverse microbial population, and included genes involved in plasmid transfer, mobilization, stability and partitioning. Phylogenetic trees were constructed based on common replication protein functional domains and, even within one replication domain family, substantial diversity was found within and between different samples. However, some replication protein domain families appear to be rare in the marine environment.  相似文献   

8.
Metagenomics is a rapidly emerging field of research for studying microbial communities. To evaluate methods presently used to process metagenomic sequences, we constructed three simulated data sets of varying complexity by combining sequencing reads randomly selected from 113 isolate genomes. These data sets were designed to model real metagenomes in terms of complexity and phylogenetic composition. We assembled sampled reads using three commonly used genome assemblers (Phrap, Arachne and JAZZ), and predicted genes using two popular gene-finding pipelines (fgenesb and CRITICA/GLIMMER). The phylogenetic origins of the assembled contigs were predicted using one sequence similarity-based (blast hit distribution) and two sequence composition-based (PhyloPythia, oligonucleotide frequencies) binning methods. We explored the effects of the simulated community structure and method combinations on the fidelity of each processing step by comparison to the corresponding isolate genomes. The simulated data sets are available online to facilitate standardized benchmarking of tools for metagenomic analysis.  相似文献   

9.
A major research goal in microbial ecology is to understand the relationship between gene organization and function involved in environmental processes of potential interest. Given that more than an estimated 99% of microorganisms in most environments are not amenable to culturing, methods for culture-independent studies of genes of interest have been developed. The wealth of metagenomic approaches allows environmental microbiologists to directly explore the enormous genetic diversity of microbial communities. However, it is extremely difficult to obtain the appropriate sequencing depth of any particular gene that can entirely represent the complexity of microbial metagenomes and be able to draw meaningful conclusions about these communities. This review presents a summary of the metagenomic approaches that have been useful for collecting more information about specific genes. Specific subsets of metagenomes that focus on sequence analysis were selected in each metagenomic studies. This 'targeted metagenomics' approach will provide extensive insight into the functional, ecological and evolutionary patterns of important genes found in microorganisms from various ecosystems.  相似文献   

10.
Bacteria and fungi are of uttermost importance in determining environmental and host functioning. Despite close interactions between animals, plants, their associated microbiomes, and the environment they inhabit, the distribution and role of bacteria and especially fungi across host and environments as well as the cross-habitat determinants of their community compositions remain little investigated. Using a uniquely broad global dataset of 13 483 metagenomes, we analysed the microbiome structure and function of 25 host-associated and environmental habitats, focusing on potential interactions between bacteria and fungi. We found that the metagenomic relative abundance ratio of bacteria-to-fungi is a distinctive microbial feature of habitats. Compared with fungi, the cross-habitat distribution pattern of bacteria was more strongly driven by habitat type. Fungal diversity was depleted in host-associated communities compared with those in the environment, particularly terrestrial habitats, whereas this diversity pattern was less pronounced for bacteria. The relative gene functional potential of bacteria or fungi reflected their diversity patterns and appeared to depend on a balance between substrate availability and biotic interactions. Alongside helping to identify hotspots and sources of microbial diversity, our study provides support for differences in assembly patterns and processes between bacterial and fungal communities across different habitats.  相似文献   

11.
Viruses are the most abundant biological entities on the planet and play an important role in balancing microbes within an ecosystem and facilitating horizontal gene transfer. Although bacteriophages are abundant in rumen environments, little is known about the types of viruses present or their interaction with the rumen microbiome. We undertook random pyrosequencing of virus-enriched metagenomes (viromes) isolated from bovine rumen fluid and analysed the resulting data using comparative metagenomics. A high level of diversity was observed with up to 28,000 different viral genotypes obtained from each environment. The majority (~78%) of sequences did not match any previously described virus. Prophages outnumbered lytic phages approximately 2:1 with the most abundant bacteriophage and prophage types being associated with members of the dominant rumen phyla (Firmicutes and Proteobacteria). Metabolic profiling based on SEED subsystems revealed an enrichment of sequences with putative functional roles in DNA and protein metabolism, but a surprisingly low proportion of sequences assigned to carbohydrate and amino acid metabolism. We expanded our analysis to include previously described metagenomic data and 14 reference genomes. Clustered regularly interspaced short palindromic repeats (CRISPR) were detected in most of the microbial genomes, suggesting previous interactions between viral and microbial communities.  相似文献   

12.
The vast majority of microbial life remains uncatalogued due to the inability to cultivate these organisms in the laboratory. This “microbial dark matter” represents a substantial portion of the tree of life and of the populations that contribute to chemical cycling in many ecosystems. In this work, we leveraged an existing single-cell genomic data set representing the candidate bacterial phylum “Calescamantes” (EM19) to calibrate machine learning algorithms and define metagenomic bins directly from pyrosequencing reads derived from Great Boiling Spring in the U.S. Great Basin. Compared to other assembly-based methods, taxonomic binning with a read-based machine learning approach yielded final assemblies with the highest predicted genome completeness of any method tested. Read-first binning subsequently was used to extract Calescamantes bins from all metagenomes with abundant Calescamantes populations, including metagenomes from Octopus Spring and Bison Pool in Yellowstone National Park and Gongxiaoshe Spring in Yunnan Province, China. Metabolic reconstruction suggests that Calescamantes are heterotrophic, facultative anaerobes, which can utilize oxidized nitrogen sources as terminal electron acceptors for respiration in the absence of oxygen and use proteins as their primary carbon source. Despite their phylogenetic divergence, the geographically separate Calescamantes populations were highly similar in their predicted metabolic capabilities and core gene content, respiring O2, or oxidized nitrogen species for energy conservation in distant but chemically similar hot springs.  相似文献   

13.
ClaMS - "Classifier for Metagenomic Sequences" - is a Java application for binning assembled contigs in metagenomes using user-specified training sets and initial parameters. Since ClaMS trains on sequence composition-based genomic signatures, it is much faster than binning tools that rely on alignments to homologs; ClaMS can bin ~20,000 sequences in 3 minutes on a laptop with a 2.4 GH× Intel Core 2 Duo processor and 2 GB RAM. ClaMS is meant to be a desktop application for biologists and can be run on any machine under any Operating System on which the Java Runtime Environment can be installed.  相似文献   

14.
Previous studies have shown that dinucleotide abundances capture the majority of variation in genome signatures and are useful for quantifying lateral gene transfer and building molecular phylogenies. Metagenomes contain a mixture of individual genomes, and might be expected to lack compositional signatures. In many metagenomic data sets the majority of sequences have no significant similarities to known sequences and are effectively excluded from subsequent analyses. To circumvent this limitation, di-, tri- and tetranucleotide abundances of 86 microbial and viral metagenomes consisting of short pyrosequencing reads were analysed to provide a method which includes all sequences that can be used in combination with other analysis to increase our knowledge about microbial and viral communities. Both principal component analysis and hierarchical clustering showed definitive groupings of metagenomes drawn from similar environments. Together these analyses showed that dinucleotide composition, as opposed to tri- and tetranucleotides, defines a metagenomic signature which can explain up to 80% of the variance between biomes, which is comparable to that obtained by functional genomics. Metagenomes with anomalous content were also identified using dinucleotide abundances. Subsequent analyses determined that these metagenomes were contaminated with exogenous DNA, suggesting that this approach is a useful metric for quality control. The predictive strength of the dinucleotide composition also opens the possibility of assigning ecological classifications to unknown fragments. Environmental selection may be responsible for this dinucleotide signature through direct selection of specific compositional signals; however, simulations suggest that the environment may select indirectly by promoting the increased abundance of a few dominant taxa.  相似文献   

15.
Microbial ecologists can now start digging into the accumulating mountains of metagenomic data to uncover the occurrence of functional genes and their correlations to microbial community members. Limitations and biases in DNA extraction and sequencing technologies impact sequence distributions, and therefore, have to be considered. However, when comparing metagenomes from widely differing environments, these fluctuations have a relatively minor role in microbial community discrimination. As a consequence, any functional gene or species distribution pattern can be compared among metagenomes originating from various environments and projects. In particular, global comparisons would help to define ecosystem specificities, such as involvement and response to climate change (for example, carbon and nitrogen cycle), human health risks (eg, presence of pathogen species, toxin genes and viruses) and biodegradation capacities. Although not all scientists have easy access to high-throughput sequencing technologies, they do have access to the sequences that have been deposited in databases, and therefore, can begin to intensively mine these metagenomic data to generate hypotheses that can be validated experimentally. Information about metabolic functions and microbial species compositions can already be compared among metagenomes from different ecosystems. These comparisons add to our understanding about microbial adaptation and the role of specific microbes in different ecosystems. Concurrent with the rapid growth of sequencing technologies, we have entered a new age of microbial ecology, which will enable researchers to experimentally confirm putative relationships between microbial functions and community structures.  相似文献   

16.
Surveying microbial diversity and function is accomplished by combining complementary molecular tools. Among them, metagenomics is a PCR free approach that contains all genetic information from microbial assemblages and is today performed at a relatively large scale and reasonable cost, mostly based on very short reads. Here, we investigated the potential of metagenomics to provide taxonomic reports of marine microbial eukaryotes. We prepared a curated database with reference sequences of the V4 region of 18S rDNA clustered at 97% similarity and used this database to extract and classify metagenomic reads. More than half of them were unambiguously affiliated to a unique reference whilst the rest could be assigned to a given taxonomic group. The overall diversity reported by metagenomics was similar to that obtained by amplicon sequencing of the V4 and V9 regions of the 18S rRNA gene, although either one or both of these amplicon surveys performed poorly for groups like Excavata, Amoebozoa, Fungi and Haptophyta. We then studied the diversity of picoeukaryotes and nanoeukaryotes using 91 metagenomes from surface down to bathypelagic layers in different oceans, unveiling a clear taxonomic separation between size fractions and depth layers. Finally, we retrieved long rDNA sequences from assembled metagenomes that improved phylogenetic reconstructions of particular groups. Overall, this study shows metagenomics as an excellent resource for taxonomic exploration of marine microbial eukaryotes.  相似文献   

17.
Metagenomic analyses: past and future trends   总被引:2,自引:0,他引:2  
  相似文献   

18.
Assembling microbial and viral genomes from metagenomes is a powerful and appealing method to understand structure–function relationships in complex environments. To compare the recovery of genomes from microorganisms and their viruses from groundwater, we generated shotgun metagenomes with Illumina sequencing accompanied by long reads derived from the Oxford Nanopore Technologies (ONT) sequencing platform. Assembly and metagenome-assembled genome (MAG) metrics for both microbes and viruses were determined from an Illumina-only assembly, ONT-only assembly, and a hybrid assembly approach. The hybrid approach recovered 2× more mid to high-quality MAGs compared to the Illumina-only approach and 4× more than the ONT-only approach. A similar number of viral genomes were reconstructed using the hybrid and ONT methods, and both recovered nearly fourfold more viral genomes than the Illumina-only approach. While yielding fewer MAGs, the ONT-only approach generated MAGs with a high probability of containing rRNA genes, 3× higher than either of the other methods. Of the shared MAGs recovered from each method, the ONT-only approach generated the longest and least fragmented MAGs, while the hybrid approach yielded the most complete. This work provides quantitative data to inform a cost–benefit analysis of the decision to supplement shotgun metagenomic projects with long reads towards the goal of recovering genomes from environmentally abundant groups.  相似文献   

19.
Microbial community succession was examined over a two-year period using spatially and temporally coordinated water chemistry measurements, metagenomic sequencing, phylogenetic binning and de novo metagenomic assembly in the extreme hypersaline habitat of Lake Tyrrell, Victoria, Australia. Relative abundances of Haloquadratum-related sequences were positively correlated with co-varying concentrations of potassium, magnesium and sulfate, but not sodium, chloride or calcium ions, while relative abundances of Halorubrum, Haloarcula, Halonotius, Halobaculum and Salinibacter-related sequences correlated negatively with Haloquadratum and these same ionic factors. Nanohaloarchaea and Halorhabdus-related sequence abundances were inversely correlated with each other, but not other taxonomic groups. These data, along with predicted gene functions from nearly-complete assembled population metagenomes, suggest different ecological phenotypes for Nanohaloarchaea and Halorhabdus-related strains versus other community members. Nucleotide percent G+C compositions were consistently lower in community metagenomic reads from summer versus winter samples. The same seasonal G+C trends were observed within taxonomically binned read subsets from each of seven different genus-level archaeal groups. Relative seasonal abundances were also linked to percent G+C for assembled population genomes. Together, these data suggest that extreme ionic conditions may exert selective pressure on archaeal populations at the level of genomic nucleotide composition, thus contributing to seasonal successional processes. Despite the unavailability of cultured representatives for most of the organisms identified in this study, effective coordination of physical and biological measurements has enabled discovery and quantification of unexpected taxon-specific, environmentally mediated factors influencing microbial community structure.  相似文献   

20.
Microorganisms constitute two third of the Earth's biological diversity. As many as 99% of the microorganisms present in certain environments cannot be cultured by standard techniques. Culture-independent methods are required to understand the genetic diversity, population structure and ecological roles of the majority of organisms. Metagenomics is the genomic analysis of microorganisms by direct extraction and cloning of DNA from their natural environment. Protocols have been developed to capture unexplored microbial diversity to overcome the existing barriers in estimation of diversity. New screening methods have been designed to select specific functional genes within metagenomic libraries to detect novel biocatalysts as well as bioactive molecules applicable to mankind. To study the complete gene or operon clusters, various vectors including cosmid, fosmid or bacterial artificial chromosomes are being developed. Bioinformatics tools and databases have added much to the study of microbial diversity. This review describes the various methodologies and tools developed to understand the biology of uncultured microbes including bacteria, archaea and viruses through metagenomic analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号