首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 734 毫秒
1.
Assembling individual genomes from complex community metagenomic data remains a challenging issue for environmental studies. We evaluated the quality of genome assemblies from community short read data (Illumina 100 bp pair-ended sequences) using datasets recovered from freshwater and soil microbial communities as well as in silico simulations. Our analyses revealed that the genome of a single genotype (or species) can be accurately assembled from a complex metagenome when it shows at least about 20 × coverage. At lower coverage, however, the derived assemblies contained a substantial fraction of non-target sequences (chimeras), which explains, at least in part, the higher number of hypothetical genes recovered in metagenomic relative to genomic projects. We also provide examples of how to detect intrapopulation structure in metagenomic datasets and estimate the type and frequency of errors in assembled genes and contigs from datasets of varied species complexity.  相似文献   

2.
The gene neighborhood in prokaryotic genomes has been effectively utilized in inferring co-functional networks in various organisms. Previously, such genomic context information has been sought among completely assembled prokaryotic genomes. Here, we present a method to infer functional gene networks according to the gene neighborhood in metagenome contigs, which are incompletely assembled genomic fragments. Given that the amount of metagenome sequence data has now surpassed that of completely assembled prokaryotic genomes in the public domain, we expect benefits of inferring networks by the metagenome-based gene neighborhood. We generated co-functional networks for diverse taxonomical species using metagenomics contigs derived from the human microbiome and the ocean microbiome. We found that the networks based on the metagenome gene neighborhood outperformed those based on 1748 completely assembled prokaryotic genomes. We also demonstrated that the metagenome-based gene neighborhood could predict genes related to virulence-associated phenotypes in a bacterial pathogen, indicating that metagenome-based functional links could be sufficiently predictive for some phenotypes of medical importance. Owing to the exponential growth of metagenome sequence data in public repositories, metagenome-based inference of co-functional networks will facilitate understanding of gene functions and pathways in diverse species.  相似文献   

3.
Although remarkable progress in metagenomic sequencing of various environmental samples has been made, large numbers of fragment sequences have been registered in the international DNA databanks, primarily without information on gene function and phylotype, and thus with limited usefulness. Industrial useful biological activity is often carried out by a set of genes, such as those constituting an operon. In this connection, metagenomic approaches have a weakness because sets of the genes are usually split up, since the sequences obtained by metagenome analyses are fragmented into 1-kb or much shorter segments. Therefore, even when a set of genes responsible for an industrially useful function is found in one metagenome library, it is usually difficult to know whether a single genome harbors the entire gene set or whether different genomes have individual genes. By modifying Self-Organizing Map (SOM), we previously developed BLSOM for oligonucleotide composition, which allowed classification (self-organization) of sequence fragments according to genomes. Because BLSOM could reassociate genomic fragments according to genomes, BLSOM may ameliorate the abovementioned weakness of metagenome analyses. Here, we have developed a strategy for clustering of metagenomic sequences according to phylotypes and genomes, by testing a gene set contributing to environment preservation.  相似文献   

4.
Accessing the soil metagenome for studies of microbial diversity   总被引:1,自引:0,他引:1  
Soil microbial communities contain the highest level of prokaryotic diversity of any environment, and metagenomic approaches involving the extraction of DNA from soil can improve our access to these communities. Most analyses of soil biodiversity and function assume that the DNA extracted represents the microbial community in the soil, but subsequent interpretations are limited by the DNA recovered from the soil. Unfortunately, extraction methods do not provide a uniform and unbiased subsample of metagenomic DNA, and as a consequence, accurate species distributions cannot be determined. Moreover, any bias will propagate errors in estimations of overall microbial diversity and may exclude some microbial classes from study and exploitation. To improve metagenomic approaches, investigate DNA extraction biases, and provide tools for assessing the relative abundances of different groups, we explored the biodiversity of the accessible community DNA by fractioning the metagenomic DNA as a function of (i) vertical soil sampling, (ii) density gradients (cell separation), (iii) cell lysis stringency, and (iv) DNA fragment size distribution. Each fraction had a unique genetic diversity, with different predominant and rare species (based on ribosomal intergenic spacer analysis [RISA] fingerprinting and phylochips). All fractions contributed to the number of bacterial groups uncovered in the metagenome, thus increasing the DNA pool for further applications. Indeed, we were able to access a more genetically diverse proportion of the metagenome (a gain of more than 80% compared to the best single extraction method), limit the predominance of a few genomes, and increase the species richness per sequencing effort. This work stresses the difference between extracted DNA pools and the currently inaccessible complete soil metagenome.  相似文献   

5.
Species’ responses at the genetic level are key to understanding the long‐term consequences of anthropogenic global change. Herbaria document such responses, and, with contemporary sampling, provide high‐resolution time‐series of plant evolutionary change. Characterizing genetic diversity is straightforward for model species with small genomes and a reference sequence. For nonmodel species—with small or large genomes—diversity is traditionally assessed using restriction‐enzyme‐based sequencing. However, age‐related DNA damage and fragmentation preclude the use of this approach for ancient herbarium DNA. Here, we combine reduced‐representation sequencing and hybridization‐capture to overcome this challenge and efficiently compare contemporary and historical specimens. Specifically, we describe how homemade DNA baits can be produced from reduced‐representation libraries of fresh samples, and used to efficiently enrich historical libraries for the same fraction of the genome to produce compatible sets of sequence data from both types of material. Applying this approach to both Arabidopsis thaliana and the nonmodel plant Cardamine bulbifera, we discovered polymorphisms de novo in an unbiased, reference‐free manner. We show that the recovered genetic variation recapitulates known genetic diversity in A. thaliana, and recovers geographical origin in both species and over time, independent of bait diversity. Hence, our method enables fast, cost‐efficient, large‐scale integration of contemporary and historical specimens for assessment of genome‐wide genetic trends over time, independent of genome size and presence of a reference genome.  相似文献   

6.
7.
8.
Marine mollusc shells enclose a wealth of information on coastal organisms and their environment. Their life history traits as well as (palaeo‐) environmental conditions, including temperature, food availability, salinity and pollution, can be traced through the analysis of their shell (micro‐) structure and biogeochemical composition. Adding to this list, the DNA entrapped in shell carbonate biominerals potentially offers a novel and complementary proxy both for reconstructing palaeoenvironments and tracking mollusc evolutionary trajectories. Here, we assess this potential by applying DNA extraction, high‐throughput shotgun DNA sequencing and metagenomic analyses to marine mollusc shells spanning the last ~7,000 years. We report successful DNA extraction from shells, including a variety of ancient specimens, and find that DNA recovery is highly dependent on their biomineral structure, carbonate layer preservation and disease state. We demonstrate positive taxonomic identification of mollusc species using a combination of mitochondrial DNA genomes, barcodes, genome‐scale data and metagenomic approaches. We also find shell biominerals to contain a diversity of microbial DNA from the marine environment. Finally, we reconstruct genomic sequences of organisms closely related to the Vibrio tapetis bacteria from Manila clam shells previously diagnosed with Brown Ring Disease. Our results reveal marine mollusc shells as novel genetic archives of the past, which opens new perspectives in ancient DNA research, with the potential to reconstruct the evolutionary history of molluscs, microbial communities and pathogens in the face of environmental changes. Other future applications include conservation of endangered mollusc species and aquaculture management.  相似文献   

9.
Metagenomic Characterization of Chesapeake Bay Virioplankton   总被引:7,自引:1,他引:6       下载免费PDF全文
Viruses are ubiquitous and abundant throughout the biosphere. In marine systems, virus-mediated processes can have significant impacts on microbial diversity and on global biogeocehmical cycling. However, viral genetic diversity remains poorly characterized. To address this shortcoming, a metagenomic library was constructed from Chesapeake Bay virioplankton. The resulting sequences constitute the largest collection of long-read double-stranded DNA (dsDNA) viral metagenome data reported to date. BLAST homology comparisons showed that Chesapeake Bay virioplankton contained a high proportion of unknown (homologous only to environmental sequences) and novel (no significant homolog) sequences. This analysis suggests that dsDNA viruses are likely one of the largest reservoirs of unknown genetic diversity in the biosphere. The taxonomic origin of BLAST homologs to viral library sequences agreed well with reported abundances of cooccurring bacterial subphyla within the estuary and indicated that cyanophages were abundant. However, the low proportion of Siphophage homologs contradicts a previous assertion that this family comprises most bacteriophage diversity. Identification and analyses of cyanobacterial homologs of the psbA gene illustrated the value of metagenomic studies of virioplankton. The phylogeny of inferred PsbA protein sequences suggested that Chesapeake Bay cyanophage strains are endemic in that environment. The ratio of psbA homologous sequences to total cyanophage sequences in the metagenome indicated that the psbA gene may be nearly universal in Chesapeake Bay cyanophage genomes. Furthermore, the low frequency of psbD homologs in the library supports the prediction that Chesapeake Bay cyanophage populations are dominated by Podoviridae.  相似文献   

10.
We have analyzed metagenomic fosmid clones from the deep chlorophyll maximum (DCM), which, by genomic parameters, correspond to the 16S ribosomal RNA (rRNA)-defined marine Euryarchaeota group IIB (MGIIB). The fosmid collections associated with this group add up to 4 Mb and correspond to at least two species within this group. From the proposed essential genes contained in the collections, we infer that large sections of the conserved regions of the genomes of these microbes have been recovered. The genomes indicate a photoheterotrophic lifestyle, similar to that of the available genome of MGIIA (assembled from an estuarine metagenome in Puget Sound, Washington Pacific coast), with a proton-pumping rhodopsin of the same kind. Several genomic features support an aerobic metabolism with diversified substrate degradation capabilities that include xenobiotics and agar. On the other hand, these MGIIB representatives are non-motile and possess similar genome size to the MGIIA-assembled genome, but with a lower GC content. The large phylogenomic gap with other known archaea indicates that this is a new class of marine Euryarchaeota for which we suggest the name Thalassoarchaea. The analysis of recruitment from available metagenomes indicates that the representatives of group IIB described here are largely found at the DCM (ca. 50 m deep), in which they are abundant (up to 0.5% of the reads), and at the surface mostly during the winter mixing, which explains formerly described 16S rRNA distribution patterns. Their uneven representation in environmental samples that are close in space and time might indicate sporadic blooms.  相似文献   

11.
The extent to which cultured strains represent the genetic diversity of a population of microorganisms is poorly understood. Because they do not require culturing, metagenomic approaches have the potential to reveal the genetic diversity of the microbes actually present in an environment. From coastal California seawater, a complex and diverse environment, the marine cyanobacteria of the genus Synechococcus were enriched by flow cytometry-based sorting and the population metagenome was analysed with 454 sequencing technology. The sequence data were compared with model Synechococcus genomes, including those of two coastal strains, one isolated from the same and one from a very similar environment. The natural population metagenome had high sequence identity to most genes from the coastal model strains but diverged greatly from these genomes in multiple regions of atypical trinucleotide content that encoded diverse functions. These results can be explained by extensive horizontal gene transfer presumably with large differences in horizontally transferred genetic material between different strains. Some assembled contigs showed the presence of novel open reading frames not found in the model genomes, but these could not yet be unambiguously assigned to a Synechococcus clade. At least three distinct mobile DNA elements (plasmids) not found in model strain genomes were detected in the assembled contigs, suggesting for the first time their likely importance in marine cyanobacterial populations and possible role in horizontal gene transfer.  相似文献   

12.
Sulfate‐reducing methanotrophy by anaerobic methanotrophic archaea (ANME) and sulfate‐reducing bacteria (SRB) is a major biological sink of methane in anoxic methane‐enriched marine sediments. The physiology of a microbial community dominated by free‐living ANME‐1 at 14–16 cm below the seafloor in the G11 pockmark at Nyegga was investigated by integrated metagenomic and metaproteomic approaches. Total DNA was subjected to 454‐pyrosequencing (829 527 reads), and 16.6 Mbp of sequence information was assembled into 27352 contigs. Taxonomic analysis supported a high abundance of Euryarchaea (70%) with 66% of the assembled metagenome belonging to ANME‐1. Extracted sediment proteins were separated in two dimensions and subjected to mass spectrometry (LTQ‐Orbitrap XL). Of 356 identified proteins, 245 were expressed by ANME‐1. These included proteins for cold‐adaptation and production of gas vesicles, reflecting both the adaptation of the ANME‐1 community to a permanently cold environment and its potential for positioning in specific sediment depths respectively. In addition, key metabolic enzymes including the enzymes in the reverse methanogenesis pathway (except N5,N10‐methylene‐tetrahydromethanopterin reductase), heterodisulfide reductases and the F420H2:quinone oxidoreductase (Fqo) complex were identified. A complete dissimilatory sulfate reduction pathway was expressed by sulfate‐reducing Deltaproteobacteria. Interestingly, an APS‐reductase comprising Gram‐positive SRB and related sequences were identified in the proteome. Overall, the results demonstrated that our approach was effective in assessing in situ metabolic processes in cold seep sediments.  相似文献   

13.
Complete genomes can be recovered from metagenomes by assembling and binning DNA sequences into metagenome assembled genomes (MAGs). Yet, the presence of microdiversity can hamper the assembly and binning processes, possibly yielding chimeric, highly fragmented and incomplete genomes. Here, the metagenomes of four samples of aerobic granular sludge bioreactors containing Candidatus (Ca.) Accumulibacter, a phosphate-accumulating organism of interest for wastewater treatment, were sequenced with both PacBio and Illumina. Different strategies of genome assembly and binning were investigated, including published protocols and a binning procedure adapted to the binning of long contigs (MuLoBiSC). Multiple criteria were considered to select the best strategy for Ca. Accumulibacter, whose multiple strains in every sample represent a challenging microdiversity. In this case, the best strategy relies on long-read only assembly and a custom binning procedure including MuLoBiSC in metaWRAP. Several high-quality Ca. Accumulibacter MAGs, including a novel species, were obtained independently from different samples. Comparative genomic analysis showed that MAGs retrieved in different samples harbour genomic rearrangements in addition to accumulation of point mutations. The microdiversity of Ca. Accumulibacter, likely driven by mobile genetic elements, causes major difficulties in recovering MAGs, but it is also a hallmark of the panmictic lifestyle of these bacteria.  相似文献   

14.
Next‐generation sequencing (NGS) is emerging as an efficient and cost‐effective tool in population genomic analyses of nonmodel organisms, allowing simultaneous resequencing of many regions of multi‐genomic DNA from multiplexed samples. Here, we detail our synthesis of protocols for targeted resequencing of mitochondrial and nuclear loci by generating indexed genomic libraries for multiplexing up to 100 individuals in a single sequencing pool, and then enriching the pooled library using custom DNA capture arrays. Our use of DNA sequence from one species to capture and enrich the sequencing libraries of another species (i.e. cross‐species DNA capture) indicates that efficient enrichment occurs when sequences are up to about 12% divergent, allowing us to take advantage of genomic information in one species to sequence orthologous regions in related species. In addition to a complete mitochondrial genome on each array, we have included between 43 and 118 nuclear loci for low‐coverage sequencing of between 18 kb and 87 kb of DNA sequence per individual for single nucleotide polymorphisms discovery from 50 to 100 individuals in a single sequencing lane. Using this method, we have generated a total of over 500 whole mitochondrial genomes from seven cetacean species and green sea turtles. The greater variation detected in mitogenomes relative to short mtDNA sequences is helping to resolve genetic structure ranging from geographic to species‐level differences. These NGS and analysis techniques have allowed for simultaneous population genomic studies of mtDNA and nDNA with greater genomic coverage and phylogeographic resolution than has previously been possible in marine mammals and turtles.  相似文献   

15.
Many economically important crops have large and complex genomes that hamper their sequencing by standard methods such as whole genome shotgun (WGS). Large tracts of methylated repeats occur in plant genomes that are interspersed by hypomethylated gene‐rich regions. Gene‐enrichment strategies based on methylation profiles offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration with McrBC endonuclease digestion to enrich for euchromatic regions in the sugarcane genome. To verify the efficiency of methylation filtration and the assembly quality of sequences submitted to gene‐enrichment strategy, we have compared assemblies using methyl‐filtered (MF) and unfiltered (UF) libraries. The use of methy filtration allowed a better assembly by filtering out 35% of the sugarcane genome and by producing 1.5× more scaffolds and 1.7× more assembled Mb in length compared with unfiltered dataset. The coverage of sorghum coding sequences (CDS) by MF scaffolds was at least 36% higher than by the use of UF scaffolds. Using MF technology, we increased by 134× the coverage of gene regions of the monoploid sugarcane genome. The MF reads assembled into scaffolds that covered all genes of the sugarcane bacterial artificial chromosomes (BACs), 97.2% of sugarcane expressed sequence tags (ESTs), 92.7% of sugarcane RNA‐seq reads and 98.4% of sorghum protein sequences. Analysis of MF scaffolds from encoded enzymes of the sucrose/starch pathway discovered 291 single‐nucleotide polymorphisms (SNPs) in the wild sugarcane species, S. spontaneum and S. officinarum. A large number of microRNA genes was also identified in the MF scaffolds. The information achieved by the MF dataset provides a valuable tool for genomic research in the genus Saccharum and for improvement of sugarcane as a biofuel crop.  相似文献   

16.
The methanogenic endosymbionts of anaerobic protists represent the only known intracellular archaea, yet, almost nothing is known about genome structure and content in these lineages. Here, an almost complete genome of an intracellular Methanobacterium species was assembled from a metagenome derived from its host ciliate, a Heterometopus species. Phylogenomic analysis showed that the endosymbiont was closely related to free‐living Methanobacterium isolates, and when compared with the genomes of free‐living Methanobacterium, the endosymbiont did not show significant reduction in genome size or GC content. Additionally, the Methanobacterium endosymbiont genome shared the majority of its genes with its closest relative, though it did also contain unique genes possibly involved in interactions with the host via membrane‐associated proteins, the removal of toxic by‐products from host metabolism and the production of small signalling molecules. Though anaerobic ciliates have been shown to transmit their endosymbionts to daughter cells during division, the results presented here could suggest that the endosymbiotic Methanobacterium did not experience significant genetic isolation or drift and/or that this lineage was only recently acquired. Altogether, comparative genomic analysis identified genes potentially involved in the establishment and maintenance of the symbiosis, as well provided insight into the genomic consequences for an intracellular archaeum.  相似文献   

17.
Little is known about early plastic biofilm assemblage dynamics and successional changes over time. By incubating virgin microplastics along oceanic transects and comparing adhered microbial communities with those of naturally occurring plastic litter at the same locations, we constructed gene catalogues to contrast the metabolic differences between early and mature biofilm communities. Early colonization incubations were reproducibly dominated by Alteromonadaceae and harboured significantly higher proportions of genes associated with adhesion, biofilm formation, chemotaxis, hydrocarbon degradation and motility. Comparative genomic analyses among the Alteromonadaceae metagenome assembled genomes (MAGs) highlighted the importance of the mannose-sensitive hemagglutinin (MSHA) operon, recognized as a key factor for intestinal colonization, for early colonization of hydrophobic plastic surfaces. Synteny alignments of MSHA also demonstrated positive selection for mshA alleles across all MAGs, suggesting that mshA provides a competitive advantage for surface colonization and nutrient acquisition. Large-scale genomic characteristics of early colonizers varied little, despite environmental variability. Mature plastic biofilms were composed of predominantly Rhodobacteraceae and displayed significantly higher proportions of carbohydrate hydrolysis enzymes and genes for photosynthesis and secondary metabolism. Our metagenomic analyses provide insight into early biofilm formation on plastics in the ocean and how early colonizers self-assemble, compared to mature, phylogenetically and metabolically diverse biofilms.  相似文献   

18.
Metagenomics is an emerging field in which the power of genomic analysis is applied to an entire microbial community, bypassing the need to isolate and culture individual microbial species. Assembling of metagenomic DNA fragments is very much like the overlap-layout-consensus procedure for assembling isolated genomes, but is augmented by an additional binning step to differentiate scaffolds, contigs and unassembled reads into various taxonomic groups. In this paper, we employed n-mer oligonucleotide frequencies as the features and developed a hierarchical classifier (PCAHIER) for binning short (≤ 1,000 bps) metagenomic fragments. The principal component analysis was used to reduce the high dimensionality of the feature space. The hierarchical classifier consists of four layers of local classifiers that are implemented based on the linear discriminant analysis. These local classifiers are responsible for binning prokaryotic DNA fragments into superkingdoms, of the same superkingdom into phyla, of the same phylum into genera, and of the same genus into species, respectively. We evaluated the performance of the PCAHIER by using our own simulated data sets as well as the widely used simHC synthetic metagenome data set from the IMG/M system. The effectiveness of the PCAHIER was demonstrated through comparisons against a non-hierarchical classifier, and two existing binning algorithms (TETRA and Phylopythia).  相似文献   

19.
Plasmids have long been recognized as an important driver of DNA exchange and genetic innovation in prokaryotes. The success of plasmids has been attributed to their independent replication from the host''s chromosome and their frequent self-transfer. It is thought that plasmids accumulate, rearrange and distribute nonessential genes, which may provide an advantage for host proliferation under selective conditions. In order to test this hypothesis independently of biases from culture selection, we study the plasmid metagenome from microbial communities in two activated sludge systems, one of which receives mostly household and the other chemical industry wastewater. We find that plasmids from activated sludge microbial communities carry among the largest proportion of unknown gene pools so far detected in metagenomic DNA, confirming their presumed role of DNA innovators. At a system level both plasmid metagenomes were dominated by functions associated with replication and transposition, and contained a wide variety of antibiotic and heavy metal resistances. Plasmid families were very different in the two metagenomes and grouped in deep-branching new families compared with known plasmid replicons. A number of abundant plasmid replicons could be completely assembled directly from the metagenome, providing insight in plasmid composition without culturing bias. Functionally, the two metagenomes strongly differed in several ways, including a greater abundance of genes for carbohydrate metabolism in the industrial and of general defense factors in the household activated sludge plasmid metagenome. This suggests that plasmids not only contribute to the adaptation of single individual prokaryotic species, but of the prokaryotic community as a whole under local selective conditions.  相似文献   

20.
In the epipelagic ocean, the genus Oithona is considered as one of the most abundant and widespread copepods and plays an important role in the trophic food web. Despite its ecological importance, little is known about Oithona and cyclopoid copepods genomics. Therefore, we sequenced, assembled and annotated the genome of Oithona nana. The comparative genomic analysis integrating available copepod genomes highlighted the expansions of genes related to stress response, cell differentiation and development, including genes coding Lin12‐Notch‐repeat (LNR) domain proteins. The Oithona biogeography based on 28S sequences and metagenomic reads from the Tara Oceans expedition showed the presence of O. nana mostly in the Mediterranean Sea (MS) and confirmed the amphitropical distribution of Oithona similis. The population genomics analyses of O. nana in the Northern MS, integrating the Tara Oceans metagenomic data and the O. nana genome, led to the identification of genetic structure between populations from the MS basins. Furthermore, 20 loci were found to be under positive selection including four missense and eight synonymous variants, harbouring soft or hard selective sweep patterns. One of the missense variants was localized in the LNR domain of the coding region of a male‐specific gene. The variation in the B‐allele frequency with respect to the MS circulation pattern showed the presence of genomic clines between O. nana and another undefined Oithona species possibly imported through Atlantic waters. This study provides new approaches and results in zooplankton population genomics through the integration of metagenomic and oceanographic data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号