首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Metagenomic studies sequence DNA directly from environmental samples to explore the structure and function of complex microbial and viral communities. Individual, short pieces of sequenced DNA (“reads”) are classified into (putative) taxonomic or metabolic groups which are analyzed for patterns across samples. Analysis of such read matrices is at the core of using metagenomic data to make inferences about ecosystem structure and function. Non-negative matrix factorization (NMF) is a numerical technique for approximating high-dimensional data points as positive linear combinations of positive components. It is thus well suited to interpretation of observed samples as combinations of different components. We develop, test and apply an NMF-based framework to analyze metagenomic read matrices. In particular, we introduce a method for choosing NMF degree in the presence of overlap, and apply spectral-reordering techniques to NMF-based similarity matrices to aid visualization. We show that our method can robustly identify the appropriate degree and disentangle overlapping contributions using synthetic data sets. We then examine and discuss the NMF decomposition of a metabolic profile matrix extracted from 39 publicly available metagenomic samples, and identify canonical sample types, including one associated with coral ecosystems, one associated with highly saline ecosystems and others. We also identify specific associations between pathways and canonical environments, and explore how alternative choices of decompositions facilitate analysis of read matrices at a finer scale.  相似文献   

2.
Profiling microbial community function from metagenomic sequencing data remains a computationally challenging problem. Mapping millions of DNA reads from such samples to reference protein databases requires long run-times, and short read lengths can result in spurious hits to unrelated proteins (loss of specificity). We developed ShortBRED (Short, Better Representative Extract Dataset) to address these challenges, facilitating fast, accurate functional profiling of metagenomic samples. ShortBRED consists of two components: (i) a method that reduces reference proteins of interest to short, highly representative amino acid sequences (“markers”) and (ii) a search step that maps reads to these markers to quantify the relative abundance of their associated proteins. After evaluating ShortBRED on synthetic data, we applied it to profile antibiotic resistance protein families in the gut microbiomes of individuals from the United States, China, Malawi, and Venezuela. Our results support antibiotic resistance as a core function in the human gut microbiome, with tetracycline-resistant ribosomal protection proteins and Class A beta-lactamases being the most widely distributed resistance mechanisms worldwide. ShortBRED markers are applicable to other homology-based search tasks, which we demonstrate here by identifying phylogenetic signatures of antibiotic resistance across more than 3,000 microbial isolate genomes. ShortBRED can be applied to profile a wide variety of protein families of interest; the software, source code, and documentation are available for download at http://huttenhower.sph.harvard.edu/shortbred  相似文献   

3.

Background

The proportion of conserved DNA sequences with no clear function is steadily growing in bioinformatics databases. Studies of sequence and structural homology have indicated that many uncharacterized protein domain sequences are variants of functionally described domains. If these variants promote an organism''s ecological fitness, they are likely to be conserved in the genome of its progeny and the population at large. The genetic composition of microbial communities in their native ecosystems is accessible through metagenomics. We hypothesize the co-variation of protein domain sequences across metagenomes from similar ecosystems will provide insights into their potential roles and aid further investigation.

Methodology/Principal findings

We calculated the correlation of Pfam protein domain sequences across the Global Ocean Sampling metagenome collection, employing conservative detection and correlation thresholds to limit results to well-supported hits and associations. We then examined intercorrelations between domains of unknown function (DUFs) and domains involved in known metabolic pathways using network visualization and cluster-detection tools. We used a cautious “guilty-by-association” approach, referencing knowledge-level resources to identify and discuss associations that offer insight into DUF function. We observed numerous DUFs associated to photobiologically active domains and prevalent in the Cyanobacteria. Other clusters included DUFs associated with DNA maintenance and repair, inorganic nutrient metabolism, and sodium-translocating transport domains. We also observed a number of clusters reflecting known metabolic associations and cases that predicted functional reclassification of DUFs.

Conclusion/Significance

Critically examining domain covariation across metagenomic datasets can grant new perspectives on the roles and associations of DUFs in an ecological setting. Targeted attempts at DUF characterization in the laboratory or in silico may draw from these insights and opportunities to discover new associations and corroborate existing ones will arise as more large-scale metagenomic datasets emerge.  相似文献   

4.
Antibiotic resistance is a dire clinical problem with important ecological dimensions. While antibiotic resistance in human pathogens continues to rise at alarming rates, the impact of environmental resistance on human health is still unclear. To investigate the relationship between human-associated and environmental resistomes, we analyzed functional metagenomic selections for resistance against 18 clinically relevant antibiotics from soil and human gut microbiota as well as a set of multidrug-resistant cultured soil isolates. These analyses were enabled by Resfams, a new curated database of protein families and associated highly precise and accurate profile hidden Markov models, confirmed for antibiotic resistance function and organized by ontology. We demonstrate that the antibiotic resistance functions that give rise to the resistance profiles observed in environmental and human-associated microbial communities significantly differ between ecologies. Antibiotic resistance functions that most discriminate between ecologies provide resistance to β-lactams and tetracyclines, two of the most widely used classes of antibiotics in the clinic and agriculture. We also analyzed the antibiotic resistance gene composition of over 6000 sequenced microbial genomes, revealing significant enrichment of resistance functions by both ecology and phylogeny. Together, our results indicate that environmental and human-associated microbial communities harbor distinct resistance genes, suggesting that antibiotic resistance functions are largely constrained by ecology.  相似文献   

5.
The various ecological habitats in the human body provide microbes a wide array of nutrient sources and survival challenges. Advances in technology such as DNA sequencing have allowed a deeper perspective into the molecular function of the human microbiota than has been achievable in the past. Here we aimed to examine the enzymes that cleave complex carbohydrates (CAZymes) in the human microbiome in order to determine (i) whether the CAZyme profiles of bacterial genomes are more similar within body sites or bacterial families and (ii) the sugar degradation and utilization capabilities of microbial communities inhabiting various human habitats. Upon examination of 493 bacterial references genomes from 12 human habitats, we found that sugar degradation capabilities of taxa are more similar to others in the same bacterial family than to those inhabiting the same habitat. Yet, the analysis of 520 metagenomic samples from five major body sites show that even when the community composition varies the CAZyme profiles are very similar within a body site, suggesting that the observed functional profile and microbial habitation have adapted to the local carbohydrate composition. When broad sugar utilization was compared within the five major body sites, the gastrointestinal track contained the highest potential for total sugar degradation, while dextran and peptidoglycan degradation were highest in oral and vaginal sites respectively. Our analysis suggests that the carbohydrate composition of each body site has a profound influence and probably constitutes one of the major driving forces that shapes the community composition and therefore the CAZyme profile of the local microbial communities, which in turn reflects the microbiome fitness to a body site.  相似文献   

6.
Microbial populations inhabiting a natural hypersaline lake ecosystem in Lake Tyrrell, Victoria, Australia, have been characterized using deep metagenomic sampling, iterative de novo assembly, and multidimensional phylogenetic binning. Composite genomes representing habitat-specific microbial populations were reconstructed for eleven different archaea and one bacterium, comprising between 0.6 and 14.1% of the planktonic community. Eight of the eleven archaeal genomes were from microbial species without previously cultured representatives. These new genomes provide habitat-specific reference sequences enabling detailed, lineage-specific compartmentalization of predicted functional capabilities and cellular properties associated with both dominant and less abundant community members, including organisms previously known only by their 16S rRNA sequences. Together, these data provide a comprehensive, culture-independent genomic blueprint for ecosystem-wide analysis of protein functions, population structure, and lifestyles of co-existing, co-evolving microbial groups within the same natural habitat. The “assembly-driven” community genomic approach demonstrated in this study advances our ability to push beyond single gene investigations, and promotes genome-scale reconstructions as a tangible goal in the quest to define the metabolic, ecological, and evolutionary dynamics that underpin environmental microbial diversity.  相似文献   

7.
A major research goal in microbial ecology is to understand the relationship between gene organization and function involved in environmental processes of potential interest. Given that more than an estimated 99% of microorganisms in most environments are not amenable to culturing, methods for culture-independent studies of genes of interest have been developed. The wealth of metagenomic approaches allows environmental microbiologists to directly explore the enormous genetic diversity of microbial communities. However, it is extremely difficult to obtain the appropriate sequencing depth of any particular gene that can entirely represent the complexity of microbial metagenomes and be able to draw meaningful conclusions about these communities. This review presents a summary of the metagenomic approaches that have been useful for collecting more information about specific genes. Specific subsets of metagenomes that focus on sequence analysis were selected in each metagenomic studies. This 'targeted metagenomics' approach will provide extensive insight into the functional, ecological and evolutionary patterns of important genes found in microorganisms from various ecosystems.  相似文献   

8.
With the astonishing rate that genomic and metagenomic sequence data sets are accumulating, there are many reasons to constrain the data analyses. One approach to such constrained analyses is to focus on select subsets of gene families that are particularly well suited for the tasks at hand. Such gene families have generally been referred to as “marker” genes. We are particularly interested in identifying and using such marker genes for phylogenetic and phylogeny-driven ecological studies of microbes and their communities (e.g., construction of species trees, phylogenetic based assignment of metagenomic sequence reads to taxonomic groups, phylogeny-based assessment of alpha- and beta-diversity of microbial communities from metagenomic data). We therefore refer to these as PhyEco (for phylogenetic and phylogenetic ecology) markers. The dual use of these PhyEco markers means that we needed to develop and apply a set of somewhat novel criteria for identification of the best candidates for such markers. The criteria we focused on included universality across the taxa of interest, ability to be used to produce robust phylogenetic trees that reflect as much as possible the evolution of the species from which the genes come, and low variation in copy number across taxa.We describe here an automated protocol for identifying potential PhyEco markers from a set of complete genome sequences. The protocol combines rapid searching, clustering and phylogenetic tree building algorithms to generate protein families that meet the criteria listed above. We report here the identification of PhyEco markers for different taxonomic levels including 40 for “all bacteria and archaea”, 114 for “all bacteria (greatly expanding on the ∼30 commonly used), and 100 s to 1000 s for some of the individual phyla of bacteria. This new list of PhyEco markers should allow much more detailed automated phylogenetic and phylogenetic ecology analyses of these groups than possible previously.  相似文献   

9.
Bacteria and fungi are of uttermost importance in determining environmental and host functioning. Despite close interactions between animals, plants, their associated microbiomes, and the environment they inhabit, the distribution and role of bacteria and especially fungi across host and environments as well as the cross-habitat determinants of their community compositions remain little investigated. Using a uniquely broad global dataset of 13 483 metagenomes, we analysed the microbiome structure and function of 25 host-associated and environmental habitats, focusing on potential interactions between bacteria and fungi. We found that the metagenomic relative abundance ratio of bacteria-to-fungi is a distinctive microbial feature of habitats. Compared with fungi, the cross-habitat distribution pattern of bacteria was more strongly driven by habitat type. Fungal diversity was depleted in host-associated communities compared with those in the environment, particularly terrestrial habitats, whereas this diversity pattern was less pronounced for bacteria. The relative gene functional potential of bacteria or fungi reflected their diversity patterns and appeared to depend on a balance between substrate availability and biotic interactions. Alongside helping to identify hotspots and sources of microbial diversity, our study provides support for differences in assembly patterns and processes between bacterial and fungal communities across different habitats.  相似文献   

10.
A microbial species concept is crucial for interpreting the variation detected by genomics and environmental genomics among cultivated microorganisms and within natural microbial populations. Comparative genomic analyses of prokaryotic species as they are presently described and named have led to the provocative idea that prokaryotes may not form species as we think about them for plants and animals. There are good reasons to doubt whether presently recognized prokaryotic species are truly species. To achieve a better understanding of microbial species, we believe it is necessary to (i) re-evaluate traditional approaches in light of evolutionary and ecological theory, (ii) consider that different microbial species may have evolved in different ways and (iii) integrate genomic, metagenomic and genome-wide expression approaches with ecological and evolutionary theory. Here, we outline how we are using genomic methods to (i) identify ecologically distinct populations (ecotypes) predicted by theory to be species-like fundamental units of microbial communities, and (ii) test their species-like character through in situ distribution and gene expression studies. By comparing metagenomic sequences obtained from well-studied hot spring cyanobacterial mats with genomic sequences of two cultivated cyanobacterial ecotypes, closely related to predominant native populations, we can conduct in situ population genetics studies that identify putative ecotypes and functional genes that determine the ecotypes' ecological distinctness. If individuals within microbial communities are found to be grouped into ecologically distinct, species-like populations, knowing about such populations should guide us to a better understanding of how genomic variation is linked to community function.  相似文献   

11.
Viruses are the most abundant biological entities on our planet. Interactions between viruses and their hosts impact several important biological processes in the world's oceans such as horizontal gene transfer, microbial diversity and biogeochemical cycling. Interrogation of microbial metagenomic sequence data collected as part of the Sorcerer II Global Ocean Expedition (GOS) revealed a high abundance of viral sequences, representing approximately 3% of the total predicted proteins. Cluster analyses of the viral sequences revealed hundreds to thousands of viral genes encoding various metabolic and cellular functions. Quantitative analyses of viral genes of host origin performed on the viral fraction of aquatic samples confirmed the viral nature of these sequences and suggested that significant portions of aquatic viral communities behave as reservoirs of such genetic material. Distributional and phylogenetic analyses of these host-derived viral sequences also suggested that viral acquisition of environmentally relevant genes of host origin is a more abundant and widespread phenomenon than previously appreciated. The predominant viral sequences identified within microbial fractions originated from tailed bacteriophages and exhibited varying global distributions according to viral family. Recruitment of GOS viral sequence fragments against 27 complete aquatic viral genomes revealed that only one reference bacteriophage genome was highly abundant and was closely related, but not identical, to the cyanomyovirus P-SSM4. The co-distribution across all sampling sites of P-SSM4-like sequences with the dominant ecotype of its host, Prochlorococcus supports the classification of the viral sequences as P-SSM4-like and suggests that this virus may influence the abundance, distribution and diversity of one of the most dominant components of picophytoplankton in oligotrophic oceans. In summary, the abundance and broad geographical distribution of viral sequences within microbial fractions, the prevalence of genes among viral sequences that encode microbial physiological function and their distinct phylogenetic distribution lend strong support to the notion that viral-mediated gene acquisition is a common and ongoing mechanism for generating microbial diversity in the marine environment.  相似文献   

12.
Functional gene arrays (FGAs) have been considered as a specific, sensitive, quantitative, and high throughput metagenomic tool to detect, monitor and characterize microbial communities. Especially GeoChips, the most comprehensive FGAs have been applied to analyze the functional diversity, composition, structure, and metabolic potential or activity of a variety of microbial communities from different habitats, such as aquatic ecosystems, soils, contaminated sites, extreme environments, and bioreactors. FGAs are able to address fundamental questions related to global change, bioremediation, land use, human health, and ecological theories, and link the microbial community structure to environmental properties and ecosystem functioning. This review focuses on applications of FGA technology for profiling microbial communities, including target preparation, hybridization and data processing, and data analysis. We also discuss challenges and future directions of FGA applications.  相似文献   

13.
Human associated microbial communities exert tremendous influence over human health and disease. With modern metagenomic sequencing methods it is now possible to follow the relative abundance of microbes in a community over time. These microbial communities exhibit rich ecological dynamics and an important goal of microbial ecology is to infer the ecological interactions between species directly from sequence data. Any algorithm for inferring ecological interactions must overcome three major obstacles: 1) a correlation between the abundances of two species does not imply that those species are interacting, 2) the sum constraint on the relative abundances obtained from metagenomic studies makes it difficult to infer the parameters in timeseries models, and 3) errors due to experimental uncertainty, or mis-assignment of sequencing reads into operational taxonomic units, bias inferences of species interactions due to a statistical problem called “errors-in-variables”. Here we introduce an approach, Learning Interactions from MIcrobial Time Series (LIMITS), that overcomes these obstacles. LIMITS uses sparse linear regression with boostrap aggregation to infer a discrete-time Lotka-Volterra model for microbial dynamics. We tested LIMITS on synthetic data and showed that it could reliably infer the topology of the inter-species ecological interactions. We then used LIMITS to characterize the species interactions in the gut microbiomes of two individuals and found that the interaction networks varied significantly between individuals. Furthermore, we found that the interaction networks of the two individuals are dominated by distinct “keystone species”, Bacteroides fragilis and Bacteroided stercosis, that have a disproportionate influence on the structure of the gut microbiome even though they are only found in moderate abundance. Based on our results, we hypothesize that the abundances of certain keystone species may be responsible for individuality in the human gut microbiome.  相似文献   

14.
15.
Eukaryotic cells commonly use protein kinases in signaling systems that relay information and control a wide range of processes. These enzymes have a fundamentally similar structure, but achieve functional diversity through variable regions that determine how the catalytic core is activated and recruited to phosphorylation targets. “Hippo” pathways are ancient protein kinase signaling systems that control cell proliferation and morphogenesis; the NDR/LATS family protein kinases, which associate with “Mob” coactivator proteins, are central but incompletely understood components of these pathways. Here we describe the crystal structure of budding yeast Cbk1–Mob2, to our knowledge the first of an NDR/LATS kinase–Mob complex. It shows a novel coactivator-organized activation region that may be unique to NDR/LATS kinases, in which a key regulatory motif apparently shifts from an inactive binding mode to an active one upon phosphorylation. We also provide a structural basis for a substrate docking mechanism previously unknown in AGC family kinases, and show that docking interaction provides robustness to Cbk1’s regulation of its two known in vivo substrates. Co-evolution of docking motifs and phosphorylation consensus sites strongly indicates that a protein is an in vivo regulatory target of this hippo pathway, and predicts a new group of high-confidence Cbk1 substrates that function at sites of cytokinesis and cell growth. Moreover, docking peptides arise in unstructured regions of proteins that are probably already kinase substrates, suggesting a broad sequential model for adaptive acquisition of kinase docking in rapidly evolving intrinsically disordered polypeptides.  相似文献   

16.
Capturing the uncultivated majority   总被引:1,自引:0,他引:1  
The metagenomic analysis of environmental microbial communities continues to be a rapidly developing area of study. DNA isolation, the first step in capturing the uncultivated majority, has seen many advances in recent years. Protocols have been developed to distinguish DNA from live versus dead cells and to separate extracellular from intracellular DNA. Looking to increase our understanding of the role that members of a microbial community play in ecological processes, several techniques have been developed that are enabling greater in-depth analysis of environmental metagenomes. These include the development of environmental gene tags and the serial analysis of 16S rRNA gene sequence tags. In addition, new screening methods have been designed to select for specific functional genes within metagenomic libraries. Finally, new cultivation methods continue to be developed to improve our ability to capture a greater diversity of microorganisms within the environment.  相似文献   

17.
A pervasive challenge in microbial ecology is understanding the genetic level where ecological units can be differentiated. Ecological differentiation often occurs at fine genomic levels, yet it is unclear how to utilise ecological information to define ecotypes given the breadth of environmental variation among microbial taxa. Here, we present an analytical framework that infers clusters along genome‐based microbial phylogenies according to shared environmental responses. The advantage of our approach is the ability to identify genomic clusters that best fit complex environmental information whilst characterising cluster niches through model predictions. We apply our method to determine climate‐associated ecotypes in populations of nitrogen‐fixing symbionts using whole genomes, explicitly sampled to detect climate differentiation across a heterogeneous landscape. Although soil and plant host characteristics strongly influence distribution patterns of inferred ecotypes, our flexible statistical method enabled us to identify climate‐associated genomic clusters using environmental data, providing solid support for ecological specialisation in soil symbionts.  相似文献   

18.
Modern microbial mats are potential analogues of some of Earth''s earliest ecosystems. Excellent examples can be found in Shark Bay, Australia, with mats of various morphologies. To further our understanding of the functional genetic potential of these complex microbial ecosystems, we conducted for the first time shotgun metagenomic analyses. We assembled metagenomic next-generation sequencing data to classify the taxonomic and metabolic potential across diverse morphologies of marine mats in Shark Bay. The microbial community across taxonomic classifications using protein-coding and small subunit rRNA genes directly extracted from the metagenomes suggests that three phyla Proteobacteria, Cyanobacteria and Bacteriodetes dominate all marine mats. However, the microbial community structure between Shark Bay and Highbourne Cay (Bahamas) marine systems appears to be distinct from each other. The metabolic potential (based on SEED subsystem classifications) of the Shark Bay and Highbourne Cay microbial communities were also distinct. Shark Bay metagenomes have a metabolic pathway profile consisting of both heterotrophic and photosynthetic pathways, whereas Highbourne Cay appears to be dominated almost exclusively by photosynthetic pathways. Alternative non-rubisco-based carbon metabolism including reductive TCA cycle and 3-hydroxypropionate/4-hydroxybutyrate pathways is highly represented in Shark Bay metagenomes while not represented in Highbourne Cay microbial mats or any other mat forming ecosystems investigated to date. Potentially novel aspects of nitrogen cycling were also observed, as well as putative heavy metal cycling (arsenic, mercury, copper and cadmium). Finally, archaea are highly represented in Shark Bay and may have critical roles in overall ecosystem function in these modern microbial mats.  相似文献   

19.
Settlement of many benthic marine invertebrates is stimulated by bacterial biofilms, although it is not known if patterns of settlement reflect microbial communities that are specific to discrete habitats. Here, we characterized the taxonomic and functional gene diversity (16S rRNA gene amplicon and metagenomic sequencing analyses), as well as the specific bacterial abundances, in biofilms from diverse nearby and distant locations, both inshore and offshore, and tested them for their ability to induce settlement of the biofouling tubeworm Hydroides elegans, an inhabitant of bays and harbours around the world. We found that compositions of the bacterial biofilms were site specific, with the greatest differences between inshore and offshore sites. Further, biofilms were highly diverse in their taxonomic and functional compositions across inshore sites, while relatively low diversity was found at offshore sites. Hydroides elegans settled on all biofilms tested, with settlement strongly correlated with bacterial abundance. Bacterial density in biofilms was positively correlated with biofilm age. Our results suggest that the localized distribution of H. elegans is not determined by ‘selection’ to locations by specific bacteria, but it is more likely linked to the prevailing local ecology and oceanographic features that affect the development of dense biofilms and the occurrence of larvae.  相似文献   

20.
Terrestrial ecosystems are receiving elevated inputs of nitrogen (N) from anthropogenic sources and understanding how these increases in N availability affect soil microbial communities is critical for predicting the associated effects on belowground ecosystems. We used a suite of approaches to analyze the structure and functional characteristics of soil microbial communities from replicated plots in two long-term N fertilization experiments located in contrasting systems. Pyrosequencing-based analyses of 16S rRNA genes revealed no significant effects of N fertilization on bacterial diversity, but significant effects on community composition at both sites; copiotrophic taxa (including members of the Proteobacteria and Bacteroidetes phyla) typically increased in relative abundance in the high N plots, with oligotrophic taxa (mainly Acidobacteria) exhibiting the opposite pattern. Consistent with the phylogenetic shifts under N fertilization, shotgun metagenomic sequencing revealed increases in the relative abundances of genes associated with DNA/RNA replication, electron transport and protein metabolism, increases that could be resolved even with the shallow shotgun metagenomic sequencing conducted here (average of 75 000 reads per sample). We also observed shifts in the catabolic capabilities of the communities across the N gradients that were significantly correlated with the phylogenetic and metagenomic responses, indicating possible linkages between the structure and functioning of soil microbial communities. Overall, our results suggest that N fertilization may, directly or indirectly, induce a shift in the predominant microbial life-history strategies, favoring a more active, copiotrophic microbial community, a pattern that parallels the often observed replacement of K-selected with r-selected plant species with elevated N.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号