首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Profiling microbial community function from metagenomic sequencing data remains a computationally challenging problem. Mapping millions of DNA reads from such samples to reference protein databases requires long run-times, and short read lengths can result in spurious hits to unrelated proteins (loss of specificity). We developed ShortBRED (Short, Better Representative Extract Dataset) to address these challenges, facilitating fast, accurate functional profiling of metagenomic samples. ShortBRED consists of two components: (i) a method that reduces reference proteins of interest to short, highly representative amino acid sequences (“markers”) and (ii) a search step that maps reads to these markers to quantify the relative abundance of their associated proteins. After evaluating ShortBRED on synthetic data, we applied it to profile antibiotic resistance protein families in the gut microbiomes of individuals from the United States, China, Malawi, and Venezuela. Our results support antibiotic resistance as a core function in the human gut microbiome, with tetracycline-resistant ribosomal protection proteins and Class A beta-lactamases being the most widely distributed resistance mechanisms worldwide. ShortBRED markers are applicable to other homology-based search tasks, which we demonstrate here by identifying phylogenetic signatures of antibiotic resistance across more than 3,000 microbial isolate genomes. ShortBRED can be applied to profile a wide variety of protein families of interest; the software, source code, and documentation are available for download at http://huttenhower.sph.harvard.edu/shortbred  相似文献   

2.
Metagenomic shotgun sequencing data can identify microbes populating a microbial community and their proportions, but existing taxonomic profiling methods are inefficient for increasingly large data sets. We present an approach that uses clade-specific marker genes to unambiguously assign reads to microbial clades more accurately and >50× faster than current approaches. We validated our metagenomic phylogenetic analysis tool, MetaPhlAn, on terabases of short reads and provide the largest metagenomic profiling to date of the human gut. It can be accessed at http://huttenhower.sph.harvard.edu/metaphlan/.  相似文献   

3.
Microbial communities carry out the majority of the biochemical activity on the planet, and they play integral roles in processes including metabolism and immune homeostasis in the human microbiome. Shotgun sequencing of such communities' metagenomes provides information complementary to organismal abundances from taxonomic markers, but the resulting data typically comprise short reads from hundreds of different organisms and are at best challenging to assemble comparably to single-organism genomes. Here, we describe an alternative approach to infer the functional and metabolic potential of a microbial community metagenome. We determined the gene families and pathways present or absent within a community, as well as their relative abundances, directly from short sequence reads. We validated this methodology using a collection of synthetic metagenomes, recovering the presence and abundance both of large pathways and of small functional modules with high accuracy. We subsequently applied this method, HUMAnN, to the microbial communities of 649 metagenomes drawn from seven primary body sites on 102 individuals as part of the Human Microbiome Project (HMP). This provided a means to compare functional diversity and organismal ecology in the human microbiome, and we determined a core of 24 ubiquitously present modules. Core pathways were often implemented by different enzyme families within different body sites, and 168 functional modules and 196 metabolic pathways varied in metagenomic abundance specifically to one or more niches within the microbiome. These included glycosaminoglycan degradation in the gut, as well as phosphate and amino acid transport linked to host phenotype (vaginal pH) in the posterior fornix. An implementation of our methodology is available at http://huttenhower.sph.harvard.edu/humann. This provides a means to accurately and efficiently characterize microbial metabolic pathways and functional modules directly from high-throughput sequencing reads, enabling the determination of community roles in the HMP cohort and in future metagenomic studies.  相似文献   

4.
Functional analysis of a clinical microbiome facilitates the elucidation of mechanisms by which microbiome perturbation can cause a phenotypic change in the patient. The direct approach for the analysis of the functional capacity of the microbiome is via shotgun metagenomics. An inexpensive method to estimate the functional capacity of a microbial community is through collecting 16S rRNA gene profiles then indirectly inferring the abundance of functional genes. This inference approach has been implemented in the PICRUSt and Tax4Fun software tools. However, those tools have important limitations since they rely on outdated functional databases and uncertain phylogenetic trees and require very specific data pre-processing protocols. Here we introduce Piphillin, a straightforward algorithm independent of any proposed phylogenetic tree, leveraging contemporary functional databases and not obliged to any singular data pre-processing protocol. When all three inference tools were evaluated against actual shotgun metagenomics, Piphillin was superior in predicting gene composition in human clinical samples compared to both PICRUSt and Tax4Fun (p<0.01 and p<0.001, respectively) and Piphillin’s ability to predict disease associations with specific gene orthologs exhibited a 15% increase in balanced accuracy compared to PICRUSt. From laboratory animal samples, no performance advantage was observed for any one of the tools over the others and for environmental samples all produced unsatisfactory predictions. Our results demonstrate that functional inference using the direct method implemented in Piphillin is preferable for clinical biospecimens. Piphillin is publicly available for academic use at http://secondgenome.com/Piphillin.  相似文献   

5.
Regulatory networks play a central role in cellular behavior and decision making. Learning these regulatory networks is a major task in biology, and devising computational methods and mathematical models for this task is a major endeavor in bioinformatics. Boolean networks have been used extensively for modeling regulatory networks. In this model, the state of each gene can be either ‘on’ or ‘off’ and that next-state of a gene is updated, synchronously or asynchronously, according to a Boolean rule that is applied to the current-state of the entire system. Inferring a Boolean network from a set of experimental data entails two main steps: first, the experimental time-series data are discretized into Boolean trajectories, and then, a Boolean network is learned from these Boolean trajectories. In this paper, we consider three methods for data discretization, including a new one we propose, and three methods for learning Boolean networks, and study the performance of all possible nine combinations on four regulatory systems of varying dynamics complexities. We find that employing the right combination of methods for data discretization and network learning results in Boolean networks that capture the dynamics well and provide predictive power. Our findings are in contrast to a recent survey that placed Boolean networks on the low end of the “faithfulness to biological reality” and “ability to model dynamics” spectra. Further, contrary to the common argument in favor of Boolean networks, we find that a relatively large number of time points in the time-series data is required to learn good Boolean networks for certain data sets. Last but not least, while methods have been proposed for inferring Boolean networks, as discussed above, missing still are publicly available implementations thereof. Here, we make our implementation of the methods available publicly in open source at http://bioinfo.cs.rice.edu/.  相似文献   

6.
Detection of remote sequence homology is essential for the accurate inference of protein structure, function and evolution. The most sensitive detection methods involve the comparison of evolutionary patterns reflected in multiple sequence alignments (MSAs) of protein families. We present PROCAIN, a new method for MSA comparison based on the combination of ‘vertical’ MSA context (substitution constraints at individual sequence positions) and ‘horizontal’ context (patterns of residue content at multiple positions). Based on a simple and tractable profile methodology and primitive measures for the similarity of horizontal MSA patterns, the method achieves the quality of homology detection comparable to a more complex advanced method employing hidden Markov models (HMMs) and secondary structure (SS) prediction. Adding SS information further improves PROCAIN performance beyond the capabilities of current state-of-the-art tools. The potential value of the method for structure/function predictions is illustrated by the detection of subtle homology between evolutionary distant yet structurally similar protein domains. ProCAIn, relevant databases and tools can be downloaded from: http://prodata.swmed.edu/procain/download. The web server can be accessed at http://prodata.swmed.edu/procain/procain.php.  相似文献   

7.
8.
Advances in biotechnology have resulted in large-scale studies of DNA methylation. A differentially methylated region (DMR) is a genomic region with multiple adjacent CpG sites that exhibit different methylation statuses among multiple samples. Many so-called “supervised” methods have been established to identify DMRs between two or more comparison groups. Methods for the identification of DMRs without reference to phenotypic information are, however, less well studied. An alternative “unsupervised” approach was proposed, in which DMRs in studied samples were identified with consideration of nature dependence structure of methylation measurements between neighboring probes from tiling arrays. Through simulation study, we investigated effects of dependencies between neighboring probes on determining DMRs where a lot of spurious signals would be produced if the methylation data were analyzed independently of the probe. In contrast, our newly proposed method could successfully correct for this effect with a well-controlled false positive rate and a comparable sensitivity. By applying to two real datasets, we demonstrated that our method could provide a global picture of methylation variation in studied samples. R source codes to implement the proposed method were freely available at http://www.csjfann.ibms.sinica.edu.tw/eag/programlist/ICDMR/ICDMR.html.  相似文献   

9.
10.

Background and Purpose

The risk of stroke after a transient ischemic attack (TIA) for patients with a positive diffusion-weighted image (DWI), i.e., transient symptoms with infarction (TSI), is much higher than for those with a negative DWI. The aim of this study was to validate the predictive value of a web-based recurrence risk estimator (RRE; http://www.nmr.mgh.harvard.edu/RRE/) of TSI.

Methods

Data from the prospective hospital-based TIA database of the First Affiliated Hospital of Zhengzhou University were analyzed. The RRE and ABCD2 scores were calculated within 7 days of symptom onset. The predictive outcome was ischemic stroke occurrence at 90 days. The receiver-operating characteristics curves were plotted, and the predictive value of the two models was assessed by computing the C statistics.

Results

A total of 221 eligible patients were prospectively enrolled, of whom 46 (20.81%) experienced a stroke within 90 days. The 90-day stroke risk in high-risk TSI patients (RRE ≥4) was 3.406-fold greater than in those at low risk (P <0.001). The C statistic of RRE (0.681; 95% confidence interval [CI], 0.592–0.771) was statistically higher than that of ABCD2 score (0.546; 95% CI, 0.454–0.638; Z = 2.115; P = 0.0344) at 90 days.

Conclusion

The RRE score had a higher predictive value than the ABCD2 score for assessing the 90-day risk of stroke after TSI.  相似文献   

11.
It has long been known that solvation plays an important role in protein-protein interactions. Here, we use a minimalistic solvation-based model for predicting protein binding energy to estimate quantitatively the contribution of the solvation factor in protein binding. The factor is described by a simple linear combination of buried surface areas according to amino-acid types. Even without structural optimization, our minimalistic model demonstrates a predictive power comparable to more complex methods, making the proposed approach the basis for high throughput applications. Application of the model to a proteomic database shows that receptor-substrate complexes involved in signaling have lower affinities than enzyme-inhibitor and antibody-antigen complexes, and they differ by chemical compositions on interfaces. Also, we found that protein complexes with components that come from the same genes generally have lower affinities than complexes formed by proteins from different genes, but in this case the difference originates from different interface areas. The model was implemented in the software PYTHON, and the source code can be found on the Shakhnovich group webpage: http://faculty.chemistry.harvard.edu/shakhnovich/software.  相似文献   

12.
13.
Metagenomics has transformed our understanding of the microbial world, allowing researchers to bypass the need to isolate and culture individual taxa and to directly characterize both the taxonomic and gene compositions of environmental samples. However, associating the genes found in a metagenomic sample with the specific taxa of origin remains a critical challenge. Existing binning methods, based on nucleotide composition or alignment to reference genomes allow only a coarse-grained classification and rely heavily on the availability of sequenced genomes from closely related taxa. Here, we introduce a novel computational framework, integrating variation in gene abundances across multiple samples with taxonomic abundance data to deconvolve metagenomic samples into taxa-specific gene profiles and to reconstruct the genomic content of community members. This assembly-free method is not bounded by various factors limiting previously described methods of metagenomic binning or metagenomic assembly and represents a fundamentally different approach to metagenomic-based genome reconstruction. An implementation of this framework is available at http://elbo.gs.washington.edu/software.html. We first describe the mathematical foundations of our framework and discuss considerations for implementing its various components. We demonstrate the ability of this framework to accurately deconvolve a set of metagenomic samples and to recover the gene content of individual taxa using synthetic metagenomic samples. We specifically characterize determinants of prediction accuracy and examine the impact of annotation errors on the reconstructed genomes. We finally apply metagenomic deconvolution to samples from the Human Microbiome Project, successfully reconstructing genus-level genomic content of various microbial genera, based solely on variation in gene count. These reconstructed genera are shown to correctly capture genus-specific properties. With the accumulation of metagenomic data, this deconvolution framework provides an essential tool for characterizing microbial taxa never before seen, laying the foundation for addressing fundamental questions concerning the taxa comprising diverse microbial communities.  相似文献   

14.
In this commentary, Rob Kulathinal describes two articles from the Perrimon lab, each describing a new online resource that can assist geneticists with the design of their RNA interference (RNAi) experiments. Hu et al.’s “UP-TORR: online tool for accurate and up-to-date annotation of RNAi reagents” and “FlyPrimerBank: An online database for Drosophila melanogaster gene expression analysis and knockdown evaluation of RNAi reagents” are published, respectively, in this month’s issues of GENETICS and G3.  相似文献   

15.
Polydrug use is common, and might occur because certain individuals experience positive effects from several different drugs during early stages of use. This study examined individual differences in subjective responses to single oral doses of d-amphetamine, alcohol, and delta-9-tetrahydrocannabinol (THC) in healthy social drinkers. Each of these drugs produces feelings of well-being in at least some individuals, and we hypothesized that subjective responses to these drugs would be positively correlated. We also examined participants’ drug responses in relation to personality traits associated with drug use. In this initial, exploratory study, 24 healthy, light drug users (12 male, 12 female), aged 21–31 years, participated in a fully within-subject, randomized, counterbalanced design with six 5.5-hour sessions in which they received d-amphetamine (20mg), alcohol (0.8 g/kg), or THC (7.5 mg), each paired with a placebo session. Participants rated the drugs’ effects on both global measures (e.g. feeling a drug effect at all) and drug-specific measures. In general, participants’ responses to the three drugs were unrelated. Unexpectedly, “wanting more” alcohol was inversely correlated with “wanting more” THC. Additionally, in women, but not in men, “disliking” alcohol was negatively correlated with “disliking” THC. Positive alcohol and amphetamine responses were related, but only in individuals who experienced a stimulant effect of alcohol. Finally, high trait constraint (or lack of impulsivity) was associated with lower reports of liking alcohol. No personality traits predicted responses across multiple drug types. Generally, these findings do not support the idea that certain individuals experience greater positive effects across multiple drug classes, but instead provide some evidence for a “drug of choice” model, in which individuals respond positively to certain classes of drugs that share similar subjective effects, and dislike other types of drugs.

Trial Registration

ClinicalTrials.gov NCT02485158  相似文献   

16.
New microbial genomes are sequenced at a high pace, allowing insight into the genetics of not only cultured microbes, but a wide range of metagenomic collections such as the human microbiome. To understand the deluge of genomic data we face, computational approaches for gene functional annotation are invaluable. We introduce a novel model for computational annotation that refines two established concepts: annotation based on homology and annotation based on phyletic profiling. The phyletic profiling-based model that includes both inferred orthologs and paralogs—homologs separated by a speciation and a duplication event, respectively—provides more annotations at the same average Precision than the model that includes only inferred orthologs. For experimental validation, we selected 38 poorly annotated Escherichia coli genes for which the model assigned one of three GO terms with high confidence: involvement in DNA repair, protein translation, or cell wall synthesis. Results of antibiotic stress survival assays on E. coli knockout mutants showed high agreement with our model''s estimates of accuracy: out of 38 predictions obtained at the reported Precision of 60%, we confirmed 25 predictions, indicating that our confidence estimates can be used to make informed decisions on experimental validation. Our work will contribute to making experimental validation of computational predictions more approachable, both in cost and time. Our predictions for 998 prokaryotic genomes include ∼400000 specific annotations with the estimated Precision of 90%, ∼19000 of which are highly specific—e.g. “penicillin binding,” “tRNA aminoacylation for protein translation,” or “pathogenesis”—and are freely available at http://gorbi.irb.hr/.  相似文献   

17.
The development of high-throughput sequencing technologies has transformed our capacity to investigate the composition and dynamics of the microbial communities that populate diverse habitats. Over the past decade, these advances have yielded an avalanche of metagenomic data. The current stage of “van Leeuwenhoek”–like cataloguing, as well as functional analyses, will likely accelerate as DNA and RNA sequencing, plus protein and metabolic profiling capacities and computational tools, continue to improve. However, it is time to consider: what’s next for microbiome research? The short pieces included here briefly consider the challenges and opportunities awaiting microbiome research.
This Perspective is part of the “Where next?” Series.
Soon, we will enter an era when “the number of population genomes deposited in public databases will dwarf those from isolates and single cells” (Gene Tyson). Clearly, as all authors noted in the following, our focus will move from describing the composition of microbial communities to elucidating the principles that govern their assembly, dynamics, and functions. How will such principles be discovered? Elhanan Borenstein proposes that a systems biology–based approach, particularly the development of mathematical and computational models of the interactions between the specific community components, will be critical for understanding the function and dynamics of microbiomes. Evolutionary biologists Howard Ochman and Andrew Moeller want to decipher how microbial assemblies evolve but challenge us to also consider the role of microbial communities in organismal evolution, and they make the exciting prediction that microbes will be implicated in the evolution of eusociality and cooperation. Brett Finlay underscores the need for deciphering the mechanistic bases—particularly the chemical/metabolite signals—for interactions between members of microbial communities and their hosts. He emphasizes how this knowledge will enable creation of new tools to manipulate the microbiota, a key challenge for future investigation. Heidi Kong also encourages deciphering the mechanisms that underlie associations between particular skin surfaces and disorders and their respective microbiota. Jeffrey Gordon considers several intriguing opportunities as well as challenges that manipulation of the gut microbiota presents for improved human nutrition and health. Finally, Karen Nelson, Karim Dabbagh and Hamilton Smith suggest that using synthetic genomes to create novel microbes or even synthetic microbiomes offers a new way to engineer the microbiota. Overall, future microbiome research regarding the molecules and mechanisms mediating interactions between members of microbial communities and their hosts should lead to discovery of exciting new biology and transformative therapeutics.  相似文献   

18.
The use of academic profiling sites is becoming more common, and emerging technologies boost researchers’ visibility and exchange of ideas. In our study we compared profiles at five different profiling sites. These five sites are ResearchGate, Academia.edu, Google Scholar Citations, ResearcherID and ORCID. The data set is enriched by demographic information including age, gender, position and affiliation, which are provided by the national CRIS-system in Norway. We find that approximately 37% of researchers at the University of Bergen have at least one profile, the prevalence being highest (> 40%) for members at the Faculty of Psychology and the Faculty of Social Sciences. Across all disciplines, ResearchGate is the most widely used platform. However, within Faculty of Humanities, Academia.edu is the preferred one. Researchers are reluctant to maintain multiple profiles, and there is little overlap between different services. Age turns out to be a poor indicator for presence in the investigated profiling sites, women are underrepresented and professors together with PhD students are the most likely profile holders. We next investigated the correlation between bibliometric measures, such as publications and citations, and user activities, such as downloads and followers. We find different bibliometric indicators to correlate strongly within individual platforms and across platforms. There is however less agreement between the traditional bibliometric and social activity indicators.  相似文献   

19.
20.
Louisa A. Stark 《Genetics》2015,200(3):679-680
The Genetics Society of America’s Elizabeth W. Jones Award for Excellence in Education recognizes significant and sustained impact on genetics education. The 2015 awardee, Louisa Stark, has made a major impact on global access to genetics education through her work as director of the University of Utah Genetic Science Learning Center. The Center’s Learn.Genetics and Teach.Genetics websites are the most widely used online genetic education resources in the world. In 2014, they were visited by 18 million students, educators, scientists, and members of the public. With over 60 million page views annually, Learn.Genetics is among the most used sites on the Web.Open in a separate window  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号