期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Conveniently Pre-Tagged and Pre-Packaged: Extended Molecular Identification and Metagenomics Using Complete Metazoan Mitochondrial Genomes

Agnes Dettai Cyril Gallut Sophie Brouillet Joel Pothier Guillaume Lecointre Régis Debruyne 《PloS one》2012,7(12)

Background

Researchers sorely need markers and approaches for biodiversity exploration (both specimen linked and metagenomics) using the full potential of next generation sequencing technologies (NGST). Currently, most studies rely on expensive multiple tagging, PCR primer universality and/or the use of few markers, sometimes with insufficient variability.

Methodology/Principal Findings

We propose a novel approach for the isolation and sequencing of a universal, useful and popular marker across distant, non-model metazoans: the complete mitochondrial genome. It relies on the properties of metazoan mitogenomes for enrichment, on careful choice of the organisms to multiplex, as well as on the wide collection of accumulated mitochondrial reference datasets for post-sequencing sorting and identification instead of individual tagging. Multiple divergent organisms can be sequenced simultaneously, and their complete mitogenome obtained at a very low cost. We provide in silico testing of dataset assembly for a selected set of example datasets.

Conclusions/Significance

This approach generates large mitogenome datasets. These sequences are useful for phylogenetics, molecular identification and molecular ecology studies, and are compatible with all existing projects or available datasets based on mitochondrial sequences, such as the Barcode of Life project. Our method can yield sequences both from identified samples and metagenomic samples. The use of the same datasets for both kinds of studies makes for a powerful approach, especially since the datasets have a high variability even at species level, and would be a useful complement to the less variable 18S rDNA currently prevailing in metagenomic studies. 相似文献

2.

A statistical toolbox for metagenomics: assessing functional diversity in microbial communities

Patrick D Schloss Jo Handelsman 《BMC bioinformatics》2008,9(1):34

相似文献

3.

Community transcriptomics reveals universal patterns of protein sequence conservation in natural microbial communities

Stewart FJ Sharma AK Bryant JA Eppley JM DeLong EF 《Genome biology》2011,12(3):R26

相似文献

4.

Quality control of microbiota metagenomics by k-mer analysis

Florian Plaza Onate Jean-Michel Batto Catherine Juste Jehane Fadlallah Cyrielle Fougeroux Doriane Gouas Nicolas Pons Sean Kennedy Florence Levenez Joel Dore S Dusko Ehrlich Guy Gorochov Martin Larsen 《BMC genomics》2015,16(1)

Background

The biological and clinical consequences of the tight interactions between host and microbiota are rapidly being unraveled by next generation sequencing technologies and sophisticated bioinformatics, also referred to as microbiota metagenomics. The recent success of metagenomics has created a demand to rapidly apply the technology to large case–control cohort studies and to studies of microbiota from various habitats, including habitats relatively poor in microbes. It is therefore of foremost importance to enable a robust and rapid quality assessment of metagenomic data from samples that challenge present technological limits (sample numbers and size). Here we demonstrate that the distribution of overlapping k-mers of metagenome sequence data predicts sequence quality as defined by gene distribution and efficiency of sequence mapping to a reference gene catalogue.

Results

We used serial dilutions of gut microbiota metagenomic datasets to generate well-defined high to low quality metagenomes. We also analyzed a collection of 52 microbiota-derived metagenomes. We demonstrate that k-mer distributions of metagenomic sequence data identify sequence contaminations, such as sequences derived from “empty” ligation products. Of note, k-mer distributions were also able to predict the frequency of sequences mapping to a reference gene catalogue not only for the well-defined serial dilution datasets, but also for 52 human gut microbiota derived metagenomic datasets.

Conclusions

We propose that k-mer analysis of raw metagenome sequence reads should be implemented as a first quality assessment prior to more extensive bioinformatics analysis, such as sequence filtering and gene mapping. With the rising demand for metagenomic analysis of microbiota it is crucial to provide tools for rapid and efficient decision making. This will eventually lead to a faster turn-around time, improved analytical quality including sample quality metrics and a significant cost reduction. Finally, improved quality assessment will have a major impact on the robustness of biological and clinical conclusions drawn from metagenomic studies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1406-7) contains supplementary material, which is available to authorized users. 相似文献

5.

AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number

Aaron M Newman James B Cooper 《BMC bioinformatics》2010,11(1):117

Background

Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. 相似文献

6.

Combining gene prediction methods to improve metagenomic gene annotation

Non G Yok Gail L Rosen 《BMC bioinformatics》2011,12(1):20

Background

Traditional gene annotation methods rely on characteristics that may not be available in short reads generated from next generation technology, resulting in suboptimal performance for metagenomic (environmental) samples. Therefore, in recent years, new programs have been developed that optimize performance on short reads. In this work, we benchmark three metagenomic gene prediction programs and combine their predictions to improve metagenomic read gene annotation. 相似文献

7.

Visualizing post genomics data-sets on customized pathway maps by ProMeTra – aeration-dependent gene expression and metabolism of Corynebacterium glutamicum as an example

Heiko Neuweger Marcus Persicke Stefan P Albaum Thomas Bekel Michael Dondrup Andrea T Hüser J?rn Winnebald Jessica Schneider J?rn Kalinowski Alexander Goesmann 《BMC systems biology》2009,3(1):82

相似文献

8.

Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP

Troy Hawkins Meghana Chitale Daisuke Kihara 《BMC bioinformatics》2010,11(1):265

Background

A new paradigm of biological investigation takes advantage of technologies that produce large high throughput datasets, including genome sequences, interactions of proteins, and gene expression. The ability of biologists to analyze and interpret such data relies on functional annotation of the included proteins, but even in highly characterized organisms many proteins can lack the functional evidence necessary to infer their biological relevance. 相似文献

9.

Outlier detection in BLAST hits

Nidhi Shah Stephen F. Altschul Mihai Pop 《Algorithms for molecular biology : AMB》2018,13(1):7

Background

An important task in a metagenomic analysis is the assignment of taxonomic labels to sequences in a sample. Most widely used methods for taxonomy assignment compare a sequence in the sample to a database of known sequences. Many approaches use the best BLAST hit(s) to assign the taxonomic label. However, it is known that the best BLAST hit may not always correspond to the best taxonomic match. An alternative approach involves phylogenetic methods, which take into account alignments and a model of evolution in order to more accurately define the taxonomic origin of sequences. Similarity-search based methods typically run faster than phylogenetic methods and work well when the organisms in the sample are well represented in the database. In contrast, phylogenetic methods have the capability to identify new organisms in a sample but are computationally quite expensive.

Results

We propose a two-step approach for metagenomic taxon identification; i.e., use a rapid method that accurately classifies sequences using a reference database (this is a filtering step) and then use a more complex phylogenetic method for the sequences that were unclassified in the previous step. In this work, we explore whether and when using top BLAST hit(s) yields a correct taxonomic label. We develop a method to detect outliers among BLAST hits in order to separate the phylogenetically most closely related matches from matches to sequences from more distantly related organisms. We used modified BILD (Bayesian Integral Log-Odds) scores, a multiple-alignment scoring function, to define the outliers within a subset of top BLAST hits and assign taxonomic labels. We compared the accuracy of our method to the RDP classifier and show that our method yields fewer misclassifications while properly classifying organisms that are not present in the database. Finally, we evaluated the use of our method as a pre-processing step before more expensive phylogenetic analyses (in our case TIPP) in the context of real 16S rRNA datasets.

Conclusion

Our experiments make a good case for using a two-step approach for accurate taxonomic assignment. We show that our method can be used as a filtering step before using phylogenetic methods and provides a way to interpret BLAST results using more information than provided by E-values and bit-scores alone.

相似文献

10.

Broad spectrum microarray for fingerprint-based bacterial species identification

Frédérique Pasquer Cosima Pelludat Brion Duffy Jürg E Frey 《BMC biotechnology》2010,10(1):13

Background

Microarrays are powerful tools for DNA-based molecular diagnostics and identification of pathogens. Most target a limited range of organisms and are based on only one or a very few genes for specific identification. Such microarrays are limited to organisms for which specific probes are available, and often have difficulty discriminating closely related taxa. We have developed an alternative broad-spectrum microarray that employs hybridisation fingerprints generated by high-density anonymous markers distributed over the entire genome for identification based on comparison to a reference database. 相似文献

11.

Expression profiles of switch-like genes accurately classify tissue and infectious disease phenotypes in model-based classification

Michael Gormley Aydin Tozeren 《BMC bioinformatics》2008,9(1):486

Background

Large-scale compilation of gene expression microarray datasets across diverse biological phenotypes provided a means of gathering a priori knowledge in the form of identification and annotation of bimodal genes in the human and mouse genomes. These switch-like genes consist of 15% of known human genes, and are enriched with genes coding for extracellular and membrane proteins. It is of interest to determine the prediction potential of bimodal genes for class discovery in large-scale datasets. 相似文献

12.

Identification of Enterobacter sakazakii from closely related species: The use of Artificial Neural Networks in the analysis of biochemical and 16S rDNA data

Carol Iversen Lee Lancashire Michael Waddington Stephen Forsythe Graham Ball 《BMC microbiology》2006,6(1):28-8

Background

Enterobacter sakazakii is an emergent pathogen associated with ingestion of infant formula and accurate identification is important in both industrial and clinical settings. Bacterial species can be difficult to accurately characterise from complex biochemical datasets and computer algorithms can potentially simplify the process. 相似文献

13.

Comparative phosphoproteomics reveals evolutionary and functional conservation of phosphorylation across eukaryotes

下载免费PDF全文

Boekhorst J van Breukelen B Heck A Snel B 《Genome biology》2008,9(10):R144

Background

Reversible phosphorylation of proteins is involved in a wide range of processes, ranging from signaling cascades to regulation of protein complex assembly. Little is known about the structure and evolution of phosphorylation networks. Recent high-throughput phosphoproteomics studies have resulted in the rapid accumulation of phosphopeptide datasets for many model organisms. Here, we exploit these novel data for the comparative analysis of phosphorylation events between different species of eukaryotes. 相似文献

14.

Parallel-META: efficient metagenomic data analysis based on high-performance computation

Xiaoquan Su Jian Xu Kang Ning 《BMC systems biology》2012,6(Z1):S16

Background

Metagenomics method directly sequences and analyses genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomic data analyses include taxonomical and functional component examination of all genomes in the microbial community. Metagenomic data analysis is both data- and computation- intensive, which requires extensive computational power. Most of the current metagenomic data analysis softwares were designed to be used on a single computer or single computer clusters, which could not match with the fast increasing number of large metagenomic projects' computational requirements. Therefore, advanced computational methods and pipelines have to be developed to cope with such need for efficient analyses.

Result

In this paper, we proposed Parallel-META, a GPU- and multi-core-CPU-based open-source pipeline for metagenomic data analysis, which enabled the efficient and parallel analysis of multiple metagenomic datasets and the visualization of the results for multiple samples. In Parallel-META, the similarity-based database search was parallelized based on GPU computing and multi-core CPU computing optimization. Experiments have shown that Parallel-META has at least 15 times speed-up compared to traditional metagenomic data analysis method, with the same accuracy of the results http://www.computationalbioenergy.org/parallel-meta.html.

Conclusion

The parallel processing of current metagenomic data would be very promising: with current speed up of 15 times and above, binning would not be a very time-consuming process any more. Therefore, some deeper analysis of the metagenomic data, such as the comparison of different samples, would be feasible in the pipeline, and some of these functionalities have been included into the Parallel-META pipeline.

相似文献

15.

Artificial and natural duplicates in pyrosequencing reads of metagenomic data

Beifang Niu Limin Fu Shulei Sun Weizhong Li 《BMC bioinformatics》2010,11(1):187

Background

Artificial duplicates from pyrosequencing reads may lead to incorrect interpretation of the abundance of species and genes in metagenomic studies. Duplicated reads were filtered out in many metagenomic projects. However, since the duplicated reads observed in a pyrosequencing run also include natural (non-artificial) duplicates, simply removing all duplicates may also cause underestimation of abundance associated with natural duplicates. 相似文献

16.

An improved statistical model for taxonomic assignment of metagenomics

Yujing Yao Zhezhen Jin Joseph H Lee 《BMC genetics》2018,19(1):98

Background

With the advances in the next-generation sequencing technologies, researchers can now rapidly examine the composition of samples from humans and their surroundings. To enhance the accuracy of taxonomy assignments in metagenomic samples, we developed a method that allows multiple mismatch probabilities from different genomes.

Results

We extended the algorithm of taxonomic assignment of metagenomic sequence reads (TAMER) by developing an improved method that can set a different mismatch probability for each genome rather than imposing a single parameter for all genomes, thereby obtaining a greater degree of accuracy. This method, which we call TADIP (Taxonomic Assignment of metagenomics based on DIfferent Probabilities), was comprehensively tested in simulated and real datasets. The results support that TADIP improved the performance of TAMER especially in large sample size datasets with high complexity.

Conclusions

TADIP was developed as a statistical model to improve the estimate accuracy of taxonomy assignments. Based on its varying mismatch probability setting and correlated variance matrix setting, its performance was enhanced for high complexity samples when compared with TAMER.

相似文献

17.

EDISA: extracting biclusters from multiple time-series of gene expression profiles

Jochen Supper Martin Strauch Dierk Wanke Klaus Harter Andreas Zell 《BMC bioinformatics》2007,8(1):334

Background

Cells dynamically adapt their gene expression patterns in response to various stimuli. This response is orchestrated into a number of gene expression modules consisting of co-regulated genes. A growing pool of publicly available microarray datasets allows the identification of modules by monitoring expression changes over time. These time-series datasets can be searched for gene expression modules by one of the many clustering methods published to date. For an integrative analysis, several time-series datasets can be joined into a three-dimensional gene-condition-time dataset, to which standard clustering or biclustering methods are, however, not applicable. We thus devise a probabilistic clustering algorithm for gene-condition-time datasets. 相似文献

18.

MetaSim: a sequencing simulator for genomics and metagenomics 总被引：1，自引：0，他引：1

Richter DC Ott F Auch AF Schmid R Huson DH 《PloS one》2008,3(10):e3373

Background

The new research field of metagenomics is providing exciting insights into various, previously unclassified ecological systems. Next-generation sequencing technologies are producing a rapid increase of environmental data in public databases. There is great need for specialized software solutions and statistical methods for dealing with complex metagenome data sets.

Methodology/Principal Findings

To facilitate the development and improvement of metagenomic tools and the planning of metagenomic projects, we introduce a sequencing simulator called MetaSim. Our software can be used to generate collections of synthetic reads that reflect the diverse taxonomical composition of typical metagenome data sets. Based on a database of given genomes, the program allows the user to design a metagenome by specifying the number of genomes present at different levels of the NCBI taxonomy, and then to collect reads from the metagenome using a simulation of a number of different sequencing technologies. A population sampler optionally produces evolved sequences based on source genomes and a given evolutionary tree.

Conclusions/Significance

MetaSim allows the user to simulate individual read datasets that can be used as standardized test scenarios for planning sequencing projects or for benchmarking metagenomic software. 相似文献

19.

Prediction of regulatory elements in mammalian genomes using chromatin signatures

Kyoung-Jae Won Iouri Chepelev Bing Ren Wei Wang 《BMC bioinformatics》2008,9(1):547

Background

Recent genomic scale survey of epigenetic states in the mammalian genomes has shown that promoters and enhancers are correlated with distinct chromatin signatures, providing a pragmatic way for systematic mapping of these regulatory elements in the genome. With rapid accumulation of chromatin modification profiles in the genome of various organisms and cell types, this chromatin based approach promises to uncover many new regulatory elements, but computational methods to effectively extract information from these datasets are still limited. 相似文献

20.

Functional Metagenomics: A High Throughput Screening Method to Decipher Microbiota-Driven NF-κB Modulation in the Human Gut

Omar Lakhdari Antonietta Cultrone Julien Tap Karine Gloux Fran?oise Bernard S. Dusko Ehrlich Fabrice Lefèvre Jo?l Doré Hervé M. Blottière 《PloS one》2010,5(9)

Background/Aim

The human intestinal microbiota plays an important role in modulation of mucosal immune responses. To study interactions between intestinal epithelial cells (IECs) and commensal bacteria, a functional metagenomic approach was developed. One interest of metagenomics is to provide access to genomes of uncultured microbes. We aimed at identifying bacterial genes involved in regulation of NF-κB signaling in IECs. A high throughput cell-based screening assay allowing rapid detection of NF-κB modulation in IECs was established using the reporter-gene strategy to screen metagenomic libraries issued from the human intestinal microbiota.

Methods

A plasmid containing the secreted alkaline phosphatase (SEAP) gene under the control of NF-κB binding elements was stably transfected in HT-29 cells. The reporter clone HT-29/kb-seap-25 was selected and characterized. Then, a first screening of a metagenomic library from Crohn''s disease patients was performed to identify NF-κB modulating clones. Furthermore, genes potentially involved in the effect of one stimulatory metagenomic clone were determined by sequence analysis associated to mutagenesis by transposition.

Results

The two proinflammatory cytokines, TNF-α and IL-1β, were able to activate the reporter system, translating the activation of the NF-κB signaling pathway and NF-κB inhibitors, BAY 11-7082, caffeic acid phenethyl ester and MG132 were efficient. A screening of 2640 metagenomic clones led to the identification of 171 modulating clones. Among them, one stimulatory metagenomic clone, 52B7, was further characterized. Sequence analysis revealed that its metagenomic DNA insert might belong to a new Bacteroides strain and we identified 2 loci encoding an ABC transport system and a putative lipoprotein potentially involved in 52B7 effect on NF-κB.

Conclusions

We have established a robust high throughput screening assay for metagenomic libraries derived from the human intestinal microbiota to study bacteria-driven NF-κB regulation. This opens a strategic path toward the identification of bacterial strains and molecular patterns presenting a potential therapeutic interest. 相似文献