期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Separating the wheat from the chaff: a prioritisation pipeline for the analysis of metabolomics datasets

Jankevics Andris Merlo Maria Elena de Vries Marcel Vonk Roel J. Takano Eriko Breitling Rainer 《Metabolomics : Official journal of the Metabolomic Society》2011,8(1):29-36

Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful and widely applied method for the study of biological systems, biomarker discovery and pharmacological interventions. LC-MS measurements are, however, significantly complicated by several technical challenges, including: (1) ionisation suppression/enhancement, disturbing the correct quantification of analytes, and (2) the detection of large amounts of separate derivative ions, increasing the complexity of the spectra, but not their information content. Here we introduce an experimental and analytical strategy that leads to robust metabolome profiles in the face of these challenges. Our method is based on rigorous filtering of the measured signals based on a series of sample dilutions. Such data sets have the additional characteristic that they allow a more robust assessment of detection signal quality for each metabolite. Using our method, almost 80% of the recorded signals can be discarded as uninformative, while important information is retained. As a consequence, we obtain a broader understanding of the information content of our analyses and a better assessment of the metabolites detected in the analyzed data sets. We illustrate the applicability of this method using standard mixtures, as well as cell extracts from bacterial samples. It is evident that this method can be applied in many types of LC-MS analyses and more specifically in untargeted metabolomics.

相似文献

2.

Separating the wheat from the chaff: a prioritisation pipeline for the analysis of metabolomics datasets

Andris Jankevics Maria Elena Merlo Marcel de Vries Roel J. Vonk Eriko Takano Rainer Breitling 《Metabolomics : Official journal of the Metabolomic Society》2012,8(1):29-36

Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful and widely applied method for the study of biological systems, biomarker discovery and pharmacological interventions. LC-MS measurements are, however, significantly complicated by several technical challenges, including: (1) ionisation suppression/enhancement, disturbing the correct quantification of analytes, and (2) the detection of large amounts of separate derivative ions, increasing the complexity of the spectra, but not their information content. Here we introduce an experimental and analytical strategy that leads to robust metabolome profiles in the face of these challenges. Our method is based on rigorous filtering of the measured signals based on a series of sample dilutions. Such data sets have the additional characteristic that they allow a more robust assessment of detection signal quality for each metabolite. Using our method, almost 80% of the recorded signals can be discarded as uninformative, while important information is retained. As a consequence, we obtain a broader understanding of the information content of our analyses and a better assessment of the metabolites detected in the analyzed data sets. We illustrate the applicability of this method using standard mixtures, as well as cell extracts from bacterial samples. It is evident that this method can be applied in many types of LC-MS analyses and more specifically in untargeted metabolomics. 相似文献

3.

Seeking unique and common biological themes in multiple gene lists or datasets: pathway pattern extraction pipeline for pathway-level comparative analysis

Ming Yi Uma Mudunuri Anney Che Robert M Stephens 《BMC bioinformatics》2009,10(1):200

Background

One of the challenges in the analysis of microarray data is to integrate and compare the selected (e.g., differential) gene lists from multiple experiments for common or unique underlying biological themes. A common way to approach this problem is to extract common genes from these gene lists and then subject these genes to enrichment analysis to reveal the underlying biology. However, the capacity of this approach is largely restricted by the limited number of common genes shared by datasets from multiple experiments, which could be caused by the complexity of the biological system itself. 相似文献

4.

RIEMS: a software pipeline for sensitive and comprehensive taxonomic classification of reads from metagenomics datasets

Matthias Scheuch Dirk H?per Martin Beer 《BMC bioinformatics》2015,16(1)

Background

Fuelled by the advent and subsequent development of next generation sequencing technologies, metagenomics became a powerful tool for the analysis of microbial communities both scientifically and diagnostically. The biggest challenge is the extraction of relevant information from the huge sequence datasets generated for metagenomics studies. Although a plethora of tools are available, data analysis is still a bottleneck.

Results

To overcome the bottleneck of data analysis, we developed an automated computational workflow called RIEMS – Reliable Information Extraction from Metagenomic Sequence datasets. RIEMS assigns every individual read sequence within a dataset taxonomically by cascading different sequence analyses with decreasing stringency of the assignments using various software applications. After completion of the analyses, the results are summarised in a clearly structured result protocol organised taxonomically. The high accuracy and performance of RIEMS analyses were proven in comparison with other tools for metagenomics data analysis using simulated sequencing read datasets.

Conclusions

RIEMS has the potential to fill the gap that still exists with regard to data analysis for metagenomics studies. The usefulness and power of RIEMS for the analysis of genuine sequencing datasets was demonstrated with an early version of RIEMS in 2011 when it was used to detect the orthobunyavirus sequences leading to the discovery of Schmallenberg virus.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0503-6) contains supplementary material, which is available to authorized users. 相似文献

5.

Data resolution: a jackknife procedure for determining the consistency of molecular marker datasets

van Hintum TJ 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2007,115(3):343-349

The results of genetic diversity studies using molecular markers not only depend on the biology of the studied objects but also on the quality of the marker data. Poor data quality may hamper the correct answering of biological questions. A new statistic is proposed to estimate the quality of a marker data set with regard to its ability to describe the structure of the biological material under study. This statistic is called data resolution (DR). It is calculated by splitting a marker data set at random into two sets each with half the number of markers. In each set, similarities between all pairs of objects are calculated. Subsequently, the similarities obtained for the two sets are correlated. This process is repeated a large number of times. The average of the correlation coefficients obtained in this way is the DR of the dataset. In the present paper, the DR statistic is applied to four studies involving amplified fragment length polymorphism as well as micro-satellite markers. In addition, some properties and possible applications of DR are discussed, including the prediction of the added value of scoring additional markers, and the determination of which similarity measure is, apart from genetical considerations, most appropriate for analyzing the data. 相似文献

6.

GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences

Cumbie JS Kimbrel JA Di Y Schafer DW Wilhelm LJ Fox SE Sullivan CM Curzon AD Carrington JC Mockler TC Chang JH 《PloS one》2011,6(10):e25279

相似文献

7.

GOToolBox: functional analysis of gene datasets based on Gene Ontology

Martin D Brun C Remy E Mouren P Thieffry D Jacq B 《Genome biology》2004,5(12):R101

We have developed methods and tools based on the Gene Ontology (GO) resource allowing the identification of statistically over- or under-represented terms in a gene dataset; the clustering of functionally related genes within a set; and the retrieval of genes sharing annotations with a query gene. GO annotations can also be constrained to a slim hierarchy or a given level of the ontology. The source codes are available upon request, and distributed under the GPL license. 相似文献

8.

SPdel: A pipeline to compare and visualize species delimitation methods for single-locus datasets

Jorge L. Ramirez Paola Valdivia Ulises Rosas-Puchuri Nereida L. Valdivia 《Molecular ecology resources》2023,23(8):1959-1965

An accurate species delimitation is critical for biological studies. In this context, the use of molecular techniques along with species delimitation methods would help to a rapid and accurate biodiversity assessment. The species delimitation methods cluster data sets of orthologous sequences in molecular operational taxonomic units (MOTU). In particular, the methods based on a single gene are easily integrated with the widely used DNA barcoding approach. We developed SPdel a user-friendly pipeline to integrate different single-gene species delimitation methods. SPdel is designed to calculate and compare MOTUs obtained by different species delimitation approaches. SPdel also outputs diverse ready-to-publish quality figures, that facilitate the interpretation of results. SPdel aims to help researchers use species delimitation methods that would improve biodiversity studies. 相似文献

9.

MADGene: retrieval and processing of gene identifier lists for the analysis of heterogeneous microarray datasets

Baron D Bihouée A Teusan R Dubois E Savagner F Steenman M Houlgatte R Ramstein G 《Bioinformatics (Oxford, England)》2011,27(5):725-726

MADGene is a software environment comprising a web-based database and a java application. This platform aims at unifying gene identifiers (ids) and performing gene set analysis. MADGene allows the user to perform inter-conversion of clone and gene ids over a large range of nomenclatures relative to 17 species. We propose a set of 23 functions to facilitate the analysis of gene sets and we give two microarray applications to show how MADGene can be used to conduct meta-analyses. AVAILABILITY: The MADGene resources are freely available online from http://www.madtools.org, a website dedicated to the analysis and annotation of DNA microarray data. 相似文献

10.

Tropomyosin is a nice marker gene for phylogenetic analysis of molluscs

Wang X Li L Xu F Zhang G 《Molecular biology reports》2011,38(7):4589-4593

Molluscs are an extraordinarily diverse group of animals and to discriminate them based on one molecular marker/gene is very difficult because of the too fast or slow rate of nucleotide substitution. In the study, the tropomyosin cds (coding sequences) of 43 animal species were analyzed, the results of which suggested that the tropomyosin gene was a nice marker gene to phylogenetic analysis of molluscs, even for all the studied animals. In addition, InDels (insertions and deletions) in tropomyosin cds of turbo cornutus were also studied and one segment repeat, which probably happened recently and was of functional importance, was found. 相似文献

11.

SHMTool: a webserver for comparative analysis of somatic hypermutation datasets

Maccarthy T Roa S Scharff MD Bergman A 《DNA Repair》2009,8(1):137-141

The somatic hypermutation (SHM) of Immunoglobulin variable (V) regions is a key process in the generation of antibody diversity. The growing number of datasets of point mutations that occur during SHM in mice and humans often include comparisons between wild-type and individuals or strains genetically defective in the repair mechanisms that contribute to SHM. However, it has been difficult to compare the results of different studies because the analyses have not been standardized for criteria such as correction for base composition and the inclusion of unique mutations. If many mutations are involved, the analysis can also be time consuming. To overcome these problems and facilitate a standardized analysis and display of similar data, we present a webserver (SHMTool) for comparing SHM datasets, available at http://scb.aecom.yu.edu/shmtool. 相似文献

12.

Repetitive elements as a transcriptomic marker of aging: Evidence in multiple datasets and models

Thomas J. LaRocca Alyssa N. Cavalier Devin Wahl 《Aging cell》2020,19(7)

相似文献

13.

ANCHOR: a 16S rRNA gene amplicon pipeline for microbial analysis of multiple environmental samples

Emmanuel Gonzalez Frederic E. Pitre Nicholas J. B. Brereton 《Environmental microbiology》2019,21(7):2440-2468

Analysis of 16S ribosomal RNA (rRNA) gene amplification data for microbial barcoding can be inaccurate across complex environmental samples. A method, ANCHOR, is presented and designed for improved species-level microbial identification using paired-end sequences directly, multiple high-complexity samples and multiple reference databases. A standard operating procedure (SOP) is reported alongside benchmarking against artificial, single sample and replicated mock data sets. The method is then directly tested using a real-world data set from surface swabs of the International Space Station (ISS). Simple mock community analysis identified 100% of the expected species and 99% of expected gene copy variants (100% identical). A replicated mock community revealed similar or better numbers of expected species than MetaAmp, DADA2, Mothur and QIIME1. Analysis of the ISS microbiome identified 714 putative unique species/strains and differential abundance analysis distinguished significant differences between the Destiny module (U.S. laboratory) and Harmony module (sleeping quarters). Harmony was remarkably dominated by human gastrointestinal tract bacteria, similar to enclosed environments on earth; however, Destiny module bacteria also derived from nonhuman microbiome carriers present on the ISS, the laboratory's research animals. ANCHOR can help substantially improve sequence resolution of 16S rRNA gene amplification data within biologically replicated environmental experiments and integrated multidatabase annotation enhances interpretation of complex, nonreference microbiomes. 相似文献

14.

A pipeline for the identification and characterization of chromatin modifications derived from ChIP-Seq datasets

Antony Kaspi Mark Ziemann Haloom Rafehi Ross Lazarus Assam El-Osta 《Biochimie》2012

The advent of massive parallel sequencing of immunopurified chromatin and its determinants has provided new avenues for researchers to map epigenome-wide changes and there is tremendous interest to uncover regulatory signatures to understand fundamental questions associated with chromatin structure and function. Indeed, the rapid development of large genome annotation projects has seen a resurgence in chromatin immunoprecipitation (ChIP) based protocols which are used to distinguish protein interactions coupled with large scale sequencing (Seq) to precisely map epigenome-wide interactions. Despite some of the great advances in our understanding of chromatin modifying complexes and their determinants, the development of ChIP-Seq technologies also pose specific demands on the integration of data for visualization, manipulation and analysis. In this article we discuss some of the considerations for experimental design planning, quality control, and bioinformatic analysis. The key aspects of post sequencing analysis are the identification of regions of interest, differentiation between biological conditions and the characterization of sequence differences for chromatin modifications. We provide an overview of best-practise approaches with background information and considerations of integrative analysis from ChIP-Seq experiments. 相似文献

15.

MeQA: a pipeline for MeDIP-seq data quality assessment and analysis

Huang J Renault V Sengenès J Touleimat N Michel S Lathrop M Tost J 《Bioinformatics (Oxford, England)》2012,28(4):587-588

相似文献

16.

Bioprocess monitoring by marker gene analysis

Schweder T 《Biotechnology journal》2011,6(8):926-933

The optimization and the scale up of industrial fermentation processes require an efficient and possibly comprehensive analysis of the physiology of the production system throughout the process development. Furthermore, to ensure a good quality control of established bioprocesses, on-line analysis techniques for the determination of marker gene expression are of interest to monitor the productivity and the safety of bioprocesses. A prerequisite for such analyses is the knowledge of genes, the expression of which is critical either for the productivity or for the performance of the bioprocess. This work reviews marker genes that are specific indicators for stress- and nutrient-limitation conditions or for the physiological status of the bacterial production hosts Bacillus subtilis, Bacillus licheniformis and Escherichia coli. The suitability of existing gene expression analysis techniques for bioprocess monitoring is discussed. Analytical approaches that enable a robust and sensitive determination of selected marker mRNAs or proteins are presented. 相似文献

17.

CAP-miRSeq: a comprehensive analysis pipeline for microRNA sequencing data

Zhifu Sun Jared Evans Aditya Bhagwate Sumit Middha Matthew Bockol Huihuang Yan Jean-Pierre Kocher 《BMC genomics》2014,15(1)

Background

miRNAs play a key role in normal physiology and various diseases. miRNA profiling through next generation sequencing (miRNA-seq) has become the main platform for biological research and biomarker discovery. However, analyzing miRNA sequencing data is challenging as it needs significant amount of computational resources and bioinformatics expertise. Several web based analytical tools have been developed but they are limited to processing one or a pair of samples at time and are not suitable for a large scale study. Lack of flexibility and reliability of these web applications are also common issues.

Results

We developed a Comprehensive Analysis Pipeline for microRNA Sequencing data (CAP-miRSeq) that integrates read pre-processing, alignment, mature/precursor/novel miRNA detection and quantification, data visualization, variant detection in miRNA coding region, and more flexible differential expression analysis between experimental conditions. According to computational infrastructure, users can install the package locally or deploy it in Amazon Cloud to run samples sequentially or in parallel for a large number of samples for speedy analyses. In either case, summary and expression reports for all samples are generated for easier quality assessment and downstream analyses. Using well characterized data, we demonstrated the pipeline’s superior performances, flexibility, and practical use in research and biomarker discovery.

Conclusions

CAP-miRSeq is a powerful and flexible tool for users to process and analyze miRNA-seq data scalable from a few to hundreds of samples. The results are presented in the convenient way for investigators or analysts to conduct further investigation and discovery.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-423) contains supplementary material, which is available to authorized users. 相似文献

18.

Immunogenetic Management Software: a new tool for visualization and analysis of complex immunogenetic datasets

Johnson ZP Eady RD Ahmad SF Agravat S Morris T Else J Lank SM Wiseman RW O'Connor DH Penedo MC Larsen CP Kean LS 《Immunogenetics》2012,64(4):329-336

Here we describe the Immunogenetic Management Software (IMS) system, a novel web-based application that permits multiplexed analysis of complex immunogenetic traits that are necessary for the accurate planning and execution of experiments involving large animal models, including nonhuman primates. IMS is capable of housing complex pedigree relationships, microsatellite-based MHC typing data, as well as MHC pyrosequencing expression analysis of class I alleles. It includes a novel, automated MHC haplotype naming algorithm and has accomplished an innovative visualization protocol that allows users to view multiple familial and MHC haplotype relationships through a single, interactive graphical interface. Detailed DNA and RNA-based data can also be queried and analyzed in a highly accessible fashion, and flexible search capabilities allow experimental choices to be made based on multiple, individualized and expandable immunogenetic factors. This web application is implemented in Java, MySQL, Tomcat, and Apache, with supported browsers including Internet Explorer and Firefox on Windows and Safari on Mac OS. The software is freely available for distribution to noncommercial users by contacting Leslie.kean@emory.edu. A demonstration site for the software is available at http://typing.emory.edu/typing_demo , user name: imsdemo7@gmail.com and password: imsdemo. 相似文献

19.

HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data

Huihuang Yan Jared Evans Mike Kalmbach Raymond Moore Sumit Middha Stanislav Luban Liguo Wang Aditya Bhagwate Ying Li Zhifu Sun Xianfeng Chen Jean-Pierre A Kocher 《BMC bioinformatics》2014,15(1)

相似文献

20.

VolPy: Automated and scalable analysis pipelines for voltage imaging datasets

Changjia Cai Johannes Friedrich Amrita Singh M. Hossein Eybposh Eftychios A. Pnevmatikakis Kaspar Podgorski Andrea Giovannucci 《PLoS computational biology》2021,17(4)

Voltage imaging enables monitoring neural activity at sub-millisecond and sub-cellular scale, unlocking the study of subthreshold activity, synchrony, and network dynamics with unprecedented spatio-temporal resolution. However, high data rates (>800MB/s) and low signal-to-noise ratios create bottlenecks for analyzing such datasets. Here we present VolPy, an automated and scalable pipeline to pre-process voltage imaging datasets. VolPy features motion correction, memory mapping, automated segmentation, denoising and spike extraction, all built on a highly parallelizable, modular, and extensible framework optimized for memory and speed. To aid automated segmentation, we introduce a corpus of 24 manually annotated datasets from different preparations, brain areas and voltage indicators. We benchmark VolPy against ground truth segmentation, simulations and electrophysiology recordings, and we compare its performance with existing algorithms in detecting spikes. Our results indicate that VolPy’s performance in spike extraction and scalability are state-of-the-art. 相似文献