期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

MCMC implementation of the optimal Bayesian classifier for non-Gaussian models: model-based RNA-Seq classification

Jason M Knight Ivan Ivanov Edward R Dougherty 《BMC bioinformatics》2014,15(1)

Background

Sequencing datasets consist of a finite number of reads which map to specific regions of a reference genome. Most effort in modeling these datasets focuses on the detection of univariate differentially expressed genes. However, for classification, we must consider multiple genes and their interactions.

Results

Thus, we introduce a hierarchical multivariate Poisson model (MP) and the associated optimal Bayesian classifier (OBC) for classifying samples using sequencing data. Lacking closed-form solutions, we employ a Monte Carlo Markov Chain (MCMC) approach to perform classification. We demonstrate superior or equivalent classification performance compared to typical classifiers for two synthetic datasets and over a range of classification problem difficulties. We also introduce the Bayesian minimum mean squared error (MMSE) conditional error estimator and demonstrate its computation over the feature space. In addition, we demonstrate superior or leading class performance over an RNA-Seq dataset containing two lung cancer tumor types from The Cancer Genome Atlas (TCGA).

Conclusions

Through model-based, optimal Bayesian classification, we demonstrate superior classification performance for both synthetic and real RNA-Seq datasets. A tutorial video and Python source code is available under an open source license at http://bit.ly/1gimnss.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0401-3) contains supplementary material, which is available to authorized users. 相似文献

2.

IM-TORNADO: A Tool for Comparison of 16S Reads from Paired-End Libraries

Patricio Jeraldo Krishna Kalari Xianfeng Chen Jaysheel Bhavsar Ashutosh Mangalam Bryan White Heidi Nelson Jean-Pierre Kocher Nicholas Chia 《PloS one》2014,9(12)

Motivation

16S rDNA hypervariable tag sequencing has become the de facto method for accessing microbial diversity. Illumina paired-end sequencing, which produces two separate reads for each DNA fragment, has become the platform of choice for this application. However, when the two reads do not overlap, existing computational pipelines analyze data from read separately and underutilize the information contained in the paired-end reads.

Results

We created a workflow known as Illinois Mayo Taxon Organization from RNA Dataset Operations (IM-TORNADO) for processing non-overlapping reads while retaining maximal information content. Using synthetic mock datasets, we show that the use of both reads produced answers with greater correlation to those from full length 16S rDNA when looking at taxonomy, phylogeny, and beta-diversity.

Availability and Implementation

IM-TORNADO is freely available at http://sourceforge.net/projects/imtornado and produces BIOM format output for cross compatibility with other pipelines such as QIIME, mothur, and phyloseq. 相似文献

3.

Corset: enabling differential gene expression analysis for de novo assembled transcriptomes

Nadia M Davidson Alicia Oshlack 《Genome biology》2014,15(7)

相似文献

4.

Comparison of Metatranscriptomic Samples Based on k-Tuple Frequencies

Ying Wang Lin Liu Lina Chen Ting Chen Fengzhu Sun 《PloS one》2014,9(1)

相似文献

5.

deGPS is a powerful tool for detecting differential expression in RNA-sequencing studies

Chen Chu Zhaoben Fang Xing Hua Yaning Yang Enguo Chen Allen W. Cowley Jr. Mingyu Liang Pengyuan Liu Yan Lu 《BMC genomics》2015,16(1)

相似文献

6.

MaxSSmap: a GPU program for mapping divergent short reads to genomes with the maximum scoring subsequence

Turki Turki Usman Roshan 《BMC genomics》2014,15(1)

Background

Programs based on hash tables and Burrows-Wheeler are very fast for mapping short reads to genomes but have low accuracy in the presence of mismatches and gaps. Such reads can be aligned accurately with the Smith-Waterman algorithm but it can take hours and days to map millions of reads even for bacteria genomes.

Results

We introduce a GPU program called MaxSSmap with the aim of achieving comparable accuracy to Smith-Waterman but with faster runtimes. Similar to most programs MaxSSmap identifies a local region of the genome followed by exact alignment. Instead of using hash tables or Burrows-Wheeler in the first part, MaxSSmap calculates maximum scoring subsequence score between the read and disjoint fragments of the genome in parallel on a GPU and selects the highest scoring fragment for exact alignment. We evaluate MaxSSmap’s accuracy and runtime when mapping simulated Illumina E.coli and human chromosome one reads of different lengths and 10% to 30% mismatches with gaps to the E.coli genome and human chromosome one. We also demonstrate applications on real data by mapping ancient horse DNA reads to modern genomes and unmapped paired reads from NA12878 in 1000 genomes.

Conclusions

We show that MaxSSmap attains comparable high accuracy and low error to fast Smith-Waterman programs yet has much lower runtimes. We show that MaxSSmap can map reads rejected by BWA and NextGenMap with high accuracy and low error much faster than if Smith-Waterman were used. On short read lengths of 36 and 51 both MaxSSmap and Smith-Waterman have lower accuracy compared to at higher lengths. On real data MaxSSmap produces many alignments with high score and mapping quality that are not given by NextGenMap and BWA. The MaxSSmap source code in CUDA and OpenCL is freely available from http://www.cs.njit.edu/usman/MaxSSmap.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-969) contains supplementary material, which is available to authorized users. 相似文献

7.

Global Expression Profiling in Atopic Eczema Reveals Reciprocal Expression of Inflammatory and Lipid Genes

Annika M. S??f Maria Tengvall-Linder Howard Y. Chang Adam S. Adler Carl-Fredrik Wahlgren Annika Scheynius Magnus Nordenskj?ld Maria Bradley 《PloS one》2008,3(12)

相似文献

8.

aTRAM - automated target restricted assembly method: a fast method for assembling loci across divergent taxa from next-generation sequencing data

Julie M Allen Daisie I Huang Quentin C Cronk Kevin P Johnson 《BMC bioinformatics》2015,16(1)

Background

Assembling genes from next-generation sequencing data is not only time consuming but computationally difficult, particularly for taxa without a closely related reference genome. Assembling even a draft genome using de novo approaches can take days, even on a powerful computer, and these assemblies typically require data from a variety of genomic libraries. Here we describe software that will alleviate these issues by rapidly assembling genes from distantly related taxa using a single library of paired-end reads: aTRAM, automated Target Restricted Assembly Method. The aTRAM pipeline uses a reference sequence, BLAST, and an iterative approach to target and locally assemble the genes of interest.

Results

Our results demonstrate that aTRAM rapidly assembles genes across distantly related taxa. In comparative tests with a closely related taxon, aTRAM assembled the same sequence as reference-based and de novo approaches taking on average < 1 min per gene. As a test case with divergent sequences, we assembled >1,000 genes from six taxa ranging from 25 – 110 million years divergent from the reference taxon. The gene recovery was between 97 – 99% from each taxon.

Conclusions

aTRAM can quickly assemble genes across distantly-related taxa, obviating the need for draft genome assembly of all taxa of interest. Because aTRAM uses a targeted approach, loci can be assembled in minutes depending on the size of the target. Our results suggest that this software will be useful in rapidly assembling genes for phylogenomic projects covering a wide taxonomic range, as well as other applications. The software is freely available http://www.github.com/juliema/aTRAM.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0515-2) contains supplementary material, which is available to authorized users. 相似文献

9.

VTBuilder: a tool for the assembly of multi isoform transcriptomes

John Archer Gareth Whiteley Nicholas R Casewell Robert A Harrison Simon C Wagstaff 《BMC bioinformatics》2014,15(1)

相似文献

10.

Integrating biological pathways and genomic profiles with ChiBE 2

?zgün Babur Ugur Dogrusoz Merve ?ak?r Bülent Arman Aksoy Nikolaus Schultz Chris Sander Emek Demir 《BMC genomics》2014,15(1)

Background

Dynamic visual exploration of detailed pathway information can help researchers digest and interpret complex mechanisms and genomic datasets.

Results

ChiBE is a free, open-source software tool for visualizing, querying, and analyzing human biological pathways in BioPAX format. The recently released version 2 can search for neighborhoods, paths between molecules, and common regulators/targets of molecules, on large integrated cellular networks in the Pathway Commons database as well as in local BioPAX models. Resulting networks can be automatically laid out for visualization using a graphically rich, process-centric notation. Profiling data from the cBioPortal for Cancer Genomics and expression data from the Gene Expression Omnibus can be overlaid on these networks.

Conclusions

ChiBE’s new capabilities are organized around a genomics-oriented workflow and offer a unique comprehensive pathway analysis solution for genomics researchers. The software is freely available at http://code.google.com/p/chibe. 相似文献

11.

From Gigabyte to Kilobyte: A Bioinformatics Protocol for Mining Large RNA-Seq Transcriptomics Data

Jilong Li Jie Hou Lin Sun Jordan Maximillian Wilkins Yuan Lu Chad E. Niederhuth Benjamin Ryan Merideth Thomas P. Mawhinney Valeri V. Mossine C. Michael Greenlief John C. Walker William R. Folk Mark Hannink Dennis B. Lubahn James A. Birchler Jianlin Cheng 《PloS one》2015,10(4)

RNA-Seq techniques generate hundreds of millions of short RNA reads using next-generation sequencing (NGS). These RNA reads can be mapped to reference genomes to investigate changes of gene expression but improved procedures for mining large RNA-Seq datasets to extract valuable biological knowledge are needed. RNAMiner—a multi-level bioinformatics protocol and pipeline—has been developed for such datasets. It includes five steps: Mapping RNA-Seq reads to a reference genome, calculating gene expression values, identifying differentially expressed genes, predicting gene functions, and constructing gene regulatory networks. To demonstrate its utility, we applied RNAMiner to datasets generated from Human, Mouse, Arabidopsis thaliana, and Drosophila melanogaster cells, and successfully identified differentially expressed genes, clustered them into cohesive functional groups, and constructed novel gene regulatory networks. The RNAMiner web service is available at http://calla.rnet.missouri.edu/rnaminer/index.html. 相似文献

12.

trieFinder: an efficient program for annotating Digital Gene Expression (DGE) tags

Gabriel Renaud Matthew C LaFave Jin Liang Tyra G Wolfsberg Shawn M Burgess 《BMC bioinformatics》2014,15(1)

相似文献

13.

Analyzing allele specific RNA expression using mixture models

Rong Lu Ryan M Smith Michal Seweryn Danxin Wang Katherine Hartmann Amy Webb Wolfgang Sadee Grzegorz A Rempala 《BMC genomics》2015,16(1)

相似文献

14.

RNA-Seq profile of flavescence dorée phytoplasma in grapevine

Simona Abbà Luciana Galetto Patricia Carle Sébastien Carrère Massimo Delledonne Xavier Foissac Sabrina Palmano Flavio Veratti Cristina Marzachì 《BMC genomics》2014,15(1)

相似文献

15.

Gene expression profiles responses to aphid feeding in chrysanthemum (Chrysanthemum morifolium)

Xiaolong Xia Yafeng Shao Jiafu Jiang Liping Ren Fadi Chen Weimin Fang Zhiyong Guan Sumei Chen 《BMC genomics》2014,15(1)

相似文献

16.

Integrating dilution-based sequencing and population genotypes for single individual haplotyping

Hirotaka Matsumoto Hisanori Kiryu 《BMC genomics》2014,15(1)

相似文献

17.

QoRTs: a comprehensive toolset for quality control and data processing of RNA-Seq experiments

Stephen W. Hartley James C. Mullikin 《BMC bioinformatics》2015,16(1)

相似文献

18.

Masking as an effective quality control method for next-generation sequencing data analysis

Sajung Yun Sijung Yun 《BMC bioinformatics》2014,15(1)

相似文献

19.

Safety and Immunogenicity of DNA and MVA HIV-1 Subtype C Vaccine Prime-Boost Regimens: A Phase I Randomised Trial in HIV-Uninfected Indian Volunteers

Sanjay Mehendale Madhuri Thakar Seema Sahay Makesh Kumar Ashwini Shete Pattabiraman Sathyamurthi Amita Verma Swarali Kurle Aparna Shrotri Jill Gilmour Rajat Goyal Len Dally Eddy Sayeed Devika Zachariah James Ackland Sonali Kochhar Josephine H. Cox Jean-Louis Excler Vasanthapuram Kumaraswami Ramesh Paranjape Vadakkuppatu Devasenapathi Ramanathan 《PloS one》2013,8(2)

Study Design

A randomized, double-blind, placebo controlled phase I trial.

Methods

The trial was conducted in 32 HIV-uninfected healthy volunteers to assess the safety and immunogenicity of prime-boost vaccination regimens with either 2 doses of ADVAX, a DNA vaccine containing Chinese HIV-1 subtype C env gp160, gag, pol and nef/tat genes, as a prime and 2 doses of TBC-M4, a recombinant MVA encoding Indian HIV-1 subtype C env gp160, gag, RT, rev, tat, and nef genes, as a boost in Group A or 3 doses of TBC-M4 alone in Group B participants. Out of 16 participants in each group, 12 received vaccine candidates and 4 received placebos.

Results

Both vaccine regimens were found to be generally safe and well tolerated. The breadth of anti-HIV binding antibodies and the titres of anti-HIV neutralizing antibodies were significantly higher (p<0.05) in Group B volunteers at 14 days post last vaccination. Neutralizing antibodies were detected mainly against Tier-1 subtype B and C viruses. HIV-specific IFN-γ ELISPOT responses were directed mostly to Env and Gag proteins. Although the IFN-γ ELISPOT responses were infrequent after ADVAX vaccinations, the response rate was significantly higher in group A after 1^st and 2^nd MVA doses as compared to the responses in group B volunteers. However, the priming effect was short lasting leading to no difference in the frequency, breadth and magnitude of IFN-γELISPOT responses between the groups at 3, 6 and 9 months post-last vaccination.

Conclusions

Although DNA priming resulted in enhancement of immune responses after 1^st MVA boosting, the overall DNA prime MVA boost was not found to be immunologically superior to homologous MVA boosting.

Trial Registration

Clinical Trial Registry CTRI/2009/091/000051 相似文献

20.

RNA-Seq versus oligonucleotide array assessment of dose-dependent TCDD-elicited hepatic gene expression in mice

Rance Nault Kelly A Fader Tim Zacharewski 《BMC genomics》2015,16(1)

相似文献