期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era

Zhenqiang Su Hong Fang Huixiao Hong Leming Shi Wenqian Zhang Wenwei Zhang Yanyan Zhang Zirui Dong Lee J Lancashire Marina Bessarabova Xi Yang Baitang Ning Binsheng Gong Joe Meehan Joshua Xu Weigong Ge Roger Perkins Matthias Fischer Weida Tong 《Genome biology》2014,15(12)

Background

Gene expression microarray has been the primary biomarker platform ubiquitously applied in biomedical research, resulting in enormous data, predictive models, and biomarkers accrued. Recently, RNA-seq has looked likely to replace microarrays, but there will be a period where both technologies co-exist. This raises two important questions: Can microarray-based models and biomarkers be directly applied to RNA-seq data? Can future RNA-seq-based predictive models and biomarkers be applied to microarray data to leverage past investment?

Results

We systematically evaluated the transferability of predictive models and signature genes between microarray and RNA-seq using two large clinical data sets. The complexity of cross-platform sequence correspondence was considered in the analysis and examined using three human and two rat data sets, and three levels of mapping complexity were revealed. Three algorithms representing different modeling complexity were applied to the three levels of mappings for each of the eight binary endpoints and Cox regression was used to model survival times with expression data. In total, 240,096 predictive models were examined.

Conclusions

Signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development, and microarray-based models can accurately predict RNA-seq-profiled samples; while RNA-seq-based models are less accurate in predicting microarray-profiled samples and are affected both by the choice of modeling algorithm and the gene mapping complexity. The results suggest continued usefulness of legacy microarray data and established microarray biomarkers and predictive models in the forthcoming RNA-seq era.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0523-y) contains supplementary material, which is available to authorized users. 相似文献

2.

Skeletal muscle alterations and exercise performance decrease in erythropoietin-deficient mice: a comparative study

Laurence Mille-Hamard Veronique L Billat Elodie Henry Blandine Bonnamy Florence Joly Philippe Benech Eric Barrey 《BMC medical genomics》2012,5(1):1-20

相似文献

3.

Single-cell mRNA sequencing identifies subclonal heterogeneity in anti-cancer drug responses of lung adenocarcinoma cells

Kyu-Tae Kim Hye Won Lee Hae-Ock Lee Sang Cheol Kim Yun Jee Seo Woosung Chung Hye Hyeon Eum Do-Hyun Nam Junhyong Kim Kyeung Min Joo Woong-Yang Park 《Genome biology》2015,16(1)

相似文献

4.

An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data

Yejun Wang Keith D MacKenzie Aaron P White 《BMC genomics》2015,16(1)

相似文献

5.

Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data

Alexander Kanitz Foivos Gypas Andreas J. Gruber Andreas R. Gruber Georges Martin Mihaela Zavolan 《Genome biology》2015,16(1)

相似文献

6.

Comparison of stranded and non-stranded RNA-seq transcriptome profiling and investigation of gene overlap

Shanrong Zhao Ying Zhang William Gordon Jie Quan Hualin Xi Sarah Du David von Schack Baohong Zhang 《BMC genomics》2015,16(1)

相似文献

7.

Multi-platform assessment of transcriptional profiling technologies utilizing a precise probe mapping methodology

Jinsheng Yu Paul F. Cliften Twyla I. Juehne Toni M. Sinnwell Chris S. Sawyer Mala Sharma Andrew Lutz Eric Tycksen Mark R. Johnson Matthew R. Minton Elliott T. Klotz Andrew E. Schriefer Wei Yang Michael E. Heinz Seth D. Crosby Richard D. Head 《BMC genomics》2015,16(1)

相似文献

8.

Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome

Lise Pingault Frédéric Choulet Adriana Alberti Natasha Glover Patrick Wincker Catherine Feuillet Etienne Paux 《Genome biology》2015,16(1)

相似文献

9.

Shifting from population-wide to personalized cancer prognosis with microarrays

Shao L Fan X Cheng N Wu L Xiong H Fang H Ding D Shi L Cheng Y Tong W 《PloS one》2012,7(1):e29534

The era of personalized medicine for cancer therapeutics has taken an important step forward in making accurate prognoses for individual patients with the adoption of high-throughput microarray technology. However, microarray technology in cancer diagnosis or prognosis has been primarily used for the statistical evaluation of patient populations, and thus excludes inter-individual variability and patient-specific predictions. Here we propose a metric called clinical confidence that serves as a measure of prognostic reliability to facilitate the shift from population-wide to personalized cancer prognosis using microarray-based predictive models. The performance of sample-based models predicted with different clinical confidences was evaluated and compared systematically using three large clinical datasets studying the following cancers: breast cancer, multiple myeloma, and neuroblastoma. Survival curves for patients, with different confidences, were also delineated. The results show that the clinical confidence metric separates patients with different prediction accuracies and survival times. Samples with high clinical confidence were likely to have accurate prognoses from predictive models. Moreover, patients with high clinical confidence would be expected to live for a notably longer or shorter time if their prognosis was good or grim based on the models, respectively. We conclude that clinical confidence could serve as a beneficial metric for personalized cancer prognosis prediction utilizing microarrays. Ascribing a confidence level to prognosis with the clinical confidence metric provides the clinician an objective, personalized basis for decisions, such as choosing the severity of the treatment. 相似文献

10.

CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts

Alison C Testa James K Hane Simon R Ellwood Richard P Oliver 《BMC genomics》2015,16(1)

相似文献

11.

Oxytocin stimulates expression of a noncoding RNA tumor marker in a human neuroblastoma cell line

Taka-aki Koshimizu Yoko Fujiwara Nobuya Sakai Katsushi Shibata Hiroyoshi Tsuchiya 《Life sciences》2010,86(11-12):455-460

相似文献

12.

Dynamics of gene silencing during X inactivation using allele-specific RNA-seq

Hendrik Marks Hindrik H. D. Kerstens Tahsin Stefan Barakat Erik Splinter René A. M. Dirks Guido van Mierlo Onkar Joshi Shuang-Yin Wang Tomas Babak Cornelis A. Albers Tüzer Kalkan Austin Smith Alice Jouneau Wouter de Laat Joost Gribnau Hendrik G. Stunnenberg 《Genome biology》2015,16(1)

相似文献

13.

Pattern of alternative splicing different associated with difference in rooting depth in rice

Wei Haibin Lou Qiaojun Xu Kai Zhou Liguo Chen Shoujun Chen Liang Luo Lijun 《Plant and Soil》2020,449(1-2):233-248

相似文献

14.

Experimental validation of methods for differential gene expression analysis and sample pooling in RNA-seq

Anto P. Rajkumar Per Qvist Ross Lazarus Francesco Lescai Jia Ju Mette Nyegaard Ole Mors Anders D. B?rglum Qibin Li Jane H. Christensen 《BMC genomics》2015,16(1)

Background

Massively parallel cDNA sequencing (RNA-seq) experiments are gradually superseding microarrays in quantitative gene expression profiling. However, many biologists are uncertain about the choice of differentially expressed gene (DEG) analysis methods and the validity of cost-saving sample pooling strategies for their RNA-seq experiments. Hence, we performed experimental validation of DEGs identified by Cuffdiff2, edgeR, DESeq2 and Two-stage Poisson Model (TSPM) in a RNA-seq experiment involving mice amygdalae micro-punches, using high-throughput qPCR on independent biological replicate samples. Moreover, we sequenced RNA-pools and compared their results with sequencing corresponding individual RNA samples.

Results

False-positivity rate of Cuffdiff2 and false-negativity rates of DESeq2 and TSPM were high. Among the four investigated DEG analysis methods, sensitivity and specificity of edgeR was relatively high. We documented the pooling bias and that the DEGs identified in pooled samples suffered low positive predictive values.

Conclusions

Our results highlighted the need for combined use of more sensitive DEG analysis methods and high-throughput validation of identified DEGs in future RNA-seq experiments. They indicated limited utility of sample pooling strategies for RNA-seq in similar setups and supported increasing the number of biological replicate samples.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1767-y) contains supplementary material, which is available to authorized users. 相似文献

15.

Quantification of gene expression while taking into account RNA alternative splicing

《Genomics》2019,111(6):1517-1528

相似文献

16.

Microarray reality checks in the context of a complex disease 总被引：9，自引：0，他引：9

Miklos GL Maleszka R 《Nature biotechnology》2004,22(5):615-621

A problem in analyzing microarray-based gene expression data is the separation of genes causally involved in a disease from innocent bystander genes, whose expression levels have been secondarily altered by primary changes elsewhere. To investigate this issue systematically in the context of a class of complex human diseases, we have compared microarray-based gene expression data with non-microarray-based clinical and biological data about the schizophrenias to ask whether these two approaches prioritize the same genes. We find that genes whose expression changes are deemed to be of importance from microarrays are rarely those classified as of importance from clinical, in situ, molecular, single-nucleotide polymorphism (SNP) association, knockout and drug perturbation data. This disparity is not limited to the schizophrenias but characterizes other human disease data sets. It also extends to biological validation of microarray data in model organisms, in which genome-wide phenotypic data have been systematically compared with microarray data. In addition, different bioinformatic protocols applied to the same microarray data yield quite different gene sets and thus make clinical decisions less straightforward. We discuss how progress may be improved in the clinical area by the assignment of high-quality phenotypic values to each member of a microarray-assigned gene set. 相似文献

17.

Gene Expression Ratios Lead to Accurate and Translatable Predictors of DR5 Agonism across Multiple Tumor Lineages

Anupama Reddy Joseph D. Growney Nick S. Wilson Caroline M. Emery Jennifer A. Johnson Rebecca Ward Kelli A. Monaco Joshua Korn John E. Monahan Mark D. Stump Felipa A. Mapa Christopher J. Wilson Janine Steiger Jebediah Ledell Richard J. Rickles Vic E. Myer Seth A. Ettenberg Robert Schlegel William R. Sellers Heather A. Huet Joseph Lehár 《PloS one》2015,10(9)

Death Receptor 5 (DR5) agonists demonstrate anti-tumor activity in preclinical models but have yet to demonstrate robust clinical responses. A key limitation may be the lack of patient selection strategies to identify those most likely to respond to treatment. To overcome this limitation, we screened a DR5 agonist Nanobody across >600 cell lines representing 21 tumor lineages and assessed molecular features associated with response. High expression of DR5 and Casp8 were significantly associated with sensitivity, but their expression thresholds were difficult to translate due to low dynamic ranges. To address the translational challenge of establishing thresholds of gene expression, we developed a classifier based on ratios of genes that predicted response across lineages. The ratio classifier outperformed the DR5+Casp8 classifier, as well as standard approaches for feature selection and classification using genes, instead of ratios. This classifier was independently validated using 11 primary patient-derived pancreatic xenograft models showing perfect predictions as well as a striking linearity between prediction probability and anti-tumor response. A network analysis of the genes in the ratio classifier captured important biological relationships mediating drug response, specifically identifying key positive and negative regulators of DR5 mediated apoptosis, including DR5, CASP8, BID, cFLIP, XIAP and PEA15. Importantly, the ratio classifier shows translatability across gene expression platforms (from Affymetrix microarrays to RNA-seq) and across model systems (in vitro to in vivo). Our approach of using gene expression ratios presents a robust and novel method for constructing translatable biomarkers of compound response, which can also probe the underlying biology of treatment response. 相似文献

18.

Bovine and murine tissue expression of insulin like growth factor-I

A.M. Oberbauer J.M. Belanger G. Rincon A. Cánovas A. Islas-Trejo R. Gularte-Mérida M.G. Thomas J.F. Medrano 《Gene》2014

相似文献

19.

Accurate Prediction of Transposon-Derived piRNAs by Integrating Various Sequential and Physicochemical Features

Longqiang Luo Dingfang Li Wen Zhang Shikui Tu Xiaopeng Zhu Gang Tian 《PloS one》2016,11(4)

BackgroundPiwi-interacting RNA (piRNA) is the largest class of small non-coding RNA molecules. The transposon-derived piRNA prediction can enrich the research contents of small ncRNAs as well as help to further understand generation mechanism of gamete.MethodsIn this paper, we attempt to differentiate transposon-derived piRNAs from non-piRNAs based on their sequential and physicochemical features by using machine learning methods. We explore six sequence-derived features, i.e. spectrum profile, mismatch profile, subsequence profile, position-specific scoring matrix, pseudo dinucleotide composition and local structure-sequence triplet elements, and systematically evaluate their performances for transposon-derived piRNA prediction. Finally, we consider two approaches: direct combination and ensemble learning to integrate useful features and achieve high-accuracy prediction models.ResultsWe construct three datasets, covering three species: Human, Mouse and Drosophila, and evaluate the performances of prediction models by 10-fold cross validation. In the computational experiments, direct combination models achieve AUC of 0.917, 0.922 and 0.992 on Human, Mouse and Drosophila, respectively; ensemble learning models achieve AUC of 0.922, 0.926 and 0.994 on the three datasets.ConclusionsCompared with other state-of-the-art methods, our methods can lead to better performances. In conclusion, the proposed methods are promising for the transposon-derived piRNA prediction. The source codes and datasets are available in S1 File. 相似文献

20.

RNA-Seq vs Dual- and Single-Channel Microarray Data: Sensitivity Analysis for Differential Expression and Clustering

Alina S?rbu Gráinne Kerr Martin Crane Heather J. Ruskin 《PloS one》2012,7(12)

With the fast development of high-throughput sequencing technologies, a new generation of genome-wide gene expression measurements is under way. This is based on mRNA sequencing (RNA-seq), which complements the already mature technology of microarrays, and is expected to overcome some of the latter’s disadvantages. These RNA-seq data pose new challenges, however, as strengths and weaknesses have yet to be fully identified. Ideally, Next (or Second) Generation Sequencing measures can be integrated for more comprehensive gene expression investigation to facilitate analysis of whole regulatory networks. At present, however, the nature of these data is not very well understood. In this paper we study three alternative gene expression time series datasets for the Drosophila melanogaster embryo development, in order to compare three measurement techniques: RNA-seq, single-channel and dual-channel microarrays. The aim is to study the state of the art for the three technologies, with a view of assessing overlapping features, data compatibility and integration potential, in the context of time series measurements. This involves using established tools for each of the three different technologies, and technical and biological replicates (for RNA-seq and microarrays, respectively), due to the limited availability of biological RNA-seq replicates for time series data. The approach consists of a sensitivity analysis for differential expression and clustering. In general, the RNA-seq dataset displayed highest sensitivity to differential expression. The single-channel data performed similarly for the differentially expressed genes common to gene sets considered. Cluster analysis was used to identify different features of the gene space for the three datasets, with higher similarities found for the RNA-seq and single-channel microarray dataset. 相似文献