期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

RNA sequencing (RNA-seq) is the current gold-standard method to quantify gene expression for expression quantitative trait locus (eQTL) studies. However, a potential caveat in these studies is that RNA-seq reads carrying the non-reference allele of variant loci can have lower probability to map correctly to the reference genome, which could bias gene quantifications and cause false positive eQTL associations. In this study, we analyze the effect of this allelic mapping bias in eQTL discovery.

Results

We simulate RNA-seq read mapping over 9.5 M common SNPs and indels, with 15.6% of variants showing biased mapping rate for reference versus non-reference reads. However, removing potentially biased RNA-seq reads from an eQTL dataset of 185 individuals has a very small effect on gene and exon quantifications and eQTL discovery. We detect only a handful of likely false positive eQTLs, and overall eQTL SNPs show no significant enrichment for high mapping bias.

Conclusion

Our results suggest that RNA-seq quantifications are generally robust against allelic mapping bias, and that this does not have a severe effect on eQTL discovery. Nevertheless, we provide our catalog of putatively biased loci to allow better controlling for mapping bias to obtain more accurate results in future RNA-seq studies.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0467-2) contains supplementary material, which is available to authorized users. 相似文献

6.

limma powers differential expression analyses for RNA-sequencing and microarray studies 总被引：1，自引：0，他引：1

Matthew E. Ritchie Belinda Phipson Di Wu Yifang Hu Charity W. Law Wei Shi Gordon K. Smyth 《Nucleic acids research》2015,43(7):e47

limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described. 相似文献

7.

新一代高通量RNA测序数据的处理与分析 总被引：4，自引：0，他引：4

王曦汪小我王立坤冯智星张学工《生物化学与生物物理进展》2010,37(8):834-846

随着新一代高通量DNA测序技术的快速发展,RNA测序(RNA-seq)已成为基因表达和转录组分析新的重要手段．RNA-seq技术产生的海量数据为生物信息学带来了新的机遇和挑战．有效地对测序数据进行针对性的生物信息学处理和分析,成为RNA-seq技术能否在科学探索中发挥重大作用的关键．以新一代Illumina/Solexa测序平台所产生的数据为例,在扼要介绍高通量RNA-seq测序流程的基础上,对RNA-seq数据处理和分析的方法和现有软件做一个较为全面的综述,并对其中有待进一步研究的问题进行展望．相似文献

8.

A statistical framework for eQTL mapping using RNA-seq data

Sun W 《Biometrics》2012,68(1):1-11

RNA-seq may replace gene expression microarrays in the near future. Using RNA-seq, the expression of a gene can be estimated using the total number of sequence reads mapped to that gene, known as the total read count (TReC). Traditional expression quantitative trait locus (eQTL) mapping methods, such as linear regression, can be applied to TReC measurements after they are properly normalized. In this article, we show that eQTL mapping, by directly modeling TReC using discrete distributions, has higher statistical power than the two-step approach: data normalization followed by linear regression. In addition, RNA-seq provides information on allele-specific expression (ASE) that is not available from microarrays. By combining the information from TReC and ASE, we can computationally distinguish cis- and trans-eQTL and further improve the power of cis-eQTL mapping. Both simulation and real data studies confirm the improved power of our new methods. We also discuss the design issues of RNA-seq experiments. Specifically, we show that by combining TReC and ASE measurements, it is possible to minimize cost and retain the statistical power of cis-eQTL mapping by reducing sample size while increasing the number of sequence reads per sample. In addition to RNA-seq data, our method can also be employed to study the genetic basis of other types of sequencing data, such as chromatin immunoprecipitation followed by DNA sequencing data. In this article, we focus on eQTL mapping of a single gene using the association-based method. However, our method establishes a statistical framework for future developments of eQTL mapping methods using RNA-seq data (e.g., linkage-based eQTL mapping), and the joint study of multiple genetic markers and/or multiple genes. 相似文献

9.

长链非编码RNA通过细胞核高级结构调控真核基因表达及其临床意义

施剑李艳明方向东《遗传》2017,39(3):189-199

长链非编码RNA(long non-coding RNA, lncRNA)是一类转录本长度超过200nt、不编码蛋白质的RNA。近年来,随着染色质构象捕获及转录组测序等技术的发展,lncRNA与染色质构象间的关系越来越受到重视。多项研究表明,lncRNA在基因调控网络中具有重要的作用,可通过影响细胞核高级结构的动态变化来调控真核基因的表达。因其广泛的基因调控功能及在肿瘤发生过程中的重要作用,lncRNA被认为是未来肿瘤临床诊断和预后判定的新型标志物之一。本文旨在介绍lncRNA改变细胞核高级结构从而调控关键基因表达的分子机制,并详细介绍lncRNA在肿瘤治疗中的临床意义。相似文献

10.

Studying bacterial transcriptomes using RNA-seq

Croucher NJ Thomson NR 《Current opinion in microbiology》2010,13(5):619-624

相似文献

11.

Small RNA-directed transcriptional control: New insights into mechanisms and therapeutic applications

《Cell cycle (Georgetown, Tex.)》2013,12(12):2353-2362

相似文献

12.

A two-parameter generalized Poisson model to improve the analysis of RNA-seq data

Sudeep Srivastava Liang Chen 《Nucleic acids research》2010,38(17):e170

相似文献

13.

Chemical capping improves template switching and enhances sequencing of small RNAs

Madalee G Wulf Sean Maguire Nan Dai Alice Blondel Dora Posfai Keerthana Krishnan Zhiyi Sun Shengxi Guan Ivan R Corrêa Jr 《Nucleic acids research》2022,50(1):e2

相似文献

14.

In-Depth Transcriptome Analysis Reveals Novel TARs and Prevalent Antisense Transcription in Human Cell Lines

Daniel Klevebring Magnus Bjursell Olof Emanuelsson Joakim Lundeberg 《PloS one》2010,5(3)

相似文献

15.

Transcriptome diversity is a systematic source of variation in RNA-sequencing data

Pablo E. García-Nieto Ban Wang Hunter B. Fraser 《PLoS computational biology》2022,18(3)

相似文献

16.

Removing technical variability in RNA-seq data using conditional quantile normalization

Hansen KD Irizarry RA Wu Z 《Biostatistics (Oxford, England)》2012,13(2):204-216

The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade's worth of statistical methodology development. The recently developed RNA sequencing (RNA-seq) technology has generated much excitement in part due to claims of reduced variability in comparison to microarrays. However, we show that RNA-seq data demonstrate unwanted and obscuring variability similar to what was first observed in microarrays. In particular, we find guanine-cytosine content (GC-content) has a strong sample-specific effect on gene expression measurements that, if left uncorrected, leads to false positives in downstream results. We also report on commonly observed data distortions that demonstrate the need for data normalization. Here, we describe a statistical methodology that improves precision by 42% without loss of accuracy. Our resulting conditional quantile normalization algorithm combines robust generalized regression to remove systematic bias introduced by deterministic features such as GC-content and quantile normalization to correct for global distortions. 相似文献

17.

Role of Genomics and RNA-seq in Studies of Fungal Virulence

Alessandro Riccombeni Geraldine Butler 《Current fungal infection reports》2012,6(4):267-274

相似文献

18.

Library preparation methods for next-generation sequencing: Tone down the bias

Erwin L. van Dijk Yan Jaszczyszyn Claude Thermes 《Experimental cell research》2014

Next-generation sequencing (NGS) has caused a revolution in biology. NGS requires the preparation of libraries in which (fragments of) DNA or RNA molecules are fused with adapters followed by PCR amplification and sequencing. It is evident that robust library preparation methods that produce a representative, non-biased source of nucleic acid material from the genome under investigation are of crucial importance. Nevertheless, it has become clear that NGS libraries for all types of applications contain biases that compromise the quality of NGS datasets and can lead to their erroneous interpretation. A detailed knowledge of the nature of these biases will be essential for a careful interpretation of NGS data on the one hand and will help to find ways to improve library quality or to develop bioinformatics tools to compensate for the bias on the other hand. In this review we discuss the literature on bias in the most common NGS library preparation protocols, both for DNA sequencing (DNA-seq) as well as for RNA sequencing (RNA-seq). Strikingly, almost all steps of the various protocols have been reported to introduce bias, especially in the case of RNA-seq, which is technically more challenging than DNA-seq. For each type of bias we discuss methods for improvement with a view to providing some useful advice to the researcher who wishes to convert any kind of raw nucleic acid into an NGS library. 相似文献

19.

Dual RNA-seq of pathogen and host

AJ Westermann SA Gorski J Vogel 《Nature reviews. Microbiology》2012,10(9):618-630

相似文献

20.

Detecting cell-type-specific allelic expression imbalance by integrative analysis of bulk and single-cell RNA sequencing data

Jiaxin Fan Xuran Wang Rui Xiao Mingyao Li 《PLoS genetics》2021,17(3)

相似文献