首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
RNA sequencing: advances, challenges and opportunities   总被引:5,自引:0,他引:5  
  相似文献   

3.
4.
5.

Background

RNA sequencing (RNA-seq) is the current gold-standard method to quantify gene expression for expression quantitative trait locus (eQTL) studies. However, a potential caveat in these studies is that RNA-seq reads carrying the non-reference allele of variant loci can have lower probability to map correctly to the reference genome, which could bias gene quantifications and cause false positive eQTL associations. In this study, we analyze the effect of this allelic mapping bias in eQTL discovery.

Results

We simulate RNA-seq read mapping over 9.5 M common SNPs and indels, with 15.6% of variants showing biased mapping rate for reference versus non-reference reads. However, removing potentially biased RNA-seq reads from an eQTL dataset of 185 individuals has a very small effect on gene and exon quantifications and eQTL discovery. We detect only a handful of likely false positive eQTLs, and overall eQTL SNPs show no significant enrichment for high mapping bias.

Conclusion

Our results suggest that RNA-seq quantifications are generally robust against allelic mapping bias, and that this does not have a severe effect on eQTL discovery. Nevertheless, we provide our catalog of putatively biased loci to allow better controlling for mapping bias to obtain more accurate results in future RNA-seq studies.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0467-2) contains supplementary material, which is available to authorized users.  相似文献   

6.
limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.  相似文献   

7.
新一代高通量RNA测序数据的处理与分析   总被引:4,自引:0,他引:4  
随着新一代高通量DNA测序技术的快速发展,RNA测序(RNA-seq)已成为基因表达和转录组分析新的重要手段.RNA-seq技术产生的海量数据为生物信息学带来了新的机遇和挑战.有效地对测序数据进行针对性的生物信息学处理和分析,成为RNA-seq技术能否在科学探索中发挥重大作用的关键.以新一代Illumina/Solexa测序平台所产生的数据为例,在扼要介绍高通量RNA-seq测序流程的基础上,对RNA-seq数据处理和分析的方法和现有软件做一个较为全面的综述,并对其中有待进一步研究的问题进行展望.  相似文献   

8.
Sun W 《Biometrics》2012,68(1):1-11
RNA-seq may replace gene expression microarrays in the near future. Using RNA-seq, the expression of a gene can be estimated using the total number of sequence reads mapped to that gene, known as the total read count (TReC). Traditional expression quantitative trait locus (eQTL) mapping methods, such as linear regression, can be applied to TReC measurements after they are properly normalized. In this article, we show that eQTL mapping, by directly modeling TReC using discrete distributions, has higher statistical power than the two-step approach: data normalization followed by linear regression. In addition, RNA-seq provides information on allele-specific expression (ASE) that is not available from microarrays. By combining the information from TReC and ASE, we can computationally distinguish cis- and trans-eQTL and further improve the power of cis-eQTL mapping. Both simulation and real data studies confirm the improved power of our new methods. We also discuss the design issues of RNA-seq experiments. Specifically, we show that by combining TReC and ASE measurements, it is possible to minimize cost and retain the statistical power of cis-eQTL mapping by reducing sample size while increasing the number of sequence reads per sample. In addition to RNA-seq data, our method can also be employed to study the genetic basis of other types of sequencing data, such as chromatin immunoprecipitation followed by DNA sequencing data. In this article, we focus on eQTL mapping of a single gene using the association-based method. However, our method establishes a statistical framework for future developments of eQTL mapping methods using RNA-seq data (e.g., linkage-based eQTL mapping), and the joint study of multiple genetic markers and/or multiple genes.  相似文献   

9.
施剑  李艳明  方向东 《遗传》2017,39(3):189-199
长链非编码RNA(long non-coding RNA, lncRNA)是一类转录本长度超过200nt、不编码蛋白质的RNA。近年来,随着染色质构象捕获及转录组测序等技术的发展,lncRNA与染色质构象间的关系越来越受到重视。多项研究表明,lncRNA在基因调控网络中具有重要的作用,可通过影响细胞核高级结构的动态变化来调控真核基因的表达。因其广泛的基因调控功能及在肿瘤发生过程中的重要作用,lncRNA被认为是未来肿瘤临床诊断和预后判定的新型标志物之一。本文旨在介绍lncRNA改变细胞核高级结构从而调控关键基因表达的分子机制,并详细介绍lncRNA在肿瘤治疗中的临床意义。  相似文献   

10.
11.
12.
13.
14.
15.
16.
The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade's worth of statistical methodology development. The recently developed RNA sequencing (RNA-seq) technology has generated much excitement in part due to claims of reduced variability in comparison to microarrays. However, we show that RNA-seq data demonstrate unwanted and obscuring variability similar to what was first observed in microarrays. In particular, we find guanine-cytosine content (GC-content) has a strong sample-specific effect on gene expression measurements that, if left uncorrected, leads to false positives in downstream results. We also report on commonly observed data distortions that demonstrate the need for data normalization. Here, we describe a statistical methodology that improves precision by 42% without loss of accuracy. Our resulting conditional quantile normalization algorithm combines robust generalized regression to remove systematic bias introduced by deterministic features such as GC-content and quantile normalization to correct for global distortions.  相似文献   

17.
18.
Next-generation sequencing (NGS) has caused a revolution in biology. NGS requires the preparation of libraries in which (fragments of) DNA or RNA molecules are fused with adapters followed by PCR amplification and sequencing. It is evident that robust library preparation methods that produce a representative, non-biased source of nucleic acid material from the genome under investigation are of crucial importance. Nevertheless, it has become clear that NGS libraries for all types of applications contain biases that compromise the quality of NGS datasets and can lead to their erroneous interpretation. A detailed knowledge of the nature of these biases will be essential for a careful interpretation of NGS data on the one hand and will help to find ways to improve library quality or to develop bioinformatics tools to compensate for the bias on the other hand. In this review we discuss the literature on bias in the most common NGS library preparation protocols, both for DNA sequencing (DNA-seq) as well as for RNA sequencing (RNA-seq). Strikingly, almost all steps of the various protocols have been reported to introduce bias, especially in the case of RNA-seq, which is technically more challenging than DNA-seq. For each type of bias we discuss methods for improvement with a view to providing some useful advice to the researcher who wishes to convert any kind of raw nucleic acid into an NGS library.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号