首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
新一代高通量RNA测序数据的处理与分析   总被引:4,自引:0,他引:4  
随着新一代高通量DNA测序技术的快速发展,RNA测序(RNA-seq)已成为基因表达和转录组分析新的重要手段.RNA-seq技术产生的海量数据为生物信息学带来了新的机遇和挑战.有效地对测序数据进行针对性的生物信息学处理和分析,成为RNA-seq技术能否在科学探索中发挥重大作用的关键.以新一代Illumina/Solexa测序平台所产生的数据为例,在扼要介绍高通量RNA-seq测序流程的基础上,对RNA-seq数据处理和分析的方法和现有软件做一个较为全面的综述,并对其中有待进一步研究的问题进行展望.  相似文献   

2.
3.
TopHat-Fusion: an algorithm for discovery of novel fusion transcripts   总被引:2,自引:0,他引:2  
  相似文献   

4.
5.
With the fast development of high-throughput sequencing technologies, a new generation of genome-wide gene expression measurements is under way. This is based on mRNA sequencing (RNA-seq), which complements the already mature technology of microarrays, and is expected to overcome some of the latter’s disadvantages. These RNA-seq data pose new challenges, however, as strengths and weaknesses have yet to be fully identified. Ideally, Next (or Second) Generation Sequencing measures can be integrated for more comprehensive gene expression investigation to facilitate analysis of whole regulatory networks. At present, however, the nature of these data is not very well understood. In this paper we study three alternative gene expression time series datasets for the Drosophila melanogaster embryo development, in order to compare three measurement techniques: RNA-seq, single-channel and dual-channel microarrays. The aim is to study the state of the art for the three technologies, with a view of assessing overlapping features, data compatibility and integration potential, in the context of time series measurements. This involves using established tools for each of the three different technologies, and technical and biological replicates (for RNA-seq and microarrays, respectively), due to the limited availability of biological RNA-seq replicates for time series data. The approach consists of a sensitivity analysis for differential expression and clustering. In general, the RNA-seq dataset displayed highest sensitivity to differential expression. The single-channel data performed similarly for the differentially expressed genes common to gene sets considered. Cluster analysis was used to identify different features of the gene space for the three datasets, with higher similarities found for the RNA-seq and single-channel microarray dataset.  相似文献   

6.
7.
RNA-Seq is a powerful tool for the annotation of genomes, in particular for the identification of isoforms and UTRs. Nevertheless, several software tools exist and no standard strategy to obtain a reliable annotation is yet established. We tested different combinations of the most commonly used reference-based alignment tools (TopHat, GSNAP) in combination with two frequently used reference-based assemblers (Cufflinks, Scripture) and evaluated the potential of RNA-Seq to improve the annotation of Drosophila pseudoobscura. While GSNAP maps a higher proportion of reads, TopHat resulted in a more accurate annotation when used in combination with Cufflinks. Scripture had the lowest sensitivity. Interestingly, after subsampling to the same coverage for GSNAP and TopHat, we find that both mappers have similar performance, implying that the advantage of TopHat is mainly an artifact of the lower coverage. Overall, we observed a low concordance among the different approaches tested both at junction and isoform levels. Using data from both sexes of two adult strains of D. pseudoobscura we detected alternative splicing for about 30% of the FlyBase multiple-exon genes. Moreover, we extended the boundaries for 6523 genes (about 40%). We annotated 669 new genes, 45% of them with splicing evidence. Most of the new genes are located on unassembled contigs, reflecting their incomplete annotation. Finally, we identified 99 additional new genes that are not represented in the current genome contigs of D. pseudoobscura, probably due to location in genomic regions that are difficult to assemble (e.g. heterochromatic regions).  相似文献   

8.
9.
10.
limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.  相似文献   

11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号