首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
2.
3.
It is crucial for researchers to optimize RNA-seq experimental designs for differential expression detection. Currently, the field lacks general methods to estimate power and sample size for RNA-Seq in complex experimental designs, under the assumption of the negative binomial distribution. We simulate RNA-Seq count data based on parameters estimated from six widely different public data sets (including cell line comparison, tissue comparison, and cancer data sets) and calculate the statistical power in paired and unpaired sample experiments. We comprehensively compare five differential expression analysis packages (DESeq, edgeR, DESeq2, sSeq, and EBSeq) and evaluate their performance by power, receiver operator characteristic (ROC) curves, and other metrics including areas under the curve (AUC), Matthews correlation coefficient (MCC), and F-measures. DESeq2 and edgeR tend to give the best performance in general. Increasing sample size or sequencing depth increases power; however, increasing sample size is more potent than sequencing depth to increase power, especially when the sequencing depth reaches 20 million reads. Long intergenic noncoding RNAs (lincRNA) yields lower power relative to the protein coding mRNAs, given their lower expression level in the same RNA-Seq experiment. On the other hand, paired-sample RNA-Seq significantly enhances the statistical power, confirming the importance of considering the multifactor experimental design. Finally, a local optimal power is achievable for a given budget constraint, and the dominant contributing factor is sample size rather than the sequencing depth. In conclusion, we provide a power analysis tool (http://www2.hawaii.edu/~lgarmire/RNASeqPowerCalculator.htm) that captures the dispersion in the data and can serve as a practical reference under the budget constraint of RNA-Seq experiments.  相似文献   

4.
目的 联合采用表达谱芯片和下一代测序技术同时高通量筛选先天性心脏病胎儿心肌组织表达差异的miRNA.方法 实验组为孕中期先天性畸形胎儿,对照组为同胎龄无心脏畸形的难免流产的胎儿,取胎儿心室心肌组织,联合采用Agilent Human 2.0 microRNAs表达谱芯片和SOLiD下一代测序技术同时观察心肌组织microRNA的表达变化,数据采用生物信息学方法进行分析,并用实时PCR方法验证芯片结果.结果 通过差异miRNA筛选,发现先天性心脏畸形组在表达谱芯片和下一代测序中共同差异的24个miRNA,生物信息学预测到1 606个靶基因,靶基因Gene Ontology分析表明其中与细胞进程、代谢过程、生物调控相关的靶基因为主,Pathway显著性分析表明,部分靶基因为生物信号通路中的关键因子;随机挑选共同表达差异的4个miRNA进行验证,结果表明定量PCR检测结果与芯片与下一代测序共同筛选结果基本相符.结论 这些在先天性心脏病中异常表达的miRNA为研究先天性心脏病分子水平上的发病机制提供了重要的线索,将有可能为心脏相关疾病的诊断和治疗提供新的靶点和研发新的药物.  相似文献   

5.
目的:应用差异显示技术,筛选鼻咽癌相关基因的异常表达。方法:选用连接有bax基因的真核表达质粒pSFFV-bax-neo, 采用脂质体转染法, 将其转染到CNE2细胞中, 以转染有空载的pSFFV-neo 质粒的CNE2细胞为对照, 采用Trizol试剂快速抽取法, 提取mRNA经逆转录成cDNA,用锚式引物和随机引物进行PCR扩增, 加入同位素, 用6%的测序变性聚丙烯酰胺凝胶上电泳PCR产物进行分离。切下聚丙烯酰胺凝胶上明显的6条差异带, 回收差异cDNA, 进行第二次PCR扩增, 经低溶点琼脂凝胶回收, 获取大量PCR扩增的差异cDNA片段,同位素标记作为探针, 抽取RNA, 进行点样, 杂交, 洗膜放射自显影等步骤, 进行细胞RNA的Northern印迹斑点杂交。结果:表明4条cDNA片段均为CNE2细胞表达片段。结论:发现bax可诱导CNE2细胞中有某些相关基因表达,并抑制了CNE2细胞某些相关基因表达。  相似文献   

6.
7.
8.
9.
10.
11.
目的:分析ATP7B基因缺陷(Wilson's disease,WD)小鼠肝脏组织中自噬相关基因的表达和自噬相关蛋白的相互作用方式,探讨铜累积诱导肝内自噬活化的可能机制。方法:对4周龄和12周龄WD小鼠肝组织进行铜含量检测和转录组测序,对差异基因进行GO和KEGG富集分析,筛选自噬相关差异基因做qRT-PCR和Western blot验证,采用GeneMANIA数据库构建自噬相关差异蛋白的互作网络(PPI)并进行功能注释分析,抑制自噬相关蛋白的表达分析其对自噬的影响。结果:与野生型小鼠相比,WD小鼠肝铜含量显著升高,铜累积导致基因表达模式改变;基于GO数据库统计自噬相关差异基因数目,4周龄和12周龄分别有8个、51个,基于KEGG数据库统计,4周龄和12周龄分别有5个、19个;筛选Ulk1Ddit4Plk3等9个基因进行qRT-PCR,定量结果与测序结果表达趋势基本一致;其编码的蛋白质通过共表达、共定位等方式互相作用;Western blot结果显示铜累积导致Ulk1、Plk3、Park2蛋白表达显著增加和细胞自噬发生,抑制Ulk1、Plk3、Park2的蛋白质表达可显著下调细胞自噬水平。结论:WD不同阶段的铜累积可调节肝脏多个自噬相关基因的表达,通过其编码的自噬相关蛋白的互相作用共同诱导肝脏自噬活化以缓解肝损伤。  相似文献   

12.
13.
Moderated statistical tests for assessing differences in tag abundance   总被引:2,自引:0,他引:2  
MOTIVATION: Digital gene expression (DGE) technologies measure gene expression by counting sequence tags. They are sensitive technologies for measuring gene expression on a genomic scale, without the need for prior knowledge of the genome sequence. As the cost of sequencing DNA decreases, the number of DGE datasets is expected to grow dramatically. Various tests of differential expression have been proposed for replicated DGE data using binomial, Poisson, negative binomial or pseudo-likelihood (PL) models for the counts, but none of the these are usable when the number of replicates is very small. RESULTS: We develop tests using the negative binomial distribution to model overdispersion relative to the Poisson, and use conditional weighted likelihood to moderate the level of overdispersion across genes. Not only is our strategy applicable even with the smallest number of libraries, but it also proves to be more powerful than previous strategies when more libraries are available. The methodology is equally applicable to other counting technologies, such as proteomic spectral counts. AVAILABILITY: An R package can be accessed from http://bioinf.wehi.edu.au/resources/  相似文献   

14.
15.
16.
17.
With the fast development of high-throughput sequencing technologies, a new generation of genome-wide gene expression measurements is under way. This is based on mRNA sequencing (RNA-seq), which complements the already mature technology of microarrays, and is expected to overcome some of the latter’s disadvantages. These RNA-seq data pose new challenges, however, as strengths and weaknesses have yet to be fully identified. Ideally, Next (or Second) Generation Sequencing measures can be integrated for more comprehensive gene expression investigation to facilitate analysis of whole regulatory networks. At present, however, the nature of these data is not very well understood. In this paper we study three alternative gene expression time series datasets for the Drosophila melanogaster embryo development, in order to compare three measurement techniques: RNA-seq, single-channel and dual-channel microarrays. The aim is to study the state of the art for the three technologies, with a view of assessing overlapping features, data compatibility and integration potential, in the context of time series measurements. This involves using established tools for each of the three different technologies, and technical and biological replicates (for RNA-seq and microarrays, respectively), due to the limited availability of biological RNA-seq replicates for time series data. The approach consists of a sensitivity analysis for differential expression and clustering. In general, the RNA-seq dataset displayed highest sensitivity to differential expression. The single-channel data performed similarly for the differentially expressed genes common to gene sets considered. Cluster analysis was used to identify different features of the gene space for the three datasets, with higher similarities found for the RNA-seq and single-channel microarray dataset.  相似文献   

18.
In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users.  相似文献   

19.
20.
C57BL/6J (B6) and DBA/2J (D2) are two of the most commonly used inbred mouse strains in neuroscience research. However, the only currently available mouse genome is based entirely on the B6 strain sequence. Subsequently, oligonucleotide microarray probes are based solely on this B6 reference sequence, making their application for gene expression profiling comparisons across mouse strains dubious due to their allelic sequence differences, including single nucleotide polymorphisms (SNPs). The emergence of next-generation sequencing (NGS) and the RNA-Seq application provides a clear alternative to oligonucleotide arrays for detecting differential gene expression without the problems inherent to hybridization-based technologies. Using RNA-Seq, an average of 22 million short sequencing reads were generated per sample for 21 samples (10 B6 and 11 D2), and these reads were aligned to the mouse reference genome, allowing 16,183 Ensembl genes to be queried in striatum for both strains. To determine differential expression, 'digital mRNA counting' is applied based on reads that map to exons. The current study compares RNA-Seq (Illumina GA IIx) with two microarray platforms (Illumina MouseRef-8 v2.0 and Affymetrix MOE 430 2.0) to detect differential striatal gene expression between the B6 and D2 inbred mouse strains. We show that by using stringent data processing requirements differential expression as determined by RNA-Seq is concordant with both the Affymetrix and Illumina platforms in more instances than it is concordant with only a single platform, and that instances of discordance with respect to direction of fold change were rare. Finally, we show that additional information is gained from RNA-Seq compared to hybridization-based techniques as RNA-Seq detects more genes than either microarray platform. The majority of genes differentially expressed in RNA-Seq were only detected as present in RNA-Seq, which is important for studies with smaller effect sizes where the sensitivity of hybridization-based techniques could bias interpretation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号