首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: Many methods of identifying differential expression in genes depend on testing the null hypotheses of exactly equal means or distributions of expression levels for each gene across groups, even though a statistically significant difference in the expression level does not imply the occurrence of any difference of biological or clinical significance. This is because a mathematical definition of 'differential expression' as any non-zero difference does not correspond to the differential expression biologists seek. Furthermore, while some current methods account for multiple comparisons in hypothesis tests, they do not accordingly adjust estimates of the degrees to which genes are differentially expressed. Both problems lead to overstating the relevance of findings. RESULTS: Testing whether genes have relevant differential expression can be accomplished with customized null hypotheses, thereby redefining 'differential expression' in a way that is more biologically meaningful. When such tests control the false discovery rate, they effectively discover genes based on a desired quantile of differential gene expression. Estimation of the degree to which genes are differentially expressed has been corrected for multiple comparisons. AVAILABILITY: R code is freely available from http://www.davidbickel.com and may become available from www.r-project.org or www.bioconductor.org SUPPLEMENTARY INFORMATION: Applications to cancer microarrays, an application in the absence of differential expression, pseudocode, and a guide to customizing the methods may be found at www.davidbickel.com and www.mathpreprints.com  相似文献   

2.
Hambuch TM  Parsch J 《Genetics》2005,170(4):1691-1700
The nonrandom use of synonymous codons (codon bias) is a well-established phenomenon in Drosophila. Recent reports suggest that levels of codon bias differ among genes that are differentially expressed between the sexes, with male-expressed genes showing less codon bias than female-expressed genes. To examine the relationship between sex-biased gene expression and level of codon bias on a genomic scale, we surveyed synonymous codon usage in 7276 D. melanogaster genes that were classified as male-, female-, or non-sex-biased in their expression in microarray experiments. We found that male-biased genes have significantly less codon bias than both female- and non-sex-biased genes. This pattern holds for both germline and somatically expressed genes. Furthermore, we find a significantly negative correlation between level of codon bias and degree of sex-biased expression for male-biased genes. In contrast, female-biased genes do not differ from non-sex-biased genes in their level of codon bias and show a significantly positive correlation between codon bias and degree of sex-biased expression. These observations cannot be explained by differences in chromosomal distribution, mutational processes, recombinational environment, gene length, or absolute expression level among genes of the different expression classes. We propose that the observed codon bias differences result from differences in selection at synonymous and/or linked nonsynonymous sites between genes with male- and female-biased expression.  相似文献   

3.
4.
5.
6.
7.
The availability of full genome sequences has allowed the construction of microarrays, with which screening of the full genome for changes in gene expression is possible. This method can provide a wealth of information about biology at the level of gene expression and is a powerful method to identify genes and pathways involved in various processes. In this study, we report a detailed analysis of the full heat stress response in Drosophila melanogaster females, using whole genome gene expression arrays (Affymetrix Inc, Santa Clara, CA, USA). The study focuses on up- as well as downregulation of genes from just before and at 8 time points after an application of short heat hardening (36 degrees C for 1 hour). The expression changes were followed up to 64 hours after the heat stress, using 4 biological replicates. This study describes in detail the dramatic change in gene expression over time induced by a short-term heat treatment. We found both known stress responding genes and new candidate genes, and processes to be involved in the stress response. We identified 3 main groups of stress responsive genes that were early-upregulated, early-downregulated, and late-upregulated, respectively, among 1222 differentially expressed genes in the data set. Comparisons with stress sensitive genes identified by studies of responses to other types of stress allow the discussion of heat-specific and general stress responses in Drosophila. Several unexpected features were revealed by this analysis, which suggests that novel pathways and mechanisms are involved in the responses to heat stress and to stress in general. The majority of stress responsive genes identified in this and other studies were downregulated, and the degree of overlap among downregulated genes was relatively high, whereas genes responding by upregulation to heat and other stress factors were more specific to the stress applied or to the conditions of the particular study. As an expected exception, heat shock genes were generally found to be upregulated by stress in general.  相似文献   

8.
9.
RNA-seq is now the technology of choice for genome-wide differential gene expression experiments, but it is not clear how many biological replicates are needed to ensure valid biological interpretation of the results or which statistical tools are best for analyzing the data. An RNA-seq experiment with 48 biological replicates in each of two conditions was performed to answer these questions and provide guidelines for experimental design. With three biological replicates, nine of the 11 tools evaluated found only 20%–40% of the significantly differentially expressed (SDE) genes identified with the full set of 42 clean replicates. This rises to >85% for the subset of SDE genes changing in expression by more than fourfold. To achieve >85% for all SDE genes regardless of fold change requires more than 20 biological replicates. The same nine tools successfully control their false discovery rate at ≲5% for all numbers of replicates, while the remaining two tools fail to control their FDR adequately, particularly for low numbers of replicates. For future RNA-seq experiments, these results suggest that at least six biological replicates should be used, rising to at least 12 when it is important to identify SDE genes for all fold changes. If fewer than 12 replicates are used, a superior combination of true positive and false positive performances makes edgeR and DESeq2 the leading tools. For higher replicate numbers, minimizing false positives is more important and DESeq marginally outperforms the other tools.  相似文献   

10.
Gene expression studies generate large quantities of data with the defining characteristic that the number of genes (whose expression profiles are to be determined) exceed the number of available replicates by several orders of magnitude. Standard spot-by-spot analysis still seeks to extract useful information for each gene on the basis of the number of available replicates, and thus plays to the weakness of microarrays. On the other hand, because of the data volume, treating the entire data set as an ensemble, and developing theoretical distributions for these ensembles provides a framework that plays instead to the strength of microarrays. We present theoretical results that under reasonable assumptions, the distribution of microarray intensities follows the Gamma model, with the biological interpretations of the model parameters emerging naturally. We subsequently establish that for each microarray data set, the fractional intensities can be represented as a mixture of Beta densities, and develop a procedure for using these results to draw statistical inference regarding differential gene expression. We illustrate the results with experimental data from gene expression studies on Deinococcus radiodurans following DNA damage using cDNA microarrays.  相似文献   

11.
12.
Differential analysis of DNA microarray gene expression data   总被引:6,自引:0,他引:6  
Here, we review briefly the sources of experimental and biological variance that affect the interpretation of high-dimensional DNA microarray experiments. We discuss methods using a regularized t-test based on a Bayesian statistical framework that allow the identification of differentially regulated genes with a higher level of confidence than a simple t-test when only a few experimental replicates are available. We also describe a computational method for calculating the global false-positive and false-negative levels inherent in a DNA microarray data set. This method provides a probability of differential expression for each gene based on experiment-wide false-positive and -negative levels driven by experimental error and biological variance.  相似文献   

13.
14.
15.
16.
17.
MOTIVATION: A common objective of microarray experiments is the detection of differential gene expression between samples obtained under different conditions. The task of identifying differentially expressed genes consists of two aspects: ranking and selection. Numerous statistics have been proposed to rank genes in order of evidence for differential expression. However, no one statistic is universally optimal and there is seldom any basis or guidance that can direct toward a particular statistic of choice. RESULTS: Our new approach, which addresses both ranking and selection of differentially expressed genes, integrates differing statistics via a distance synthesis scheme. Using a set of (Affymetrix) spike-in datasets, in which differentially expressed genes are known, we demonstrate that our method compares favorably with the best individual statistics, while achieving robustness properties lacked by the individual statistics. We further evaluate performance on one other microarray study.  相似文献   

18.
Hsieh WP  Chu TM  Wolfinger RD  Gibson G 《Genetics》2003,165(2):747-757
An emerging issue in evolutionary genetics is whether it is possible to use gene expression profiling to identify genes that are associated with morphological, physiological, or behavioral divergence between species and whether these genes have undergone positive selection. Some of these questions were addressed in a recent study (Enard et al. 2002) of the difference in gene expression among human, chimp, and orangutan, which suggested an accelerated rate of divergence in gene expression in the human brain relative to liver. Reanalysis of the Affymetrix data set using analysis of variance methods to quantify the contributions of individuals and species to variation in expression of 12,600 genes indicates that as much as one-quarter of the genome shows divergent expression between primate species at the 5% level. The magnitude of fold change ranges from 1.2-fold up to 8-fold. Similar conclusions apply to reanalysis of Enard et al. 2002 parallel murine data set. However, biases inherent to short oligonucleotide microarray technology may account for some of the tissue and species effects. At high significance levels, more differences were observed in the liver than in the brain in each of the pairwise species comparisons, so it is not clear that expression divergence is accelerated in the human brain. Further, there is an apparent bias toward upregulation of gene expression in the brain in both primates and mice, whereas genes are equally likely to be up- or downregulated in the liver when these species diverge. A small subset of genes that are candidates for adaptive divergence may be identified on the basis of a high ratio of interspecific to intraspecific divergence.  相似文献   

19.
当两组样本间基因表达的差异程度较低或样本量较少时,采用通常的错误发现率(falsediscovery rate,FDR)控制水平(如5%或10%),可能无法识别足够多的差异表达基因以进行后续的功能富集分析。然而,功能富集分析对差异表达基因中的错误发现具有一定的稳健性。所以,采用较低的FDR控制水平(即允许较高的FDR)识别差异表达基因,可能可以可靠地发现疾病相关功能。本文分析了5套研究乳腺癌转移的基因表达谱,通过其中差异表达信号较强的3套数据,论证了即使差异表达基因的FDR达到25%,功能富集分析的结果仍具有较高的稳健性。然后,在另外2套差异表达信号微弱的数据中,采用25%的FDR控制水平筛选差异表达基因来进行功能富集分析,并与前述3套数据的功能富集结果做比较。结果显示,采用较低的FDR控制水平筛选差异表达基因,仍然可以可靠地识别乳腺癌转移相关功能。分析结果也提示,在乳腺癌转移过程中,一些功能较为宽泛的生物学过程(如细胞分裂、细胞周期和DNA复制等)整体受到了扰动,反映出乳腺癌转移是一种涉及广泛基因表达改变的系统性疾病。  相似文献   

20.
A flexible statistical framework is developed for the analysis of read counts from RNA-Seq gene expression studies. It provides the ability to analyse complex experiments involving multiple treatment conditions and blocking variables while still taking full account of biological variation. Biological variation between RNA samples is estimated separately from the technical variation associated with sequencing technologies. Novel empirical Bayes methods allow each gene to have its own specific variability, even when there are relatively few biological replicates from which to estimate such variability. The pipeline is implemented in the edgeR package of the Bioconductor project. A case study analysis of carcinoma data demonstrates the ability of generalized linear model methods (GLMs) to detect differential expression in a paired design, and even to detect tumour-specific expression changes. The case study demonstrates the need to allow for gene-specific variability, rather than assuming a common dispersion across genes or a fixed relationship between abundance and variability. Genewise dispersions de-prioritize genes with inconsistent results and allow the main analysis to focus on changes that are consistent between biological replicates. Parallel computational approaches are developed to make non-linear model fitting faster and more reliable, making the application of GLMs to genomic data more convenient and practical. Simulations demonstrate the ability of adjusted profile likelihood estimators to return accurate estimators of biological variability in complex situations. When variation is gene-specific, empirical Bayes estimators provide an advantageous compromise between the extremes of assuming common dispersion or separate genewise dispersion. The methods developed here can also be applied to count data arising from DNA-Seq applications, including ChIP-Seq for epigenetic marks and DNA methylation analyses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号