共查询到20条相似文献,搜索用时 15 毫秒
1.
Time-course studies of gene expression are essential in biomedical research to understand biological phenomena that evolve in a temporal fashion. We introduce a functional hierarchical model for detecting temporally differentially expressed (TDE) genes between two experimental conditions for cross-sectional designs, where the gene expression profiles are treated as functional data and modeled by basis function expansions. A Monte Carlo EM algorithm was developed for estimating both the gene-specific parameters and the hyperparameters in the second level of modeling. We use a direct posterior probability approach to bound the rate of false discovery at a pre-specified level and evaluate the methods by simulations and application to microarray time-course gene expression data on Caenorhabditis elegans developmental processes. Simulation results suggested that the procedure performs better than the two-way ANOVA in identifying TDE genes, resulting in both higher sensitivity and specificity. Genes identified from the C. elegans developmental data set show clear patterns of changes between the two experimental conditions. 相似文献
2.
Wavelet thresholding with bayesian false discovery rate control 总被引:1,自引:0,他引:1
The false discovery rate (FDR) procedure has become a popular method for handling multiplicity in high-dimensional data. The definition of FDR has a natural Bayesian interpretation; it is the expected proportion of null hypotheses mistakenly rejected given a measure of evidence for their truth. In this article, we propose controlling the positive FDR using a Bayesian approach where the rejection rule is based on the posterior probabilities of the null hypotheses. Correspondence between Bayesian and frequentist measures of evidence in hypothesis testing has been studied in several contexts. Here we extend the comparison to multiple testing with control of the FDR and illustrate the procedure with an application to wavelet thresholding. The problem consists of recovering signal from noisy measurements. This involves extracting wavelet coefficients that result from true signal and can be formulated as a multiple hypotheses-testing problem. We use simulated examples to compare the performance of our approach to the Benjamini and Hochberg (1995, Journal of the Royal Statistical Society, Series B57, 289-300) procedure. We also illustrate the method with nuclear magnetic resonance spectral data from human brain. 相似文献
3.
Although both clustering and identification of differentially expressed genes are equally essential in most microarray studies, the two tasks are often conducted without regard to each other. This is clearly not the most efficient way of extracting information. The main aim of this article is to develop a coherent statistical method that can simultaneously cluster and detect differentially expressed genes. Through information sharing between the two tasks, the proposed approach gives more sensible clustering among genes and is more sensitive in identifying differentially expressed genes. The improvement over existing methods is illustrated in both our simulation results and a case study. 相似文献
4.
Viktorian Miok Saskia M Wilting Mark A van de Wiel Annelieke Jaspers Paula I van Noort Ruud H Brakenhoff Peter JF Snijders Renske DM Steenbergen Wessel N van Wieringen 《BMC bioinformatics》2014,15(1)
Background
To determine which changes in the host cell genome are crucial for cervical carcinogenesis, a longitudinal in vitro model system of HPV-transformed keratinocytes was profiled in a genome-wide manner. Four cell lines affected with either HPV16 or HPV18 were assayed at 8 sequential time points for gene expression (mRNA) and gene copy number (DNA) using high-resolution microarrays. Available methods for temporal differential expression analysis are not designed for integrative genomic studies.Results
Here, we present a method that allows for the identification of differential gene expression associated with DNA copy number changes over time. The temporal variation in gene expression is described by a generalized linear mixed model employing low-rank thin-plate splines. Model parameters are estimated with an empirical Bayes procedure, which exploits integrated nested Laplace approximation for fast computation. Iteratively, posteriors of hyperparameters and model parameters are estimated. The empirical Bayes procedure shrinks multiple dispersion-related parameters. Shrinkage leads to more stable estimates of the model parameters, better control of false positives and improvement of reproducibility. In addition, to make estimates of the DNA copy number more stable, model parameters are also estimated in a multivariate way using triplets of features, imposing a spatial prior for the copy number effect.Conclusion
With the proposed method for analysis of time-course multilevel molecular data, more profound insight may be gained through the identification of temporal differential expression induced by DNA copy number abnormalities. In particular, in the analysis of an integrative oncogenomics study with a time-course set-up our method finds genes previously reported to be involved in cervical carcinogenesis. Furthermore, the proposed method yields improvements in sensitivity, specificity and reproducibility compared to existing methods. Finally, the proposed method is able to handle count (RNAseq) data from time course experiments as is shown on a real data set.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-327) contains supplementary material, which is available to authorized users. 相似文献5.
Nonparametric and parametric approaches have been proposed to estimate false discovery rate under the independent hypothesis testing assumption. The parametric approach has been shown to have better performance than the nonparametric approaches. In this article, we study the nonparametric approaches and quantify the underlying relations between parametric and nonparametric approaches. Our study reveals the conservative nature of the nonparametric approaches, and establishes the connections between the empirical Bayes method and p-value-based nonparametric methods. Based on our results, we advocate using the parametric approach, or directly modeling the test statistics using the empirical Bayes method. 相似文献
6.
Tan YD 《Genomics》2011,98(5):390-399
Receiver operating characteristic (ROC) has been widely used to evaluate statistical methods, but a fatal problem is that ROC cannot evaluate estimation of the false discovery rate (FDR) of a statistical method and hence the area under of curve as a criterion cannot tell us if a statistical method is conservative. To address this issue, we propose an alternative criterion, work efficiency. Work efficiency is defined as the product of the power and degree of conservativeness of a statistical method. We conducted large-scale simulation comparisons among the optimizing discovery procedure (ODP), the Bonferroni (B-) procedure, Local FDR (Localfdr), ranking analysis of the F-statistics (RAF), the Benjamini-Hochberg (BH-) procedure, and significance analysis of microarray data (SAM). The results show that ODP, SAM, and the B-procedure perform with low efficiencies while the BH-procedure, RAF, and Localfdr work with higher efficiency. ODP and SAM have the same ROC curves but their efficiencies are significantly different. 相似文献
7.
Benjamini and Hochberg's method for controlling the false discoveryrate is applied to the problem of testing infinitely many contrastsin linear models. Exact, easily calculated critical values arederived, defining a new multiple comparisons method for testingcontrasts in linear models. The method is adaptive, dependingon the data through the F-statistic, like the Waller–DuncanBayesian multiple comparisons method. Comparisons with Scheffé'smethod are given, and the method is extended to the simultaneousconfidence intervals of Benjamini and Yekutieli. 相似文献
8.
Microarray technology provides a powerful tool for the expression profile of thousands of genes simultaneously, which makes it possible to explore the molecular and metabolic etiology of the development of a complex disease under study. However, classical statistical methods and technologies fail to be applicable to microarray data. Therefore, it is necessary and motivating to develop powerful methods for large-scale statistical analyses. In this paper, we described a novel method, called Ranking Analysis of Microarray Data (RAM). RAM, which is a large-scale two-sample t-test method, is based on comparisons between a set of ranked T statistics and a set of ranked Z values (a set of ranked estimated null scores) yielded by a "randomly splitting" approach instead of a "permutation" approach and a two-simulation strategy for estimating the proportion of genes identified by chance, i.e., the false discovery rate (FDR). The results obtained from the simulated and observed microarray data show that RAM is more efficient in identification of genes differentially expressed and estimation of FDR under undesirable conditions such as a large fudge factor, small sample size, or mixture distribution of noises than Significance Analysis of Microarrays. 相似文献
9.
Degenkolbe T Hannah MA Freund S Hincha DK Heyer AG Köhl KI 《Analytical biochemistry》2005,346(2):217-224
Gene expression profiling on microarrays is widely used to measure the expression of large numbers of genes in a single experiment. Because of the high cost of this method, feasible numbers of replicates are limited, thus impairing the power of statistical analysis. As a step toward reducing technically induced variation, we developed a procedure of sample preparation and analysis that minimizes the number of sample manipulation steps, introduces quality control before array hybridization, and allows recovery of the prepared mRNA for independent validation of results. Sample preparation is based on mRNA separation using oligo(dT) magnetic beads, which are subsequently used for first-strand cDNA synthesis on the beads. cDNA covalently bound to the magnetic beads is used as template for second-strand cDNA synthesis, leaving the intact mRNA in solution for further analysis. The quality of the synthesized cDNA can be assessed by quantitative polymerase chain reaction using 3'- and 5'-specific primer pairs for housekeeping genes such as glyceraldehyde-3-phosphate dehydrogenase. Second-strand cDNA is chemically labeled with fluorescent dyes to avoid dye bias in enzymatic labeling reactions. After hybridization of two differently labeled samples to microarray slides, arrays are scanned and images analyzed automatically with high reproducibility. Quantile-normalized data from five biological replica display a coefficient of variation 45% for 90% of profiled genes, allowing detection of twofold changes with false positive and false negative rates of 10% each. We demonstrate successful application of the procedure for expression profiling in plant leaf tissue. However, the method could be easily adapted for samples from animal including human or from microbial origin. 相似文献
10.
In DNA microarray studies, gene-set analysis (GSA) has become the focus of gene expression data analysis. GSA utilizes the gene expression profiles of functionally related gene sets in Gene Ontology (GO) categories or priori-defined biological classes to assess the significance of gene sets associated with clinical outcomes or phenotypes. Many statistical approaches have been proposed to determine whether such functionally related gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to the discriminatory power of gene sets and classification of patients. 相似文献
11.
Objectives
Metastasis is the most significant prognostic factor for laryngeal carcinoma which necessitates the identification of molecular alterations associated with metastasis. The identification of such molecular alterations will not only prove useful in treatment but also provide insight into mechanisms of cancer metastasis. The studies conducted so far have not specifically focused on metastasis or invasion pathways. Therefore we investigated the expression profiles with a pathway focused approach.Materials and methods
Total RNA was extracted from 36 laryngeal tumors and paired cancer free tissue. Expression levels of 88 genes were determined using a PCR array system following cDNA synthesis. Obtained data was used for the calculation of altered expression levels, facilitating relevant algorithms. Significant alterations were determined according to their p-value obtained by Student's t-test.Results
Sixteen genes have shown altered expression when compared with adjacent cancer-free tissue. 2 of these 16 genes have shown differential expression in tumors with neck metastasis in respect to non-metastatic tumors.Conclusion
We found that TGFB1, TIMP1, c-Myc, SPARC, COL4A2 and SOX4 show altered expression in laryngeal tumors. c-Myc and SOX4 expression is decreased as laryngeal tumors switch to metastatic phenotype. 相似文献12.
Although many statistical methods have been proposed for identifying differentially expressed genes, the optimal approach has still not been resolved. Therefore, it is necessary to develop more efficient methods of finding differentially expressed genes while accounting for noise and false discovery rate (FDR). We propose a method based on multi-resolution wavelet transformation analysis combined with SAM for identifying differentially expressed genes by adjusting the Δ and computing the FDR. This method was applied to a microarray expression dataset from adenoma patients and normal subjects. The number of differentially expressed genes gradually reduced with an increasing Δ value, and the FDR was reduced after wavelet transformation. At a given Δ value, the FDR was also reduced before and after wavelet transformation. In conclusion, a greater number and quality of differentially expressed genes were detected using the method when compared to non-transformed data, and the FDRs were notably more controlled and reduced. 相似文献
13.
Digital gene expression (DGE) was performed to investigate the gene expression profiles of 4008 and p50 silkworm strains at 48 h after oral infection with BmCPV. 3,668,437 clean tags were identified in the BmCPV-infected p50 silkworms and 3,540,790 clean tags in the control p50. By contrast, 4,498,263 clean tags were identified in the BmCPV-infected 4008 silkworms and 4,164,250 clean tags in the control 4008. A total of 691 differentially expressed genes were detected in the infected 4008 DGE library and 185 were detected in the infected p50 DGE library, respectively. The expression profiles identified some important differentially expressed genes involved in signal transduction, enzyme activity and apoptotic changes, some of which were verified using quantitative real-time PCR (qRT-PCR). These results provide important clues on the molecular mechanism of BmCPV invasion and resistance mechanism of silkworms against BmCPV infection. 相似文献
14.
Relationships of differential gene expression in leaves with heterosis and heterozygosity in a rice diallel cross 总被引:31,自引:0,他引:31
Xiong L. Z. Yang G. P. Xu C. G. Zhang Qifa Saghai Maroof M. A. 《Molecular breeding : new strategies in plant improvement》1998,4(2):129-136
Using differential display analysis, we assessed the patterns of differential gene expression in hybrids relative to their parents in a diallel cross involving 8 elite rice lines. The analysis revealed several patterns of differential expression including: (1) bands present in one parent and F1 but absent in the other parent, (2) bands observed in both parents but not in the F1, (3) bands occurring in only one parent but not in the F1 or the other parent, and, (4) bands detected only in the F1 but in neither of the parents. Relationships between differential gene expression and heterosis and marker heterozygosity were evaluated using data for RFLPs, SSRs and a number of agronomic characters. The analysis showed that there was very little correlation between patterns of differential expression and the F1 means for all six agronomic traits. Differentially expressed fragments that occurred only in one parent but not in the other parent or in F1 in each of the respective crosses were positively correlated with heterosis and heterozygosity. And conversely, fragments that were detected in F1s but in neither of the respective parents were negatively correlated with heterosis and heterozygosity. The remaining patterns of differential expression were not correlated with heterosis or heterozygosity. The relationships between the patterns of differential expression and heterosis observed in this study were not consistent with expectations based on dominance or overdominance hypotheses. 相似文献
15.
16.
17.
Dudoit S Gilbert HN van der Laan MJ 《Biometrical journal. Biometrische Zeitschrift》2008,50(5):716-744
This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of Type I error rates, defined as generalized tail probability (gTP) error rates, gTP (q,g) = Pr(g (V(n),S(n)) > q), and generalized expected value (gEV) error rates, gEV (g) = E [g (V(n),S(n))], for arbitrary functions g (V(n),S(n)) of the numbers of false positives V(n) and true positives S(n). Of particular interest are error rates based on the proportion g (V(n),S(n)) = V(n) /(V(n) + S(n)) of Type I errors among the rejected hypotheses, such as the false discovery rate (FDR), FDR = E [V(n) /(V(n) + S(n))]. The proposed procedures offer several advantages over existing methods. They provide Type I error control for general data generating distributions, with arbitrary dependence structures among variables. Gains in power are achieved by deriving rejection regions based on guessed sets of true null hypotheses and null test statistics randomly sampled from joint distributions that account for the dependence structure of the data. The Type I error and power properties of an FDR-controlling version of the resampling-based empirical Bayes approach are investigated and compared to those of widely-used FDR-controlling linear step-up procedures in a simulation study. The Type I error and power trade-off achieved by the empirical Bayes procedures under a variety of testing scenarios allows this approach to be competitive with or outperform the Storey and Tibshirani (2003) linear step-up procedure, as an alternative to the classical Benjamini and Hochberg (1995) procedure. 相似文献
18.
Use of a simple semiquantitative method for appraisal of green fluorescent protein gene expression in transgenic tobacco plants 总被引:1,自引:0,他引:1
We have applied a simple method for evaluation of gfp gene expression in plants using a CCD camera and computerized processing of images. Transgenic tobacco plants were obtained by Agrobacterium tumefaciens-mediated transfer of plasmid T-DNA bearing a m-gfp5-ER sequence governed by the 35S promoter together with the nptII selectable marker gene. Presence of the gfp gene in plants was confirmed by a polymerase chain reaction method. Mean brightness values measured using image analysis software showed differences between transgenic and control plants and suggest the possibility of rapid selection of transgenic individuals among regenerants and their progenies. 相似文献
19.
20.