首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Wu B  Guan Z  Zhao H 《Biometrics》2006,62(3):735-744
Nonparametric and parametric approaches have been proposed to estimate false discovery rate under the independent hypothesis testing assumption. The parametric approach has been shown to have better performance than the nonparametric approaches. In this article, we study the nonparametric approaches and quantify the underlying relations between parametric and nonparametric approaches. Our study reveals the conservative nature of the nonparametric approaches, and establishes the connections between the empirical Bayes method and p-value-based nonparametric methods. Based on our results, we advocate using the parametric approach, or directly modeling the test statistics using the empirical Bayes method.  相似文献   

2.
In most analyses of large-scale genomic data sets, differentialexpression analysis is typically assessed by testing for differencesin the mean of the distributions between 2 groups. A recentfinding by Tomlins and others (2005) is of a different typeof pattern of differential expression in which a fraction ofsamples in one group have overexpression relative to samplesin the other group. In this work, we describe a general mixturemodel framework for the assessment of this type of expression,called outlier profile analysis. We start by considering thesingle-gene situation and establishing results on identifiability.We propose 2 nonparametric estimation procedures that have naturallinks to familiar multiple testing procedures. We then developmultivariate extensions of this methodology to handle genome-widemeasurements. The proposed methodologies are compared usingsimulation studies as well as data from a prostate cancer geneexpression study.  相似文献   

3.
Westfall  Peter H. 《Biometrika》2008,95(3):709-719
Benjamini and Hochberg's method for controlling the false discoveryrate is applied to the problem of testing infinitely many contrastsin linear models. Exact, easily calculated critical values arederived, defining a new multiple comparisons method for testingcontrasts in linear models. The method is adaptive, dependingon the data through the F-statistic, like the Waller–DuncanBayesian multiple comparisons method. Comparisons with Scheffé'smethod are given, and the method is extended to the simultaneousconfidence intervals of Benjamini and Yekutieli.  相似文献   

4.
Liang  Faming; Zhang  Jian 《Biometrika》2008,95(4):961-977
Testing of multiple hypotheses involves statistics that arestrongly dependent in some applications, but most work on thissubject is based on the assumption of independence. We proposea new method for estimating the false discovery rate of multiplehypothesis tests, in which the density of test scores is estimatedparametrically by minimizing the Kullback–Leibler distancebetween the unknown density and its estimator using the stochasticapproximation algorithm, and the false discovery rate is estimatedusing the ensemble averaging method. Our method is applicableunder general dependence between test statistics. Numericalcomparisons between our method and several competitors, conductedon simulated and real data examples, show that our method achievesmore accurate control of the false discovery rate in almostall scenarios.  相似文献   

5.
A Bayesian model-based clustering approach is proposed for identifying differentially expressed genes in meta-analysis. A Bayesian hierarchical model is used as a scientific tool for combining information from different studies, and a mixture prior is used to separate differentially expressed genes from non-differentially expressed genes. Posterior estimation of the parameters and missing observations are done by using a simple Markov chain Monte Carlo method. From the estimated mixture model, useful measure of significance of a test such as the Bayesian false discovery rate (FDR), the local FDR (Efron et al., 2001), and the integration-driven discovery rate (IDR; Choi et al., 2003) can be easily computed. The model-based approach is also compared with commonly used permutation methods, and it is shown that the model-based approach is superior to the permutation methods when there are excessive under-expressed genes compared to over-expressed genes or vice versa. The proposed method is applied to four publicly available prostate cancer gene expression data sets and simulated data sets.  相似文献   

6.
Hong F  Li H 《Biometrics》2006,62(2):534-544
Time-course studies of gene expression are essential in biomedical research to understand biological phenomena that evolve in a temporal fashion. We introduce a functional hierarchical model for detecting temporally differentially expressed (TDE) genes between two experimental conditions for cross-sectional designs, where the gene expression profiles are treated as functional data and modeled by basis function expansions. A Monte Carlo EM algorithm was developed for estimating both the gene-specific parameters and the hyperparameters in the second level of modeling. We use a direct posterior probability approach to bound the rate of false discovery at a pre-specified level and evaluate the methods by simulations and application to microarray time-course gene expression data on Caenorhabditis elegans developmental processes. Simulation results suggested that the procedure performs better than the two-way ANOVA in identifying TDE genes, resulting in both higher sensitivity and specificity. Genes identified from the C. elegans developmental data set show clear patterns of changes between the two experimental conditions.  相似文献   

7.
False discovery control with p-value weighting   总被引:2,自引:0,他引:2  
  相似文献   

8.
Tan YD 《Genomics》2011,98(5):390-399
Receiver operating characteristic (ROC) has been widely used to evaluate statistical methods, but a fatal problem is that ROC cannot evaluate estimation of the false discovery rate (FDR) of a statistical method and hence the area under of curve as a criterion cannot tell us if a statistical method is conservative. To address this issue, we propose an alternative criterion, work efficiency. Work efficiency is defined as the product of the power and degree of conservativeness of a statistical method. We conducted large-scale simulation comparisons among the optimizing discovery procedure (ODP), the Bonferroni (B-) procedure, Local FDR (Localfdr), ranking analysis of the F-statistics (RAF), the Benjamini-Hochberg (BH-) procedure, and significance analysis of microarray data (SAM). The results show that ODP, SAM, and the B-procedure perform with low efficiencies while the BH-procedure, RAF, and Localfdr work with higher efficiency. ODP and SAM have the same ROC curves but their efficiencies are significantly different.  相似文献   

9.
10.
11.
Tan YD  Fornage M  Fu YX 《Genomics》2006,88(6):846-854
Microarray technology provides a powerful tool for the expression profile of thousands of genes simultaneously, which makes it possible to explore the molecular and metabolic etiology of the development of a complex disease under study. However, classical statistical methods and technologies fail to be applicable to microarray data. Therefore, it is necessary and motivating to develop powerful methods for large-scale statistical analyses. In this paper, we described a novel method, called Ranking Analysis of Microarray Data (RAM). RAM, which is a large-scale two-sample t-test method, is based on comparisons between a set of ranked T statistics and a set of ranked Z values (a set of ranked estimated null scores) yielded by a "randomly splitting" approach instead of a "permutation" approach and a two-simulation strategy for estimating the proportion of genes identified by chance, i.e., the false discovery rate (FDR). The results obtained from the simulated and observed microarray data show that RAM is more efficient in identification of genes differentially expressed and estimation of FDR under undesirable conditions such as a large fudge factor, small sample size, or mixture distribution of noises than Significance Analysis of Microarrays.  相似文献   

12.
13.
14.
Summary .   In this article, we apply the recently developed Bayesian wavelet-based functional mixed model methodology to analyze MALDI-TOF mass spectrometry proteomic data. By modeling mass spectra as functions, this approach avoids reliance on peak detection methods. The flexibility of this framework in modeling nonparametric fixed and random effect functions enables it to model the effects of multiple factors simultaneously, allowing one to perform inference on multiple factors of interest using the same model fit, while adjusting for clinical or experimental covariates that may affect both the intensities and locations of peaks in the spectra. For example, this provides a straightforward way to account for systematic block and batch effects that characterize these data. From the model output, we identify spectral regions that are differentially expressed across experimental conditions, in a way that takes both statistical and clinical significance into account and controls the Bayesian false discovery rate to a prespecified level. We apply this method to two cancer studies.  相似文献   

15.
Due to advances in experimental technologies, it is feasible to collect measurements for a large number of variables. When these variables are simultaneously screened by a statistical test, it is necessary to consider the adjustment for multiple hypothesis testing. The false discovery rate has been proposed and widely used to address this issue. A related problem is the estimation of the proportion of true null hypotheses. The long-standing difficulty to this problem is the identifiability of the nonparametric model. In this study, we propose a moment-based method coupled with sample splitting for estimating this proportion. If the p values from the alternative hypothesis are homogeneously distributed, then the proposed method will solve the identifiability and give its optimal performances. When the p values from the alternative hypothesis are heterogeneously distributed, we propose to approximate this mixture distribution so that the identifiability can be achieved. Theoretical aspects of the approximation error are discussed. The proposed estimation method is completely nonparametric and simple with an explicit formula. Simulation studies show the favorable performances of the proposed method when it is compared to the other existing methods. Two microarray gene expression data sets are considered for applications.  相似文献   

16.
Dunson DB  Herring AH 《Biometrics》2003,59(4):916-923
In studying the relationship between an ordered categorical predictor and an event time, it is standard practice to include dichotomous indicators of the different levels of the predictor in a Cox model. One can then use a multiple degree-of-freedom score or partial likelihood ratio test for hypothesis testing. Often, interest focuses on comparing the null hypothesis of no difference to an order-restricted alternative, such as a monotone increase across levels of a predictor. This article proposes a Bayesian approach for addressing hypotheses of this type. We reparameterize the Cox model in terms of a cumulative product of parameters having conjugate prior densities, consisting of mixtures of point masses at one, and truncated gamma densities. Due to the structure of the model, posterior computation can proceed via a simple and efficient Gibbs sampling algorithm. Posterior probabilities for the global null hypothesis and subhypotheses, comparing the hazards for specific groups, can be calculated directly from the output of a single Gibbs chain. The approach allows for level sets across which a predictor has no effect. Generalizations to multiple predictors are described, and the method is applied to a study of emergency medical treatment for stroke.  相似文献   

17.
18.
We consider multiple testing with false discovery rate (FDR) control when p values have discrete and heterogeneous null distributions. We propose a new estimator of the proportion of true null hypotheses and demonstrate that it is less upwardly biased than Storey's estimator and two other estimators. The new estimator induces two adaptive procedures, that is, an adaptive Benjamini–Hochberg (BH) procedure and an adaptive Benjamini–Hochberg–Heyse (BHH) procedure. We prove that the adaptive BH (aBH) procedure is conservative nonasymptotically. Through simulation studies, we show that these procedures are usually more powerful than their nonadaptive counterparts and that the adaptive BHH procedure is usually more powerful than the aBH procedure and a procedure based on randomized p‐value. The adaptive procedures are applied to a study of HIV vaccine efficacy, where they identify more differentially polymorphic positions than the BH procedure at the same FDR level.  相似文献   

19.
Summary .  Pharmacovigilance systems aim at early detection of adverse effects of marketed drugs. They maintain large spontaneous reporting databases for which several automatic signaling methods have been developed. One limit of those methods is that the decision rules for the signal generation are based on arbitrary thresholds. In this article, we propose a new signal-generation procedure. The decision criterion is formulated in terms of a critical region for the P-values resulting from the reporting odds ratio method as well as from the Fisher's exact test. For the latter, we also study the use of mid-P-values. The critical region is defined by the false discovery rate, which can be estimated by adapting the P-values mixture model based procedures to one-sided tests. The methodology is mainly illustrated with the location-based estimator procedure. It is studied through a large simulation study and applied to the French pharmacovigilance database.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号