首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Wavelet thresholding with bayesian false discovery rate control   总被引:1,自引:0,他引:1  
The false discovery rate (FDR) procedure has become a popular method for handling multiplicity in high-dimensional data. The definition of FDR has a natural Bayesian interpretation; it is the expected proportion of null hypotheses mistakenly rejected given a measure of evidence for their truth. In this article, we propose controlling the positive FDR using a Bayesian approach where the rejection rule is based on the posterior probabilities of the null hypotheses. Correspondence between Bayesian and frequentist measures of evidence in hypothesis testing has been studied in several contexts. Here we extend the comparison to multiple testing with control of the FDR and illustrate the procedure with an application to wavelet thresholding. The problem consists of recovering signal from noisy measurements. This involves extracting wavelet coefficients that result from true signal and can be formulated as a multiple hypotheses-testing problem. We use simulated examples to compare the performance of our approach to the Benjamini and Hochberg (1995, Journal of the Royal Statistical Society, Series B57, 289-300) procedure. We also illustrate the method with nuclear magnetic resonance spectral data from human brain.  相似文献   

2.
One of multiple testing problems in drug finding experiments is the comparison of several treatments with one control. In this paper we discuss a particular situation of such an experiment, i.e., a microarray setting, where the many-to-one comparisons need to be addressed for thousands of genes simultaneously. For a gene-specific analysis, Dunnett's single step procedure is considered within gene tests, while the FDR controlling procedures such as Significance Analysis of Microarrays (SAM) and Benjamini and Hochberg (BH) False Discovery Rate (FDR) adjustment are applied to control the error rate across genes. The method is applied to a microarray experiment with four treatment groups (three microarrays in each group) and 16,998 genes. Simulation studies are conducted to investigate the performance of the SAM method and the BH-FDR procedure with regard to controlling the FDR, and to investigate the effect of small-variance genes on the FDR in the SAM procedure.  相似文献   

3.
The multiple testing problem attributed to gene expression analysis is challenging not only by its size, but also by possible dependence between the expression levels of different genes resulting from coregulations of the genes. Furthermore, the measurement errors of these expression levels may be dependent as well since they are subjected to several technical factors. Multiple testing of such data faces the challenge of correlated test statistics. In such a case, the control of the False Discovery Rate (FDR) is not straightforward, and thus demands new approaches and solutions that will address multiplicity while accounting for this dependency. This paper investigates the effects of dependency between bormal test statistics on FDR control in two-sided testing, using the linear step-up procedure (BH) of Benjamini and Hochberg (1995). The case of two multiple hypotheses is examined first. A simulation study offers primary insight into the behavior of the FDR subjected to different levels of correlation and distance between null and alternative means. A theoretical analysis follows in order to obtain explicit upper bounds to the FDR. These results are then extended to more than two multiple tests, thereby offering a better perspective on the effect of the proportion of false null hypotheses, as well as the structure of the test statistics correlation matrix. An example from gene expression data analysis is presented.  相似文献   

4.
In most analyses of large-scale genomic data sets, differentialexpression analysis is typically assessed by testing for differencesin the mean of the distributions between 2 groups. A recentfinding by Tomlins and others (2005) is of a different typeof pattern of differential expression in which a fraction ofsamples in one group have overexpression relative to samplesin the other group. In this work, we describe a general mixturemodel framework for the assessment of this type of expression,called outlier profile analysis. We start by considering thesingle-gene situation and establishing results on identifiability.We propose 2 nonparametric estimation procedures that have naturallinks to familiar multiple testing procedures. We then developmultivariate extensions of this methodology to handle genome-widemeasurements. The proposed methodologies are compared usingsimulation studies as well as data from a prostate cancer geneexpression study.  相似文献   

5.
Benjamini Y  Heller R 《Biometrics》2008,64(4):1215-1222
SUMMARY: We consider the problem of testing for partial conjunction of hypothesis, which argues that at least u out of n tested hypotheses are false. It offers an in-between approach to the testing of the conjunction of null hypotheses against the alternative that at least one is not, and the testing of the disjunction of null hypotheses against the alternative that all hypotheses are not null. We suggest powerful test statistics for testing such a partial conjunction hypothesis that are valid under dependence between the test statistics as well as under independence. We then address the problem of testing many partial conjunction hypotheses simultaneously using the false discovery rate (FDR) approach. We prove that if the FDR controlling procedure in Benjamini and Hochberg (1995, Journal of the Royal Statistical Society, Series B 57, 289-300) is used for this purpose the FDR is controlled under various dependency structures. Moreover, we can screen at all levels simultaneously in order to display the findings on a superimposed map and still control an appropriate FDR measure. We apply the method to examples from microarray analysis and functional magnetic resonance imaging (fMRI), two application areas where the need for partial conjunction analysis has been identified.  相似文献   

6.
Liang  Faming; Zhang  Jian 《Biometrika》2008,95(4):961-977
Testing of multiple hypotheses involves statistics that arestrongly dependent in some applications, but most work on thissubject is based on the assumption of independence. We proposea new method for estimating the false discovery rate of multiplehypothesis tests, in which the density of test scores is estimatedparametrically by minimizing the Kullback–Leibler distancebetween the unknown density and its estimator using the stochasticapproximation algorithm, and the false discovery rate is estimatedusing the ensemble averaging method. Our method is applicableunder general dependence between test statistics. Numericalcomparisons between our method and several competitors, conductedon simulated and real data examples, show that our method achievesmore accurate control of the false discovery rate in almostall scenarios.  相似文献   

7.
Computer simulation techniques were used to investigate the Type I and Type II error rates of one parametric (Dunnett) and two nonparametric multiple comparison procedures for comparing treatments with a control under nonnormality and variance homogeneity. It was found that Dunnett's procedure is quite robust with respect to violations of the normality assumption. Power comparisons show that for small sample sizes Dunnett's procedure is superior to the nonparametric procedures also in non-normal cases, but for larger sample sizes the multiple analogue to Wilcoxon and Kruskal-Wallis rank statistics are superior to Dunnett's procedure in all considered nonnormal cases. Further investigations under nonnormality and variance heterogeneity show robustness properties with respect to the risks of first kind and power comparisons yield similar results as in the equal variance case.  相似文献   

8.
Summary Methods for performing multiple tests of paired proportions are described. A broadly applicable method using McNemar's exact test and the exact distributions of all test statistics is developed; the method controls the familywise error rate in the strong sense under minimal assumptions. A closed form (not simulation‐based) algorithm for carrying out the method is provided. A bootstrap alternative is developed to account for correlation structures. Operating characteristics of these and other methods are evaluated via a simulation study. Applications to multiple comparisons of predictive models for disease classification and to postmarket surveillance of adverse events are given.  相似文献   

9.
Reiter  Jerome P. 《Biometrika》2008,95(4):933-946
When some of the records used to estimate the imputation modelsin multiple imputation are not used or available for analysis,the usual multiple imputation variance estimator has positivebias. We present an alternative approach that enables unbiasedestimation of variances and, hence, calibrated inferences insuch contexts. First, using all records, the imputer samplesm values of the parameters of the imputation model. Second,for each parameter draw, the imputer simulates the missing valuesfor all records n times. From these mn completed datasets, theimputer can analyse or disseminate the appropriate subset ofrecords. We develop methods for interval estimation and significancetesting for this approach. Methods are presented in the contextof multiple imputation for measurement error.  相似文献   

10.
A multiple character analysis was undertaken of a broadly representativesample of three species:Canis lupus (wolf), C. latrans (coyote),and C. familiaris (dog). These species are clearly and significantlydistinguished by the technique of linear discrimination. Theanalysis provides a basis for the identification of skulls notobviously distinguishable by size or other diagnostic characters. Early populations of Canis n. niger and C. n. gregoryi (redwolf) are compared with the three species above and are foundto form a cluster with lupus and to be sharply distinct fromthe other two species. Additional comparisons show that whilelupus lycaon and niger both overlap with lupus, they are distinctfrom each other. This entire cluster is quite distinct fromlatrans, with niger being the farthest removed. A sample populationof C. n. gregoiyi, from the edge of the extending range of C.latrans, was examined and found to show too great a range ofvariation to be attributed to a single species.  相似文献   

11.
We consider multiple testing with false discovery rate (FDR) control when p values have discrete and heterogeneous null distributions. We propose a new estimator of the proportion of true null hypotheses and demonstrate that it is less upwardly biased than Storey's estimator and two other estimators. The new estimator induces two adaptive procedures, that is, an adaptive Benjamini–Hochberg (BH) procedure and an adaptive Benjamini–Hochberg–Heyse (BHH) procedure. We prove that the adaptive BH (aBH) procedure is conservative nonasymptotically. Through simulation studies, we show that these procedures are usually more powerful than their nonadaptive counterparts and that the adaptive BHH procedure is usually more powerful than the aBH procedure and a procedure based on randomized p‐value. The adaptive procedures are applied to a study of HIV vaccine efficacy, where they identify more differentially polymorphic positions than the BH procedure at the same FDR level.  相似文献   

12.
Simultaneous inference in general parametric models   总被引:6,自引:0,他引:6  
Simultaneous inference is a common problem in many areas of application. If multiple null hypotheses are tested simultaneously, the probability of rejecting erroneously at least one of them increases beyond the pre-specified significance level. Simultaneous inference procedures have to be used which adjust for multiplicity and thus control the overall type I error rate. In this paper we describe simultaneous inference procedures in general parametric models, where the experimental questions are specified through a linear combination of elemental model parameters. The framework described here is quite general and extends the canonical theory of multiple comparison procedures in ANOVA models to linear regression problems, generalized linear models, linear mixed effects models, the Cox model, robust linear models, etc. Several examples using a variety of different statistical models illustrate the breadth of the results. For the analyses we use the R add-on package multcomp, which provides a convenient interface to the general approach adopted here.  相似文献   

13.
Boos DD  Stefanski LA  Wu Y 《Biometrics》2009,65(3):692-700
Summary .  A new version of the false selection rate variable selection method of Wu, Boos, and Stefanski (2007,  Journal of the American Statistical Association   102, 235–243) is developed that requires no simulation. This version allows the tuning parameter in forward selection to be estimated simply by hand calculation from a summary table of output even for situations where the number of explanatory variables is larger than the sample size. Because of the computational simplicity, the method can be used in permutation tests and inside bagging loops for improved prediction. Illustration is provided in clinical trials for linear regression, logistic regression, and Cox proportional hazards regression.  相似文献   

14.
A method is suggested for handling multiple comparisons in repeated measurement situations with completely random missing values. Exact results are obtained for the situation with normally distributed observations in the case of compound symmetry. The method uses grouping with respect to the positions of the missing values. It is most efficient and best suited when there are not too many measurement occasions in the longitudinal investigation.  相似文献   

15.
On weighted Hochberg procedures   总被引:1,自引:0,他引:1  
Tamhane  Ajit C.; Liu  Lingyun 《Biometrika》2008,95(2):279-294
We consider different ways of constructing weighted Hochberg-typestep-up multiple test procedures including closed proceduresbased on weighted Simes tests and their conservative step-upshort-cuts, and step-up counterparts of two weighted Holm procedures.It is shown that the step-up counterparts have some seriouspitfalls such as lack of familywise error rate control and lackof monotonicity in rejection decisions in terms of p-values.Therefore an exact closed procedure appears to be the best alternative,its only drawback being lack of simple stepwise structure. Aconservative step-up short-cut to the closed procedure may beused instead, but with accompanying loss of power. Simulationsare used to study the familywise error rate and power propertiesof the competing procedures for independent and correlated p-values.Although many of the results of this paper are negative, theyare useful in highlighting the need for caution when procedureswith similar pitfalls may be used.  相似文献   

16.
应用神经网络和多元回归技术预测森林产量   总被引:16,自引:0,他引:16  
应用传统统计技术常会因样本小和测量数据不符某种分布而受到限制。本文评价一种前馈型神经网络算法以预测落叶阔叶林产量。另外,还介绍一种由定性变为定量的数据变换方法,以用相对小的样本建立多元回归预测模型。数据变换方法有助于改善多元回归模型的预测效果。在本实验的条件下,研究结果表明神经网络技术能够产生最好的预测效果.  相似文献   

17.
Summary Microarray gene expression studies over ordered categories are routinely conducted to gain insights into biological functions of genes and the underlying biological processes. Some common experiments are time‐course/dose‐response experiments where a tissue or cell line is exposed to different doses and/or durations of time to a chemical. A goal of such studies is to identify gene expression patterns/profiles over the ordered categories. This problem can be formulated as a multiple testing problem where for each gene the null hypothesis of no difference between the successive mean gene expressions is tested and further directional decisions are made if it is rejected. Much of the existing multiple testing procedures are devised for controlling the usual false discovery rate (FDR) rather than the mixed directional FDR (mdFDR), the expected proportion of Type I and directional errors among all rejections. Benjamini and Yekutieli (2005, Journal of the American Statistical Association 100, 71–93) proved that an augmentation of the usual Benjamini–Hochberg (BH) procedure can control the mdFDR while testing simple null hypotheses against two‐sided alternatives in terms of one‐dimensional parameters. In this article, we consider the problem of controlling the mdFDR involving multidimensional parameters. To deal with this problem, we develop a procedure extending that of Benjamini and Yekutieli based on the Bonferroni test for each gene. A proof is given for its mdFDR control when the underlying test statistics are independent across the genes. The results of a simulation study evaluating its performance under independence as well as under dependence of the underlying test statistics across the genes relative to other relevant procedures are reported. Finally, the proposed methodology is applied to a time‐course microarray data obtained by Lobenhofer et al. (2002, Molecular Endocrinology 16, 1215–1229). We identified several important cell‐cycle genes, such as DNA replication/repair gene MCM4 and replication factor subunit C2, which were not identified by the previous analyses of the same data by Lobenhofer et al. (2002) and Peddada et al. (2003, Bioinformatics 19, 834–841). Although some of our findings overlap with previous findings, we identify several other genes that complement the results of Lobenhofer et al. (2002) .  相似文献   

18.
Wu B  Guan Z  Zhao H 《Biometrics》2006,62(3):735-744
Nonparametric and parametric approaches have been proposed to estimate false discovery rate under the independent hypothesis testing assumption. The parametric approach has been shown to have better performance than the nonparametric approaches. In this article, we study the nonparametric approaches and quantify the underlying relations between parametric and nonparametric approaches. Our study reveals the conservative nature of the nonparametric approaches, and establishes the connections between the empirical Bayes method and p-value-based nonparametric methods. Based on our results, we advocate using the parametric approach, or directly modeling the test statistics using the empirical Bayes method.  相似文献   

19.
The Newman-Keuls (NK) procedure for testing all pairwise comparisons among a set of treatment means, introduced by Newman (1939) and in a slightly different form by Keuls (1952) was proposed as a reasonable way to alleviate the inflation of error rates when a large number of means are compared. It was proposed before the concepts of different types of multiple error rates were introduced by Tukey (1952a, b; 1953). Although it was popular in the 1950s and 1960s, once control of the familywise error rate (FWER) was accepted generally as an appropriate criterion in multiple testing, and it was realized that the NK procedure does not control the FWER at the nominal level at which it is performed, the procedure gradually fell out of favor. Recently, a more liberal criterion, control of the false discovery rate (FDR), has been proposed as more appropriate in some situations than FWER control. This paper notes that the NK procedure and a nonparametric extension controls the FWER within any set of homogeneous treatments. It proves that the extended procedure controls the FDR when there are well-separated clusters of homogeneous means and between-cluster test statistics are independent, and extensive simulation provides strong evidence that the original procedure controls the FDR under the same conditions and some dependent conditions when the clusters are not well-separated. Thus, the test has two desirable error-controlling properties, providing a compromise between FDR control with no subgroup FWER control and global FWER control. Yekutieli (2002) developed an FDR-controlling procedure for testing all pairwise differences among means, without any FWER-controlling criteria when there is more than one cluster. The empirica example in Yekutieli's paper was used to compare the Benjamini-Hochberg (1995) method with apparent FDR control in this context, Yekutieli's proposed method with proven FDR control, the Newman-Keuls method that controls FWER within equal clusters with apparent FDR control, and several methods that control FWER globally. The Newman-Keuls is shown to be intermediate in number of rejections to the FWER-controlling methods and the FDR-controlling methods in this example, although it is not always more conservative than the other FDR-controlling methods.  相似文献   

20.
Sabatti C  Service S  Freimer N 《Genetics》2003,164(2):829-833
We explore the implications of the false discovery rate (FDR) controlling procedure in disease gene mapping. With the aid of simulations, we show how, under models commonly used, the simple step-down procedure introduced by Benjamini and Hochberg controls the FDR for the dependent tests on which linkage and association genome screens are based. This adaptive multiple comparison procedure may offer an important tool for mapping susceptibility genes for complex diseases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号