期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

False discovery rate control for multiple testing based on discrete p-values

Xiongzhi Chen 《Biometrical journal. Biometrische Zeitschrift》2020,62(4):1060-1079

For multiple testing based on discrete p-values, we propose a false discovery rate (FDR) procedure “BH+” with proven conservativeness. BH+ is at least as powerful as the BH (i.e., Benjamini-Hochberg) procedure when they are applied to superuniform p-values. Further, when applied to mid-p-values, BH+ can be more powerful than it is applied to conventional p-values. An easily verifiable necessary and sufficient condition for this is provided. BH+ is perhaps the first conservative FDR procedure applicable to mid-p-values and to p-values with general distributions. It is applied to multiple testing based on discrete p-values in a methylation study, an HIV study and a clinical safety study, where it makes considerably more discoveries than the BH procedure. In addition, we propose an adaptive version of the BH+ procedure, prove its conservativeness under certain conditions, and provide evidence on its excellent performance via simulation studies. 相似文献

2.

False discovery control with p-value weighting 总被引：2，自引：0，他引：2

Genovese Christopher R.; Roeder Kathryn; Wasserman Larry 《Biometrika》2006,93(3):509-524

相似文献

3.

False discovery rate in linkage and association genome screens for complex disorders

Sabatti C Service S Freimer N 《Genetics》2003,164(2):829-833

We explore the implications of the false discovery rate (FDR) controlling procedure in disease gene mapping. With the aid of simulations, we show how, under models commonly used, the simple step-down procedure introduced by Benjamini and Hochberg controls the FDR for the dependent tests on which linkage and association genome screens are based. This adaptive multiple comparison procedure may offer an important tool for mapping susceptibility genes for complex diseases. 相似文献

4.

False discovery rate estimation for cross-linked peptides identified by mass spectrometry

T Walzthoeni M Claassen A Leitner F Herzog S Bohn F Förster M Beck R Aebersold 《Nature methods》2012,9(9):901-903

The mass spectrometric identification of chemically cross-linked peptides (CXMS) specifies spatial restraints of protein complexes; these values complement data obtained from common structure-determination techniques. Generic methods for determining false discovery rates of cross-linked peptide assignments are currently lacking, thus making data sets from CXMS studies inherently incomparable. Here we describe an automated target-decoy strategy and the software tool xProphet, which solve this problem for large multicomponent protein complexes. 相似文献

5.

False discovery rate, sensitivity and sample size for microarray studies 总被引：10，自引：0，他引：10

Pawitan Y Michiels S Koscielny S Gusnanto A Ploner A 《Bioinformatics (Oxford, England)》2005,21(13):3017-3024

MOTIVATION: In microarray data studies most researchers are keenly aware of the potentially high rate of false positives and the need to control it. One key statistical shift is the move away from the well-known P-value to false discovery rate (FDR). Less discussion perhaps has been spent on the sensitivity or the associated false negative rate (FNR). The purpose of this paper is to explain in simple ways why the shift from P-value to FDR for statistical assessment of microarray data is necessary, to elucidate the determining factors of FDR and, for a two-sample comparative study, to discuss its control via sample size at the design stage. RESULTS: We use a mixture model, involving differentially expressed (DE) and non-DE genes, that captures the most common problem of finding DE genes. Factors determining FDR are (1) the proportion of truly differentially expressed genes, (2) the distribution of the true differences, (3) measurement variability and (4) sample size. Many current small microarray studies are plagued with large FDR, but controlling FDR alone can lead to unacceptably large FNR. In evaluating a design of a microarray study, sensitivity or FNR curves should be computed routinely together with FDR curves. Under certain assumptions, the FDR and FNR curves coincide, thus simplifying the choice of sample size for controlling the FDR and FNR jointly. 相似文献

6.

False discovery rate paradigms for statistical analyses of microarray gene expression data

Cheng C Pounds S 《Bioinformation》2007,1(10):436-446

The microarray gene expression applications have greatly stimulated the statistical research on the massive multiple hypothesis tests problem. There is now a large body of literature in this area and basically five paradigms of massive multiple tests: control of the false discovery rate (FDR), estimation of FDR, significance threshold criteria, control of family-wise error rate (FWER) or generalized FWER (gFWER), and empirical Bayes approaches. This paper contains a technical survey of the developments of the FDR-related paradigms, emphasizing precise formulation of the problem, concepts of error measurements, and considerations in applications. The goal is not to do an exhaustive literature survey, but rather to review the current state of the field. 相似文献

7.

Wavelet thresholding with bayesian false discovery rate control 总被引：1，自引：0，他引：1

Tadesse MG Ibrahim JG Vannucci M Gentleman R 《Biometrics》2005,61(1):25-35

The false discovery rate (FDR) procedure has become a popular method for handling multiplicity in high-dimensional data. The definition of FDR has a natural Bayesian interpretation; it is the expected proportion of null hypotheses mistakenly rejected given a measure of evidence for their truth. In this article, we propose controlling the positive FDR using a Bayesian approach where the rejection rule is based on the posterior probabilities of the null hypotheses. Correspondence between Bayesian and frequentist measures of evidence in hypothesis testing has been studied in several contexts. Here we extend the comparison to multiple testing with control of the FDR and illustrate the procedure with an application to wavelet thresholding. The problem consists of recovering signal from noisy measurements. This involves extracting wavelet coefficients that result from true signal and can be formulated as a multiple hypotheses-testing problem. We use simulated examples to compare the performance of our approach to the Benjamini and Hochberg (1995, Journal of the Royal Statistical Society, Series B57, 289-300) procedure. We also illustrate the method with nuclear magnetic resonance spectral data from human brain. 相似文献

8.

Optimal two-stage screening designs for survival comparisons 总被引：2，自引：0，他引：2

SCHAID DANIEL J.; WIEAND SAM; THERNEAU TERRY M. 《Biometrika》1990,77(3):507-513

相似文献

9.

Cox regression methods for two-stage randomization designs

Lokhnygina Y Helterbrand JD 《Biometrics》2007,63(2):422-428

Two-stage randomization designs (TSRD) are becoming increasingly common in oncology and AIDS clinical trials as they make more efficient use of study participants to examine therapeutic regimens. In these designs patients are initially randomized to an induction treatment, followed by randomization to a maintenance treatment conditional on their induction response and consent to further study treatment. Broader acceptance of TSRDs in drug development may hinge on the ability to make appropriate intent-to-treat type inference within this design framework as to whether an experimental induction regimen is better than a standard induction regimen when maintenance treatment is fixed. Recently Lunceford, Davidian, and Tsiatis (2002, Biometrics 58, 48-57) introduced an inverse probability weighting based analytical framework for estimating survival distributions and mean restricted survival times, as well as for comparing treatment policies at landmarks in the TSRD setting. In practice Cox regression is widely used and in this article we extend the analytical framework of Lunceford et al. (2002) to derive a consistent estimator for the log hazard in the Cox model and a robust score test to compare treatment policies. Large sample properties of these methods are derived, illustrated via a simulation study, and applied to a TSRD clinical trial. 相似文献

10.

Implementing false discovery rate control: increasing your power 总被引：23，自引：0，他引：23

Koen J.F. Verhoeven Katy L. Simonsen Lauren M. McIntyre 《Oikos》2005,108(3):643-647

Popular procedures to control the chance of making type I errors when multiple statistical tests are performed come at a high cost: a reduction in power. As the number of tests increases, power for an individual test may become unacceptably low. This is a consequence of minimizing the chance of making even a single type I error, which is the aim of, for instance, the Bonferroni and sequential Bonferroni procedures. An alternative approach, control of the false discovery rate (FDR), has recently been advocated for ecological studies. This approach aims at controlling the proportion of significant results that are in fact type I errors. Keeping the proportion of type I errors low among all significant results is a sensible, powerful, and easy-to-interpret way of addressing the multiple testing issue. To encourage practical use of the approach, in this note we illustrate how the proposed procedure works, we compare it to more traditional methods that control the familywise error rate, and we discuss some recent useful developments in FDR control. 相似文献

11.

False discovery rate estimation for large‐scale homogeneous discrete p‐values

下载免费PDF全文

Kun Liang 《Biometrics》2016,72(2):639-648

相似文献

12.

Adaptive two-stage analysis of genetic association in case-control designs

Zheng G Song K Elston RC 《Human heredity》2007,63(3-4):175-186

We study a two-stage analysis of genetic association for case-control studies. In the first stage, we compare Hardy-Weinberg disequilibrium coefficients between cases and controls and, in the second stage, we apply the Cochran- Armitage trend test. The two analyses are statistically independent when Hardy-Weinberg equilibrium holds in the population, so all the samples are used in both stages. The significance level in the first stage is adaptively determined based on its conditional power. Given the level in the first stage, the level for the second stage analysis is determined with the overall Type I error being asymptotically controlled. For finite sample sizes, a parametric bootstrap method is used to control the overall Type I error rate. This two-stage analysis is often more powerful than the Cochran-Armitage trend test alone for a large association study. The new approach is applied to SNPs from a real study. 相似文献

13.

Optimal DNA pooling-based two-stage designs in case-control association studies

Zhao Y Wang S 《Human heredity》2009,67(1):46-56

Study cost remains the major limiting factor for genome-wide association studies due to the necessity of genotyping a large number of SNPs for a large number of subjects. Both DNA pooling strategies and two-stage designs have been proposed to reduce genotyping costs. In this study, we propose a cost-effective, two-stage approach with a DNA pooling strategy. During stage I, all markers are evaluated on a subset of individuals using DNA pooling. The most promising set of markers is then evaluated with individual genotyping for all individuals during stage II. The goal is to determine the optimal parameters (pi(p)(sample ), the proportion of samples used during stage I with DNA pooling; and pi(p)(marker ), the proportion of markers evaluated during stage II with individual genotyping) that minimize the cost of a two-stage DNA pooling design while maintaining a desired overall significance level and achieving a level of power similar to that of a one-stage individual genotyping design. We considered the effects of three factors on optimal two-stage DNA pooling designs. Our results suggest that, under most scenarios considered, the optimal two-stage DNA pooling design may be much more cost-effective than the optimal two-stage individual genotyping design, which use individual genotyping during both stages. 相似文献

14.

Hit selection with false discovery rate control in genome-scale RNAi screens

Zhang XD Kuan PF Ferrer M Shu X Liu YC Gates AT Kunapuli P Stec EM Xu M Marine SD Holder DJ Strulovici B Heyse JF Espeseth AS 《Nucleic acids research》2008,36(14):4667-4679

RNA interference (RNAi) is a modality in which small double-stranded RNA molecules (siRNAs) designed to lead to the degradation of specific mRNAs are introduced into cells or organisms. siRNA libraries have been developed in which siRNAs targeting virtually every gene in the human genome are designed, synthesized and are presented for introduction into cells by transfection in a microtiter plate array. These siRNAs can then be transfected into cells using high-throughput screening (HTS) methodologies. The goal of RNAi HTS is to identify a set of siRNAs that inhibit or activate defined cellular phenotypes. The commonly used analysis methods including median ± kMAD have issues about error rates in multiple hypothesis testing and plate-wise versus experiment-wise analysis. We propose a methodology based on a Bayesian framework to address these issues. Our approach allows for sharing of information across plates in a plate-wise analysis, which obviates the need for choosing either a plate-wise or experimental-wise analysis. The proposed approach incorporates information from reliable controls to achieve a higher power and a balance between the contribution from the samples and control wells. Our approach provides false discovery rate (FDR) control to address multiple testing issues and it is robust to outliers. 相似文献

15.

A targeted maximum likelihood estimator for two-stage designs

Rose S van der Laan MJ 《The international journal of biostatistics》2011,7(1):17

We consider two-stage sampling designs, including so-called nested case control studies, where one takes a random sample from a target population and completes measurements on each subject in the first stage. The second stage involves drawing a subsample from the original sample, collecting additional data on the subsample. This data structure can be viewed as a missing data structure on the full-data structure collected in the second-stage of the study. Methods for analyzing two-stage designs include parametric maximum likelihood estimation and estimating equation methodology. We propose an inverse probability of censoring weighted targeted maximum likelihood estimator (IPCW-TMLE) in two-stage sampling designs and present simulation studies featuring this estimator. 相似文献

16.

False discoveries and models for gene discovery 总被引：10，自引：0，他引：10

van den Oord EJ Sullivan PF 《Trends in genetics : TIG》2003,19(10):537-542

In the search for genes underlying complex traits, there is a tendency to impose increasingly stringent criteria to avoid false discoveries. These stringent criteria make it hard to find true effects, and we argue that it might be better to optimize our procedures for eliminating and controlling false discoveries. Focusing on achieving an acceptable ratio of true- and false-positives, we show that false discoveries could be eliminated much more efficiently using a stepwise approach. To avoid a relatively high false discovery rate, corrections for 'multiple testing' might also be needed in candidate gene studies. If the appropriate methods are used, detecting the proportion of true effects appears to be a more important determinant of the genotyping burden than the desired false discovery rate. This raises the question of whether current models for gene discovery are shaped excessively by a fear of false discoveries. 相似文献

17.

Adaptive linear step-up procedures that control the false discovery rate 总被引：7，自引：0，他引：7

Benjamini Yoav; Krieger Abba M.; Yekutieli Daniel 《Biometrika》2006,93(3):491-507

相似文献

18.

New constructions of one- and two-stage pooling designs.

Yongxi Cheng Ding-Zhu Du 《Journal of computational biology》2008,15(2):195-205

The study of gene functions requires a DNA library of high quality, such a library is obtained from a large mount of testing and screening. Pooling design is a very helpful tool for reducing the number of tests for DNA library screening. In this paper, we present new one- and two-stage pooling designs, together with new probabilistic pooling designs. The approach in this paper works for both error-free and error-tolerance scenarios. 相似文献

19.

On sample size and inference for two-stage adaptive designs

Liu Q Chi GY 《Biometrics》2001,57(1):172-177

Proschan and Hunsberger (1995, Biometrics 51, 1315-1324) proposed a two-stage adaptive design that maintains the Type I error rate. For practical applications, a two-stage adaptive design is also required to achieve a desired statistical power while limiting the maximum overall sample size. In our proposal, a two-stage adaptive design is comprised of a main stage and an extension stage, where the main stage has sufficient power to reject the null under the anticipated effect size and the extension stage allows increasing the sample size in case the true effect size is smaller than anticipated. For statistical inference, methods for obtaining the overall adjusted p-value, point estimate and confidence intervals are developed. An exact two-stage test procedure is also outlined for robust inference. 相似文献

20.

Adaptive two-stage designs for single-arm phase IIA cancer clinical trials

Lin Y Shih WJ 《Biometrics》2004,60(2):482-490

The main purpose of a phase IIA trial of a new anticancer therapy is to determine whether the therapy has sufficient promise against a specific type of tumor to warrant its further development. The therapy will be rejected for further investigation if the true response rate is less than some uninteresting level and the test of hypothesis is powered at a specific target response rate. Two-stage designs are commonly used for this situation. However, in many situations investigators often express concern about uncertainty in targeting the alternative hypothesis to study power at the planning stage. In this article, motivated by a real example, we propose a strategy for adaptive two-stage designs that will use the information at the first stage of the study to either reject the therapy or continue testing with either an optimistic or a skeptic target response rate, while the type I error rate is controlled. We also introduce new optimal criteria to reduce the expected total sample size. 相似文献