首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
We consider multiple testing with false discovery rate (FDR) control when p values have discrete and heterogeneous null distributions. We propose a new estimator of the proportion of true null hypotheses and demonstrate that it is less upwardly biased than Storey's estimator and two other estimators. The new estimator induces two adaptive procedures, that is, an adaptive Benjamini–Hochberg (BH) procedure and an adaptive Benjamini–Hochberg–Heyse (BHH) procedure. We prove that the adaptive BH (aBH) procedure is conservative nonasymptotically. Through simulation studies, we show that these procedures are usually more powerful than their nonadaptive counterparts and that the adaptive BHH procedure is usually more powerful than the aBH procedure and a procedure based on randomized p‐value. The adaptive procedures are applied to a study of HIV vaccine efficacy, where they identify more differentially polymorphic positions than the BH procedure at the same FDR level.  相似文献   

2.
    
In this paper, we consider online multiple testing with familywise error rate (FWER) control, where the probability of committing at least one type I error will remain under control while testing a possibly infinite sequence of hypotheses over time. Currently, adaptive-discard (ADDIS) procedures seem to be the most promising online procedures with FWER control in terms of power. Now, our main contribution is a uniform improvement of the ADDIS principle and thus of all ADDIS procedures. This means, the methods we propose reject as least as much hypotheses as ADDIS procedures and in some cases even more, while maintaining FWER control. In addition, we show that there is no other FWER controlling procedure that enlarges the event of rejecting any hypothesis. Finally, we apply the new principle to derive uniform improvements of the ADDIS-Spending and ADDIS-Graph.  相似文献   

3.
The paper is concerned with expected type I errors of some stepwise multiple test procedures based on independent p‐values controlling the so‐called false discovery rate (FDR). We derive an asymptotic result for the supremum of the expected type I error rate(EER) when the number of hypotheses tends to infinity. Among others, it will be shown that when the original Benjamini‐Hochberg step‐up procedure controls the FDR at level α, its EER may approach a value being slightly larger than α/4 when the number of hypotheses increases. Moreover, we derive some least favourable parameter configuration results, some bounds for the FDR and the EER as well as easily computable formulae for the familywise error rate (FWER) of two FDR‐controlling procedures. Finally, we discuss some undesirable properties of the FDR concept, especially the problem of cheating.  相似文献   

4.
    
John Lawrence 《Biometrics》2019,75(4):1334-1344
It is known that the one‐sided Simes’ test controls the error rate if the underlying distribution is multivariate totally positive of order 2 (MTP2), but not in general. The two‐sided test also controls the error rate when the coordinate absolute value has an MTP2 distribution, which holds more generally. We prove mathematically that when the coordinate absolute value controls the error rate at level 2α, then certain kinds of truncated Simes’ tests also control the one‐sided error rate at level α. We also compare the closure of the truncated tests with the Holms, Hochberg, and Hommel procedures in many scenarios when the test statistics are multivariate normal.  相似文献   

5.
    
The Globaltest is a powerful test for the global null hypothesis that there is no association between a group of features and a response of interest, which is popular in pathway testing in metabolomics. Evaluating multiple feature sets, however, requires multiple testing correction. In this paper, we propose a multiple testing method, based on closed testing, specifically designed for the Globaltest. The proposed method controls the familywise error rate simultaneously over all possible feature sets, and therefore allows post hoc inference, that is, the researcher may choose feature sets of interest after seeing the data without jeopardizing error control. To circumvent the exponential computation time of closed testing, we derive a novel shortcut that allows exact closed testing to be performed on the scale of metabolomics data. An R package ctgt is available on comprehensive R archive network for the implementation of the shortcut procedure, with applications on several real metabolomics data examples.  相似文献   

6.
Gene expression signatures from microarray experiments promise to provide important prognostic tools for predicting disease outcome or response to treatment. A number of microarray studies in various cancers have reported such gene signatures. However, the overlap of gene signatures in the same disease has been limited so far, and some reported signatures have not been reproduced in other populations. Clearly, the methods used for verifying novel gene signatures need improvement. In this article, we describe an experiment in which microarrays and sample hybridization are designed according to the statistical principles of randomization, replication and blocking. Our results show that such designs provide unbiased estimation of differential expression levels as well as powerful tests for them.  相似文献   

7.
    
Multiple testing (MT) with false discovery rate (FDR) control has been widely conducted in the “discrete paradigm” where p-values have discrete and heterogeneous null distributions. However, in this scenario existing FDR procedures often lose some power and may yield unreliable inference, and for this scenario there does not seem to be an FDR procedure that partitions hypotheses into groups, employs data-adaptive weights and is nonasymptotically conservative. We propose a weighted p-value-based FDR procedure, “weighted FDR (wFDR) procedure” for short, for MT in the discrete paradigm that efficiently adapts to both heterogeneity and discreteness of p-value distributions. We theoretically justify the nonasymptotic conservativeness of the wFDR procedure under independence, and show via simulation studies that, for MT based on p-values of binomial test or Fisher's exact test, it is more powerful than six other procedures. The wFDR procedure is applied to two examples based on discrete data, a drug safety study, and a differential methylation study, where it makes more discoveries than two existing methods.  相似文献   

8.
    
Hung et al. (2007) considered the problem of controlling the type I error rate for a primary and secondary endpoint in a clinical trial using a gatekeeping approach in which the secondary endpoint is tested only if the primary endpoint crosses its monitoring boundary. They considered a two-look trial and showed by simulation that the naive method of testing the secondary endpoint at full level α at the time the primary endpoint reaches statistical significance does not control the familywise error rate at level α. Tamhane et al. (2010) derived analytic expressions for familywise error rate and power and confirmed the inflated error rate of the naive approach. Nonetheless, many people mistakenly believe that the closure principle can be used to prove that the naive procedure controls the familywise error rate. The purpose of this note is to explain in greater detail why there is a problem with the naive approach and show that the degree of alpha inflation can be as high as that of unadjusted monitoring of a single endpoint.  相似文献   

9.
10.
    
For multiple testing based on discrete p-values, we propose a false discovery rate (FDR) procedure “BH+” with proven conservativeness. BH+ is at least as powerful as the BH (i.e., Benjamini-Hochberg) procedure when they are applied to superuniform p-values. Further, when applied to mid-p-values, BH+ can be more powerful than it is applied to conventional p-values. An easily verifiable necessary and sufficient condition for this is provided. BH+ is perhaps the first conservative FDR procedure applicable to mid-p-values and to p-values with general distributions. It is applied to multiple testing based on discrete p-values in a methylation study, an HIV study and a clinical safety study, where it makes considerably more discoveries than the BH procedure. In addition, we propose an adaptive version of the BH+ procedure, prove its conservativeness under certain conditions, and provide evidence on its excellent performance via simulation studies.  相似文献   

11.
    
In this paper, we consider multiplicity testing approaches mainly for phase 3 trials with two doses. We review a few available approaches and propose some new ones. The doses selected for phase 3 usually have the same or a similar efficacy profile, so they have some degree of consistency in efficacy. We review the Hochberg procedure, the Bonferroni procedure, and a few consistency‐adjusted procedures, and suggest new ones by applying the available procedures to the pooled dose and the high dose, the dose that is thought to be more efficacious between two doses. The reason behind the idea is that the pooled dose and the high dose are more consistent than the original two doses if the high dose is more efficacious than the low dose. We compare all approaches via simulations and recommend using a procedure combining 4A and the pooling approach. We also discuss briefly the testing strategy for trials with more than two doses.  相似文献   

12.
    
Kun Liang 《Biometrics》2016,72(2):639-648
  相似文献   

13.
14.
    
Mass spectrometry‐based proteomics starts with identifications of peptides and proteins, which provide the bases for forming the next‐level hypotheses whose “validations” are often employed for forming even higher level hypotheses and so forth. Scientifically meaningful conclusions are thus attainable only if the number of falsely identified peptides/proteins is accurately controlled. For this reason, RAId continued to be developed in the past decade. RAId employs rigorous statistics for peptides/proteins identification, hence assigning accurate P‐values/E‐values that can be used confidently to control the number of falsely identified peptides and proteins. The RAId web service is a versatile tool built to identify peptides and proteins from tandem mass spectrometry data. Not only recognizing various spectra file formats, the web service also allows four peptide scoring functions and choice of three statistical methods for assigning P‐values/E‐values to identified peptides. Users may upload their own protein database or use one of the available knowledge integrated organismal databases that contain annotated information such as single amino acid polymorphisms, post‐translational modifications, and their disease associations. The web service also provides a friendly interface to display, sort using different criteria, and download the identified peptides and proteins. RAId web service is freely available at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid  相似文献   

15.
The two‐sided Simes test is known to control the type I error rate with bivariate normal test statistics. For one‐sided hypotheses, control of the type I error rate requires that the correlation between the bivariate normal test statistics is non‐negative. In this article, we introduce a trimmed version of the one‐sided weighted Simes test for two hypotheses which rejects if (i) the one‐sided weighted Simes test rejects and (ii) both p‐values are below one minus the respective weighted Bonferroni adjusted level. We show that the trimmed version controls the type I error rate at nominal significance level α if (i) the common distribution of test statistics is point symmetric and (ii) the two‐sided weighted Simes test at level 2α controls the level. These assumptions apply, for instance, to bivariate normal test statistics with arbitrary correlation. In a simulation study, we compare the power of the trimmed weighted Simes test with the power of the weighted Bonferroni test and the untrimmed weighted Simes test. An additional result of this article ensures type I error rate control of the usual weighted Simes test under a weak version of the positive regression dependence condition for the case of two hypotheses. This condition is shown to apply to the two‐sided p‐values of one‐ or two‐sample t‐tests for bivariate normal endpoints with arbitrary correlation and to the corresponding one‐sided p‐values if the correlation is non‐negative. The Simes test for such types of bivariate t‐tests has not been considered before. According to our main result, the trimmed version of the weighted Simes test then also applies to the one‐sided bivariate t‐test with arbitrary correlation.  相似文献   

16.
Li J  Jiang J  Leung FC 《Gene》2012,494(1):57-64
Next generation 454 pyrosequencing technology for whole bacterial genome sequencing involves a deep sequencing strategy with at least 15-20 × in depth proposed by official protocols but usually done with over 20 × in practices. In this study, we carried out a comprehensive evaluation of quality of the de novo assemblies based on realistic pyrosequencing simulated data from 1480 prokaryote genomes and 7 runs of machine-generated data. Our results demonstrated that for most of the prokaryote genomes, 6-10 × sequencing in qualified runs with 400 bp reads could produce high quality draft assembly (> 98% genome coverage, < 100 contigs with N50 size > 100 kb, single base accuracy > 99.99, indel error rate < 0.01%, false gene loss/duplication rate < 0.5%). Our study proves the power of low depth pyrosequencing strategy, which provides a cost-effective way for sequencing whole prokaryote genomes in a short time and enables further studies in microbial population diversity and comparative genomics.  相似文献   

17.
Detection of positive Darwinian selection has become ever more important with the rapid growth of genomic data sets. Recent branch-site models of codon substitution account for variation of selective pressure over branches on the tree and across sites in the sequence and provide a means to detect short episodes of molecular adaptation affecting just a few sites. In likelihood ratio tests based on such models, the branches to be tested for positive selection have to be specified a priori. In the absence of a biological hypothesis to designate so-called foreground branches, one may test many branches, but a correction for multiple testing becomes necessary. In this paper, we employ computer simulation to evaluate the performance of 6 multiple test correction procedures when the branch-site models are used to test every branch on the phylogeny for positive selection. Four of the methods control the familywise error rates (FWERs), whereas the other 2 control the false discovery rate (FDR). We found that all correction procedures achieved acceptable FWER except for extremely divergent sequences and serious model violations, when the test may become unreliable. The power of the test to detect positive selection is influenced by the strength of selection and the sequence divergence, with the highest power observed at intermediate divergences. The 4 correction procedures that control the FWER had similar power. We recommend Rom's procedure for its slightly higher power, but the simple Bonferroni correction is useable as well. The 2 correction procedures that control the FDR had slightly more power and also higher FWER. We demonstrate the multiple test procedures by analyzing gene sequences from the extracellular domain of the cluster of differentiation 2 (CD2) gene from 10 mammalian species. Both our simulation and real data analysis suggest that the multiple test procedures are useful when multiple branches have to be tested on the same data set.  相似文献   

18.
There are many common misapprehensions about statistics that occur in the literature. We are sure that the three misapprehensions we deal with in this short review are widespread. They concern:
  • 1) what P values mean;
  • 2) what an insignificant result means, and what it does not mean; the question of the ‘power’ of a statistical test;
  • 3) the difference between importance and statistical significance.
We produce no formulae or recipes for dealing with particular situations, instead we concentrate on the commonsense use of simple statistics. We emphasise that if the use of any but the simplest statistics is intended, it is much better to get proper statistical help before starting experiments, rather than afterwards. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

19.
Quantitative research especially in the social, but also in the biological sciences has been limited by the availability and applicability of analytic techniques that elaborate interactions among behaviours, treatment effects, and mediating variables. This gap has been filled by a newly developed statistical technique, known as graphical interaction modelling. The merit of graphical models for analyzing highly structured data is explored in this paper by an empirical study on coping with a chronic condition as a function of interrelationships between three sets of factors. These include background factors, illness context factors, and four self‐care practices. Based on a graphical chain model, the direct and indirect dependencies are revealed and discussed in comparison to the results obtained from a simple logistic regression model ignoring possible interaction effects. Both techniques are introduced from a more tutorial point of view instead of going far into technical details.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号