期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Controlling the false discovery rate with constraints: the Newman-Keuls test revisited

Shaffer JP 《Biometrical journal. Biometrische Zeitschrift》2007,49(1):136-143

The Newman-Keuls (NK) procedure for testing all pairwise comparisons among a set of treatment means, introduced by Newman (1939) and in a slightly different form by Keuls (1952) was proposed as a reasonable way to alleviate the inflation of error rates when a large number of means are compared. It was proposed before the concepts of different types of multiple error rates were introduced by Tukey (1952a, b; 1953). Although it was popular in the 1950s and 1960s, once control of the familywise error rate (FWER) was accepted generally as an appropriate criterion in multiple testing, and it was realized that the NK procedure does not control the FWER at the nominal level at which it is performed, the procedure gradually fell out of favor. Recently, a more liberal criterion, control of the false discovery rate (FDR), has been proposed as more appropriate in some situations than FWER control. This paper notes that the NK procedure and a nonparametric extension controls the FWER within any set of homogeneous treatments. It proves that the extended procedure controls the FDR when there are well-separated clusters of homogeneous means and between-cluster test statistics are independent, and extensive simulation provides strong evidence that the original procedure controls the FDR under the same conditions and some dependent conditions when the clusters are not well-separated. Thus, the test has two desirable error-controlling properties, providing a compromise between FDR control with no subgroup FWER control and global FWER control. Yekutieli (2002) developed an FDR-controlling procedure for testing all pairwise differences among means, without any FWER-controlling criteria when there is more than one cluster. The empirica example in Yekutieli's paper was used to compare the Benjamini-Hochberg (1995) method with apparent FDR control in this context, Yekutieli's proposed method with proven FDR control, the Newman-Keuls method that controls FWER within equal clusters with apparent FDR control, and several methods that control FWER globally. The Newman-Keuls is shown to be intermediate in number of rejections to the FWER-controlling methods and the FDR-controlling methods in this example, although it is not always more conservative than the other FDR-controlling methods. 相似文献

2.

Multiple testing to establish superiority/equivalence of a new treatment compared with k standard treatments for unbalanced designs

Kwong KS Cheung SH Chan WS 《Biometrics》2004,60(2):491-498

In clinical studies, multiple superiority/equivalence testing procedures can be applied to classify a new treatment as superior, equivalent (same therapeutic effect), or inferior to each set of standard treatments. Previous stepwise approaches (Dunnett and Tamhane, 1997, Statistics in Medicine16, 2489-2506; Kwong, 2001, Journal of Statistical Planning and Inference 97, 359-366) are only appropriate for balanced designs. Unfortunately, the construction of similar tests for unbalanced designs is far more complex, with two major difficulties: (i) the ordering of test statistics for superiority may not be the same as the ordering of test statistics for equivalence; and (ii) the correlation structure of the test statistics is not equi-correlated but product-correlated. In this article, we seek to develop a two-stage testing procedure for unbalanced designs, which are very popular in clinical experiments. This procedure is a combination of step-up and single-step testing procedures, while the familywise error rate is proved to be controlled at a designated level. Furthermore, a simulation study is conducted to compare the average powers of the proposed procedure to those of the single-step procedure. In addition, a clinical example is provided to illustrate the application of the new procedure. 相似文献

3.

A general method for controlling the genome-wide type I error rate in linkage and association mapping experiments in plants

Müller BU Stich B Piepho HP 《Heredity》2011,106(5):825-831

Control of the genome-wide type I error rate (GWER) is an important issue in association mapping and linkage mapping experiments. For the latter, different approaches, such as permutation procedures or Bonferroni correction, were proposed. The permutation test, however, cannot account for population structure present in most association mapping populations. This can lead to false positive associations. The Bonferroni correction is applicable, but usually on the conservative side, because correlation of tests cannot be exploited. Therefore, a new approach is proposed, which controls the genome-wide error rate, while accounting for population structure. This approach is based on a simulation procedure that is equally applicable in a linkage and an association-mapping context. Using the parameter settings of three real data sets, it is shown that the procedure provides control of the GWER and the generalized genome-wide type I error rate (GWER(k)). 相似文献

4.

On the False Discovery Rate and Expected Type I Errors

Helmut Finner M. Roters 《Biometrical journal. Biometrische Zeitschrift》2001,43(8):985-1005

The paper is concerned with expected type I errors of some stepwise multiple test procedures based on independent p‐values controlling the so‐called false discovery rate (FDR). We derive an asymptotic result for the supremum of the expected type I error rate(EER) when the number of hypotheses tends to infinity. Among others, it will be shown that when the original Benjamini‐Hochberg step‐up procedure controls the FDR at level α, its EER may approach a value being slightly larger than α/4 when the number of hypotheses increases. Moreover, we derive some least favourable parameter configuration results, some bounds for the FDR and the EER as well as easily computable formulae for the familywise error rate (FWER) of two FDR‐controlling procedures. Finally, we discuss some undesirable properties of the FDR concept, especially the problem of cheating. 相似文献

5.

On the identification of mortality hotspots in linear infrastructures

《Basic and Applied Ecology》2019

One of the main tasks when dealing with the impacts of infrastructures on wildlife is to identify hotspots of high mortality so one can devise and implement mitigation measures. A common strategy to identify hotspots is to divide an infrastructure into several segments and determine when the number of collisions in a segment is above a given threshold, reflecting a desired significance level that is obtained assuming a probability distribution for the number of collisions, which is often the Poisson distribution. The problem with this approach, when applied to each segment individually, is that the probability of identifying false hotspots (Type I error) is potentially high. The way to solve this problem is to recognize that it requires multiple testing corrections or a Bayesian approach. Here, we apply three different methods that implement the required corrections to the identification of hotspots: (i) the familywise error rate correction, (ii) the false discovery rate, and (iii) a Bayesian hierarchical procedure. We illustrate the application of these methods with data on two bird species collected on a road in Brazil. The proposed methods provide practitioners with procedures that are reliable and simple to use in real situations and, in addition, can reflect a practitioner’s concerns towards identifying false positive or missing true hotspots. Although one may argue that an overly cautionary approach (reducing the probability of type I error) may be beneficial from a biological conservation perspective, it may lead to a waste of resources and, probably worse, it may raise doubts about the methodology adopted and the credibility of those suggesting it. 相似文献

6.

On sample size and inference for two-stage adaptive designs

Liu Q Chi GY 《Biometrics》2001,57(1):172-177

Proschan and Hunsberger (1995, Biometrics 51, 1315-1324) proposed a two-stage adaptive design that maintains the Type I error rate. For practical applications, a two-stage adaptive design is also required to achieve a desired statistical power while limiting the maximum overall sample size. In our proposal, a two-stage adaptive design is comprised of a main stage and an extension stage, where the main stage has sufficient power to reject the null under the anticipated effect size and the extension stage allows increasing the sample size in case the true effect size is smaller than anticipated. For statistical inference, methods for obtaining the overall adjusted p-value, point estimate and confidence intervals are developed. An exact two-stage test procedure is also outlined for robust inference. 相似文献

7.

Improved power of familywise error rate procedures for discrete data under dependency

Li He Joseph F. Heyse 《Biometrical journal. Biometrische Zeitschrift》2019,61(1):101-114

In many applications where it is necessary to test multiple hypotheses simultaneously, the data encountered are discrete. In such cases, it is important for multiplicity adjustment to take into account the discreteness of the distributions of the p‐values, to assure that the procedure is not overly conservative. In this paper, we review some known multiple testing procedures for discrete data that control the familywise error rate, the probability of making any false rejection. Taking advantage of the fact that the exact permutation or exact pairwise permutation distributions of the p‐values can often be determined when the sample size is small, we investigate procedures that incorporate the dependence structure through the exact permutation distribution and propose two new procedures that incorporate the exact pairwise permutation distributions. A step‐up procedure is also proposed that accounts for the discreteness of the data. The performance of the proposed procedures is investigated through simulation studies and two applications. The results show that by incorporating both discreteness and dependency of p‐value distributions, gains in power can be achieved. 相似文献

8.

Closed testing with Globaltest,with application in metabolomics

Ningning Xu Aldo Solari Jelle J. Goeman 《Biometrics》2023,79(2):1103-1113

The Globaltest is a powerful test for the global null hypothesis that there is no association between a group of features and a response of interest, which is popular in pathway testing in metabolomics. Evaluating multiple feature sets, however, requires multiple testing correction. In this paper, we propose a multiple testing method, based on closed testing, specifically designed for the Globaltest. The proposed method controls the familywise error rate simultaneously over all possible feature sets, and therefore allows post hoc inference, that is, the researcher may choose feature sets of interest after seeing the data without jeopardizing error control. To circumvent the exponential computation time of closed testing, we derive a novel shortcut that allows exact closed testing to be performed on the scale of metabolomics data. An R package ctgt is available on comprehensive R archive network for the implementation of the shortcut procedure, with applications on several real metabolomics data examples. 相似文献

9.

A simple test of the homogeneity of risk difference in sparse data: an application to a multicenter study

Lui KJ 《Biometrical journal. Biometrische Zeitschrift》2005,47(5):654-661

In this paper, we focus discussion on testing the homogeneity of risk difference for sparse data, in which we have few patients in each stratum, but a moderate or large number of strata. When the number of patients per treatment within strata is small (2 to 5 patients), none of test procedures proposed previously for testing the homogeneity of risk difference for sparse data can really perform well. On the basis of bootstrap methods, we develop a simple test procedure that can improve the power of the previous test procedures. Using Monte Carlo simulations, we demonstrate that the test procedure developed here can perform reasonable well with respect to Type I error even when the number of patients per stratum for each treatment is as small as two patients. We evaluate and study the power of the proposed test procedure in a variety of situations. We also include a comparison of the performance between the test statistics proposed elsewhere and the test procedure developed here. Finally, we briefly discuss the limitation of using the proposed test procedure. We use the data comparing two chemotherapy treatments in patients with multiple myeloma to illustrate the use of the proposed test procedure. 相似文献

10.

Estimation of false discovery rates in multiple testing: application to gene microarray data

Tsai CA Hsueh HM Chen JJ 《Biometrics》2003,59(4):1071-1081

Testing for significance with gene expression data from DNA microarray experiments involves simultaneous comparisons of hundreds or thousands of genes. If R denotes the number of rejections (declared significant genes) and V denotes the number of false rejections, then V/R, if R > 0, is the proportion of false rejected hypotheses. This paper proposes a model for the distribution of the number of rejections and the conditional distribution of V given R, V / R. Under the independence assumption, the distribution of R is a convolution of two binomials and the distribution of V / R has a noncentral hypergeometric distribution. Under an equicorrelated model, the distributions are more complex and are also derived. Five false discovery rate probability error measures are considered: FDR = E(V/R), pFDR = E(V/R / R > 0) (positive FDR), cFDR = E(V/R / R = r) (conditional FDR), mFDR = E(V)/E(R) (marginal FDR), and eFDR = E(V)/r (empirical FDR). The pFDR, cFDR, and mFDR are shown to be equivalent under the Bayesian framework, in which the number of true null hypotheses is modeled as a random variable. We present a parametric and a bootstrap procedure to estimate the FDRs. Monte Carlo simulations were conducted to evaluate the performance of these two methods. The bootstrap procedure appears to perform reasonably well, even when the alternative hypotheses are correlated (rho = .25). An example from a toxicogenomic microarray experiment is presented for illustration. 相似文献

11.

Inference Procedures for Assessing Interobserver Agreement among Multiple Raters 总被引：1，自引：0，他引：1

Mekibib Altaye Allan Dormer Neil Klar 《Biometrics》2001,57(2):584-588

We propose a new procedure for constructing inferences about a measure of interobserver agreement in studies involving a binary outcome and multiple raters. The proposed procedure, based on a chi-square goodness-of-fit test as applied to the correlated binomial model (Bahadur, 1961, in Studies in Item Analysis and Prediction, 158-176), is an extension of the goodness-of-fit procedure developed by Donner and Eliasziw (1992, Statistics in Medicine 11, 1511-1519) for the case of two raters. The new procedure is shown to provide confidence-interval coverage levels that are close to nominal over a wide range of parameter combinations. The procedure also provides a sample-size formula that may be used to determine the required number of subjects and raters for such studies. 相似文献

12.

Testing strategy in phase 3 trials with multiple doses

Jianjun Li Simon Kirby 《Biometrical journal. Biometrische Zeitschrift》2019,61(1):115-125

In this paper, we consider multiplicity testing approaches mainly for phase 3 trials with two doses. We review a few available approaches and propose some new ones. The doses selected for phase 3 usually have the same or a similar efficacy profile, so they have some degree of consistency in efficacy. We review the Hochberg procedure, the Bonferroni procedure, and a few consistency‐adjusted procedures, and suggest new ones by applying the available procedures to the pooled dose and the high dose, the dose that is thought to be more efficacious between two doses. The reason behind the idea is that the pooled dose and the high dose are more consistent than the original two doses if the high dose is more efficacious than the low dose. We compare all approaches via simulations and recommend using a procedure combining 4A and the pooling approach. We also discuss briefly the testing strategy for trials with more than two doses. 相似文献

13.

A general Akaike-type criterion for model selection in robust regression 总被引：2，自引：0，他引：2

BURMAN P.; NOLAN D. 《Biometrika》1995,82(4):877-886

Akaike's procedure (1970) for selecting a model minimises anestimate of the expected squared error in predicting new, independentobservations. This selection criterion was designed for modelsfitted by least squares. A different model-fitting technique,such as least absolute deviation regression, requires an appropriatemodel selection procedure. This paper presents a general Akaike-typecriterion applicable to a wide variety of loss functions formodel fitting. It requires only that the function be convexwith a unique minimum, and twice differentiable in expectation.Simulations show that the estimators proposed here well approximatetheir respective prediction errors. 相似文献

14.

A family of Bayes multiple testing procedures

Cohen Arthur; Sackrowitz H. B.; Xu Minya; Buyske Steven 《Biometrika》2008,95(2):295-305

Under the model of independent test statistics, we propose atwo-parameter family of Bayes multiple testing procedures. Thetwo parameters can be viewed as tuning parameters. Using theBenjamini–Hochberg step-up procedure for controlling falsediscovery rate as a baseline for conservativeness, we choosethe tuning parameters to compromise between the operating characteristicsof that procedure and a less conservative procedure that focuseson alternatives that a priori might be considered likely ormeaningful. The Bayes procedures do not have the theoreticaland practical shortcomings of the popular stepwise procedures.In terms of the number of mistakes, simulations for two examplesindicate that over a large segment of the parameter space, theBayes procedure is preferable to the step-up procedure. Anotherdesirable feature of the procedures is that they are computationallyfeasible for any number of hypotheses. 相似文献

15.

Test equality and sample size calculation based on risk difference in a randomized clinical trial with noncompliance and missing outcomes

Lui KJ Chang KC 《Biometrical journal. Biometrische Zeitschrift》2008,50(2):224-236

In a randomized clinical trial (RCT), noncompliance with an assigned treatment can occur due to serious side effects, while missing outcomes on patients may happen due to patients' withdrawal or loss to follow up. To avoid the possible loss of power to detect a given risk difference (RD) of interest between two treatments, it is essentially important to incorporate the information on noncompliance and missing outcomes into sample size calculation. Under the compound exclusion restriction model proposed elsewhere, we first derive the maximum likelihood estimator (MLE) of the RD among compliers between two treatments for a RCT with noncompliance and missing outcomes and its asymptotic variance in closed form. Based on the MLE with tanh(-1)(x) transformation, we develop an asymptotic test procedure for testing equality of two treatment effects among compliers. We further derive a sample size calculation formula accounting for both noncompliance and missing outcomes for a desired power 1 - beta at a nominal alpha-level. To evaluate the performance of the test procedure and the accuracy of the sample size calculation formula, we employ Monte Carlo simulation to calculate the estimated Type I error and power of the proposed test procedure corresponding to the resulting sample size in a variety of situations. We find that both the test procedure and the sample size formula developed here can perform well. Finally, we include a discussion on the effects of various parameters, including the proportion of compliers, the probability of non-missing outcomes, and the ratio of sample size allocation, on the minimum required sample size. 相似文献

16.

Constructing confidence intervals for selected parameters

Haibing Zhao Xinping Cui 《Biometrics》2020,76(4):1098-1108

In large-scale problems, it is common practice to select important parameters by a procedure such as the Benjamini and Hochberg procedure and construct confidence intervals (CIs) for further investigation while the false coverage-statement rate (FCR) for the CIs is controlled at a desired level. Although the well-known BY CIs control the FCR, they are uniformly inflated. In this paper, we propose two methods to construct shorter selective CIs. The first method produces shorter CIs by allowing a reduced number of selective CIs. The second method produces shorter CIs by allowing a prefixed proportion of CIs containing the values of uninteresting parameters. We theoretically prove that the proposed CIs are uniformly shorter than BY CIs and control the FCR asymptotically for independent data. Numerical results confirm our theoretical results and show that the proposed CIs still work for correlated data. We illustrate the advantage of the proposed procedures by analyzing the microarray data from a HIV study. 相似文献

17.

A Group Sequential Method for one Standard Control and more than one Experimental Treatment

Kung-Jong Lui 《Biometrical journal. Biometrische Zeitschrift》1994,36(5):515-529

Many group-sequential test procedures have been proposed to meet the ethical need for interim analyses. All of these papers, however, focus their discussion on the situation where there are only one standard control and one experimental treatment. In this paper, we consider a trial with one standard control, but with more than one experimental treatment. We have developed a group-sequential test procedure to accommodate any finite number of experimental treatments. To facilitate the practical application of the proposed test procedure, on the basis of Monte Carlo simulation, we have derived the critical values of α-levels equal to 0.01, 0.05 and 0.10 for the number of experimental treatments ranging from 2 to 4 and the number of multiple group sequential analysis ranging from 1 to 10. Comparing with a single non-sequential analysis that has a reasonable power (say, 0.80), we have demonstrated that the application of the proposed test procedure may substantially reduce the required sample size without seriously sacrificing the original power. 相似文献

18.

Development of sequential multiple comparison procedure for dose response test

Nakamura T Douke H 《Biometrical journal. Biometrische Zeitschrift》2007,49(1):30-39

We propose a multiple comparison procedure to identify the minimum effective dose level by sequentially comparing each dose level with the zero dose level in the dose finding test. If we can find the minimum effective dose level at an early stage in the sequential test, it is possible to terminate the procedure in the dose finding test after a few group observations up to the dose level. Thus, the procedure is viable from an economical point of view when high costs are involved in obtaining the observations. In the procedure, we present an integral formula to determine the critical values for satisfying a predefined type I familywise error rate. Furthermore, we show how to determine the required sample size in order to guarantee the power of the test in the procedure. In practice, we compare the power of the test and the required sample size for various configurations of the population means in simulation studies and adopt our sequential procedure to the dose response test in a case study. 相似文献

19.

An Investigation on the Allelic Chi‐Square Test Used in Genetic Association Studies

Seung‐Ho Kang Dong‐Wan Shin Man‐Suk Oh Chul W. Ahn 《Biometrical journal. Biometrische Zeitschrift》2004,46(6):699-706

Case‐control studies are primary study designs used in genetic association studies. Sasieni (Biometrics 1997, 53, 1253–1261) pointed out that the allelic chi‐square test used in genetic association studies is invalid when Hardy‐Weinberg equilibrium (HWE) is violated in a combined population. It is important to know how much type I error rate is deviated from the nominal level under violated HWE. We examine bounds of type I error rate of the allelic chi‐square test. We also investigate power of the goodness‐of‐fit test for HWE which can be used as a guideline for selecting an appropriate test between the allelic chi‐square test and the modified allelic chi‐square test, the latter of which was proposed for cases of violated HWE. In small samples, power is not large enough to detect the Wright's inbreeding model of small values of inbreeding coefficient. Therefore, when the null hypothesis of HWE is barely accepted, the modified test should be considered as an alternative method. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim) 相似文献

20.

Gatekeeping testing via adaptive alpha allocation

Li JD Mehrotra DV 《Biometrical journal. Biometrische Zeitschrift》2008,50(5):704-715

In a typical clinical trial, there are one or two primary endpoints, and a few secondary endpoints. When at least one primary endpoint achieves statistical significance, there is considerable interest in using results for the secondary endpoints to enhance characterization of the treatment effect. Because multiple endpoints are involved, regulators may require that the familywise type I error rate be controlled at a pre-set level. This requirement can be achieved by using "gatekeeping" methods. However, existing methods suffer from logical oddities such as allowing results for secondary endpoint(s) to impact the likelihood of success for the primary endpoint(s). We propose a novel and easy-to-implement gatekeeping procedure that is devoid of such deficiencies. A real data example and simulation results are used to illustrate efficiency gains of our method relative to existing methods. 相似文献