首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A modified exact test is proposed for 2×2 contingency tables. This test, which is based on a less conservative definition of the concept of significance (STONE, 1969) is compared with a modified form of Pearson's X2 test and with Tocher's randomized exact (UMPU) test. The sizes of the new test lie near the nominal 0.05 levels while those of the X2 test usually exceed the nominal level, sometimes by a factor of 2 or more. The power of the modified test is usually close to that of the UMPU test.  相似文献   

2.
We consider uniformly most powerful (UMP) as well as uniformly most powerful unbiased (UMPU) tests and their non‐randomized versions for certain hypotheses concerning a binomial parameter. It will be shown that the power function of a UMP(U)‐test based on sample size n can coincide on the entire parameter space with the power function of the corresponding test based on sample size n + 1. A complete characterization of this paradox will be derived. Apart some exceptional cases for two‐sided tests and equivalence tests the paradox appears if and only if a test based on sample size n is non‐randomized.  相似文献   

3.
A two-tailed P-value is presented for a significance test in two by two contingency tables. There is no extraneous quasi-observation such as is needed in the exact randomized uniformly most powerful unbiased (UMPU) test of the hypothesis of independence. The proposed P-value can never exceed unity and is always two-tailed, unlike other P-values proposed in the literature  相似文献   

4.
Based on uniformly most powerful unbiased (UMPU) tests for two-sided hypotheses and a short note in Lehmann (1959) on critical levels for randomized tests, Meulepas (1998, 1999) proposed (two-tailed) P -values taking into account the randomization constant(s) of the UMPU-tests. While UMPU-tests need an extra uniform observation if randomization is required, the P -values proposed by Meulepas need no extra uniform observation. At first glance, his idea looks very promising in order to define a suitable and powerful P -value. Unfortunately, such P -values are generally too conservative.  相似文献   

5.
Wellek S 《Biometrics》2004,60(3):694-703
The classical chi(2)-procedure for the assessment of genetic equilibrium is tailored for establishing lack rather than goodness of fit of an observed genotype distribution to a model satisfying the Hardy-Weinberg law, and the same is true for the exact competitors to the large-sample procedure, which have been proposed in the biostatistical literature since the late 1930s. In this contribution, the methodology of statistical equivalence testing is adopted for the construction of tests for problems in which the assumption of approximate compatibility of the genotype distribution actually sampled with Hardy-Weinberg equilibrium (HWE) plays the role of the alternative hypothesis one aims to establish. The result of such a construction highly depends on the choice of a measure of distance to be used for defining an indifference zone containing those genotype distributions whose degree of disequilibrium shall be considered irrelevant. The first such measure proposed here is the Euclidean distance of the true parameter vector from that of a genotype distribution with identical allele frequencies being in strict HWE. The second measure is based on the (scalar) parameter of the distribution first introduced into the present context by Stevens (1938, Annals of Eugenics 8, 377-383). The first approach leads to a nonconditional test (which nevertheless can be carried out in a numerically exact way), the second to an exact conditional test shown to be uniformly most powerful unbiased (UMPU) for the associated pair of hypotheses. Both tests are compared in terms of the exact power attained against the class of those specific alternatives under which HWE is strictly satisfied.  相似文献   

6.
We consider the problem of testing for independence against the consistent superiority of one treatment over another when the response variable is binary and is compared across two treatments in each of several strata. Specifically, we consider the randomized clinical trial setting. A number of issues arise in this context. First, should tables be combined if there are small or zero margins? Second, should one assume a common odds ratio across strata? Third, if the odds ratios differ across strata, then how does the standard test (based on a common odds ratio) perform? Fourth, are there other analyzes that are more appropriate for handling a situation in which the odds ratios may differ across strata? In addressing these issues we find that the frequently used Cochran–Mantel–Haenszel test may have a poor power profile, despite being optimal when the odds ratios are common. We develop novel tests that are analogous to the Smirnov, modified Smirnov, convex hull, and adaptive tests that have been proposed for ordered categorical data. (© 2006 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

7.
Mehrotra DV  Chan IS  Berger RL 《Biometrics》2003,59(2):441-450
Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.  相似文献   

8.
Smooth tests for the zero-inflated poisson distribution   总被引:1,自引:0,他引:1  
Thas O  Rayner JC 《Biometrics》2005,61(3):808-815
In this article we construct three smooth goodness-of-fit tests for testing for the zero-inflated Poisson (ZIP) distribution against general smooth alternatives in the sense of Neyman. We apply our tests to a data set previously claimed to be ZIP distributed, and show that the ZIP is not a good model to describe the data. At rejection of the null hypothesis of ZIP, the individual components of the test statistic, which are directly related to interpretable parameters in a smooth model, may be used to gain insight into an alternative distribution.  相似文献   

9.
BackgroundThe indirect immunofluorescence assay (IFA) is considered a reference test for scrub typhus. Recently, the Scrub Typhus Infection Criteria (STIC; a combination of culture, PCR assays and IFA IgM) were proposed as a reference standard for evaluating alternative diagnostic tests. Here, we use Bayesian latent class models (LCMs) to estimate the true accuracy of each diagnostic test, and of STIC, for diagnosing scrub typhus.ConclusionsThe low specificity of STIC was caused by the low specificity of IFA IgM. Neither STIC nor IFA IgM can be used as reference standards against which to evaluate alternative diagnostic tests. Further evaluation of new diagnostic tests should be done with a carefully selected set of diagnostic tests and appropriate statistical models.  相似文献   

10.
The problem of dropout is a common one in longitudinal studies. One usually assumes for the analysis that dropout is at random. There are some tests to investigate this assumption. But these tests depend on normally distributed data or lack power, cf. Listing and Schlittgen (1998). We here propose an overall test which combines several Wilcoxon rank sum tests. The alternative hypothesis states that there is a tendency for larger (smaller) values of the target variable the last time the probands show up. The test is applicable with many ties also. It proves to perform well, compared to the test developed for normally distributed data, as well as to a test for completely missing at random which is proposed by Little (1988). An application to real data is given too.  相似文献   

11.
The paper deals with the classical two-sample testing problem for the equality of two populations, one of the most fundamental problems in biomedical experiments and case–control studies. The most familiar alternatives are the difference in location parameters or the difference in scale parameters or in both the parameters of the population density. All the tests designed for classical location or scale or location–scale alternatives assume that there is no change in the shape of the distribution. Some authors also consider the Lehmann-type alternative that addresses the change in shape. Two-sample tests under Lehmann alternative assume that the location and scale parameters are invariant. In real life, when a shift in the distribution occurs, one or more of the location, scale, and shape parameters may change simultaneously. We refer to change of one or more of the three parameters as a versatile alternative. Noting the dearth of literature for the equality two populations against such versatile alternative, we introduce two distribution-free tests based on the Euclidean and Mahalanobis distance. We obtain the asymptotic distributions of the two test statistics and study asymptotic power. We also discuss approximating p-values of the proposed tests in real applications with small samples. We compare the power performance of the two tests with several popular existing distribution-free tests against various fixed alternatives using Monte Carlo. We provide two illustrations based on biomedical experiments. Unlike existing tests which are suitable only in certain situations, proposed tests offer very good power in almost all types of shifts.  相似文献   

12.
Brown RP 《Genetica》1997,101(1):67-74
Heterogeneous phenotypic correlations may be suggestive of underlying changes in genetic covariance among life-history, morphology, and behavioural traits, and their detection is therefore relevant to many biological studies. Two new statistical tests are proposed and their performances compared with existing methods. Of all tests considered, the existing approximate test of homogeneity of product-moment correlations provides the greatest power to detect heterogeneous correlations, when based on Hotelling's z*-transformation. The use of this transformation and test is recommended under conditions of bivariate normality. A new distribution-free randomisation test of homogeneity of Spearman's rank correlations is described and recommended for use when the bivariate samples are taken from populations with non-normal or unknown distributions. An alternative randomisation test of homogeneity of product-moment correlations is shown to be a useful compromise between the approximate tests and the randomisation tests on Spearman's rank correlations: it is not as sensitive to departures from normality as the approximate tests, but has greater power than the rank correlation test. An example is provided that shows how choice of test will have a considerable influence on the conclusions of a particular study. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

13.
Tests for a monotonic trend between an ordered categorical exposure and disease status are routinely carried out from case‐control data using the Mantel‐extension trend test or the asymptotically equivalent Cochran‐Armitage test. In this study, we considered two alternative tests based on isotonic regression, namely an order‐restricted likelihood ratio test and an isotonic modification of the Mantel‐extension test extending the recent proposal by Mancuso, Ahn and Chen (2001) to case‐control data. Furthermore, we considered three tests based on contrasts, namely a single contrast (SC) test based on Schaafsma's coefficients, the Dosemeci and Benichou (DB) test, a multiple contrast (MC) test based on the Helmert, reverse‐Helmert and linear contrasts and we derived their case‐control versions. Using simulations, we compared the statistical properties of these five alternative tests to those of the Mantel‐extension test under various patterns including no relationship, as well as monotonic and non‐monotonic relationships between exposure and disease status. In the case of no relationship, all tests had close to nominal type I error except in situations combining a very unbalanced exposure distribution and small sample size, where the asymptotic versions of the three tests based on contrasts were highly anticonservative. The use of bootstrap instead of asymptotic versions corrected this anticonservatism. For monotonic patterns, all tests had close powers. For non monotonic patterns, the DB‐test showed the most favourable results as it was the least powerful test. The two tests based on isotonic regression were the most powerful tests and the Mantel‐extension test, the SC‐ and MC‐tests had in‐between powers. The six tests were applied to data from a case‐control study investigating the relationship between alcohol consumption and risk of laryngeal cancer in Turkey. In situations with no evidence of a monotonic relationship between exposure and disease status, the three tests based on contrasts did not conclude in favour of a significant trend whereas all the other tests did. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

14.
Linear rank tests are widely used when testing for independence against stochastic order in a 2 x J contingency table with two treatments and J ordered outcome levels. For this purpose, numerical scores are assigned, possibly by default, to the J outcome levels. When the choice of scores is not apparent, integer (equally spaced) scores are often assigned. We show that this practice generally leads to unnecessarily conservative tests. The use of slightly perturbed scores will result in a less conservative and uniformly more powerful test.  相似文献   

15.
Statistical association between a single nucleotide polymorphism (SNP) genotype and a quantitative trait in genome-wide association studies is usually assessed using a linear regression model, or, in the case of non-normally distributed trait values, using the Kruskal-Wallis test. While linear regression models assume an additive mode of inheritance via equi-distant genotype scores, Kruskal-Wallis test merely tests global differences in trait values associated with the three genotype groups. Both approaches thus exhibit suboptimal power when the underlying inheritance mode is dominant or recessive. Furthermore, these tests do not perform well in the common situations when only a few trait values are available in a rare genotype category (disbalance), or when the values associated with the three genotype categories exhibit unequal variance (variance heterogeneity). We propose a maximum test based on Marcus-type multiple contrast test for relative effect sizes. This test allows model-specific testing of either dominant, additive or recessive mode of inheritance, and it is robust against variance heterogeneity. We show how to obtain mode-specific simultaneous confidence intervals for the relative effect sizes to aid in interpreting the biological relevance of the results. Further, we discuss the use of a related all-pairwise comparisons contrast test with range preserving confidence intervals as an alternative to Kruskal-Wallis heterogeneity test. We applied the proposed maximum test to the Bogalusa Heart Study dataset, and gained a remarkable increase in the power to detect association, particularly for rare genotypes. Our simulation study also demonstrated that the proposed non-parametric tests control family-wise error rate in the presence of non-normality and variance heterogeneity contrary to the standard parametric approaches. We provide a publicly available R library nparcomp that can be used to estimate simultaneous confidence intervals or compatible multiplicity-adjusted p-values associated with the proposed maximum test.  相似文献   

16.
We consider the problem of using permutation-based methods to test for treatment–covariate interactions from randomized clinical trial data. Testing for interactions is common in the field of personalized medicine, as subgroups with enhanced treatment effects arise when treatment-by-covariate interactions exist. Asymptotic tests can often be performed for simple models, but in many cases, more complex methods are used to identify subgroups, and non-standard test statistics proposed, and asymptotic results may be difficult to obtain. In such cases, it is natural to consider permutation-based tests, which shuffle selected parts of the data in order to remove one or more associations of interest; however, in the case of interactions, it is generally not possible to remove only the associations of interest by simple permutations of the data. We propose a number of alternative permutation-based methods, designed to remove only the associations of interest, but preserving other associations. These methods estimate the interaction term in a model, then create data that “looks like” the original data except that the interaction term has been permuted. The proposed methods are shown to outperform traditional permutation methods in a simulation study. In addition, the proposed methods are illustrated using data from a randomized clinical trial of patients with hypertension.  相似文献   

17.
In some infectious disease studies and 2‐step treatment studies, 2 × 2 table with structural zero could arise in situations where it is theoretically impossible for a particular cell to contain observations or structural void is introduced by design. In this article, we propose a score test of hypotheses pertaining to the marginal and conditional probabilities in a 2 × 2 table with structural zero via the risk/rate difference measure. Score test‐based confidence interval will also be outlined. We evaluate the performance of the score test and the existing likelihood ratio test. Our empirical results evince the similar and satisfactory performance of the two tests (with appropriate adjustments) in terms of coverage probability and expected interval width. Both tests consistently perform well from small‐ to moderate‐sample designs. The score test however has the advantage that it is only undefined in one scenario while the likelihood ratio test can be undefined in many scenarios. We illustrate our method by a real example from a two‐step tuberculosis skin test study.  相似文献   

18.
We consider testing whether the nonparametric function in a semiparametric additive mixed model is a simple fixed degree polynomial, for example, a simple linear function. This test provides a goodness-of-fit test for checking parametric models against nonparametric models. It is based on the mixed-model representation of the smoothing spline estimator of the nonparametric function and the variance component score test by treating the inverse of the smoothing parameter as an extra variance component. We also consider testing the equivalence of two nonparametric functions in semiparametric additive mixed models for two groups, such as treatment and placebo groups. The proposed tests are applied to data from an epidemiological study and a clinical trial and their performance is evaluated through simulations.  相似文献   

19.
Perlman MD  Wu L 《Biometrics》2004,60(1):276-280
Testing problems with multivariate one-sided alternative hypotheses are common in clinical trials with multiple endpoints. In the case of comparing two treatments, treatment 1 is often preferred if it is superior for at least one of the endpoints and not biologically inferior for the remaining endpoints. Bloch et al. (2001, Biometrics57, 1039-1047) propose an intersection-union test (IUT) for this testing problem, but their test does not utilize the appropriate multivariate one-sided test. In this note we modify their test by an alternative IUT that does utilize the appropriate one-sided test. Empirical and graphical evidence show that the proposed test is more appropriate for this testing problem.  相似文献   

20.
We consider the problem of drawing superiority inferences on individual endpoints following non-inferiority testing. R?hmel et al. (2006) pointed out this as an important problem which had not been addressed by the previous procedures that only tested for global superiority. R?hmel et al. objected to incorporating the non-inferiority tests in the assessment of the global superiority test by exploiting the relationship between the two, since the results of the latter test then depend on the non-inferiority margins specified for the former test. We argue that this is justified, besides the fact that it enhances the power of the global superiority test. We provide a closed testing formulation which generalizes the three-step procedure proposed by R?hmel et al. for two endpoints. For the global superiority test, R?hmel et al. suggest using the L?uter (1996) test which is modified to make it monotone. The resulting test not only is complicated to use, but the modification does not readily extend to more than two endpoints, and it is less powerful in general than several of its competitors. This is verified in a simulation study. Instead, we suggest applying the one-sided likelihood ratio test used by Perlman and Wu (2004) or the union-intersection t(max) test used by Tamhane and Logan (2004).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号