首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
TAKEUCHI (1969) provides a uniformly most powerful (UMP) one side test for testing the location parameter of the two parameters exponential model when the scale parameter is unknown. The power of his similar size α test depends, however, on the unknown scale parameter. In this case and in more general situations when there exists a sufficient statistic for the nuisance parameter, the theory of generalized THOMPSON's distributions, more specifically, the Thompsonization of a test statistic, LAURENT (1959, 1972) provides a UMP test whose power does not depend on the nuisance parameter. Examples of application of the general nuisance parameter free test procedure include here the truncated exponential, the inverse Gaussian, and the geometric distributions.  相似文献   

2.
In this article, we describe a conditional score test for detecting a monotone dose‐response relationship with ordinal response data. We consider three different versions of this test: asymptotic, conditional exact, and mid‐P conditional score test. Exact and asymptotic power formulae based on these tests will be studied. Asymptotic sample size formulae based on the asymptotic conditional score test will be derived. The proposed formulae are applied to a vaccination study and a developmental toxicity study for illustrative purposes. Actual significance level and exact power properties of these tests are compared in a small empirical study. The mid‐P conditional score test is observed to be the most powerful test with actual significance level close to the pre‐specified nominal level.  相似文献   

3.
Sensitivity and specificity have traditionally been used to assess the performance of a diagnostic procedure. Diagnostic procedures with both high sensitivity and high specificity are desirable, but these procedures are frequently too expensive, hazardous, and/or difficult to operate. A less sophisticated procedure may be preferred, if the loss of the sensitivity or specificity is determined to be clinically acceptable. This paper addresses the problem of simultaneous testing of sensitivity and specificity for an alternative test procedure with a reference test procedure when a gold standard is present. The hypothesis is formulated as a compound hypothesis of two non‐inferiority (one‐sided equivalence) tests. We present an asymptotic test statistic based on the restricted maximum likelihood estimate in the framework of comparing two correlated proportions under the prospective and retrospective sampling designs. The sample size and power of an asymptotic test statistic are derived. The actual type I error and power are calculated by enumerating the exact probabilities in the rejection region. For applications that require high sensitivity as well as high specificity, a large number of positive subjects and a large number of negative subjects are needed. We also propose a weighted sum statistic as an alternative test by comparing a combined measure of sensitivity and specificity of the two procedures. The sample size determination is independent of the sampling plan for the two tests.  相似文献   

4.
One‐tailed statistical tests are often used in ecology, animal behaviour and in most other fields in the biological and social sciences. Here we review the frequency of their use in the 1989 and 2005 volumes of two journals (Animal Behaviour and Oecologia), their advantages and disadvantages, the extensive erroneous advice on them in both older and modern statistics texts and their utility in certain narrow areas of applied research. Of those articles with data sets susceptible to one‐tailed tests, at least 24% in Animal Behaviour and at least 13% in Oecologia used one‐tailed tests at least once. They were used 35% more frequently with nonparametric methods than with parametric ones and about twice as often in 1989 as in 2005. Debate in the psychological literature of the 1950s established the logical criterion that one‐tailed tests should be restricted to situations where there is interest only in results in one direction. ‘Interest’ should be defined; however, in terms of collective or societal interest and not by the individual investigator. By this ‘collective interest’ criterion, all uses of one‐tailed tests in the journals surveyed seem invalid. In his book Nonparametric Statistics, S. Siegel unrelentingly suggested the use of one‐tailed tests whenever the investigator predicts the direction of a result. That work has been a major proximate source of confusion on this issue, but so are most recent statistics textbooks. The utility of one‐tailed tests in research aimed at obtaining regulatory approval of new drugs and new pesticides is briefly described, to exemplify the narrow range of research situations where such tests can be appropriate. These situations are characterized by null hypotheses stating that the difference or effect size does not exceed, or is at least as great as, some ‘amount of practical interest’. One‐tailed tests rarely should be used for basic or applied research in ecology, animal behaviour or any other science.  相似文献   

5.
Recently, Brown , Hwang , and Munk (1998) proposed and unbiased test for the average equivalence problem which improves noticeably in power on the standard two one‐sided tests procedure. Nevertheless, from a practical point of view there are some objections against the use of this test which are mainly adressed to the ‘unusual’ shape of the critical region. We show that every unbiased test has a critical region with such an ‘unusual’ shape. Therefore, we discuss three (biased) modifications of the unbiased test. We come to the conclusion that a suitable modification represents a good compromise between a most powerful test and a test with an appealing shape of its critical region. In order to perform these tests figures are given containing the rejection region. Finally, we compare all tests in an example from neurophysiology. This shows that it is beneficial to use these improved tests instead of the two one‐sided tests procedure.  相似文献   

6.
7.
The ANOVA‐based F‐test used for testing the significance of the random effect variance component is a valid test for an unbalanced one‐way random model. However, it does not have an uniform optimum property. For example, this test is not uniformly most powerful invariant (UMPI). In fact, there is no UMPI test in the unbalanced case (see Khuri , Mathew , and Sinha , 1998). The power of the F‐test depends not only on the design used, but also on the true values of the variance components. As Khuri (1996) noted, we can gain a better insight into the effect of data imbalance on the power of the F‐test using a method for modelling the power in terms of the design parameters and the variance components. In this study, generalized linear modelling (GLM) techniques are used for this purpose. It is shown that GLM, in combination with a method of generating designs with a specified degree of imbalance, is an effective way of studying the behavior of the power of the F‐test in a one‐way random model.  相似文献   

8.
Tests for a monotonic trend between an ordered categorical exposure and disease status are routinely carried out from case‐control data using the Mantel‐extension trend test or the asymptotically equivalent Cochran‐Armitage test. In this study, we considered two alternative tests based on isotonic regression, namely an order‐restricted likelihood ratio test and an isotonic modification of the Mantel‐extension test extending the recent proposal by Mancuso, Ahn and Chen (2001) to case‐control data. Furthermore, we considered three tests based on contrasts, namely a single contrast (SC) test based on Schaafsma's coefficients, the Dosemeci and Benichou (DB) test, a multiple contrast (MC) test based on the Helmert, reverse‐Helmert and linear contrasts and we derived their case‐control versions. Using simulations, we compared the statistical properties of these five alternative tests to those of the Mantel‐extension test under various patterns including no relationship, as well as monotonic and non‐monotonic relationships between exposure and disease status. In the case of no relationship, all tests had close to nominal type I error except in situations combining a very unbalanced exposure distribution and small sample size, where the asymptotic versions of the three tests based on contrasts were highly anticonservative. The use of bootstrap instead of asymptotic versions corrected this anticonservatism. For monotonic patterns, all tests had close powers. For non monotonic patterns, the DB‐test showed the most favourable results as it was the least powerful test. The two tests based on isotonic regression were the most powerful tests and the Mantel‐extension test, the SC‐ and MC‐tests had in‐between powers. The six tests were applied to data from a case‐control study investigating the relationship between alcohol consumption and risk of laryngeal cancer in Turkey. In situations with no evidence of a monotonic relationship between exposure and disease status, the three tests based on contrasts did not conclude in favour of a significant trend whereas all the other tests did. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

9.
We are concerned with calculating the sample size required for estimating the mean of the continuous distribution in the context of a two component nonstandard mixture distribution (i.e., a mixture of an identifiable point degenerate function F at a constant with probability P and a continuous distribution G with probability 1 – P). A common ad hoc procedure of escalating the naïve sample size n (calculated under the assumption of no point degenerate function F) by a factor of 1/(1 – P), has about 0.5 probability of achieving the pre‐specified statistical power. Such an ad hoc approach may seriously underestimate the necessary sample size and jeopardize inferences in scientific investigations. We argue that sample size calculations in this context should have a pre‐specified probability of power ≥1 – β set by the researcher at a level greater than 0.5. To that end, we propose an exact method and an approximate method to calculate sample size in this context so that the pre‐specified probability of achieving a desired statistical power is determined by the researcher. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

10.
Summary This article proposes new tests to compare the vaccine and placebo groups in randomized vaccine trials when a small fraction of volunteers become infected. A simple approach that is consistent with the intent‐to‐treat principle is to assign a score, say W, equal to 0 for the uninfecteds and some postinfection outcome X > 0 for the infecteds. One can then test the equality of this skewed distribution of W between the two groups. This burden of illness (BOI) test was introduced by Chang, Guess, and Heyse (1994, Statistics in Medicine 13 , 1807–1814). If infections are rare, the massive number of 0s in each group tends to dilute the vaccine effect and this test can have poor power, particularly if the X's are not close to zero. Comparing X in just the infecteds is no longer a comparison of randomized groups and can produce misleading conclusions. Gilbert, Bosch, and Hudgens (2003, Biometrics 59 , 531–541) and Hudgens, Hoering, and Self (2003, Statistics in Medicine 22 , 2281–2298) introduced tests of the equality of X in a subgroup—the principal stratum of those “doomed” to be infected under either randomization assignment. This can be more powerful than the BOI approach, but requires unexaminable assumptions. We suggest new “chop‐lump” Wilcoxon and t‐tests (CLW and CLT) that can be more powerful than the BOI tests in certain situations. When the number of volunteers in each group are equal, the chop‐lump tests remove an equal number of zeros from both groups and then perform a test on the remaining W's, which are mostly >0. A permutation approach provides a null distribution. We show that under local alternatives, the CLW test is always more powerful than the usual Wilcoxon test provided the true vaccine and placebo infection rates are the same. We also identify the crucial role of the “gap” between 0 and the X's on power for the t‐tests. The chop‐lump tests are compared to established tests via simulation for planned HIV and malaria vaccine trials. A reanalysis of the first phase III HIV vaccine trial is used to illustrate the method.  相似文献   

11.
12.
The intraclass version of kappa coefficient has been commonly applied as a measure of agreement for two ratings per subject with binary outcome in reliability studies. We present an efficient statistic for testing the strength of kappa agreement using likelihood scores, and derive asymptotic power and sample size formula. Exact evaluation shows that the score test is generally conservative and more powerful than a method based on a chi‐square goodness‐of‐fit statistic (Donner and Eliasziw , 1992, Statistics in Medicine 11 , 1511–1519). In particular, when the research question is one directional, the one‐sided score test is substantially more powerful and the reduction in sample size is appreciable.  相似文献   

13.
The paired-t, sign, and signed rank tests were compared for samples from a bivariate exponential distribution. Each is a valid α-level test. One test was not uniformly more powerful than the others for all sample sizes, α levels, correlations, and alternative hypotheses considered, but the signed rank test did well consistently. It was always preferable to the sign test and never was appreciably worse than the paired-t test. The relative performance of the tests depends on α as well as the sample size.  相似文献   

14.
The effects of sex hormones on immune function have received much attention, especially following the proposal of the immunocompetence handicap hypothesis. Many studies, both experimental and correlational, have been conducted to test the relationship between immune function and the sex hormones testosterone in males and oestrogen in females. However, the results are mixed. We conducted four cross‐species meta‐analyses to investigate the relationship between sex hormones and immune function: (i) the effect of testosterone manipulation on immune function in males, (ii) the correlation between circulating testosterone level and immune function in males, (iii) the effect of oestrogen manipulation on immune function in females, and (iv) the correlation between circulating oestrogen level and immune function in females. The results from the experimental studies showed that testosterone had a medium‐sized immunosuppressive effect on immune function. The effect of oestrogen, on the other hand, depended on the immune measure used. Oestrogen suppressed cell‐mediated immune function while reducing parasite loads. The overall correlation (meta‐analytic relationship) between circulating sex hormone level and immune function was not statistically significant for either testosterone or oestrogen despite the power of meta‐analysis. These results suggest that correlational studies have limited value for testing the effects of sex hormones on immune function. We found little evidence of publication bias in the four data sets using indirect tests. There was a weak and positive relationship between year of publication and effect size for experimental studies of testosterone that became non‐significant after we controlled for castration and immune measure, suggesting that the temporal trend was due to changes in these moderators over time. Graphical analyses suggest that the temporal trend was due to an increased use of cytokine measures across time. We found substantial heterogeneity in effect sizes, except in correlational studies of testosterone, even after we accounted for the relevant random and fixed factors. In conclusion, our results provide good evidence that testosterone suppresses immune function and that the effect of oestrogen varies depending on the immune measure used.  相似文献   

15.
Health care utilization and outcome studies call for hierarchical approaches. The objectives were to predict major complications following percutaneous coronary interventions by health providers, and to compare Bayesian and non‐Bayesian sample size calculation methods. The hierarchical data structure consisted of: (1) Strata: PGY4, PGY7, and physician assistant as providers with varied experiences; (2) Clusters: ks providers per stratum; (3) Individuals: ns patients reviewed by each provider. The main outcome event illustrated was mortality modeled by a Bayesian beta‐binomial model. Pilot information and assumptions were utilized to elicit beta prior distributions. Sample size calculations were based on the approximated average length, fixed at 1%, of 95% posterior intervals of the mean event rate parameter. Necessary sample sizes by both non‐Bayesian and Bayesian methods were compared. We demonstrated that the developed Bayesian methods can be efficient and may require fewer subjects to satisfy the same length criterion.  相似文献   

16.
Paired data arises in a wide variety of applications where often the underlying distribution of the paired differences is unknown. When the differences are normally distributed, the t‐test is optimum. On the other hand, if the differences are not normal, the t‐test can have substantially less power than the appropriate optimum test, which depends on the unknown distribution. In textbooks, when the normality of the differences is questionable, typically the non‐parametric Wilcoxon signed rank test is suggested. An adaptive procedure that uses the Shapiro‐Wilk test of normality to decide whether to use the t‐test or the Wilcoxon signed rank test has been employed in several studies. Faced with data from heavy tails, the U.S. Environmental Protection Agency (EPA) introduced another approach: it applies both the sign and t‐tests to the paired differences, the alternative hypothesis is accepted if either test is significant. This paper investigates the statistical properties of a currently used adaptive test, the EPA's method and suggests an alternative technique. The new procedure is easy to use and generally has higher empirical power, especially when the differences are heavy‐tailed, than currently used methods.  相似文献   

17.
The two‐sided Simes test is known to control the type I error rate with bivariate normal test statistics. For one‐sided hypotheses, control of the type I error rate requires that the correlation between the bivariate normal test statistics is non‐negative. In this article, we introduce a trimmed version of the one‐sided weighted Simes test for two hypotheses which rejects if (i) the one‐sided weighted Simes test rejects and (ii) both p‐values are below one minus the respective weighted Bonferroni adjusted level. We show that the trimmed version controls the type I error rate at nominal significance level α if (i) the common distribution of test statistics is point symmetric and (ii) the two‐sided weighted Simes test at level 2α controls the level. These assumptions apply, for instance, to bivariate normal test statistics with arbitrary correlation. In a simulation study, we compare the power of the trimmed weighted Simes test with the power of the weighted Bonferroni test and the untrimmed weighted Simes test. An additional result of this article ensures type I error rate control of the usual weighted Simes test under a weak version of the positive regression dependence condition for the case of two hypotheses. This condition is shown to apply to the two‐sided p‐values of one‐ or two‐sample t‐tests for bivariate normal endpoints with arbitrary correlation and to the corresponding one‐sided p‐values if the correlation is non‐negative. The Simes test for such types of bivariate t‐tests has not been considered before. According to our main result, the trimmed version of the weighted Simes test then also applies to the one‐sided bivariate t‐test with arbitrary correlation.  相似文献   

18.
For clinical trials with interim analyses conditional rejection probabilities play an important role when stochastic curtailment or design adaptations are performed. The conditional rejection probability gives the conditional probability to finally reject the null hypothesis given the interim data. It is computed either under the null or the alternative hypothesis. We investigate the properties of the conditional rejection probability for the one sided, one sample t‐test and show that it can be non monotone in the interim mean of the data and non monotone in the non‐centrality parameter for the alternative. We give several proposals how to implement design adaptations (that are based on the conditional rejection probability) for the t‐test and give a numerical example. Additionally, the conditional rejection probability given the interim t‐statistic is investigated. It does not depend on the unknown σ and can be used in stochastic curtailment procedures. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

19.
Mendoza and Gutiérrez‐Peña (1999) presented a Bayesian analysis of the ratio of two normal means when the ratio of the means determines the ratio of the variances, XN(βμ, β2σ2), YN(μ, σ2). They claimed the superiority of the Bayesian analysis, based on the whole likelihood function with non‐informative priors, by comparing it with a non‐Bayesian analysis based only on the Fieller pivotal. The purpose here is to show that the Fieller pivotal constitutes only part of the likelihood function of β in this model, so that any analysis, including Bayesian, based solely on the Fieller pivotal is highly inefficient, possibly to the extent of discarding most of the information in the sample. A non‐Bayesian analysis based on the structure of the whole likelihood function exhibits this and rectifies such an oversight. The role of the variance ratio is discussed and exemplified. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

20.
David I. Warton 《Biometrics》2011,67(1):116-123
Summary A modification of generalized estimating equations (GEEs) methodology is proposed for hypothesis testing of high‐dimensional data, with particular interest in multivariate abundance data in ecology, an important application of interest in thousands of environmental science studies. Such data are typically counts characterized by high dimensionality (in the sense that cluster size exceeds number of clusters, n>K) and over‐dispersion relative to the Poisson distribution. Usual GEE methods cannot be applied in this setting primarily because sandwich estimators become numerically unstable as n increases. We propose instead using a regularized sandwich estimator that assumes a common correlation matrix R , and shrinks the sample estimate of R toward the working correlation matrix to improve its numerical stability. It is shown via theory and simulation that this substantially improves the power of Wald statistics when cluster size is not small. We apply the proposed approach to study the effects of nutrient addition on nematode communities, and in doing so discuss important issues in implementation, such as using statistics that have good properties when parameter estimates approach the boundary (), and using resampling to enable valid inference that is robust to high dimensionality and to possible model misspecification.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号