首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The intraclass version of kappa coefficient has been commonly applied as a measure of agreement for two ratings per subject with binary outcome in reliability studies. We present an efficient statistic for testing the strength of kappa agreement using likelihood scores, and derive asymptotic power and sample size formula. Exact evaluation shows that the score test is generally conservative and more powerful than a method based on a chi‐square goodness‐of‐fit statistic (Donner and Eliasziw , 1992, Statistics in Medicine 11 , 1511–1519). In particular, when the research question is one directional, the one‐sided score test is substantially more powerful and the reduction in sample size is appreciable.  相似文献   

2.
Case‐parent trio studies considering genotype data from children affected by a disease and their parents are frequently used to detect single nucleotide polymorphisms (SNPs) associated with disease. The most popular statistical tests for this study design are transmission/disequilibrium tests (TDTs). Several types of these tests have been developed, for example, procedures based on alleles or genotypes. Therefore, it is of great interest to examine which of these tests have the highest statistical power to detect SNPs associated with disease. Comparisons of the allelic and the genotypic TDT for individual SNPs have so far been conducted based on simulation studies, since the test statistic of the genotypic TDT was determined numerically. Recently, however, it has been shown that this test statistic can be presented in closed form. In this article, we employ this analytic solution to derive equations for calculating the statistical power and the required sample size for different types of the genotypic TDT. The power of this test is then compared with the one of the corresponding score test assuming the same mode of inheritance as well as the allelic TDT based on a multiplicative mode of inheritance, which is equivalent to the score test assuming an additive mode of inheritance. This is, thus, the first time the power of these tests are compared based on equations, yielding instant results and omitting the need for time‐consuming simulation studies. This comparison reveals that these tests have almost the same power, with the score test being slightly more powerful.  相似文献   

3.
In this article, we describe a conditional score test for detecting a monotone dose‐response relationship with ordinal response data. We consider three different versions of this test: asymptotic, conditional exact, and mid‐P conditional score test. Exact and asymptotic power formulae based on these tests will be studied. Asymptotic sample size formulae based on the asymptotic conditional score test will be derived. The proposed formulae are applied to a vaccination study and a developmental toxicity study for illustrative purposes. Actual significance level and exact power properties of these tests are compared in a small empirical study. The mid‐P conditional score test is observed to be the most powerful test with actual significance level close to the pre‐specified nominal level.  相似文献   

4.
This paper investigates homogeneity test of rate ratios in stratified matched-pair studies on the basis of asymptotic and bootstrap-resampling methods. Based on the efficient score approach, we develop a simple and computationally tractable score test statistic. Several other homogeneity test statistics are also proposed on the basis of the weighted least-squares estimate and logarithmic transformation. Sample size formulae are derived to guarantee a pre-specified power for the proposed tests at the pre-given significance level. Empirical results confirm that (i) the modified score statistic based on the bootstrap-resampling method performs better in the sense that its empirical type I error rate is much closer to the pre-specified nominal level than those of other tests and its power is greater than those of other tests, and is hence recommended, whilst the statistics based on the weighted least-squares estimate and logarithmic transformation are slightly conservative under some of the considered settings; (ii) the derived sample size formulae are rather accurate in the sense that their empirical powers obtained from the estimated sample sizes are very close to the pre-specified nominal powers. A real example is used to illustrate the proposed methodologies.  相似文献   

5.
Investigations of sample size for planning case-control studies have usually been limited to detecting a single factor. In this paper, we investigate sample size for multiple risk factors in strata-matched case-control studies. We construct an omnibus statistic for testing M different risk factors based on the jointly sufficient statistics of parameters associated with the risk factors. The statistic is non-iterative, and it reduces to the Cochran statistic when M = 1. The asymptotic power function of the test is a non-central chi-square with M degrees of freedom and the sample size required for a specific power can be obtained by the inverse relationship. We find that the equal sample allocation is optimum. A Monte Carlo experiment demonstrates that an approximate formula for calculating sample size is satisfactory in typical epidemiologic studies. An approximate sample size obtained using Bonferroni's method for multiple comparisons is much larger than that obtained using the omnibus test. Approximate sample size formulas investigated in this paper using the omnibus test, as well as the individual tests, can be useful in designing case-control studies for detecting multiple risk factors.  相似文献   

6.
Tang ML  Tang NS  Chan IS  Chan BP 《Biometrics》2002,58(4):957-963
In this article, we propose approximate sample size formulas for establishing equivalence or noninferiority of two treatments in match-pairs design. Using the ratio of two proportions as the equivalence measure, we derive sample size formulas based on a score statistic for two types of analyses: hypothesis testing and confidence interval estimation. Depending on the purpose of a study, these formulas can be used to provide a sample size estimate that guarantees a prespecified power of a hypothesis test at a certain significance level or controls the width of a confidence interval with a certain confidence level. Our empirical results confirm that these score methods are reliable in terms of true size, coverage probability, and skewness. A liver scan detection study is used to illustrate the proposed methods.  相似文献   

7.
Overdispersion is a common phenomenon in Poisson modeling, and the negative binomial (NB) model is frequently used to account for overdispersion. Testing approaches (Wald test, likelihood ratio test (LRT), and score test) for overdispersion in the Poisson regression versus the NB model are available. Because the generalized Poisson (GP) model is similar to the NB model, we consider the former as an alternate model for overdispersed count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes a score test for overdispersion based on the GP model and compares the power of the test with the LRT and Wald tests. A simulation study indicates the score test based on asymptotic standard Normal distribution is more appropriate in practical application for higher empirical power, however, it underestimates the nominal significance level, especially in small sample situations, and examples illustrate the results of comparing the candidate tests between the Poisson and GP models. A bootstrap test is also proposed to adjust the underestimation of nominal level in the score statistic when the sample size is small. The simulation study indicates the bootstrap test has significance level closer to nominal size and has uniformly greater power than the score test based on asymptotic standard Normal distribution. From a practical perspective, we suggest that, if the score test gives even a weak indication that the Poisson model is inappropriate, say at the 0.10 significance level, we advise the more accurate bootstrap procedure as a better test for comparing whether the GP model is more appropriate than Poisson model. Finally, the Vuong test is illustrated to choose between GP and NB2 models for the same dataset.  相似文献   

8.
Tests for linkage are usually performed using the lod score method. A critical question in linkage analyses is the choice of sample size. The appropriate sample size depends on the desired type-I error and power of the test. This paper investigates the exact type-I error and power of the lod score method in a segregating F(2) population with co-dominant markers and a qualitative monogenic dominant-recessive trait. For illustration, a disease-resistance trait is considered, where the susceptible allele is recessive. A procedure is suggested for finding the appropriate sample size. It is shown that recessive plants have about twice the information content of dominant plants, so the former should be preferred for linkage detection. In some cases the exact alpha-values for a given nominal alpha may be rather small due to the discrete nature of the sampling distribution in small samples. We show that a gain in power is possible by using exact methods.  相似文献   

9.
Eng KH  Kosorok MR 《Biometrics》2005,61(1):86-91
An advantage of the supremum log-rank over the standard log-rank statistic is an increased sensitivity to a wider variety of stochastic ordering alternatives. In this article, we develop a formula for sample size computation for studies utilizing the supremum log-rank statistic. The idea is to base power on the proportional hazards alternative, so that the supremum log rank will have the same power as the standard log rank in the setting where the standard log rank is optimal. This results in a slight increase in sample size over that required for the standard log rank. For example, a 5.733% increase occurs for a two-sided test having type I error 0.05 and power 0.80. This slight increase in sample size is offset by the significant gains in power the supremum log-rank test achieves for a wide range of nonproportional hazards alternatives. A small simulation study is used for illustration. These results should facilitate the wider use of the supremum log-rank statistic in clinical trials.  相似文献   

10.
Many medical and biological studies entail classifying a number of observations according to two factors, where one has two and the other three possible categories. This is the case of, for example, genetic association studies of complex traits with single-nucleotide polymorphisms (SNPs), where the a priori statistical planning, analysis, and interpretation of results are of critical importance. Here, we present methodology to determine the minimum sample size required to detect dependence in 2 x 3 tables based on Fisher's exact test, assuming that neither of the two margins is fixed and only the grand total N is known in advance. We provide the numerical tools necessary to determine these sample sizes for desired power, significance level, and effect size, where only the computational time can be a limitation for extreme parameter values. These programs can be accessed at . This solution of the sample size problem for an exact test will permit experimentalists to plan efficient sampling designs, determine the extent of statistical support for their hypotheses, and gain insight into the repeatability of their results. We apply this solution to the sample size problem to three empirical studies, and discuss the results with specified power and nominal significance levels.  相似文献   

11.
We consider the power and sample size calculation of diagnostic studies with normally distributed multiple correlated test results. We derive test statistics and obtain power and sample size formulas. The methods are illustrated using an example of comparison of CT and PET scanner for detecting extra-hepatic disease for colorectal cancer.  相似文献   

12.
We consider the statistical testing for non-inferiority of a new treatment compared with the standard one under matched-pair setting in a stratified study or in several trials. A non-inferiority test based on the efficient scores and a Mantel-Haenszel (M-H) like procedure with restricted maximum likelihood estimators (RMLEs) of nuisance parameters and their corresponding sample size formulae are presented. We evaluate the above tests and the M-H type Wald test in level and power. The stratified score test is conservative and provides the best power. The M-H like procedure with RMLEs gives an accurate level. However, the Wald test is anti-conservative and we suggest caution when it is used. The unstratified score test is not biased but it is less powerful than the stratified score test when base-line probabilities related to strata are not the same. This investigation shows that the stratified score test possesses optimum statistical properties in testing non-inferiority. A common difference between two proportions across strata is the basic assumption of the stratified tests, we present appropriate tests to validate the assumption and related remarks.  相似文献   

13.
In ophthalmologic studies, measurements obtained from both eyes of an individual are often highly correlated. Ignoring the correlation could lead to incorrect inferences. An asymptotic method was proposed by Tang and others (2008) for testing equality of proportions between two groups under Rosner''s model. In this article, we investigate three testing procedures for general g ≥ 2 groups. Our simulation results show the score testing procedure usually produces satisfactory type I error control and has reasonable power. The three test procedures get closer when sample size becomes larger. Examples from ophthalmologic studies are used to illustrate our proposed methods.  相似文献   

14.
Modification of sample size in group sequential clinical trials   总被引:1,自引:0,他引:1  
Cui L  Hung HM  Wang SJ 《Biometrics》1999,55(3):853-857
In group sequential clinical trials, sample size reestimation can be a complicated issue when it allows for change of sample size to be influenced by an observed sample path. Our simulation studies show that increasing sample size based on an interim estimate of the treatment difference can substantially inflate the probability of type I error in most practical situations. A new group sequential test procedure is developed by modifying the weights used in the traditional repeated significance two-sample mean test. The new test has the type I error probability preserved at the target level and can provide a substantial gain in power with the increase of sample size. Generalization of the new procedure is discussed.  相似文献   

15.
The Cochran-Armitage test has commonly been used for a trend test in binomial proportions. The quasi-likelihood method provides a simple approach to model extra-binomial proportions. Two versions of the score and Wald tests using different parameterizations for the extra-binomial variance were investigated: one in terms of intercluster correlation, and another in terms of variance. The Monte Carlo simulation was used to evaluate the performance of the each version of the score test and the Wald test, and the Cochran-Armitage test. The simulation shows that the Cochran-Armitage test has the proper size only for the binomial sample data, and the test is no longer valid when applied to the extra-binomial data. The Wald test is more likely to exceed the nominal level than the score test under either intercluster correlation model or variance model. Both score tests performed very well even with the binomial data; the tests control the type I error and in the meantime maintain the power of detecting the dose effects. Based on the design considered in this paper, the two scores test are comparable. The score test based on the intercluster correlations model seems better controlling the Type I error but appears less powerful than that based on the variance model. An example from a developmental toxicity experiment is given.  相似文献   

16.
Small study effects occur when smaller studies show different, often larger, treatment effects than large ones, which may threaten the validity of systematic reviews and meta-analyses. The most well-known reasons for small study effects include publication bias, outcome reporting bias, and clinical heterogeneity. Methods to account for small study effects in univariate meta-analysis have been extensively studied. However, detecting small study effects in a multivariate meta-analysis setting remains an untouched research area. One of the complications is that different types of selection processes can be involved in the reporting of multivariate outcomes. For example, some studies may be completely unpublished while others may selectively report multiple outcomes. In this paper, we propose a score test as an overall test of small study effects in multivariate meta-analysis. Two detailed case studies are given to demonstrate the advantage of the proposed test over various naive applications of univariate tests in practice. Through simulation studies, the proposed test is found to retain nominal Type I error rates with considerable power in moderate sample size settings. Finally, we also evaluate the concordance between the proposed tests with the naive application of univariate tests by evaluating 44 systematic reviews with multiple outcomes from the Cochrane Database.  相似文献   

17.
Rosner B  Glynn RJ 《Biometrics》2011,67(2):646-653
The Wilcoxon rank sum test is widely used for two-group comparisons of nonnormal data. An assumption of this test is independence of sampling units both within and between groups, which will be violated in the clustered data setting such as in ophthalmological clinical trials, where the unit of randomization is the subject, but the unit of analysis is the individual eye. For this purpose, we have proposed the clustered Wilcoxon test to account for clustering among multiple subunits within the same cluster (Rosner, Glynn, and Lee, 2003, Biometrics 59, 1089-1098; 2006, Biometrics 62, 1251-1259). However, power estimation is needed to plan studies that use this analytic approach. We have recently published methods for estimating power and sample size for the ordinary Wilcoxon rank sum test (Rosner and Glynn, 2009, Biometrics 65, 188-197). In this article we present extensions of this approach to estimate power for the clustered Wilcoxon test. Simulation studies show a good agreement between estimated and empirical power. These methods are illustrated with examples from randomized trials in ophthalmology. Enhanced power is achieved with use of the subunit as the unit of analysis instead of the cluster using the ordinary Wilcoxon rank sum test.  相似文献   

18.
Testing Heterozygote Excess and Deficiency   总被引:32,自引:2,他引:30  
F. Rousset  M. Raymond 《Genetics》1995,140(4):1413-1419
Currently used tests of Hardy-Weinberg proportions do not take into account the nature of the alternative hypothesis, which is generally a heterozygote deficiency. Different exact tests, appropriate for small sample size and large number of alleles, are proposed in this perspective, and their properties are evaluated by power comparisons. Some tests are found to be close to optimal for the detection of inbreeding or heterozygote excess, one of which is a score test closely related to Robertson and Hill's estimator of the inbreeding coefficient. This test is also easily applied to multiple samples. Such tests are not always the most appropriate if alternative hypotheses differ from those considered here.  相似文献   

19.
Detecting the association between genetic markers and complex diseases can be a critical first step toward identification of the genetic basis of disease. Misleading associations can be avoided by choosing as controls the parents of diseased cases, but the availability of parents often limits this design to early-onset disease. Alternatively, sib controls offer a valid design. A general multivariate score statistic is presented, to detect the association between a multiallelic genetic marker locus and affection status; this general approach is applicable to designs that use parents as controls, sibs as controls, or even unrelated controls whose genotypes do not fit Hardy-Weinberg proportions or that pool any combination of these different designs. The benefit of this multivariate score statistic is that it will tend to be the most powerful method when multiple marker alleles are associated with affection status. To plan these types of studies, we present methods to compute sample size and power, allowing for varying sibship sizes, ascertainment criteria, and genetic models of risk. The results indicate that sib controls have less power than parental controls and that the power of sib controls can be increased by increasing either the number of affected sibs per sibship or the number of unaffected control sibs. The sample-size results indicate that the use of sib controls to test for associations, by use of either a single-marker locus or a genomewide screen, will be feasible for markers that have a dominant effect and for common alleles having a recessive effect. The results presented will be useful for investigators planning studies using sibs as controls.  相似文献   

20.
Summary As the nonparametric generalization of the one‐way analysis of variance model, the Kruskal–Wallis test applies when the goal is to test the difference between multiple samples and the underlying population distributions are nonnormal or unknown. Although the Kruskal–Wallis test has been widely used for data analysis, power and sample size methods for this test have been investigated to a much lesser extent. This article proposes new power and sample size calculation methods for the Kruskal–Wallis test based on the pilot study in either a completely nonparametric model or a semiparametric location model. No assumption is made on the shape of the underlying population distributions. Simulation results show that, in terms of sample size calculation for the Kruskal–Wallis test, the proposed methods are more reliable and preferable to some more traditional methods. A mouse peritoneal cavity study is used to demonstrate the application of the methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号