首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 640 毫秒
1.
Classical power analysis for sample size determination is typically performed in clinical trials. A “hybrid” classical Bayesian or a “fully Bayesian” approach can be alternatively used in order to add flexibility to the design assumptions needed at the planning stage of the study and to explicitly incorporate prior information in the procedure. In this paper, we exploit and compare these approaches to obtain the optimal sample size of a single-arm trial based on Poisson data. We adopt exact methods to establish the rejection of the null hypothesis within a frequentist or a Bayesian perspective and suggest the use of a conservative criterion for sample size determination that accounts for the not strictly monotonic behavior of the power function in the presence of discrete data. A Shiny web app in R has been developed to provide a user-friendly interface to easily compute the optimal sample size according to the proposed criteria and to assure the reproducibility of the results.  相似文献   

2.
NOETHER (1987) proposed a method of sample size determination for the Wilcoxon-Mann-Whitney test. To obtain a sample size formula, he restricted himself to alternatives that differ only slightly from the null hypothesis, so that the unknown variance o2 of the Mann-Whitney statistic can be approximated by the known variance under the null hypothesis which depends only on n. This fact is frequently forgotten in statistical practice. In this paper, we compare Noether's large sample solution against an alternative approach based on upper bounds of σ2 which is valid for any alternatives. This comparison shows that Noether's approximation is sufficiently reliable with small and large deviations from the null hypothesis.  相似文献   

3.
Posch M  Bauer P 《Biometrics》2000,56(4):1170-1176
This article deals with sample size reassessment for adaptive two-stage designs based on conditional power arguments utilizing the variability observed at the first stage. Fisher's product test for the p-values from the disjoint samples at the two stages is considered in detail for the comparison of the means of two normal populations. We show that stopping rules allowing for the early acceptance of the null hypothesis that are optimal with respect to the average sample size may lead to a severe decrease of the overall power if the sample size is a priori underestimated. This problem can be overcome by choosing designs with low probabilities of early acceptance or by midtrial adaptations of the early acceptance boundary using the variability observed in the first stage. This modified procedure is negligibly anticonservative and preserves the power.  相似文献   

4.
Segregation analysis, employing nuclear families, is the most frequently used method to evaluate the mode of inheritance of a trait. To our knowledge, there exists no tabular information regarding the sample sizes required of individuals and families needed to perform a significance test of a specific segregation ratio for a predetermined power and significance level. To fill this gap, we have developed sample-size tables based on the asymptotic variance of the maximum likelihood estimate of the segregation ratio and on the normal approximation for two-sided hypothesis testing. Assuming homogeneous sibship size, minimum sample sizes were determined for testing the null hypothesis for the segregation ratio of 1/4 or 1/2 vs. alternative values of .05-.80, for the significance level of .05 and power of .8, for ascertainment probabilities of nearly 0 to 1.0, and sibship sizes 2-7. The results of these calculations indicate a complex interaction of the null and the alternate hypotheses, ascertainment probability, and sibship size in determining the sample size required for simple segregation analysis. The accompanying tables should aid in the appropriate design and cost assessment of future genetic epidemiologic studies.  相似文献   

5.
The Haseman-Elston (HE) regression method offers a mathematically and computationally simpler alternative to variance-components (VC) models for the linkage analysis of quantitative traits. However, current versions of HE regression and VC models are not optimised for binary traits. Here, we present a modified HE regression and a liability-threshold VC model for binary-traits. The new HE method is based on the regression of a linear combination of the trait squares and the trait cross-product on the proportion of alleles identical by descent (IBD) at the putative locus, for sibling pairs. We have implemented both the new HE regression-based method and have performed analytic and simulation studies to assess its type 1 error rate and power under a range of conditions. These studies showed that the new HE method is well-behaved under the null hypothesis in large samples, is more powerful than both the original and the revisited HE methods, and is approximately equivalent in power to the liability-threshold VC model.  相似文献   

6.
Although phylogenetic hypotheses can provide insights into mechanisms of evolution, their utility is limited by our inability to differentiate simultaneous speciation events (hard polytomies) from rapid cladogenesis (soft polytomies). In the present paper, we tested the potential for statistical power analysis to differentiate between hard and soft polytomies in molecular phytogenies. Classical power analysis typically is used a priori to determine the sample size required to detect a particular effect size at a particular level of significance (a) with a certain power (1 – β). A posteriori, power analysis is used to infer whether failure to reject a null hypothesis results from lack of an effect or from insufficient data (i.e., low power). We adapted this approach to molecular data to infer whether polytomies result from simultaneous branching events or from insufficient sequence information. We then used this approach to determine the amount of sequence data (sample size) required to detect a positive branch length (effect size). A worked example is provided based on the auklets (Charadriiformes: Alcidae), a group of seabirds among which relationships are represented by a polytomy, despite analyses of over 3000 bp of sequence data. We demonstrate the calculation of effect sizes and sample sizes from sequence data using a normal curve test for difference of a proportion from an expected value and a t-test for a difference of a mean from an expected value. Power analyses indicated that the data for the auklets should be sufficient to differentiate speciation events that occurred at least 100,000 yr apart (the duration of the shortest glacial and interglacial events of the Pleistocene), 2.6 million years ago.  相似文献   

7.
In the Haseman-Elston approach the squared phenotypic difference is regressed on the proportion of alleles shared identical by descent (IBD) to map a quantitative trait to a genetic marker. In applications the IBD distribution is estimated and usually cannot be determined uniquely owing to incomplete marker information. At Genetic Analysis Workshop (GAW) 13, Jacobs et al. [BMC Genet 2003, 4(Suppl 1):S82] proposed to improve the power of the Haseman-Elston algorithm by weighting for information available from marker genotypes. The authors did not show, however, the validity of the employed asymptotic distribution. In this paper, we use the simulated data provided for GAW 14 and show that weighting Haseman-Elston by marker information results in increased type I error rates. Specifically, we demonstrate that the number of significant findings throughout the chromosome is significantly increased with weighting schemes. Furthermore, we show that the classical Haseman-Elston method keeps its nominal significance level when applied to the same data. We therefore recommend to use Haseman-Elston with marker informativity weights only in conjunction with empirical p-values. Whether this approach in fact yields an increase in power needs to be investigated further.  相似文献   

8.
Scientists often need to test hypotheses and construct corresponding confidence intervals. In designing a study to test a particular null hypothesis, traditional methods lead to a sample size large enough to provide sufficient statistical power. In contrast, traditional methods based on constructing a confidence interval lead to a sample size likely to control the width of the interval. With either approach, a sample size so large as to waste resources or introduce ethical concerns is undesirable. This work was motivated by the concern that existing sample size methods often make it difficult for scientists to achieve their actual goals. We focus on situations which involve a fixed, unknown scalar parameter representing the true state of nature. The width of the confidence interval is defined as the difference between the (random) upper and lower bounds. An event width is said to occur if the observed confidence interval width is less than a fixed constant chosen a priori. An event validity is said to occur if the parameter of interest is contained between the observed upper and lower confidence interval bounds. An event rejection is said to occur if the confidence interval excludes the null value of the parameter. In our opinion, scientists often implicitly seek to have all three occur: width, validity, and rejection. New results illustrate that neglecting rejection or width (and less so validity) often provides a sample size with a low probability of the simultaneous occurrence of all three events. We recommend considering all three events simultaneously when choosing a criterion for determining a sample size. We provide new theoretical results for any scalar (mean) parameter in a general linear model with Gaussian errors and fixed predictors. Convenient computational forms are included, as well as numerical examples to illustrate our methods.  相似文献   

9.
We revisit the results of the recent Reproducibility Project: Psychology by the Open Science Collaboration. We compute Bayes factors—a quantity that can be used to express comparative evidence for an hypothesis but also for the null hypothesis—for a large subset (N = 72) of the original papers and their corresponding replication attempts. In our computation, we take into account the likely scenario that publication bias had distorted the originally published results. Overall, 75% of studies gave qualitatively similar results in terms of the amount of evidence provided. However, the evidence was often weak (i.e., Bayes factor < 10). The majority of the studies (64%) did not provide strong evidence for either the null or the alternative hypothesis in either the original or the replication, and no replication attempts provided strong evidence in favor of the null. In all cases where the original paper provided strong evidence but the replication did not (15%), the sample size in the replication was smaller than the original. Where the replication provided strong evidence but the original did not (10%), the replication sample size was larger. We conclude that the apparent failure of the Reproducibility Project to replicate many target effects can be adequately explained by overestimation of effect sizes (or overestimation of evidence against the null hypothesis) due to small sample sizes and publication bias in the psychological literature. We further conclude that traditional sample sizes are insufficient and that a more widespread adoption of Bayesian methods is desirable.  相似文献   

10.
Traditional resampling-based tests for homogeneity in covariance matrices across multiple groups resample residuals, that is, data centered by group means. These residuals do not share the same second moments when the null hypothesis is false, which makes them difficult to use in the setting of multiple testing. An alternative approach is to resample standardized residuals, data centered by group sample means and standardized by group sample covariance matrices. This approach, however, has been observed to inflate type I error when sample size is small or data are generated from heavy-tailed distributions. We propose to improve this approach by using robust estimation for the first and second moments. We discuss two statistics: the Bartlett statistic and a statistic based on eigen-decomposition of sample covariance matrices. Both statistics can be expressed in terms of standardized errors under the null hypothesis. These methods are extended to test homogeneity in correlation matrices. Using simulation studies, we demonstrate that the robust resampling approach provides comparable or superior performance, relative to traditional approaches, for single testing and reasonable performance for multiple testing. The proposed methods are applied to data collected in an HIV vaccine trial to investigate possible determinants, including vaccine status, vaccine-induced immune response level and viral genotype, of unusual correlation pattern between HIV viral load and CD4 count in newly infected patients.  相似文献   

11.
Yin G  Shen Y 《Biometrics》2005,61(2):362-369
Clinical trial designs involving correlated data often arise in biomedical research. The intracluster correlation needs to be taken into account to ensure the validity of sample size and power calculations. In contrast to the fixed-sample designs, we propose a flexible trial design with adaptive monitoring and inference procedures. The total sample size is not predetermined, but adaptively re-estimated using observed data via a systematic mechanism. The final inference is based on a weighted average of the block-wise test statistics using generalized estimating equations, where the weight for each block depends on cumulated data from the ongoing trial. When there are no significant treatment effects, the devised stopping rule allows for early termination of the trial and acceptance of the null hypothesis. The proposed design updates information regarding both the effect size and within-cluster correlation based on the cumulated data in order to achieve a desired power. Estimation of the parameter of interest and its confidence interval are proposed. We conduct simulation studies to examine the operating characteristics and illustrate the proposed method with an example.  相似文献   

12.
The nonparametric Behrens‐Fisher hypothesis is the most appropriate null hypothesis for the two‐sample comparison when one does not wish to make restrictive assumptions about possible distributions. In this paper, a numerical approach is described by which the likelihood ratio test can be calculated for the nonparametric Behrens‐Fisher problem. The approach taken here effectively reduces the number of parameters in the score equations to one by using a recursive formula for the remaining parameters. The resulting single dimensional problem can be solved numerically. The power of the likelihood ratio test is compared by simulation to that of a generalized Wilcoxon test of Brunner and Munzel. The tests have similar power for all alternatives considered when a simulated null distribution is used to generate cutoff values for the tests. The methods are illustrated on data on shoulder pain from a clinical trial.  相似文献   

13.
We propose a general likelihood-based approach to the linkage analysis of qualitative and quantitative traits using identity by descent (IBD) data from sib-pairs. We consider the likelihood of IBD data conditional on phenotypes and test the null hypothesis of no linkage between a marker locus and a gene influencing the trait using a score test in the recombination fraction theta between the two loci. This method unifies the linkage analysis of qualitative and quantitative traits into a single inferential framework, yielding a simple and intuitive test statistic. Conditioning on phenotypes avoids unrealistic random sampling assumptions and allows sib-pairs from differing ascertainment mechanisms to be incorporated into a single likelihood analysis. In particular, it allows the selection of sib-pairs based on their trait values and the analysis of only those pairs having the most informative phenotypes. The score test is based on the full likelihood, i.e. the likelihood based on all phenotype data rather than just differences of sib-pair phenotypes. Considering only phenotype differences, as in Haseman and Elston (1972) and Kruglyak and Lander (1995), may result in important losses in power. The linkage score test is derived under general genetic models for the trait, which may include multiple unlinked genes. Population genetic assumptions, such as random mating or linkage equilibrium at the trait loci, are not required. This score test is thus particularly promising for the analysis of complex human traits. The score statistic readily extends to accommodate incomplete IBD data at the test locus, by using the hidden Markov model implemented in the programs MAPMAKER/SIBS and GENEHUNTER (Kruglyak and Lander, 1995; Kruglyak et al., 1996). Preliminary simulation studies indicate that the linkage score test generally matches or outperforms the Haseman-Elston test, the largest gains in power being for selected samples of sib-pairs with extreme phenotypes.  相似文献   

14.
Joshua Ladau  Sadie J. Ryan 《Oikos》2010,119(7):1064-1069
Null model tests of presence–absence data (‘NMTPAs’) provide important tools for inferring effects of competition, facilitation, habitat filtering, and other ecological processes from observational data. Many NMTPAs have been developed, but they often yield conflicting conclusions when applied to the same data. Type I and II error rates, size, power, robustness and bias provide important criteria for assessing which tests are valid, but these criteria need to be evaluated contingent on the sample size, null hypothesis of interest, and assumptions that are appropriate for the data set that is being analyzed. In this paper, we confirm that this is the case using the software MPower, evaluating the validity of NMTPAs contingent on the null hypothesis being tested, assumptions that can be made, and sample size. Evaluating the validity of NMTPAs contingent on these factors is important towards ensuring that reliable inferences are drawn from observational data about the processes controlling community assembly.  相似文献   

15.
Ryman N  Jorde PE 《Molecular ecology》2001,10(10):2361-2373
A variety of statistical procedures are commonly employed when testing for genetic differentiation. In a typical situation two or more samples of individuals have been genotyped at several gene loci by molecular or biochemical means, and in a first step a statistical test for allele frequency homogeneity is performed at each locus separately, using, e.g. the contingency chi-square test, Fisher's exact test, or some modification thereof. In a second step the results from the separate tests are combined for evaluation of the joint null hypothesis that there is no allele frequency difference at any locus, corresponding to the important case where the samples would be regarded as drawn from the same statistical and, hence, biological population. Presently, there are two conceptually different strategies in use for testing the joint null hypothesis of no difference at any locus. One approach is based on the summation of chi-square statistics over loci. Another method is employed by investigators applying the Bonferroni technique (adjusting the P-value required for rejection to account for the elevated alpha errors when performing multiple tests simultaneously) to test if the heterogeneity observed at any particular locus can be regarded significant when considered separately. Under this approach the joint null hypothesis is rejected if one or more of the component single locus tests is considered significant under the Bonferroni criterion. We used computer simulations to evaluate the statistical power and realized alpha errors of these strategies when evaluating the joint hypothesis after scoring multiple loci. We find that the 'extended' Bonferroni approach generally is associated with low statistical power and should not be applied in the current setting. Further, and contrary to what might be expected, we find that 'exact' tests typically behave poorly when combined in existing procedures for joint hypothesis testing. Thus, while exact tests are generally to be preferred over approximate ones when testing each particular locus, approximate tests such as the traditional chi-square seem preferable when addressing the joint hypothesis.  相似文献   

16.
The problem of whether the hominid fossil sample of habiline specimens is comprised of more than one species has received much attention in paleoanthropology. The core of this debate has critical implications about when and how variation can be explained by taxonomy. In this paper, we examine the problem of whether the observed variation in habiline samples reflects species differences. We test the null hypothesis of no difference by examining the degree of variability in habiline sample in comparison with other single-species early hominid fossil samples from Sterkfontein and Swartkrans (Sterkfontein is earlier than the habiline sample, Swartkrans may be within the habiline time span). We developed a new method for this examination, which we call STandard Error Test of the null hypothesis of no difference (STET). Our sampling statistic is based on the standard error of the slope of regressions between pairs of specimens, relating all of the homologous measurements that each pair shares. We show that the null hypothesis for the habiline sample cannot be rejected. The similarities of specimen pairs within the habiline sample are not more than those observed between the specimens in the australopithecine samples we analyzed.  相似文献   

17.
Wang T  Elston RC 《Human heredity》2004,57(2):109-116
The original and revisited Haseman-Elston methods are simple robust methods to detect linkage, but neither is uniformly optimal in terms of power. In this report, we propose a simple modification of the revisited Haseman-Elston method that retains the simplicity and robustness properties, but increases its power. We demonstrate theoretically that the modification can be more powerful than the optimally weighted Haseman-Elston method when the sibship mean can be correctly specified. We then examine the properties of this modification by simulation when the sibship mean is replaced by its best linear unbiased predictor. The simulation results indicate that this modification maintains good control over type I error, even in the case of larger sibships, and that the empirical power of this modification is similar to that of the optimally weighted Haseman-Elston method in most cases.  相似文献   

18.
In experiments with many statistical tests there is need to balance type I and type II error rates while taking multiplicity into account. In the traditional approach, the nominal -level such as 0.05 is adjusted by the number of tests, , i.e., as 0.05/. Assuming that some proportion of tests represent “true signals”, that is, originate from a scenario where the null hypothesis is false, power depends on the number of true signals and the respective distribution of effect sizes. One way to define power is for it to be the probability of making at least one correct rejection at the assumed -level. We advocate an alternative way of establishing how “well-powered” a study is. In our approach, useful for studies with multiple tests, the ranking probability is controlled, defined as the probability of making at least correct rejections while rejecting hypotheses with smallest P-values. The two approaches are statistically related. Probability that the smallest P-value is a true signal (i.e., ) is equal to the power at the level , to an excellent approximation. Ranking probabilities are also related to the false discovery rate and to the Bayesian posterior probability of the null hypothesis. We study properties of our approach when the effect size distribution is replaced for convenience by a single “typical” value taken to be the mean of the underlying distribution. We conclude that its performance is often satisfactory under this simplification; however, substantial imprecision is to be expected when is very large and is small. Precision is largely restored when three values with the respective abundances are used instead of a single typical effect size value.  相似文献   

19.
Both theoretical calculations and simulation studies have been used to compare and contrast the statistical power of methods for mapping quantitative trait loci (QTLs) in simple and complex pedigrees. A widely used approach in such studies is to derive or simulate the expected mean test statistic under the alternative hypothesis of a segregating QTL and to equate a larger mean test statistic with larger power. In the present study, we show that, even when the test statistic under the null hypothesis of no linkage follows a known asymptotic distribution (the standard being chi(2)), it cannot be assumed that the distribution under the alternative hypothesis is noncentral chi(2). Hence, mean test statistics cannot be used to indicate power differences, and a comparison between methods that are based on simulated average test statistics may lead to the wrong conclusion. We illustrate this important finding, through simulations and analytical derivations, for a recently proposed new regression method for the analysis of general pedigrees to map quantitative trait loci. We show that this regression method is not necessarily more powerful nor computationally more efficient than a maximum-likelihood variance-component approach. We advocate the use of empirical power to compare trait-mapping methods.  相似文献   

20.
数量性状的遗传分析可以通过"选择基因型"的方式完成。本文提出了一个利用极端样本来对数量性状位点(QTL)进行关联分析的统计量T。统计量T比较上极端群体样本中具有纯合子标记的性状值差异。通过计算机模拟考察了无关联情形时T的分布和Ⅰ型错误率,结果表明,在各种样本选择策略下,T的分布近似于χ^2-分布,Ⅰ型错误率接近设定的显著性水平。同时,考察了各种遗传模型下不同遗传率,不同样本大小,及不同样本选择阈值对T的统计功效的影响,结果表明,T的功效随着标记和QTL间连锁不平衡程度的增强及遗传率和样本大小的增大而增大,当样本选择阈值更严格时,功效也越大。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号