首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Rosenbaum PR 《Biometrics》2011,67(3):1017-1027
Summary In an observational or nonrandomized study of treatment effects, a sensitivity analysis indicates the magnitude of bias from unmeasured covariates that would need to be present to alter the conclusions of a naïve analysis that presumes adjustments for observed covariates suffice to remove all bias. The power of sensitivity analysis is the probability that it will reject a false hypothesis about treatment effects allowing for a departure from random assignment of a specified magnitude; in particular, if this specified magnitude is “no departure” then this is the same as the power of a randomization test in a randomized experiment. A new family of u‐statistics is proposed that includes Wilcoxon's signed rank statistic but also includes other statistics with substantially higher power when a sensitivity analysis is performed in an observational study. Wilcoxon's statistic has high power to detect small effects in large randomized experiments—that is, it often has good Pitman efficiency—but small effects are invariably sensitive to small unobserved biases. Members of this family of u‐statistics that emphasize medium to large effects can have substantially higher power in a sensitivity analysis. For example, in one situation with 250 pair differences that are Normal with expectation 1/2 and variance 1, the power of a sensitivity analysis that uses Wilcoxon's statistic is 0.08 while the power of another member of the family of u‐statistics is 0.66. The topic is examined by performing a sensitivity analysis in three observational studies, using an asymptotic measure called the design sensitivity, and by simulating power in finite samples. The three examples are drawn from epidemiology, clinical medicine, and genetic toxicology.  相似文献   

2.
We present a survey of sample size formulas derived in other papers for pairwise comparisons of k treatments and for comparisons of k treatments with a control. We consider the calculation of sample sizes with preassigned per‐pair, any‐pair and all‐pairs power for tests that control either the comparisonwise or the experimentwise type I error rate. A comparison exhibits interesting similarities between the parametric, nonparametric and binomial case.  相似文献   

3.
We consider sample size determination for ordered categorical data when the alternative assumption is the proportional odds model. In this paper the sample size formula proposed by Whitehead (Statistics in Medicine, 12 , 2257–2271, 1993) is compared with the methods based on exact and asymptotic linear rank tests with Wilcoxon and trend scores. We show that Whitehead's formula, which is based on a normal approximation, works well when the sample size is moderate to large but recommend the exact method with Wilcoxon scores for small sample sizes. The consequences of misspecification in models are also investigated.  相似文献   

4.
Computer simulation techniques were used to investigate the Type I and Type II error rates of one parametric (Dunnett) and two nonparametric multiple comparison procedures for comparing treatments with a control under nonnormality and variance homogeneity. It was found that Dunnett's procedure is quite robust with respect to violations of the normality assumption. Power comparisons show that for small sample sizes Dunnett's procedure is superior to the nonparametric procedures also in non-normal cases, but for larger sample sizes the multiple analogue to Wilcoxon and Kruskal-Wallis rank statistics are superior to Dunnett's procedure in all considered nonnormal cases. Further investigations under nonnormality and variance heterogeneity show robustness properties with respect to the risks of first kind and power comparisons yield similar results as in the equal variance case.  相似文献   

5.
The one‐degree‐of‐freedom Cochran‐Armitage (CA) test statistic for linear trend has been widely applied in various dose‐response studies (e.g., anti‐ulcer medications and short‐term antibiotics, animal carcinogenicity bioassays and occupational toxicant studies). This approximate statistic relies, however, on asymptotic theory that is reliable only when the sample sizes are reasonably large and well balanced across dose levels. For small, sparse, or skewed data, the asymptotic theory is suspect and exact conditional method (based on the CA statistic) seems to provide a dependable alternative. Unfortunately, the exact conditional method is only practical for the linear logistic model from which the sufficient statistics for the regression coefficients can be obtained explicitly. In this article, a simple and efficient recursive polynomial multiplication algorithm for exact unconditional test (based on the CA statistic) for detecting a linear trend in proportions is derived. The method is applicable for all choices of the model with monotone trend including logistic, probit, arcsine, extreme value and one hit. We also show that this algorithm can be easily extended to exact unconditional power calculation for studies with up to a moderately large sample size. A real example is given to illustrate the applicability of the proposed method.  相似文献   

6.
Investigations of sample size for planning case-control studies have usually been limited to detecting a single factor. In this paper, we investigate sample size for multiple risk factors in strata-matched case-control studies. We construct an omnibus statistic for testing M different risk factors based on the jointly sufficient statistics of parameters associated with the risk factors. The statistic is non-iterative, and it reduces to the Cochran statistic when M = 1. The asymptotic power function of the test is a non-central chi-square with M degrees of freedom and the sample size required for a specific power can be obtained by the inverse relationship. We find that the equal sample allocation is optimum. A Monte Carlo experiment demonstrates that an approximate formula for calculating sample size is satisfactory in typical epidemiologic studies. An approximate sample size obtained using Bonferroni's method for multiple comparisons is much larger than that obtained using the omnibus test. Approximate sample size formulas investigated in this paper using the omnibus test, as well as the individual tests, can be useful in designing case-control studies for detecting multiple risk factors.  相似文献   

7.
Sensitivity and specificity have traditionally been used to assess the performance of a diagnostic procedure. Diagnostic procedures with both high sensitivity and high specificity are desirable, but these procedures are frequently too expensive, hazardous, and/or difficult to operate. A less sophisticated procedure may be preferred, if the loss of the sensitivity or specificity is determined to be clinically acceptable. This paper addresses the problem of simultaneous testing of sensitivity and specificity for an alternative test procedure with a reference test procedure when a gold standard is present. The hypothesis is formulated as a compound hypothesis of two non‐inferiority (one‐sided equivalence) tests. We present an asymptotic test statistic based on the restricted maximum likelihood estimate in the framework of comparing two correlated proportions under the prospective and retrospective sampling designs. The sample size and power of an asymptotic test statistic are derived. The actual type I error and power are calculated by enumerating the exact probabilities in the rejection region. For applications that require high sensitivity as well as high specificity, a large number of positive subjects and a large number of negative subjects are needed. We also propose a weighted sum statistic as an alternative test by comparing a combined measure of sensitivity and specificity of the two procedures. The sample size determination is independent of the sampling plan for the two tests.  相似文献   

8.
Summary . In this article, we consider problems with correlated data that can be summarized in a 2 × 2 table with structural zero in one of the off‐diagonal cells. Data of this kind sometimes appear in infectious disease studies and two‐step procedure studies. Lui (1998, Biometrics 54, 706–711) considered confidence interval estimation of rate ratio based on Fieller‐type, Wald‐type, and logarithmic transformation statistics. We reexamine the same problem under the context of confidence interval construction on false‐negative rate ratio in diagnostic performance when combining two diagnostic tests. We propose a score statistic for testing the null hypothesis of nonunity false‐negative rate ratio. Score test–based confidence interval construction for false‐negative rate ratio will also be discussed. Simulation studies are conducted to compare the performance of the new derived score test statistic and existing statistics for small to moderate sample sizes. In terms of confidence interval construction, our asymptotic score test–based confidence interval estimator possesses significantly shorter expected width with coverage probability being close to the anticipated confidence level. In terms of hypothesis testing, our asymptotic score test procedure has actual type I error rate close to the pre‐assigned nominal level. We illustrate our methodologies with real examples from a clinical laboratory study and a cancer study.  相似文献   

9.
Since its introduction in 1959 the ability of the classical Mantel-Haenszel (M–H) procedure for combining the odds ratios of a set of I 2 × 2 tables has led to its use also in stratified or multicentre type clinical trials. A familiar application is the M–H logrank test in survival analysis. An extension of the M–H procedure covering the case of 2 × K contingency tables (MANTEL , 1963) with ordered levels retains the essential property of pooling the results of I homogeneous tables (i.e. in absence of qualitative interactions). The assignment of some score for the K columns of a table is essential for the use of the method (in comparing 2 treatments). Some possibilities of score assignment are discussed: for clinical outcome variables such as the degree of severity of a disease, pain and so on, the score is at hand in a natural way. A less well-known type of scoring consists in ranking the observations of a continuous variable, leading to cell sizes of 1 or 0. In this case, however, if equidistant ranking was used, the E–M–H procedure appears as an extension of Wilcoxon's rank sum test and represents a powerful non-parametric approach in stratified or multicentre type designs with non normally distributed outcome variables. The results of some Monte-Carlo simulations for 2 possible equidistant ranking procedures are presented, which indicate only a moderate gain in power as compared to Wilcoxon's rank sum test under the common situation of centre effects not exceeding treatment effects. Use of the E–M–H pro?edure is also recommended as a simple method to overcome the potential bias due to unequally distributed prognostic factors among treatment groups.  相似文献   

10.
Quantifying dispersal within wild populations is an important but challenging task. Here we present a method to estimate contemporary, individual‐based dispersal distance from noninvasively collected samples using a specialized panel of 96 SNPs (single nucleotide polymorphisms). One main issue in conducting dispersal studies is the requirement for a high sampling resolution at a geographic scale appropriate for capturing the majority of dispersal events. In this study, fecal samples of brown bear (Ursus arctos) were collected by volunteer citizens, resulting in a high sampling resolution spanning over 45,000 km2 in Gävleborg and Dalarna counties in Sweden. SNP genotypes were obtained for unique individuals sampled (n = 433) and subsequently used to reconstruct pedigrees. A Mantel test for isolation by distance suggests that the sampling scale was appropriate for females but not for males, which are known to disperse long distances. Euclidean distance was estimated between mother and offspring pairs identified through the reconstructed pedigrees. The mean dispersal distance was 12.9 km (SE 3.2) and 33.8 km (SE 6.8) for females and males, respectively. These results were significantly different (Wilcoxon's rank‐sum test: P‐value = 0.02) and are in agreement with the previously identified pattern of male‐biased dispersal. Our results illustrate the potential of using a combination of noninvasively collected samples at high resolution and specialized SNPs for pedigree‐based dispersal models.  相似文献   

11.
Asymptotically correct 90 and 95 percentage points are given for multiple comparisons with control and for all pair comparisons of several independent samples of equal size from polynomial distributions. Test statistics are the maxima of the X2-statistics for single comparisons. For only two categories the asymptotic distributions of these test statistics result from DUNNETT'S many-one tests and TUKEY'S range test (cf. MILLER, 1981). The percentage points for comparisons with control are computed from the limit distribution of the test statistic under the overall hypothesis H0. To some extent the applicability of these bounds is investigated by simulation. The bounds can also be used to improve Holm's sequentially rejective Bonferroni test procedure (cf. HOLM, 1979). The percentage points for all pair comparisons are obtained by large simulations. Especially for 3×3-tables the limit distribution of the test statistic under H0 is derived also for samples of unequal size. Also these bounds can improve the corresponding Bonferroni-Holm procedure. Finally from SKIDÁK's probability inequality for normal random vectors (cf. SKIDÁK, 1967) a similar inequality is derived for dependent X2-variables applicable to simultaneous X2-tests.  相似文献   

12.
A generalization of the Behrens‐Fisher problem for two samples is examined in a nonparametric model. It is not assumed that the underlying distribution functions are continuous so that data with arbitrary ties can be handled. A rank test is considered where the asymptotic variance is estimated consistently by using the ranks over all observations as well as the ranks within each sample. The consistency of the estimator is derived in the appendix. For small samples (n1, n2 ≥ 10), a simple approximation by a central t‐distribution is suggested where the degrees of freedom are taken from the Satterthwaite‐Smith‐Welch approximation in the parametric Behrens‐Fisher problem. It is demonstrated by means of a simulation study that the Wilcoxon‐Mann‐Whitney‐test may be conservative or liberal depending on the ratio of the sample sizes and the variances of the underlying distribution functions. For the suggested approximation, however, it turns out that the nominal level is maintained rather accurately. The suggested nonparametric procedure is applied to a data set from a clinical trial. Moreover, a confidence interval for the nonparametric treatment effect is given.  相似文献   

13.
Summary Meta‐analysis seeks to combine the results of several experiments in order to improve the accuracy of decisions. It is common to use a test for homogeneity to determine if the results of the several experiments are sufficiently similar to warrant their combination into an overall result. Cochran’s Q statistic is frequently used for this homogeneity test. It is often assumed that Q follows a chi‐square distribution under the null hypothesis of homogeneity, but it has long been known that this asymptotic distribution for Q is not accurate for moderate sample sizes. Here, we present an expansion for the mean of Q under the null hypothesis that is valid when the effect and the weight for each study depend on a single parameter, but for which neither normality nor independence of the effect and weight estimators is needed. This expansion represents an order O(1/n) correction to the usual chi‐square moment in the one‐parameter case. We apply the result to the homogeneity test for meta‐analyses in which the effects are measured by the standardized mean difference (Cohen’s d‐statistic). In this situation, we recommend approximating the null distribution of Q by a chi‐square distribution with fractional degrees of freedom that are estimated from the data using our expansion for the mean of Q. The resulting homogeneity test is substantially more accurate than the currently used test. We provide a program available at the Paper Information link at the Biometrics website http://www.biometrics.tibs.org for making the necessary calculations.  相似文献   

14.
Mating preference can be a driver of sexual selection and assortative mating and is, therefore, a key element in evolutionary dynamics. Positive mating preference by similarity is the tendency for the choosy individual to select a mate which possesses a similar variant of a trait. Such preference can be modelled using Gaussian‐like mathematical functions that describe the strength of preference, but such functions cannot be applied to empirical data collected from the field. As a result, traditionally, mating preference is indirectly estimated by the degree of assortative mating (using Pearson's correlation coefficient, r) in wild captured mating pairs. Unfortunately, r and similar coefficients are often biased due to the fact that different variants of a given trait are nonrandomly distributed in the wild, and pooling of mating pairs from such heterogeneous samples may lead to “false–positive” results, termed “the scale‐of‐choice effect” (SCE). Here we provide two new estimators of mating preference (Crough and Cscaled) derived from Gaussian‐like functions which can be applied to empirical data. Computer simulations demonstrated that r coefficient showed robust estimations properties of mating preference but it was severely affected by SCE, Crough showed reasonable estimation properties and it was little affected by SCE, while Cscaled showed the best properties at infinite sample sizes and it was not affected by SCE but failed at biological sample sizes. We recommend using Crough combined with the r coefficient to infer mating preference in future empirical studies.  相似文献   

15.
The determination of sample sizes for the comparison of k treatments against a control by means of the test of Dunnett (1955, 1964) as well as by means of the multiple t-test will be considered. The power in multiple comparisons can be defined in different ways, see Hochberg and Tamhane (1987). We will derive formulas for the per-pair power, the any-pair power and the all-pairs power for both one- and two-sided comparisons. Tables will be provided that allow sample sizes to be determined for preassigned values of the power.  相似文献   

16.
Chris J. Lloyd 《Biometrics》2010,66(3):975-982
Summary Clinical trials data often come in the form of low‐dimensional tables of small counts. Standard approximate tests such as score and likelihood ratio tests are imperfect in several respects. First, they can give quite different answers from the same data. Second, the actual type‐1 error can differ significantly from nominal, even for quite large sample sizes. Third, exact inferences based on these can be strongly nonmonotonic functions of the null parameter and lead to confidence sets that are discontiguous. There are two modern approaches to small sample inference. One is to use so‐called higher order asymptotics ( Reid, 2003 , Annal of Statistics 31 , 1695–1731) to provide an explicit adjustment to the likelihood ratio statistic. The theory for this is complex but the statistic is quick to compute. The second approach is to perform an exact calculation of significance assuming the nuisance parameters equal their null estimate ( Lee and Young, 2005 , Statistic and Probability Letters 71 , 143–153), which is a kind of parametric bootstrap. The purpose of this article is to explain and evaluate these two methods, for testing whether a difference in probabilities p2? p1 exceeds a prechosen noninferiority margin δ0 . On the basis of an extensive numerical study, we recommend bootstrap P‐values as superior to all other alternatives. First, they produce practically identical answers regardless of the basic test statistic chosen. Second, they have excellent size accuracy and higher power. Third, they vary much less erratically with the null parameter value δ0 .  相似文献   

17.
A general nonparametric approach to asymptotic multiple test procedures is proposed which is based on relative effects and which includes continuous as well as discontinuous distributions. The results can be applied to all relevant multiple testing problems in the one‐way layout and include the well known Steel tests as special cases. Moreover, a general estimator for the asymptotic covariance matrix is considered that is consistent even under alternative. This estimator is used to derive simultaneous confidence intervals for the relative effects as well as a test procedure for the multiple nonparametric Behrens‐Fisher problem.  相似文献   

18.
Summary A time‐specific log‐linear regression method on quantile residual lifetime is proposed. Under the proposed regression model, any quantile of a time‐to‐event distribution among survivors beyond a certain time point is associated with selected covariates under right censoring. Consistency and asymptotic normality of the regression estimator are established. An asymptotic test statistic is proposed to evaluate the covariate effects on the quantile residual lifetimes at a specific time point. Evaluation of the test statistic does not require estimation of the variance–covariance matrix of the regression estimators, which involves the probability density function of the survival distribution with censoring. Simulation studies are performed to assess finite sample properties of the regression parameter estimator and test statistic. The new regression method is applied to a breast cancer data set with long‐term follow‐up to estimate the patients' median residual lifetimes, adjusting for important prognostic factors.  相似文献   

19.
The three‐arm design with a test treatment, an active control and a placebo group is the gold standard design for non‐inferiority trials if it is ethically justifiable to expose patients to placebo. In this paper, we first use the closed testing principle to establish the hierarchical testing procedure for the multiple comparisons involved in the three‐arm design. For the effect preservation test we derive the explicit formula for the optimal allocation ratios. We propose a group sequential type design, which naturally accommodates the hierarchical testing procedure. Under this proposed design, Monte Carlo simulations are conducted to evaluate the performance of the sequential effect preservation test when the variance of the test statistic is estimated based on the restricted maximum likelihood estimators of the response rates under the null hypothesis. When there are uncertainties for the placebo response rate, the proposed design demonstrates better operating characteristics than the fixed sample design.  相似文献   

20.
A new phylogenetic comparative method is proposed, based on mapping two continuous characters on a tree to generate data pairs for regression or correlation analysis, which resolves problems of multiple character reconstructions, phylogenetic dependence, and asynchronous responses (evolutionary lags). Data pairs are formed in two ways (tree‐down and tree‐up) by matching corresponding changes, Δx and Δy. Delayed responses (Δy occurring later in the tree than Δx) are penalized by weighting pairs using nodal or branch‐length distance between Δx and Δy; immediate (same‐node) responses are given maximum weight. All combinations of character reconstructions (or a random sample thereof) are used to find the observed range of the weighted coefficient of correlation r (or weighted slope b). This range is used as test statistic, and the null distribution is generated by randomly reallocating changes (Δx and Δy) in the topology. Unlike randomization of terminal values, this procedure complies with Generalized Monte Carlo requirements while saving considerable computation time. Phylogenetic dependence is avoided by randomization without data transformations, yielding acceptable type‐I error rates and statistical power. We show that ignoring delayed responses can lead to falsely nonsignificant results. Issues that arise from considering delayed responses based on optimization are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号