首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A simple, closed-form jackknife estimate of the actual variance of the Mann-Whitney-Wilcoxon statistic, as opposed to the standard permutational variance under the test's null hypothesis has been derived which permits avoiding anticonservative performance in the presence of heterosce-dasticity. The formulation given allows modifications of the exponential scores test, of censored data tests by Gehan (1965), Peto & Peto (1977) and Prentice (1978), of tests for monotonic τ association by Kendall (1962) and for tests of ordered k-sample hypotheses. A Monte Carlo study supports recommendations for the jackknife procedures, but also shows their limited advantages in exponential scores and censored data versions. Thus, the paper extends results by Fligner & Policello (1981).  相似文献   

2.
Sensitivity and specificity have traditionally been used to assess the performance of a diagnostic procedure. Diagnostic procedures with both high sensitivity and high specificity are desirable, but these procedures are frequently too expensive, hazardous, and/or difficult to operate. A less sophisticated procedure may be preferred, if the loss of the sensitivity or specificity is determined to be clinically acceptable. This paper addresses the problem of simultaneous testing of sensitivity and specificity for an alternative test procedure with a reference test procedure when a gold standard is present. The hypothesis is formulated as a compound hypothesis of two non‐inferiority (one‐sided equivalence) tests. We present an asymptotic test statistic based on the restricted maximum likelihood estimate in the framework of comparing two correlated proportions under the prospective and retrospective sampling designs. The sample size and power of an asymptotic test statistic are derived. The actual type I error and power are calculated by enumerating the exact probabilities in the rejection region. For applications that require high sensitivity as well as high specificity, a large number of positive subjects and a large number of negative subjects are needed. We also propose a weighted sum statistic as an alternative test by comparing a combined measure of sensitivity and specificity of the two procedures. The sample size determination is independent of the sampling plan for the two tests.  相似文献   

3.
Decady and Thomas (2000, Biometrics 56, 893-896) propose a first-order corrected Umesh-Loughin-Scherer statistic to test for association in an r x c contingency table with multiple column responses. Agresti and Liu (1999, Biometrics 55, 936-943) point out that such statistics are not invariant to the arbitrary designation of a zero or one to a positive response. This paper shows that, in addition, the proposed testing procedure does not hold the correct size when there are strong pairwise associations between responses.  相似文献   

4.
Anthony Almudevar 《Biometrics》2001,57(4):1080-1088
The problem of inferring kinship structure among a sample of individuals using genetic markers is considered with the objective of developing hypothesis tests for genetic relatedness with nearly optimal properties. The class of tests considered are those that are constrained to be permutation invariant, which in this context defines tests whose properties do not depend on the labeling of the individuals. This is appropriate when all individuals are to be treated identically from a statistical point of view. The approach taken is to derive tests that are probably most powerful for a permutation invariant alternative hypothesis that is, in some sense, close to a null hypothesis of mutual independence. This is analagous to the locally most powerful test commonly used in parametric inference. Although the resulting test statistic is a U-statistic, normal approximation theory is found to be inapplicable because of high skewness. As an alternative it is found that a conditional procedure based on the most powerful test statistic can calculate accurate significance levels without much loss in power. Examples are given in which this type of test proves to be more powerful than a number of alternatives considered in the literature, including Queller and Goodknight's (1989) estimate of genetic relatedness, the average number of shared alleles (Blouin, 1996), and the number of feasible sibling triples (Almudevar and Field, 1999).  相似文献   

5.
Guan Y 《Biometrics》2008,64(3):800-806
Summary .   We propose a formal method to test stationarity for spatial point processes. The proposed test statistic is based on the integrated squared deviations of observed counts of events from their means estimated under stationarity. We show that the resulting test statistic converges in distribution to a functional of a two-dimensional Brownian motion. To conduct the test, we compare the calculated statistic with the upper tail critical values of this functional. Our method requires only a weak dependence condition on the process but does not assume any parametric model for it. As a result, it can be applied to a wide class of spatial point process models. We study the efficacy of the test through both simulations and applications to two real data examples that were previously suspected to be nonstationary based on graphical evidence. Our test formally confirmed the suspected nonstationarity for both data.  相似文献   

6.
Briggs WM  Zaretzki R 《Biometrics》2008,64(1):250-6; discussion 256-61
Summary .   We introduce the Skill Plot, a method that it is directly relevant to a decision maker who must use a diagnostic test. In contrast to ROC curves, the skill curve allows easy graphical inspection of the optimal cutoff or decision rule for a diagnostic test. The skill curve and test also determine whether diagnoses based on this cutoff improve upon a naive forecast (of always present or of always absent). The skill measure makes it easy to directly compare the predictive utility of two different classifiers in an analogy to the area under the curve statistic related to ROC analysis. Finally, this article shows that the skill-based cutoff inferred from the plot is equivalent to the cutoff indicated by optimizing the posterior odds in accordance with Bayesian decision theory. A method for constructing a confidence interval for this optimal point is presented and briefly discussed.  相似文献   

7.
In survivorship modelling using the proportional hazards model of Cox (1972, Journal of the Royal Statistical Society, Series B, 34, 187–220), it is often desired to test a subset of the vector of unknown regression parameters β in the expression for the hazard rate at time t. The likelihood ratio test statistic is well behaved in most situations but may be expensive to calculate. The Wald (1943, Transactions of the American Mathematical Society 54, 426–482) test statistic is easier to calculate, but has some drawbacks. In testing a single parameter in a binomial logit model, Hauck and Donner (1977, Journal of the American Statistical Association 72, 851–853) show that the Wald statistic decreases to zero the further the parameter estimate is from the null and that the asymptotic power of the test decreases to the significance level. The Wald statistic is extensively used in statistical software packages for survivorship modelling and it is therefore important to understand its behavior. The present work examines empirically the behavior of the Wald statistic under various departures from the null hypothesis and under the presence of Type I censoring and covariates in the model. It is shown via examples that the Wald statistic's behavior is not as aberrant as found for the logistic model. For the single parameter case, the asymptotic non-null distribution of the Wald statistic is examined.  相似文献   

8.
The one‐degree‐of‐freedom Cochran‐Armitage (CA) test statistic for linear trend has been widely applied in various dose‐response studies (e.g., anti‐ulcer medications and short‐term antibiotics, animal carcinogenicity bioassays and occupational toxicant studies). This approximate statistic relies, however, on asymptotic theory that is reliable only when the sample sizes are reasonably large and well balanced across dose levels. For small, sparse, or skewed data, the asymptotic theory is suspect and exact conditional method (based on the CA statistic) seems to provide a dependable alternative. Unfortunately, the exact conditional method is only practical for the linear logistic model from which the sufficient statistics for the regression coefficients can be obtained explicitly. In this article, a simple and efficient recursive polynomial multiplication algorithm for exact unconditional test (based on the CA statistic) for detecting a linear trend in proportions is derived. The method is applicable for all choices of the model with monotone trend including logistic, probit, arcsine, extreme value and one hit. We also show that this algorithm can be easily extended to exact unconditional power calculation for studies with up to a moderately large sample size. A real example is given to illustrate the applicability of the proposed method.  相似文献   

9.
Recently BHATTI (1993) considered an efficient estimation of random coefficient model based on survey data. The main objective of this paper is to construct one sided test for testing equicorrelation coefficient in presence of random coefficients using optimal testing procedure. The test statistic is a ratio of quadratic forms in normal variables which is most powerful and point optimal invariant.  相似文献   

10.
Parzen M  Lipsitz SR 《Biometrics》1999,55(2):580-584
In this paper, a global goodness-of-fit test statistic for a Cox regression model, which has an approximate chi-squared distribution when the model has been correctly specified, is proposed. Our goodness-of-fit statistic is global and has power to detect if interactions or higher order powers of covariates in the model are needed. The proposed statistic is similar to the Hosmer and Lemeshow (1980, Communications in Statistics A10, 1043-1069) goodness-of-fit statistic for binary data as well as Schoenfeld's (1980, Biometrika 67, 145-153) statistic for the Cox model. The methods are illustrated using data from a Mayo Clinic trial in primary billiary cirrhosis of the liver (Fleming and Harrington, 1991, Counting Processes and Survival Analysis), in which the outcome is the time until liver transplantation or death. The are 17 possible covariates. Two Cox proportional hazards models are fit to the data, and the proposed goodness-of-fit statistic is applied to the fitted models.  相似文献   

11.
In a comparative clinical trial, if the maximum information is adjusted on the basis of unblinded data, the usual test statistic should be avoided due to possible type I error inflation. An adaptive test can be used as an alternative. The usual point estimate of the treatment effect and the usual confidence interval should also be avoided. In this article, we construct a point estimate and a confidence interval that are motivated by an adaptive test statistic. The estimator is consistent for the treatment effect and the confidence interval asymptotically has correct coverage probability.  相似文献   

12.
An exact rank test for two dependent samples based on overall mid‐ranks is discussed which can be applied to metric as well as to ordinal data. The exact conditional distribution of the test statistic given the observed vector of rank differences is determined. A recursion formula is given as well as a fast shift algorithm in SAS/IML code. Moreover, it is demonstrated that the paired rank test can be more powerful than other tests for paired samples by means of a simulation study. Finally, the test is applied to a psychiatric trial with longitudinal ordinal data.  相似文献   

13.
Motivated by investigating the relationship between progesterone and the days in a menstrual cycle in a longitudinal study, we propose a multikink quantile regression model for longitudinal data analysis. It relaxes the linearity condition and assumes different regression forms in different regions of the domain of the threshold covariate. In this paper, we first propose a multikink quantile regression for longitudinal data. Two estimation procedures are proposed to estimate the regression coefficients and the kink points locations: one is a computationally efficient profile estimator under the working independence framework while the other one considers the within-subject correlations by using the unbiased generalized estimation equation approach. The selection consistency of the number of kink points and the asymptotic normality of two proposed estimators are established. Second, we construct a rank score test based on partial subgradients for the existence of the kink effect in longitudinal studies. Both the null distribution and the local alternative distribution of the test statistic have been derived. Simulation studies show that the proposed methods have excellent finite sample performance. In the application to the longitudinal progesterone data, we identify two kink points in the progesterone curves over different quantiles and observe that the progesterone level remains stable before the day of ovulation, then increases quickly in 5 to 6 days after ovulation and then changes to stable again or drops slightly.  相似文献   

14.
Leisenring W  Alonzo T  Pepe MS 《Biometrics》2000,56(2):345-351
Positive and negative predictive values of a diagnostic test are key clinically relevant measures of test accuracy. Surprisingly, statistical methods for comparing tests with regard to these parameters have not been available for the most common study design in which each test is applied to each study individual. In this paper, we propose a statistic for comparing the predictive values of two diagnostic tests using this paired study design. The proposed statistic is a score statistic derived from a marginal regression model and bears some relation to McNemar's statistic. As McNemar's statistic can be used to compare sensitivities and specificities of diagnostic tests, parameters that condition on disease status, our statistic can be considered as an analog of McNemar's test for the problem of comparing predictive values, parameters that condition on test outcome. We report on the results of a simulation study designed to examine the properties of this test under a variety of conditions. The method is illustrated with data from a study of methods for diagnosis of coronary artery disease.  相似文献   

15.
A "gold" standard test, providing definitive verification of disease status, may be quite invasive or expensive. Current technological advances provide less invasive, or less expensive, diagnostic tests. Ideally, a diagnostic test is evaluated by comparing it with a definitive gold standard test. However, the decision to perform the gold standard test to establish the presence or absence of disease is often influenced by the results of the diagnostic test, along with other measured, or not measured, risk factors. If only data from patients who received the gold standard test were used to assess the test performance, the commonly used measures of diagnostic test performance--sensitivity and specificity--are likely to be biased. Sensitivity would often be higher, and specificity would be lower, than the true values. This bias is called verification bias. Without adjustment for verification bias, one may possibly introduce into the medical practice a diagnostic test with apparent, but not truly, high sensitivity. In this article, verification bias is treated as a missing covariate problem. We propose a flexible modeling and computational framework for evaluating the performance of a diagnostic test, with adjustment for nonignorable verification bias. The presented computational method can be utilized with any software that can repetitively use a logistic regression module. The approach is likelihood-based, and allows use of categorical or continuous covariates. An explicit formula for the observed information matrix is presented, so that one can easily compute standard errors of estimated parameters. The methodology is illustrated with a cardiology data example. We perform a sensitivity analysis of the dependency of verification selection process on disease.  相似文献   

16.
MOTIVATION: Recently a class of nonparametric statistical methods, including the empirical Bayes (EB) method, the significance analysis of microarray (SAM) method and the mixture model method (MMM), have been proposed to detect differential gene expression for replicated microarray experiments conducted under two conditions. All the methods depend on constructing a test statistic Z and a so-called null statistic z. The null statistic z is used to provide some reference distribution for Z such that statistical inference can be accomplished. A common way of constructing z is to apply Z to randomly permuted data. Here we point our that the distribution of z may not approximate the null distribution of Z well, leading to possibly too conservative inference. This observation may apply to other permutation-based nonparametric methods. We propose a new method of constructing a null statistic that aims to estimate the null distribution of a test statistic directly. RESULTS: Using simulated data and real data, we assess and compare the performance of the existing method and our new method when applied in EB, SAM and MMM. Some interesting findings on operating characteristics of EB, SAM and MMM are also reported. Finally, by combining the idea of SAM and MMM, we outline a simple nonparametric method based on the direct use of a test statistic and a null statistic.  相似文献   

17.
Mehrotra DV  Chan IS  Berger RL 《Biometrics》2003,59(2):441-450
Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.  相似文献   

18.
Summary . In this article, we consider problems with correlated data that can be summarized in a 2 × 2 table with structural zero in one of the off‐diagonal cells. Data of this kind sometimes appear in infectious disease studies and two‐step procedure studies. Lui (1998, Biometrics 54, 706–711) considered confidence interval estimation of rate ratio based on Fieller‐type, Wald‐type, and logarithmic transformation statistics. We reexamine the same problem under the context of confidence interval construction on false‐negative rate ratio in diagnostic performance when combining two diagnostic tests. We propose a score statistic for testing the null hypothesis of nonunity false‐negative rate ratio. Score test–based confidence interval construction for false‐negative rate ratio will also be discussed. Simulation studies are conducted to compare the performance of the new derived score test statistic and existing statistics for small to moderate sample sizes. In terms of confidence interval construction, our asymptotic score test–based confidence interval estimator possesses significantly shorter expected width with coverage probability being close to the anticipated confidence level. In terms of hypothesis testing, our asymptotic score test procedure has actual type I error rate close to the pre‐assigned nominal level. We illustrate our methodologies with real examples from a clinical laboratory study and a cancer study.  相似文献   

19.
Lui KJ  Kelly C 《Biometrics》2000,56(1):309-315
Lipsitz et al. (1998, Biometrics 54, 148-160) discussed testing the homogeneity of the risk difference for a series of 2 x 2 tables. They proposed and evaluated several weighted test statistics, including the commonly used weighted least squares test statistic. Here we suggest various important improvements on these test statistics. First, we propose using the one-sided analogues of the test procedures proposed by Lipsitz et al. because we should only reject the null hypothesis of homogeneity when the variation of the estimated risk differences between centers is large. Second, we generalize their study by redesigning the simulations to include the situations considered by Lipsitz et al. (1998) as special cases. Third, we consider a logarithmic transformation of the weighted least squares test statistic to improve the normal approximation of its sampling distribution. On the basis of Monte Carlo simulations, we note that, as long as the mean treatment group size per table is moderate or large (> or = 16), this simple test statistic, in conjunction with the commonly used adjustment procedure for sparse data, can be useful when the number of 2 x 2 tables is small or moderate (< or = 32). In these situations, in fact, we find that our proposed method generally outperforms all the statistics considered by Lipsitz et al. Finally, we include a general guideline about which test statistic should be used in a variety of situations.  相似文献   

20.
Dukic V  Gatsonis C 《Biometrics》2003,59(4):936-946
Current meta-analytic methods for diagnostic test accuracy are generally applicable to a selection of studies reporting only estimates of sensitivity and specificity, or at most, to studies whose results are reported using an equal number of ordered categories. In this article, we propose a new meta-analytic method to evaluate test accuracy and arrive at a summary receiver operating characteristic (ROC) curve for a collection of studies evaluating diagnostic tests, even when test results are reported in an unequal number of nonnested ordered categories. We discuss both non-Bayesian and Bayesian formulations of the approach. In the Bayesian setting, we propose several ways to construct summary ROC curves and their credible bands. We illustrate our approach with data from a recently published meta-analysis evaluating a single serum progesterone test for diagnosing pregnancy failure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号