首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A modified chi-squared statistic Z is proposed for testing hypotheses about category occupancy rates for individuals distributed by clusters, when the cluster sizes are observed. This statistic is the Pearson chi-square statistic based on the individuals' counts divided by 1 + M* where M* is the mean number of other individuals per cluster per individual. The kind of alternative hypothesis for which the Z-based test compares favourably in power with the Pearson chi-square test based on the cluster frequencies is given. However, we prove that this latter test is more powerful than the former one as long as the equidistribution of the random choice vectors is assumed.  相似文献   

2.
A distribution–free test is considered for testing the treatment effects in block designs with different cell frequencies. A test statistic which is a function of treatment ranks has been proposed which is distributed as chi-square for large samples. The null distribution of the test statistic has been obtained. The entire procedure has been explained by a numerical example.  相似文献   

3.
Donner A  Klar N  Zou G 《Biometrics》2004,60(4):919-925
Split-cluster designs are frequently used in the health sciences when naturally occurring clusters such as multiple sites or organs in the same subject are assigned to different treatments. However, statistical methods for the analysis of binary data arising from such designs are not well developed. The purpose of this article is to propose and evaluate a new procedure for testing the equality of event rates in a design dividing each of k clusters into two segments having multiple sites (e.g., teeth, lesions). The test statistic proposed is a generalization of a previously published procedure based on adjusting the standard Pearson chi-square statistic, but can also be derived as a score test using the approach of generalized estimating equations.  相似文献   

4.
With large amounts of experimental data, modern molecular biology needs appropriate methods to deal with biological sequences. In this work, we apply a statistical method (Pearson's chi-square test) to recognize the signals appear in the whole genome of the Escherichia coli. To show the effectiveness of the method, we compare the Pearson's chi-square test with linguistic complexity on the complete genome of E. coli. The results suggest that Pearson's chi-square test is an efficient method for distinguishing genes (coding regions) form pseudogenes (noncoding regions). On the other hand, the performance of the linguistic complexity is much lower than the chi-square test method. We also use the Pearson's chi-square test method to determine which parts of the Open Reading Frame (ORF) have significant effect on discriminating genes form pseudogenes. Moreover, different complexity measures and Pearson's chi-square test applied on the genes with high value of Pearson's chi-square statistic. We also compute the measures on homologous of these genes. The results illustrate that there is a region near the start codon with high value of chi-square statistic and low complexity that is conserve between homologous genes.  相似文献   

5.
A modified chi-square test for testing the equality of two multinomial populations against an ordering restricted alternative in one sample and two sample cases is constructed. The relation between a concept of dependence called dependence by chi-square and stochastic ordering is established. A tabulation of the asymptotic distribution of the test statistic under the null hypothesis is given. Simulations are used to compare the power of this test with the power of the likelihood ratio test of stochastic ordering of the two multinomial populations.  相似文献   

6.
A multi-sample slippage test based on ordered observations has been given. The test statistic is based on the sum of ranks of the sample. The probability distribution of the test statistic has been worked out for small sample and it turns out to be chi-square distribution for large sample. The analytical procedure has been explained by a numerical example.  相似文献   

7.
Bilder CR  Loughin TM 《Biometrics》2004,60(1):241-248
Questions that ask respondents to "choose all that apply" from a set of items occur frequently in surveys. Categorical variables that summarize this type of survey data are called both pick any/c variables and multiple-response categorical variables. It is often of interest to test for independence between two categorical variables. When both categorical variables can have multiple responses, traditional Pearson chi-square tests for independence should not be used because of the within-subject dependence among responses. An intuitively constructed version of the Pearson statistic is proposed to perform the test using bootstrap procedures to approximate its sampling distribution. First- and second-order adjustments to the proposed statistic are given in order to use a chi-square distribution approximation. A Bonferroni adjustment is proposed to perform the test when the joint set of responses for individual subjects is unavailable. Simulations show that the bootstrap procedures hold the correct size more consistently than the other procedures.  相似文献   

8.
The association between a binary variable Y and a variable X having an at least ordinal measurement scale might be examined by selecting a cutpoint in the range of X and then performing an association test for the obtained 2 x 2 contingency table using the chi-square statistic. The distribution of the maximally selected chi-square statistic (i.e. the maximal chi-square statistic over all possible cutpoints) under the null-hypothesis of no association between X and Y is different from the known chi-square distribution. In the last decades, this topic has been extensively studied for continuous X variables, but not for non-continuous variables of at least ordinal measurement scale (which include e.g. classical ordinal or discretized continuous variables). In this paper, we suggest an exact method to determine the finite-sample distribution of maximally selected chi-square statistics in this context. This novel approach can be seen as a method to measure the association between a binary variable and variables having an at least ordinal scale of different types (ordinal, discretized continuous, etc). As an illustration, this method is applied to a new data set describing pregnancy and birth for 811 babies.  相似文献   

9.
As a useful tool for geographical cluster detection of events, the spatial scan statistic is widely applied in many fields and plays an increasingly important role. The classic version of the spatial scan statistic for the binary outcome is developed by Kulldorff, based on the Bernoulli or the Poisson probability model. In this paper, we apply the Hypergeometric probability model to construct the likelihood function under the null hypothesis. Compared with existing methods, the likelihood function under the null hypothesis is an alternative and indirect method to identify the potential cluster, and the test statistic is the extreme value of the likelihood function. Similar with Kulldorff’s methods, we adopt Monte Carlo test for the test of significance. Both methods are applied for detecting spatial clusters of Japanese encephalitis in Sichuan province, China, in 2009, and the detected clusters are identical. Through a simulation to independent benchmark data, it is indicated that the test statistic based on the Hypergeometric model outweighs Kulldorff’s statistics for clusters of high population density or large size; otherwise Kulldorff’s statistics are superior.  相似文献   

10.
J W Choi  R B McHugh 《Biometrics》1989,45(3):979-996
Situations often arise in a large-scale household survey where a complex probability sample of clusters rather than of individuals is drawn from a large population. Typically, the clusters of such complex samples include a number of correlated members. The responses of these members are then weighted to obtain estimates for the population. Such weighted data are commonly published by the National Center for Health Statistics and other U.S. federal agencies. Frequently, problems arise when such data are tested by usual chi-square test statistics for goodness of fit or independence. Researchers have discovered that the usual chi-square tests provide spuriously inflated results when applied to cluster samples and that new methods are required to correct such problems. This paper proposes a strategy for a goodness-of-fit or independence test based on correlated and weighted data arising in cluster samples, and provides a factor that validly reduces the inflation of the usual chi-square statistics. This method is applied to the chronic condition data collected from the St Paul-Minneapolis, Minnesota, primary sampling unit (PSU) during the 1975 National Health Interview Survey (NHIS). This analysis, together with simulation studies presented elsewhere, provides evidence that the usual chi-square statistics from such data can be corrected for the impacts of clustering and weighting by use of the proposed reduction factor.  相似文献   

11.
We address the problem of tests of homogeneity in two-way contingency tables in case-control studies when the case category is subdivided into k subcategories. In this situation, we have two cells with large frequencies and 2 X k cells with frequencies that become small as k increases. We propose two ad hoc statistics in which a statistic for the sparse cells is combined with a statistic for the cells with large frequencies. We will study these tests along with the Pearson test (using a chi-square approximation) in a Monte Carlo simulation study. Two sets of null hypothesis models and two sets of alternative hypothesis models are considered. The best test for the models considered is the usual Pearson test (using an approximate chi-square distribution) although the ad hoc models are more powerful under one alternative model considered.  相似文献   

12.
A multiple testing procedure for clinical trials.   总被引:57,自引:0,他引:57  
A multiple testing procedure is proposed for comparing two treatments when response to treatment is both dichotomous (i.e., success or failure) and immediate. The proposed test statistic for each test is the usual (Pearson) chi-square statistic based on all data collected to that point. The maximum number (N) of tests and the number (m1 + m2) of observations collected between successive tests is fixed in advance. The overall size of the procedure is shown to be controlled with virtually the same accuracy as the single sample chi-square test based on N(m1 + m2) observations. The power is also found to be virtually the same. However, by affording the opportunity to terminate early when one treatment performs markedly better than the other, the multiple testing procedure may eliminate the ethical dilemmas that often accompany clinical trials.  相似文献   

13.
Kim W  Gordon D  Sebat J  Ye KQ  Finch SJ 《PloS one》2008,3(10):e3475
Recent studies suggest that copy number polymorphisms (CNPs) may play an important role in disease susceptibility and onset. Currently, the detection of CNPs mainly depends on microarray technology. For case-control studies, conventionally, subjects are assigned to a specific CNP category based on the continuous quantitative measure produced by microarray experiments, and cases and controls are then compared using a chi-square test of independence. The purpose of this work is to specify the likelihood ratio test statistic (LRTS) for case-control sampling design based on the underlying continuous quantitative measurement, and to assess its power and relative efficiency (as compared to the chi-square test of independence on CNP counts). The sample size and power formulas of both methods are given. For the latter, the CNPs are classified using the Bayesian classification rule. The LRTS is more powerful than this chi-square test for the alternatives considered, especially alternatives in which the at-risk CNP categories have low frequencies. An example of the application of the LRTS is given for a comparison of CNP distributions in individuals of Caucasian or Taiwanese ethnicity, where the LRTS appears to be more powerful than the chi-square test, possibly due to misclassification of the most common CNP category into a less common category.  相似文献   

14.
Tango T 《Biometrics》2007,63(1):119-127
A class of tests with quadratic forms for detecting spatial clustering of health events based on case-control point data is proposed. It includes Cuzick and Edwards's test statistic (1990, Journal of the Royal Statistical Society, Series B 52, 73-104). Although they used the property of asymptotic normality of the test statistic, we show that such an approximation is generally poor for moderately large sample sizes. Instead, we suggest a central chi-square distribution as a better approximation to the asymptotic distribution of the test statistic. Furthermore, not only to estimate the optimal value of the unknown parameter on the scale of cluster but also to adjust for multiple testing due to repeating the procedure by changing the parameter value, we propose the minimum of the profile p-value of the test statistic for the parameter as an integrated test statistic. We also provide a statistic to estimate the areas or cases which make large contributions to significant clustering. The proposed methods are illustrated with a data set concerning the locations of cases of childhood leukemia and lymphoma and another on early medieval grave site locations consisting of affected and nonaffected grave sites.  相似文献   

15.
S R Paul  K Y Liang  S G Self 《Biometrics》1989,45(1):231-236
This paper is concerned with testing the multinomial (binomial) assumption against the Dirichlet-multinomial (beta-binomial) alternatives. In particular, we discuss the distribution of the asymptotic likelihood ratio (LR) test and obtain the C(alpha) goodness-of-fit test statistic. The inadequacy of the regular chi-square approximation to the LR test is supported by some Monte Carlo experiments. The C(alpha) test is recommended based on empirical significance level and power and also computational simplicity. Two examples are given.  相似文献   

16.
When a set of populations are compared in respect of gene frequencies, and the chi-square test of heterogeneity is found to be significant, it is pertinent to find out whether the heterogeneity can be explained by a few linear combinations of the gene frequencies, and the total heterogeneity chi-square value can be partitioned as the sum of heterogeneity chi-square values contributed by the linear combinations. The present report describes such a method, and the linear combination that explains the maximum heterogeneity is called the principal axis. An application of this method is presented to find clusters of 31 Mongoloid tribal populations of eastern India using ABO gene frequency data.  相似文献   

17.
J Nam  J J Gart 《Biometrics》1985,41(2):455-466
The general method of the discrepancy or heterogeneity chi-square is applied to ABO-like data in which there are no observed double blanks in either the disease or the control group. When the recessive gene frequency is assumed zero, this method leads to an approximate chi-square test identical to that suggested by Smouse and Williams (1982, Biometrics 38, 757-768). When this assumption is relaxed, there arise two cases which are determined by whether the maximum likelihood estimate of this frequency is zero or not. It is shown that the value of the simple score statistic of Gart and Nam (1984, Biometrics 40, 887-894) discriminates between the two cases. The various omnibus test statistics for comparing groups are shown to differ little in several practical examples. However, under the more general assumption the appropriate degrees of freedom is one more than the number previously suggested.  相似文献   

18.
Halpern AL 《Biometrics》1999,55(4):1044-1050
A novel changepoint statistic based on the minimum value, over possible changepoint locations, of Fisher's Exact Test, is introduced. Specific points in the exact distribution of the minimally selected Fisher's value may be rapidly calculated as a lattice-path counting problem via known recurrence methods. The test is compared to the Kolmogorov-Smirnov two-sample test, the maximally selected chi-square, and a likelihood ratio test. The tests are applied to assessing recombination in genetic sequences of HIV.  相似文献   

19.
Significance testing for correlated binary outcome data   总被引:1,自引:0,他引:1  
B Rosner  R C Milton 《Biometrics》1988,44(2):505-512
Multiple logistic regression is a commonly used multivariate technique for analyzing data with a binary outcome. One assumption needed for this method of analysis is the independence of outcome for all sample points in a data set. In ophthalmologic data and other types of correlated binary data, this assumption is often grossly violated and the validity of the technique becomes an issue. A technique has been developed (Rosner, 1984) that utilizes a polychotomous logistic regression model to allow one to look at multiple exposure variables in the context of a correlated binary data structure. This model is an extension of the beta-binomial model, which has been widely used to model correlated binary data when no covariates are present. In this paper, a relationship is developed between the two techniques, whereby it is shown that use of ordinary logistic regression in the presence of correlated binary data can result in true significance levels that are considerably larger than nominal levels in frequently encountered situations. This relationship is explored in detail in the case of a single dichotomous exposure variable. In this case, the appropriate test statistic can be expressed as an adjusted chi-square statistic based on the 2 X 2 contingency table relating exposure to outcome. The test statistic is easily computed as a function of the ordinary chi-square statistic and the correlation between eyes (or more generally between cluster members) for outcome and exposure, respectively. This generalizes some previous results obtained by Koval and Donner (1987, in Festschrift for V. M. Joshi, I. B. MacNeill (ed.), Vol. V, 199-224.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

20.
The central theme in case-control genetic association studies is to efficiently identify genetic markers associated with trait status. Powerful statistical methods are critical to accomplishing this goal. A popular method is the omnibus Pearson's chi-square test applied to genotype counts. To achieve increased power, tests based on an assumed trait model have been proposed. However, they are not robust to model misspecification. Much research has been carried out on enhancing robustness of such model-based tests. An analysis framework that tests the equality of allele frequency while allowing for different deviation from Hardy-Weinberg equilibrium (HWE) between cases and controls is proposed. The proposed method does not require specification of trait models nor HWE. It involves only 1 degree of freedom. The likelihood ratio statistic, score statistic, and Wald statistic associated with this framework are introduced. Their performance is evaluated by extensive computer simulation in comparison with existing methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号