首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
Tango T 《Biometrics》2007,63(1):119-127
A class of tests with quadratic forms for detecting spatial clustering of health events based on case-control point data is proposed. It includes Cuzick and Edwards's test statistic (1990, Journal of the Royal Statistical Society, Series B 52, 73-104). Although they used the property of asymptotic normality of the test statistic, we show that such an approximation is generally poor for moderately large sample sizes. Instead, we suggest a central chi-square distribution as a better approximation to the asymptotic distribution of the test statistic. Furthermore, not only to estimate the optimal value of the unknown parameter on the scale of cluster but also to adjust for multiple testing due to repeating the procedure by changing the parameter value, we propose the minimum of the profile p-value of the test statistic for the parameter as an integrated test statistic. We also provide a statistic to estimate the areas or cases which make large contributions to significant clustering. The proposed methods are illustrated with a data set concerning the locations of cases of childhood leukemia and lymphoma and another on early medieval grave site locations consisting of affected and nonaffected grave sites.  相似文献   

2.
Nonparametric all‐pairs multiple comparisons based on pairwise rankings can be performed in the one‐way design with the Steel‐Dwass procedure. To apply this test, Wilcoxon's rank sum statistic is calculated for all pairs of groups; the maximum of the rank sums is the test statistic. We provide exact calculations of the asymptotic critical values (and P‐values, respectively) even for unbalanced designs. We recommend this asymptotic method whenever large sample sizes are present. For small sample sizes we recommend the use of the new statistic according to Baumgartner , Weiss , and Schindler (1998, Biometrics 54 , 1129–1135) instead of Wilcoxon's rank sum for the multiple comparisons. We show that the resultant procedure can be less conservative and, according to simulation results, more powerful than the original Steel‐Dwass procedure. We illustrate the methods with a practical data set.  相似文献   

3.
The use of the Pearson chi-square statistic for testing hypotheses on biological populations is not appropriate when the individuals are distributed by clusters. In the case where the clusters are distributed independently of each other, we propose an asymptotically chi-square distributed test statistic taking into account the cluster size distribution. An example provided by European Corn Borer eggs data is used to illustrate the test procedure.  相似文献   

4.
A distribution–free test is considered for testing the treatment effects in block designs with different cell frequencies. A test statistic which is a function of treatment ranks has been proposed which is distributed as chi-square for large samples. The null distribution of the test statistic has been obtained. The entire procedure has been explained by a numerical example.  相似文献   

5.
Sensitivity and specificity have traditionally been used to assess the performance of a diagnostic procedure. Diagnostic procedures with both high sensitivity and high specificity are desirable, but these procedures are frequently too expensive, hazardous, and/or difficult to operate. A less sophisticated procedure may be preferred, if the loss of the sensitivity or specificity is determined to be clinically acceptable. This paper addresses the problem of simultaneous testing of sensitivity and specificity for an alternative test procedure with a reference test procedure when a gold standard is present. The hypothesis is formulated as a compound hypothesis of two non‐inferiority (one‐sided equivalence) tests. We present an asymptotic test statistic based on the restricted maximum likelihood estimate in the framework of comparing two correlated proportions under the prospective and retrospective sampling designs. The sample size and power of an asymptotic test statistic are derived. The actual type I error and power are calculated by enumerating the exact probabilities in the rejection region. For applications that require high sensitivity as well as high specificity, a large number of positive subjects and a large number of negative subjects are needed. We also propose a weighted sum statistic as an alternative test by comparing a combined measure of sensitivity and specificity of the two procedures. The sample size determination is independent of the sampling plan for the two tests.  相似文献   

6.
In this article we propose a new technique for identifying clusters in temporal point processes. This relies on the comparision between all the m -order spacings and it is totally independent of any alternative hypothesis. A recursive procedure is introduced and allows to identify multiple clusters independently. This new scan statistic seems to be more efficient than the classical scan statistic for detecting and recovering cluster alternatives. These results have applications in epidemiological studies of rare diseases.  相似文献   

7.
Although linear rank statistics for the two‐sample problem are distribution free tests, their power depends on the distribution of the data. In the planning phase of an experiment, researchers are often uncertain about the shape of this distribution and so the choice of test statistic for the analysis and the determination of the required sample size are based on vague information. Adaptive designs with interim analysis can potentially overcome both problems. And in particular, adaptive tests based on a selector statistic are a solution to the first. We investigate whether adaptive tests can be usefully implemented in flexible two‐stage designs to gain power. In a simulation study, we compare several methods for choosing a test statistic for the second stage of an adaptive design based on interim data with the procedure that applies adaptive tests in both stages. We find that the latter is a sensible approach that leads to the best results in most situations considered here. The different methods are illustrated using a clinical trial example.  相似文献   

8.
We present a class of likelihood-based score statistics that accommodate genotypes of both unrelated individuals and families, thereby combining the advantages of case-control and family-based designs. The likelihood extends the one proposed by Schaid and colleagues (Schaid and Sommer 1993, 1994; Schaid 1996; Schaid and Li 1997) to arbitrary family structures with arbitrary patterns of missing data and to dense sets of multiple markers. The score statistic comprises two component test statistics. The first component statistic, the nonfounder statistic, evaluates disequilibrium in the transmission of marker alleles from parents to offspring. This statistic, when applied to nuclear families, generalizes the transmission/disequilibrium test to arbitrary numbers of affected and unaffected siblings, with or without typed parents. The second component statistic, the founder statistic, compares observed or inferred marker genotypes in the family founders with those of controls or those of some reference population. The founder statistic generalizes the statistics commonly used for case-control data. The strengths of the approach include both the ability to assess, by comparison of nonfounder and founder statistics, the potential bias resulting from population stratification and the ability to accommodate arbitrary family structures, thus eliminating the need for many different ad hoc tests. A limitation of the approach is the potential power loss and/or bias resulting from inappropriate assumptions on the distribution of founder genotypes. The systematic likelihood-based framework provided here should be useful in the evaluation of both the relative merits of case-control and various family-based designs and the relative merits of different tests applied to the same design. It should also be useful for genotype-disease association studies done with the use of a dense set of multiple markers.  相似文献   

9.
Tao Sun  Yu Cheng  Ying Ding 《Biometrics》2023,79(3):1713-1725
Copula is a popular method for modeling the dependence among marginal distributions in multivariate censored data. As many copula models are available, it is essential to check if the chosen copula model fits the data well for analysis. Existing approaches to testing the fitness of copula models are mainly for complete or right-censored data. No formal goodness-of-fit (GOF) test exists for interval-censored or recurrent events data. We develop a general GOF test for copula-based survival models using the information ratio (IR) to address this research gap. It can be applied to any copula family with a parametric form, such as the frequently used Archimedean, Gaussian, and D-vine families. The test statistic is easy to calculate, and the test procedure is straightforward to implement. We establish the asymptotic properties of the test statistic. The simulation results show that the proposed test controls the type-I error well and achieves adequate power when the dependence strength is moderate to high. Finally, we apply our method to test various copula models in analyzing multiple real datasets. Our method consistently separates different copula models for all these datasets in terms of model fitness.  相似文献   

10.
A multiple testing procedure for clinical trials.   总被引:57,自引:0,他引:57  
A multiple testing procedure is proposed for comparing two treatments when response to treatment is both dichotomous (i.e., success or failure) and immediate. The proposed test statistic for each test is the usual (Pearson) chi-square statistic based on all data collected to that point. The maximum number (N) of tests and the number (m1 + m2) of observations collected between successive tests is fixed in advance. The overall size of the procedure is shown to be controlled with virtually the same accuracy as the single sample chi-square test based on N(m1 + m2) observations. The power is also found to be virtually the same. However, by affording the opportunity to terminate early when one treatment performs markedly better than the other, the multiple testing procedure may eliminate the ethical dilemmas that often accompany clinical trials.  相似文献   

11.
Despite the potential pitfalls of stratification, population-based association studies nowadays are being conducted more often than family-based association studies. However, the mechanism of genomic imprinting has lately been implicated in the etiology of genetic complex diseases and can be detected using statistics only in family-based designs. Powerful tests for association and imprinting have been proposed previously for case-parent trios and single markers. Since the power of association studies can be improved if multiple affected children and haplotypes are considered, we extended the parental asymmetry test (PAT) for imprinting to a test that is suited for both general nuclear families and haplotypes, called HAP-PAT. Significance of the HAP-PAT is determined via a Monte-Carlo simulation procedure. In addition to the HAP-PAT, we modified a haplotype-based association test, proposed by us before, in such a way that either only paternal or maternal transmissions contribute to the test statistic. The approaches were implemented in FAMHAP and we evaluated their performance under a variety of disease models. We were able to demonstrate the usefulness of our haplotype-based approaches to detect parent-of-origin effects. Furthermore, we showed that also in the presence of imprinting it is more reasonable to consider all affected children of a nuclear family, than to randomly select one affected child from each family and to conduct a trio study using the selected individuals.  相似文献   

12.
Two-stage designs for experiments with a large number of hypotheses   总被引:1,自引:0,他引:1  
MOTIVATION: When a large number of hypotheses are investigated the false discovery rate (FDR) is commonly applied in gene expression analysis or gene association studies. Conventional single-stage designs may lack power due to low sample sizes for the individual hypotheses. We propose two-stage designs where the first stage is used to screen the 'promising' hypotheses which are further investigated at the second stage with an increased sample size. A multiple test procedure based on sequential individual P-values is proposed to control the FDR for the case of independent normal distributions with known variance. RESULTS: The power of optimal two-stage designs is impressively larger than the power of the corresponding single-stage design with equal costs. Extensions to the case of unknown variances and correlated test statistics are investigated by simulations. Moreover, it is shown that the simple multiple test procedure using first stage data for screening purposes and deriving the test decisions only from second stage data is a very powerful option.  相似文献   

13.
Kwong KS  Cheung SH  Chan WS 《Biometrics》2004,60(2):491-498
In clinical studies, multiple superiority/equivalence testing procedures can be applied to classify a new treatment as superior, equivalent (same therapeutic effect), or inferior to each set of standard treatments. Previous stepwise approaches (Dunnett and Tamhane, 1997, Statistics in Medicine16, 2489-2506; Kwong, 2001, Journal of Statistical Planning and Inference 97, 359-366) are only appropriate for balanced designs. Unfortunately, the construction of similar tests for unbalanced designs is far more complex, with two major difficulties: (i) the ordering of test statistics for superiority may not be the same as the ordering of test statistics for equivalence; and (ii) the correlation structure of the test statistics is not equi-correlated but product-correlated. In this article, we seek to develop a two-stage testing procedure for unbalanced designs, which are very popular in clinical experiments. This procedure is a combination of step-up and single-step testing procedures, while the familywise error rate is proved to be controlled at a designated level. Furthermore, a simulation study is conducted to compare the average powers of the proposed procedure to those of the single-step procedure. In addition, a clinical example is provided to illustrate the application of the new procedure.  相似文献   

14.
The w statistic introduced by Lockhart et al. (1998. A covariotide model explains apparent phylogenetic structure of oxygenic photosynthetic lineages. Mol Biol Evol. 15:1183-1188) is a simple and easily calculated statistic intended to detect heterotachy by comparing amino acid substitution patterns between two monophyletic groups of protein sequences. It is defined as the difference between the fraction of varied sites in both groups and the fraction of varied sites in each group. The w test has been used to distinguish a covarion process from equal rates and rates variation across sites processes. Using simulation we show that the w test is effective for small data sets and for data sets that have low substitution rates in the groups but can have difficulties when these conditions are not met. Using site entropy as a measure of variability of a sequence site, we modify the w statistic to a w' statistic by assigning as varied in one group those sites that are actually varied in both groups but have a large entropy difference. We show that the w' test has more power to detect two kinds of heterotachy processes (covarion and bivariate rate shifts) in large and variable data. We also show that a test of Pearson's correlation of the site entropies between two monophyletic groups can be used to detect heterotachy and has more power than the w' test. Furthermore, we demonstrate that there are settings where the correlation test as well as w and w' tests do not detect heterotachy signals in data simulated under a branch length mixture model. In such cases, it is sometimes possible to detect heterotachy through subselection of appropriate taxa. Finally, we discuss the abilities of the three statistical tests to detect a fourth mode of heterotachy: lineage-specific changes in proportion of variable sites.  相似文献   

15.
As a useful tool for geographical cluster detection of events, the spatial scan statistic is widely applied in many fields and plays an increasingly important role. The classic version of the spatial scan statistic for the binary outcome is developed by Kulldorff, based on the Bernoulli or the Poisson probability model. In this paper, we apply the Hypergeometric probability model to construct the likelihood function under the null hypothesis. Compared with existing methods, the likelihood function under the null hypothesis is an alternative and indirect method to identify the potential cluster, and the test statistic is the extreme value of the likelihood function. Similar with Kulldorff’s methods, we adopt Monte Carlo test for the test of significance. Both methods are applied for detecting spatial clusters of Japanese encephalitis in Sichuan province, China, in 2009, and the detected clusters are identical. Through a simulation to independent benchmark data, it is indicated that the test statistic based on the Hypergeometric model outweighs Kulldorff’s statistics for clusters of high population density or large size; otherwise Kulldorff’s statistics are superior.  相似文献   

16.
Mehrotra DV  Chan IS  Berger RL 《Biometrics》2003,59(2):441-450
Fisher's exact test for comparing response proportions in a randomized experiment can be overly conservative when the group sizes are small or when the response proportions are close to zero or one. This is primarily because the null distribution of the test statistic becomes too discrete, a partial consequence of the inference being conditional on the total number of responders. Accordingly, exact unconditional procedures have gained in popularity, on the premise that power will increase because the null distribution of the test statistic will presumably be less discrete. However, we caution researchers that a poor choice of test statistic for exact unconditional inference can actually result in a substantially less powerful analysis than Fisher's conditional test. To illustrate, we study a real example and provide exact test size and power results for several competing tests, for both balanced and unbalanced designs. Our results reveal that Fisher's test generally outperforms exact unconditional tests based on using as the test statistic either the observed difference in proportions, or the observed difference divided by its estimated standard error under the alternative hypothesis, the latter for unbalanced designs only. On the other hand, the exact unconditional test based on the observed difference divided by its estimated standard error under the null hypothesis (score statistic) outperforms Fisher's test, and is recommended. Boschloo's test, in which the p-value from Fisher's test is used as the test statistic in an exact unconditional test, is uniformly more powerful than Fisher's test, and is also recommended.  相似文献   

17.
Many recently developed nonparametric jump tests can be viewed as multiple hypothesis testing problems. For such multiple hypothesis tests, it is well known that controlling type I error often makes a large proportion of erroneous rejections, and such situation becomes even worse when the jump occurrence is a rare event. To obtain more reliable results, we aim to control the false discovery rate (FDR), an efficient compound error measure for erroneous rejections in multiple testing problems. We perform the test via the Barndorff-Nielsen and Shephard (BNS) test statistic, and control the FDR with the Benjamini and Hochberg (BH) procedure. We provide asymptotic results for the FDR control. From simulations, we examine relevant theoretical results and demonstrate the advantages of controlling the FDR. The hybrid approach is then applied to empirical analysis on two benchmark stock indices with high frequency data.  相似文献   

18.
Cheng R  Ma JZ  Wright FA  Lin S  Gao X  Wang D  Elston RC  Li MD 《Genetics》2003,164(3):1175-1187
As the speed and efficiency of genotyping single-nucleotide polymorphisms (SNPs) increase, using the SNP map, it becomes possible to evaluate the extent to which a common haplotype contributes to the risk of disease. In this study we propose a new procedure for mapping functional sites or regions of a candidate gene of interest using multiple linked SNPs. Based on a case-parent trio family design, we use expectation-maximization (EM) algorithm-derived haplotype frequency estimates of multiple tightly linked SNPs from both unambiguous and ambiguous families to construct a contingency statistic S for linkage disequilibrium (LD) analysis. In the procedure, a moving-window scan for functional SNP sites or regions can cover an unlimited number of loci except for the limitation of computer storage. Within a window, all possible widths of haplotypes are utilized to find the maximum statistic S* for each site (or locus). Furthermore, this method can be applied to regional or genome-wide scanning for determining linkage disequilibrium using SNPs. The sensitivity of the proposed procedure was examined on the simulated data set from the Genetic Analysis Workshop (GAW) 12. Compared with the conventional and generalized TDT methods, our procedure is more flexible and powerful.  相似文献   

19.
Large-scale whole genome association studies are increasingly common, due in large part to recent advances in genotyping technology. With this change in paradigm for genetic studies of complex diseases, it is vital to develop valid, powerful, and efficient statistical tools and approaches to evaluate such data. Despite a dramatic drop in genotyping costs, it is still expensive to genotype thousands of individuals for hundreds of thousands single nucleotide polymorphisms (SNPs) for large-scale whole genome association studies. A multi-stage (or two-stage) design has been a promising alternative: in the first stage, only a fraction of samples are genotyped and tested using a dense set of SNPs, and only a small subset of markers that show moderate associations with the disease will be genotyped in later stages. Multi-stage designs have also been used in candidate gene association studies, usually in regions that have shown strong signals by linkage studies. To decide which set of SNPs to be genotyped in the next stage, a common practice is to utilize a simple test (such as a chi2 test for case-control data) and a liberal significance level without corrections for multiple testing, to ensure that no true signals will be filtered out. In this paper, I have developed a novel SNP selection procedure within the framework of multi-stage designs. Based on data from stage 1, the method explicitly explores correlations (linkage disequilibrium) among SNPs and their possible interactions in determining the disease phenotype. Comparing with a regular multi-stage design, the approach can select a much reduced set of SNPs with high discriminative power for later stages. Therefore, not only does it reduce the genotyping cost in later stages, it also increases the statistical power by reducing the number of tests. Combined analysis is proposed to further improve power, and the theoretical significance level of the combined statistic is derived. Extensive simulations have been performed, and results have shown that the procedure can reduce the number of SNPs required in later stages, with improved power to detect associations. The procedure has also been applied to a real data set from a genome-wide association study of the sporadic amyotrophic lateral sclerosis (ALS) disease, and an interesting set of candidate SNPs has been identified.  相似文献   

20.
生物多样性和均匀度显著性的随机化检验及计算软件   总被引:6,自引:0,他引:6  
多样性指数和均匀度以其简单易用而被广泛应用于群落生物学和生物多样性等研究中,然而由于缺乏合适的统计检验方法等原因,其分析的可信性往往较低,因而限制了其应用。鉴于生物多样性研究中广泛应用主观和直接的比较不,有必要建立和使用较为严格的多样性统计检验。本研究建立和应用了如下随机化检验方法:单群落多样性指数和均匀度的显著性检验,单群落多样性指数和均匀度的置认区间,群落间多群样和均匀度的差异显著性检验。随机化方法已被成功地应用于群落生态学研究,其原理是:随机排序某一向量中的元素,或随机交换两向量中的对应元素。计算该随机化数据的多样性和均匀度,重复该过程多次,统计和计算显著性检验的p值,由向量中的对应元素。计算该随机化数据的多样性和均匀度,重复该过程多次,统计和显著性检验的p值。由此可确定多样性和差异的统计显著性。同时,研制了相应的Internet计算软件BiodiverisytTest。该软件由7个Java类和1个HTML文件组成,可运行于多种操作系统和网络浏览器上,可读取多种类型的ODBC数据库文件如Access,Excel,FoxPro,Dbase等。该软件中包括Shannon-Wiener多样性指数,Simpson多样性指数,McIntosh多样性指数,Berger-Parker多样性指数,Hrlbert多样性指数以及Brillouin多样性指数。基于Shannon-Wiener多样性指数和Berger-Parker多样性指数,用BiodiversityTest软件对水稻田节肢动物群落多样性(15个地点,17个功能群,125个节肢动物种)进行了比较和分析。结果显示,两组结果可较好地反映水稻节肢动物群落多样性的差异显著性,这些检验方法可有效地反映多样性指数和均匀度的变化。与水稻田节肢动物群落间多样性的直接比较法相比,该随机化检验方法获得更客观的结果。本算法与软件有助于改进生物多样性研究中使用的某些不甚严格的分析方法,为随机化检验方法在生物多样性研究中的进一步应用提供了一种可用的工具。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号