首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Lehne B  Lewis CM  Schlitt T 《PloS one》2011,6(6):e20133
Interpreting Genome-Wide Association Studies (GWAS) at a gene level is an important step towards understanding the molecular processes that lead to disease. In order to incorporate prior biological knowledge such as pathways and protein interactions in the analysis of GWAS data it is necessary to derive one measure of association for each gene. We compare three different methods to obtain gene-wide test statistics from Single Nucleotide Polymorphism (SNP) based association data: choosing the test statistic from the most significant SNP; the mean test statistics of all SNPs; and the mean of the top quartile of all test statistics. We demonstrate that the gene-wide test statistics can be controlled for the number of SNPs within each gene and show that all three methods perform considerably better than expected by chance at identifying genes with confirmed associations. By applying each method to GWAS data for Crohn's Disease and Type 1 Diabetes we identified new potential disease genes.  相似文献   

2.
Zhang K  Traskin M  Small DS 《Biometrics》2012,68(1):75-84
For group-randomized trials, randomization inference based on rank statistics provides robust, exact inference against nonnormal distributions. However, in a matched-pair design, the currently available rank-based statistics lose significant power compared to normal linear mixed model (LMM) test statistics when the LMM is true. In this article, we investigate and develop an optimal test statistic over all statistics in the form of the weighted sum of signed Mann-Whitney-Wilcoxon statistics under certain assumptions. This test is almost as powerful as the LMM even when the LMM is true, but it is much more powerful for heavy tailed distributions. A simulation study is conducted to examine the power.  相似文献   

3.
Linkage heterogeneity is common for complex diseases. It is well known that loss of statistical power for detecting linkage will result if one assumes complete homogeneity in the presence of linkage heterogeneity. To this end, Smith (1963, Annals of Human Genetics 27, 175-182) proposed an admixture model to account for linkage heterogeneity. It is well known that for this model, the conventional chi-squared approximation to the likelihood ratio test for no linkage does not apply even when the sample size is large. By dealing with nuclear families and one marker at a time for genetic diseases with simple modes of inheritance, score-based test statistics (Liang and Rathouz, 1999, Biometrics 55, 65-74) and likelihood-ratio-based test statistics (Lemdani and Pons, 1995, Biometrics 51, 1033-1041) have been proposed which have a simple large-sample distribution under the null hypothesis of linkage. In this paper, we extend their work to more practical situations that include information from multiple markers and multi-generational pedigrees while allowing for a class of general genetic models. Three different approaches are proposed to eliminate the nuisance parameters in these test statistics. We show that all three approaches lead to the same asymptotic distribution under the null hypothesis of no linkage. Simulation results show that the proposed test statistics have adequate power to detect linkage and that the performances of these two classes of test statistics are quite comparable. We have applied the proposed method to a family study of asthma (Barnes et al., 1996), in which the score-based test shows evidence of linkage with p-value <0.0001 in the region of interest on chromosome 12. Additionally, we have implemented this score-based test within the frequently used computer package GENEHUNTER.  相似文献   

4.
This paper considers four summary test statistics, including the one recently proposed by Bennett (1986, Biometrical Journal 28, 859–862), for hypothesis testing of association in a series of independent fourfold tables under inverse sampling. This paper provides a systematic and quantitative evaluation of the small-sample performance for these summary test statistics on the basis of a Monte Carlo simulation. This paper notes that the test statistic developed by Bennett (1986) can be conservative and thereby possibly lose the power when the underlying disease is not rare. This paper also finds that for given a fixed total number of cases in each table, the conditional test statistic is the best in controlling type I error among all test statistics considered here.  相似文献   

5.
Assunção R  Maia A 《Biometrics》2007,63(1):290-294
Summary .   In environmental risk analysis, it is common to assume the stochastic independence (or separability) between the marks associated with the random events of a spatial-temporal point process. Schoenberg (2004, Biometrics 60, 471–481) proposed several test statistics for this hypothesis and used simulated data to evaluate their performance. He found that a Cramér-von Mises-type test is powerful to detect gradual departures from separability although it is not uniformly powerful over a large class of alternative models. We present a semiparametric approach to model alternative hypotheses to separability and derive a score test statistic. We show that there is a relationship between this score test and some of the test statistics proposed by Schoenberg. Specifically, all are different versions of weighted Cramér-von Mises-type statistics. This gives some insight into the reasons for the similarities and differences between the test statistics' performance. We also point out some difficulties in controlling the type I error probability in Schoenberg's residual test.  相似文献   

6.
This paper is concerned with the power behaviour of four goodness-of-fit test statistics in sparse multinomials with k cells. Most previous work has been concerned only with both Pearson's X2 and the likelihood ratio test statistics. We consider in this study, two additional test statistics, namely, the Cressie-Read test statistic – I(2/3) and the modified Freeman-Tukey test (FT) statistic. Because k ≥ 10 in this study, a Monte Carlo procedure based on 1000 simulated samples is used to estimate the powers for the four test statistics. Alternatives on various line segments are employed. Results suggest that none of the test statistics completely dominate the other and that the choice of which test to use depends on the nature of the alternative hypothesis. These results are consistent with those obtained by West and Kempthorne (1972), although, the Pearson's χ2 test statistic may be preferred because of its closer approximation to the χ2 distribution in terms of the attained α levels.  相似文献   

7.
Testing for a change in the slope of the simple linear regression model has many applications in bio‐sciences, quality control and survival analysis. This paper compares Anderson‐Darling and Erdós‐Darling type test statistics which are based on the least squares change point process of Sen (1980) with the corresponding Kolmogorov‐Smirnov and Crámer‐von Mises type test statistics. We estimated the limiting critical values of these test statistics and conducted Monte Carlo simulation studies to compare their powers.  相似文献   

8.
The two‐sided Simes test is known to control the type I error rate with bivariate normal test statistics. For one‐sided hypotheses, control of the type I error rate requires that the correlation between the bivariate normal test statistics is non‐negative. In this article, we introduce a trimmed version of the one‐sided weighted Simes test for two hypotheses which rejects if (i) the one‐sided weighted Simes test rejects and (ii) both p‐values are below one minus the respective weighted Bonferroni adjusted level. We show that the trimmed version controls the type I error rate at nominal significance level α if (i) the common distribution of test statistics is point symmetric and (ii) the two‐sided weighted Simes test at level 2α controls the level. These assumptions apply, for instance, to bivariate normal test statistics with arbitrary correlation. In a simulation study, we compare the power of the trimmed weighted Simes test with the power of the weighted Bonferroni test and the untrimmed weighted Simes test. An additional result of this article ensures type I error rate control of the usual weighted Simes test under a weak version of the positive regression dependence condition for the case of two hypotheses. This condition is shown to apply to the two‐sided p‐values of one‐ or two‐sample t‐tests for bivariate normal endpoints with arbitrary correlation and to the corresponding one‐sided p‐values if the correlation is non‐negative. The Simes test for such types of bivariate t‐tests has not been considered before. According to our main result, the trimmed version of the weighted Simes test then also applies to the one‐sided bivariate t‐test with arbitrary correlation.  相似文献   

9.
Zheng G  Chen Z 《Biometrics》2005,61(1):254-258
In many practical problems, a hypothesis testing involves a nuisance parameter which appears only under the alternative hypothesis. Davies (1977, Biometrika 64, 247-254) proposed the maximum of the score statistics over the whole range of the nuisance parameter as a test statistic for this type of hypothesis testing. Freidlin, Podgor, and Gastwirth (1999, Biometrics 55, 883-886) studied two other simpler maximum test statistics, the maximum of the score statistics at two extreme points of the nuisance parameter, and the maximum of the score statistics at three points of the nuisance parameter including the two extreme points. In this article, we compare the powers of these three maximum-type statistics in the context of three genetic problems.  相似文献   

10.
In data analysis involving the proportional-hazards regression model due to Cox (1972, Journal of the Royal Statistical Society, Series B 34, 187-220), the test criteria commonly used for assessing the partial contribution to survival of subsets of concomitant variables are the classical likelihood ratio (LR) and Wald statistics. This paper presents an investigation of three other test criteria with potentially major computational advantages over the classical tests, especially for stepwise variable selection in moderate to large data sets. The alternative criteria considered are Rao's efficient score statistic and two other score statistics. Under the Cox model, the performance of these tests is examined empirically and compared with the performance of the LR and Wald statistics. Rao's test performs comparably to the LR test in all the cases considered. The performance of the other criteria is competitive in many cases. The use of these statistics is illustrated in a study of coronary artery disease.  相似文献   

11.
The transmission/disequilibrium test (TDT) is a popular, simple, and powerful test of linkage, which can be used to analyze data consisting of transmissions to the affected members of families with any kind pedigree structure, including affected sib pairs (ASPs). Although it is based on the preferential transmission of a particular marker allele across families, it is not a valid test of association for ASPs. Martin et al. devised a similar statistic for ASPs, Tsp, which is also based on preferential transmission of a marker allele but which is a valid test of both linkage and association for ASPs. It is, however, less powerful than the TDT as a test of linkage for ASPs. What I show is that the differences between the TDT and Tsp are due to the fact that, although both statistics are based on preferential transmission of a marker allele, the TDT also exploits excess sharing in identity-by-descent transmissions to ASPs. Furthermore, I show that both of these statistics are members of a family of "TDT-like" statistics for ASPs. The statistics in this family are based on preferential transmission but also, to varying extents, exploit excess sharing. From this family of statistics, we see that, although the TDT exploits excess sharing to some extent, it is possible to do so to a greater extent-and thus produce a more powerful test of linkage, for ASPs, than is provided by the TDT. Power simulations conducted under a number of disease models are used to verify that the most powerful member of this family of TDT-like statistics is more powerful than the TDT for ASPs.  相似文献   

12.
The predictive abilities of two-group classification models (CMs) are often expressed in terms of their Cooper statistics. These statistics are often reported without any indication of their uncertainty, making it impossible to judge whether the predicted classifications are significantly better than the predictions made by a different CM, or whether the predictive performance of the CM exceeds predefined performance criteria in a statistically significant way. Bootstrap resampling routines are reported that provide a means of expressing the uncertainty associated with Cooper statistics. The usefulness of the bootstrapping routines is illustrated by constructing 95% confidence intervals for the Cooper statistics of four alternative skin-corrosivity tests (the rat skin transcutaneous electrical resistance assay, EPISKIN, Skin(2) and CORROSITEX), and four two-step sequences in which each in vitro test is used in combination with a physicochemical test for skin corrosion based on pH measurements.  相似文献   

13.
Zhao J  Jin L  Xiong M 《Genetics》2006,174(3):1529-1538
As millions of single-nucleotide polymorphisms (SNPs) have been identified and high-throughput genotyping technologies have been rapidly developed, large-scale genomewide association studies are soon within reach. However, since a genomewide association study involves a large number of SNPs it is therefore nearly impossible to ensure a genomewide significance level of 0.05 using the available statistics, although the multiple-test problems can be alleviated, but not sufficiently, by the use of tagging SNPs. One strategy to circumvent the multiple-test problem associated with genome-wide association tests is to develop novel test statistics with high power. In this report, we introduce several nonlinear tests, which are based on nonlinear transformation of allele or haplotype frequencies. We investigate the power of the nonlinear test statistics and demonstrate that under certain conditions, some nonlinear test statistics have much higher power than the standard chi2-test statistic. Type I error rates of the nonlinear tests are validated using simulation studies. We also show that a class of similarity measure-based test statistics is based on the quadratic function of allele or haplotype frequencies, and thus they belong to nonlinear tests. To evaluate their performance, the nonlinear test statistics are also applied to three real data sets. Our study shows that nonlinear test statistics have great potential in association studies of complex diseases.  相似文献   

14.
A new statistical test for linkage heterogeneity.   总被引:6,自引:5,他引:1       下载免费PDF全文
A new, statistical test for linkage heterogeneity is described. It is a likelihood-ratio test based on a beta distribution for the prior distribution of the recombination fraction among families (or individuals). The null distribution for this statistic (called the B-test) is derived under a broad range of circumstances. Two other heterogeneity test statistics--the admixture test or A-test first described by Smith and Morton's test (here referred to as the K-test)--are also examined. The probability distribution for the K-test statistic is very sensitive to family size, whereas the other two statistics are not. All three statistics are somewhat sensitive to the magnitude of the recombination fraction theta. Critical values for each of the test statistics are given. A conservative approximation for both the A-test and B-test is given by a chi 2 distribution when P/2 instead of P is used for the observed significance level. In terms of power, the B-test performs best among the three tests over a broad range of alternate heterogeneity hypotheses--except for the specific case of admixture with loose linkage, in which the A-test performs best. Overall, the difference in power among the three tests is not large. An application to some recently published data on the fragile-X syndrome and X-chromosome markers is given.  相似文献   

15.
Hothorn T  Zeileis A 《Biometrics》2008,64(4):1263-1269
SUMMARY: Maximally selected statistics for the estimation of simple cutpoint models are embedded into a generalized conceptual framework based on conditional inference procedures. This powerful framework contains most of the published procedures in this area as special cases, such as maximally selected chi(2) and rank statistics, but also allows for direct construction of new test procedures for less standard test problems. As an application, a novel maximally selected rank statistic is derived from this framework for a censored response partitioned with respect to two ordered categorical covariates and potential interactions. This new test is employed to search for a high-risk group of rectal cancer patients treated with a neo-adjuvant chemoradiotherapy. Moreover, a new efficient algorithm for the evaluation of the asymptotic distribution for a large class of maximally selected statistics is given enabling the fast evaluation of a large number of cutpoints.  相似文献   

16.
Koopman WJ  Gort G 《Genetics》2004,167(4):1915-1928
Many AFLP studies include relatively unrelated genotypes that contribute noise to data sets instead of signal. We developed: (1) estimates of expected AFLP similarities between unrelated genotypes, (2) significance tests for AFLP similarities, enabling the detection of unrelated genotypes, and (3) weighted similarity coefficients, including band position information. Detection of unrelated genotypes and use of weighted similarity coefficients will make the analysis of AFLP data sets more informative and more reliable. Test statistics and weighted coefficients were developed for total numbers of shared bands and for Dice, Jaccard, Nei and Li, and simple matching (dis)similarity coefficients. Theoretical and in silico AFLP fragment length distributions (FLDs) were examined as a basis for the tests. The in silico AFLP FLD based on the Arabidopsis thaliana genome sequence was the most appropriate for angiosperms. The G + C content of the selective nucleotides in the in silico AFLP procedure significantly influenced the FLD. Therefore, separate test statistics were calculated for AFLP procedures with high, average, and low G + C contents in the selective nucleotides. The test statistics are generally applicable for angiosperms with a G + C content of approximately 35-40%, but represent conservative estimates for genotypes with higher G + C contents. For the latter, test statistics based on a rice genome sequence are more appropriate.  相似文献   

17.
We consider in this paper, the behaviour of a class of the CRESSIE READ (1984) power divergence test statistics indexed by parameter λ - I (λ), with the modified X2 test statistics (LU) proposed by LAWAL and UPTON (1984), for sparse contingency tables ranging from the 3×3 to the 10×10. We present a sample of our results here. The results indicate that the LU test out-performs either the Cressie-Read suggested test I(2/3) or the Pearson's test - I(1). Our results further show that the modification to the likelihood ratio test [Y2 = I'(0)] proposed by WILLIAMS (1976) performs like the parent Y2 test, very poorly compared with either the I(2/3), X2 or the LU test statistics. Power results also indicate that the powers of the LU test are in all cases considered in this study slightly higher than those of X2 and I(2/3) tests. The LU test is therefore strongly recommended for use with sparse two-way contingency tables because in all of the cases considered, none of the other test statistics consistently out-performs the LU test with respect to attained α level or power.  相似文献   

18.
In combining several tests of significance the individual test statistics are allowed to be stochastically dependent. By choosing the weighted inverse normal method for the combination, the dependency of the original test statistics is then characterized by a correlation of the transformed statistics. For this correlation a confidence region, an unbiased estimator and an unbiased estimate of its variance are derived. The combined test statistic is extended to include the case of possibly dependent original test statistics. Simulation studies show the performance of the actual significance level.  相似文献   

19.
False discovery rate (FDR) methodologies are essential in the study of high-dimensional genomic and proteomic data. The R package 'fdrtool' facilitates such analyses by offering a comprehensive set of procedures for FDR estimation. Its distinctive features include: (i) many different types of test statistics are allowed as input data, such as P-values, z-scores, correlations and t-scores; (ii) simultaneously, both local FDR and tail area-based FDR values are estimated for all test statistics and (iii) empirical null models are fit where possible, thereby taking account of potential over- or underdispersion of the theoretical null. In addition, 'fdrtool' provides readily interpretable graphical output, and can be applied to very large scale (in the order of millions of hypotheses) multiple testing problems. Consequently, 'fdrtool' implements a flexible FDR estimation scheme that is unified across different test statistics and variants of FDR. AVAILABILITY: The program is freely available from the Comprehensive R Archive Network (http://cran.r-project.org/) under the terms of the GNU General Public License (version 3 or later). CONTACT: strimmer@uni-leipzig.de.  相似文献   

20.
Sano A  Tachida H 《Genetics》2005,169(3):1687-1697
We consider the Wright-Fisher model with exponential population growth and investigate effects of population growth on the shape of genealogy and the distributions of several test statistics of neutrality. In the limiting case as the population grows rapidly, the rapid-growth-limit genealogy is characterized. We obtained approximate expressions for expectations and variances of test statistics in the rapid-growth-limit genealogy and star genealogy. The distributions in the star genealogy are narrower than those in the cases of the simulated and rapid-growth-limit genealogies. The expectations and variances of the test statistics are monotone decreasing functions of the time length of the expansion, and the higher power of R(2) against population growth is suggested to be due to their smaller variances rather than to change of the expectations. We also investigated by simulation how quickly the distributions of test statistics approach those of the rapid-growth-limit genealogy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号