首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Wu CC  Amos CI 《Human heredity》2003,55(4):153-162
Genetic linkage analysis is a powerful tool for the identification of disease susceptibility loci. Among the most commonly applied genetic linkage strategies are affected sib-pair tests, but the statistical properties of these tests have not been well characterized. Here, we present a study of the distribution of affected sib-pair tests comparing the type I error rate and the power of the mean test and the proportion test, which are the most commonly used, along with a novel exact test. In contrast to existing literature, our findings showed that the mean and proportion tests have inflated type I error rates, especially when used with small samples. We developed and applied corrections to the tests which provide an excellent adjustment to the type I error rate for both small and large samples. We also developed a novel approach to identify the areas of higher power for the mean test versus the proportion test, providing a wider and simpler comparison with fewer assumptions about parameter values than existing approaches require.  相似文献   

2.
In attempting to improve the efficiency of McNemar's test statistic, we develop two test procedures that account for the information on both the discordant and concordant pairs for testing equality between two comparison groups in dichotomous data with matched pairs. Furthermore, we derive a test procedure derived from one of the most commonly‐used interval estimators for odds ratio. We compare these procedures with those using McNemar's test, McNemar's test with the continuity correction, and the exact test with respect to type I error and power in a variety of situations. We note that the test procedures using McNemar's test with the continuity correction and the exact test can be quite conservative and hence lose much efficiency, while the test procedure using McNemar's test can actually perform well even when the expected number of discordant pairs is small. We also find that the two test procedures, which incorporate the information on all matched pairs into hypothesis testing, may slightly improve the power of using McNemar's test without essentially losing the precision of type I error. On the other hand, the test procedure derived from an interval estimator of adds ratio with use of the logarithmic transformation may have type I error much larger than the nominal α‐level when the expected number of discordant pairs is not large and therefore, is not recommended for general use.  相似文献   

3.
Implementing false discovery rate control: increasing your power   总被引:23,自引:0,他引:23  
Popular procedures to control the chance of making type I errors when multiple statistical tests are performed come at a high cost: a reduction in power. As the number of tests increases, power for an individual test may become unacceptably low. This is a consequence of minimizing the chance of making even a single type I error, which is the aim of, for instance, the Bonferroni and sequential Bonferroni procedures. An alternative approach, control of the false discovery rate (FDR), has recently been advocated for ecological studies. This approach aims at controlling the proportion of significant results that are in fact type I errors. Keeping the proportion of type I errors low among all significant results is a sensible, powerful, and easy-to-interpret way of addressing the multiple testing issue. To encourage practical use of the approach, in this note we illustrate how the proposed procedure works, we compare it to more traditional methods that control the familywise error rate, and we discuss some recent useful developments in FDR control.  相似文献   

4.
Kang SH  Shin D 《Human heredity》2004,58(1):10-17
Many scientific problems can be formulated in terms of a statistical model indexed by parameters, only some of which are of scientific interest and the other parameters, called nuisance parameters, are not of interest in themselves. For testing the Hardy-Weinberg law, a relation among genotype and allele probabilities is of interest and allele probabilities are of no interest and now nuisance parameters. In this paper we investigate how the size (the maximum of the type I error rate over the nuisance parameter space) of the chi-square test for the Hardy-Weinberg law is affected by the nuisance parameters. Whether the size is well controlled or not under the nominal level has been frequently investigated as basic components of statistical tests. The size represents the type I error rate at the worst case. We prove that the size is always greater than the nominal level as the sample size increases. Extensive computations show that the size of the chi-squared test (worst type I error rate over the nuisance parameter space) deviates more upwardly from the nominal level as the sample size gets larger. The value at which the maximum of the type I error rate was found moves closer to the edges of the the nuisance parameter space with increasing sample size. An exact test is recommended as an alternative when the type I error is inflated.  相似文献   

5.
Hao K  Wang X 《Human heredity》2004,58(3-4):154-163
OBJECTIVES: Genotyping error commonly occurs and could reduce the power and bias statistical inference in genetics studies. In addition to genotypes, some automated biotechnologies also provide quality measurement of each individual genotype. We studied the relationship between the quality measurement and genotyping error rate. Furthermore, we propose two association tests incorporating the genotyping quality information with the goal to improve statistical power and inference. METHODS: 50 pairs of DNA sample duplicates were typed for 232 SNPs by BeadArray technology. We used scatter plot, smoothing function and generalized additive models to investigate the relationship between genotype quality score (q) and inconsistency rate (?) among duplicates. We constructed two association tests: (1) weighted contingency table test (WCT) and (2) likelihood ratio test (LRT) to incorporate individual genotype error rate (epsilon(i)), in unmatched case-control setting. RESULTS: In the 50 duplicates, we found q and ? were in strong negative association, suggesting the genotypes with low quality score were more likely to be mistyped. The WCT improved the statistical power and partially corrects the bias in point estimation. The LRT offered moderate power gain, but was able to correct the bias in odds ratio estimation. The two new methods also performed favorably in some scenarios when epsilon(i) was mis-specified. CONCLUSIONS: With increasing number of genetic studies and application of automated genotyping technology, there is a growing need to adequately account for individual genotype error rate in statistical analysis. Our study represents an initial step to address this need and points out a promising direction for further research.  相似文献   

6.
Observed variations in rates of taxonomic diversification have been attributed to a range of factors including biological innovations, ecosystem restructuring, and environmental changes. Before inferring causality of any particular factor, however, it is critical to demonstrate that the observed variation in diversity is significantly greater than that expected from natural stochastic processes. Relative tests that assess whether observed asymmetry in species richness between sister taxa in monophyletic pairs is greater than would be expected under a symmetric model have been used widely in studies of rate heterogeneity and are particularly useful for groups in which paleontological data are problematic. Although one such test introduced by Slowinski and Guyer a decade ago has been applied to a wide range of clades and evolutionary questions, the statistical behavior of the test has not been examined extensively, particularly when used with Fisher's procedure for combining probabilities to analyze data from multiple independent taxon pairs. Here, certain pragmatic difficulties with the Slowinski-Guyer test are described, further details of the development of a recently introduced likelihood-based relative rates test are presented, and standard simulation procedures are used to assess the behavior of the two tests in a range of situations to determine: (1) the accuracy of the tests' nominal Type I error rate; (2) the statistical power of the tests; (3) the sensitivity of the tests to inclusion of taxon pairs with few species; (4) the behavior of the tests with datasets comprised of few taxon pairs; and (5) the sensitivity of the tests to certain violations of the null model assumptions. Our results indicate that in most biologically plausible scenarios, the likelihood-based test has superior statistical properties in terms of both Type I error rate and power, and we found no scenario in which the Slowinski-Guyer test was distinctly superior, although the degree of the discrepancy varies among the different scenarios. The Slowinski-Guyer test tends to be much more conservative (i.e., very disinclined to reject the null hypothesis) in datasets with many small pairs. In most situations, the performance of both the likelihood-based test and particularly the Slowinski-Guyer test improve when pairs with few species are excluded from the computation, although this is balanced against a decline in the tests' power and accuracy as fewer pairs are included in the dataset. The performance of both tests is quite poor when they are applied to datasets in which the taxon sizes do not conform to the distribution implied by the usual null model. Thus, results of analyses of taxonomic rate heterogeneity using the Slowinski-Guyer test can be misleading because the test's ability to reject the null hypothesis (equal rates) when true is often inaccurate and its ability to reject the null hypothesis when the alternative (unequal rates) is true is poor, particularly when small taxon pairs are included. Although not always perfect, the likelihood-based test provides a more accurate and powerful alternative as a relative rates test.  相似文献   

7.
OBJECTIVE: In affected sib pair studies without genotyped parents the effect of genotyping error is generally to reduce the type I error rate and power of tests for linkage. The effect of genotyping error when parents have been genotyped is unknown. We investigated the type I error rate of the single-point Mean test for studies in which genotypes of both parents are available. METHODS: Datasets were simulated assuming no linkage and one of five models for genotyping error. In each dataset, Mendelian-inconsistent families were either excluded or regenotyped, and then the Mean test applied. RESULTS: We found that genotyping errors lead to an inflated type I error rate when inconsistent families are excluded. Depending on the genotyping-error model assumed, regenotyping inconsistent families has one of several effects. It may produce the same type I error rate as if inconsistent families are excluded; it may reduce the type I error, but still leave an anti-conservative test; or it may give a conservative test. Departures of the type I error rate from its nominal level increase with both the genotyping error rate and sample size. CONCLUSION: We recommend that markers with high error rates either be excluded from the analysis or be regenotyped in all families.  相似文献   

8.

Background

Spurious associations between single nucleotide polymorphisms and phenotypes are a major issue in genome-wide association studies and have led to underestimation of type 1 error rate and overestimation of the number of quantitative trait loci found. Many authors have investigated the influence of population structure on the robustness of methods by simulation. This paper is aimed at developing further the algebraic formalization of power and type 1 error rate for some of the classical statistical methods used: simple regression, two approximate methods of mixed models involving the effect of a single nucleotide polymorphism (SNP) and a random polygenic effect (GRAMMAR and FASTA) and the transmission/disequilibrium test for quantitative traits and nuclear families. Analytical formulae were derived using matrix algebra for the first and second moments of the statistical tests, assuming a true mixed model with a polygenic effect and SNP effects.

Results

The expectation and variance of the test statistics and their marginal expectations and variances according to the distribution of genotypes and estimators of variance components are given as a function of the relationship matrix and of the heritability of the polygenic effect. These formulae were used to compute type 1 error rate and power for any kind of relationship matrix between phenotyped and genotyped individuals for any level of heritability. For the regression method, type 1 error rate increased with the variability of relationships and with heritability, but decreased with the GRAMMAR method and was not affected with the FASTA and quantitative transmission/disequilibrium test methods.

Conclusions

The formulae can be easily used to provide the correct threshold of type 1 error rate and to calculate the power when designing experiments or data collection protocols. The results concerning the efficacy of each method agree with simulation results in the literature but were generalized in this work. The power of the GRAMMAR method was equal to the power of the FASTA method at the same type 1 error rate. The power of the quantitative transmission/disequilibrium test was low. In conclusion, the FASTA method, which is very close to the full mixed model, is recommended in association mapping studies.  相似文献   

9.
Summary .  Regression models are often used to test for cause-effect relationships from data collected in randomized trials or experiments. This practice has deservedly come under heavy scrutiny, because commonly used models such as linear and logistic regression will often not capture the actual relationships between variables, and incorrectly specified models potentially lead to incorrect conclusions. In this article, we focus on hypothesis tests of whether the treatment given in a randomized trial has any effect on the mean of the primary outcome, within strata of baseline variables such as age, sex, and health status. Our primary concern is ensuring that such hypothesis tests have correct type I error for large samples. Our main result is that for a surprisingly large class of commonly used regression models, standard regression-based hypothesis tests (but using robust variance estimators) are guaranteed to have correct type I error for large samples, even when the models are incorrectly specified. To the best of our knowledge, this robustness of such model-based hypothesis tests to incorrectly specified models was previously unknown for Poisson regression models and for other commonly used models we consider. Our results have practical implications for understanding the reliability of commonly used, model-based tests for analyzing randomized trials.  相似文献   

10.
Studies using haplotypes of multiple tightly linked markers are more informative than those using a single marker. However, studies based on multimarker haplotypes have some difficulties. First, if we consider each haplotype as an allele and use the conventional single-marker transmission/disequilibrium test (TDT), then the rapid increase in the degrees of freedom with an increasing number of markers means that the statistical power of the conventional tests will be low. Second, the parental haplotypes cannot always be unambiguously reconstructed. In the present article, we propose a haplotype-sharing TDT (HS-TDT) for linkage or association between a disease-susceptibility locus and a chromosome region in which several tightly linked markers have been typed. This method is applicable to both quantitative traits and qualitative traits. It is applicable to any size of nuclear family, with or without ambiguous phase information, and it is applicable to any number of alleles at each of the markers. The degrees of freedom (in a broad sense) of the test increase linearly as the number of markers considered increases but do not increase as the number of alleles at the markers increases. Our simulation results show that the HS-TDT has the correct type I error rate in structured populations and that, in most cases, the power of HS-TDT is higher than the power of the existing single-marker TDTs and haplotype-based TDTs.  相似文献   

11.
We present a new method of quantitative-trait linkage analysis that combines the simplicity and robustness of regression-based methods and the generality and greater power of variance-components models. The new method is based on a regression of estimated identity-by-descent (IBD) sharing between relative pairs on the squared sums and squared differences of trait values of the relative pairs. The method is applicable to pedigrees of arbitrary structure and to pedigrees selected on the basis of trait value, provided that population parameters of the trait distribution can be correctly specified. Ambiguous IBD sharing (due to incomplete marker information) can be accommodated in the method by appropriate specification of the variance-covariance matrix of IBD sharing between relative pairs. We have implemented this regression-based method and have performed simulation studies to assess, under a range of conditions, estimation accuracy, type I error rate, and power. For normally distributed traits and in large samples, the method is found to give the correct type I error rate and an unbiased estimate of the proportion of trait variance accounted for by the additive effects of the locus-although, in cases where asymptotic theory is doubtful, significance levels should be checked by simulations. In large sibships, the new method is slightly more powerful than variance-components models. The proposed method provides a practical and powerful tool for the linkage analysis of quantitative traits.  相似文献   

12.
The central issue for Genetic Analysis Workshop 14 (GAW14) is the question, which is the better strategy for linkage analysis, the use of single-nucleotide polymorphisms (SNPs) or microsatellite markers? To answer this question we analyzed the simulated data using Duffy's SIB-PAIR program, which can incorporate parental genotypes, and our identity-by-state – identity-by-descent (IBS-IBD) transformation method of affected sib-pair linkage analysis which uses the matrix transformation between IBS and IBD. The advantages of our method are as follows: the assumption of Hardy-Weinberg equilibrium is not necessary; the parental genotype information maybe all unknown; both IBS and its related IBD transformation can be used in the linkage analysis; the determinant of the IBS-IBD transformation matrix provides a quantitative measure of the quality of the marker in linkage analysis. With the originally distributed simulated data, we found that 1) for microsatellite markers there are virtually no differences in types I and II error rates when parental genotypes were or were not used; 2) on average, a microsatellite marker has more power than a SNP marker does in linkage detection; 3) if parental genotype information is used, SNP markers show lower type I error rates than microsatellite markers; and 4) if parental genotypes are not available, SNP markers show considerable variation in type I error rates for different methods.  相似文献   

13.
The Mantel test, based on comparisons of distance matrices, is commonly employed in comparative biology, but its statistical properties in this context are unknown. Here, we evaluate the performance of the Mantel test for two applications in comparative biology: testing for phylogenetic signal, and testing for an evolutionary correlation between two characters. We find that the Mantel test has poor performance compared to alternative methods, including low power and, under some circumstances, inflated type‐I error. We identify a remedy for the inflated type‐I error of three‐way Mantel tests using phylogenetic permutations; however, this test still has considerably lower power than independent contrasts. We recommend that use of the Mantel test should be restricted to cases in which data can only be expressed as pairwise distances among taxa.  相似文献   

14.
Zhongxue Chen  Qingzhong Liu  Kai Wang 《Genomics》2019,111(5):1152-1159
Gene- and pathway-based variant association tests are important tools in finding genetic variants that are associated with phenotypes of interest. Although some methods have been proposed in the literature, powerful and robust statistical tests are still desirable in this area. In this study, we propose a statistical test based on decomposing the genotype data into orthogonal parts from which powerful and robust independent p-value combination approaches can be utilized. Through a comprehensive simulation study, we compare the proposed test with some existing popular ones. Our simulation results show that the new test has great performance in terms of controlling type I error rate and statistical power. Real data applications are also conducted to illustrate the performance and usefulness of the proposed test.  相似文献   

15.
There has been considerable debate in the literature concerning bias in case-control association mapping studies due to population stratification. In this paper, we perform a theoretical analysis of the effects of population stratification by measuring the inflation in the test's type I error (or false-positive rate). Using a model of stratified sampling, we derive an exact expression for the type I error as a function of population parameters and sample size. We give necessary and sufficient conditions for the bias to vanish when there is no statistical association between disease and marker genotype in each of the subpopulations making up the total population. We also investigate the variation of bias with increasing subpopulations and show, both theoretically and by using simulations, that the bias can sometimes be quite substantial even with a very large number of subpopulations. In a companion simulation-based paper (Heiman et al., Part I, this issue), we have focused on the CRR (confounding risk ratio) and its relationship to the type I error in the case of two subpopulations, and have also quantified the magnitude of the type I error that can occur with relatively low CRR values.  相似文献   

16.
Several independent clinical trials are usually conducted to demonstrate and support the evidence of the efficacy of a new drug. When not all the trials demonstrate a treatment effect because of a lack of statistical significant finding, the sponsor sometimes conducts a post hoc pooled test and uses the pooled result as extra statistical evidence. In this paper, we study the extent of type I error rate inflation with the post hoc pooled analysis and the power of interaction test in assessing the homogeneity of the trials with respect to treatment effect size. We also compare the power of several test procedures with or without pooled test involved and discuss the appropriateness of pooled tests under different alternative hypotheses.  相似文献   

17.
Extreme discordant sibling pairs (EDSPs) are theoretically powerful for the mapping of quantitative-trait loci (QTLs) in humans. EDSPs have not been used much in practice, however, because of the need to screen very large populations to find enough pairs that are extreme and discordant. Given appropriate statistical methods, another alternative is to use moderately discordant sibling pairs (MDSPs)--pairs that are discordant but not at the far extremes of the distribution. Such pairs can be powerful yet far easier to collect than extreme discordant pairs. Recent work on statistical methods for QTL mapping in humans has included a number of methods that, though not developed specifically for discordant pairs, may well be powerful for MDSPs and possibly even EDSPs. In the present article, we survey the new statistics and discuss their applicability to discordant pairs. We then use simulation to study the type I error and the power of various statistics for EDSPs and for MDSPs. We conclude that the best statistic(s) for discordant pairs (moderate or extreme) is (are) to be found among the new statistics. We suggest that the new statistics are appropriate for many other designs as well-and that, in fact, they open the way for the exploration of entirely novel designs.  相似文献   

18.
The maximum-likelihood-binomial (MLB) method, based on the binomial distribution of parental marker alleles among affected offspring, recently was shown to provide promising results by two-point linkage analysis of affected-sibship data. In this article, we extend the MLB method to multipoint linkage analysis, using the general framework of hidden Markov models. Furthermore, we perform a large simulation study to investigate the robustness and power of the MLB method, compared with those of the maximum-likelihood-score (MLS) method as implemented in MAPMAKER/SIBS, in the multipoint analysis of different affected-sibship samples. Analyses of multiple-affected sibships by means of the MLS were conducted by consideration of all possible sib pairs, with (weighted MLS [MLSw]) or without (unweighted MLS [MLSu]) application of a classic weighting procedure. In simulations under the null hypothesis, the MLB provided very consistent type I errors regardless of the type of family sample (sib pairs or multiple-affected sibships), as did the MLS for samples with sib pairs only. When samples included multiple-affected sibships, the MLSu led to inflation of low type I errors, whereas the MLSw yielded very conservative tests. Power comparisons showed that the MLB generally was more powerful than the MLS, except in recessive models with allele frequencies <.3. Missing parental marker data did not strongly influence type I error and power results in these multipoint analyses. The MLB approach, which in a natural way accounts for multiple-affected sibships and which provides a simple likelihood-ratio test for linkage, is an interesting alternative for multipoint analysis of sibships.  相似文献   

19.
We examine the efficiency of a number of schemes to select cases from nuclear families for case-control association analysis using the Genetic Analysis Workshop 14 simulated dataset. We show that with this simulated dataset comparing all affected siblings with unrelated controls is considerably more powerful than all of the other approaches considered. We find that the test statistic is increased by almost 3-fold compared to the next best sampling schemes of selecting all affected sibs only from families with affected parents (AF aff), one affected sib with most evidence of allele-sharing from each family (SF), and all affected sibs from families with evidence for linkage (AF L). We consider accounting for biological relatedness of samples in the association analysis to maintain the correct type I error. We also discuss the relative efficiencies of increasing the ratio of unrelated cases to controls, methods to confirm associations and issues to consider when applying our conclusions to other complex disease datasets.  相似文献   

20.
Tests for equal relative variation are valuable and frequently used tools for evaluating hypotheses about taxonomic heterogeneity in fossil hominids. In this study, Monte Carlo methods and simulated data are used to evaluate and compare 11 tests for equal relative variation. The tests evaluated include CV-based parametric bootstrap tests, modifications of Levene's test, and modified weighted scores tests. The results of these simulations show that a modified version of the weighted scores test developed by Fligner and Killeen ([1976] J. Am. Stat. Assoc. 71:210-213) is the only test that maintains an acceptable balance of type I and type II errors, even under conditions where all other tests have extraordinarily high type I error rates or little power.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号