首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: An important application of microarray experiments is to identify differentially expressed genes. Because microarray data are often not distributed according to a normal distribution nonparametric methods were suggested for their statistical analysis. Here, the Baumgartner-Weiss-Schindler test, a novel and powerful test based on ranks, is investigated and compared with the parametric t-test as well as with two other nonparametric tests (Wilcoxon rank sum test, Fisher-Pitman permutation test) recently recommended for the analysis of gene expression data. RESULTS: Simulation studies show that an exact permutation test based on the Baumgartner-Weiss-Schindler statistic B is preferable to the other three tests. It is less conservative than the Wilcoxon test and more powerful, in particular in case of asymmetric or heavily tailed distributions. When the underlying distribution is symmetric the differences in power between the tests are relatively small. Thus, the Baumgartner-Weiss-Schindler is recommended for the usual situation that the underlying distribution is a priori unknown. AVAILABILITY: SAS code available on request from the authors.  相似文献   

2.
Summary This article proposes new tests to compare the vaccine and placebo groups in randomized vaccine trials when a small fraction of volunteers become infected. A simple approach that is consistent with the intent‐to‐treat principle is to assign a score, say W, equal to 0 for the uninfecteds and some postinfection outcome X > 0 for the infecteds. One can then test the equality of this skewed distribution of W between the two groups. This burden of illness (BOI) test was introduced by Chang, Guess, and Heyse (1994, Statistics in Medicine 13 , 1807–1814). If infections are rare, the massive number of 0s in each group tends to dilute the vaccine effect and this test can have poor power, particularly if the X's are not close to zero. Comparing X in just the infecteds is no longer a comparison of randomized groups and can produce misleading conclusions. Gilbert, Bosch, and Hudgens (2003, Biometrics 59 , 531–541) and Hudgens, Hoering, and Self (2003, Statistics in Medicine 22 , 2281–2298) introduced tests of the equality of X in a subgroup—the principal stratum of those “doomed” to be infected under either randomization assignment. This can be more powerful than the BOI approach, but requires unexaminable assumptions. We suggest new “chop‐lump” Wilcoxon and t‐tests (CLW and CLT) that can be more powerful than the BOI tests in certain situations. When the number of volunteers in each group are equal, the chop‐lump tests remove an equal number of zeros from both groups and then perform a test on the remaining W's, which are mostly >0. A permutation approach provides a null distribution. We show that under local alternatives, the CLW test is always more powerful than the usual Wilcoxon test provided the true vaccine and placebo infection rates are the same. We also identify the crucial role of the “gap” between 0 and the X's on power for the t‐tests. The chop‐lump tests are compared to established tests via simulation for planned HIV and malaria vaccine trials. A reanalysis of the first phase III HIV vaccine trial is used to illustrate the method.  相似文献   

3.
As a consequence of the "large p small n" characteristic for microarray data, hypothesis tests based on individual genes often result in low average power. There are several proposed tests that attempt to improve power. Among these, the FS test that was developed using the concept of James-Stein shrinkage to estimate the variances showed a striking average power improvement. In this paper, we establish a framework in which we model the key parameters with a distribution to find an optimal Bayes test which we call the MAP test (where MAP stands for Maximum Average Power). Under this framework, the FS test can be derived as an empirical Bayes test approximating the MAP test corresponding to modeling the variances. By modeling both the means and the variances with a distribution, a MAP statistic is derived which is optimal in terms of average power but is computationally intensive. An empirical Bayes test called the FSS test is derived as an approximation to the MAP tests and can be computed instantaneously. The FSS statistic shrinks both the means and the variances and has numerically identical average power to the MAP tests. Much numerical evidence is presented in this paper that shows that the proposed test performs uniformly better in average power than the other tests in the literature, including the classical F test, the FS test, the test of Wright and Simon, the moderated t-test, SAM, Efron's t test, the B-statistic and Storey's optimal discovery procedure. A theory is established which indicates that the proposed test is optimal in power when controlling the false discovery rate (FDR).  相似文献   

4.
The Cochran–Armitage (CA) linear trend test for proportions is often used for genotype‐based analysis of candidate gene association. Depending on the underlying genetic mode of inheritance, the use of model‐specific scores maximises the power. Commonly, the underlying genetic model, i.e. additive, dominant or recessive mode of inheritance, is a priori unknown. Association studies are commonly analysed using permutation tests, where both inference and identification of the underlying mode of inheritance are important. Especially interesting are tests for case–control studies, defined by a maximum over a series of standardised CA tests, because such a procedure has power under all three genetic models. We reformulate the test problem and propose a conditional maximum test of scores‐specific linear‐by‐linear association tests. For maximum‐type, sum and quadratic test statistics the asymptotic expectation and covariance can be derived in a closed form and the limiting distribution is known. Both the limiting distribution and approximations of the exact conditional distribution can easily be computed using standard software packages. In addition to these technical advances, we extend the area of application to stratified designs, studies involving more than two groups and the simultaneous analysis of multiple loci by means of multiplicity‐adjusted p‐values for the underlying multiple CA trend tests. The new test is applied to reanalyse a study investigating genetic components of different subtypes of psoriasis. A new and flexible inference tool for association studies is available both theoretically as well as practically since already available software packages can be easily used to implement the suggested test procedures.  相似文献   

5.
Pop‐Inference is an educational tool designed to help teaching of hypothesis testing using populations. The application allows for the statistical comparison of demographic parameters among populations. Input demographic data are projection matrices or raw demographic data. Randomization tests are used to compare populations. The tests evaluate the hypothesis that demographic parameters differ among groups of individuals more that should be expected from random allocation of individuals to populations. Confidence intervals for demographic parameters are obtained using the bootstrap. Tests may be global or pairwise. In addition to tests on differences, one‐way life table response experiments (LTRE) are available for random and fixed factors. Planned (a priori) comparisons are possible. Power of comparison tests is evaluated by constructing the distribution of the test statistic when the null hypothesis is true and when it is false. The relationship between power and sample size is explored by evaluating differences among populations at increasing population sizes, while keeping vital rates constant.  相似文献   

6.
OBJECTIVES: The association of a candidate gene with disease can be evaluated by a case-control study in which the genotype distribution is compared for diseased cases and unaffected controls. Usually, the data are analyzed with Armitage's test using the asymptotic null distribution of the test statistic. Since this test does not generally guarantee a type I error rate less than or equal to the significance level alpha, tests based on exact null distributions have been investigated. METHODS: An algorithm to generate the exact null distribution for both Armitage's test statistic and a recently proposed modification of the Baumgartner-Weiss-Schindler statistic is presented. I have compared the tests in a simulation study. RESULTS: The asymptotic Armitage test is slightly anticonservative whereas the exact tests control the type I error rate. The exact Armitage test is very conservative, but the exact test based on the modification of the Baumgartner-Weiss-Schindler statistic has a type I error rate close to alpha. The exact Armitage test is the least powerful test; the difference in power between the other two tests is often small and the comparison does not show a clear winner. CONCLUSION: Simulation results indicate that an exact test based on the modification of the Baumgartner-Weiss-Schindler statistic is preferable for the analysis of case-control studies of genetic markers.  相似文献   

7.
Consider a study to evaluate treatment A with a placebo in two or more groups of patients. If treatment A is beneficial to one group of patients and harmful to another, then we say that there is qualitative interaction or crossover interaction between patient groups and the treatments. Gail and Simon (1985, Biometrics 41, 361-372) developed a large-sample procedure for this testing problem. Their test has received favorable coverage in the literature. In this article, we obtain corresponding exact finite sample results for normal error distribution and provide a table of critical values. The test statistic is similar to the familiar F-ratio, and its p-value is equal to a weighted sum of tail areas of F-distributions. The computations to implement this are simple. A simulation study shows that the exact critical values provided here for normal error distribution are preferable to the asymptotic critical values for a wide range of error distributions. We also develop tests that are power robust against long-tailed error distributions. Our robust test uses M-estimators instead of the least squares estimators. We show that the efficiency robustness of the M-estimator translates to power robustness of the corresponding test. Therefore, our robust tests are better if outliers are expected. A simulation study illustrates the substantial power advantages of our robust tests.  相似文献   

8.
A new statistical test for linkage heterogeneity.   总被引:6,自引:5,他引:1       下载免费PDF全文
A new, statistical test for linkage heterogeneity is described. It is a likelihood-ratio test based on a beta distribution for the prior distribution of the recombination fraction among families (or individuals). The null distribution for this statistic (called the B-test) is derived under a broad range of circumstances. Two other heterogeneity test statistics--the admixture test or A-test first described by Smith and Morton's test (here referred to as the K-test)--are also examined. The probability distribution for the K-test statistic is very sensitive to family size, whereas the other two statistics are not. All three statistics are somewhat sensitive to the magnitude of the recombination fraction theta. Critical values for each of the test statistics are given. A conservative approximation for both the A-test and B-test is given by a chi 2 distribution when P/2 instead of P is used for the observed significance level. In terms of power, the B-test performs best among the three tests over a broad range of alternate heterogeneity hypotheses--except for the specific case of admixture with loose linkage, in which the A-test performs best. Overall, the difference in power among the three tests is not large. An application to some recently published data on the fragile-X syndrome and X-chromosome markers is given.  相似文献   

9.
Simple test statistics for major gene detection: a numerical comparison   总被引:5,自引:0,他引:5  
Summary We compare 22 simple tests for the detection of major gene segregation in livestock populations. These tests belong to two groups: methods based on the comparison of within-family distribution and methods based on the comparison of parents' and offspring performances. The power of the 22 tests and the robustness of the two more powerful of these 22 are evaluated by simulation. Thirteen types of major loci, differing in the within-genotype means, variances or alleles frequencies, are studied. Thirty hierarchically balanced populations defined by the number of sire families (5–20), dams per sire (1–20) and progenies per dam (1–20) are simulated. The quantiles are estimated from 2000 samples, the power from 1000 samples and the robustness from 100 samples. The more powerful tests are the within family-variance heterogenity test (Bartlett test) and the within-family mean-variance regression (Fain 1978). Their robustness may be very low, in particular when the trait distribution is skewed.  相似文献   

10.
PJE. Goss  R. C. Lewontin 《Genetics》1996,143(1):589-602
Regions of differing constraint, mutation rate or recombination along a sequence of DNA or amino acids lead to a nonuniform distribution of polymorphism within species or fixed differences between species. The power of five tests to reject the null hypothesis of a uniform distribution is studied for four classes of alternate hypothesis. The tests explored are the variance of interval lengths; a modified variance test, which includes covariance between neighboring intervals; the length of the longest interval; the length of the shortest third-order interval; and a composite test. Although there is no uniformly most powerful test over the range of alternate hypotheses tested, the variance and modified variance tests usually have the highest power. Therefore, we recommend that one of these two tests be used to test departure from uniformity in all circumstances. Tables of critical values for the variance and modified variance tests are given. The critical values depend both on the number of events and the number of positions in the sequence. A computer program is available on request that calculates both the critical values for a specified number of events and number of positions as well as the significance level of a given data set.  相似文献   

11.
May S  Degruttola V 《Biometrics》2007,63(1):194-200
We propose new tests for two-group comparisons of repeated measures of a response where the repeated measures might be obtained at arbitrary time points that differ over individuals. The tests are almost U-statistics in that the kernel contains some unknown parameters that need to be estimated from the data. Our methods are designed for settings in which response means of one group are strictly greater than the response means of the other group. The tests do not make any assumptions regarding the distribution of the repeated measures except that one of the tests assumes that the repeated measures can be grouped into distinct periods of observations (e.g., around fixed follow-up time points) such that the covariance between scores only depends on the periods the observations belong to and that the covariance matrices are the same in the two groups. The tests are valid even if the probability that a response is observed depends on the level of response provided that the missing data mechanism is the same in both groups. Inference can conveniently be based on resampling. We provide asymptotic results for the test statistics. We investigate size and power of the tests and use them to assess differences in viral load decline for drug-resistant and drug-sensitive human immunodeficiency virus (HIV)-1 infected patients.  相似文献   

12.
With development of massively parallel sequencing technologies, there is a substantial need for developing powerful rare variant association tests. Common approaches include burden and non-burden tests. Burden tests assume all rare variants in the target region have effects on the phenotype in the same direction and of similar magnitude. The recently proposed sequence kernel association test (SKAT) (Wu, M. C., and others, 2011. Rare-variant association testing for sequencing data with the SKAT. The American Journal of Human Genetics 89, 82-93], an extension of the C-alpha test (Neale, B. M., and others, 2011. Testing for an unusual distribution of rare variants. PLoS Genetics 7, 161-165], provides a robust test that is particularly powerful in the presence of protective and deleterious variants and null variants, but is less powerful than burden tests when a large number of variants in a region are causal and in the same direction. As the underlying biological mechanisms are unknown in practice and vary from one gene to another across the genome, it is of substantial practical interest to develop a test that is optimal for both scenarios. In this paper, we propose a class of tests that include burden tests and SKAT as special cases, and derive an optimal test within this class that maximizes power. We show that this optimal test outperforms burden tests and SKAT in a wide range of scenarios. The results are illustrated using simulation studies and triglyceride data from the Dallas Heart Study. In addition, we have derived sample size/power calculation formula for SKAT with a new family of kernels to facilitate designing new sequence association studies.  相似文献   

13.
《Acta Oecologica》2007,31(1):102-108
Biological data often tend to have heterogeneous, discontinuous non-normal distributions. Statistical non-parametric tests, like the Mann–Whitney U-test or the extension for more than two samples, the Kruskal–Wallis test, are often used in these cases, although they assume certain preconditions which are often ignored. We developed a permutation test procedure that uses the ratio of the interquartile distances and the median differences of the original non-classified data to assess the properties of the real distribution more appropriately than the classical methods. We used this test on a heterogeneous, skewed biological data set on invertebrate dispersal and showed how different the reactions of the Kruskal–Wallis test and the permutation approach are. We then evaluated the new testing procedure with reproducible data that were generated from the normal distribution. Here, we tested the influence of four different experimental trials on the new testing procedure in comparison to the Kruskal–Wallis test. These trials showed the impact of data that were varying in terms of (a) negative correlation between variances and means of the samples, (b) changing variances that were not correlated with the means of the samples, (c) constant variances and means, but different sample sizes and in trials (d) we evaluated the testing power of the new procedure. Due to the different test statistics, the permutation test reacted more sensibly to the data presented in trials (a) and c) and non-uniformly in trial (b). In the evaluation of the testing power, no significant differences between the Kruskal–Wallis test and the new permutation testing procedure could be detected. We consider this test to be an alternative for working on heterogeneous data where the preconditions of the classical non-parametric tests are not met.  相似文献   

14.
Pan W  Basu S  Shen X 《Human heredity》2011,72(2):98-109
There has been an increasing interest in detecting gene-gene and gene-environment interactions in genetic association studies. A major statistical challenge is how to deal with a large number of parameters measuring possible interaction effects, which leads to reduced power of any statistical test due to a large number of degrees of freedom or high cost of adjustment for multiple testing. Hence, a popular idea is to first apply some dimension reduction techniques before testing, while another is to apply only statistical tests that are developed for and robust to high-dimensional data. To combine both ideas, we propose applying an adaptive sum of squared score (SSU) test and several other adaptive tests. These adaptive tests are extensions of the adaptive Neyman test [Fan, 1996], which was originally proposed for high-dimensional data, providing a simple and effective way for dimension reduction. On the other hand, the original SSU test coincides with a version of a test specifically developed for high-dimensional data. We apply these adaptive tests and their original nonadaptive versions to simulated data to detect interactions between two groups of SNPs (e.g. multiple SNPs in two candidate regions). We found that for sparse models (i.e. with only few non-zero interaction parameters), the adaptive SSU test and its close variant, an adaptive version of the weighted sum of squared score (SSUw) test, improved the power over their non-adaptive versions, and performed consistently well across various scenarios. The proposed adaptive tests are built in the general framework of regression analysis, and can thus be applied to various types of traits in the presence of covariates.  相似文献   

15.
Based on the result that the distribution of estimated spectral power in broad frequency bands can be approximated by x2-distribution the accuracy of the approximation was tested by different models of stochastical processes. Two significance tests for comparison of spectral band powers on the basis of this approximation and the estimation of confidence intervals were developed and tested by application on EEG data. By means of these test statistics, a new basis for statistical comparison between EEG maps of groups as well as between maps of groups and single maps can be suggested.  相似文献   

16.
Brown RP 《Genetica》1997,101(1):67-74
Heterogeneous phenotypic correlations may be suggestive of underlying changes in genetic covariance among life-history, morphology, and behavioural traits, and their detection is therefore relevant to many biological studies. Two new statistical tests are proposed and their performances compared with existing methods. Of all tests considered, the existing approximate test of homogeneity of product-moment correlations provides the greatest power to detect heterogeneous correlations, when based on Hotelling's z*-transformation. The use of this transformation and test is recommended under conditions of bivariate normality. A new distribution-free randomisation test of homogeneity of Spearman's rank correlations is described and recommended for use when the bivariate samples are taken from populations with non-normal or unknown distributions. An alternative randomisation test of homogeneity of product-moment correlations is shown to be a useful compromise between the approximate tests and the randomisation tests on Spearman's rank correlations: it is not as sensitive to departures from normality as the approximate tests, but has greater power than the rank correlation test. An example is provided that shows how choice of test will have a considerable influence on the conclusions of a particular study. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

17.
Zhou JY  Hu YQ  Lin S  Fung WK 《Human heredity》2009,67(1):1-12
Parent-of-origin effects are important in studying genetic traits. More than 1% of all mammalian genes are believed to show parent-of-origin effects. Some statistical methods may be ineffective or fail to detect linkage or association for a gene with parent-of-origin effects. Based on case-parents trios, the parental-asymmetry test (PAT) is simple and powerful in detecting parent-of-origin effects. However, it is common in practice to collect nuclear families with both parents as well as nuclear families with only one parent. In this paper, when only one parent is available for each family with an arbitrary number of affected children, we firstly develop a new test statistic 1-PAT to test for parent-of-origin effects in the presence of association between an allele at the marker locus under study and a disease gene. Then we extend the PAT to accommodate complete nuclear families each with one or more affected children. Combining families with both parents and families with only one parent, the C-PAT is proposed to detect parent-of-origin effects. The validity of the test statistics is verified by simulation in various scenarios of parameter values. A power study shows that using the additional information from incomplete nuclear families in the analysis greatly improves the power of the tests, compared to that based on only complete nuclear families. Also, utilizing all affected children in each family, the proposed tests have a higher power than when only one affected child from each family is selected. Additional power comparison also demonstrates that the C-PAT is more powerful than a number of other tests for detecting parent-of-origin effects.  相似文献   

18.
In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.  相似文献   

19.
Several tests of molecular phylogenies have been proposed over the last decades, but most of them lead to strikingly different P-values. I propose that such discrepancies are principally due to different forms of null hypotheses. To support this hypothesis, two new tests are described. Both consider the composite null hypothesis that all the topologies are equidistant from the true but unknown topology. This composite hypothesis can either be reduced to the simple hypothesis at the least favorable distribution (frequentist significance test [FST]) or to the maximum likelihood topology (frequentist hypothesis test [FHT]). In both cases, the reduced null hypothesis is tested against each topology included in the analysis. The tests proposed have an information-theoretic justification, and the distribution of their test statistic is estimated by a nonparametric bootstrap, adjusting P-values for multiple comparisons. I applied the new tests to the reanalysis of two chloroplast genes, psaA and psbB, and compared the results with those of previously described tests. As expected, the FST and the FHT behaved approximately like the Shimodaira-Hasegawa test and the bootstrap, respectively. Although the tests give overconfidence in a wrong tree when an overly simple nucleotide substitution model is assumed, more complex models incorporating heterogeneity among codon positions resolve some conflicts. To further investigate the influence of the null hypothesis, a power study was conducted. Simulations showed that FST and the Shimodaira-Hasegawa test are the least powerful and FHT is the most powerful across the parameter space. Although the size of all the tests is affected by misspecification, the two new tests appear more robust against misspecification of the model of evolution and consistently supported the hypothesis that the Gnetales are nested within gymnosperms.  相似文献   

20.
Consider a study of two groups of individuals infected with a population of a genetically related heterogeneous mixture of viruses, and multiple viral sequences are sampled from each person. Based on estimates of genetic distances between pairs of aligned viral sequences within individuals, we develop four new tests to compare intra-individual genetic sequence diversity between the two groups. This problem is complicated by two levels of dependency in the data structure: (i) Within an individual, any pairwise distances that share a common sequence are positively correlated; and (ii) for any two pairings of individuals which share a person, the two differences in intra-individual distances between the paired individuals are positively correlated. The first proposed test is based on the difference in mean intra-individual pairwise distances pooled over all individuals in each group, standardized by a variance estimate that corrects for the correlation structure using U-statistic theory. The second procedure is a nonparametric rank-based analog of the first test, and the third test contrasts the set of subject-specific average intra-individual pairwise distances between the groups. These tests are very easy to use and solve correlation problem (i). The fourth procedure is based on a linear combination of all possible U-statistics calculated on independent, identically distributed sequence subdatasets, over the two levels (i) and (ii) of dependencies in the data, and is more complicated than the other tests but can be more powerful. Although the proposed methods are empirical and do not fully utilize knowledge from population genetics, the tests reflect biology through the evolutionary models used to derive the pairwise sequence distances. The new tests are evaluated theoretically and in a simulation study, and are applied to a dataset of 200 HIV sequences sampled from 21 children.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号