首页 | 本学科首页   官方微博 | 高级检索  
     


Should We Abandon the t-Test in the Analysis of Gene Expression Microarray Data: A Comparison of Variance Modeling Strategies
Authors:Marine Jeanmougin  Aurelien de Reynies  Laetitia Marisa  Caroline Paccard  Gregory Nuel  Mickael Guedj
Affiliation:1. Programme Cartes d''Identité des Tumeurs (CIT), Ligue Nationale Contre le Cancer, Paris, France.; 2. Department of Biostatistics, Pharnext, Paris, France.; 3. Department of Applied Mathematics (MAP5) UMR CNRS 8145, Paris Descartes University, Paris, France.; 4. Statistics and Genome Laboratory UMR CNRS 8071, University of Evry, Evry, France.;University of Michigan, United States of America
Abstract:High-throughput post-genomic studies are now routinely and promisingly investigated in biological and biomedical research. The main statistical approach to select genes differentially expressed between two groups is to apply a t-test, which is subject of criticism in the literature. Numerous alternatives have been developed based on different and innovative variance modeling strategies. However, a critical issue is that selecting a different test usually leads to a different gene list. In this context and given the current tendency to apply the t-test, identifying the most efficient approach in practice remains crucial. To provide elements to answer, we conduct a comparison of eight tests representative of variance modeling strategies in gene expression data: Welch''s t-test, ANOVA [1], Wilcoxon''s test, SAM [2], RVM [3], limma [4], VarMixt [5] and SMVar [6]. Our comparison process relies on four steps (gene list analysis, simulations, spike-in data and re-sampling) to formulate comprehensive and robust conclusions about test performance, in terms of statistical power, false-positive rate, execution time and ease of use. Our results raise concerns about the ability of some methods to control the expected number of false positives at a desirable level. Besides, two tests (limma and VarMixt) show significant improvement compared to the t-test, in particular to deal with small sample sizes. In addition limma presents several practical advantages, so we advocate its application to analyze gene expression data.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号