首页 | 本学科首页   官方微博 | 高级检索  
     


Analysis of a simulated microarray dataset: Comparison of methods for data normalisation and detection of differential expression (Open Access publication)
Authors:Michael Watson  Mónica Pérez-Alegre  Michael Denis Baron  Céline Delmas  Peter Dov?   Mylène Duval  Jean-Louis Foulley  Juan José Garrido-Pavón  Ina Hulsegge  Florence Jaffrézic   ángeles Jiménez-Marín  Miha Lavri?   Kim-Anh Lê Cao  Guillemette Marot  Daphné Mouzaki  Marco H Pool  Christèle Robert-Granié   Magali San Cristobal  Gwenola Tosser-Klopp  David Waddington  Dirk-Jan de Koning
Affiliation:1.Institute for Animal Health, Compton, UK (IAH_C);2.University of Cordoba, Cordoba, Spain (CDB);3.Institute for Animal Health, Pirbright, UK (IAH_P);4.INRA, Castanet-Tolosan, France (INRA_T);5.University of Ljubljana, Slovenia (SLN);6.INRA, Jouy-en-Josas, France (INRA_J);7.Animal Sciences Group Wageningen UR, Lelystad, NL (IDL);8.Roslin Institute, Roslin, UK (ROSLIN);9.Institute for Animal Health Informatic groups, Compton Laboratory, Compton RG20 7 NN Newbury Bershive, UK
Abstract:Microarrays allow researchers to measure the expression of thousands of genes in a single experiment. Before statistical comparisons can be made, the data must be assessed for quality and normalisation procedures must be applied, of which many have been proposed. Methods of comparing the normalised data are also abundant, and no clear consensus has yet been reached. The purpose of this paper was to compare those methods used by the EADGENE network on a very noisy simulated data set. With the a priori knowledge of which genes are differentially expressed, it is possible to compare the success of each approach quantitatively. Use of an intensity-dependent normalisation procedure was common, as was correction for multiple testing. Most variety in performance resulted from differing approaches to data quality and the use of different statistical tests. Very few of the methods used any kind of background correction. A number of approaches achieved a success rate of 95% or above, with relatively small numbers of false positives and negatives. Applying stringent spot selection criteria and elimination of data did not improve the false positive rate and greatly increased the false negative rate. However, most approaches performed well, and it is encouraging that widely available techniques can achieve such good results on a very noisy data set.
Keywords:gene expression   two colour microarray   simulation   statistical analysis
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号