首页 | 本学科首页   官方微博 | 高级检索  
     


Estimating dataset size requirements for classifying DNA microarray data.
Authors:Sayan Mukherjee  Pablo Tamayo  Simon Rogers  Ryan Rifkin  Anna Engle  Colin Campbell  Todd R Golub  Jill P Mesirov
Affiliation:Whitehead Institute/Massachusetts Institute of Technology Center for Genome Research, Cambridge, MA 02139, USA. sayan@genome.wi.mit.edu
Abstract:A statistical methodology for estimating dataset size requirements for classifying microarray data using learning curves is introduced. The goal is to use existing classification results to estimate dataset size requirements for future classification experiments and to evaluate the gain in accuracy and significance of classifiers built with additional data. The method is based on fitting inverse power-law models to construct empirical learning curves. It also includes a permutation test procedure to assess the statistical significance of classification performance for a given dataset size. This procedure is applied to several molecular classification problems representing a broad spectrum of levels of complexity.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号