首页 | 本学科首页   官方微博 | 高级检索  
   检索      


A feature selection method for classification within functional genomics experiments based on the proportional overlapping score
Authors:Osama Mahmoud  Andrew Harrison  Aris Perperoglou  Asma Gul  Zardad Khan  Metodi V Metodiev  Berthold Lausen
Institution:.Department of Mathematical Sciences, University of Essex, Wivenhoe Park, CO4 3SQ Colchester, UK ;.School of Biological Sciences/Proteomics Unit, University of Essex, Wivenhoe Park, CO4 3SQ Colchester, UK ;.Department of Applied Statisitcs, Helwan University, Cairo, Egypt
Abstract:

Background

Microarray technology, as well as other functional genomics experiments, allow simultaneous measurements of thousands of genes within each sample. Both the prediction accuracy and interpretability of a classifier could be enhanced by performing the classification based only on selected discriminative genes. We propose a statistical method for selecting genes based on overlapping analysis of expression data across classes. This method results in a novel measure, called proportional overlapping score (POS), of a feature’s relevance to a classification task.

Results

We apply POS, along‐with four widely used gene selection methods, to several benchmark gene expression datasets. The experimental results of classification error rates computed using the Random Forest, k Nearest Neighbor and Support Vector Machine classifiers show that POS achieves a better performance.

Conclusions

A novel gene selection method, POS, is proposed. POS analyzes the expressions overlap across classes taking into account the proportions of overlapping samples. It robustly defines a mask for each gene that allows it to minimize the effect of expression outliers. The constructed masks along‐with a novel gene score are exploited to produce the selected subset of genes.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-274) contains supplementary material, which is available to authorized users.
Keywords:Feature selection  Gene ranking  Microarray classification  Proportional overlap score  Gene mask  Minimum subset of genes
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号