首页 | 本学科首页   官方微博 | 高级检索  
     


Stroke Treatment Prediction Using Features Selection Methods and Machine Learning Classifiers
Affiliation:1. STICODE Departement, RIADI laboratory of National School of Computer Sciences, Manouba, Tunisia;2. MATHSTIC Departement, ITI laboratory of IMT Atlantique, Brest, France;3. Intradys, Brest, France
Abstract:ObjectivesFeature selection in data sets is an important task allowing to alleviate various machine learning and data mining issues. The main objectives of a feature selection method consist on building simpler and more understandable classifier models in order to improve the data mining and processing performances. Therefore, a comparative evaluation of the Chi-square method, recursive feature elimination method, and tree-based method (using Random Forest) used on the three common machine learning methods (K-Nearest Neighbor, naïve Bayesian classifier and decision tree classifier) are performed to select the most relevant primitives from a large set of attributes. Furthermore, determining the most suitable couple (i.e., feature selection method-machine learning method) that provides the best performance is performed.Materials and methodsIn this paper, an overview of the most common feature selection techniques is first provided: the Chi-Square method, the Recursive Feature Elimination method (RFE) and the tree-based method (using Random Forest). A comparative evaluation of the improvement (brought by such feature selection methods) to the three common machine learning methods (K- Nearest Neighbor, naïve Bayesian classifier and decision tree classifier) are performed. For evaluation purposes, the following measures: micro-F1, accuracy and root mean square error are used on the stroke disease data set.ResultsThe obtained results show that the proposed approach (i.e., Tree Based Method using Random Forest, TBM-RF, decision tree classifier, DTC) provides accuracy higher than 85%, F1-score higher than 88%, thus, better than the KNN and NB using the Chi-Square, RFE and TBM-RF methods.ConclusionThis study shows that the couple - Tree Based Method using Random Forest (TBM-RF) decision tree classifier successfully and efficiently contributes to find the most relevant features and to predict and classify patient suffering of stroke disease.”
Keywords:Stroke disease  Feature selection  Data mining  Decision tree classifier  Naive Bayes  K-nearest neighbor  Recursive feature elimination  Tree-based model  Chi-square
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号