首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Benchmarking the Clustering Performances of Evolutionary Algorithms: A Case Study on Varying Data Size
Institution:1. Department of Pediatrics, Kansas Mercy Children’s Hospital;2. Department of Pediatrics, Medical College of Wisconsin;3. Department of Surgery, Medical College of Wisconsin;1. Department of Computer Engineering, North Tehran Branch, Islamic Azad University, Tehran, Iran;2. Department of Computer Engineering, Faculty of Engineering, Kharazmi University, Tehran, Iran;3. Department of Industrial Engineering, Birjand University of Technology, Birjand, Iran
Abstract:Background and objectiveClustering is a widely used popular method for data analysis within many clustering algorithms for years. Today it is used in many predictions, collaborative filtering and automatic segmentation systems on different domains. Also, to be broadly used in practice, such clustering algorithms need to give both better performance and robustness when compared to the ones currently used. In recent years, evolutionary algorithms are used in many domains since they are robust and easy to implement. And many clustering problems can be easily solved with such algorithms if the problem is modeled as an optimization problem. In this paper, we present an optimization approach for clustering by using four well-known evolutionary algorithms which are Biogeography-Based Optimization (BBO), Grey Wolf Optimization (GWO), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO).Methodthe objective function has been specified to minimize the total distance from cluster centers to the data points. Euclidean distance is used for distance calculation. We have applied this objective function to the given algorithms both to find the most efficient clustering algorithm and to compare the clustering performances of algorithms against different data sizes. In order to benchmark the clustering performances of algorithms in the experiments, we have used a number of datasets with different data sizes such as some small scale, medium and big data. The clustering performances have been compared to K-means as it is a widely used clustering algorithm for years in literature. Rand Index, Adjusted Rand Index, Mirkin's Index and Hubert's Index have been considered as parameters for evaluating the clustering performances.ResultAs a result of the clustering experiments of algorithms over different datasets with varying data sizes according to the specified performance criteria, GA and GWO algorithms show better clustering performances among the others.ConclusionsThe results of the study showed that although the algorithms have shown satisfactory clustering results on small and medium scale datasets, the clustering performances on Big data need to be improved.
Keywords:Clustering  Optimization  Evolutionary algorithms  PSO  GWO  BBO  K-means
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号