首页 | 本学科首页   官方微博 | 高级检索  
   检索      


Detecting outliers in a univariate time series dataset using unsupervised combined statistical methods: A case study on surface water temperature
Institution:1. Environmental Technology, School of Industrial Technology, Universiti Sains Malaysia, 11800, Pulau Pinang, Malaysia;2. Centre for Marine and Coastal Studies (CEMACS), Universiti Sains Malaysia, Pulau Pinang, Malaysia;1. Griffith School of Engineering and Built Environment, Griffith University, Parklands Drive, Southport, Queensland 4222, Australia;2. Cities Research Institute, Griffith University, Parklands Drive, Southport, Queensland 4222, Australia;3. Australian Rivers Institute, Griffith University, 170 Kessels Road, Nathan, Queensland 4111, Australia;4. South Australia Water (SAWater), Adelaide, Australia;1. Ikerlan Technology Research Centre, Big Data Architectures Team, Pº J.M. Arizmediarrieta, 2, 20500 Arrasate-Mondragón, Spain;2. Department of Computer Science and Artificial intelligence, Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada, 18071 Granada, Spain
Abstract:The surface water temperature is a vital ecological and climate variable, and its monitoring is critical. An extensive sensor network measures the ocean, but outliers pervade the monitoring data due to the sudden change in the water surface level. No single algorithm can identify the outlier efficiently. Hence, this work aims to propose and evaluate the performance of three statistical-based outlier detection algorithms for the water surface temperature: 1) the Standard Z-Score method, 2) the Modified Z-Score coupled with decomposition, and 3) the Exponential Moving Average with the Coupled Modified Z-Score and decomposition. A threshold was set to flag the outlier values. The models' performance was evaluated using the F-score method. Results showed that an increase in outlier detection might reduce the precision of identifying the actual outlier. Based on the results, the Exponential Moving Average with the Modified Z-Score gave the highest F-score value (= 0.83) compared to the other two individual methods. Therefore, this proposed algorithm is recommended to detect outliers efficiently in large surface water temperature datasets.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号