首页 | 本学科首页   官方微博 | 高级检索  
     


SPSRG: a prediction approach for correlated failures in distributed computing systems
Authors:Weiwei Zheng  Zhili Wang  Haoqiu Huang  Luoming Meng  Xuesong Qiu
Affiliation:1.State Key Laboratory of Networking and Switching Technology,Beijing University of Posts and Telecommunications,Beijing,China
Abstract:Failure instances in distributed computing systems (DCSs) have exhibited temporal and spatial correlations, where a single failure instance can trigger a set of failure instances simultaneously or successively within a short time interval. In this work, we propose a correlated failure prediction approach (CFPA) to predict correlated failures of computing elements in DCSs. The approach models correlated-failure patterns using the concept of probabilistic shared risk groups and makes a prediction for correlated failures by exploiting an association rule mining approach in a parallel way. We conduct extensive experiments to evaluate the feasibility and effectiveness of CFPA using both failure traces from Los Alamos National Lab and simulated datasets. The experimental results show that the proposed approach outperforms other approaches in both the failure prediction performance and the execution time, and can potentially provide better prediction performance in a larger system.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号