首页 | 本学科首页   官方微博 | 高级检索  
     


SmartJoin: a network-aware multiway join for MapReduce
Authors:Kenn Slagter  Ching-Hsien Hsu  Yeh-Ching Chung  Gangman Yi
Affiliation:1. Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, ROC
2. Department of Computer Science, Chung Hua University, Hsinchu, Taiwan, ROC
3. Department of Computer Science, Gangneung-Wonju National University, Gangwon, Gangneung, South Korea
Abstract:MapReduce is an effective tool for processing large amounts of data in parallel using a cluster of processors or computers. One common data processing task is the join operation, which combines two or more datasets based on values common to each. In this paper, we present a network aware multi-way join for MapReduce (SmartJoin) that improves performance and considers network traffic when redistributing workload amongst reducers. SmartJoin achieves this by dynamically redistributing tuples directly between reducers with an intelligent network aware algorithm. We show that our presented technique has significant potential to minimize the time required to join multiple datasets. In our evaluation, we show that SmartJoin has up to 39 % improvement compared to the non-redistribution method, a 26.8 % improvement over random redistribution and 27.6 % improvement over worst join redistribution.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号