摘要
针对Hadoop默认调度算法和异构环境下LATE调度算法的不足,在SAMR调度算法的基础上提出了一种增强的自适应MapReduce调度算法。该算法记录了每个节点的历史信息,采用K-means聚类算法动态地调整阶段进度值以找到真正需要启动备份的落后任务。实验结果表明,增强自适应的MapReduce调度算法在提高任务执行时间的估算误差以及准确识别慢任务方面具有一定的有效性。
Aiming at the shortage of Hadoop default scheduling algorithm and LATE scheduling algorithm of heterogeneous environment, this paper proposes an enhanced adaptive MapReduce scheduling algorithm on the basis of SAMR scheduling algorithm. The algorithm records the history information of each node, and uses K-means clustering algorithm to dynamically adjust the progress value, aims to find the slow tasks which are really need begin back-up. Finally, the experimental results show that the enhanced MapReduce scheduling algorithm has some validity in the aspect of improving the estimation error of the tasks’execution time and accurately identifying the slow tasks.
出处
《计算机工程与应用》
CSCD
2013年第19期39-43,140,共6页
Computer Engineering and Applications
基金
国家自然科学基金(No.12A520021)
河南省科学和技术部财政支持重点项目(No.122102310309,No.122102210117)