摘要
针对传统MapReduce框架中任务节点和工作节点的失效问题,提出了在配置备份节点的分层主从式MapReduce框架中加入单元集群的处理方法。在改进框架中,任务处理的最小单位是单元集群,当单元集群中的某个工作节点失效或者超过时间阙值时,子任务节点则选择该单元集群中的空闲工作节点来分配任务,并且不需要重新传输任务文件分块,这既节省了工作节点重选择的时间,又降低了网络传输的压力。使用该框架针对不同数量的数据块进行实验,工作节点的灾难恢复时间均可以节省25ms左右,证明了单元集群的处理方法可以有效解决工作节点的失效问题。
Against the failure problem of Master Node and Worker Node in the traditional MapReduce framework, proposing a solution of adding unit cluster in the hierarchical master-slave MapReduce framework with Master backup nodes, in this improve- ment framework, for a sub-master node, the minimum unit of executing task is a unit cluster. When a worker node in the unit cluster failing or exceeding the time threshold, the sub-master node selects the idle nodes in this unit cluster to execute the task and does not retransmit the task file block, this not only saves the time of reselecting node, but also reduces the pressure of net- work transmission.In the experiment of using this framework, against the different number of the data blocks, the disaster recovery time of the worker node era1 save about 25 ms. The experiment results demonstrates the solution of unit cluster can effectively solve the failure problem of the worker node.
出处
《微型机与应用》
2013年第16期81-84,共4页
Microcomputer & Its Applications