摘要
目前Hadoop的作业调度算法都是将系统中的多类资源抽象成单一资源,分配给作业的资源均是节点资源中固定大小的一部分,称为插槽。这类基于插槽的算法没有考虑到系统多资源的差异性,忽略了不同类型作业对资源的不同需求,因此导致系统在吞吐量和平均作业完成时间上性能低下。本文研究了多资源环境下公平调度算法在Hadoop中的实现,设计了一种多资源公平调度器MFS(Multi-resource Fair Scheduler)。MFS采用了DRF(Dominant Resource Fairness)调度思想,使用需求向量来描述作业对各类资源的需求,并按照需求向量中各资源的大小给作业分配资源。MFS能更加充分有效地使用系统的各类资源,并能满足不同类型作业对资源的不同需求。实验表明相比于基于插槽的Fair Scheduler与Capacity Scheduler,MFS提高了系统的吞吐量,降低了平均作业完成时间。
Hadoop job schedulers typically use a single resource abstraction and resources are allocated at the level of fixed-size partition of the nodes,called slots.These job schedulers ignore the different demands of jobs and fair allocation of multiple types of resources,leading to poor performance in throughput and average job completion time.This paper studies and implements a Muti-resource Fair Scheduler(MFS) in Hadoop.MFS adopts the idea of Dominant Resource Fairness(DRF).It uses a demand vector to describe demands for resources of a job and allocates resources to the job according to the demand vector.MFS uses resources more efficiently and satisfies multiple jobs with heterogeneous demands for resources.Experiment results show that MFS has higher throughput and shorter average job completion time compared to Hadoop slot-based Fair Scheduler and Capacity Scheduler.
出处
《集成技术》
2012年第3期66-71,共6页
Journal of Integration Technology