摘要
迭代计算是相同逻辑的重复执行,在各种机器学习和数据挖掘方法中被广泛使用.在大数据的处理与分析领域中,分布式迭代计算更是当前的热点研究问题之一.容错机制是分布式系统高可用性的必要保证.现有分布式系统的容错机制虽然在高可用性上表现良好,但忽略了面向迭代计算的容错效率问题.本文针对批流混合大数据计算系统Apache Flink的迭代容错效率问题,进行了系统的研究.执行流处理任务时,Flink采用“分布式快照”的检查点机制来完成容错.对于海量数据的迭代分析,检查点增加了不必要的延迟.执行批处理任务时,Flink采用从头执行任务的方式来实现容错,该方式虽然实现简单,但带来了很大的时间开销.针对以上问题,本文首先提出了一种基于补偿函数的乐观迭代容错机制.该容错机制在迭代任务发生故障时采用乐观补偿的思想恢复任务,在迭代执行过程中不采用任何额外的容错手段(不会引入额外的容错开销),采用用户自定义的补偿函数收集健康节点上的迭代数据,并结合初始的迭代数据对故障节点上丢失的分区数据进行恢复,继续执行至迭代收敛状态,保证了迭代任务的高效顺利执行.由于乐观迭代容错机制并不保证得到的结果与无故障执行得到的结果完全一致,因此针对精度要求较高的迭代任务,本文结合Flink系统的迭代数据流模型,进一步提出一种基于头尾检查点悲观迭代容错机制.与传统的阻塞检查点(阻塞下游操作符)的工作方式不同,该容错机制以非阻塞的方式编写检查点,充分结合Flink迭代数据流的特点,将可变数据集的检查点注入迭代流本身.通过设计迭代感知,简化了系统架构,降低了检查点成本和故障恢复时间.本文基于Flink系统,在大量的真实数据集和模拟数据集上,从增量迭代和全量迭代两方面对提出的两种容错机制进行了全面的实验研究,验证了本文提出的迭代容错优化技术的高效性.实验结果证实,本文基于Flink系统提出的乐观容错机制和悲观容错机制在计算效率上均优于现有的分布式迭代容错机制.前者在全量迭代计算任务中运行时间最高可提升22.8%,在增量迭代计算任务中最高可提升33.8%;后者在全量迭代任务中最高可节省15.3%的时间开销,在增量迭代任务中最高可节省18.5%的时间开销.
Iterative calculation is the repeated execution of the same logic and is widely used in various machine learning and data mining methods.In the field of big data processing and analysis,distributed iterative computing is one of the current hot research issues.Fault tolerance is a necessary guarantee for high availability of distributed systems.Although the fault tolerance mechanism of existing distributed systems performs well in high availability,it ignores the problem of fault tolerance efficiency for iterative computing.This paper systematically studies the iterative fault-tolerant efficiency of batch-flow hybrid big data computing system Apache Flink.When performing stream processing tasks,Flink uses a“distributed snapshot”checkpoint mechanism to complete fault tolerance.For iterative analysis of massive data,checkpoints add unnecessary delay.When performing batch processing tasks,Flink uses the task execution method from the beginning to achieve fault tolerance.Although this method is simple to implement,it brings a lot of time overhead.In view of the above problems,this paper first proposes an optimistic iterative fault tolerance mechanism based on compensation functions.This fault-tolerant mechanism uses optimistic compensation to recover tasks when iterative tasks fail.It does not use any additional fault-tolerant methods(it does not introduce additional fault-tolerant overhead)during iterative execution,and uses user-defined compensation functions to collect healthy nodes.Iterative data,combined with the initial iterative data,recovers the lost partition data on the failed node,and continues execution to the iterative convergence state,ensuring the efficient and smooth execution of the iterative task.Because the optimistic iterative fault tolerance mechanism does not guarantee that the results obtained are completely consistent with the results obtained by fault-free execution,for the iteration tasks with higher accuracy requirements,this paper combines the iterative data flow model of the Flink system to further propose a head-to-tail checkpoint.Pessimistic iterative fault tolerance mechanism.Unlike traditional blocking checkpoints(blocking downstream operators),this fault-tolerant mechanism writes checkpoints in a non-blocking manner,fully combines the characteristics of Flink iterative data flow,and injects variable data set checkpoints into the iterative flow itself.By designing iterative awareness,the system architecture is simplified,and checkpoint costs and failure recovery times are reduced.This paper is based on the Flink system.On a large number of real data sets and simulated data sets,a comprehensive experimental study of the two proposed fault tolerance mechanisms from the aspects of incremental iteration and full iteration is conducted,and the effectiveness of the proposed iterative fault tolerance optimization technology is verified.Efficiency.The experimental results confirm that the optimistic and pessimistic fault-tolerant mechanisms proposed in this paper based on the Flink system are superior to the existing distributed iterative fault-tolerant mechanisms in terms of computational efficiency.The former can increase the running time by up to 22.8%in full iterative computing tasks and up to 33.8%in incremental iterative computing tasks;the latter can save up to 15.3%of the time overhead in full iterative tasks,and in incremental iterative tasks Saves up to 18.5%of time.
作者
郭文鹏
赵宇海
王国仁
韦刘国
GUO Wen-Peng;ZHAO Yu-Hai;WANG Guo-Ren;WEI Liu-Guo(School of Computer Science and Engineering,Northeastern University,Shenyang 110169;School of Computer Science and Technology,Beijing Institute of Technology University,Beijing 100081)
出处
《计算机学报》
EI
CSCD
北大核心
2020年第11期2101-2118,共18页
Chinese Journal of Computers
基金
科技部重点研发项目“云计算和大数据”重点专项项目(2018YFB1004402)
国家自然科学基金(61772124)资助.
关键词
分布式迭代计算
Apache
Flink
乐观容错
悲观容错
检查点
distributed iterative calculation
Apache Flink
optimistic fault tolerance
pessimistic fault tolerance
checkpoint