期刊文献+

不确定性传播算法的MapReduce并行化实现 被引量:1

Parallel implementing probabilistic spreading algorithm using MapReduce programming mode
原文传递
导出
摘要 为了克服单机串行不确定性传播算法处理大规模数据集的局限,采用MapReduce编程模型对算法进行并行化实现。将单机算法按照算法流程进行拆分,每一步对应一个MapReduce程序。每一步的输入及输出数据都存储在Hadoop分布式文件系统上。用命中率对比并行化的不确定性传播算法与全局排名算法的性能。对比不同数据量、不同节点数时并行化的不确定性传播算法的加速比。试验结果表明,不确定性传播算法MapReduce并行化后部署在Hadoop集群上运行,命中率显著高于全局排名算法,且有着较好的并行性,扩大了单机算法所能处理的数据规模且提高了算法的运算速度。 In order to overcome the limitations of the serial probabilistic spreading algorithm in dealing with large-scale dataset,a parallelization of the algorithm was put forth by using MapReduce. The complex computing tasks were decomposed into a series of MapReduce job flow for distributed parallel processing on Hadoop. The input and output data of every step were stored in the Hadoop distributed file system. Hit ratio was used to compare the parallelizable probabilistic spreading algorithm versus the global ranking method performance. Speedups of the parallelizable algorithm were compared while the amount of data and the number of nodes was different. Experiment results showed that the probabilistic spreading algorithm based on MapReduce had good parallelism and had higher hit ratio than the global ranking method. Data scale that can be handled by the serial algorithm was expanded,and the operation speed of the algorithm was raised.
出处 《山东大学学报(工学版)》 CAS 北大核心 2015年第5期22-28,共7页 Journal of Shandong University(Engineering Science)
基金 北京市教委基金资助项目(PXM2011_014204_09_000232)
关键词 MAPREDUCE 云计算平台 二分网络 不确定性传播算法 分布式 MapReduce cloud computing paltform bipartite network probabilistic spreading algorithm distributed
  • 相关文献

参考文献21

二级参考文献131

共引文献834

同被引文献10

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部