期刊文献+

面向异构架构的传递闭包并行算法 被引量:3

Parallel Transitive Closure Algorithm for Heterogeneous Architecture
下载PDF
导出
摘要 传统求图传递闭包的方法存在计算量大与计算时间长的问题。为加快处理大数据量的传递闭包算法的计算速度,结合算法密集计算和开放式计算语言(OpenCL)框架的特征,采用本地存储器优化的并行子矩阵乘和分块的矩阵乘并行计算,提出一种基于OpenCL的传递闭包并行算法。利用本地存储器优化的并行子矩阵乘算法来优化计算步骤,提高图形处理器(GPU)的存储器利用率,降低数据获取延迟。通过分块矩阵乘并行计算算法实现大数据量的矩阵乘,提高GPU计算核心的利用率。数据结果表明,与CPU串行算法、基于开放多处理的并行算法和基于统一设备计算架构的并行算法相比,传递闭包并行算法在OpenCL架构下NVIDIA GeForce GTX 1070计算平台上分别获得了593.14倍、208.62倍和1.05倍的加速比。 The traditional method for obtaining the transitive closure of the graphs faces the large amount of calculation and long calculation time.In order to improve the computing speed of the transitive closure algorithm for dealing with large amounts of data,an Open Computing Language(OpenCL)-based parallel algorithm for transitive closure is proposed.The algorithm combines the characteristics of algorithm-intensive computation and OpenCL architecture,and uses the parallel submatrix multiplication and block matrix multiplication optimized by local memory for parallel computing.The parallel submatrix multiplication algorithm is used to optimize the computational steps,improves the memory utilization of the Graphic Processing Unit(GPU),and reduces the data acquisition delay.The parallel block matrix multiplication algorithm is used to implement matrix multiplication involving large amounts of data,and improve the utilization of the GPU computing cores.The experimental results show that compared with the sequential CPU-based algorithm,parallel algorithm based on Open Multi-Processing,and parallel algorithm based on Compute Unified Device Architecture(CUDA),the proposed parallel transitive closure algorithm provides a 593.14 times,208.62 times and 1.05 times speedup respectively on the NVIDIA GeForce GTX 1070 computing platform with OpenCL architecture.
作者 肖汉 郭宝云 李彩林 周清雷 XIAO Han;GUO Baoyun;LI Cailin;ZHOU Qinglei(School of Information Science and Technology,Zhengzhou Normal University,Zhengzhou 450044,China;School of Civil and Architectural Engineering,Shandong University of Technology,Zibo,Shandong 255000,China;School of Information Engineering,Zhengzhou University,Zhengzhou 450001,China)
出处 《计算机工程》 CAS CSCD 北大核心 2021年第8期131-139,共9页 Computer Engineering
基金 国家自然科学基金(41601496,41701525,61572444) 山东省自然科学基金(ZR2017LD002) 山东省重点研发计划项目(2018GGX106002)。
关键词 矩阵乘 传递闭包 图形处理器 开放式计算语言 并行算法 matrix multiplication transitive closure Graphic Processing Unit(GPU) Open Computing Language(OpenCL) parallel algorithm
  • 相关文献

参考文献6

二级参考文献44

  • 1陈显强.二元关系的传递性和传递闭包探讨[J].数学的实践与认识,2004,34(9):135-137. 被引量:13
  • 2何小亚,王洪山.利用关系矩阵求传递闭包的一种方法[J].数学的实践与认识,2005,35(3):172-175. 被引量:23
  • 3翟璐璐,谢维奇.关系传递闭包的计算[J].河南教育学院学报(自然科学版),2005,14(1):25-26. 被引量:7
  • 4戚晓芳,徐宝文,周晓宇.一种基于程序可达图的并发程序依赖性分析方法[J].电子学报,2007,35(2):287-291. 被引量:14
  • 5Agrawal R, Borgida A,Jagadish H V. Efficient management of transitive relationships in large data and knowledge bases[C]// SIGMOD. 1989 : 253-262.
  • 6Jin Ruo-ming, Xiang Yang, Ruan Ning, et al. Efficiently answering reachability queries on very large directed graphs[C]///SIGMOD Conference. 2008 : 595-608.
  • 7Wang Hai-xun, He Hao, Yang Jun, et al. Dual Labeling: Answering graph teachability queries in constant time[C] //ICDE' 06. 2006:75.
  • 8Cohen E, Halperin E, Kaplan H. Reachability and distance queries via 2-Hop labels[C]//Proceedings of the 13th annual ACM- SIAM Symposium on Discrete algorithms. 2002 : 937-946.
  • 9Jagadish H V. A compression technique to materialize transitive closure[J]. ACM Trans. Database Syst. , 1990,15(4):558-598.
  • 10Kapoor S, Ramesh H. Algorithms for generating all spanning trees of undirected and weighted graphs[J]. SIAM J. Comput. , 1995,24(2).

共引文献9

同被引文献17

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部