摘要
传统求图传递闭包的方法存在计算量大与计算时间长的问题。为加快处理大数据量的传递闭包算法的计算速度,结合算法密集计算和开放式计算语言(OpenCL)框架的特征,采用本地存储器优化的并行子矩阵乘和分块的矩阵乘并行计算,提出一种基于OpenCL的传递闭包并行算法。利用本地存储器优化的并行子矩阵乘算法来优化计算步骤,提高图形处理器(GPU)的存储器利用率,降低数据获取延迟。通过分块矩阵乘并行计算算法实现大数据量的矩阵乘,提高GPU计算核心的利用率。数据结果表明,与CPU串行算法、基于开放多处理的并行算法和基于统一设备计算架构的并行算法相比,传递闭包并行算法在OpenCL架构下NVIDIA GeForce GTX 1070计算平台上分别获得了593.14倍、208.62倍和1.05倍的加速比。
The traditional method for obtaining the transitive closure of the graphs faces the large amount of calculation and long calculation time.In order to improve the computing speed of the transitive closure algorithm for dealing with large amounts of data,an Open Computing Language(OpenCL)-based parallel algorithm for transitive closure is proposed.The algorithm combines the characteristics of algorithm-intensive computation and OpenCL architecture,and uses the parallel submatrix multiplication and block matrix multiplication optimized by local memory for parallel computing.The parallel submatrix multiplication algorithm is used to optimize the computational steps,improves the memory utilization of the Graphic Processing Unit(GPU),and reduces the data acquisition delay.The parallel block matrix multiplication algorithm is used to implement matrix multiplication involving large amounts of data,and improve the utilization of the GPU computing cores.The experimental results show that compared with the sequential CPU-based algorithm,parallel algorithm based on Open Multi-Processing,and parallel algorithm based on Compute Unified Device Architecture(CUDA),the proposed parallel transitive closure algorithm provides a 593.14 times,208.62 times and 1.05 times speedup respectively on the NVIDIA GeForce GTX 1070 computing platform with OpenCL architecture.
作者
肖汉
郭宝云
李彩林
周清雷
XIAO Han;GUO Baoyun;LI Cailin;ZHOU Qinglei(School of Information Science and Technology,Zhengzhou Normal University,Zhengzhou 450044,China;School of Civil and Architectural Engineering,Shandong University of Technology,Zibo,Shandong 255000,China;School of Information Engineering,Zhengzhou University,Zhengzhou 450001,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第8期131-139,共9页
Computer Engineering
基金
国家自然科学基金(41601496,41701525,61572444)
山东省自然科学基金(ZR2017LD002)
山东省重点研发计划项目(2018GGX106002)。
关键词
矩阵乘
传递闭包
图形处理器
开放式计算语言
并行算法
matrix multiplication
transitive closure
Graphic Processing Unit(GPU)
Open Computing Language(OpenCL)
parallel algorithm