摘要
FPGA因具有较好的并行处理能力和灵活性,使其在卷积神经网络硬件加速计算中得到广泛的应用,但是传统的FPGA图像卷积实现中存在模块化设计以及空间开销较大的问题.本文提出了一种面向硬件加速的通用图像卷积开发平台.通过模块化设计,极大提高针对不同卷积核实现图像卷积开发的灵活性;另外通过图像批次处理技术,充分利用数据重复性实现内存共享,较好地降低了存储空间的开销.实验结果表明,本文设计的平台在模块化设计方面提供了更好的可重配置架构,非常适于实验教学应用;在存储空间需求方面,当并行度提高时,BRAM的复杂度只是线性增加,这对于功耗的降低具有优势.
With fine parallel processing capability and flexibility,Field Programmable Gate Array(FPGA)has been widely applied to hardware-accelerated computation,especially in Convolution Neural Networks(CNN).However,traditional image convolution on FPGA has limited modular design and large space overhead.This study builds a general experiment platform of image convolution for hardware acceleration.Through the modular design,it greatly improves the flexibility in image convolution for different convolution kernels.In addition,an image batch-processing system is adopted to enable memory sharing due to data repetition,reducing the need for storage space.Experimental results present that the proposed platform boasts a better reconfigurable architecture in terms of modular design.Besides,the complexity of BRAM only increases linearly with higher parallelism,which has the advantage of reducing power consumption.
作者
阚保强
KAN Bao-Qiang(Faculty of Information Technology,Concord College,Fujian Normal University,Fuzhou 350003,China)
出处
《计算机系统应用》
2021年第2期77-82,共6页
Computer Systems & Applications
基金
国家自然科学基金(61201216)
福建省教师教育科研项目(JAT191117)
泉州市科技计划(2017T009)
福建师范大学协和学院科研基金(KY20200202)。
关键词
FPGA
硬件加速
图像卷积
并行度
FPGA
hardware acceleration
image convolution
parallelism