期刊文献+

基于众核架构的BP神经网络算法优化 被引量:1

Optimization of BP neural network algorithm based on many-core architecture
下载PDF
导出
摘要 近年来,众核处理器(Many Integrated Cores,MIC)越来越多地为人们所关注,众核架构已经成为许多超算的首选。BP神经网络是采用反向误差传播(Back Propagation,BP)算法的人工神经网络,对于处理器的浮点计算能力要求比较高。目前最新的Intel Xeon Phi(KNL)众核处理器可以达到3TFLOPS的双精度浮点峰值性能。本文对BP神经网络在KNL上进行了向量化扩展,并使用寄存器分块和缓存分块方法优化研究。实验结果表明在KNL上最快能达到220img/s的处理速度,其加速比达到了13.2,为GPU的2.9倍,KNC的2.28倍。 In recent years, the MIC(Many Integrated Cores)more and more people's attention, many core architecture has become the first choice for many supercomputing.BP neural network is a kind of artificial neural network based on BP(Back Propagation)algorithm, which requires a high level of floating-point computing capability.The latest Intel Xeon Phi (KNL) core processor can achieve 3TFLOPS double precision floating point peak performance.In this paper, we extend the BP neural network on KNL, and use the method of register block and cache block to optimize the research.The experimental results show that the fastest processing speed of 220img/s can be achieved on the KNL, and the speedup ratio is 13.2, which is times of GPU and KNC is 2.28 times.
作者 周文
出处 《电子世界》 2017年第3期48-51,共4页 Electronics World
基金 国家自然科学基金(Grant No.61571226) 江苏省自然科学基金(青年科学基金)(Grant No.BK20140823)资助
关键词 众核架构 BP神经网络 缓存分块 向量化 many-core architecture BP neural network cache block vectorizatio
  • 相关文献

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部