期刊文献+

面向智能计算的国产众核处理器架构研究 被引量:2

Research on homegrown manycore architecture for intelligent computing
原文传递
导出
摘要 当前人工智能对算力的需求以超摩尔定律的速度增长,算法并行性高、数据重用性强,为处理器体系结构设计带来了更大的设计空间.众核处理器以其强大的片上计算能力、灵活的片上体系结构、高效的片上通信、柔性优化的存储等特性,为人工智能提供了更广阔的发展空间.本文在介绍众核处理器发展历史的基础上梳理了主要技术路线,重点论述人工智能应用对国产众核处理器体系结构和关键特性的需求. In recent times,the demand for the computational capability of artificial intelligence(AI)is increasing rapidly.It is well-known that high parallelism algorithm and strong reusability of data provide more design space for processor architecture design.The manycore processor has a huge development space of AI with its strong on-chip computing power,flexible on-chip architecture,efficient on-chip communication,and flexible optimized storage.Based on the history of the development of manycore processors,this paper summarizes the main technical routes and focuses on the requirements of AI applications for the architecture and critical features of domestic manycore processors.
作者 李宏亮 郑方 郝子宇 高红光 过锋 唐勇 吕晖 刘鑫 陈芳园 Hongliang LI;Fang ZHENG;Ziyu HAO;Hongguang GAO;Feng GUO;Yong TANG;Hui LV;Xin LIU;Fangyuan CHEN(Jiangnan Institute of Computing Technology,Wuxi 214083,China)
出处 《中国科学:信息科学》 CSCD 北大核心 2019年第3期247-255,共9页 Scientia Sinica(Informationis)
基金 核高基项目面向数据中心(云平台)与集群计算的智能计算单元(批准号:2018ZX01028-102)资助项目
关键词 众核处理器 智能计算 体系结构 通信机制 存储体系 manycore processor intelligent computing computer architecture communication mechanism memory system
  • 相关文献

参考文献3

二级参考文献19

  • 1邓让钰,陈海燕,邢座程,谢伦国,曾献君.EPIC微体系结构的存储级并行执行模型的研究[J].计算机学报,2007,30(1):74-80. 被引量:1
  • 2Cooley J W, Tukey J W. An algorithm for the machine computation of the complex fourier series. Mathematics of Computation, 1965, 19(90): 297-301
  • 3Frigo M, Johson S G. The design and implementation of FFTW3. Proceedings of the IEEE, 2005, 93(2): 216-231
  • 4Williams Samuel, Shall John, Oliker Leonid, Kamil Shoaib, Husbands Parry, Yelick Katherine. Scientific computing kernels on the Cell processor. International Journal of Parallel Programming, 2007, 35(3): 263-298
  • 5Govindaraiu Naga K, Larsen Scott, Gray Jim, Manocha Dinesh. A memory model for scientific algorithms on graphics processors//Proceedings of the 2006 ACM/IEEE Conference on Supereomputing. Tampa, Florida, 2006
  • 6Chen Long, Hu Ziang, Lin Jun-Min, Gao Guang R. Optimizing fast fourier transform on a multi-core architecture//Proceedings of the IEEE International Parallel and Distributed Processing Symposium. California, USA, 2007: 499
  • 7Bailey D H. FFTs in external or hierarchical memory. Journal of Supercomputing, 1990, 4(1): 23-35
  • 8Woo Steven Cameron, Ohara Moriyoshi, Torrie Evan, Singh Jaswinder Pal, Gupta Anoop. The SPLASH-2 programs: Characterization and methodological considerations//Proceedings of the 22nd International Symposium on Computer Architecture. S. Marghenta Ligure, Italy, 19951 24-36
  • 9Iftode Liviu, Singh Jaswinder Pal, Li Kai. Scope consistency: A bridge between release consistency and entry consistency// Proceedings of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures. Padua, Italy, 1996:277-287
  • 10Christoforos Kozyrakis David Patterson. Scalable vector processors for embedded systems[J]. IEEE Micro, 2003,23 (6) : 36 - 45.

共引文献18

同被引文献18

引证文献2

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部