期刊文献+

Taxonomy of Data Prefetching for Multicore Processors 被引量:1

Taxonomy of Data Prefetching for Multicore Processors
原文传递
导出
摘要 Data prefetching is an effective data access latency hiding technique to mask the CPU stall caused by cache misses and to bridge the performance gap between processor and memory. With hardware and/or software support, data prefetching brings data closer to a processor before it is actually needed. Many prefetching techniques have been developed for single-core processors. Recent developments in processor technology have brought multicore processors into mainstream. While some of the single-core prefetching techniques are directly applicable to multicore processors, numerous novel strategies have been proposed in the past few years to take advantage of multiple cores. This paper aims to provide a comprehensive review of the state-of-the-art prefetching techniques, and proposes a taxonomy that classifies various design concerns in developing a prefetching strategy, especially for multicore processors. We compare various existing methods through analysis as well. Data prefetching is an effective data access latency hiding technique to mask the CPU stall caused by cache misses and to bridge the performance gap between processor and memory. With hardware and/or software support, data prefetching brings data closer to a processor before it is actually needed. Many prefetching techniques have been developed for single-core processors. Recent developments in processor technology have brought multicore processors into mainstream. While some of the single-core prefetching techniques are directly applicable to multicore processors, numerous novel strategies have been proposed in the past few years to take advantage of multiple cores. This paper aims to provide a comprehensive review of the state-of-the-art prefetching techniques, and proposes a taxonomy that classifies various design concerns in developing a prefetching strategy, especially for multicore processors. We compare various existing methods through analysis as well.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2009年第3期405-417,共13页 计算机科学技术学报(英文版)
基金 supported in part by the National Science Foundation of USA under Grant Nos.EIA-0224377,CNS-0406328,CNS-0509118,and CCF-0621435.
关键词 taxonomy of prefetching strategies multicore processors data prefetching memory hierarchy taxonomy of prefetching strategies, multicore processors, data prefetching, memory hierarchy
  • 相关文献

参考文献1

二级参考文献42

  • 1DARPA. High productivity computing systems (HPCS), vision: Focus on the lost dimension of HPC “User &: system efficiency and productivity”. http://www.darpa.mil/ipto/programs/hpcs/vision.htm.
  • 2John Hennessy, David Patterson. Computer Architecture: A Quantitative Approach. Fourth edition, Morgan Kaufmann, ISBN: 0123704901, 2006.
  • 3Wm A Wulf, Sally A McKee. Hitting the memory wall: Implications of the obvious. ACM SIGARPH Computer Architecture News, March 1995, 23(1): 20-24.
  • 4Chen T F, Baer J L. Effective hardware-based data prefetching for high performance processors. IEEE Transactions on Computers, 1995, 44(5): 609-623.
  • 5Dahlgren F, Dubois M, Stenstrom P. Fixed and adaptive sequential prefetching in shared-memory multiprocessors. In Proc. International Conference on Parallel Processing (ICPP), Los Alamitos, CA, USA, CRC Press, 1993, Vol.1, pp.56--63.
  • 6Fu J, Patel J H. Data prefetching in multiprocessor vector cache memories. In Proc. the 17th Annual International Symposium on Computer Architecture, Toronto, Canada, 1991, pp.54--63.
  • 7Joseph D, Grunwald D. Prefetching using Markov predictors. In Proc. the 24th International Symposium on Computer Architecture, Denver-Colorado, 1997, pp.252-263.
  • 8Gokul Kandiraju, Anand Sivasubramaniam. Going the distance for TLB prefetching: An application-driven study. In Proc. the International Symposium on Computer Architecture, Anchorage, Alaska, 2002, p.195.
  • 9Alexander T, Kedem G. Distributed predictive cache design for high performance memory system. In Proc. the 2nd International Symposium on High Performance Computer Architecture (HPCA), San Jose, CA, 1996, pp.254-263.
  • 10Collins J, Tullsen D, Wang H, Shen J. Dynamic speculative precomputation. In Proc. the 34th International Symposium on Microarvhitecture, Austin, Texas, 2001, pp.306-317.

共引文献2

同被引文献32

  • 1Bryant R E. Data-Intensive Supercomputing; The case for DISC [EB/OL]. http://www, cs. cmu. edu/bryant,2012-12-13.
  • 2谭光明.非规则计算中的局部性和并行性[D].北京:中国科学院计算技术研究所,2008.
  • 3Annavaram M, Patel J M, Davidson E S. Data pretetchmg by de- pendence graph pre-computation [ A] // Proceedings of the 28th Annual International Symposium on Computer Architecture (Gotehorg, Sweden),2001[C-]. New YorkACM,2001:52-61.
  • 4Collins J D, Tullsen D M,Wang H, et al. Dynamic speculative preeomputation[A].//roceedings of the 34th International Symposium on Microarehitecture (Austin, Tex. ), 2001 [C]. New York: ACM, 2001 : 306-317.
  • 5Collins J D,Wang H,Tullsen D M,et al. Speculative preeompu- tation: Long-range prefetching of delinquent loads[A] ff Pro- ceedings of the 28th Annual International Symposium on Com- puter Architecture (Goteborg, Sweden), 2001[C]. New York: ACM, 2001 : 14-25.
  • 6Liao S S W,Wang P H,Wang H,et a]. Post-pass binary adapta- tion for software-based speculative precomputation[]//Pro- ceedings of the ACM SIGPLANConference on Programming Language Design and Implementation (Berlin, Germany), 2002 [C]. New York: ACM, 2002 : 117-128.
  • 7Kim D , Yeung D. Design and evaluation of compiler algorithms for pre-execution[A] // Proceedings of the 10th International Conference on Architectural Support for Programming Langua- ges and Operating Systems (San Jose, Calif. ), 2002 [C]. New York: ACM, 2002 : 159-170.
  • 8Luk C-K. Tolerating Memory Latency through software-con- trolled pre-exeeution in simultaneous multithreading processors [A]//Proceedings of the 28th Annual International Symposium on Computer Architecture (Goteborg, Sweden), 2001 EC]. New York: ACM, 2001 : 40-51.
  • 9Moshovos A, Pnevmatikatos D N, Baniasadi A. Slice-processors: An implementation of operation-based prediction-A]//Procee- dings of the International Conference on Supercomputing (Sor- rento, Italy), 2001[C]. New York ACM, 2001 : 321-334.
  • 10Roth A, Sohi G S. Speculative data-driven multithreading [A]// Proceedings of the 7th International Conference on High Per- formance Computer Architecture (Monterrey, Mexico), 9.001 [C]. Los Alamitos, Calif: IEEE Computer Society Press, 2001: 191-202.

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部