时域有限差分法中的GPU加速高效CPML方案

High performance CPML acceleration scheme with GPU for FDTD

下载PDF

导出

摘要针对并行CPML存在的计算冗余和访问冗余问题,提出了一种用于时域有限差分法的图形处理器加速无除法联合最小访存CPML更新方案.该方案通过重新安排CPML迭代公式,将除法操作吸收进公式的固定系数中,消去了图形处理器计算中负担繁重的除法操作.该方案进一步通过合并PML区域内时域有限差分法常规场值更新步骤和CPML更新步骤,剔除了这两个步骤中的重复访存,使算法的访存需求最小化.数值验证结果表明,在同等精度下,CPML更新过程和PML区域场值整体计算过程分别减少了70%和44%的计算时间. To overcome computational redundancy and memory-access redundancy of the traditional GPU- accelerated CPML technique, a novel division-free and minimum-access CPML scheme is proposed. In the proposed scheme, the division operators in the CPML method are merged into a series of fixed coefficients by optimally rearranging the iteration process of CPML and then, the reduplicate memory accesses are eliminated by updating the FDTD and CPML operation in the PML region jointly. Experimental results show that the proposed structure can save up to 70% operation time compared with the traditional GPU- CPML technique and 44 % of field updating in the PML region, without any loss of accuracy.

作者白冰牛中奇

机构地区西安电子科技大学电子工程学院

出处《西安电子科技大学学报》 EI CAS CSCD 北大核心 2015年第1期194-199,212,共7页 Journal of Xidian University

基金国家自然科学基金资助项目(30870577 61301288) 中央高校基本科研业务费资助项目(JB140218 K5051302057)

关键词时域有限差分法卷积完全匹配层图形处理器并行计算计算统一设备架构 finite difference time domain method convolution perfectly matched layer graphics processing unit parallel computing compute unified device architecture

分类号 TP391.9 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献11

1Taflove A, Hagness S C. Computational Electrodynamics: the Finite Difference Time Domain Method [M]. 3rd Edition. Norwood: Artech House Publishers, 2005: 273-328.
2姜彦南,葛德彪,杨利霞,于新华.二维半空间时域有限差分瞬态场外推方法[J].西安电子科技大学学报,2014,41(2):178-184. 被引量：1
3张连波,郭立新,苟雪银,王安琪.三层粗糙面电磁散射的矩量法研究[J].西安电子科技大学学报,2013,40(6):147-154. 被引量：12
4Berenger J P. A Perfectly Matched Layer for the Absorption of Electromagnetic Waves[J]. Journal of Computational Physics, 1994, 114(2): 185-200.
5Kuzuoglu M, Mittra R. Frequency Dependence of the Constitutive Parameters of Causal Perfectly Matched Anisotropic Absorbers [J]. Microwave and Guided Wave Letters, 1996, 6(12): 447-449.
6Roden J, Gedney S D. Convolution PML(CPML): an Efficient FDTD Implementation of the CFS-PML for Arbitrary Media [J]. Microwave and Optical Technology Letters, 2000, 27(5): 334-339.
7Zygiridis T T, Kantartzis N V, Tsiboukis T D. GPU-Accelerated Efficient Implementation of FDTD Methods with Optimum Time-step Selection[J]. IEEE Transactions on Magnetics, 2014, 50(2): 477-480.
8Sypek P, Dziekonski A, Mrozowski M. How to Render FDTD Computations More Effective Using a Graphics Accelerator[J]. IEEE Transactions on Magnetics, 2009, 45(3): 1324-1327.
9Inman M J, Elsherbeni A Z, Maloney J G, et al. GPU Based FDTD Solver with CPML Boundaries[C]//Antennas and Propagation Society International Symposium. Piscataway: IEEE, 2007: 5255-5258.
10胡媛,李康,孔凡敏,杜刘革.基于CUDA架构的三维CPML-FDTD并行方法[J].计算机工程与应用,2011,47(25):220-223. 被引量：4

二级参考文献27

1李康,孔凡敏,郭毅峰,王俊泉,梅良模.MRTD和高阶FDTD算法的数值色散特性的分析[J].系统仿真学报,2005,17(9):2089-2091. 被引量：12
2葛德彪,杨利霞.各向异性介质FDTD分析及其并行计算[J].系统工程与电子技术,2006,28(4):483-485. 被引量：4
3Yee K S.Numerical solution of initial boundary value problems involving Maxwell's equations in isotropic media[J].IEEE Trans on Antennas and Propagation, 1996, 14: 302-307.
4Adams S, Payne J, Boppana R.Finite Difference Time Domain (FDTD) simulations using graphics processors[C]//2007 DoD High Performance Computing Modernization Program Users Group Conference, Pittsburgh, 2007: 334-338.
5Valcarce A,De La Roche G,Jie Z.A GPU approach to FDTD for radio coverage prediction[C]//11th IEEE Singapore Interna- tional Conference on Communication Systems,Guangzhou,2008: 1585-1590.
6Roden J, Gedney S D.Convolution PML (CPML) :an efficient FDTD implementation of the CFS-PML for arbitrary medium[J]. Microwave and Optical Technology Letters, 2000,27: 334-339.
7Gandey S D.An anisotropic perfectly matched layer-absorbing medium for the truncation of FDTD lattices[J].IEEE Transac- tions on Antennas and Propagation, 1996,44(12) : 1630-1639.
8Nvidia Corporation Technical Staff.NVIDIA CUDA program- ming guide 2.0[M].[S.l.] : NVIDIA Corporation, 2008 : 13-71.
9Inman M J, Elsherbeni A Z.Programming video cards for com- putational electromagnetic applications[J].Antermas and Propaga- tion Magazine,IEEE,2005,47(6) : 71-78.
10Du Liuge,Li Kang,Kong Fanmin.Parallel 3D finite difference time domain simulations on graphics processors with cuda[C]// Proceedings of the Computational Intelligence and Software Engineering, Wuhan, 2009:1-4.

共引文献14

1尤双双,谢杰文,彭谷香.基于CPML的FDTD法矿山地质灾害应急数值模拟[J].世界有色金属,2019,44(12):115-118.
2邵宗有,王昭顺,刘新春.基于CPU-GPU异构机群的FDTD并行算法加速研究[J].系统仿真学报,2013,25(2):235-240. 被引量：1
3Bai Bing,Niu Zhongqi,Niu Yi,Wei Bing,Zhao Gang.Fast division-free parallel structure for convolution perfectly matched layer in finite difference time domain method[J].The Journal of China Universities of Posts and Telecommunications,2015,22(1):72-76.
4武剑,任新成,朱小敏.指数型分布粗糙地面宽带后向电磁散射的FDTD研究[J].山东科学,2015,28(4):83-88. 被引量：1
5贾春刚,郭立新,刘伟,尤立志.并行FDTD方法在海面及其上方漂浮目标复合电磁散射中的应用[J].电波科学学报,2016,31(1):116-122. 被引量：4
6齐超,刘伟健,张磊,全勇.二维分形表面电磁散射特性[J].哈尔滨工业大学学报,2016,48(3):15-19. 被引量：1
7张元元,吴振森,张玉石.多频段典型地表后向散射回波经验模型[J].西安电子科技大学学报,2016,43(5):190-196.
8晁雪,任新成,田炜.基于矩量法的非高斯分层粗糙面电磁散射研究[J].河南科学,2017,35(8):1209-1213. 被引量：1
9赵华,郭立新.高斯粗糙表面涂覆目标太赫兹散射特性[J].西安电子科技大学学报,2018,45(1):23-29. 被引量：2
10晁雪,任新成,田炜.分层粗糙面与上方目标的复合电磁散射研究[J].测控技术,2018,37(2):143-146. 被引量：1

1Bai Bing,Niu Zhongqi,Niu Yi,Wei Bing,Zhao Gang.Fast division-free parallel structure for convolution perfectly matched layer in finite difference time domain method[J].The Journal of China Universities of Posts and Telecommunications,2015,22(1):72-76.
2王丽芳.基于CUDA的图形处理器加速锥束CT重建算法的研究[J].计算机应用与软件,2014,31(1):218-221.
3代健,褚天舒,杨照.基于OpenCL的GPU加速三维时域有限差分电磁场仿真算法研究[J].数值计算与计算机应用,2014,35(1):8-20. 被引量：2
4杨广林,孔令富.基于图像分块的背景模型构建方法[J].机器人,2007,29(1):29-34. 被引量：12
5李晓慧.Garter预测未来——20％企业将不拥有IT资产。[J].信息方略,2010(9):17-17.
6BYOD现象叫停还是超越[J].网管员世界,2012(13):9-10.
7王全民,陈彬,郭刚,黄柯棣.超宽带冲激无线电引信地面回波仿真算法[J].系统仿真学报,2011,23(3):469-473. 被引量：3
8王全民,郭刚,付慧,黄柯棣.超宽带信号地面回波仿真方法研究[J].计算机工程与应用,2011,47(20):129-131.
9毛耀宗,陈珂,江弋,邹权.基于粒子群算法与图形处理器加速的支持向量机参数优化方法[J].厦门大学学报（自然科学版）,2013,52(5):609-612. 被引量：5
10张立红,余文华,杨小玲.加速并行时域有限差分仿真的新方法[J].电波科学学报,2012,27(1):56-60. 被引量：5

西安电子科技大学学报

2015年第1期

浏览历史

内容加载中请稍等...

时域有限差分法中的GPU加速高效CPML方案

参考文献11

二级参考文献27

共引文献14

相关作者

相关机构

相关主题

浏览历史