期刊文献+

基于GPU的混合精度平方根共轭梯度算法 被引量:6

Mixed precision CGS algorithm based on GPU
下载PDF
导出
摘要 针对当前基于GPU的数值算法具有双精度数据性能低下的缺陷。提出了一种适于GPU统一计算架构Fermi-CUDA的混合精度平方根共轭梯度算法用以求解稀疏线性方程组。该算法采用单精度内迭代与双精度外迭代结合的方法,以充分利用GPU体系结构下单精度高性能和双精度高精度的优点。整个算法的计算部分完全在GPU端进行,减少了CPU和GPU之间的数据通信。实现了基于GPU的平方根共轭梯度法、Jacobi迭代法和Gauss-Seidel迭代法,分析它们作为内迭代算子对算法收敛性的影响。实验表明,该算法获得了与全双精度数据处理等同的计算精度,比GPU全双精度在浮点性能上提升近一倍,相对于CPU全双精度串行算法,最大加速比达到70以上。 GPU-based numerical algorithms have the shortcoming of low performance for double precision. We suggest a mixed precision conjugate gradient squared algorithm suitable for the GPU of Fermi-CUDA to solve sparse linear equations. The scheme uses a combination of single-precision inner iteration and double-precision outer iteration to take the advantages of efficient single-precision operation and accurate double-precision operation under the GPU structure. The calculation of the algorithm is implemented entirely on the GPU, which reduces the data transfer between CPU and GPU. Conjugate gradient squared algorithm, Jacobi iteration method and Gauss-Seidel iteration method based on GPU are implemented; and as inner iteration operators, their influence on the convergence of the whole process is analyzed. Experiments indicate that the mixed precision scheme maintains the native double-precision accuracy of data processing. At the same time, the floating point accuracy is improved by a factor of 2 compared with that using double-precision alone, and the maximum speedup ratio reaches to more than 70.
出处 《仪器仪表学报》 EI CAS CSCD 北大核心 2012年第1期97-104,共8页 Chinese Journal of Scientific Instrument
基金 国家自然科学基金(60973089 60873148 60773097 61003101) 吉林省科技发展计划项目基金(201101039 20101501 20100185 20090108 20080107) 欧盟合作项目(155776-EM-1-2009-1-IT-ERAMUNDUS-ECW-L12) 国家教育部博士点专项基金(20100061110031) 吉林大学符号计算与知识工程教育部重点实验室开放项目(93K-17-2011-K01 93K-17-2009-K05) 吉林大学科学前沿与交叉学科创新项目(201103134)资助
关键词 线性方程组 平方根共轭梯度算法 内外迭代子 混合精度 图形处理器 linear equations CGS algorithm inner and outer iteration mixed precision graphics processing unit
  • 相关文献

参考文献18

二级参考文献126

共引文献238

同被引文献51

  • 1金巍巍,陶文铨,何雅玲.代数方程求解方法收敛速度比较及对算法健壮性的影响[J].西安交通大学学报,2005,39(9):966-970. 被引量:6
  • 2张恩泽,彭树生,何小祥,陈如山.超松弛迭代-双共轭梯度在三维电磁问题有限元分析中的应用[J].淮阴师范学院学报(自然科学版),2005,4(4):292-295. 被引量:4
  • 3蒋长锦.科学计算与C程序集[M].中国水利水电出版社,2010.
  • 4COOTES T,TAYLOR C, COOPER D. et al. Active shape models-Their training and application [ J ]. Computation Vision Image Understanding, 1995,61:35-59.
  • 5BRECHBULER C, GERIG G, KUBLER O. Parameterization of closed surfaces for 3-D shape description [ J ]. Computation Vision Image Understanding, 1995,61:154-170.
  • 6YU P, GRANT P E, QI Y, et al. Cortical surface shape analysis based on spherical wavelets [ J ]. IEEE Transaction on Medical Imaging,2007,26:582-97.
  • 7NAIN D, HAKER S, BOBICK A, et al. Muhiscale 3-D shape representation and segmentation using spherical wavelets [ J ]. IEEE Transaction on Medical Imaging, 2007.26:598-618.
  • 8YU P, YEO B T T, GRANT P E, et al. Cortical folding development study based on over-complete spherical wavelets [ C ]. In Proceedings of the Workshop on Mathematical Methods in Biomedical Image Analysis, International Conference on Computer Vision,2007.
  • 9YEO B T T, OU W Q, GOLLAND P. On the construction of invertible filter banks on the 2-Sphere[ J]. IEEE Transaction on Image Processing,2008,17 ( 3 ) :283- 300.
  • 10YEO B T T, YU P, GRATF P E, et al. Shape analysis with overcomplete spherical wavelet[ C ]. In Proceedings of the International Conference on Medical Image Computing and computer Assisted Intervention (MICCI) ,2008.

引证文献6

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部