摘要
广义极小残量法(GMRES)是最常用的求解非对称大规模稀疏线性方程组的方法之一,其收敛速度快且稳定性良好。Intel Xeon Phi众核协处理器(MIC)具有计算能力强、易编程、易移植等特点。采用MPI+OpenMP+offload混合编程模型将GMRES算法移植到MIC集群平台上。采用进程间集合通信异步隐藏、数据传输优化、向量化以及线程亲和性优化等多种手段,大幅提升了GMRES算法的求解效率。最后将并行算法应用到"局部径向基函数求解高维偏微分方程"问题的求解中。测试表明,CPU节点集群上开启32个进程,并行效率高达71.74%,4块MIC卡的最高加速性能可达单颗CPU的7倍。
Generalized minimal residual method(GMRES)is the most commonly used method for solving asymmetric large-scale linear algebraic equations,and it has fast convergence and stable property.Intel many integrated co-processors(MIC)has strong computing power and it can program easily.In this paper,MPI+OpenMP+offload hybrid programming paradigm was used to port GMRES algorithm to the MIC heterogeneous cluster platform.The perfor-mance of GMRES parallel algorithm was greatly improved by using kinds of optimization methods,such as hiding collective communications using asynchronous execution model,vectorization optimization,data transfer optimization,extensibility of MIC thread optimization,etc.Finally,GMRES parallel algorithm was used to improve the perfomance of solving high dimensional PDEs by the localized radical basis functions(RBFs)collocation methods.Results from tests indicate that the parallel efficiency can be up to 71.74% when using 32 processes in cluster,and the maximum speedup ratio of 4MICs to 1CPU can be up to 7.
出处
《计算机科学》
CSCD
北大核心
2017年第4期197-201,240,共6页
Computer Science