期刊文献+

一种基于随机森林分类器构建高性能应用程序性能分析模型的方法

A method for constructing performance analysis model of high performance application based on random forest classifier
下载PDF
导出
摘要 高性能应用程序的传统性能分析方法因分析过程存在额外开销和分析结果不准确等缺陷,致使用户耗费更多的时间和领域知识。为解决以上问题,将程序的性能分析问题转化成高维特征下非平衡小样本数据集的多分类问题,采集500条包含程序运行时进程切换次数、内存利用率、磁盘I/O负载等7种性能数据,经PCA降维等数据预处理后,使用随机森林分类器训练程序性能问题分析模型。实验验证该模型可识别出内存利用率过高、磁盘I/O负载过重等5类性能问题。为评估模型的指导有效性,分别采集HotSpot3D程序和LU-Decomposition程序运行时产生的性能数据,并根据模型输出结果指导,分别基于运行级和编译级优化2个验证程序运行。实验结果表明,所提方法可有效指导优化程序的运行性能,2个验证程序的加速比分别为1.056和5.657。 Traditional performance analysis methods for high performance applications have shortcomings such as additional overhead during the analysis process and inaccurate analysis results,resulting in users spending more time and domain knowledge.To address these issues,this paper transforms the problem of program performance analysis into a multi-classification problem of unbalanced small sample datasets under high-dimensional features.By collecting 500 pieces of performance data that include seven types of metrics such as the number of process switches,memory utilization,and disk I/O load during program runtime,after data preprocessing such as PCA dimensionality reduction,a program performance problem analysis model is trained using a random forest classifier.Experimental validation shows that the model can identify five types of performance issues,including excessive memory utilization and heavy disk I/O load.To evaluate the effectiveness of the model s guidance,this paper collects performance data generated by the HotSpot3D program and the LU-Decomposition program during runtime.Based on the model s output guidance,the two validation programs are optimized at the runtime level and the compilation level.Experimental results indicate that the proposed method can effectively guide the optimization of program performance,with speedup ratios of 1.056 and 5.657 for the two programs,respectively.
作者 柴旭清 乔一航 范黎林 CHAI Xu-qing;QIAO Yi-hang;FAN Li-lin(College of Computer and Information Engineering,Henan Normal University,Xinxiang 453007;High Performance Computing Center,Henan Normal University,Xinxiang 453007;Henan Engineering Laboratory of Intelligent Commerce and Internet of Things Technology,Xinxiang 453007,China)
出处 《计算机工程与科学》 CSCD 北大核心 2024年第7期1218-1228,共11页 Computer Engineering & Science
基金 国家自然科学基金(12274117) 河南省优秀青年科学基金(202300410226) 河南省高校科技创新计划(20HASTIT026)。
关键词 Nmon 性能分析 变分自编码器 聚类 随机森林 Nmon performance analysis variational autoencoder cluster random forest
  • 相关文献

参考文献30

二级参考文献149

  • 1李海军,王钲旋,王利民,苑森淼.基于主成分分析提升朴素贝叶斯[J].仪器仪表学报,2004,25(z3):384-386. 被引量:7
  • 2李东亮,王海花.基于/proc文件系统及对内核信息的获取[J].河北工程大学学报(自然科学版),2007,24(2):73-77. 被引量:2
  • 3[1]Ghosh S,et al.Cache Miss Equations: A Compiler Framework for Analyzing and Tuning Memory Behavior.In ACM Transactions on Programming Languages and Systems,1999,21(4):702~745
  • 4[2]http://www.cs.wisc.edu/~mscalar/simplescalar.html
  • 5[3]Merten M C,et al.An Architectural Framework for Run-Time Optimization.IEEE Transactions on Computers,2001,50(6):567~589
  • 6[4]Lambert, et al.Profiling I/O Interrupts in Modern Architectures.In:8th Intl.Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems,San Francisco, California,2000
  • 7[5]Hirzel M, et al.Bursty Tracing: A Framework for Low-Overhead Temporal Profiling.In:4th Workshop on Feedback-Directed and Dynamic Optimization (FDDO), Dec.2001
  • 8[6]http://icl.cs.utk.edu/projects/papi/
  • 9[7]http://www.gz-juelich.de/zam/PCL/
  • 10[8]http://research.compaq.com/SRC/dcpi/

共引文献173

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部