摘要
针对MapReduce框架执行效率不佳的问题,对MapReduce性能优化的多个方案进行了研究。首先阐述了云计算的定义、特征以及专用批处理Paa S平台Hadoop的组成,之后简单介绍了MapReduce框架和MapReduce框架下的应用程序开发,接着着重讨论了MapReduce性能优化的三个主流方向:系统实现优化、参数调优、应用程序优化。并从应用程序着手,提出多个解决方法,进行了in-Map Reduce优化算法、脚本语言/编译语言对比、小文件预处理优化等多个实验,最后对优化技术和实验数据进行了分析。实验结果表明,优化应用程序是提高MapReduce性能的有效手段。
For the problem of poor execution efficiency under MapReduce framework, multiple solutions on performance optimization of MapReduce is studied. Firstly, the cloud computing definition and its characteristics, Hadoop composition are described in detail, then the framework of MapReduce and application development under the framework is introduced in this paper. Three main directions of perform- ance optimization for MapReduce are also discussed, including the system optimization realized, parameters optimized and application op- timization. Furthermore,multiple solutions are put forward from the application viewpoint, including in-Map Reduce optimization algo- rithm,script language/compile language contrast experiment,tuning for small files. Lastly, analyze optimization technique and experimen- tal data. The experimental results show that the optimized application is an effective means to improve the performance of MapReduce.
出处
《计算机技术与发展》
2015年第7期96-99,106,共5页
Computer Technology and Development
基金
江苏省卓越工程师(软件类)计划试点专业(苏教高函[2012]17号)
江苏省高等学校软件服务外包类专业嵌入式人才培养项目(苏教高函[2014]14号)
江苏省电力公司科技项目(J2014057)
三江学院本科工程二期项目(J14021)