期刊文献+

基于Spark的并行KMeans聚类模型研究 被引量:8

Research on Parallel Kmeans Clustering Model Based on Spark
下载PDF
导出
摘要 文章基于Spark分布式计算框架设计并实现了并行KMeans聚类模型,并通过该模型在不同规模的Movie Lens数据集上进行训练比对实验,结果表明,该并行KMeans聚类模型适合运行在分布式集群环境下,且并行化计算效率也有不俗的表现;其次通过repartition算子设计分片加载数据,优化并行方案,有效减少了模型的训练时间。 Distributed computing framework based on spark is designed and implemented in parallel KMeans clustering model,and through the model in different sizes of MovieLens data set for training on the comparison experiment,the results show that the parallel KMeans clustering model is suitable for operation under the large distributed data environment,and parallel computa tion efficiency is also doing well.Secondly through the repartition operator load data,parallel scheme is optimized,the training time of the model is reduced effectively.
作者 侯敬儒 吴晟 李英娜 HOU Jingru;WU Sheng;LI Yingna(School of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500)
出处 《计算机与数字工程》 2018年第3期537-540,555,共5页 Computer & Digital Engineering
关键词 SPARK KMeans MovieLens 并行聚类 repartition Spark,KMeans,MovieLens,parallel clustering,repartition
  • 相关文献

参考文献5

二级参考文献61

  • 1崔杰,李陶深,兰红星.基于Hadoop的海量数据存储平台设计与开发[J].计算机研究与发展,2012,49(S1):12-18. 被引量:141
  • 2孙雅明,王晨力,张智晟,刘尚伟.基于蚁群优化算法的电力系统负荷序列的聚类分析[J].中国电机工程学报,2005,25(18):40-45. 被引量:24
  • 3王海波.云计算中数据库的关键问题研究与实现[D].吉林大学,2011.
  • 4李奕.计算革命与数据价值[J].中国计算机报,2012(10).
  • 5Yongqiang He, Rubao Lee, Yin Huai. RCFile: A fast and space-efficient data placement structure in MapReduce~based warehouse systems. ICDE,2011:1199-1208.
  • 6Dhruba Borthakur. The Hac/oop Distributed File System[J]. Architecture and Design.
  • 7宋杰,侯泓颖,李丹程.MQM:一种用于Web服务查找的多维QoS模型[J].小型微型计算机系统,2011(3):1000-1220.
  • 8Apache HBase, a distributed, versioned, column-oriented da tabase built on top of Apache Hadoop and Apache ZooKeeper. Chapter 5. 5.
  • 9Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Bigtable: A Dis trihuted Storage System for Structured Data[J]. ACM Trans. Comput. Syst. (TOCS),2008,26(2).
  • 10Foley, James, Andries van Dam, Steven Feiner, John Hughes. Computer Graphics= Principle and Practice. Massa- chusetts : Addison-Wesley Publishing Company, 1987 : 870-871.

共引文献122

同被引文献102

引证文献8

二级引证文献63

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部