期刊文献+

一种基于Hadoop的高效K-Medoids并行算法 被引量:4

Highly efficient parallel algorithm of K-Medoids based on Hadoop platform
下载PDF
导出
摘要 针对传统K-Medoids算法对初始聚类中心敏感、收敛速度慢,以及在大数据环境下所面临的内存容量和CPU处理速度的瓶颈问题,从改进初始中心选择方案和中心替换策略入手,利用Hadoop分布式计算平台结合基于Top K的并行随机采样策略,实现了一种高效稳定的K-Medoids并行算法,并且通过调整Hadoop平台,实现算法的进一步优化。实验证明,改进的K-Medoids算法不仅有良好的加速比,其收敛性和聚类精度均得到了改善。 In view of the traditional K-Medoids algorithm is sensitive to the initial clustering center,slow convergence speed,and in large data environment facing the bottleneck problem of memory and CPU processing speed,through improving the initial center options and replacement strategy of using the Hadoop distributed computing platform combined with parallel random sampling strategy based on Top K,realizes a highly efficient and stable K-Medoids parallel algorithm,and by adjusting the Hadoop platform,realize the further optimization of the algorithm.Experiments show that the improved K-Medoids algorithm not only has a good speedup,the convergence and the clustering accuracy are also improved.
出处 《计算机工程与应用》 CSCD 北大核心 2015年第16期47-54,共8页 Computer Engineering and Applications
关键词 K-Medoids 分布式计算 HADOOP 并行采样 K-Medoids distributed computation Hadoop parallel sampling
  • 相关文献

参考文献20

二级参考文献232

共引文献1048

同被引文献35

引证文献4

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部