摘要
针对高斯混合模型(GMM)聚类算法对初始值敏感且容易陷入局部极小值的问题,利用密度峰值(DP)算法全局搜索能力强的优势,对GMM算法的初始聚类中心进行优化,提出了一种融合DP的GMM聚类算法(DPGMMC)。首先,基于DP算法寻找聚类中心,得到混合模型的初始参数;其次,采用最大期望(EM)算法迭代估计混合模型的参数;最后,根据贝叶斯后验概率准则实现数据点的聚类。在Iris数据集下,DP-GMMC聚类准确率可达到96. 67%,与传统GMM算法相比提高了33. 6个百分点,解决了对初始聚类中心依赖的问题。实验结果表明,DP-GMMC对低维数据集有较好的聚类效果。
The clustering algorithm of Gaussian Mixture Model(GMM)is sensitive to initial value and easy to fall into local minimum.In order to solve the problems,taking advantage of strong global search ability of Density Peaks(DP)algorithm,the initial clustering center of GMM algorithm was optimized,and a new Clustering algorithm of GMM based on DP(DP-GMMC)was proposed.Firstly,the clustering center was searched by the DP algorithm to obtain the initial parameters of mixed model.Then,the Expectation Maximization(EM)algorithm was used to estimate the parameters of mixed model iteratively.Finally,the data points were clustered according to the Bayesian posterior probability criterion.In the Iris data set,the problem of dependence on the initial clustering center is solved,and the clustering accuracy of DP-GMMC can reach96.67%,which is33.6percentage points higher than that of the traditional GMM algorithm.The experimental results show that,the proposd DP-GMMC has better clustering effect on low-dimensional datasets.
作者
陶志勇
刘晓芳
王和章
TAO Zhiyong;LIU Xiaofang;WANG Hezhang(School of Electronic and Information Engineering, Liaoning Technical University, Huludao Liaoning 125105, China;Fuxin Lixing Technology Company Limited, Fuxin Liaoning 123000, China)
出处
《计算机应用》
CSCD
北大核心
2018年第12期3433-3437,3443,共6页
journal of Computer Applications
基金
辽宁省博士启动基金资助项目(20170520098)
辽宁省自然科学基金资助项目(2015020100)
辽宁省普通高等教育本科教学改革研究项目(551610001095)
辽宁省教育厅一般项目(LJ2017QL013)~~
关键词
聚类
高斯混合模型
最大期望算法
密度峰值
clustering
Gaussian Mixture Model (GMM)
Expectation Maximization (EM) algorithm
Density Peak (DP)