摘要
多聚类中心近邻传播聚类算法(MEAP),在处理任意形状具有流形分布结构的数据时,往往得不到理想的聚类结果。为此,基于流形学习的思想,设计了一种全新的相似性度量,该相似性度量能够扩大位于同一流形中数据点间的相似性,同时缩小处于不同流形上数据点间的相似性,从而使得相似性矩阵能够准确地反映数据集内在的流形分布结构。将该相似性度量与MEAP相结合,提出基于流形结构的多聚类中心近邻传播聚类算法MS-MEAP(Manifold Structure based Multi-Exemplar Affinity Propagation),从而有效地拓展了算法处理任意形状具有流形分布结构数据集的能力,同时提高了算法的运行效率。在人工数据集与USPS手写体数据集上进行了实验,仿真实验结果及算法有效性分析证明,MS-MEAP算法相比于原算法在处理任意形状具有流形分布结构的数据时,具有更好的聚类性能。
When dealing with arbitrary shape data set with manifold structure, multi-exemplar affinity propagation cannot obtain good clustering results. To overcome this shortcoming, this paper designs a brand new similarity measure based on the idea of manifold learning. This similarity can amplify the similarity between data points of the same manifold and reduce the similarity between data points of different manifolds. As a result, the similarity matrix can reflect the internal manifold structure of the data set precisely. Based on this similarity matrix, this paper proposes the novel manifold structure based multi-exemplar affinity propagation, which can solve the problem mentioned above effectively and also improve the efficiency of this algorithm. It obtains promising results both on artificial datasets and USPS handwritten digits datasets. The simulation results show that the new method outperforms traditional MEAP algorithm.
出处
《计算机工程与应用》
CSCD
北大核心
2016年第6期67-73,共7页
Computer Engineering and Applications
基金
国家自然科学基金(No.61305017
No.60975027)
江苏省自然科学基金(No.BK20130154)
江苏高校优势学科建设工程资助项目