摘要
在谱聚类算法没有先验信息的情况下,对于具有复杂形状和不同密度变化的数据集很难构建合适的相似图,且基于欧氏距离的高斯核函数的相似性度量忽略了全局一致性。针对该问题,提出一种基于共享最近邻的密度自适应邻域谱聚类算法(SC-DANSN)。通过一种无参数的密度自适应邻域构建方法构建无向图,将共享最近邻作为衡量样本之间的相似性度量进而消除参数对构建相似图的影响,体现全局和局部的一致性。实验结果表明,SC-DANSN算法相比K-means算法和基于K最近邻的谱聚类算法(SC-KNN)具有更高的聚类精度,同时相比SC-KNN算法对参数的选取敏感性更低。
Without prior information,it is difficult for spectral clustering algorithms to build appropriate similarity graphs for datasets with complex shapes and different densities.At the same time,the similarity measure of Gaussian kernel functions based on Euclidean distance ignores global consistency.To address the problem,a spectral clustering algorithm(SC-DANSN)for density adaptive neighborhood based on shared nearest neighbors is proposed.An undirected graph is constructed by using a parameter-free density adaptive neighborhood construction method,and shared nearest neighbors are used to measure the similarity between samples.This measurement eliminates the influence of parameters on similarity graph construction,as it reflects both global consistency and local consistency.The experimental results show that the SC-DANSN algorithm has a higher clustering accuracy than the K-means algorithm and Spectral Clustering based on K Nearest Neighbor(SC-KNN).At the same time,SC-DANSN is less sensitive to the selection of parameters than SC-KNN.
作者
葛君伟
杨广欣
GE Junwei;YANG Guangxin(College of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第8期116-123,共8页
Computer Engineering
基金
重庆市重点产业共性关键技术创新重大主题专项(cstc2017zdcy-zdzx0046)
重庆市基础与前沿研究计划项目(cstc2017jcyjA0755)。
关键词
谱聚类
相似性矩阵
密度自适应邻域
共享最近邻
K最近邻
Spectral Clustering(SC)
similarity matrix
density adaptive neighborhood
shared nearest neighbor
K Nearest Neighbor(KNN)