摘要
在分析经典谱聚目标函数与加权核k-means目标函数等价基础上,设计了一种基于抽样子空间约束的改进大规模数据谱聚类算法,算法通过加权核k-means迭代优化避免矩阵特征分解的大量资源被占用,通过数据抽样及聚类中心的子空间约束,避免全部核矩阵都被使用,从而降低经典算法的时间空间复杂度。理论分析和实验结果表明,改进算法保持与经典算法相近聚类精度,提高了聚类效率,验证了改进算法的有效性。
On the basis of analyzing the equivalent function of the objective function of classical spectral clustering algorithm and the weighted kernel k-means objective function,an improved large-scale data spectrum clustring algorithm based on sampling subspace constraint was designed,the weighted kernel k-means iterative optimization was used to avoid the large resource consumption of Laplacian matrix feature decomposition,and by using data sampling and constraining the cluster center to the subspace generated by the sampling points,the use of all kernel matrices was avoided,thereby reducing the time-space complexity of classical algorithms.Theoretical analysis and experi- mental results show that the improved algorithm can greatly improve the clustering efficiency on the basis of main- taining similar clustering accuracy with the classic algorithm and verify the effectiveness of the proposed algorithm.
作者
聂茹
NIE Ru(Electronic Information Engineering Institute,Guangzhou College of South China University of Technology,Guangzhou 510800,China)
出处
《电信科学》
2018年第11期41-47,共7页
Telecommunications Science
基金
广东省教育厅青年创新人才基金资助项目(No.2016KQNCX227)~~
关键词
大规模数据谱聚类
加权核k-means算法
数据抽样
矩阵特征分解
核矩阵
large scale data spectral clustering
weighted kernel k-means algorithm
data sampling
matrix feature decomposition
kernel matrix