期刊文献+

基于差分隐私保护的Stacking集成聚类算法研究 被引量:7

A Stacking ensemble clustering algorithm based on differential privacy protection
下载PDF
导出
摘要 针对差分隐私保护下单一聚类算法准确性和安全性不足的问题,提出了一种基于差分隐私保护的Stacking集成聚类算法。使用Stacking集成多种异质聚类算法,将K-means聚类、Birch层次聚类、谱聚类和混合高斯聚类作为初级聚类算法,结合轮廓系数对初级聚类算法产生的聚类结果加权并入原始数据,将K-means算法作为次级聚类算法对扩展后的数据集进行聚类分析。其中,针对原始数据和初级聚类算法的聚类结果分别提出自适应的ε函数确定隐私预算,为不同敏感度的数据分配不同程度的Laplace噪声。理论分析和实验结果均表明,与单一聚类算法相比,该算法满足ε-差分隐私保护的同时有效提高了聚类准确性,实现了隐私保护与数据可用性的高度平衡。 Aiming at the problem that the accuracy and security of the single clustering algorithm under differential privacy protection are insufficient,a stacking ensemble clustering algorithm based on differential privacy protection is proposed.Stacking is used to integrate a variety of heterogeneous clustering algorithms.K-means clustering,birch hierarchical clustering,spectral clustering and gaussian mixture clustering are used as primary clustering algorithms.By combining the contour coefficient,the clustering results generated by the primary clustering algorithms are weighted into the original data.Kmeans algorithm is used as the secondary clustering algorithm to cluster the expanded data set.According to the clustering results of the original data and the primary clustering algorithms,adaptiveεfunctions are proposed to determine the privacy budget,and different degrees of Laplace noise are allocated to the data with different sensitivities.Theoretical analysis and experimental results show that,compared with the single clustering algorithm,the proposed algorithm can effectively improve the clustering accuracy while satisfying theε-differential privacy protection,and achieve a high balance between privacy protection and data availability.
作者 李帅 常锦才 李吕牧之 蔡昆杰 LI Shuai;CHANG Jin-cai;LI-LÜ Mu-zhi;CAI Kun-jie(College of Science,North China University of Science and Technology,Tangshan 063210;Hebei Provincial Key Laboratory of Data Science and Application,Tangshan 063210,China)
出处 《计算机工程与科学》 CSCD 北大核心 2022年第8期1402-1408,共7页 Computer Engineering & Science
基金 华北空管局科技项目(HBKG202002) 唐山市科学与工程计算创新团队项目(18130209B)。
关键词 差分隐私 集成聚类 Stacking算法 自适应ε 隐私保护 differential privacy ensemble clustering Stacking algorithm self-adaptionε privacy protection
  • 相关文献

参考文献2

二级参考文献9

共引文献262

同被引文献77

引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部