摘要
提出一种基于取整划分函数的K匿名算法,并从理论上证明该算法在非平凡的数据集中可以取得更低的上界.特别地,当数据集大于2k^2时,该算法产生的匿名化数据的匿名组规模的上界为k+1;而当待发布数据表足够大时,算法所生成的所有匿名组的平均规模将足够趋近于K.仿真实验结果表明,该算法是有效而可行的.
This paper proposes an algorithm based on rounded partition function for k-anonymity. By rigorous theoretical proof, the study will show that a better upper bound on size of the anonymization groups can be obtained in non-trivial data sets. In particular, when the size of the original dataset is greater than 2k2, the upper bound will be reduced to k+1. Further, the average size of all anonymization groups of the anonymous data will be close enough to k when the size of the original dataset is large enough. Experimental results on real datasets show that this algorithm is effective and feasible.
出处
《软件学报》
EI
CSCD
北大核心
2012年第8期2138-2148,共11页
Journal of Software
基金
国家自然科学基金(61003057)
福建省自然科学基金(2010J01330)
关键词
隐私保护
数据发布
k匿名算法
取整划分函数
匿名组规模上界
privacy preservation
data publishing
algorithm for k-anonymity
rounded partition function
upper bound on size of anonymization group