摘要
数据挖掘问题是提高k-匿名隐私保护模型下数据可用性问题之一.通过分析发现,k-匿名表中准标识符属性值与利用精确表生成的判定树的部分非叶结点的属性值均是通过泛化产生的,根据这一对应关系,本文提出了一种基于k-匿名表的判定树生成算法.该算法直接以k-匿名表作为输入,避免了经典ID3算法运行前的数据准备工作.实验表明,该算法节省了建立概化层次树的时间,并且行之有效.
Data mining is one of problems for the utility of anonymized data under the k-anonymity privacy protection model.Through analysis,we find that both the quasi-identifier attribute values in the k-anonymity table and the node except leaf of the decision tree in the private table are needed to generalize.According to this correspondence,we propose a decision tree algorithm based on k-anonymity.The algorithm accepts the k-anonymity table as input to avoid the ID3algorithm data preparation work before running.Experimental results show that the algorithm saves the time which is used to build generalize tree and it is efficient for k-anonymity data table.
出处
《武汉大学学报(理学版)》
CAS
CSCD
北大核心
2011年第6期494-498,共5页
Journal of Wuhan University:Natural Science Edition
基金
国家自然科学基金资助项目(61070032)