针对k-匿名数据的判定树构造算法

An Algorithm of Decision Tree for k-Anonymity Data

导出

摘要数据挖掘问题是提高k-匿名隐私保护模型下数据可用性问题之一.通过分析发现,k-匿名表中准标识符属性值与利用精确表生成的判定树的部分非叶结点的属性值均是通过泛化产生的,根据这一对应关系,本文提出了一种基于k-匿名表的判定树生成算法.该算法直接以k-匿名表作为输入,避免了经典ID3算法运行前的数据准备工作.实验表明,该算法节省了建立概化层次树的时间,并且行之有效. Data mining is one of problems for the utility of anonymized data under the k-anonymity privacy protection model.Through analysis,we find that both the quasi-identifier attribute values in the k-anonymity table and the node except leaf of the decision tree in the private table are needed to generalize.According to this correspondence,we propose a decision tree algorithm based on k-anonymity.The algorithm accepts the k-anonymity table as input to avoid the ID3algorithm data preparation work before running.Experimental results show that the algorithm saves the time which is used to build generalize tree and it is efficient for k-anonymity data table.

作者林丙春刘国华王梅

机构地区东华大学计算机科学与技术学院

出处《武汉大学学报（理学版）》 CAS CSCD 北大核心 2011年第6期494-498,共5页 Journal of Wuhan University:Natural Science Edition

基金国家自然科学基金资助项目(61070032)

关键词 K-匿名判定树 ID3 不确定数据挖掘 k-anonymity decision tree ID3 uncertain data mining

分类号 TP311.131 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献13

1Samarati P,Sweeney L. Protecting Privacy When Dis- closing Information: k-anonymity and Its Enforcement through Generalization and Suppression [R/OL]. [2010-12-12]. http://citeseerx, ist. psu. edu/viewdoc/ summary? doi= lO. 1.1.37. 5829.
2Sweeney L. k-Anonymity:A model for protecting pri- vacy [DB/OL]. [2010-12-201. http://citeseerx, ist. psu. edu/ viewdoc / download.
3Cormode G, Srivastava D. Anonymized data: Genera- tion, models, usage [ DB/OL]. [ 2010-12-25 ]. hltp ..// dimacs, rulgers, edu/~ graham/pubs/papers/anon- tutl O. pd f .
4Chui C K, Kao B, Hung E. Mining frequent itemsets from uncertain data[DB/OL]. [2010-12-23]. http:// www. philippe- fournier-viger, com/spmf /uapriori. pdf.
5Li Y F, Han J W, Yang J. Clustering moving objects [DB/OL]. [2010-12-20]. http://www, cs. uiuc. edu/ ~hanj/ pd f /kddO4_clusmovobj. pd f .
6Ngai W,Kao B,Chui C K,et al. Efficient clustering of uncertain data [DB/OL]. [ 2010-12-21 ]. http ://www. fbe. hku. hk /- mchau / papers / E f f icientClusterin gO- fUncertainData, pd f .
7Cormode G, McGregor A. Approximation Algorithms for Clustering Uncertain Data[DB/OL]. [2010-12-23]. http://www, research, att. com/people/Cormode_ Graham/ library/ publications/ CormodeMcGregor08. pdf.
8Guha S,Munagala K. Exceeding expectations and clus- tering uncertain data[DB/OL]. [2010-12-23]. http,// www. cs. duke. edu/~kamesh/ podsO9-guha, pd f .
9韩佳炜.数据挖掘:概念与技术[M].北京:机械工业出版社,2007:162-172.
10Newman D J, Blake C L, Merz C J. UCI repository of machine learning database[DB/OL]. [ 2010-12-21 ]. http://www, ics. uci. edu/-mlearn/.

二级参考文献19

1张坤,刘国华.基于熵的视图安全性判定[J].计算机研究与发展,2006,43(z2):122-127. 被引量：4
2陈子阳,郜时红,刘国华.基于k-匿名的视图发布安全问题的研究[J].计算机研究与发展,2006,43(z2):133-139. 被引量：5
3郜时红,刘国华,聂俊岚,钱颖.基于先验知识的视图发布安全问题的研究[J].计算机研究与发展,2006,43(z3):206-211. 被引量：4
4杨晓春,刘向宇,王斌,于戈.支持多约束的K-匿名化方法[J].软件学报,2006,17(5):1222-1231. 被引量：60
5LIU Guohua GAO Shihong.A Method of Eliminating Information Disclosure in View Publishing[J].Wuhan University Journal of Natural Sciences,2006,11(6):1753-1756. 被引量：4
6Sweeney L. K-Anonymity, A model for protecting privacy [J]. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002, 10(5): 557-570
7Sweeney L. Achieving k-anonymity privacy protection using generalization and suppression [J]. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 2002, 10(5): 571-588
8Sweeney L. Guaranteeing anonymity when sharing medical data: The datafly system [J]. Journal of the American Medical Informatics Association. 1997, 4(Suppl): 51-55
9Hundepool A, Willenborg L. μ-and τ-argus: Software for statistical disclosure control [C]//Proc of the 3rd Int Seminar on Statistical Confidentiality. Ljubliana: Eurostat-Statistical Office of the Republic of Slovenia, 1996:208-217
10Meyerson A, Williams R. On the complexity of optimal k- anonymity [C] //Proc of the ACM Syrup on Principles of Database Systems. New York: ACM, 2004:223-228

共引文献10

1宋金玲,刘国华,黄立明,朱彩云.k-匿名方法中相关视图集和准标识符的求解算法[J].计算机研究与发展,2009,46(1):77-88. 被引量：7
2李金才,刘国华,郗君甫,吕艳丽.一种满足最大隐私泄漏率要求的匿名方法[J].燕山大学学报,2010,34(3):225-230. 被引量：1
3辛婷婷,刘国华.K-匿名隐私保护模型下的Top-k查询[J].计算机科学与探索,2011,5(8):751-759. 被引量：1
4吴佳伟,刘国华,王梅.K-匿名隐私保护模型中不确定性数据的建模问题研究[J].计算机工程与科学,2011,33(9):7-12. 被引量：2
5杨高明,杨静,张健沛.隐私保护的数据发布研究[J].计算机科学,2011,38(9):11-17. 被引量：16
6丁媛媛.K-匿名技术在无线传感器网络隐私保护中的应用[J].内蒙古民族大学学报（自然科学版）,2011,26(6):650-652. 被引量：1
7万涛,刘国华.k-匿名数据中的数据依赖问题研究[J].计算机工程,2012,38(20):38-40. 被引量：2
8杨月平,王箭.基于k-匿名的多源数据融合算法研究[J].计算机技术与发展,2017,27(5):102-107. 被引量：4
9杨月平,王箭,薛明富.面向敏感值的层次化多源数据融合隐私保护[J].计算机科学,2017,44(9):156-161. 被引量：1
10王玉,黄刚.公路网移动用户隐私保护算法研究[J].计算机应用研究,2018,35(10):3078-3081.

1姜燕生,李凡.数据挖掘中的数据准备工作[J].湖北工学院学报,2003,18(6):35-38. 被引量：5
2程望,冯洪斌,王运利.基于VB的图形用户界面实现工程数据库中信息的管理[J].计算机时代,1998(9):3-4.
3张倩影.数据的积累序号算法分析[J].信息系统工程,2016,29(3):141-142.
4张军.ERP实施过程中数据准备的应用研究[J].计算机安全,2014(2):57-60.
5程苗.电子商务网站的Web数据挖掘方案设计[J].计算机科学,2007,34(8):168-170. 被引量：5
6吴瑛,王秋生.模糊C均值聚类算法在Web使用挖掘上的应用研究[J].计算机技术与发展,2008,18(6):32-35. 被引量：9
7冯梅.通用模拟数据自动生成工具设计[J].现代电子工程,2007(4):72-74. 被引量：1
8万晓燕,陈姗.交互式计算机图形学在数据挖掘中的应用实践探究[J].信息系统工程,2016,29(6):17-17.
9陈蓉.关于韶钢集团ERP数据准备的探讨与实施[J].技术与市场,2008,15(9):90-91.
10于晓红,石海燕.基于中间库的ERP物资数据处理模型的建立及应用[J].中国信息界,2012(11):44-46.

武汉大学学报（理学版）

2011年第6期

浏览历史

内容加载中请稍等...

针对k-匿名数据的判定树构造算法

参考文献13

二级参考文献19

共引文献10

相关作者

相关机构

相关主题

浏览历史