期刊文献+

基于双压缩的Apriori算法优化设计

Design of Apriori algorithm optimization based on double compression
下载PDF
导出
摘要 针对Apriori算法的性能瓶颈问题,提出了一种双压缩Apriori(Apriori double compression,Apriori_DC)算法.该算法通过不断压缩事务数据库,减少事务记录数和数据项,并通过缩减频繁项集从而减少下一步候选频繁项集的数量,最终实现提高算法效率.试验验证表明:在支持度相同而数据量不同,以及数据量相同而支持度不同时,Apriori_DC算法均优于Apriori算法,且在Apriori_DC算法执行过程中,事务数据库的数据量不断缩小. A new algorithm based on double compression, which was called as Apriori double compres- sion (Apriori_ DC ) , was proposed, according to the performance bottleneck problem of Apriori algorithm. Two ways were used to improve performance: the transaction database was continually compressed to re- duce the transaction record and the total item in the database ; the number of the next candidate frequent item set was to reduce by compressing the frequent item set. The experiments showed that Apriori_ DC al- gorithm had better performance than Apriori algorithm when the support ratio was the same and the record number of the database was different or the record number of the database was the same and the support ratio was different. The experiment also showed that the record number of the database was continually reduced during the execution of the Apriori_ DC algorithm.
作者 郑建华 徐龙琴 刘双印 张世龙 ZHENG Jianhua;XU Longqin;LIU Shuangyin;ZHANG Shilong(College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China)
出处 《仲恺农业工程学院学报》 CAS 2017年第4期26-31,共6页 Journal of Zhongkai University of Agriculture and Engineering
基金 国家自然科学基金(61471133 61571444) 广东省科技计划(2013B090600065 2017A070712019) 广州市科技计划(201704030098)资助项目
关键词 APRIORI算法 Apriori_DC算法 关联规则 频繁项集 压缩 Apriori algorithm Apriori_ DC algorithm association rule frequent item set compression
  • 相关文献

参考文献7

二级参考文献40

  • 1程海明,吴青,赵春华.油液监测故障诊断关联规则的挖掘研究[J].武汉理工大学学报(交通科学与工程版),2004,28(5):729-731. 被引量:10
  • 2尹群,王丽珍,田启明.一种基于概率的加权关联规则挖掘算法[J].计算机应用,2005,25(4):805-807. 被引量:18
  • 3骆嘉伟,王艳,杨涛,吴君浩.一种结合完全连接的改进Apriori算法[J].计算机应用,2006,26(5):1174-1177. 被引量:4
  • 4Jiawei Han, Micheline Kamber. Data Mining: Concepts and Techniques [ M ]. Second Edition. Beijing: China Machine Press,2006:147 - 172.
  • 5Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases [ C ]//Proc of the ACM SIGMOD Conf on Mana of Data( SIGMOD'93 ) ,New York : ACM Press, 1993:207 - 216.
  • 6Omiecinski E. Alternative interesting measures for mining associations [ J ]. IEEE Trans Knowledge and Data Eng, 2003,15:57.
  • 7Geng L Q, Hamilton H J. Interestingness measures for data mining: A survey [ J ]. ACM Comp Surveys, 2006, 38 (3):9.
  • 8Brin S, Motwani R, Silverstein C. Beyond market baskets: generalizing association rules to correlations [C]//Proc ACM SIGMOD Int Conf on Mana of Data, Tucson: ACM Press, 1997:265 - 276.
  • 9Huang Wenxue, Krneta Milorad, Lin Limin, et al. Association bundle--A new pattern for association analysis [ C ]// Sixth IEEE Int Conf on Data Mining Workshops( ICDMW' 06) Washington : IEEE Computer Society ,2006:601 - 605.
  • 10FIMI. Frequent Itemset Mining Dataset Repository [ EB/ OL]. (2003 - 11 - 19 ) [ 2011 - 03 - 08 ]. http ://fimi. cs. helsinki, fi/data/,2003.

共引文献98

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部