期刊文献+

改进的基于频繁模式树的最大频繁项集挖掘算法——FP-MFIA 被引量:16

FP-MFIA: improved algorithm for mining maximum frequent itemsets based on frequent-pattern tree
下载PDF
导出
摘要 针对最大频繁项目集挖掘算法(DMFIA)当候选项目集维数高而最大频繁项目集维数较低的情况下要产生大量的候选项目集的缺点,提出了一种改进的基于频繁模式树(FP-tree)结构的最大频繁项目集挖掘算法——FPMFIA。该算法根据FP-tree的项目头表,采用自底向上的搜索策略逐层挖掘最大频繁项目集,从而加速每次对候选集计数的操作。在挖掘时根据每层的条件模式基产生维数较低的非频繁项目集,尽早对候选项目集进行剪枝和降维,可大量减少候选项目集的数量。同时在挖掘时充分利用最大频繁项集的性质,减少搜索空间。通过算法在不同支持度下挖掘时间的对比可知,算法FP-MFIA在最小支持度较低的情况下时间效率是DMFIA以及基于降维的最大频繁模式挖掘算法(BDRFI)的2倍以上,说明FP-MFIA在候选集维数较高的时候优势明显。 Focusing on the drawback that Discovering Maximum Frequent Itemsets Algorithm (DMFIA) has to generate lots of maximum frequent candidate itemsets in each dimension when given datasets with many candidate items and each maximum frequent itemset is not long, an improved Algorithm for mining Maximum Frequent Itemsets based of Frequent- Pattern tree (FP-MFIA) for mining maximum frequent itemsets based on FP-tree was proposed. According to Htable of FP- tree, this algorithm used bottom-up searches to mine maximum frequent itemsets, thus accelerated the count of candidates. Producing infrequent itemsets with lower dimension according to conditional pattern base of every layer when mining, cutting and reducing dimensions of candidate itemsets can largely reduce the amount of candidate itemsets. At the same time taking full advantage of properties of maximum frequent itemsets will reduce the search space. The time efficiency of FP-MFIA is at least two times as much as the algorithm of DMFIA and BDRFI ( algorithm for mining frequent itemsets based on dimensionality reduction of frequent itemset) according to computational time contrast based on different supports. It shows that FP-MFIA has a clear advantage when candidate itemsets are with high dimension.
出处 《计算机应用》 CSCD 北大核心 2015年第3期775-778,共4页 journal of Computer Applications
基金 国家科技支撑计划项目(2012BAF12B08)
关键词 最大频繁项集 频繁模式树 数据挖掘 关联规则 非频繁项集 maximum frequent itemset Frequent Pattern tree (FP-tree) data mining association rule infrequentitemset
  • 相关文献

参考文献13

  • 1AGRAWAL R, IMIELINSKI T, SWAMI A. Mining association rules between sets of items in large database[C]//Proceedings of 1993 ACM SIGMOD Conference on Management of Data. New York: ACM, 1993:207-216.
  • 2PARK J S, CHEN M-S, YU P S. An effective Hash-based algorithm for mining association rules[C]//Proceedings of 1995 ACM SIGMOD International Conference on Management of Data. New York: ACM, 1995: 175-186.
  • 3LI H, WANG Y, ZHANG D, et al. PFP: parallel FP-growth for query recommendation[C]//Proceedings of the 2008 ACM Conference on Recommender Systems. New York: ACM, 2008:125-137.
  • 4付冬梅,王志强.基于FP-tree和约束概念格的关联规则挖掘算法及应用研究[J].计算机应用研究,2014,31(4):1013-1015. 被引量:25
  • 5GRAHNE G, ZHU J. High performance mining of maximal frequent itemsets[EB/OL].[2014-07-06]. http://www.docin.com/p-773109811.html.
  • 6LIN D, KEDEM Z. Pincer-search: a new algorithm for discovering the maximum frequent set[C]//Proceedings of the 6th European Conference on Extending Database Technology. Berlin: Springer-Verlag, 1998:105-119.
  • 7宋余庆,朱玉全,孙志挥,陈耿.基于FP-Tree的最大频繁项目集挖掘及更新算法[J].软件学报,2003,14(9):1586-1592. 被引量:164
  • 8HAN J, PEI J, YIN Y. Mining frequent patterns without candidate generation[C]//Proceedings of the 2000 ACM-SIGMOD International Conference on Management of Data. New York: ACM, 2000:1-12.
  • 9钱雪忠,惠亮.关联规则中基于降维的最大频繁模式挖掘算法[J].计算机应用,2011,31(5):1339-1343. 被引量:13
  • 10赵志刚,王芳,万军.基于OWSFP-Tree的最大频繁项目集挖掘算法[J].计算机工程与设计,2013,34(5):1687-1690. 被引量:5

二级参考文献56

共引文献230

同被引文献134

引证文献16

二级引证文献70

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部