期刊文献+

基于FP-tree的最大频繁项目集挖掘算法 被引量:4

Algorithm for mining maximal frequent itemsets based on FP-tree
下载PDF
导出
摘要 最大频繁项目集挖掘是数据挖掘领域最重要的基本问题之一,在分析已有算法的基础上提出了FP-MMFI算法,它是对FP-growth算法在最大频繁项目集挖掘上的扩展。提出了频繁路径的概念,用它可以有效地对FP-tree进行压缩和缩小搜索空间,同时使用投影的方法对超集检测进行了优化,减少了项目匹配的次数。最后实验结果表明,该算法在性能上优于已有的同类算法。 Maximal Frequent itemsets mining is one of most important and fundamental data mining problems. A new algorithm FPMMFI is presented, which is an extension of the FP-growth method for mining maximal frequent itemsets. A new concept is developed, called frequent path, which can reduce the size of FP-tree and search space. A method of projection is used to reduce the comparative times of superset checking. The experimental result show that the new algorithm outperforms the previously developed algorithms such as MAFIA.
出处 《计算机工程与设计》 CSCD 北大核心 2008年第2期385-388,共4页 Computer Engineering and Design
关键词 数据挖掘 关联规则 频繁项目集 最大频繁项目集 频繁模式树 data mining association rules frequent itemsets maximal frequent itemsets frequent pattern tree
  • 相关文献

参考文献8

  • 1Han J Pei J,Yin Y.Mining frequent patterns without candidate generation[C].Dallas,TX: ACM-SIGMOD,2000.
  • 2Aggarwal C,Agrawal R.Prasad VVV. Depth first generation of long patterns[C]. Boston,MA,USA:Proc of the 6th ACM SIGKDD International Conference on knowledge Discovery and Data Mining,2000: 108-118.
  • 3Burdick D,Calimlim M,Gehrke J.MAFIA: A maximal frequent itemset algorithm for transactional databases [C]. Heidelberg, Germany:Proc of the 17th International Conference on Data Engineering,2001:443-452.
  • 4Grahne G, Zhu JEHigh performance mining of maximal frequent itemsets [C]. San Francisco, CA:Proc of the 6th SIAM Int'l Workshop on High Performance Data Mining (HPDM), 2003:135-143.
  • 5刘君强,孙晓莹,王勋,潘云鹤.挖掘最大频繁模式的新方法[J].计算机学报,2004,27(10):1328-1334. 被引量:15
  • 6颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222. 被引量:68
  • 7马丽生,邓辉文,齐逸.一种新的最大频繁项目集挖掘算法[J].计算机应用,2006,26(11):2670-2673. 被引量:6
  • 8范明,孟小峰,Han J,等.数据挖掘[M].北京:机械工业出版社,2001.

二级参考文献32

  • 1刘君强,孙晓莹,王勋,潘云鹤.挖掘最大频繁模式的新方法[J].计算机学报,2004,27(10):1328-1334. 被引量:15
  • 2颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222. 被引量:68
  • 3[1]Agrawal R., Imielinski T., Swami A.. Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D. C. , USA, 1993, 207~216
  • 4[2]Lin D-I. , Kedem Z. M.. Pincer-search: A new algorithm for discovering the maximum frequent set. In: Proceedings of the 6th International Conference on Extending Database Technology, Valencia, Spain, 1998, 105~119
  • 5[3]Bayardo R. J.. Efficiently mining long patterns from databases.In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA,1998, 85~93
  • 6[4]Aggarwal C. , Agarwal R. , Prasad V. V. V.. Depth first generation of long patterns. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Boston, MA, USA, 2000, 108~118
  • 7[5]Burdick D. , Calimlim M. , Gehrke J.. MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, 2001, 443~452
  • 8[6]Agrawal R. , Srikant R.. Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, 1994, 487~499
  • 9[7]Liu Jun-Qiang, Pan Yun-He, Wang Ke, Han Jia-Wei. Mining frequent item sets by opportunistic projection. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Alberta, Canada, 2002, 229~238
  • 10[8]Wang Ke, Liu Tang, Han Jia-Wei, Liu Jun-Qiang. Top down FP-growth for association rule mining. In: Proceedings of the 6th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Taipei, Taiwan, China, 2002, 334~340

共引文献84

同被引文献34

  • 1颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222. 被引量:68
  • 2陆介平,杨明,孙志挥,鞠时光.快速挖掘全局最大频繁项目集[J].软件学报,2005,16(4):553-560. 被引量:27
  • 3张素兰.一种基于事务压缩的关联规则优化算法[J].计算机工程与设计,2006,27(18):3450-3453. 被引量:16
  • 4邹丽,郭发军,王艳娟.分布式关联规则挖掘算法研究[J].科学技术与工程,2007,7(8):1759-1761. 被引量:3
  • 5冯洁,陶宏才.快速挖掘最大频繁项集[J].微电子学与计算机,2007,24(5):123-126. 被引量:12
  • 6Bayardo R J. Efficiently mining long patterns from databases[ C]. Proc. of the ACM-SIGMOD Intl Conf. Management of Data ( SIGMOD98 ). Seattle, Washington : 1998,85 -93.
  • 7Agarwal R C,Aggarwal C C,Prasad V V V. Depth first generation of long patterns[ C]. In Proceedings of the ACM SIGMOD Conference ,2000.
  • 8Burdick D, Calimlim M, Gehrke J. MAFIA: a maximal frequent itemset algorithm for transactional databases[C]. Intl Conf. on Data Engineering,2001.
  • 9Zhou Q H,Weslcy C, Lu B J. SmaltMiner:a depth 1st algorithm guided by tail information for mining maximal frequent itcmsets [C]. In: Proc. of the IEEE Int 1 Conf. on Data Mining (ICDM2002) ,2002,570-577.
  • 10Grahne G, Zhu J. Efficiently using prefix-trees in mining frequent itemsets[ C]. In:l st Workshop on Frequent ltemset Mining Implementation( FIMI03 ) ,2003.

引证文献4

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部