基于FP-tree的最大频繁项目集挖掘算法被引量：4

Algorithm for mining maximal frequent itemsets based on FP-tree

下载PDF

导出

摘要最大频繁项目集挖掘是数据挖掘领域最重要的基本问题之一,在分析已有算法的基础上提出了FP-MMFI算法,它是对FP-growth算法在最大频繁项目集挖掘上的扩展。提出了频繁路径的概念,用它可以有效地对FP-tree进行压缩和缩小搜索空间,同时使用投影的方法对超集检测进行了优化,减少了项目匹配的次数。最后实验结果表明,该算法在性能上优于已有的同类算法。 Maximal Frequent itemsets mining is one of most important and fundamental data mining problems. A new algorithm FPMMFI is presented, which is an extension of the FP-growth method for mining maximal frequent itemsets. A new concept is developed, called frequent path, which can reduce the size of FP-tree and search space. A method of projection is used to reduce the comparative times of superset checking. The experimental result show that the new algorithm outperforms the previously developed algorithms such as MAFIA.

作者马丽生邓辉文齐逸

机构地区滁州学院计算机科学与技术系西南大学计算机与信息科学学院

出处《计算机工程与设计》 CSCD 北大核心 2008年第2期385-388,共4页 Computer Engineering and Design

关键词数据挖掘关联规则频繁项目集最大频繁项目集频繁模式树 data mining association rules frequent itemsets maximal frequent itemsets frequent pattern tree

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献8

1Han J Pei J,Yin Y.Mining frequent patterns without candidate generation[C].Dallas,TX: ACM-SIGMOD,2000.
2Aggarwal C,Agrawal R.Prasad VVV. Depth first generation of long patterns[C]. Boston,MA,USA:Proc of the 6th ACM SIGKDD International Conference on knowledge Discovery and Data Mining,2000: 108-118.
3Burdick D,Calimlim M,Gehrke J.MAFIA: A maximal frequent itemset algorithm for transactional databases [C]. Heidelberg, Germany:Proc of the 17th International Conference on Data Engineering,2001:443-452.
4Grahne G, Zhu JEHigh performance mining of maximal frequent itemsets [C]. San Francisco, CA:Proc of the 6th SIAM Int'l Workshop on High Performance Data Mining (HPDM), 2003:135-143.
5刘君强,孙晓莹,王勋,潘云鹤.挖掘最大频繁模式的新方法[J].计算机学报,2004,27(10):1328-1334. 被引量：15
6颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222. 被引量：68
7马丽生,邓辉文,齐逸.一种新的最大频繁项目集挖掘算法[J].计算机应用,2006,26(11):2670-2673. 被引量：6
8范明,孟小峰,Han J,等.数据挖掘[M].北京:机械工业出版社,2001.

二级参考文献32

1刘君强,孙晓莹,王勋,潘云鹤.挖掘最大频繁模式的新方法[J].计算机学报,2004,27(10):1328-1334. 被引量：15
2颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222. 被引量：68
3[1]Agrawal R., Imielinski T., Swami A.. Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D. C. , USA, 1993, 207～216
4[2]Lin D-I. , Kedem Z. M.. Pincer-search: A new algorithm for discovering the maximum frequent set. In: Proceedings of the 6th International Conference on Extending Database Technology, Valencia, Spain, 1998, 105～119
5[3]Bayardo R. J.. Efficiently mining long patterns from databases.In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA,1998, 85～93
6[4]Aggarwal C. , Agarwal R. , Prasad V. V. V.. Depth first generation of long patterns. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery ＆ Data Mining, Boston, MA, USA, 2000, 108～118
7[5]Burdick D. , Calimlim M. , Gehrke J.. MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, 2001, 443～452
8[6]Agrawal R. , Srikant R.. Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, 1994, 487～499
9[7]Liu Jun-Qiang, Pan Yun-He, Wang Ke, Han Jia-Wei. Mining frequent item sets by opportunistic projection. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Alberta, Canada, 2002, 229～238
10[8]Wang Ke, Liu Tang, Han Jia-Wei, Liu Jun-Qiang. Top down FP-growth for association rule mining. In: Proceedings of the 6th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Taipei, Taiwan, China, 2002, 334～340

共引文献84

1刘美琦,唐常杰,徐开阔,刘胤田.基于补充频繁模式的P2P搜索优化[J].四川大学学报（自然科学版）,2009,46(6):1638-1644.
2卓贤林,游志胜,李辉.基于数据挖掘的VaR测量算法研究[J].四川大学学报（自然科学版）,2005,42(6):1136-1140.
3黄澍庄.频繁项集挖掘算法分析与比较[J].德州学院学报,2005,21(6):65-71.
4陈鹏,吕卫锋.一种基于有效修剪的最大频繁项集挖掘算法[J].北京航空航天大学学报,2006,32(2):218-223. 被引量：2
5祝木田,于红梅,刘志杰.基于决策树技术数据提取的应用[J].山东理工大学学报（自然科学版）,2006,20(2):77-80. 被引量：1
6唐德权,王绪峰,朱林立,谢文君.一种快速挖掘频繁项集算法的研究[J].湖南科技学院学报,2006,27(5):117-120. 被引量：3
7马丽生,邓辉文,齐逸.一种新的最大频繁项目集挖掘算法[J].计算机应用,2006,26(11):2670-2673. 被引量：6
8尤磊,辛大欣,石云平.一种改进的FP-Growth关联规则挖掘算法[J].国外电子测量技术,2007,26(5):22-25. 被引量：2
9宋晶晶,刘瑞新,王艳,姜保庆.Mining Maximal Frequent Patterns in a Unidirectional FP-tree[J].Journal of Donghua University(English Edition),2006,23(6):105-109. 被引量：1
10李海滨.一个多层次模糊规则的逐维挖掘算法[J].广西科学院学报,2007,23(3):144-146.

同被引文献34

1颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222. 被引量：68
2陆介平,杨明,孙志挥,鞠时光.快速挖掘全局最大频繁项目集[J].软件学报,2005,16(4):553-560. 被引量：27
3张素兰.一种基于事务压缩的关联规则优化算法[J].计算机工程与设计,2006,27(18):3450-3453. 被引量：16
4邹丽,郭发军,王艳娟.分布式关联规则挖掘算法研究[J].科学技术与工程,2007,7(8):1759-1761. 被引量：3
5冯洁,陶宏才.快速挖掘最大频繁项集[J].微电子学与计算机,2007,24(5):123-126. 被引量：12
6Bayardo R J. Efficiently mining long patterns from databases[ C]. Proc. of the ACM-SIGMOD Intl Conf. Management of Data ( SIGMOD98 ). Seattle, Washington : 1998,85 -93.
7Agarwal R C,Aggarwal C C,Prasad V V V. Depth first generation of long patterns[ C]. In Proceedings of the ACM SIGMOD Conference ,2000.
8Burdick D, Calimlim M, Gehrke J. MAFIA: a maximal frequent itemset algorithm for transactional databases[C]. Intl Conf. on Data Engineering,2001.
9Zhou Q H,Weslcy C, Lu B J. SmaltMiner:a depth 1st algorithm guided by tail information for mining maximal frequent itcmsets [C]. In: Proc. of the IEEE Int 1 Conf. on Data Mining (ICDM2002) ,2002,570-577.
10Grahne G, Zhu J. Efficiently using prefix-trees in mining frequent itemsets[ C]. In:l st Workshop on Frequent ltemset Mining Implementation( FIMI03 ) ,2003.

引证文献4

1任永功,张亮,付玉.一种基于频繁模式树的最大频繁项目集挖掘算法[J].小型微型计算机系统,2010,31(2):317-321. 被引量：6
2李也白,唐辉,张淳,贺玉明.基于改进的FP-tree的频繁模式挖掘算法[J].计算机应用,2011,31(1):101-103. 被引量：21
3陈刚,闫英战,刘秉权.一种基于CAN-tree快速构建算法[J].微电子学与计算机,2014,31(1):76-82. 被引量：4
4张志宏,兰静.海量加密军用数据下的频繁项目集挖掘仿真[J].计算机仿真,2015,32(5):10-13.

二级引证文献30

1马青霞,李广水,郑滔.多谓词约束下基于模式增长的频繁项集挖掘算法[J].计算机技术与发展,2011,21(10):116-120. 被引量：1
2蒋廷耀,廖强.一种基于局部重构树的改进频繁子图挖掘算法[J].武汉理工大学学报（信息与管理工程版）,2011,33(6):864-867.
3杜永生.基于层次频繁模式树的关联分类规则数据挖掘算法[J].济宁学院学报,2011,32(6):76-78.
4王会金.中观信息系统审计风险控制体系研究——以COBIT框架与数据挖掘技术相结合为视角[J].审计与经济研究,2012,27(1):16-23. 被引量：24
5邹晓红,郑超.基于标准编码的频繁子图挖掘算法[J].小型微型计算机系统,2012,33(1):78-82.
6马丽生,姚光顺,杨传健.基于改进FP-tree的最大频繁项目集挖掘算法[J].计算机应用,2012,32(2):326-329. 被引量：8
7王体春,陈炳发,卜良峰.基于公理化设计的产品方案设计可拓配置模型[J].中国机械工程,2012,23(19):2269-2275. 被引量：6
8吐尔地·托合提,维尼拉·木沙江,艾斯卡尔·艾木都拉.基于频繁模式挖掘的维吾尔文智能组词方法[J].计算机应用,2012,32(10):2920-2922. 被引量：6
9杨艳霞,张伟丰.卷烟产品销售规律挖掘算法的应用[J].数字技术与应用,2013,31(1):121-121. 被引量：1
10杨艳霞,杨丽华,张伟丰.基于FP-Growth算法的卷烟产品销售规律挖掘研究[J].科技创业月刊,2013,26(4):31-32. 被引量：1

1钱进.最大频繁项目集挖掘技术研究与展望[J].微计算机应用,2005,26(6):652-654. 被引量：7
2钱进.最大频繁项目集挖掘技术研究[J].江苏技术师范学院学报,2004,10(4):61-64.
3马丽生,邓辉文,齐逸.一种新的最大频繁项目集挖掘算法[J].计算机应用,2006,26(11):2670-2673. 被引量：6
4宋余庆,朱玉全,孙志挥,杨鹤标.一种基于频繁模式树的约束最大频繁项目集挖掘及其更新算法[J].计算机研究与发展,2005,42(5):777-783. 被引量：21
5赵鹏.海量高维数据下的频繁项目集挖掘算法研究[J].计算机应用与软件,2012,29(7):150-153. 被引量：2
6陈晨,鞠时光.改进的最大频繁项集挖掘算法[J].计算机工程与设计,2010,31(18):4009-4011. 被引量：2
7宋余庆,朱玉全,孙志挥,陈耿.基于FP-Tree的最大频繁项目集挖掘及更新算法[J].软件学报,2003,14(9):1586-1592. 被引量：164
8刘杰,葛晓玢,姚珺.基于矩阵的最大频繁项目集挖掘算法研究[J].电脑知识与技术（过刊）,2011,17(10X):7234-7236. 被引量：1
9陈耿,朱玉全,宋余庆,陆介平,孙志挥.基于频繁模式树的约束最大频繁项目集挖掘算法研究[J].应用科学学报,2006,24(1):64-69. 被引量：4
10刘慧婷,候明利,赵鹏,姚晟.不确定数据流最大频繁项集挖掘算法研究[J].计算机工程与应用,2016,52(19):72-77. 被引量：9

计算机工程与设计

2008年第2期

浏览历史

内容加载中请稍等...

基于FP-tree的最大频繁项目集挖掘算法被引量：4

参考文献8

二级参考文献32

共引文献84

同被引文献34

引证文献4

二级引证文献30

相关作者

相关机构

相关主题

浏览历史

基于FP-tree的最大频繁项目集挖掘算法 被引量：4

参考文献8

二级参考文献32

共引文献84

同被引文献34

引证文献4

二级引证文献30

相关作者

相关机构

相关主题

浏览历史

基于FP-tree的最大频繁项目集挖掘算法被引量：4