摘要
最大频繁项目集挖掘是数据挖掘领域最重要的基本问题之一,在分析已有算法的基础上提出了FP-MMFI算法,它是对FP-growth算法在最大频繁项目集挖掘上的扩展。提出了频繁路径的概念,用它可以有效地对FP-tree进行压缩和缩小搜索空间,同时使用投影的方法对超集检测进行了优化,减少了项目匹配的次数。最后实验结果表明,该算法在性能上优于已有的同类算法。
Maximal Frequent itemsets mining is one of most important and fundamental data mining problems. A new algorithm FPMMFI is presented, which is an extension of the FP-growth method for mining maximal frequent itemsets. A new concept is developed, called frequent path, which can reduce the size of FP-tree and search space. A method of projection is used to reduce the comparative times of superset checking. The experimental result show that the new algorithm outperforms the previously developed algorithms such as MAFIA.
出处
《计算机工程与设计》
CSCD
北大核心
2008年第2期385-388,共4页
Computer Engineering and Design
关键词
数据挖掘
关联规则
频繁项目集
最大频繁项目集
频繁模式树
data mining
association rules
frequent itemsets
maximal frequent itemsets
frequent pattern tree