摘要
频繁闭项目集挖掘是数据挖掘研究中的一个重要研究课题。目前已有的频繁闭项目集挖掘算法主要针对单机环境,有关分布式环境下的全局频繁闭项目集挖掘算法的研究尚不多见。为此,本文提出了一种快速挖掘全局频繁闭项目集算法,并对其更新问题进行了研究;提出了一种相应的频繁闭项目集增量式更新算法,该算法将充分利用先前的挖掘结果来节省发现新的全局频繁闭项目集的时间开销。实验结果表明算法是有效的。
Discovering frequent closed itemsets is a key problem in data mining application. Many sequential algorithms have been proposed for mining frequent closed itemsets. However, very little work has been done in discovering fre quent closed itemsets in distributed environment. In this paper, an efficient algorithm GFCIA and its updating algo rithm UGFCIA for mining global frequent closed itemsets is presented, which uses far less communication overhead. Experimental results show the feasibility and effectiveness of the algorithm.
出处
《计算机科学》
CSCD
北大核心
2008年第1期193-195,共3页
Computer Science
基金
国家自然科学基金(No.60572112)
关键词
数据挖掘
分布式数据库
频繁闭项目集
全局频繁闭项目集
Data mining, Distributed database, Frequent closed itemsets, Global frequent closed itemsets