摘要
频繁闭合模式集惟一确定频繁模式完全集并且数量小得多,然而,如何挖掘滑动窗口中的频繁闭合模式集是一个很大的挑战.根据数据流的特点,提出了一种发现滑动窗口中频繁闭合模式的新方法DSCFI.DSCFI算法将滑动窗口分割为若干个基本窗口,以基本窗口为更新单位,利用已有的频繁闭合模式挖掘算法计算每个基本窗口的潜在频繁闭合项集,将它们及其子集存储到一种新的数据结构DSCFItree中,DSCFItree能够增量更新,利用DSCFItree可以快速地挖掘滑动窗口中的所有频繁闭合模式.最后,通过实验验证了这种方法的有效性.
The set of frequent closed patterns determines exactly the complete set of all frequent patterns and is usually much smaller than the latter. But how to mine frequent closed patterns from a sliding window is a very big challenge. According to the features of data streams, a new algorithm, call DS_CFI, is proposed to solve the problem of mining the frequent closed itemsets. A sliding window is divided into several basic windows and the basic window is served as an updating unit. Latency frequent closed itemsets of every basic window are mined by the existing frequent closed pattern algorithms. Those itemsets and their subset are stored in a new data structure called DSCFI_tree. The DSCFI_tree can be incrementally updated and the frequent closed itemsets in a sliding window can be rapidly found based on DSCFI_tree. The experimental results show the feasibility and effectiveness of the algorithm.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2006年第10期1738-1743,共6页
Journal of Computer Research and Development
基金
江苏省高技术基金项目(BG2004034)
江苏省2004年度研究生创新计划基金项目(xm04-36)~~
关键词
数据流
闭合频繁项集
滑动窗口
关联规则
知识发现
data stream
frequent closed item
sliding window
association rule
knowledge discovery