摘要
序列模式在基因分析、金融预测等方面有着重要的应用,是数据挖掘的一个主要分支.鉴于数据流应用的日益增多,本文在研究传统序列模式挖掘算法的基础上,提出了一种基于可扩展滑动窗口和贝叶斯概率过滤的面向数据流的序列模式挖掘算法(BM SP-DS算法),目的是简化序列模式发现的中间结果,提高挖掘效率,以便在小的存储空间和低的运算时间内快速发现流数据的频繁序列模式,同时算法也减少了因主观支持度取值不当对模式发现造成的负面影响.实验结果表明,该算法是可行、较优的.
Although mining sequential pattern is becoming increasing essential m many scientific and commerciat domains, it is challenging to extend it to data stream. In this paper, we present a new efficient BMSP-DS algorithm of sequential patterns mining for data stream, which based on extendable sliding window and Bayesian probability filtration. This algorithm can reduce temp data in mining process by eliminating low probability sequence candidates, and quicken frequent sequential patterns mining in limited time and restricted space. Finally, the experiment result demonstrates the algorithm is effective.
出处
《小型微型计算机系统》
CSCD
北大核心
2006年第7期1292-1295,共4页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(60273075)资助.
关键词
数据流
序列模式
滑动窗口
贝叶斯概率
data stream
sequential patterns
sliding window
bayesian probability