摘要
近年来涌现出很多数据流的应用 ,比如网络日志、传感器网络等 数据流的数据量无限、数据分布变化等特性使得传统的挖掘算法不能很好地解决这些问题 针对上述问题提出了一种数据流上的基于频繁模式的分类算法———CAPE(classificationusingfrequentpattern) CAPE通过数据流中的频繁模式进行分类 ,在压缩数据的同时保存了数据中的分类信息 实验证明 ,这种算法比其他算法有更高的准确性
Classification is an important data mining task in the past decade Meanwhile, many effective and efficient methods, e g decision tree and Bayes network, have been developed for classifying on large static database However, these methods do not fit to processing over data stream So a new algorithm—CAPE(classification using frequent patterns) is presented to deal with classification over data stream Frequent patterns are imported into classification and used to record data distributing over stream mainly during a certain time slice The experimental results show that the accuracy of classification using frequent patterns over stream is higher in most cases compared with the algorithm “weighted classifier ensembles” which is known to be the best classification algorithm over stream at present
出处
《计算机研究与发展》
EI
CSCD
北大核心
2004年第10期1677-1683,共7页
Journal of Computer Research and Development
基金
国家自然科学基金重点项目 ( 6993 3 0 10
60 3 0 3 0 0 8)
国家"八六三"高技术研究发展计划基金项目 ( 2 0 0 2AA4Z3 43 0
2 0 0 2AA2 3 10 41)
关键词
数据流
分类
决策树
频繁模式
data stream
classification
decision tree
frequent pattern