CAPE——数据流上的基于频繁模式的分类算法被引量：7

CAPE—A Classification Algorithm Using Frequent Patterns over Data Streams

下载PDF

导出

摘要近年来涌现出很多数据流的应用 ,比如网络日志、传感器网络等数据流的数据量无限、数据分布变化等特性使得传统的挖掘算法不能很好地解决这些问题针对上述问题提出了一种数据流上的基于频繁模式的分类算法———CAPE(classificationusingfrequentpattern) CAPE通过数据流中的频繁模式进行分类 ,在压缩数据的同时保存了数据中的分类信息实验证明 ,这种算法比其他算法有更高的准确性 Classification is an important data mining task in the past decade Meanwhile, many effective and efficient methods, e g decision tree and Bayes network, have been developed for classifying on large static database However, these methods do not fit to processing over data stream So a new algorithm—CAPE(classification using frequent patterns) is presented to deal with classification over data stream Frequent patterns are imported into classification and used to record data distributing over stream mainly during a certain time slice The experimental results show that the accuracy of classification using frequent patterns over stream is higher in most cases compared with the algorithm “weighted classifier ensembles” which is known to be the best classification algorithm over stream at present

作者王鹏吴晓晨王晨汪卫施伯乐

机构地区复旦大学计算机与信息技术系

出处《计算机研究与发展》 EI CSCD 北大核心 2004年第10期1677-1683,共7页 Journal of Computer Research and Development

基金国家自然科学基金重点项目 ( 6993 3 0 10 60 3 0 3 0 0 8) 国家"八六三"高技术研究发展计划基金项目 ( 2 0 0 2AA4Z3 43 0 2 0 0 2AA2 3 10 41)

关键词数据流分类决策树频繁模式 data stream classification decision tree frequent pattern

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献11

1J Han, M Kamber. Data Mining: Concepts and Techniques. San Francisco: Morgan Kaufmann, 2000
2B Babcock, S Babu, M Datar, et al. Models and issues in data stream systems. In: Proc of ACM Symp on Principles of Database Systems (PODS-02). New York: ACM Press, 2002
3Y Chen, G Dong, J Han,et al. Multi-dimensional regression analysis of time-series data streams. In: Proc of Very Large Database (VLDB02). San Francisco: Morgan Kaufmann, 2002
4J-M Adamo. Data Mining for Association Rules and Sequential Patterns: Sequential and Parallel Algorithms. New York:Springer-Verlag, 2001
5G Hulten, L Spencer, P Domingos. Mining time-changing data streams. In: Proc of the Int'l Conf on Knowledge Discovery and Data Mining (SIGKDD01). New York: ACM Press, 2001. 97～106
6Haixun Wang, Wei Fan Philip S Yu, Jiawei Han. Mining concept-drifting data streams using ensemble classifiers. In: Proc of the Int'l Conf on Knowledge Discovery and Data Mining (SIGKDD03). New York: ACM Press, 2003
7B Liu, W Hsu, Y Ma. Integrating classification and association rule mining. KDD'98, New York, 1998
8W Li, J Han, J Pei. CMAR: Accurate and efficient classiffication based on multiple class-association rules. In: Proc of ICDM' 01.Washington, D C: IEEE Computer Society Press, 2001. 369～376
9X Yin, J Han. CPAR: Classification based on predictive association rules. The 2003 SIAM Int'l Conf on Data Mining (SDM'03), San Fransisco, CA, 2003
10Joong Hyuk Chang, Won Suk Lee. Finding recent frequent itemsets adaptively over online date streams. In: Proc of SIGKDD03. New York: ACM Press, 2003

同被引文献32

1朱永泰,王晨,洪铭胜,汪卫,施伯乐.ESPM——频繁子树挖掘算法[J].计算机研究与发展,2004,41(10):1720-1727. 被引量：18
2马瑞民,王小龙.NBCC:一种数据流上变化的挖掘算法[J].计算机工程与应用,2006,42(7):166-168. 被引量：1
3Wang J, Karypis G. HARMONY: Efficiently mining the best rules for classification [C] //Proc of 2005 SIAM Conf of Data Mining (SDM'05). 2005: 205-216
4Liu B, Hsu W, Ma Y. Integrating classification and association rule mining [C] //Proc of KDD'98. 1998:80-86
5Li W, Han J, Pei J. CMAR: Accurate and efficient classification based on multiple class-association rules [C] //Proc of ICDM'01. Berlin: Springer, 2001:369-376
6Gosta G, Jianfei Z. Efficiently Using prefix-trees in mining frequent itemsets [C] //Proc of FIMI'04. Piscataway, NJ: IEEE, 2003
7Chi Y, Wang H, Yu P S, et al. Moment: Maintaining closed frequent itemsets over a stream sliding window [C]//Proc of ICDM'04. Piscataway, NJ: IEEE, 2004:59-66
8Pei J, Han J, Wang J. Closet+: Searching for the best strategies for mining frequent closed itemsets [C]//Proc of SIGKDD '03. New York: ACM, 2003
9Burdiek D, Calimlim M, Gehrke J. MAFIA: A maximal frequent itemset algorithm for transactional databases [C] //Proc of the 17tb Int Conf on Data Engineering. Piseataway, NJ: IEEE, 2001:443-452
10Coenen F. LUCS KDD implementation of CMAR [OL]. [2007-10-07J. http://www. esc. liv. ac. uk/-frans/KDD/ Software/CMAR/emar. html, The University of Liverpool

引证文献7

1赵文文,吴坚,陈波.数据挖掘中的频繁模式发现[J].萍乡高等专科学校学报,2005,22(4):84-85.
2国新出版物发行数据调查中心修改《出版物发行数据核查指引》(报刊部分)[J].中国报业,2006(12):17-17.
3杨颖,杨磊.分布式流数据频繁项发现算法的研究[J].计算机应用,2008,28(1):136-139. 被引量：1
4敖富江,王涛,刘宝宏,黄柯棣.CBC-DS:基于频繁闭模式的数据流分类算法[J].计算机研究与发展,2009,46(5):779-786. 被引量：3
5丁剑,韩萌,李娟.概念漂移数据流挖掘算法综述[J].计算机科学,2016,43(12):24-29. 被引量：13
6孙杜靖,李玲娟,马可.面向流数据的DPFP-Stream算法的设计与实现[J].计算机技术与发展,2017,27(7):29-33. 被引量：1
7沈森.数据流上变化的挖掘算法运用[J].信息技术与信息化,2021(11):89-91.

二级引证文献18

1许冠英,韩萌,王少峰,贾涛.数据流集成分类算法综述[J].计算机应用研究,2020,37(1):1-8. 被引量：11
2屠莉,陈崚.流数据上的频繁项挖掘算法[J].计算机应用,2011,31(2):450-453.
3马青霞,李广水,孙梅.频繁模式挖掘进展及典型应用[J].计算机工程与应用,2011,47(15):138-144. 被引量：6
4贾敏杰,王黎明.基于k-best树模式的树流分类算法研究[J].小型微型计算机系统,2013,34(6):1328-1333.
5丁剑,韩萌,李娟.概念漂移数据流挖掘算法综述[J].计算机科学,2016,43(12):24-29. 被引量：13
6杨帆,张永.基于相对熵的数据流概念漂移检测算法[J].计算机应用与软件,2017,34(12):256-259. 被引量：2
7费宏慧.导构网络中用户信息资源优化检测方法研究[J].计算机仿真,2017,34(12):318-320. 被引量：2
8印世杰,陈作炳,朱梦佳,项勤.粘稠物料烘干机干燥过程优化仿真研究[J].计算机仿真,2017,34(12):383-388. 被引量：1
9廖多杨.医院临床数据分析智能分类处理技术研究[J].计算机测量与控制,2018,26(2):183-185. 被引量：2
10汤健,乔俊飞,刘卓,周晓杰,余刚,赵建军.磨矿过程的球磨机研磨机理数值仿真及磨机负荷参数软测量综述[J].北京工业大学学报,2018,44(11):1459-1470. 被引量：15

1技术趋势North Cape和 Haswell[J].微型计算机,2013(5):108-108.
2王月成,郭开政,姜峰.CAPE-1200油井测控终端异步串行通讯应用[J].数字技术与应用,2011,29(9):108-108.
3史永辉,刘曲明.仿真软件的研究及应用[J].舰船论证参考,2003(1):58-62.
4Cape Clear Studio[J].个人电脑,2003,9(11):93-93.
5e络盟BeagleBoard开源系列新品引创客高度关注[J].单片机与嵌入式系统应用,2017,17(4):15-15.
6刘科研,万丽荣,曾庆良,范文慧.基于XML的信息集成系统的研究与实现[J].计算机应用研究,2005,22(4):149-151. 被引量：22
7党华锐,赵慧.基于Petri网的协议辅助设计工具的实现[J].西北大学学报（自然科学版）,1995,25(5):422-424.
8宋登,张浩,樊留群.制造企业信息化平台中的CAPE[J].制造业自动化,2004,26(9):53-56. 被引量：6
9张晰.霍尼韦尔成为CAPE—OPEN实验室网络协会会员[J].国际化工信息,2005(11):27-28.
10霍尼韦尔成为CAPE-OPEN实验室网络协会会员[J].世界仪表与自动化,2006,10(1):8-8.

计算机研究与发展

2004年第10期

浏览历史

内容加载中请稍等...

CAPE——数据流上的基于频繁模式的分类算法被引量：7

参考文献11

同被引文献32

引证文献7

二级引证文献18

相关作者

相关机构

相关主题

浏览历史

CAPE——数据流上的基于频繁模式的分类算法 被引量：7

参考文献11

同被引文献32

引证文献7

二级引证文献18

相关作者

相关机构

相关主题

浏览历史

CAPE——数据流上的基于频繁模式的分类算法被引量：7