期刊文献+

CAPE——数据流上的基于频繁模式的分类算法 被引量:7

CAPE—A Classification Algorithm Using Frequent Patterns over Data Streams
下载PDF
导出
摘要 近年来涌现出很多数据流的应用 ,比如网络日志、传感器网络等 数据流的数据量无限、数据分布变化等特性使得传统的挖掘算法不能很好地解决这些问题 针对上述问题提出了一种数据流上的基于频繁模式的分类算法———CAPE(classificationusingfrequentpattern) CAPE通过数据流中的频繁模式进行分类 ,在压缩数据的同时保存了数据中的分类信息 实验证明 ,这种算法比其他算法有更高的准确性 Classification is an important data mining task in the past decade Meanwhile, many effective and efficient methods, e g decision tree and Bayes network, have been developed for classifying on large static database However, these methods do not fit to processing over data stream So a new algorithm—CAPE(classification using frequent patterns) is presented to deal with classification over data stream Frequent patterns are imported into classification and used to record data distributing over stream mainly during a certain time slice The experimental results show that the accuracy of classification using frequent patterns over stream is higher in most cases compared with the algorithm “weighted classifier ensembles” which is known to be the best classification algorithm over stream at present
出处 《计算机研究与发展》 EI CSCD 北大核心 2004年第10期1677-1683,共7页 Journal of Computer Research and Development
基金 国家自然科学基金重点项目 ( 6993 3 0 10 60 3 0 3 0 0 8) 国家"八六三"高技术研究发展计划基金项目 ( 2 0 0 2AA4Z3 43 0 2 0 0 2AA2 3 10 41)
关键词 数据流 分类 决策树 频繁模式 data stream classification decision tree frequent pattern
  • 相关文献

参考文献11

  • 1J Han, M Kamber. Data Mining: Concepts and Techniques. San Francisco: Morgan Kaufmann, 2000
  • 2B Babcock, S Babu, M Datar, et al. Models and issues in data stream systems. In: Proc of ACM Symp on Principles of Database Systems (PODS-02). New York: ACM Press, 2002
  • 3Y Chen, G Dong, J Han,et al. Multi-dimensional regression analysis of time-series data streams. In: Proc of Very Large Database (VLDB02). San Francisco: Morgan Kaufmann, 2002
  • 4J-M Adamo. Data Mining for Association Rules and Sequential Patterns: Sequential and Parallel Algorithms. New York:Springer-Verlag, 2001
  • 5G Hulten, L Spencer, P Domingos. Mining time-changing data streams. In: Proc of the Int'l Conf on Knowledge Discovery and Data Mining (SIGKDD01). New York: ACM Press, 2001. 97~106
  • 6Haixun Wang, Wei Fan Philip S Yu, Jiawei Han. Mining concept-drifting data streams using ensemble classifiers. In: Proc of the Int'l Conf on Knowledge Discovery and Data Mining (SIGKDD03). New York: ACM Press, 2003
  • 7B Liu, W Hsu, Y Ma. Integrating classification and association rule mining. KDD'98, New York, 1998
  • 8W Li, J Han, J Pei. CMAR: Accurate and efficient classiffication based on multiple class-association rules. In: Proc of ICDM' 01.Washington, D C: IEEE Computer Society Press, 2001. 369~376
  • 9X Yin, J Han. CPAR: Classification based on predictive association rules. The 2003 SIAM Int'l Conf on Data Mining (SDM'03), San Fransisco, CA, 2003
  • 10Joong Hyuk Chang, Won Suk Lee. Finding recent frequent itemsets adaptively over online date streams. In: Proc of SIGKDD03. New York: ACM Press, 2003

同被引文献32

  • 1朱永泰,王晨,洪铭胜,汪卫,施伯乐.ESPM——频繁子树挖掘算法[J].计算机研究与发展,2004,41(10):1720-1727. 被引量:18
  • 2马瑞民,王小龙.NBCC:一种数据流上变化的挖掘算法[J].计算机工程与应用,2006,42(7):166-168. 被引量:1
  • 3Wang J, Karypis G. HARMONY: Efficiently mining the best rules for classification [C] //Proc of 2005 SIAM Conf of Data Mining (SDM'05). 2005: 205-216
  • 4Liu B, Hsu W, Ma Y. Integrating classification and association rule mining [C] //Proc of KDD'98. 1998:80-86
  • 5Li W, Han J, Pei J. CMAR: Accurate and efficient classification based on multiple class-association rules [C] //Proc of ICDM'01. Berlin: Springer, 2001:369-376
  • 6Gosta G, Jianfei Z. Efficiently Using prefix-trees in mining frequent itemsets [C] //Proc of FIMI'04. Piscataway, NJ: IEEE, 2003
  • 7Chi Y, Wang H, Yu P S, et al. Moment: Maintaining closed frequent itemsets over a stream sliding window [C]//Proc of ICDM'04. Piscataway, NJ: IEEE, 2004:59-66
  • 8Pei J, Han J, Wang J. Closet+: Searching for the best strategies for mining frequent closed itemsets [C]//Proc of SIGKDD '03. New York: ACM, 2003
  • 9Burdiek D, Calimlim M, Gehrke J. MAFIA: A maximal frequent itemset algorithm for transactional databases [C] //Proc of the 17tb Int Conf on Data Engineering. Piseataway, NJ: IEEE, 2001:443-452
  • 10Coenen F. LUCS KDD implementation of CMAR [OL]. [2007-10-07J. http://www. esc. liv. ac. uk/-frans/KDD/ Software/CMAR/emar. html, The University of Liverpool

引证文献7

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部