期刊文献+

一种不平衡数据流集成分类模型 被引量:23

An Ensemble Classifier Framework for Mining Imbalanced Data Streams
下载PDF
导出
摘要 针对不平衡数据流的分类问题,结合基于权重的集成分类器与抽样技术,本文提出了一种处理不平衡数据流集成分类器模型.理论分析与实验验证表明,该集成分类器具有更低的计算复杂度,更能适应存在概念漂移的不平衡数据流挖掘分类,其整体分类性能优于基于权重的集成分类器模型,能明显提升少数类的分类精度. Many real world data streams mining applications involve learning from imbalanced data streams,where such applications expect to have a higher predictive accuracy over the minority class,however most classification model assume relatively balanced data streams,they cannot handle imbalanced distribution.In this paper,we propose a novel ensemble classifier framework(IMDWE) for mining concept-drifting data streams with imbalanced distribution by using weighted ensemble classifier framework sampling technique including over-sampling and under-sampling.Our empirical study shows that the IMDWE is superior and have improves both the efficiency in learning the model and the accuracy in performing classification over the minority class.
出处 《电子学报》 EI CAS CSCD 北大核心 2010年第1期184-189,共6页 Acta Electronica Sinica
关键词 分类 集成分类器 不平衡数据流 概念漂移 classification ensemble classifier imbalanced data streams concept drift
  • 相关文献

参考文献20

  • 1H Wang, et al. Mining concept-drifting data streams using ensemble classifiers[ A ]. Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C] .New York: ACM Press,2003.226- 235.
  • 2M Scholz, R Klinkenberg. An ensemble classifier for drifting concepts[ A]. Proceedings of the Second International Work- shop on Knowledge Discovery in Data Streams [ C]. Porto, Portugal: Springer,2005.53 - 64.
  • 3Wei Fan. Systematic data selection to mine concept - drifting data streams[A]. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C] .New York: ACM Press,2004. 128- 137.
  • 4J Z Kolter, M A Maloof. Using additive expert ensembles to cope with concept drift [ A]. Proceedings of the 22nd International Conference on Machine Learning[C]. New York: ACM Press, 2005.449 - 456.
  • 5G M Weiss, F Provost. Learning when training data are costly: the effect of class distribution on tree induction[ J]. JOUlllal of Artificial Intelligence Research, 2003, (19) : 315 - 354.
  • 6N V Chawla, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, (16) :321 - 357.
  • 7G M Weiss. Mining with rarity: a unifying framework[ J]. ACM SIGKDD Explorations, 2004,6( 1 ) :8 - 19.
  • 8C Elkan. The foundations of cost - sensitive learning[A]. Proceedings of the 17th International Joint Conference on Artificial Intelligence[C]. Seattle, Washington, USA: Morgan Kaufinann Publishers Inc, 2001. 973 - 978.
  • 9M Ciraco, M Rogalewski, G Weiss. Improving classifier utility by altering the misclassification cost ratio[A]. Proceedings of the 1st International Workshop on Utility-based Data Mining [C] .New York: ACM Press,2005.46- 52.
  • 10C X Ling, V S Sheng. Cost-sensitive learning and the class imbalance problem [ A ]. Encyclopedia of Machine Learning M]. New York: Springer. 2008.

同被引文献185

  • 1林昌平,郑皎凌.基于DOM规范的网页分析技术研究[J].成都信息工程学院学报,2007,22(z1):113-117. 被引量:2
  • 2宋群,张骏,邓正宏.基于偏斜数据流分类的入侵检测方法[J].西北工业大学学报,2009,27(6):859-862. 被引量:1
  • 3黄訸,易晓东,李姗姗,廖湘科.面向高性能计算机的海量数据处理平台实现与评测[J].计算机研究与发展,2012,49(S1):357-361. 被引量:13
  • 4王佰玲,方滨兴,云晓春.零拷贝报文捕获平台的研究与实现[J].计算机学报,2005,28(1):46-52. 被引量:67
  • 5X Q Zlau,X D Wu. Class noise vs attribute noise: a quantitative study of their impacts[ J]. Artificial Intelligence Review, 2004, 11(3) : 177 - 210.
  • 6D Shen,Q Yang,Z Chen.Noise reduction through summariza tion for web-page classification[ J]. Information Processing and Management, 2007,43(6) : 1735 - 174.
  • 7V Eglin, S Bres, C Rivero. Hcrmite and Gabor transforms for noise reduction and handwriting classification in ancient manuscripts[ J]. International Journal on Document Analysis and Recognition,2007,9(2) : 101 - 122.
  • 8X D Wu, X Q Zhu. Mining with noise knowledge:error aware data mining[ J]. IEEE. Transactions on Systems Man and Cy bernetics,2008,38(4) :917 - 932.
  • 9G H John. Robust decision trees: Removing outliers from databases[ A]. Proceedings of the First International Confer ence on Knowledge Discovery and Data Mining [ C ]. Menlo Park, CA: AAAI Press, 1995. 174 - 179.
  • 10D Gamgerber, N Lavrac, C Groselj. Experiments with noise fil tering in a medical domain[ A ]. Proceedings of the Sixteenth International Conference on Machine 1.gaming[ C]. San Fran cisco,USA:Morgan Kauffnann, 1999.143 - 151.

引证文献23

二级引证文献95

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部