期刊文献+

倾斜数据流中正例样本的漂移检测方法

Concept Drift Detection of Positive Class in Skewed Data Streams
下载PDF
导出
摘要 倾斜数据中普遍存在概念漂移,而已有数据流概念漂移检测方法多假设类分布是平衡的,难以用于倾斜数据流。为此,提出了一种基于正例分布的倾斜数据流概念漂移检测方法CDPSD。首先采用改进的重采样方法,避免将不同概念的实例采样到同一数据块中,并构建分类器;再通过检测正例而非所有实例的类分布变化进行概念漂移的检测及分类器更新。实验表明,CDPSD能及时检测到倾斜数据流中的概念漂移,并快速更新分类模型,提高了正类样本的分类效果。 The concept drift is common in skewed data stream (SDS). However, the most detection algorithms of concept drift assume that the class distributions of data streams are balanced, and are not suitable in skewed data streams. Therefore, this paper proposes a detection approach for concept drifts in SDS, called CDPSD. Firstly, it adopts the modified resample method, which makes the instances in different concepts belong to different data blocks, and then builds the classifiers. Secondly, it uses the class distribution of the positive not all instances to detect the concept drifts and modify the classifiers. The experiments show that CDPSD can detect the concept drift, update the classifier in time, and promote the classification results of positive instances.
出处 《计算机科学与探索》 CSCD 2013年第6期545-550,共6页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金No.60975034 中央高校基本科研业务费专项资金Nos.2011HGBZ1329 2011HGQC1013 安徽省自然科学基金No.1208085QF122~~
关键词 概念漂移 倾斜数据流 重采样 分类 正类 concept drift skewed data streams resample classification positive class
  • 相关文献

参考文献4

二级参考文献39

  • 1H Wang, et al. Mining concept-drifting data streams using ensemble classifiers[ A ]. Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C] .New York: ACM Press,2003.226- 235.
  • 2M Scholz, R Klinkenberg. An ensemble classifier for drifting concepts[ A]. Proceedings of the Second International Work- shop on Knowledge Discovery in Data Streams [ C]. Porto, Portugal: Springer,2005.53 - 64.
  • 3Wei Fan. Systematic data selection to mine concept - drifting data streams[A]. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C] .New York: ACM Press,2004. 128- 137.
  • 4J Z Kolter, M A Maloof. Using additive expert ensembles to cope with concept drift [ A]. Proceedings of the 22nd International Conference on Machine Learning[C]. New York: ACM Press, 2005.449 - 456.
  • 5G M Weiss, F Provost. Learning when training data are costly: the effect of class distribution on tree induction[ J]. JOUlllal of Artificial Intelligence Research, 2003, (19) : 315 - 354.
  • 6N V Chawla, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, (16) :321 - 357.
  • 7G M Weiss. Mining with rarity: a unifying framework[ J]. ACM SIGKDD Explorations, 2004,6( 1 ) :8 - 19.
  • 8C Elkan. The foundations of cost - sensitive learning[A]. Proceedings of the 17th International Joint Conference on Artificial Intelligence[C]. Seattle, Washington, USA: Morgan Kaufinann Publishers Inc, 2001. 973 - 978.
  • 9M Ciraco, M Rogalewski, G Weiss. Improving classifier utility by altering the misclassification cost ratio[A]. Proceedings of the 1st International Workshop on Utility-based Data Mining [C] .New York: ACM Press,2005.46- 52.
  • 10C X Ling, V S Sheng. Cost-sensitive learning and the class imbalance problem [ A ]. Encyclopedia of Machine Learning M]. New York: Springer. 2008.

共引文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部