期刊文献+

面向流数据分类的在线学习综述 被引量:25

Survey of Online Learning Algorithms for Streaming Data Classification
下载PDF
导出
摘要 流数据分类旨在从连续不断到达的流式数据中增量学习一个从输入变量到类标变量的映射函数,以便对随时到达的测试数据进行准确分类.在线学习范式作为一种增量式的机器学习技术,是流数据分类的有效工具.主要从在线学习的角度对流数据分类算法的研究现状进行综述.具体地,首先介绍在线学习的基本框架和性能评估方法,然后着重介绍在线学习算法在一般流数据上的工作现状,在高维流数据上解决“维度诅咒”问题的工作现状,以及在演化流数据上处理“概念漂移”问题的工作现状,最后讨论高维和演化流数据分类未来仍然存在的挑战和亟待研究的方向. The objective of streaming data classification is to learn incrementally a decision function that maps input variables to a label variable,from continuously arriving streaming data,so as to accurately classify the test data that may arrive anytime.The online learning paradigm,as an incremental machine learning technology,is an effective tool for classification of streaming data.This paper mainly summarizes,from the perspective of online learning,the recent development of algorithms for streaming data classification.Specifically,the basic framework and the performance evaluation methodology of online learning are first introduced.Then,the latest development of online learning algorithms for general streaming data,for alleviating the“curse of dimensionality”problem in high-dimensional streaming data,and for resolving the“concept drifting”problem in evolving streaming data are reviewed respectively.Finally,future challenges and promising research directions for classification of high-dimensional and evolving streaming data are also discussed.
作者 翟婷婷 高阳 朱俊武 ZHAI Ting-Ting;GAO Yang;ZHU Jun-Wu(School of Information Engineering,Yangzhou University,Yangzhou 225127,China;State Key Laboratory for Novel Software Technology(Nanjing University),Nanjing 210023,China)
出处 《软件学报》 EI CSCD 北大核心 2020年第4期912-931,共20页 Journal of Software
基金 国家重点研发计划(2017YFB0702600,2017YFB0702601) 国家自然科学基金(61906165,61432008,61872313) 江苏省高等学校自然科学研究项目(19KJB520064)。
关键词 在线学习 流数据分类 维度诅咒 概念漂移 稀疏在线学习 演化流分类 online learning streaming data classification curse of dimensionality concept drifting sparse online learning evolving data stream classification
  • 相关文献

参考文献5

二级参考文献68

  • 1Nature. Big data [EB/OL]. [ 2012-10-02 ]. http://www. nature, com/news/Specials/bigdata/index, html.
  • 2Science. Special online collection: Dealing with data [EB/OL]. [2012-10-02]. http: //www. sciencemag, org/site/ speclal/data.
  • 3杨海钦,吕荣聪,金国庆.面向大数据的在线学习算法[J].中国计算机学会通讯.2014,10(11):36-40.
  • 4Rosenblatt F. The perceptron: A probabilistic model for information storage and organization in the brain [J]. Psychological Review, 1958, 65(6): 386-408.
  • 5Crammer K, Dekel O, Keshet J, et al. Online passive- aggressive algorithms [J]. Journal of Machine Learning Research, 20061 7(3): 551-585.
  • 6Langford J, Li Lihong, Zhang Tong. Sparse online learning via truncated gradient [J]. Journal of Machine Learning Research, 2009, 10(3) : 777-801.
  • 7Duchi J, Singer Y. Efficient online and batch learning using forward backward splitting [J]. Journal of Maching Learning Research, 2009, 10(12): 2899-2934.
  • 8Xiao L. Dual averaging methods for regularized stochastic learning and online optimization [J]. Journal of Machine Learning Research, 2010, 11(10): 2543-2596.
  • 9Yang Haiqin, Lyu M R, King I. Efficient online learning for multi-task feature selection [J]. ACM Trans on Knowledge Discovery from Data, 2013, 1(1) ; 1-28.
  • 10Yang Haiqin, King I, Lyu M R. Sparse Learning under Regularization Framework: Theory and Applications [M]. 1st ed. Saarbrucken, Germany: LAP Lambert Academic Publishing, 2011.

共引文献82

同被引文献117

引证文献25

二级引证文献77

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部