期刊文献+

基于流形嵌入过采样的非平衡数据分类方法 被引量:11

A novel pattern classification method for imbalanced data set based on manifold embedded over-sampling
原文传递
导出
摘要 工业监测数据中正常与异常状态数据一般存在非平衡性,而传统的过采样非平衡数据处理方法往往在解决非线性、高维含噪的非平衡问题时不能获得满意的模式分类效果.本文利用流形学习的非线性降维,提出一种流形嵌入过采样方法,为有机结合流形学习与过采样的非平衡数据模式分类方法提供了统一框架.研究结果表明:该方法采用过采样平衡数据在流形空间的低维嵌入数据直接完成模式分类,可以减小流形嵌入空间到原始数据空间反映射的计算代价和模式分类成本.另外,流形学习可以有效保持原始数据结构特性,在流形嵌入空间的过采样可以实现更符合原始数据特性的非线性插值.面向TE过程和矿山微震2种具有不同规模和特性的非平衡工业监测数据集,F1指标分别平均提升了21.94%和37.34%,AUC指标分别提升了37.85%和10.64%,从而验证了所提方法在解决较大数据规模的非平衡模式分类问题时,具有稳定良好的分类效果. The normal and abnormal samples in industrial monitoring data sets are usually imbalanced.Traditional over-sampling methods usually cannot obtain a satisfactory pattern recognition result,especially when the data is non-linear distributed or high dimensional noise exists.In this paper,a manifold embedded over-sampling method was proposed based on the nonlinear dimensional reduction characteristic of manifold learning,and the method provides a unified framework for the combination of manifold learning methods and over-sampling methods.Differing from traditional methods,the classifier is trained in the manifold data space rather than observation space based on the balanced data set after manifold learning and over-sampling process.Therefore,it can reduce the computation and classification cost of some methods which re-map the generated samples from manifold space to original data space.Furthermore,manifold learning can maintain the structure of original data set,so over-sampling based on themanifold space data set can produce qualified samples satisfying the non-linear structure of observation data space.Experiments were implemented under two imbalanced industrial data sets(TE Process and mine micro-seism)with different attributes and data size,and the proposed method increased the average F1 value by 21.94%and 37.34%and the average AUCvalue by37.85%and 10.64%respectively.The results show that the proposed method has a more reliable and realizable performance for solving the imbalanced pattern classification problem in industrial scenes.
作者 程健 杨凌凯 崔宁 郭一楠 CHENG Jian;YANG Lingkai;CUI Ning;GUO Yinan(School of Information and Control Engineering,China University of Mining and Technology,Xuzhou,Jiangsu,221116,China)
出处 《中国矿业大学学报》 EI CAS CSCD 北大核心 2018年第6期1325-1333,共9页 Journal of China University of Mining & Technology
基金 国家重点研发计划项目(2016YFC0801406) 国家自然科学基金项目(61573361) 江苏省六大高峰人才项目(2017-DZXX-046) 中国矿业大学学科前沿研究专项(2015XKQY19)
关键词 流形学习 过采样 非平衡数据 模式分类 TE过程 矿山微震 manifold learning over-sampling imbalanced data pattern classification TE process mine micro-seism
  • 相关文献

参考文献5

二级参考文献86

  • 1贺虎,窦林名,巩思园,王利利,高明仕,张学飞.巷道防冲机理及支护控制研究[J].采矿与安全工程学报,2010,27(1):40-44. 被引量:47
  • 2Isermann R, Balle E Trends in the application of model based fault detection and diagnosis of technical processes[J]. Control Engineering Practice, 1997, 5(5): 709-719.
  • 3Parthasarathy K, Jay H L. Diagnostic tools for multivariable model-based control system[J]. Industrial and Engineering Chemistry Research, 1997, 36(7): 2725- 2738.
  • 4Anne Raich, Ali Cinar. Statistical process monitoring and disturbance diagnosis in multivariable continuous processes [J]. AIChE J, 1996, 42(4): 995-1009.
  • 5Jie Chen, Ron J. Patton. Robust model-based fault diagnosis for dynamic systems[M]. Boston: Kluwer Academic Publishers, 1999.
  • 6Bagheri F, Khaloozaded H, Abbaszadeh K. Stator fault detection in induction machines by parameter estimation using adaptive Kalman filter[C]. Proc of 2007 Mediterranean Conf on Control and Automation. Piscataway: IEEE, 2007: 1-6.
  • 7Li L L, Zhou D H. Fast and robust fault diagnosis for a class of nonlinear system: Detectability analysis[J]. Computers and Chemical Engineering, 2004, 28(12): 2635-2646.
  • 8Janos Gertler. Analytical redundancy methods in fault detection and isolation[C]. Proc of IFAC/ IMACS Symposium on Fault Detection, Supervision and Safety for Technical Processes. Baden-Baden: Pergamon Press, 1991.
  • 9Iri M, Aoki K, O'Shima E, et al. An algorithm for diagnosis of system failures in the chemical process[J]. Computers and Chemical Engineering, 1979, 3(1/2/3/4): 489-493.
  • 10Wu J D, Wang Y H, Mingsian R B. Development of an expert system for fault diagnosis in scooter engine platform using fuzzy-logic inference[J]. Expert Systems with Applicatio, 2007, 33(4): 1063-1075.

共引文献377

同被引文献117

引证文献11

二级引证文献111

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部