摘要
工业控制系统异常检测存在类不平衡问题,导致通用分类器很难实现异常数据的精准识别。目前,针对类不平衡数据,常用采样方法实现各类数据的平衡,以提高分类器性能。但传统采样方法对数据集特征敏感,采样效果稳定性差,异常检测精度波动大。文章基于生成式对抗网络(Generative Adversarial Network,GAN),提出一种GAN-Cross采样模型,该模型可以学习目标数据的概率分布,并生成相似概率分布的数据,从而改善数据的平衡性。同时,文章在生成器和判别器中增加了交叉层,从而更好地实现特征提取。最后文章将该模型与随机森林、K-近邻、高斯朴素贝叶斯和支持向量机4种经典分类器进行组合,在4个公开类不平衡数据集上与其他4种常规采样方法进行比较。实验结果表明,与传统采样方法相比,该模型能够显著提高分类器对类不平衡数据的异常检测能力。
Industrial control system anomaly detection has a class imbalance problem,which makes it difficult for general classifiers to accurately identify abnormal data.At present,for class imbalanced data,sampling methods are commonly used to achieve the balance of various types of data to improve the performance of the classifier.However,traditional sampling methods are sensitive to the characteristics of the data set,resulting in poor stability of the sampling effect and fluctuations in the accuracy of anomaly detection.Based on the generative adversarial network(GAN),this paper proposed a GAN-Cross sampling model.The model could learn the probability distribution of the target data and generate data with similar probability distributions,so as to achieve the sampling effect.At the same time,in order to achieve better feature extraction,this paper applied a cross layer in the generator and discriminator.Finally,the model was combined with four classic classifiers:random forest,K-nearest neighbor,Gaussian Naive Bayes,and support vector machine,and compared with other four conventional sampling methods on four public imbalanced data sets.Experimental results show that compared with traditional sampling methods,this model can significantly improve the anomaly detection performance of the classifier on imbalanced data.
作者
顾兆军
刘婷婷
高冰
隋翯
GU Zhaojun;LIU Tingting;GAO Bing;SUI He(Information Security Evaluation Center,Civil Aviation University of China,Tianjin 300300,China;College of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China;College of Aeronautical Engineering,Civil Aviation University of China,Tianjin 300300,China)
出处
《信息网络安全》
CSCD
北大核心
2022年第8期81-89,共9页
Netinfo Security
基金
国家自然科学基金[61601467]
民航安全能力建设基金[PESA2019073,PESA2019074,PESA2020100]。
关键词
工业控制系统
类不平衡数据
生成式对抗网络
采样方法
异常检测
industrial control system
imbalanced data
generative adversarial network
sampling method
anomaly detection