期刊文献+

基于信息增益和随机森林分类器的入侵检测系统研究 被引量:4

Intrusion Detection System Using Random Forests Classifier and Information Gain
下载PDF
导出
摘要 目前,许多误用检测系统无法检测未知攻击,而异常检测系统虽然能够精确检测未知攻击,但由于入侵检测固有的特性,入侵事件与正常事件类间存在极大的不平衡性,这导致很难利用机器学习的方法高效地进行入侵行为检测.为此,提出了一种基于信息增益和随机森林分类器的入侵检测系统.为了解决类之间的不平衡性,对训练数据集应用了合成少数过采样算法.提出了一种基于信息增益的特征选择方法,并用于构建一个数据集的特征约减子集.首先,利用随机森林算法从训练集中建立入侵模型,构建误用检测模型,通过网络连接的特征来匹配检测已知攻击.然后,利用信息增益的特征选择方法,根据特征约减获得的特征,将不确定性攻击的网络连接数据通过随机森林进行聚类,进而实现未知攻击的检测.实验采用的NSL-KDD入侵检测数据集是KDDCUP99数据集的增强版本.由于入侵检测固有的特性,NSL-KDD数据集设计时类间存在极大的不平衡性.实验结果表明,结合合成少数过采样算法以及基于特征选择的信息增益的随机森林分类器对少数类别异常检测率可达到0.962. At present, many misuse detection systems cannot detect unknown attacks, while the anomaly detection system can accurately detect unknown attacks, but because of intrusion detection inherent characteristics, there is a great imbalance between intrusion events and normal events, which lead it very difficult to use the method of machine learning to carry out intrusion behavior detection. An intrusion detection system based on information gain and random forest classifier is proposed. In order to solve the imbalance between classes, a small number of over-sampling algorithms is applied to the training data set. A feature selection method based on information gain is proposed, and it is used to construct the feature subtraction subsets of the data set. Firstly, the intrusion model is established from the training set by using the random forest algorithm, and the misuse detection model is constructed, and the known attacks are detected by matching the characteristics of the network connection. Then, by using the feature selection method of information gain, the network connection data of the uncertain attack is clustered according to the characteristic of the feature, and the detection of unknown attack is realized by clustering with the forest. The NSL-KDD intrusion detection data set used in the experiment is an enhanced version of the KDDCUP'99 data set. Due to the inherent characteristics of intrusion detection, there is a great imbalance between NSL-KDD data set. The experimental results show that the random forest classifier combined with the Synthetic Minority Over Sampling Technique (SMOTE) can reach 0. 962 of the detection rate for small samole categories.
作者 魏金太 高穹
出处 《中北大学学报(自然科学版)》 CAS 2018年第1期74-79,88,共7页 Journal of North University of China(Natural Science Edition)
基金 国家自然科学基金资助项目(11404398) 河南科技厅重点攻关资助项目(142102210097)
关键词 网络安全 入侵检测 随机森林 特征选择 Network security IDS random forest feature selection
  • 相关文献

参考文献6

二级参考文献108

  • 1唐焕文,张立卫,王雪华.一类约束不可微优化问题的极大熵方法[J].计算数学,1993,15(3):268-275. 被引量:75
  • 2唐焕文,张立卫.凸规划的极大熵方法[J].科学通报,1994,39(8):682-684. 被引量:49
  • 3李兴斯.一类不可微优化问题的有效解法[J].中国科学(A辑),1994,24(4):371-377. 被引量:137
  • 4陈友,程学旗,李洋,戴磊.基于特征选择的轻量级入侵检测系统[J].软件学报,2007,18(7):1639-1651. 被引量:78
  • 5[1]Forrest S, Perrelason AS, Allen L, Cherukur R. Self_Nonself discrimination in a computer. In: Rushby J, Meadows C, eds. Proceedings of the 1994 IEEE Symposium on Research in Security and Privacy. Oakland, CA: IEEE Computer Society Press, 1994. 202~212.
  • 6[2]Ghosh AK, Michael C, Schatz M. A real-time intrusion detection system based on learning program behavior. In: Debar H, Wu SF, eds. Recent Advances in Intrusion Detection (RAID 2000). Toulouse: Spinger-Verlag, 2000. 93~109.
  • 7[3]Lee W, Stolfo SJ. A data mining framework for building intrusion detection model. In: Gong L, Reiter MK, eds. Proceedings of the 1999 IEEE Symposium on Security and Privacy. Oakland, CA: IEEE Computer Society Press, 1999. 120~132.
  • 8[4]Vapnik VN. The Nature of Statistical Learning Theory. New York: Spring-Verlag, 1995.
  • 9[5]Lee W, Dong X. Information-Theoretic measures for anomaly detection. In: Needham R, Abadi M, eds. Proceedings of the 2001 IEEE Symposium on Security and Privacy. Oakland, CA: IEEE Computer Society Press, 2001. 130~143.
  • 10[6]Warrender C, Forresr S, Pearlmutter B. Detecting intrusions using system calls: Alternative data models. In: Gong L, Reiter MK, eds. Proceedings of the 1999 IEEE Symposium on Security and Privacy. Oakland, CA: IEEE Computer Society Press, 1999. 133~145.

共引文献249

同被引文献27

引证文献4

二级引证文献67

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部