面向恶意网址检测的广谱特征选择与评估

Broad-spectrum feature selection and evaluation for malicious URLs detection

下载PDF

导出

摘要针对恶意网址检测系统的特征选择和降维问题,基于特征选择方法的优化结果提出多种特征子集。利用基于分类器的准确率和召回率等性能评价指标,采用随机森林、贝叶斯网络、J48、随机树机器学习方法,对信息增益、卡方校验、信息增益率、基于Relief值、基于OneR分类器、基于关联性规则、基于相关性等多种特征选择算法所确定的特征子集进行检测。结果表明,除基于相关性特征选择算法确定的特征子集外,其他方法确定的特征子集均具有良好的分类性能,其中基于关联性规则选择的特征子集的维度仅为5,但各分类器基于此特征子集的分类准确率均高达99%以上。 The multiple feature subsets are proposed based on the optimization results of feature selection method to solve the problems of feature selection and dimension reduction for malicious URLs detection system. The classifier.based performance evaluation indicators such as accuracy rate and recall rate, and machine learning method using random forest, Bayesian network,J48,random tree are used to detect the feature subsets determined by information gain,Chi - square verification, information gain radio,and multi - feature selection algorithms based on Relief value,OneR classifier,correction rule and correction attribute evaluation. The results show that,except the feature subset determined by the algorithm based on correction attribute evaluation,the feature subsets determined by other feature selection algorithms have high classification performance,in which the dimensionality of feature subset determined by the algorithm based on correlation rule is only 5,but the classification accuracy rate of all the classifiers based on this feature subset can reach up to 99%.

作者张慧钱丽萍汪立东袁辰张婷 ZHANG Hui;QIAN Liping;WANG Lidong;YUAN Chen;ZHANG Ting(College of Electrical and Information Engineering,Beijing University of Civil Engineering and Architecture,Beijing 100044,China)

机构地区北京建筑大学电气与信息工程学院

出处《现代电子技术》北大核心 2019年第9期60-64,共5页 Modern Electronics Technique

基金国家自然科学基金资助项目(61571144) 北京建筑大学博士基金项目(00331616014)~~

关键词网络安全恶意网址检测特征提取特征选择特征子集信息安全 network security malicious URL detection feature extraction feature selection feature subset information security

分类号 TN915.08-34 [电子电信—通信与信息系统] TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献4

1胡蓓蓓,彭艳兵,程光.基于Counting Bloom Filter的DNS异常检测[J].计算机工程与应用,2014,50(15):82-86. 被引量：2
2张维维,龚俭,刘茜,刘尚东,胡晓艳.基于词素特征的轻量级域名检测算法[J].软件学报,2016,27(9):2348-2364. 被引量：30
3武小年,彭小金,杨宇洋,方堃.入侵检测中基于SVM的两级特征选择方法[J].通信学报,2015,36(4):19-26. 被引量：35
4张浩.网络数据特征选择的优化方法研究与仿真[J].计算机仿真,2017,34(2):367-370. 被引量：6

二级参考文献70

1龚俭,彭艳兵,杨望,刘卫江.基于BloomFilter的大规模异常TCP连接参数再现方法[J].软件学报,2006,17(3):434-444. 被引量：24
2彭艳兵,龚俭,刘卫江,杨望.Bloom Filter哈希空间的元素还原[J].电子学报,2006,34(5):822-827. 被引量：7
3Yadav S, Reddy N.Winning with DNS failures: strategies for faster botnet detection[C]//Security and Privacy in Communication Networks,2011.
4Jiang N,Gao J,Lin Y,et al.Identifying suspicious activi- ties through DNS failure graph analysis[C]//IEEE Inter- national Conference on Computer Communications,2011.
5Zhu Z S, Yegneswaran V, Chen Y.Using failure informa- tion analysis to detect enterprise zombies[C]//Security and Privacy in Communication Networks,2009.
6ZHANG Y, YANG A, XIONG C, et al. Feature selection using data envelopment analysis[J]. Knowledge-Based Systems, 2014, 64:70-80.
7LEE M C. Using support vector machine with a hybrid feature selec- tion method to the stock trend prediction[J]. Expert Systems with Ap- plications, 2009, 36(8): 10896-10904.
8YONGLI Z, YUNG Z, WEI M T, et al. An improved feature selection algorithm based on MAHALANOBIS distance for network intrusion detection[A]. Sensor Network Security Technology and Privacy Communication System (SNS & PCS), 2013 International Conference on[C]. 2013.69-73.
9TESFAHUN A, BHASKARI D L. Intrusion detection using random forests classifier with SMOTE and feature reduction[A]. Cloud & Ubiquitous Computing & Emerging Technologies (CUBE), 2013 In- ternational Conference on[C]. 2013.127-132.
10ARAUJO N V S, OLIVEIRA R, FERREIRA E W T, et al. Kappa-fuzzy aRTMAP: a feature selection based methodology to in- trusion detection in computer networks[A]. Trust, Security and Privacy in Computing and Communications (TrustCom), 2013 12th IEEE In- ternational Conference on[C[. 2013.271-276.

共引文献69

1赵珂雨,陈婉莹.一种基于stacking集成学习的DGA域名检测方法[J].数据通信,2020(6):19-24.
2林思明,陈腾跃,梁煜麓.基于BiLstm神经网络的DGA域名检测方法[J].网络安全技术与应用,2019(1):15-17. 被引量：4
3付钰,李洪成,吴晓平,王甲生.基于大数据分析的APT攻击检测研究综述[J].通信学报,2015,36(11):1-14. 被引量：84
4李洪成,吴晓平,严博.面向MANET异常检测的分布式遗传k-means研究[J].通信学报,2015,36(11):167-173. 被引量：9
5张焕,乔晓艳.多任务运动想象脑电特征的融合分类研究[J].传感技术学报,2016,29(6):802-807. 被引量：7
6张燕,杜红乐,李楠.基于密度均衡的网络入侵检测[J].微型电脑应用,2016,32(8):36-39. 被引量：2
7李洪成,吴晓平,姜洪海.基于改进聚类分析的网络流量异常检测方法[J].网络与信息安全学报,2015,1(1):66-71. 被引量：15
8黄可望,蔡一新,朱嘉钢.基于PCA-2KPCA-SVM的pod入侵高精度检测方法[J].计算机工程与设计,2017,38(8):2092-2098. 被引量：3
9李丛,闫仁武,丁勇,王云.入侵检测中基于IBQGSA的特征选择及SVM参数优化[J].计算机工程与设计,2017,38(8):2227-2234. 被引量：2
10李文.基于特征选择的网络入侵检测模型研究[J].计算机测量与控制,2017,25(8):214-217. 被引量：5

1唐伟.提供网址检测服务的法律分析[J].信息安全研究,2018,4(7):626-632.
2王于英.在教学实践中有效培养学生的数感[J].课程教材教学研究（教育研究）,2019,0(1):24-26.
3张宝明,魏程益.基于Structure2vec算法的网络欺诈风险特征选择与评估[J].软件导刊,2019,18(2):28-33. 被引量：2
4张珊珊.水坑攻击概述[J].计算机与网络,2019,0(5):53-53. 被引量：1
5吕继续,丛静.基于数据挖掘的大学生阅读行为分析[J].科技资讯,2018,16(17):212-213. 被引量：2
6吕云峰.2018年瑞星网络安全报告与趋势展望[J].信息安全研究,2019,5(3):186-191. 被引量：4
7轻阅读[J].风流一代,2019,0(12):59-59.
8姚福东,罗鹏明,刘夏君.脊柱转移性肿瘤患者脊柱手术后早期相邻节段性疾病发生的分级评估[J].肿瘤学杂志,2019,25(2):140-144. 被引量：1
9刘品新.论大数据证据[J].环球法律评论,2019,41(1):21-34. 被引量：130
10戴臻.鲶鱼粒子群算法选择特征的支持向量机网络入侵检测[J].信息与电脑,2019,31(6):56-59.

现代电子技术

2019年第9期

浏览历史

内容加载中请稍等...

面向恶意网址检测的广谱特征选择与评估

参考文献4

二级参考文献70

共引文献69

相关作者

相关机构

相关主题

浏览历史