摘要
针对传统URL检测方法在恶意URL检测时存在的准确率不高、实时性差等问题,提出一种基于Stacking集成学习的算法模型。该模型采用机器学习单一方法中的岭分类、支持向量机、朴素贝叶斯作为初级学习器,采用逻辑回归作为次级学习器,通过初级学习器和次级学习器相结合的双层结构对URL进行检测。使用大量的URL数据集分别对单一方法中的模型和Stacking集成学习方法的模型进行训练,并对每种模型进行评估。评估结果表明,Stacking集成学习的算法模型对恶意URL检测的准确率可达98.75%,与其他模型相比提升0.75%以上。采用Flask作为开发框架,实现了恶意URL检测系统的功能,并对系统进行云端等部署,得到系统根据用户输入的URL链接可以输出URL的检测结果,具有较好的应用价值。
In allusion to the problems of traditional URL(uniform resource locator)detection methods such as low accuracy and poor real⁃time performance in detecting malicious URLs,an algorithm model based on Stacking ensemble learning is proposed.In this model,the ridge classification,support vector machine,and naive Bayes in a single machine learning method are used as primary learner,and the logical regression is used as secondary learner.The URL is detected by means of the two⁃layer structure combining primary learner and secondary learner.A large number of URL datasets are used to train the models of single method and Stacking ensemble learning method,and evaluate each model.The evaluation results show that the accuracy of Stacking ensemble learning algorithm model for malicious URL detection can reach 98.75%,which is at least 0.75%higher than other models.The Flask is taken as the development framework to implement the functions of malicious URL detection system,and carry out the cloud and other deployments to the system.Based on the URL link input by the user,the system can output the detection result of the URL,which has good application value.
作者
张永刚
吕鹏飞
张悦
姚兴博
冯艳丽
ZHANG Yonggang;LÜPengfei;ZHANG Yue;YAO Xingbo;FENG Yanli(State Grid Inner Mongolia East Power Co.,Ltd.,Hohhot 010020,China;College of Artificial Intelligence,Nanjing Agricultural University,Nanjing 210095,China)
出处
《现代电子技术》
2023年第10期105-109,共5页
Modern Electronics Technique