摘要
软件复杂性的增加,给软件安全性带来极大的挑战.随着软件规模的不断增大以及漏洞形态多样化,传统漏洞挖掘方法由于存在高误报率和高漏报率的问题,已无法满足复杂软件的安全性分析需求.近年来,随着人工智能产业的兴起,大量机器学习方法被尝试用于解决软件漏洞挖掘问题.首先,通过梳理基于机器学习的软件漏洞挖掘的现有研究工作,归纳了其技术特征与工作流程;接着,从其中核心的原始数据特征提取切入,以代码表征形式作为分类依据,对现有研究工作进行分类阐述,并系统地进行了对比分析;最后,依据对现有研究工作的整理总结,探讨了基于机器学习的软件漏洞挖掘领域面临的挑战,并展望了该领域的发展趋势.
The increasing complexity of software application brings great challenges to software security.Due to the increase of software scale and diversity of vulnerability forms,the high false positives and false negatives of traditional vulnerability mining methods cannot meet the requirements of software security analysis.In recent years,with the rise of artificial intelligence industry,a large number of machine learning methods have been tried to solve the problem of software vulnerability mining.Firstly,the latest research results of applying machine learning method to the research of vulnerability mining are summarized in recent years,and the technical characteristics and workflow are proposed.Then,starting from the core original data features extraction,the existing research is classified according to the code representation form,and the existing research is systematically compared.Finally,based on the summary of the existing research,the challenges in the field of software vulnerability mining based on machine learning are discussed,and the development trends of this field are proposed.
作者
李韵
黄辰林
王中锋
袁露
王晓川
LI Yun;HUANG Chen-Lin;WANG Zhong-Feng;YUAN Lu;WANG Xiao-Chuan(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China;PLA 61302 Troops,Beijing 100016,China)
出处
《软件学报》
EI
CSCD
北大核心
2020年第7期2040-2061,共22页
Journal of Software
基金
国家重点研发计划(2018YFB0803501)。
关键词
机器学习
漏洞挖掘
代码表征
软件质量
深度学习
machine learning
vulnerability mining
code representation
software quality
deep learning