摘要
自动程序修复是近年来的研究热点并取得了一定的进展。现有的自动程序修复方法大多利用测试套件来验证补丁正确性。然而,使用测试套件验证自动程序修复方法生成的大量候选补丁不仅会造成巨大的开销,不完美的测试套件还会导致补丁的过拟合问题,因此如何提高补丁验证效率、有效验证补丁正确性成为亟待解决的问题。为了降低补丁验证开销并提高补丁正确率,提出了结合两种嵌入技术验证补丁正确性的方法。该方法首先利用Doc2Vec计算补丁与错误代码的相似性,然后使用一个基于BERT模型的分类器过滤通过相似性筛选出的补丁中的错误补丁。为了验证所提方法的有效性,基于5个开源的Java缺陷库进行实验,结果表明该方法能够有效地验证补丁的正确性并提高验证效率。
Automatic program repair is a research hotspot in recent years and has made some progress.Most of the existing automatic program repair methods use the test suite to validate patch correctness.However,using the test suite to validate a large number of candidate patches will not only bring huge costs,but also lead to the overfitting problem of patches.Therefore,how to improve the efficiency of patch validation and effectively validate patch correctness has become an urgent problem.In order to reduce the cost and improve the patch accuracy,this paper proposes an approach combining two embedding techniques to validate patch correctness.Firstly,this approach uses Doc2 Vec model to calculate the similarity between the patch and the error code,then it uses the classifier based on BERT model to filter out the error patches from the patches screened by the similarity.To evaluate the effectiveness of this approach,experiments are carried out based on five open source Java benchmarks.Experimental results show that this approach can effectively validate patch correctness and improve the efficiency of patch validation.
作者
黄颖
姜淑娟
蒋婷婷
HUANG Ying;JIANG Shu-juan;JIANG Ting-ting(Engineering Research Center of Mine Digitalization of Ministry of Education,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China;School of Computer Scienceand Technology,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China)
出处
《计算机科学》
CSCD
北大核心
2022年第11期83-89,共7页
Computer Science
基金
国家自然科学基金(61673384)。
关键词
自动程序修复
补丁验证
代码相似性
嵌入技术
Automatic program repair
Patch validation
Code similarity
Embedding technology