摘要
针对现有的漂移检测算法不适用于解决单触发序列的漂移问题,提出一种基于活动距离变化的突发漂移检测方法。首先,提取每个滑动窗口中活动的关系矩阵来获取关系的特征向量;其次,为了降低关系矩阵的维度,通过计算滑动窗口之间活动的杰卡德距离,将活动的关系矩阵转换为杰卡德距离分布矩阵;然后,采用KL散度比较相邻距离矩阵中概率分布的变化来定位漂移区间;最后,为了解决粒度大小引起的不确定性问题,以循环关系的位置为窗口大小依次遍历并求得漂移区间的交集来定位漂移点。通过实验对包含12种变更模式且每种模式有5个不同大小日志的模拟数据集和两个软件仓库的执行日志的真实数据集进行了评估。结果表明,该方法可以对单触发序列的突发漂移进行有效定位。
In view of the fact that the existing drift detection algorithms were not suitable for solving the drift problem of single firing sequence,a sudden drift detection method based on the change of active distance was proposed.The active relation matrix in each sliding window was extracted to obtain the feature vector of the relationship.To reduce the dimension of the relational matrix,the active relational matrix was converted into Jaccard distance distribution matrix by calculating the active Jaccard distance between sliding Windows.Kullback-Leibler(KL)divergence was used to compare the variation of probability distribution in the adjacent distance matrix to locate the drift interval.To solve the problem of the uncertainty caused by size of the particle size,the intersection of drift interval was obtained by traversing the window size in turn.Simulated data sets with 12 change patterns and 5 logs of different sizes for each pattern and real data sets of execution logs for two software repositories were evaluated.The results showed that the proposed method could effectively locate the sudden drift of single firing sequence.
作者
原佳怡
朱锐
林雷蕾
李彤
郑明
YUAN Jiayi;ZHU Rui;LIN Leilei;LI Tong;ZHENG Ming(School of Software, Yunnan University, Kunming 650091, China;Key Laboratory in Software Engineering of Yunnan Province, Kunming 650091, China;School of Software, Tsinghua University, Beijing 100084, China;School of Big Data, Yunnan Agricultural University, Kunming 650201, China;School of Information, Yunnan University, Kunming 650500, China;College of Teacher Education, Shanxi Normal University, Taiyuan 030092, China)
出处
《计算机集成制造系统》
EI
CSCD
北大核心
2021年第9期2636-2646,共11页
Computer Integrated Manufacturing Systems
基金
国家自然科学基金资助项目(62002310)
云南省重大科技专项计划资助项目(202002AD080002)
云南省自然科学基金基础研究面上资助项目(202101AT070004,2019FB135)
云南省软件工程重点实验室开放基金项目(2020SE404)
云南大学数据驱动的软件工程省科技创新团队资助项目(2017HC012)
云南大学“东陆中青年骨干教师”培养计划资助项目(C176220200)
云南哲学社会科学青年项目(QN2020024)。
关键词
突发漂移
单触发序列
杰卡德距离
KL散度
漂移检测算法
sudden drift
single firing sequence
Jaccard distance
Kullback-Leibler divergence
drift detection algorithm