摘要
模型挖掘作为流程挖掘的热点领域之一,旨在从事件日志中生成描述业务流程的模型。事件日志包含具有可分解循环依赖关系的活动,此类活动既无法使用过滤非频繁活动的方式将其过滤,也不能当作混沌活动处理,导致流程模型精确度较低。现有方法不能在含有噪声的情况下根据有无循环结构划分事件日志,进而无法在无循环结构子日志上正确识别具有可分解循环依赖关系的活动,且需要依赖活动属性。为克服现有方法的不足,提高挖掘模型质量,提出分离循环结构和可分解循环依赖关系的分解流程模型挖掘框架。首先基于启发式方法将事件日志根据有无循环结构划分为两部分,在无循环结构事件日志中根据活动间可达关系频率和直接跟随关系频率识别具有可分解循环依赖关系的活动,进而将具有可分解循环依赖关系的活动从有循环结构事件日志中过滤,以识别事件日志的循环结构并投影得到子日志集合。然后使用现有流程模型挖掘方法挖掘子模型并基于边界活动分支结构关系合并子模型。实验结果表明,该方法基于ProM平台实现,并基于公开事件日志与直接使用Inductive Miner、基于最大划分框架和基于阶段的业务流程模型挖掘方法相比,精确度提高了0.08~0.42,复杂度降低了3.86~45.92。
Model mining—one of the hot areas of process mining—aims to generate models describing business processes from event logs.Event logs may contain activities with decomposable cyclic dependencies,which cannot be filtered by filtering infrequent activities nor treated as chaotic activities and can lead to low precision of process models.The existing methods cannot divide the event logs according to the presence or absence of cyclic structures in the presence of noise and thus cannot correctly identify activities with decomposable cyclic dependencies on sub-logs without cyclic structures,and the use of the existing methods is dependent on activity attributes.To overcome the shortage of existing methods and improve the quality of mining models,a decomposable process model mining framework that separates the cyclic structure and decomposable cyclic dependencies is proposed.First,the event log is divided into two parts on the basis of heuristics,and the activities with decomposable cyclic dependencies are identified in the event log with no cyclic structure according to the frequency of inter-activity reachable relations and direct following relations.Then,the activities with decomposable cyclic dependencies are filtered from the event log with a cyclic structure to identify the cyclic structure of the event log and to project the set of sub-logs.Finally,existing process model mining techniques are used to mine sub-models and merge sub-models according to the boundary activity branch structure relationship.The proposed method is implemented using the ProM platform,and its performance is quantitatively compared with that of the maximal based framework,stage-based discovery of business process model methods,and the direct use of Inductive Miner to mine models based on public event logs.Experiments indicate that compared with the other methods,the precision of the proposed method is 0.08-0.42 higher,and the complexity is reduced by 3.86-45.92.
作者
王康
刘聪
王路
曾庆田
WANG Kang;LIU Cong;WANG Lu;ZENG Qingtian(School of Electronic Information Engineering,Shandong University of Science and Technology,Qingdao 266590,Shandong,China;School of Computer Science and Technology,Shandong University of Technology,Zibo 255000,Shandong,China;College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266590,Shandong,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2023年第11期94-105,114,共13页
Computer Engineering
基金
国家自然科学基金(61902222)
山东省泰山学者工程专项基金(ts20190936,tsqn201909109)
山东省自然科学基金优秀青年基金(ZR2021YQ45)
山东省高等学校青创科技计划创新团队项目(QC2021948080)
教育部人文社会科学研究青年基金项目(20YJCZH159)
山东省自然科学基金青年基金(ZR2022QF020)。