摘要
流程挖掘是一种从实际业务执行日志中发现结构化流程信息的过程.流程挖掘技术广泛应用于业务流程的发现和辅助建模过程中,并能够通过差异分析的方法帮助改进已有业务流程.如何处理流程模型中的重复任务,是流程挖掘技术的一个关键问题.提出了一个在标准流程挖掘算法执行之前进行的重复任务处理阶段,这一重复任务处理方法可以很好地兼容目前已有的各种流程挖掘算法,使之能处理重复任务.并提出了一种能够将事件记录上下文信息的差别数值化的距离度量定义,使用这种度量能够利用聚类方法来识别输入数据中的重复任务.最后利用典型的带有重复任务的流程模型,对所提出的处理方法进行模拟实验,并取得了良好的实验效果.
Process mining is to discover structured process description from real execution data. It helps the discovery and design of business process, and improves the existent ones through delta analysis. One of the challenging problems in process mining is how to deal with duplicate tasks. This paper provides a duplicate tasks treatment stage before the real execution of mining algorithm, which method is well compatible with existent process mining algorithms and helps them deal with duplicate tasks. In addition, this paper designs a distance measure to transfer the difference of event context into numerical form, and take advantage of such distance to distinguish duplicate tasks through clustering technology. The method in this paper is proved by experiments on typical process model having duplicate tasks.
出处
《中国科学院研究生院学报》
CAS
CSCD
北大核心
2009年第1期107-113,共7页
Journal of the Graduate School of the Chinese Academy of Sciences
基金
supported by Ministry of Science and Technology of the People's Republic of China(2005DKA64100,2005DKA10201)
关键词
流程挖掘
重复任务
聚类
process mining, duplicate tasks, clustering