摘要
在大型IT系统中偶尔会出现故障状况或异常,为及时抢通业务,需要采用快速定位告警根因的方法。常用的告警根因分析方法主要是关联规则分析,而普通的关联规则面对大量数据存在一定缺陷,可能会挖掘到统计学上相关但是逻辑上不相关的无效告警关联规则。文章提出一种基于正负关联规则的告警根因计算方法,采用错误日志、告警的聚类压缩处理方法,结合机器设备之间的拓扑关联关系,对异常事件进行正负关联度分析挖掘,找到两两异常事件之间的关联关系,作为故障根因的判断依据。在实验数据部分,得到较为准确的根因分析结果,证明该算法能减少冗余无效规则,提高挖掘效率。
Since there are many faults or anomalies occurring in large IT system, root cause analysis for alarm information is very important. The main analysis method is association rules analysis, while the common association rules have certain defects facing big data, and will mine invalid alarm association rules which are statistically relevant but logically irrelevant.This paper proposes one calculation method of alarm root cause based on positive and negative association rules, which find the relationship between each two anomalies as decision judgment, with the error log and alarm clustering compression processing method, topological relationship between machines, and the analysis of the positive and negative correlation between anomalies. In the part of experimental data, more accurate root cause analysis results are obtained, which proves the algorithm can reduce redundant invalid rules and improve mining efficiency.
出处
《科技创新与应用》
2022年第5期158-160,共3页
Technology Innovation and Application
关键词
根因分析
正负关联规则
拓扑关系
root cause analysis
positive and negative association rules
topological relationship