期刊文献+

面向不完备混合数据的模糊多粒度异常检测

Fuzzy multi-granularity anomaly detection for incomplete mixed data
下载PDF
导出
摘要 针对现有的异常检测方法大多无法有效处理不完备混合数据的问题,提出一种面向不完备混合数据的模糊多粒度异常检测算法ADFIIS(Anomaly Detection in Fuzzy Incomplete Information System),所提算法考虑在标称属性和在数值属性上出现缺失值的情况,能处理混合属性数据。首先,定义属性之间的模糊相似度;其次,计算每个属性的模糊熵,基于熵的大小使用多粒度的思想构建多个属性序列;再次,计算每个样本的异常值以表征它的异常程度;最后,设计相应的ADFIIS算法并分析它的复杂度。在公开数据集上进行实验,将所提算法与ILGNI(Incomplete Local and Global Neighborhood Information network)等主流离群点检测算法对比。实验结果表明,ADFIIS在不完备混合数据集上的受试者操作特征(ROC)曲线效果更好。ADFIIS的曲线下面积(AUC)的平均值优于90%的对比方法,相较于同样能够处理不完备混合数据的ILGNI,它的AUC平均值提升了7个百分点。所提算法使用模型扩展法在不改变原始数据集的情况下对不完备数据集进行异常检测,拓展了异常检测的适用范围。 In view of the inadequacy problem of most existing anomaly detection methods in effectively handling incomplete mixed data,a fuzzy multi-granularity anomaly detection algorithm for incomplete mixed data ADFIIS(Anomaly Detection in Fuzzy Incomplete Information System)was designed,which took into account the presence of missing values in both nominal and numeric attributes,and could handle mixed attribute data.The fuzzy similarity between attributes was defined and then the fuzzy entropy of each attribute was calculated.Based on the entropy values,a multi-granularity approach was employed to construct multiple attribute sequences.Subsequently,the outliers of each sample were calculated to characterize its degree of anomaly.Finally,the corresponding ADFIIS algorithm was designed,and its complexity was analyzed.Experiments were conducted on publicly available datasets,and the proposed algorithm was compared with some mainstream outlier detection algorithms such as ILGNI(Incomplete Local and Global Neighborhood Information network).Experimental results show that ADFIIS has better Receiver Operating Characteristic(ROC)curve performance on incomplete mixed datasets.On average,the Area Under the ROC Curve(AUC)of ADFIIS is better than 90%of the comparison methods.Compared with ILGNI,which can also handle incomplete mixed data,the average AUC of ADFIIS is improved by 7 percentage points.In the proposed algorithm,the model expansion method is used to detect anomalies in incomplete datasets without changing the original datasets,which expands the application scope of anomaly detection.
作者 唐宇皓 彭德中 袁钟 TANG Yuhao;PENG Dezhong;YUAN Zhong(College of Computer Science,Sichuan University,Chengdu Sichuan 610065,China)
出处 《计算机应用》 CSCD 北大核心 2024年第10期3097-3104,共8页 journal of Computer Applications
基金 国家自然科学基金资助项目(62306196) 四川省科技计划项目(2023YFQ0020,2022YFSY0047) 中央高校基本科研业务费专项资金资助项目(YJ202245)。
关键词 模糊粗糙集 多粒度 异常检测 离群检测 不完备混合数据 fuzzy rough set multi-granularity anomaly detection outlier detection incomplete mixed data
  • 相关文献

参考文献3

二级参考文献11

共引文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部