摘要
大坝安全监测数据中的异常值会对大坝安全分析、决策的正确性和及时性产生影响。为准确高效地检测大坝安全监测数据中的异常值,提出一种基于IF-Encoder的异常数据检测算法,基于时间序列间的相关性对目标序列进行重构,对比重构序列与目标序列残差的大小来识别异常值。另外依据规范要求,提出一种基于相关性的异常值鉴定方法,针对检测出的异常值进行真实异常、虚假异常划分,在保留真实异常值的情况下,对虚假异常值进行剔除处理。结果表明:相比四分位法、拉伊达准则、KNN最近邻法、DBSCAN聚类法,IF-Encoder算法检测异常值的查全率、查准率、准确率有所提升,其对异常值的识别更加可靠、有效。基于相关性的异常值鉴定方法对真实异常的鉴定准确率为92%,对虚假异常的鉴定准确率为100%,可有效对异常值进行划分。
The outliers in dam safety monitoring data can have an impact on the correctness and timeliness of dam safety analysis and decision-making.In order to accurately and efficiently detect outliers in dam safety monitoring data,an anomaly data detection algorithm based on IF Encoder was proposed.The target sequence was rebuilt based on the correlation between time series,and the residual size between the rebuilt sequence and the target sequence was compared to identify outliers.In addition,according to regulatory requirements,a correlation based outlier identification method was proposed,which divided the detected outliers into true and false anomalies,and removed false anomalies while retaining the true outliers.The results show that compared with the quartile method,Rayda criterion,KNN nearest neighbor method,and DBSCAN clustering method,the IF-Encoder algorithm has improved recall,precision,and accuracy in detecting outliers,and its recognition of outliers is more reliable and effective.The correlation based outlier identification method has an accuracy rate of 92%for identifying true anomalies and 100% for identifying false anomalies,which can effectively classify outliers.
作者
刘鹤鹏
李登华
丁勇
LIU Hepeng;LI Denghua;DING Yong(School of Science,Nanjing University of Science and Technology,Nanjing 210094,China;Nanjing Hydraulic Research Institute,Nanjing 210029,China;Key Laboratory of Reservoir Dam Safety,Ministry of Water Resources,Nanjing 210029,China)
出处
《人民黄河》
CAS
北大核心
2024年第10期148-153,共6页
Yellow River
基金
国家重点研发计划项目(2022YFC3005502)
国家自然科学基金资助项目(51979174)
国家自然科学基金联合基金资助项目(U2040221)
中央级公益性科研院所基本科研业务费专项(Y322008)。
关键词
孤立森林
异常值检测
相关性
卷积长短期神经网络
Isolated Forests
outlier detection
correlation
Convolutional Long Short Term Neural Networks