摘要
局部离群因子(LOF)是对过程数据的局部离群程度的定义,然而工业过程对数据异常检测的实时性要求高,要求出所有采样点的离群因子计算量较大。故本文对LOF算法进行相应的改进,采用k-近邻计算对象的局部可达密度,同时利用1种预处理采样点的方法CDC(Closest Distance to Center),通过计算每个点到中心点的距离先对采样点进行修剪,剔除大部分不可能是离群点的采样点,只需要计算剩余点改进的LOF值,从而提高离群点检测的效率。最终通过对TE过程数据仿真,说明在保证离群点检测准确性的情况下,相比于LOF缩短了算法运行的时间。
Local outlier factor is the defmition of the local outlier degree of process data. However, the examination of abnormal data of industrial process requires high real-time capability and a great deal calculation is required to calculate the local outlier factors of all sampling data. So the k-neighbours used to calculate the local density reachable are introduced to improve the LOF algorithm. Simultaneously, a preprocessing method called closest distance to center is mentioned in this paper. As a result, the sampling points are pruned by computing distances between every point and the center, and we can get rid of most points which are impossible to be the outliers. Then it is just needed to calculate the modified local outlier factors of the reminders, which improve the efficiency of detecting outliers. In the end, the simulation results of the dataset of TE process state the reduction the operation time compared with LOF algorithm with the guarantee of the veracity of outlier detection.
出处
《计算机与应用化学》
CAS
CSCD
北大核心
2013年第1期53-56,共4页
Computers and Applied Chemistry
基金
国家自然科学基金资助项目(61134007)
江苏高等学校优秀科技创新团队资助项目
关键词
局部离群因子
K-近邻
CDC
local outlier factor, k-neighbours, closest distance to center