高维离群检测算法及其应用

Outlier Detection Algorithm in High Dimension and Its Application

下载PDF

导出

摘要离群检测的目的在于找出隐含在海量数据中相对稀疏而孤立的异常数据模式。由于高维数据的特殊性,传统的离群挖掘算法往往不适合发掘高维空间中的离群点。本文将蚁群算法用于改进超图模型,提出了一种新的离群检测算法——AHHDOD算法,在检测出离群数据模式的同时给出离群点的归属。经检验,该算法能有效收敛于最优解,且其时间复杂度更低。最后,将该方法应用于矿难预警检测中,能对可能出现的危机状况给出预警提示。实验证明,该方法取得的预警结果是可信的和可接受的。 The aim of outlier detection is to find out abnormal data patterns concealed in abundant data sets which were sparse and isolate. Mine disaster occurred much more frequently in our country, so it is urgent to take out an effective method to prevent mine disasters and guarantee miner＇s life and property. In this paper, we presente a new method - AHHDOD, it could not only find out the abnormal data patterns, but also can give the attribution of them. Finally, this method was put into use in the mine disaster forewarning system. The results proved that its convergence and complexity is better than other existed algorithms, and its forewarning result is credible and acceptable.

作者鞠可一周德群张玉强

机构地区南京航空航天大学经济与管理学院海军飞行学院实验中心

出处《系统工程》 CSCD 北大核心 2008年第11期116-122,共7页 Systems Engineering

基金国家自然科学基金资助项目(90510010) 教育部博士点基金资助项目(20050287026)

关键词蚁群算法超图高维数据离群检测矿难预警 Ant Colony Algorithm Hypergraph High Dimensional Data Outlier Detection Mine Disaster Forewarning

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献19

1Knorr E M,Ng R T. Algorithms for mining distancebased outliers in large datasets[A]. Proc. of Int. Conf. Very Large Data-bases (VLDB' 98)[C]. New York, 1998 : 392-403.
2Hawkins D. Identification of outliers[M]. London: Chapman and Hall,1980.
3Barnett V, Lewis T. Outliers in statistical data[M]. New York :John Wiley & Sons, 1994.
4黄洪宇,林甲祥,陈崇成,樊明辉.离群数据挖掘综述[J].计算机应用研究,2006,23(8):8-13. 被引量：42
5Knorr E, Ng R. Finding intensional knowledge of distance-based outliers[A]. Scotland: Proc. of the 25th VLDB Conference Edinburgh[C]. 1999:211-222.
6Jiang S Y, et al, GLOF: a new approach for mining local outlier [A]. International Conference on Machine Learning and Cybernetics[C]. 2003: 157- 162.
7Breunig M M, et al. Optics of: identifying density- based local outlier[A]. Zytkow J M, Rauch. Proc. of the 3rd European conference on principles and practice of knowledge discovery in data bases, lecture notes in computer science 1704 [C]. Prague: Springer, 1999 : 262-270.
8Papadimitriou S, et al. LOCI: fast outlier detection using the local correlation integral [A]. The 19th International Conference on Data Engineering[C]. 2003:315.
9Arning A, Agrawal R, Raghavan P. A linear method for deviation detection in large databases [A]. Proc. of 1996 Int. Conf. Data Mining and Knowledge (Special Issue on High Performance Data Mining) [C]. 2000.
10Sarawagi S, Agrawal R, Megiddo N. Discoverydriven exploration of OLAP data cubes[A]. Proc.of Int. Conf. Extending Database Technology (EDBT' 98)[C]. Valencia, 1998 : 168-182.

二级参考文献41

1Zheng Binxiang,Du Xiuhua & Xi Yugeng Institute of Automation, Shanghai Jiaotong University,Shanghai 200030,P.R.China.Outliers Mining in Time Series Data Sets[J].Journal of Systems Engineering and Electronics,2002,13(1):93-97. 被引量：3
2范大昭,雷蓉,张永生.从地理数据库中探测奇异值[J].测绘科学,2004,29(5):12-15. 被引量：2
3陆声链,林士敏.基于距离的孤立点检测及其应用[J].计算机与数字工程,2004,32(5):94-97. 被引量：23
4Cheung D W，Proc 12th Int Conference on Data Engineering，1995年
5Agrawal R，Proc Int Conference on Data Engineering，1995年，3页
6Fayyad, U., Piatetsky-Shapiro, G., Smyth, P. Knowledge discovery and data mining: towards a unifying framework. In: Simoudis, E., Han, J., Fayyad, U.M., eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland, Oregon: AAAI Press, 1996. 82～88.
7Ng, R. T., Han, J. Efficient and effective clustering methods for spatial data mining. In: Bocca, J.B., Jarke, M., Zaniolo, C., eds. Proceedings of the 20th International Conference on Very Large Data Bases. Santiago: Morgan Kaufmann, 1994. 144～155.
8Ester, M., Kriegel, H.-p., Sander, J., et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M., eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland, Oregon: AAAI Press, 1996. 226～231.
9Zhang, T., Ramakrishnan, R., Linvy, M. BIRCH: an efficient eata clustering method for very large databases. In: Jagadish, H.V., Mumick, I.S., eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Montreal: ACM Press, 1996. 103～114.
10Wang, W., Yang, J., Muntz, R. STING: a statistical information grid approach to spatial data mining. In: Jarke, M., Carey, M.J., Dittrich, K.R., et al., eds. Proceedings of the 23rd International Conference on Very Large Data Bases. Athens, Greece: Morgan Kaufmann, 1997. 186～195.

共引文献87

1冯泽磊,吴美凤.动态浮箱数据清洗方法在电力系统中的应用[J].发电技术,2019,40(S1):109-113. 被引量：5
2蒋盛益,徐雨明,陈溪辉.异常挖掘研究综述[J].衡阳师范学院学报,2004,25(3):63-66. 被引量：2
3李雁,陆海亭,张宁.一种短时交通流异常数据识别新算法[J].公路交通科技（应用技术版）,2010,6(4):185-188.
4ZHANG Jing 1,2 , SUN Zhi-hui 1 1.Department of Computer Science and Engineering, Southeast University, Nanjing 210096, Jiangsu, China,2.Department of Electricity and Information Engineering, Jiangsu University, Zhenjiang 212001, Jiangsu, China.Constructing Three-Dimension Space Graph for Outlier Detection Algorithms in Data Mining[J].Wuhan University Journal of Natural Sciences,2004,9(5):585-589. 被引量：1
5刘洪涛,童德利,陈世福.一种基于属性的异常点检测算法[J].计算机科学,2005,32(5):164-166. 被引量：4
6赵泽茂,何坤金,胡友进.基于距离的异常数据挖掘算法及其应用[J].计算机应用与软件,2005,22(9):105-107. 被引量：12
7蔡江辉,张华煜.离群数据挖掘方法研究[J].电脑开发与应用,2005,18(12):46-47. 被引量：1
8苏华.营销培训问题攻略[J].人才资源开发,2005(12):74-74.
9张净,孙志挥.GDLOF:基于网格和稠密单元的快速局部离群点探测算法[J].东南大学学报（自然科学版）,2005,35(6):863-866. 被引量：6
10金义富,朱庆生,邹咸林.高维数据集离群子空间特性研究[J].计算机工程与应用,2006,42(9):147-149. 被引量：2

1黄洪宇,林甲祥,陈崇成,樊明辉.离群数据挖掘综述[J].计算机应用研究,2006,23(8):8-13. 被引量：42
2施冬冬,贾瑞玉,黄义堂.基于遗传算法的高维离群点检测算法的改进[J].计算机技术与发展,2009,19(3):141-143. 被引量：5
3施冬冬,方星星.基于混合遗传算法的高维离群数据检测[J].赤峰学院学报（自然科学版）,2016,32(20):1-2. 被引量：3
4刘立峰,张秋松.基于自适应滤波的弱声音信号检测研究[J].微计算机信息,2009,25(13):84-85. 被引量：1
5赵向兵,白伟.离群数据检测研究[J].山西大同大学学报（自然科学版）,2012,28(2):10-13.
6王孝平,董秀成.运动物体仿真中的碰撞检测研究[J].西华大学学报（自然科学版）,2013,32(5):15-17. 被引量：4
7徐庚保,曾莲芝.致力于“防患于未然”的仿真[J].计算机仿真,2012,29(8):6-11.
8李桥,周莹莲,黄胜,马翔.对随机投影算法的离群数据挖掘技术研究[J].计算机工程与应用,2013,49(24):122-129. 被引量：3
9刘康,钱旭,王自强.基于流形主动学习的遥感图像分类算法[J].计算机应用,2013,33(2):326-328. 被引量：4
10张雪飞,章国安,季彦呈.VANET中基于线性协作策略的预警信息检测算法[J].电信科学,2014,30(5):120-125. 被引量：2

系统工程

2008年第11期

浏览历史

内容加载中请稍等...

高维离群检测算法及其应用

参考文献19

二级参考文献41

共引文献87

相关作者

相关机构

相关主题

浏览历史