期刊文献+

高维离群检测算法及其应用

Outlier Detection Algorithm in High Dimension and Its Application
下载PDF
导出
摘要 离群检测的目的在于找出隐含在海量数据中相对稀疏而孤立的异常数据模式。由于高维数据的特殊性,传统的离群挖掘算法往往不适合发掘高维空间中的离群点。本文将蚁群算法用于改进超图模型,提出了一种新的离群检测算法——AHHDOD算法,在检测出离群数据模式的同时给出离群点的归属。经检验,该算法能有效收敛于最优解,且其时间复杂度更低。最后,将该方法应用于矿难预警检测中,能对可能出现的危机状况给出预警提示。实验证明,该方法取得的预警结果是可信的和可接受的。 The aim of outlier detection is to find out abnormal data patterns concealed in abundant data sets which were sparse and isolate. Mine disaster occurred much more frequently in our country, so it is urgent to take out an effective method to prevent mine disasters and guarantee miner's life and property. In this paper, we presente a new method - AHHDOD, it could not only find out the abnormal data patterns, but also can give the attribution of them. Finally, this method was put into use in the mine disaster forewarning system. The results proved that its convergence and complexity is better than other existed algorithms, and its forewarning result is credible and acceptable.
出处 《系统工程》 CSCD 北大核心 2008年第11期116-122,共7页 Systems Engineering
基金 国家自然科学基金资助项目(90510010) 教育部博士点基金资助项目(20050287026)
关键词 蚁群算法 超图 高维数据 离群检测 矿难预警 Ant Colony Algorithm Hypergraph High Dimensional Data Outlier Detection Mine Disaster Forewarning
  • 相关文献

参考文献19

  • 1Knorr E M,Ng R T. Algorithms for mining distancebased outliers in large datasets[A]. Proc. of Int. Conf. Very Large Data-bases (VLDB' 98)[C]. New York, 1998 : 392-403.
  • 2Hawkins D. Identification of outliers[M]. London: Chapman and Hall,1980.
  • 3Barnett V, Lewis T. Outliers in statistical data[M]. New York :John Wiley & Sons, 1994.
  • 4黄洪宇,林甲祥,陈崇成,樊明辉.离群数据挖掘综述[J].计算机应用研究,2006,23(8):8-13. 被引量:42
  • 5Knorr E, Ng R. Finding intensional knowledge of distance-based outliers[A]. Scotland: Proc. of the 25th VLDB Conference Edinburgh[C]. 1999:211-222.
  • 6Jiang S Y, et al, GLOF: a new approach for mining local outlier [A]. International Conference on Machine Learning and Cybernetics[C]. 2003: 157- 162.
  • 7Breunig M M, et al. Optics of: identifying density- based local outlier[A]. Zytkow J M, Rauch. Proc. of the 3rd European conference on principles and practice of knowledge discovery in data bases, lecture notes in computer science 1704 [C]. Prague: Springer, 1999 : 262-270.
  • 8Papadimitriou S, et al. LOCI: fast outlier detection using the local correlation integral [A]. The 19th International Conference on Data Engineering[C]. 2003:315.
  • 9Arning A, Agrawal R, Raghavan P. A linear method for deviation detection in large databases [A]. Proc. of 1996 Int. Conf. Data Mining and Knowledge (Special Issue on High Performance Data Mining) [C]. 2000.
  • 10Sarawagi S, Agrawal R, Megiddo N. Discoverydriven exploration of OLAP data cubes[A]. Proc.of Int. Conf. Extending Database Technology (EDBT' 98)[C]. Valencia, 1998 : 168-182.

二级参考文献41

  • 1Zheng Binxiang,Du Xiuhua & Xi Yugeng Institute of Automation, Shanghai Jiaotong University,Shanghai 200030,P.R.China.Outliers Mining in Time Series Data Sets[J].Journal of Systems Engineering and Electronics,2002,13(1):93-97. 被引量:3
  • 2范大昭,雷蓉,张永生.从地理数据库中探测奇异值[J].测绘科学,2004,29(5):12-15. 被引量:2
  • 3陆声链,林士敏.基于距离的孤立点检测及其应用[J].计算机与数字工程,2004,32(5):94-97. 被引量:23
  • 4Cheung D W,Proc 12th Int Conference on Data Engineering,1995年
  • 5Agrawal R,Proc Int Conference on Data Engineering,1995年,3页
  • 6Fayyad, U., Piatetsky-Shapiro, G., Smyth, P. Knowledge discovery and data mining: towards a unifying framework. In: Simoudis, E., Han, J., Fayyad, U.M., eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland, Oregon: AAAI Press, 1996. 82~88.
  • 7Ng, R. T., Han, J. Efficient and effective clustering methods for spatial data mining. In: Bocca, J.B., Jarke, M., Zaniolo, C., eds. Proceedings of the 20th International Conference on Very Large Data Bases. Santiago: Morgan Kaufmann, 1994. 144~155.
  • 8Ester, M., Kriegel, H.-p., Sander, J., et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M., eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland, Oregon: AAAI Press, 1996. 226~231.
  • 9Zhang, T., Ramakrishnan, R., Linvy, M. BIRCH: an efficient eata clustering method for very large databases. In: Jagadish, H.V., Mumick, I.S., eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Montreal: ACM Press, 1996. 103~114.
  • 10Wang, W., Yang, J., Muntz, R. STING: a statistical information grid approach to spatial data mining. In: Jarke, M., Carey, M.J., Dittrich, K.R., et al., eds. Proceedings of the 23rd International Conference on Very Large Data Bases. Athens, Greece: Morgan Kaufmann, 1997. 186~195.

共引文献87

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部