期刊文献+

CABOSFV algorithm for high dimensional sparse data clustering 被引量:7

CABOSFV algorithm for high dimensional sparse data clustering
下载PDF
导出
摘要 An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively. An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV),was proposed for the high dimensional clustering of binary sparse data. This algorithm compressesthe data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scaleenormously, and can get the clustering result with only one data scan. Both theoretical analysis andempirical tests showed that CABOSFV is of low computational complexity. The algorithm findsclusters in high dimensional large datasets efficiently and handles noise effectively.
出处 《Journal of University of Science and Technology Beijing》 CSCD 2004年第3期283-288,共6页 北京科技大学学报(英文版)
关键词 数据采矿 高维分散数据集 聚类算法 分散特征矢量 CABOSFV clustering data mining sparse high dimensionality
  • 相关文献

参考文献8

  • 1Han Jiawei,Kamber Micheline.Data Mining: Concepts and Techniques[]..2001
  • 2Agrawal R,Gehrke J,Gunopulos D,et al.Automatic subspace clustering of high dimensional data for data mining applications[].Proceedings of the ACM SIGMOD International Conference on Management of Data.1998
  • 3Sudipto Guha,Rajeev Rastogi,Kyuseok Shim.CURE: An Efficient Clustering Algorithm for Large Databases[].Proceedings of the ACM SIGMOD International Conference on Management of Data.1998
  • 4Ng R,Han J.Efficient and Effective Clustering Methods for Spatial Data Mining[].Proc Int Conf Very Large Data Bases (VLDB’ ).1994
  • 5Tian Zhang,Raghu Ramakrishnan,Miron Livny.BIRCH: an efficient data clustering method for very large databases[].ACM SIGMOD Record.1996
  • 6Wang W,Yang J,Muntz RR.STING: A statistical information grid approach to spatial data mining[].Proceedings of the rd International Conference on Very Large Data Bases.1997
  • 7Wang Wei,,Yang Jiong,Richard Muntz.STNG+:An Approach to Active Spatial Data Mining[].th International Conference on Data Engineering.1999
  • 8Ester M,Kriegel H P,Sander J,et al.A density-based algorithm for discovering clusters in large spatial databases[].Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD’).1996

同被引文献28

引证文献7

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部