期刊文献+

Hybrid:一种两阶段的聚类算法 被引量:3

Hybrid: A Two Phase Clustering Algorithm
下载PDF
导出
摘要 提出了一种两阶段的聚类方法:Hybrid。第一阶段产生大小相同的圆形原子聚类;第二阶段合并原子聚类形成任意形状和大小的聚合聚类。在扩展边界时,不但考虑原子聚类间的距离,还考虑原子聚类的密度相似度。这样可以更好地排除“噪音”的影响,得到内部结构更加趋同的聚合聚类。 This paper presents a new clustering algorithm named Hybrid. Hybrid involves two phases: the first phase generates a set of round atom clusters with same size, and the second phase merges these atom clusters to create a set of molecule clusters with arbitrary size and shape. During the edge expanding process, Hybrid considers not only the distance between two atom clusters, but also the closeness of two atom clusters' densities. Therefore Hybrid can eliminate outlier effectively while maintaining more isomorphic molecule clusters.
出处 《计算机工程》 EI CAS CSCD 北大核心 2005年第13期1-3,50,共4页 Computer Engineering
基金 国家自然科学基金资助项目(60173058)
关键词 数据挖掘 聚类算法 原子聚类 聚合聚类 噪音 Data mining Clustering algorithm Atom cluster Molecule cluster Outlier
  • 相关文献

参考文献6

  • 1Raymond T, Hau N J. Efficient and Effective Clustering Methods for Spatial Data Mining[A]. The 20^th VLDB Conference, Santiago, Chile,1994:144-155.
  • 2Zhang Tian, Ramakrishnan R, Livny M. BIRCH: An Efficient Data Clustering Method for Very Large Databases[A]. Proceedings of the ACM SIGMOD Conference on Management of Data, Montreal,Canada, 1996:103-114.
  • 3Ester M, Kriegel H P, Sander J, et al. A Density-based Algorithm for Discovering Clusters in large Spatial Database with Noise[A]. 2^nd Intl Conf on Knowledge Discovering in Databases and Data Mining,Portland, USA, 1996:226-231.
  • 4Guha U, Rastogi R, Shim K. CURE: An Efficient Clustering Algorithm for Large Databases[J]. Pergamon Information Systems,2001, 26(1): 35-58.
  • 5周兵,沈钧毅,彭勤科.基于随机抽样和聚类特征的聚类算法[J].西安交通大学学报,2003,37(12):1234-1237. 被引量:6
  • 6Karypis G Han Eui—Hong(Sam),Kumar V.CHAMELEON:A Hierarchical Clustering Algorithm Using Dynamic Modeling.Computer,1999.32:68-75.

二级参考文献8

  • 1[1]Raymond T, Hau N J. Efficient and effective clustering methods for spatial data mining[A]. The 20th VLDB Conference, Santiago, Chile, 1994.
  • 2[2]Zhang T, Ramakrishnan R, Livny M. BIRCH: an efficient data clustering method for very large databases[A]. The ACM SIGMOD Conference on Management of Data, Montreal, Canada, 1996.
  • 3[3]Ester M, Kriegel H P, Sander J, et al. A densitybased algorithm for discovering clusters in large spatial database with noise [A]. 2nd Intl Conf on Knowledge Discovering in Databases and Data Mining, Portland,USA, 1996.
  • 4[4]Guha U,Rastogi R, Shim K. CURE.. an efficient clustering algorithm for large databases [J]. Pergamon Information Systems, 2001, 26(1): 35~58.
  • 5[5]Wang Wei, Yang Jiong, Muntz R. STING: a statistical information grid approach to spatial data mining[A]. The 23rd VLDB Conference, Athens, Greece,1997.
  • 6[6]Gehrke A J, Gunopulos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining applications [A]. The ACM SIGMOD International Conference on Management of Data, Seattle,USA, 1998.
  • 7[7]Vitter J. Random sampling with reservoir [J]. ACM Trans on Mathematical Software, 1985, 11 (1): 37 ~57.
  • 8[8]Motwani R, Raghavan P. Randomized algorithms [M]. London: Cambridge University Press, 1995.

共引文献5

同被引文献13

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部