期刊文献+

采用有效邻近点和适应密度的密度聚类算法 被引量:4

A Density Clustering Algorithm Based on Effective Neighboring Points and Adaptive Density Distribution
下载PDF
导出
摘要 密度聚类作为一类重要的聚类分析方法,具有无需预先指定类簇数,可识别任意形状聚类族等优点,但在计算密度的过程中,K近邻或邻域半径的选取对聚类效果具有较大的影响,且当数据集中存在类簇间距相差较大的情况时,密度聚类无法自适应类簇中数据对象密度变换,导致聚类效果与实际存在较大误差。针对现有密度聚类分析存在的不足,利用有效邻近点和适应密度分布,提出了一种密度聚类分析算法。该算法首先通过相对距离确定伸缩半径,定义了数据对象的有效邻近点,并有效地克服了近邻值K选取对聚类效果的影响;其次,计算核心点和边界点阈值,依据有效邻近点,并确定类簇中的核心区域数据对象,有效地改善了聚类分析效率;然后,调整簇内有效距离,改善了类簇密度分布不均匀、类簇间距离过大等问题;最后,在人工和UCI数据集上验证了该算法的有效性。 As an important cluster analysis method,density clustering has the advantages of unspecified number of cluster in advance and clustering with arbitrary shapes can be discovered.However,in the process of calculating the density,there is an important influence on the clustering due to the selection of K-nearest neighboring or Eps.When cluster spacing vary a lot in the datasets,the density clustering is unable to adapt to the data object density transformation in the clusters,which leads to a large deviation between the clustering and the reality datasets.In order to overcome shortcomings of existing density cluster analysis,a density clustering algorithm is proposed by using effective neighboring points and adaptive density distribution.Firstly,the telescopic radius is determined by the relative distance,the effective neighboring points of the data object is defined,and the influence of the selection of the nearest neighbor value K on the clustering effect is overcame.Secondly,core point and boundary point threshold are calculated using the relative distance,so that core area objects in the cluster are determined according to the effective neighboring points,which effectively improves the efficiency of cluster analysis.Thirdly,uneven density distribution and large distance between clusters are improved by adjusting the effective distance within the cluster.In the end,the effectiveness of the proposed algorithm is validated on artificial and UCI datasets.
作者 闫强强 张敏 荀亚玲 YAN Qiang-qiang;ZHANG Min;XUN Ya-ling(School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China)
出处 《计算机技术与发展》 2022年第9期14-22,共9页 Computer Technology and Development
基金 国家青年科学基金项目(61602335) 山西省自然科学基金(201901D211302)。
关键词 密度聚类 伸缩半径 有效邻近点 适应密度分布 相对距离 density clustering algorithm telescopic radius effective neighboring points adaptive density distribution relative distance
  • 相关文献

参考文献8

二级参考文献41

  • 1周水庚,周傲英,金文,范晔,钱卫宁.FDBSCAN:一种快速 DBSCAN算法(英文)[J].软件学报,2000,11(6):735-744. 被引量:42
  • 2倪巍伟,孙志挥,陆介平.k-LDCHD——高维空间k邻域局部密度聚类算法[J].计算机研究与发展,2005,42(5):784-791. 被引量:18
  • 3曹锋,周傲英.基于图形处理器的数据流快速聚类[J].软件学报,2007,18(2):291-302. 被引量:24
  • 4CHIANG Chingsan,CHU Shuchua,John F. Roddick,PAN Jengshyang.New Search Strategies and New Derived Inequality for Efficient K-Medoids-Based Algorithms[J].Chinese Journal of Electronics,2007,16(1):82-87. 被引量:3
  • 5Ester M, et al. A density-based algorithm for discovering clusters in large spatial databases with noise [C]//Proc of the 2nd Int Conf on Knowledge Discovering in Databases and Data Mining(KDD96). Menlo Park, CA: AAA I Press, 1996.
  • 6Berkhin P. Survey of clustering data mining techniques [R] //San Jose, CA: Accrue Software, 2002.
  • 7Xu R, Wunsch D II. Survey of clustering algorithms [J]. IEEE Trans on Neural Networks, 2005, 16(3): 645-678.
  • 8Sander.J, Ester M, Kriegel H P, et al. Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications [J]. Data Mining and Knowledge Discovery, an lnternatlonal Journal, 1998, 2(2): 169-194.
  • 9Ankerst M, Breunig M M, Kriegel H -P, et al. Optics: Ordering points to identify the clustering structure [C]//Proc 3fACMSIGMOD1999. New York: ACM, 1999:49-60.
  • 10Xu X, Ester M, Kriegel H, et al. A distribution based clustering algorithm for mining in large spatial databases [C] //Proc of the 14th Int Conf on Data Engineering (ICDE'98). Washington, DC: IEEE Computer Society, 1998:324-331.

共引文献225

同被引文献40

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部