摘要
基于密度的聚类算法具有挖掘任意形状聚类和处理"噪声"数据等优势,同时也存在时间消耗大、参数问题局限及输入顺序敏感等缺陷。为此,文章提出一种基于层次树的密度聚类算法DCHT(Density Cluste-ring Based on Hierarchical Tree),以层次树描述子聚类信息,动态调整密度参数,基于密度探测树结构中相邻子聚类得到最终的聚类簇。理论分析和实验结果表明,该算法适用于大规模、高维数据,并具有动态调整参数和屏蔽输入顺序敏感性的优点。
Density-based clustering methods have the advantages such as clustering with arbitrary shapes and handling noise, which also have disadvantages in its long time consumption, parameter tuning and sensitivity of input order. In this paper, a new clustering algorithm called DCHT (Density Clustering Based on a Hierarchical Tree) is presented that constructs a hierarchical tree to describe the sub-clusters. The natural clusters are discovered by tuning density parameter dynamically and detecting adjacent sub-clusters of the tree. Both theoretical analysis and experimental results indicate that the DCHT algorithm with the advantages of tuning parameter dynamically and shielding the sensitivity of input order is suitable for mining large-scaled and high dimensional database.
出处
《合肥工业大学学报(自然科学版)》
CAS
CSCD
北大核心
2008年第2期187-190,195,共5页
Journal of Hefei University of Technology:Natural Science
基金
安徽省自然科学基金资助项目(050420207)
合肥工业大学科研发展基金资助项目(050504F)
安徽省高校教师资助计划项目(2005jq1012)
关键词
数据挖掘
聚类
基于密度聚类
输入顺序敏感性
data mining
clustering
density-based clustering
sensitivity of input order