期刊文献+

基于高斯分布的簇间距离计算方法 被引量:10

A Gaussian Distribution Based Cluster Distance Computing Method
下载PDF
导出
摘要 凝聚的层次聚类算法是一种性能优越的聚类算法,该算法通过不断合并距离相近的簇最终将数据集合划分为用户指定的若干个类别。在聚类的过程中簇间距离计算的准确性是影响算法性能的重要因素。本文提出一种新的基于高斯分布的簇间距离的计算方法,该方法通过簇自身的大小、密度分布等因素改进算法的计算准确性,在不同文本集合上与现有的簇间距离计算方法进行了对比实验,实验结果表明该方法有效地改进了层次聚类算法的性能。 Agglomerate hierarchical clustering algorithm is distinguished for its superior performance in dividing the data set by continually merging similar clusters. The cluster distance computing method is the key issue affecting the performance of hierarchical clustering algorithm. This paper proposes a new method of calculating the clusters distance based on the Gaussian distribution. This method considers the factors in the cluster-itself to improve the calculation veracity, such as the cluster's size and its data distribution. , The experimental results on different text sets prove that the proposed method improves the performance of hierarchical clustering effectively.
出处 《中文信息学报》 CSCD 北大核心 2008年第3期50-55,共6页 Journal of Chinese Information Processing
基金 国家863计划课题(2006AA01Z148) 教育部科学技术研究重点项目(207148)
关键词 计算机应用 中文信息处理 层次聚类 簇间距离计算 文本聚类 computer application Chinese information processing hierarchical clustering cluster distance computing text clustering
  • 相关文献

参考文献1

  • 1Ph.D.Candidate:Han Duan-feng College of Shipbuilding Engineering, Harbin Engineering University, Harbing 150001, ChinaSupervisor: Huang De-bo (Harbin Engineering Univ.) Members of Dissertation Defense Committee: Dai Yi-shan (Harbin Engineering Univ.)Chairman Zhu Dian-ming (Science and Technology Commission of Heilongjiang Province) Huang Sheng (Harbin Engineering Univ.) Xu Yu-ru (Harbin Engineering Univ.) Zhang Liang (Harbin Engineering Univ.).RESEARCH ON THE METHOD AND CALCULATION OF THE NEW SLENDER-SHIP WAVE RESISTANCE THEORY[J].Journal of Hydrodynamics,2003,15(4):124-124. 被引量:4

共引文献3

同被引文献60

引证文献10

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部