摘要
聚类分析是一种非监督型知识发现的方法,能有效地处理大量的、繁杂的、属性众多的且没有类标志的数据。DBSCAN算法能实现任意形状的数据集的聚类,模糊C均值适合于那些在簇中心周围呈均匀分布的数据集,CABOSFV算法对于高维稀疏数据集(例如Web数据)能很好地聚类。在I-Miner中嵌入DBSCAN、CABOSFV和模糊C均值三种聚类分析算法,能够较好地满足用户的需要,建立数据挖掘模型,支持生产决策。
The clustering analysis is a unsupervised knowledge discovery method, which can effectively process massive, numerous and diverse, the attribute numerous, and unlabeled data. DBSCAN algorithm can achieve clustering of any shape of dataset, Fuzzy c-means is suitable for dataset which is uniform distribution around the cluster centres, CABoSFV algorithm can be a good clustering for high-dimensional dataset(such as Web data). Embedding DBSCAN FCM and CABoSFV clustering analysis algorithm into I-Miner, can be better to enormously satisfy the user's need, establish the data mining model, and support the production decision-making.
出处
《现代计算机》
2009年第2期30-34,共5页
Modern Computer
关键词
聚类分析
DBSCAN算法
模糊C均值
CABOSFV算法
Clustering Analysis
Dbscan (Density-Based Spatial Clustering of Applications with Noise)
FCM (Fuzzy C-Means )
Cabosfv (Clustering Algorithm Based On Sparse Feature Vector)