摘要
在大数据的数据挖掘模型中,普遍采用模糊聚类算法进行数据分析。常用的模糊C均值聚类算法即FCM聚类算法,具有较多明显缺点,如抗噪性偏低、收敛速度慢、聚类数目无法自动确定等。常用的增量式模糊聚类方法通常在原有的以一个中心点为集群代表的基础上,改为选取多中心点进行增量式聚类算法的分析。但是,通过这样的算法进行数据分析也存在一定的问题,主要表现在其中心点选择是固定的,灵活性很差。基于以上原因,文中将对原有基础算法做出改进,主要对大数据中数据挖掘模型的增量型模糊聚类算法做出分析,经实践验证,改进后算法切实可行,普适性较强。
The fuzzy clustering algorithm is widely used in data mining model of big data for data analysis.The commonly used fuzzy C⁃means clustering algorithm,also known as FCM clustering algorithm,has obvious disadvantages,for instance,the noise immunity is poor,the convergence speed is slow,and the number of clusters cannot be determined automatically.In the commonly used incremental fuzzy clustering algorithm,multi⁃center points are selected for incremental clustering algorithm analysis instead of taking one center point as the cluster representative as before.However,there are still certain problems in the algorithm in the process of data analysis,mainly because the selection of the center point is fixed,resulting the poor flexibility.In view of the above,the existing basic algorithm will be improved,and the incremental fuzzy clustering algorithm for data mining model in big data will be mainly analyzed.The practice shows that the improved algorithm is feasible and universal.
作者
李小红
常振云
LI Xiaohong;CHANG Zhenyun(School of Information Science and Engineering,Tianshi College,Tianjin 301700,China)
出处
《现代电子技术》
北大核心
2020年第3期177-182,共6页
Modern Electronics Technique
关键词
增量型模糊聚类
大数据
数据挖掘模型
聚类算法
余弦相似度
隶属度矩阵
incremental fuzzy clustering
big data
data mining model
clustering algorithm
cosine similarity
membership matrix