期刊文献+

一种基于模糊C均值聚类的稀疏数据缺失值填充方法 被引量:5

A missing data imputation mettiod of sparse data based on fuzzy C-ceans clustering
下载PDF
导出
摘要 缺失数据处理通常基于统计学的方法,在数据预处理阶段对缺失值进行填补,其效率和准确性并不高。因此,提出了一种基于模糊C均值(FCM)聚类的嵌入式填充方法(FCMSI)。此算法通过平均比率法(ARM)对稀疏数据进行初始化填充;采用局部距离策略对FCM进行改进,并对数据进行聚类;将缺失数据作为变量,在每次聚类迭代后的簇内采用协同过滤(CF)的思想对变量值进行替换,直到结果收敛。利用UCI标准数据集进行对比实验,并采用三种不同评价指标衡量,验证了FCMSI方法比传统填充方法性能显著提高。 The missing data are usually filled by statistical method in the data preprocessing stage.The efficiency and accuracy are not high enough for practical application.Therefore,an Fuzzy C-Means based space data imputation(FCMSI)is proposed based on Fuzzy C-means(FCM)clustering.The sparse data are initialized by the average ratio method(ARM).The FCM is improved by using the local distance strategy to cluster the data.The missing data are taken as variables and will be replaced based on the idea of collaborative filtering(CF)in each cluster after clustering iteration until the results converge.The UCI standard data set isused to carry out comparative experiments.Three different evaluation indexes are used to measure the performance of FCMSI method.The results prove that the performance of FCMSI method is significantly improved compared with the traditional filling method.
作者 张楷卉 李鹏 ZHANG Kaihui;LI Peng(College of Entrepreneurship Education,Heilongjiang University,Harbin 150080,China;School of Software and Microelectronics,Harbin University of Science and Technology,Harbin 150080,China)
出处 《黑龙江大学自然科学学报》 CAS 2019年第6期750-756,共7页 Journal of Natural Science of Heilongjiang University
基金 国家自然科学基金资助项目(61103149) 黑龙江省普通高校基本科研业务费专项资金资助(LGYC2018JQ003)。
关键词 缺失数据填充 稀疏数据 模糊C均值聚类 协同过滤 missing data filling sparse data fuzzy C-means clustering collaborative filtering
  • 相关文献

参考文献6

二级参考文献48

  • 1董师师,黄哲学.随机森林理论浅析[J].集成技术,2013,2(1):1-7. 被引量:149
  • 2张敏,于剑.基于划分的模糊聚类算法[J].软件学报,2004,15(6):858-868. 被引量:176
  • 3杨晓春,刘向宇,王斌,于戈.支持多约束的K-匿名化方法[J].软件学报,2006,17(5):1222-1231. 被引量:60
  • 4S Kirindis,V Chatzis.A robust fuzzy local information c-means clustering algorithm[J].IEEE Trans Image Process,2010,19(5):1328-1337.
  • 5W Cai,S Chen,D Zhang.Fast and robust fuzzy c-means clustering algorithms incorporating local information for image segmentation[J].Pattern Recognition,2007,40(3):825-838.
  • 6R Krishnapuram,J Keller.A possibilistic approach to clustering[J].IEEE Trans Fuzzy Systems,1993,1(2):98-110.
  • 7N R Pal,K Pal,J C Bezdek.A possibilistic fuzzy c-means clustering algorithm[J].IEEE Trans Fuzzy Systems,2005,13(4):517-530.
  • 8J S Zhang,Y W Leung.Improved possibilistic c-means clustering algorithms[J].IEEE.Trans Fuzzy Systems,2004,2(12):209-217.
  • 9Jian Yu,Miin-Shen Yang,E.Stanley Lee:Sample-weighted clustering methods.Computers & Mathematics with Applications,2011,62(5):2200-2208.
  • 10A Schneider.Weighted possibilistic c-means clustering algorithms[J].IEEE Trans Fuzzy Systems,2000,1:176-180.

共引文献83

同被引文献51

引证文献5

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部