期刊文献+

基于朴素贝叶斯的EM缺失数据填充算法 被引量:7

EM algorithm to implement missing values based on Nave Bayesian
下载PDF
导出
摘要 实际应用中大量的不完整的数据集,造成了数据中信息的丢失和分析的不方便,所以对缺失数据的处理已经成为目前分类领域研究的热点。由于EM方法随机选取初始代表簇中心会导致聚类不稳定,本文使用朴素贝叶斯算法的分类结果作为EM算法的初始使用范围,然后按E步M步反复求精,利用得到的最大化值填充缺失数据。实验结果表明,本文的算法加强了聚类的稳定性,具有更好的数据填充效果。 Dataset with missing values is quite common in real applications. It is a big problem of data pretreatment, and handling missing values has become a research hot issue. EM chooses the center of cluster randomly leading to cluster irregularly, so this paper uses the result of Na lye Bayesian as the initial range of EM, then refines the value reduplicative, finally gets the excepted maximize value. The research result suggests that this algorithm improved the level of cluster and had a better data makeup result.
作者 邹薇 王会进
出处 《微型机与应用》 2011年第16期75-77,81,共4页 Microcomputer & Its Applications
关键词 数据填充 EM算法 朴素贝叶斯算法 missing values implement EM algorithm Naive Bayesian algorithm
  • 相关文献

参考文献10

  • 1GRZYMALA-BUSSE J W. Rough set approach to incomplete data. In:LNAI 3070,2004:50-55.
  • 2[加]HanJiawei,KAMBERM.数据挖掘概念与设计[M].北京:机械工业出版社,2008.
  • 3LAKSHMINARAYAN K,(1999).hnputation of missing data in industrial databases[J],Applied Intelligence 11:259-275.
  • 4HUANG X L.A pseudo-nearest-neighbor approach for missing data recovery on Gaussian random data sets[J]. Pattern Recognition Letters,2002(23): 1613-1622.
  • 5GRZYMALA-BUSSE J W,FU M,(2000).A comparison of several approaches to missing attribute values in data mining[C].In:Proc of the 2nd Int'Conf on Rough Sets and Current Trends in Computing. Berlin: Springer- Verlag, 2000 : 378-385.
  • 6ZHANG S C,QIN Y S,ZHU X F,et al.Optimized parameters for missing data imputation.PRICAI06,2006:1010-1016.
  • 7宫义山,董晨.基于贝叶斯网络的缺失数据处理[J].沈阳工业大学学报,2010,32(1):79-83. 被引量:6
  • 8彭红毅,朱思铭,蒋春福.数据挖掘中基于ICA的缺失数据值的估计[J].计算机科学,2005,32(12):203-205. 被引量:9
  • 9HRUSCHKA E R,EBECKEN N F F.Missing values prediction with K2 [J]. Intelligent Data Analysis, 2002,6 (6): 557- 566.
  • 10GEMAN S,GEMAN D.Stochastic relaxation,Gibbs distribution and the Bayesian restoration of images[J].IEEE Trans onPattern Analysis and Machine Intelligence, 1984(6):721.

二级参考文献19

  • 1杨欣斌,孙京诰,黄道.基于Bayesian网络的缺损数据处理方法[J].华东理工大学学报(社会科学版),2002,17(S1):41-44. 被引量:3
  • 2Cooper G, Herskovits E. A Bayesian method for the induction of probabilistic networks from data [J]. Machine Learning, 1992 ( 9 ) : 309 - 347.
  • 3Heckerman D, Geiger D, Chickering D. Learning Bayesian networks : the combination of knowledge and statistical data [ J ]. Machine Learning, 1995 (20) : 196 - 243.
  • 4Paola S,Mareo R. Baysian inference with missing data using bound and collaps [R]. London:The Open University Research Report, 1997.
  • 5Heckerman D. Bayesian networks for data missing [ J ]. Data Mining and Knowledge Discovery, 1997 (1) :79 -119.
  • 6Kantardzic M.Data Mining Concepts,Models,Methods,and Algorithms.Beijing:Tsing hua University Press,2003.
  • 7Feelders A D.Handling Missing Data in Trees:Surrogate Splits or Statistical Imputation.LNAI 1704,1999.329-334.
  • 8Grzymala-Busse J W.Rough Set Approach to Incomplete Data.In:LNAI 3070,2004.50-55.
  • 9Gerardo B D,et al.The Association Rule Algorithm with Missing Data in Data Mining.In:LNCS3043,2004.97-105.
  • 10Li Dan,et al.Towards Missing Data Imputation- A Study of Fuzzy K-means Clustering Method.In:LNAI 3066,2004.573-579.

共引文献13

同被引文献50

引证文献7

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部