期刊文献+

大数据环境下朴素贝叶斯分类算法的改进与实现 被引量:13

Research and realization of improved native Bayes classification algorithm under big data environment
下载PDF
导出
摘要 朴素贝叶斯分类算法是一种简单并且高效的分类算法,但条件独立性假设在现实中很难满足,导致其性能有所下降.为了解决该问题,本文在关联规则和置信度的基础上对该分类算法进行了改进.通过挖掘出来的关联规则和该规则的置信度,对不同的属性赋予不同的权重,同时实现了该分类算法的MapReduce化,从而在保持简单性的基础上有效地提高了朴素贝叶斯分类算法的分类性能.动车组运维实验表明:该算法提高了分类的准确率和效率. Native Bayes classification algorithm is a simple and efficient classification algorithm. How- ever, its application is partly restricted because the assumptions of conditional independence are diffi- cult to satisfy in reality. A modified Bayes classification algorithm with attribute weights is put forward based on association rules and confidence to solve this problem. Different weights are provided for dif- ferent attributes based on association rules and confidence. The proposed algorithm is implemented us- ing MapReduce programming mode, thus the classification performance of native Bayes classification algorithm improves effectively while maintains its simplicity. Experiments in EMU maintenance show that the method indeed improves the accuracy and efficiency of classification algorithms.
作者 张春 郭明亮
出处 《北京交通大学学报》 CAS CSCD 北大核心 2015年第2期35-41,共7页 JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基金 铁道总公司科技项目资助(K14D00061)
关键词 MAPREDUCE 朴素贝叶斯 分类算法 关联规则 置信度 动车组 MapReduce native Bayes classification algorithm association rule confidence electric mo-tor train unit(EMU)
  • 相关文献

参考文献9

  • 1HanJW,KamberM.数据挖掘:概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2005:200-204.
  • 2Domingos P, Pazzani M. Beyond independence: conditions for the optimality of the simple Bayesian classier[ C]//Proc 13th Intl Conf Machine ]_aming, 1996: 105- 112.
  • 3汪为汉,唐学文,邓一贵.基于贝叶斯学习的集成流量分类方法[J].计算机工程,2012,38(16):164-166. 被引量:4
  • 4Hall M. A decision tree-based attribute weighting filter for naive Bayes[J]. Knowledge-Based Systems, 2007, 20(2) : 120 126.
  • 5Taheri S, Yearwood J, Mammadov M, et al. Attribute weighted naive Bayes classifier using a local optimization [J]. Neural Computing and Applications, 2014, 24(5): 995 1002.
  • 6Wu J, Cai Z, Zhu X. Self-adaptive probability estimation for naive Bayes classification[ C]//IEEE International Joint Conference on Neural Networks, 2013:1 - 8.
  • 7Zhang H, Sheng S. Learning weighted naive Bayes with accurate ranking [ C]/ZI'he 4th IEEE International- Conference on Data Mining, 2004: 567 - 570.
  • 8张明卫,王波,张斌,朱志良.基于相关系数的加权朴素贝叶斯分类算法[J].东北大学学报(自然科学版),2008,29(7):952-955. 被引量:32
  • 9Dean J, Ghemawat S. MapReduce: simplified data process- ing on large clusters [J ]. Communications of the ACM, 2008, 51(1). 107- 113.

二级参考文献18

  • 1邓维斌,王国胤,王燕.基于Rough Set的加权朴素贝叶斯分类算法[J].计算机科学,2007,34(2):204-206. 被引量:43
  • 2Han J W,Kamber M.数据挖掘:概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2005:185-190.
  • 3Domingos P, Pazzani M. Beyond independence: conditions for the optimality of the simple Bayesian classifier [ C]//The 13th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 1996 : 105 - 112.
  • 4Gou K X, Jun G X, Zhao Z. Learning Bayesian network structure from distributed homogeneous data [ C ]//SNPD. Chicago: IEEE Computer Society, 2007 : 250 - 254.
  • 5Friedman N, Geiger D, Goldszrnidt M. Bayesian network classifiers[J ]. Machine Learning, 1997,29(3) : 131 - 163.
  • 6Chickering D M. Learning Bayesian networks is NP-complete [M]//Douglas H. Learning from data: AI and statistics. New York: Springer-Verlag, 1996:121 - 130.
  • 7Zhang H, Sheng S. Learning weighted naive Bayes with accurate ranking [ C ] // The 4th IEEE International Conference on Data Mining. Chicago: IEEE Computer Society, 2004 : 567 - 570.
  • 8Embrechts P, Lindskog F. Modeling dependence with copulas and application to risk management[ M]//Rachev S. Handbook of heavy tailed distributions in finance. Amsterdam: Elsevier, 2003 : 329 - 384.
  • 9Gudmund R. Statistics: the conceptual approach[M]. New York: Springer-Verlag, 1997 : 252 - 292.
  • 10Ling C X, Huang J, Zhang H. AUC: a statistically consistent and more discriminating measure than accuracy [ C ] // Proceedings of the International Joint Conference on Artificial Intelligence. [S. l]: AAAL, 2003:519- 526.

共引文献34

同被引文献90

引证文献13

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部