摘要
朴素贝叶斯分类算法是一种简单并且高效的分类算法,但条件独立性假设在现实中很难满足,导致其性能有所下降.为了解决该问题,本文在关联规则和置信度的基础上对该分类算法进行了改进.通过挖掘出来的关联规则和该规则的置信度,对不同的属性赋予不同的权重,同时实现了该分类算法的MapReduce化,从而在保持简单性的基础上有效地提高了朴素贝叶斯分类算法的分类性能.动车组运维实验表明:该算法提高了分类的准确率和效率.
Native Bayes classification algorithm is a simple and efficient classification algorithm. How- ever, its application is partly restricted because the assumptions of conditional independence are diffi- cult to satisfy in reality. A modified Bayes classification algorithm with attribute weights is put forward based on association rules and confidence to solve this problem. Different weights are provided for dif- ferent attributes based on association rules and confidence. The proposed algorithm is implemented us- ing MapReduce programming mode, thus the classification performance of native Bayes classification algorithm improves effectively while maintains its simplicity. Experiments in EMU maintenance show that the method indeed improves the accuracy and efficiency of classification algorithms.
出处
《北京交通大学学报》
CAS
CSCD
北大核心
2015年第2期35-41,共7页
JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基金
铁道总公司科技项目资助(K14D00061)
关键词
MAPREDUCE
朴素贝叶斯
分类算法
关联规则
置信度
动车组
MapReduce
native Bayes
classification algorithm
association rule
confidence
electric mo-tor train unit(EMU)