摘要
随着互联网的到来,其技术的发展导致了各种数据呈现出爆发式的增长,比如文本数据,分类算法在海量数据前面临着新的挑战。为了解决传统朴素贝叶斯分类算法在面临挑战中的不足,对其中关键词进行加权来提高分类准确率,然后通过Map Reduce编程模型,设计出朴素贝叶斯算法在Hadoop平台下的实现。实验表明:在Hadoop集群上通过并行化的设计朴素贝叶斯分类算法展现出了良好的性能,同时表现出了可靠的扩展性。
With the advent of the Internet,the development of technology has led to a variety of data showing explosive growth,such as text data,and the massive data classification algorithm is facing new challenges.In order to solve the traditional Bayesian classifier algorithm insufficient of facing challenges,this paper weights keywords to improve classification accuracy,then through the Map Reduce programming model,naive Bayes algorithm devised at Hadoop platform.
出处
《工业控制计算机》
2016年第4期96-97,100,共3页
Industrial Control Computer