摘要
主要研究垃圾文本识别问题,利用苹果手机评论文本特征向量建立了SVM分类模型对垃圾文本进行识别,并与BP神经网络判别模型结果进行对比,得出苹果手机前400组训练样本的判别正确率为71%,后196组测试样本的判别正确率为70.12%.故得到,影响垃圾观点文本识别效果的主要原因为:1)评论文本的特征项的提取和文本特征空间向量求解.2)判别分类方法的选择,其中SVM文本识别效果最优.
This paper mainly study the issue of spam text recognition, apple mobile phone comment text feature vector is used to establish the model of SVM classification of garbage text recognition, comparing with the results of the discriminant model of BP neural network, it is concluded that the iphone before 400 the training sample set of discriminant accuracy is 71%, after the 196 test sample set of discriminant accuracy is 70.12%. Therefore, main causes of impact garbage view text recognition effect are: 1) reviews of document feature extraction and feature space vector to solve document. 2) the discriminant method choice, including the optimal SVM text recognition effect.
出处
《数学的实践与认识》
北大核心
2016年第7期144-153,共10页
Mathematics in Practice and Theory
基金
国家自然科学基金资助项目"考虑文本情感的实时在线客户服务质量驱动因素研究"(71471102)