摘要
【目的】从评论对象的属性特征出发解决情感极性量化问题。【方法】将在线评论文本分解构建三层评论体系,即评论对象–对象属性–评论描述,从属性层级抽取属性词集和对应的评论集,考虑评论对象属性特征的不同影响,引入属性因子,并对TFIDF进行改进用以计算属性因子;结合评论模式和评论语境提出基于属性特征的评论情感量化分析算法并采用Python语言予以实现。【结果】相较于传统机器学习分类算法(NB、SVM)、属性因子设置为等权重时,本文算法在评论文本情感分类准确性方面有显著提高。【局限】评论集领域选择方面具有局限性,量化算法在系数设定方面存在主观性。【结论】本文算法能有效解决情感极性量化问题,进一步提高了情感分类准确性。
[Objective] This article tries to quantitatively study the sentiment polarity of online comments base on the targets' attributes. [Methods] First, we analyzed the comments by their objects, attributes and contents. Then, we extracted the attribute words and the corresponding comment sets. Third, we introduced the attribute factors and calculated their values with the modified TFIDF formula. Finally, we developed a quantitative analysis algorithm based on the attribute features with Python. [Results] Compared to the traditional machine learning classification algorithms(e.g., NB and SVM), our method improved the accuracy of sentiment classification, when the attribute factor was set to equal weight. [Limitations] The comments selection method and the coefficients parameters of the proposed algorithm need to be improved. [Conclusions] Our method could effectively improve the accuracy of the sentiment classification.
出处
《数据分析与知识发现》
CSSCI
CSCD
2017年第10期1-11,共11页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金项目"基于可信语义Wiki的知识库构建方法与研究应用"(项目编号:71203173)
中央高校基本科研业务费专项资金资助项目"大数据环境下基于主题模型的信息服务研究"(项目编号:JB160606)
国家自然科学青年基金项目"大规模动态社交网络社团检测算法研究"(项目编号:71401130)的研究成果之一
关键词
评论文本
属性因子
评论模式
情感极性
Comment Text Attribute Factor Comment Mode Sentiment Polarity