摘要
[目的/意义]产品评论观点抽取任务是细粒度评论挖掘的核心任务,其面临的主要挑战是如何自动抽取评论文本中由评价对象、程度词、观点词构成的三元组。[方法/过程]针对条件随机场(CRFs)模型需要人工构造语言学特征的缺陷,提出一种基于深度学习和CRFs的产品评论抽取方法,该方法首先在连续词袋模型(CBOW)获得词向量基础上,利用双向长短期记忆神经网络(BLSTM RNN)自动学习评论语句的文本特征,再以CRFs层进行解码标注,进而识别出三元组。[结果/结论]为验证方法的有效性,针对从京东商城等电商平台上抓取的手机和酒店评论集,人工标注部分评论用于训练模型并进行测试,实验结果表明,该方法在产品评论观点抽取任务上取得了平均F值大于80%的效果。
[Purpose/Significance ]Opinion extraction is the core task of fine-grained products reviews mining.The main challenge is to automatically extract three tuples composed of opinion target,degree words and opinion words in the comment text.[Method/Process ]A method of products reviews extraction based on deep learning and the condition random field (CRFs) is put forward,which aims at dealing with such problems as the defect of artificial linguistic features for CRFs.Firstly,on the basis of distributed word vectors obtained by unsupervised training of the continuous bag of words model (CBOW),the bidirectional long short term memory recurrent neural networks (BLSTM RNN) is used to learn automatically text features of review sentences.Then the CRFs layer is used to decode and annotate,and the evaluation objects,degree words and opinion words are identified.[Result/Conclusion ]To verify the validity of the method,artificial annotated partial comments are applied to training and testing of the mobile phone and hotel service review sets crawled from the Jingdong Mall and other electronic business platforms.The experimental results show that the method achieves the average F-measure greater than 80% in the product comments extraction task.
作者
睢国钦
那日萨
彭振
Sui Guoqin;Zhao Narisa;Peng Zhen(Institute of Systems Engineering,Dalian University of Technology,Dalian 116024)
出处
《情报杂志》
CSSCI
北大核心
2019年第5期177-185,共9页
Journal of Intelligence
基金
国家自然科学基金面上项目"基于在线评论的网络消费者群体行为预测智能技术研究"(编号:61471083)
教育部人文社会科学研究规划基金项目"基于在线评论的网络消费者群体行为机理及预测"(编号:14YJA630044)
大连市科技创新基金项目"大连智慧城市建设中基于大数据的智能决策理论方法及支持技术研究"(编号:2018J11CY009)