期刊文献+

基于中文字形的ELMo在电商事件识别上的应用 被引量:4

E-commerce Event Detection with Chinese Character Glyph Based ELMo
下载PDF
导出
摘要 挖掘电商评论文本中的电商事件对分析用户购物行为和商品场景分类有重要帮助。该文给出电商事件的定义,将电商事件识别问题转换为序列标注问题,构建了一个基于电商评论文本的电商事件标注数据。该文首先在基于字符的BiLSTM-CRF神经网络模型上进行扩展,加入语言模型词向量(Embeddings from Language Models,ELMo)来提高识别性能。进而考虑中文字形特征,包括五笔和笔画特征。提出两种引入字形特征的新模型,即在预训练语言模型中结合事件的字形信息进行建模。实验结果表明融入字形特征的ELMo可以进一步提高模型性能。最后,该文分别使用新闻和电商领域两份大规模无标注数据训练语言模型。结果表明,电商领域语料对系统的帮助更大。 Mining events in E-commerce reviews is of great help to analyze customer shopping behavior and commodity scene classification.This paper presents the definition of E-commerce event and treats the event detectionas a sequence labeling issue.Besides,It constructs an event detection corpus based on E-commerce comments.Firstly,this paper extends the character-based BiLSTM-CRF model with the Embeddings from Language Models(ELMo)to improve the performance.Then,it considers the characteristics of Chinese characters,including five-strokes(Wubi)and common-strokes.Two novel models are proposed to add glyph features into ELMo by using the glyph information of events.Experimental results show that the proposed models can improve performance on a newly built dataset.Finally,this paper uses two large text corpus from news and E-commerce domains to train language models.The results show that the E-commerce corpus is more helpful to the system.
作者 王铭涛 方晔玮 陈文亮 WANG Mingtao;FANG Yewei;CHEN Wenliang(School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
出处 《中文信息学报》 CSCD 北大核心 2021年第12期94-102,共9页 Journal of Chinese Information Processing
基金 国家自然科学基金(61525205,61876115)
关键词 电商事件 序列标注 字形特征 ELMo e-commerce event sequence labeling glyph features ELMo
  • 相关文献

参考文献2

二级参考文献3

共引文献31

同被引文献28

引证文献4

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部