期刊文献+

基于RoBERTa-WWM的旅游领域命名实体识别方法

Named Entity Recognition Method in Tourism Domain Based on RoBERTa-WWM
下载PDF
导出
摘要 针对旅游评论中的实体具有长度较长、结构复杂、嵌套严重的问题,提出一种基于Ro BERTa-WWMBiLSTM-CRF模型的旅游实体识别方法。首先,使用RoBERTa-WWM(A Robustly Optimized BERT Pre-training Approach-Whole Word Masking)预训练语言模型从旅游评论中获得含有先验语义信息的字符向量;其次,引入双向长短期记忆网络(BiLSTM)进一步获得包含上下文信息的文本序列双向表达;最后,通过条件随机场(CRF)输出最优标签序列。使用建立的旅游数据集进行实验,结果表明Ro BERTa-WWM-BiLSTM-CRF模型的识别效果优于现有的主流模型,验证了该方法进行命名实体识别的有效性。 Aiming at the problems of long length,complex structure and serious nesting of entities in travel reviews,a travel entity recognition method based on RoBERTa-WWM-BiLSTM-CRF model is presented.Firstly,the RoBERTa-WWM(A Robustly Optimized BERT Pre-training Approach-Whole Word Masking) pre-training language model is used to obtain character vectors containing a priori semantic information from travel reviews;Secondly,the introduction of Bi-directional Long Short-Term Memory(BILSTM) further obtains the bidirectional expression of text sequences containing contextual information;Finally,the Conditional Random Field(CRF)is introduced to output the optimal tag sequence.Experiment with the builded tourism data set,The results show that the recognition effect of the RoBERTa-WWM-BiLSTM-CRF model is better than that of the existing mainstream models,verifying The effectiveness of this method for named entity recognition.
作者 李胜楠 徐春 LI Sheng-nan;XU Chun(School of Information Management,Xinjiang University of Finance and Economics,Urumqi 830012,China)
出处 《电脑与信息技术》 2022年第6期34-38,共5页 Computer and Information Technology
基金 新疆自然科学基金项目(项目编号:2019D01A23) 新疆财经大学科研基金项目(项目编号:2022XGC073) 新疆社会科学基金项目(项目编号:18BGL086)。
关键词 命名实体识别 RoBERTa-WWM 双向长短期 条件随机场 named entity recognition RoBERTa-WWM BILSTM CRF
  • 相关文献

参考文献6

二级参考文献56

共引文献152

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部