期刊文献+

中文命名实体识别综述 被引量:25

Survey of Chinese Named Entity Recognition
下载PDF
导出
摘要 中文命名实体识别(NER)任务是信息抽取领域内的一个子任务,其任务目标是给定一段非结构文本后,从句子中寻找、识别和分类相关实体,例如人名、地名和机构名称。中文命名实体识别是一个自然语言处理(NLP)领域的基本任务,在许多下游NLP任务中,包括信息检索、关系抽取和问答系统中扮演着重要角色。全面回顾了现有的基于神经网络的单词-字符晶格结构的中文NER模型。首先介绍了中文NER相比英语NER难度更大,存在着中文文本相关实体边界难以确定和中文语法结构复杂等难点及挑战。然后调研了在不同神经网络架构下(RNN、CNN、GNN和Transformer)最具代表性的晶格结构的中文NER模型。由于单词序列信息可以给基于字符的序列学习更多边界信息,为了显式地利用每个字符所相关的词汇信息,过去的这些工作提出通过词-字符晶格结构将单词信息整合到字符序列中。这些在中文NER任务上基于神经网络的单词-字符晶格结构的性能要明显优于基于单词或基于字符的方法。最后介绍了中文NER的数据集及评价标准。 The Chinese named entity recognition(NER)task is a sub-task within the information extraction domain,where the task goal is to find,identify and classify relevant entities,such as names of people,places and organizations,from sentences given a piece of unstructured text.Chinese named entity recognition is a fundamental task in the field of natural language processing(NLP)and plays an important role in many downstream NLP tasks,including information retrieval,relationship extraction and question and answer systems.This paper provides a comprehensive review of existing neural network-based word-character lattice structures for Chinese NER models.Firstly,this paper introduces that Chinese NER is more difficult than English NER,and there are difficulties and challenges such as difficulty in determining the boundaries of Chinese text-related entities and complex Chinese grammatical structures.Secondly,this paper investigates the most representative lattice-structured Chinese NER models under different neural network architectures(RNN(recurrent neural network),CNN(convolutional neural network),GNN(graph neural network)and Transformer).Since word sequence information can capture more boundary information for character-based sequence learning,in order to explicitly exploit the lexical information associated with each character,some prior work has proposed integrating word information into character sequences via word-character lattice structures.These neural network-based word-character lattice structures perform significantly better than word-based or characterbased approaches on the Chinese NER task.Finally,this paper introduces the dataset and evaluation criteria of Chinese NER.
作者 赵山 罗睿 蔡志平 ZHAO Shan;LUO Rui;CAI Zhiping(College of Computer,National University of Defense Technology,Changsha 410073,China)
出处 《计算机科学与探索》 CSCD 北大核心 2022年第2期296-304,共9页 Journal of Frontiers of Computer Science and Technology
基金 国家重点研发计划(2020YFC2003400)。
关键词 命名实体识别(NER) 晶格结构 神经网络 named entity recognition(NER) lattice structure neural network
  • 相关文献

参考文献3

二级参考文献37

  • 1张晓艳,王挺,陈火旺.命名实体识别研究[J].计算机科学,2005,32(4):44-48. 被引量:66
  • 2俞鸿魁,张华平,刘群,吕学强,施水才.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94. 被引量:159
  • 3Wikipedia:Message Understanding Conference[EB/OL].2013-12-27.http://en.wikipedia.org/wiki/Message_Understanding_Conference.
  • 4Wikipedia:Named Entity Recognition[EB/OL].2013-12-28.http://en.wikipedia.org/wiki/Named_Entity_Recognition.
  • 5Rizzo G,Troncy R.NERD:Evaluating Named Entity Recognition Toolsinthe Web of Data[J].Lecture Notesin Computer Science,2012(7295):39-55.
  • 6Rizzo G,Troncy R.NERD:A Framework for Unifying Named Entity Recognition and Disam biguation Extraction Tools[C]∥13th Conference ofthe European Chapter of the Association for ComputationalL inguistics.2012:73-76.
  • 7Li Chen-liang,Weng Jian-shu.TwiNER:Named Entity Recognition in Targeted Twitter Stream[C]∥SIGIR.2012:721-730.
  • 8Liu Xiao-hua,Zhang Shao-dian,et al.Recognizing Named Entitiesin Tweets[C]∥ACL.2011:359-367.
  • 9Finin T,Murnane W.Annotating Named Entitiesin TwitterDatawith Crowdsourcing[C]∥ACL.2010.
  • 10Ritter A,Clark S,Etzioni M O.Named Entity RecognitioninTweets:An Experimental Study.http://aclweb.org/anthology/D/D11/D11-D1141.pdf.

共引文献113

同被引文献280

引证文献25

二级引证文献77

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部