摘要
特定领域实体具有分布稀疏、类型有限、领域性强等特点,与普通命名实体具有较大差别,在使用神经网络模型构建识别模型中面临训练语料规模有限、带标实体稀疏等困难.以武器装备名识别为例,研究深度学习框架下,词性、句法和领域知识融入神经网络模型的方法和效果.实验结果表明,在融入词性和领域知识后,武器装备名识别的F值分别提升了0.97%与9.5%.此外,通过在不同语料规模下进行实验并定量分析不同类型特征的分布特点,初步给出造成不同类型特征对深度学习模型有着不同支持作用的原因.
The domain-specific entities have the characteristics of sparse distribution,limited types and strong domains.They are quite different from ordinary named entities.It is difficult to construct recognition model by using neural network model due to the limited size of training corpus and sparse labeled entities.Taking the identification of military equipment names as example,we study the method and effect of the integration of part of speech,syntax and domain knowledge into the neural network model under the framework of depth learning.The experimental results show that after the integration of part of speech and domain knowledge,the F value of military equipment name recognition increases by 0.97%and 9.5%respectively.By conducting experiments under different corpus size and quantitatively analyzing the distribution characteristics of different types of features,the reasons that different types of features have different supporting effects for deep learning are given.
作者
雷树杰
邢富坤
王闻慧
Lei Shujie;Xing Fukun;Wang Wenhui(Luoyang Campus,Information Engineering University of PLA Strategic Support Forces,Luoyang 471003,Henan,China;School of Foreign Languages,Qingdao University,Qingdao 266000,Shandong,China)
出处
《计算机应用与软件》
北大核心
2019年第11期210-217,共8页
Computer Applications and Software