摘要
传统事件触发词抽取方法在特征提取过程中对自然语言处理工具产生过度依赖的方法,耗费大量人力,容易出现错误传播和数据稀疏性等问题。为此,提出采用CNN-BiGRU模型进行事件触发词抽取的方法。将词向量和位置向量进行拼接作为输入,提取词级别特征和句子全局特征,提高触发词抽取效果,并通过CNN提取词汇级别特征,利用BiGRU获取文本上下文语义信息。在ACE2005英文语料库和中文突发事件语料库CEC上的实验结果表明,该模型事件触发词识别F1值分别达到74.9%和79.29%,有效提升事件触发词的抽取性能。
The existing methods for extracting event trigger words rely heavily on manual intervention and natural language processing tools for feature extraction,leading to frequent error propagation and high data sparsity.To address the problem,a new method for extracting event trigger words is proposed based on the CNN-BiGRU model.Taking the word vectors and position vectors as the input,the model can extract both word-level features and global sentence features,which improves the result of trigger word extraction.In addition,the model employs Convolutional Neural Network(CNN)to extract word-level features,and BiGRU to obtain the contextual semantic information of the text.The experimental results show that the F1 value of the model in event trigger word recognition reaches 74.9%on the ACE2005 English corpus and 79.29%on the Chinese Emergency Corpus(CEC),indicating that the proposed model can improve the performance of event trigger word extraction.
作者
苗佳
段跃兴
张月琴
张泽华
MIAO Jia;DUAN Yuexing;ZHANG Yueqin;ZHANG Zehua(College of Information and Computer,Taiyuan University of Technology,Jinzhong,Taiyuan 030600,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第9期69-74,83,共7页
Computer Engineering
基金
国家自然科学基金(61503273)。
关键词
事件抽取
触发词检测
事件类型识别
卷积神经网络
循环神经网络
双向门控循环单元
特征提取
event extraction
trigger word detection
event type recognition
Convolutional Neural Network(CNN)
Recurrent Neural Network(RNN)
Bidirectional Gated Recurrent Unit(BiGRU)
feature extraction