摘要
开放域事件定义与传统事件定义不同,主要以任意领域的事件触发词为核心,并包括与其关联的时间、地点、人物、数量等多种元素构成的结构化数据,是不可预测的。在开放域触发词抽取中,提出了一种基于规则和二值分类相结合的混合模型方法(简称R-Two模型),规则方法需人工构建规则,具有抽取速度快、表征能力强的优点,但也存在规则不完备、过分依赖句法分析的缺点。二值分类法的训练过程虽然比较繁琐,但抽取的准确率高且受句法分析影响小,故将二者融合,并通过实验证明融合方法的有效性。
Different from traditional event definition,open-domain event definition takes event trigger words in any field as the core, including structural data of the time, place, character, quantity and so on, which are unpredictable.A hybrid model based on combination of rule and two-element classification(R-Two model) is proposed.Rules and methods need to be constructed by artificial rules, which have the advantages of high extraction-speed and strong representation ability. And however there are also some shortcomings including the not-complete rules, and over-reliance on syntactic analysis.Two-element classification method,although complex in training process, is high in extraction accuracy and small in impact by syntactic analysis.And thus based on fusion of the two and via experiments, the effectiveness of this fusion method is reliably verified.
作者
苏晓丹
周刚
陈海勇
丁宣宣
SU Xiao-dan ZHOU Gang CHEN Hai-yong DING Xuan-xuan(PLA Information Engineering University, Zhengzhou Henan 450001, China State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou Henan 450001, China)
出处
《通信技术》
2017年第1期24-29,共6页
Communications Technology
关键词
开放域
触发词
规则
二值分类
open domain
trigger
rule
two-element classification