摘要
[目的/意义]《史记》是我国第一部纪传体史书,几乎囊括黄帝时代到汉武帝元狩元年3000多年的重大历史事件。如何快速准确地发现这些历史事件及其之间的内在联系,对于透过历史现象、揭示历史实质以及发现历史规律具有重要意义。[方法/过程]在BERT模型和LSTM-CRF模型的基础上,提出面向《史记》的历史事件及其组成元素抽取方法,并基于此构建《史记》事理图谱。[结果/结论]实验结果表明,利用所提方法抽取历史事件及其组成元素的F1值分别达到0.823和0.760。通过事理图谱能够发现蕴含在《史记》中鲜为人知的知识,这为文献学、历史学、社会学等领域专家开展研究提供必要的资料准备。
[Purpose/significance]Historical Records is the first biographical history book in China,which contains almost all the significant historical events during more than 3000 years between the Yellow Emperor and the Emperor Wu of Han.How to efficiently extract these historical events and their relationships is quite important to penetrate the historical appearances,reveal the historical essences and discover the historical laws.[Method/process]The BERT model and LSTM-CRF model were introduced in this paper,and historical events extraction method based on Historical Records was proposed and the historical event graph was constructed.[Result/conclusion]The experiment results show that the F1 values of historical event and its components extraction are respectively 0.823 and 0.760.The rare known knowledge is invented by the event graph,which providing essential literature foundation for many researchers,such as philology,history and sociology,to conduct their researches.
作者
刘忠宝
党建飞
张志剑
Liu Zhongbao;Dang Jianfei;Zhang Zhijian(Key Laboratory of Cloud Computing and Internet-of-Things Technology(Quanzhou University of Information Engineering),Fujian Province University,Quanzhou 362000;School of Software,North University of China,Taiyuan 030051)
出处
《图书情报工作》
CSSCI
北大核心
2020年第11期116-124,共9页
Library and Information Service
基金
国家社会科学基金一般项目"大数据环境下面向图书馆资源的跨媒体知识服务研究"(项目编号:19BTQ012)研究成果之一。
关键词
《史记》
历史事件抽取
事理图谱
BERT模型
双向长短期记忆网络
条件随机场
Historical Records
extraction of historical events
event graph
bidirectional encoder
representations from transformers(BERT)
bidirectional long short-term memory(BiLSTM)
conditional random field(CRF)