摘要
当前医学语料库实体及实体关系的分类体系难以满足精准医学发展需求的问题,该文针对儿科疾病开展研究。在医学领域专家的指导下制定了适合儿科学的命名实体和实体关系的标注体系及详细标注规范;融合国内外相关医学标准资源,利用标注工具对298余万字儿科医学文本中实体及实体关系进行机器预标注、人工标注及人工校对,构建了面向儿科疾病的医学实体及关系语料库。所构建的语料库包含504种儿科常见疾病,共标注命名实体23603个,实体关系36513个,多轮标注一致性分别为0.85和0.82。基于该语料库构建了儿科医学知识图谱,并开发了基于知识图谱的儿科医学知识问答系统。
In the current medical corpus,the classification system of entities and entity relations is difficult to meet the development requirement of precision medicine.This paper conducts the research about pediatric diseases.In particular,this paper constructs an annotation system and detailed annotation schemes for named entity and entity relations under the guidance of medical experts.By fusing the relevant medical standard,annotation tools are applied for machine pre-annotation,manual annotation and manual proofreading of entities and entity relations in pediatric medical texts with more than 2.98 million words,thus constructing a medical entities and entity relations corpus for 504 common pediatric diseases.In this corpus,23603 named entities and 36513 entity relationships were annotated,and for them the consistency accuracies of multiple-around annotation are 0.85 and 0.82,respectively.Based on the annotated corpus,this paper also constructs a pediatric medical knowledge graph and develops a pediatric medical knowledge QA system.
作者
昝红英
刘涛
牛常勇
赵悦淑
张坤丽
穗志方
ZAN Hongying;LIU Tao;NIU Changyong;ZHAO Yueshu;ZHANG Kunli;SUI Zhifang(School of Information Engineering,Zhengzhou University,Zhengzhou,Henan 450001,China;The Peng Cheng Laboratory,Shenzhen,Guangdong 518052,China;The Third Affiliated Hospital of Zhengzhou University,Zhengzhou,Henan 450001,China;Key Laboratory of Computational Linguistics,Ministry of Education,Peking University,Beijing 100871,China)
出处
《中文信息学报》
CSCD
北大核心
2020年第5期19-26,共8页
Journal of Chinese Information Processing
基金
国家社会科学基金(18ZDA315)
河南省高等学校重点科研项目(20A520038)
河南省科技攻关项目(192102210260)
河南省科技攻关计划国际合作项目(172102410065)
河南省医学科技攻关计划省部共建项目(SB201901021)
关键词
儿科疾病
语料库建设
命名实体
实体关系
知识图谱
pediatries
corpus construction
named entity
entity relation
knowledge graph