摘要
先秦汉语在汉语史研究上具有重要地位,然而以往的研究始终没有形成结构化的先秦词汇资源,难以满足古汉语信息处理和跨语言对比的研究需要。国际上以英文词网(WordNet)的义类架构为基础,建立了数十种语言的词网,已经成为多语言自然语言处理和跨语言对比的基础资源。该文综述了国内外各种词网的构建情况,特别是古代语言的词网和汉语词网,且详细介绍了先秦词网的构建和校正过程,构建了涵盖43 591个词语、61 227个义项、17 975个义类的先秦词网。该文还通过与古梵语词网的跨语言对比,尝试分析这两种古老语言在词汇上的共性和差异,初步验证先秦词网的价值。
Pre-Qin ancient Chinese plays an important role in the history of Chinese language.However,there is no well-structured lexical resources of Pre-Qin ancient Chinese,which is essential in ancient language processing and cross language comparison.This paper summarizes the construction methods of WordNet,which a well-formed semantic hierarchy developed for tens human languages,with a special focus in ancient languages’and Chinese Word-Nets.This paper then presents the construction and data checking process of the WordNet for Pre-Qin ancient Chinese(PQAC-WN),which covers 43591words,61227senses and 17975synsets.By cross language comparison with the ancient Sanskrit WordNet,this paper analyzes the lexical similarities and differences of the two ancient languages,thus preliminarily verifying the application of the PQAC-WN.
作者
卢雪晖
徐会丹
李斌
陈思瑜
LU Xuehui;XU Huidan;LI Bin;CHEN Siyu(School of Chinese Language and Literature,Nanjing Normal University,Nanjing,Jiangsu 210024,China;School of International Chinese Language Education,Beijing Normal University,Beijing 100091,China)
出处
《中文信息学报》
CSCD
北大核心
2023年第3期36-45,共10页
Journal of Chinese Information Processing
基金
国家语委项目(YB145-41)
古籍工作重点课题(22GJK006)
国家社会科学基金(21&ZD331,22&ZD262)
江苏省社会科学基金(20JYB004)
关键词
词网
先秦汉语
跨语言对比
古文信息处理
WordNet
Pre-Qin ancient Chinese
cross-language comparation
ancient Chinese information processing