摘要
按照自然语言的构成层次——词语、句子和篇章,分析各层次语义分析的内涵、现有的研究策略、理论依据及存在的主要方法,并对现存的两类主要研究策略进行对比分析。认为词语语义分析是指确定词语意义,衡量两个词之间的语义相似度或相关度;句子语义分析研究包含句义分析和句义相似度分析两方面;文本语义分析就是识别文本的意义、主题、类别等语义信息的过程。当前的自然语言语义分析主要存在两种主要的研究策略:基于知识或语义学规则的语义分析和基于统计学的语义分析。基于统计与规则相融合的语义分析方法是未来自然语言语义分析的主流方法,本体语义学是自然语言语义分析的重要基础。
According to the three composition levels of natural language - words, sentences and texts, their definitions, the existing research strategies, theoretical basis and the present main methods are summarized and analyzed. Furthermore, two existing research strategies of semantic analysis are analyzed comparatively. Word semantic analysis is defined as to determine words meaning and measure similarity or relevancy between two words; sentence semantic analysis research includes sentence semantics and sentences similarity analysis; text semantic analysis is defined as the process of identifying text meaning, topic and category etc. There are two main research strategies to make the semantic analysis of natural language which are semantic analysis based on the knowledge or semantic rules and the statistics. In addition, the semantic analysis method based on combination of the statistics and rule will be the future mainstream method in natural language semantic analysis. And ontology semantics will be the important basis for analysis of natural language semantics.
出处
《图书情报工作》
CSSCI
北大核心
2014年第22期130-137,共8页
Library and Information Service
基金
国家自然科学基金项目"基于知识地图的对等网语义社区及其知识共享研究"(项目编号:71103138)
中央高校基本科研业务费资助项目"大数据背景下基于用户生成内容的商务智能模型研究"(项目编号:7214484902)研究成果之一
关键词
语义分析
语义分析理论
语料库
知识库
semantic analysis semantic analysis theory corpus knowledge base