摘要
地理空间中对位置的描述可以基于形式化的地理坐标,也可以利用自然语言文本中的非形式化地名来表达。文本中的同一地名可能指向很多地理位置,这就引起了地名歧义,地名消歧就是消除概念指称上的认知分歧,为地名分配唯一的地理位置。该文从地名知识的角度出发,首先提出了一种基于地名本体的地名知识统一表达方法,介绍了地名知识来源和地名知识库建库流程;然后给出了中文地名消歧原理和算法流程,从语义关系、拓扑关系、距离关系和地名密度4个维度计算地名实体之间的地理关联度进行地名消歧;最后通过实验对消歧方法进行了验证与评价,结果表明该消歧方法具有较高的准确率、召回率、覆盖率和F值。
In geographic space,a location can be expressed by either geographic coordinates or natural language texts,corresponding to formalized or non-formalized expressions respectively.A typonym in texts,however,can refer to different locations in geographic space,leading to toponym ambiguity.The toponym disambiguation is therefore to eliminate cognition divergence of concept reference and assign unique geographic coordinate for such ambiguous location.From the perspective of toponym knowledge,this paper proposes a unified expression method based on toponym ontology,and investigates the data source of toponym knowledge,and proposes the framework of constructing a toponym knowledge database.Next,this paper describes the theory of toponym disambiguation,and the corresponding algorithm flow by means of semantic relation,typology relation,distance relation and toponym density to calculate the geographic correlations for toponym disambiguation.Semantic relation mainly refers to the type of relationship between the concepts of toponyms.Typology relation mainly contain equal,contain,intersection,adjacent and disjoint etc.Distance relation refers to quantitative distance between the toponym entities on the map.Toponym density represents the density of toponym entities in regions in toponym knowledge base.The proposed method is evaluated by experiments,where the experimental results show that the proposed method produces a high accuracy in terms of precision,recall rate,coverage rate and F value.
出处
《地理与地理信息科学》
CSCD
北大核心
2016年第4期5-10,共6页
Geography and Geo-Information Science
基金
国家自然科学基金项目(40871183
41140012
41271392
41401463
41571394)
四川省应急测绘与防灾减灾工程技术研究中心开放基金项目(K2014B016
K2015B014)
关键词
地名
知识
地名本体
地名消歧
关联度
toponym
knowledge
toponym ontology
toponym disambiguation
correlation