摘要
目前的数据元与数据项的匹配算法主要思想是基于字面相似程度实现匹配,这种算法对数据项命名结构规范有较强依赖,且大多业务数据库的数据项没有加入中文名,故无法实现匹配。该文提出一种数据元与数据项匹配算法,从数据项的归属实体名称、数据项名称、类型、长度、数据特征等多个角度设计算法,有较强的通用性,能够在数据项名称不规范或无中文名的情况下实现有效匹配。
The main idea of the current data element matching algorithm and data entry is literal similarity-based matching, this algorithm to the data item specification naming structure has a strong dependence, and the data items are mostly business data-base did not join Chinese name, so the match can not be achieved. This paper presents a data element and data item matching algorithms from multiple angles ownership entity name data entry, data entry name, type, length, data characteristics, such as design algorithm, there is a strong universal, can not in the name of the data item specification or without Chinese name of the case to achieve effective match.
作者
李敏
LI Min (Public Safety Information Technology Department, China Electronics Technology Company, Beijing 100083, China)
出处
《电脑知识与技术》
2016年第1期5-6,共2页
Computer Knowledge and Technology
关键词
数据元
数据项
匹配
特征词
data element
data items
matching
feature words