摘要
企业信息检索系统所要解决的核心问题就是:提高检索关键字集合的规范性与完备性,通过对文本相似度计算及相关分类算法,达到检索关键字集合规范性及完备性提高的目的。给出了线性序列相似度的定义,讨论了匹配矩阵的性质,给出了一种计算线性序列相似度的算法,并对算法作出了优化。
The key problem that an intelligent enterprise information retrieval system needs to solve is to improve the normative and completeness of searching Key words,which can be achieved with the use of text similarity computing and relative classification algorithm.A definition of similarity degree of linear sequence is presented.The characteristics of match matrix is discussed.And finally an algorithm of calculating similarity degree of linear sequence and its optimization are introduced.
出处
《科学技术与工程》
2011年第15期3571-3575,3584,共6页
Science Technology and Engineering
关键词
信息检索
文本相似度算法
全局优化
状态空间
information retrieval text similarity computing algorithm overall optimization state space