摘要
通过对查询短语的结构分析,发现查询短语通常由关键词和特征词构成。特征词是对网页内容的概括,它预示着网页中包含一组特定的特征词条。基于该思想建立了面向Web网页内容的特征库。以元搜索引擎为研究对象,研究了以Web网页内容特征库为基础实现对查询短语进行语义理解的方法,提出了相关度级别的算法,对库中已收入的特征词进行了查询测试,查准率为86.7%。实验表明,该模型基本实现了对查询短语的理解,对提高搜索引擎的查准率有显著的效果。
By analysis of query phrase structure, the author finds that the query phrase is generally composed of both keyword and the feature word, The feature word generalizes Web page feature, it implies that the Web page consists of some special feature lemma. With this thought the feature base that face to Web page content is built. In the paper, META search engine is studied. The paper discusses how to realize the semantic comprehension on query phrase, which is based on the feature base of Web page content. Meanwhile, it brings forward an algorithm of relativity level. The feature words, which are collected in the feature base, are tested, and the precision ratio is about 86.7%, The test result indicates that the module can realize the semantic comprehension to query phrase, and it has an evident effect to improve the precision of search engine.
出处
《计算机工程》
EI
CAS
CSCD
北大核心
2006年第7期210-211,共2页
Computer Engineering
关键词
语义理解
网页特征库
元搜索引擎
Semantic comprehension
Web page feature base
Meta search engine