摘要
在研究传统的基于特征码去重算法的基础上,针对元搜索引擎中网页重复现象,提出一种基于用户查询关键词的网页去重方法,提高元搜索引擎检索质量,并且介绍算法的实现过程,通过实验验证算法的有效性。
Based on the study of the duplicated Web pages detection algorithm with feature code, the paper proposes a duplicated detection algorithm based on the keyword from user' s submission for meta search engine. The main steps of algorithm are introduced. And this algorithm is tested and verified its validity in an experiment.
出处
《现代图书情报技术》
CSSCI
北大核心
2008年第7期43-46,共4页
New Technology of Library and Information Service
关键词
网页去重
元搜索
特征码
中文分词
Duplicate detection Meta search Feature code Chinese word segmentation