期刊文献+

基于元搜索的网页消重方法研究 被引量:5

Study on the Duplicated Web Pages Detection Algorithm with Meta Search Engine
下载PDF
导出
摘要 本文在对现有主流网页消重技术进行分析基础上,针对元搜索引擎技术,提出一种基于元搜索的网页消重算法。介绍了算法的具体实现步骤,并且通过实验验证了算法的有效性。
作者 谢蕙 秦杰
出处 《计算机系统应用》 2008年第8期94-96,共3页 Computer Systems & Applications
  • 相关文献

参考文献6

二级参考文献9

  • 1谢立,王永强,于德敏,许增朴.利用图像的灰度特征实现半透明产品的识别[J].微计算机信息,2005,21(07X):44-45. 被引量:10
  • 2[1]T.W. Yan and H. Garcia- Molina. Duplicate removal in information dissemination. In Proceedings of the 21st International Conference on Very Large Data Bases(VLDB' 95) ,66 - 77,San Francisco,Ca., USA,September 1995. Morgan Kaufmann Publishers, Inc.
  • 3[2]Narayanan Shivakumar and Hector Garcia- Molina. SCAM: a copy detection mechanism for digital documents. In Proceedings of 2nd International Conference in Theory and Practice of Digital Libraries (DL'95) ,Austin, Texas,June 1995.
  • 4[3]T. Yan and H. Garcia- Molina. The sift information dissemination system. In ACM TODS,2000.
  • 5[4]J.W. Kirriemuir & P. Willett Identification of duplicate and near - duplicate full - text records in database search outputs using hierarchic cluster analysis,in Program-automated library and information,(1995)29(3) :241-256.
  • 6[5]Buckley C. ,Cardie C. ,Mardis S. ,Mitra M. ,Pierce D. ,Wagstaff K. ,Walz J. ,The Smart/Empire TIPSTER IR System, TIPSTER Phase Ⅲ Proceedings,Morgan Kaufmann,San Francisco,CA,2000.
  • 7Finding near-replicas of documents on the web. Narayanan Shivakumar, et al. WebDB 1998
  • 8Finding replicated web collections. Junghoo Cho, N. Shivakumar et al. In Proceedings of 2000 ACM International Conference on Management of Data (SIGMOD), May 2000.
  • 9牛伟霞,张永奎.潜在语义索引方法在信息过滤中的应用[J].计算机工程与应用,2001,37(9):57-60. 被引量:16

共引文献47

同被引文献29

引证文献5

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部