期刊文献+

分布式信息搜集系统中URL存储检索的设计与分析 被引量:2

Analysis and Design of URL Indexing in Distributed Information Retrieval System
下载PDF
导出
摘要 URL的存储检索效率是构建大规模分布式信息搜集系统的关键 ,其决定了系统搜集 Web文档的效率 .对 URL存储检索性能做定量分析 ,分别得出 URL存储及检索所需要达到的速度指标 .在此基础上 ,提出了两种 URL存储检索原型 ,即集中 URL服务器存储检索和分布 URL存储检索 ,并对这两种原型系统的检索速度、性能价格比、可扩展性以及可靠性进行了分析比较 .实际应用中 ,可以根据优化目标选择相应的 With the scale of World Wide Web increasing exponentially, the key technique of improving the distributed crawler system performance is the efficiency of URL storage and indexing. Based on the quantitative analyzing of the performance metrics of the URL index and storage,this paper presented two URL storage and index architectures in distributed crawler system: centralized URL server storage and index, distributed URL storage and index. The advantage and disadvantage of each were discussed. The distributed URL system was realized in our distributed crawler system, and the work is efficient.
出处 《上海交通大学学报》 EI CAS CSCD 北大核心 2003年第3期454-457,共4页 Journal of Shanghai Jiaotong University
基金 上海市科委重点基础科研项目 ( 0 2 DJ14 0 45 )
关键词 分布式系统 Web信息搜集 URL存储检索 distributed system Web Crawler URL storage and index
  • 相关文献

参考文献7

  • 1Heydon A, Najork M. Mercator: a scalable, extensible Web crawle [EB/OL]. http : //research. compaq.com/SRC/mercator/papers/www/paper. html, 1999-06-10.
  • 2Darren R Hardy. Harvest user's manual [R]. Boulder: University of Colorado, 1995.
  • 3Sergy Brin, Lawrence Page. The anatomy of a largescale hypertextual Web search engine [J]. Computer Networks and ISDN Systems, 1998,30:107 - 117.
  • 4WANG Ji-cheng, JIN Xiang-yu. Distributed and cooperative information retrieval on the World Wide Web [J]. Journal Computer Science & Technology,2000, 15(6): 611-618.
  • 5Mark A, Overmeer C J. My personal search engine[J]. Computer Networks, 1999,31: 2271-2279.
  • 6Kansas City Public Library. Introduction to search engines [EB/OL]. http://www. kcpl. lib. mo. us/search/srchengines. htm, 2001 - 06- 05.
  • 7Jupitermedia Corporation. Search engine size [EB/OL]. http://searchenginewatch. com/reports/sizes.html, 2001-07-01.

同被引文献7

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部