摘要
爬行虫算法是搜索引擎探讨的热点。通过分析现有爬行虫算法设计和程序实现的主要方式 ,权衡其利弊 ,总结出一个适合于中小型网站网页下载的爬行虫算法。并使用jBuider8.0工具实现了该算法。通过实验分析 ,该程序下载的网页数的速度为 1882 4 2个网页 /分和 4 1.92 74 .5 9KB/秒。
The research of crawler's algorithm is a hotspot in search engine. This paper,first analyses the current method of designing crawler's algorithm and realizing crawler's program and concludes its disadvantage and advantage. Then it gives a crawler's algorithm of retrieval web page suitable for medium and small-sized web site and realizes this algorithm by jbuider8.0?It is proved that the program speed of downloading web pages is 188~242/minus and 41.92~74.59KB/second.
出处
《计算机应用》
CSCD
北大核心
2004年第1期33-35,共3页
journal of Computer Applications
关键词
爬行虫算法
爬行虫程序
搜索引擎
crawler's algorithm
crawler's program
search engine