摘要
根据网络信息量大的特点,以及主题网络爬虫效率上的要求,将postgresql数据库集群技术运用在主题网络爬虫当中,解决了爬虫对大信息量的存储,并采用缓存技术解决了集群技术在爬虫应用中的效率瓶颈。
In respect to the characteristics of hugeness of net information and request for spider efficiency in topic net, this paper applies postgresql database cluster to the topic net spider, meets the need for huge storage space by spider, and also tackled the bottleneck of efficiency with cache technology when the cluster technology is applied in spider.
出处
《计算机系统应用》
2010年第12期160-163,共4页
Computer Systems & Applications