摘要
基于SES开发通用爬行器,可以对企业数据库、门户网页、文档文件、办公系统内容等进行抓取和分析,提取企业级用户所关注的信息,并对抓取的数据进行索引,存储到索引库当中,以及提供增量爬行机制.系统界面友好,准确高效.
The design developed the universal crawler based on SES, which could crawl and analyze enterprise database, portal page, documents and files, office systems and so on. Extracted information which the corporate users concerned, index the data, store it to the index database, and provided the mechanism of incremental crawl. The engine offers a friendly search interface and search efficiently.
出处
《哈尔滨商业大学学报(自然科学版)》
CAS
2011年第4期605-608,共4页
Journal of Harbin University of Commerce:Natural Sciences Edition
基金
国家高技术研究发展计划(2006AA09A102-15)
国家科技重大专项(2008ZX05023-05-05)