摘要
本文设计了一种面向HTTP协议的信息审计系统,针对大型企业内网特点,提出综合使用多种采集手段获取流量数据,还原其中占较高比重的HTTP协议信息进行审计,引入开源搜索引擎sphinx用于处理海量页面快照,提高进行关键词匹配的效率。使用开源软件Memcached和Starling构建高速网络缓存系统,降低了模块间的耦合度,使系统具有良好的负载能力和扩展性。
This article introduces a network information auditing system faced on HTTP protocol.According to the characteristics of big enterprise intranet,we reassembling HTTP protocol information which are the main compounds in network stream for auditing system by use a comprehensive set of data capture techniques.In order to improve efficiency of keywords matching,Sphinx which is an open source search engine is introduced to processing web page snapshots.Finally,two of open source software which are Memcached and Starling use to build high speed networking cache system,so the coupling degree among modules could be lower and the system's loading capability and expansibility is also strong.