摘要
日志数据管理系统是最重要的云服务基础设施之一。重要日志数据缺失将造成相应日志分析与决策的片面性和不准确性。然而日志数据采集能力越强,日志采集的运行期开销就越大,海量日志数据的管理与分析就越耗时,对整个云服务环境的系统性能造成不可忽视的影响。针对如何采集必要的日志数据同时尽可能降低其运行期开销的问题,文章首先提出日志采集粒度的概念,然后设计并编程实现一个面向云计算的粒度自配置日志采集平台。其中,平台构成模块包括:日志采集工具、存储日志采集粒度规则和事实的知识库;基于规则动态增加或关闭相关日志数据采集模块的推理机;相应的图形界面,包括用于添加或修改知识库规则的管理界面和直观查看日志数据的用户界面。最后,初步的案例学习结果表明了平台的有效性。
The log data management system is one of the key infrastructures for cloud computing. Missing of important log data leads to inaccurate and one-sided data analysis and decision making. However, the stronger the log data capturing capability is, the higher the runtime overhead is. In order to capture necessary log data and reduce the runtime overhead as much as possible, this paper ifrst put forward the log data capturing grain level concept was put forward ifrstly in this paper, and a grain-level self-conifguring log data capturing platform was designed then for cloud computing. This platform is consisted of a log data capturing tool, a knowledge base storing grain-level based log capturing rules and facts, a rule-based inference engine for adding and removing speciifc log data capturing modules, and graphical interfaces for managing the knowledge base and querying log data sets. Finally, our preliminary case study demonstrates the efifciency of our platform.
出处
《集成技术》
2014年第3期22-31,共10页
Journal of Integration Technology
基金
国家自然科学基金(61103001
61170077
61272445和61202377)
2012年广东省大学生创新实验项目(0000175782)
深圳市科技计划(JCYJ20120613102030248)
关键词
日志采集
推理机
知识库
云计算
淘宝开源系统Tsar
log data capturing
inference engine
knowledge base
cloud computing
Tsar, the open source system ofTaobao