摘要
为以较低的误报率和较高的检测率对攻击和恶意行为进行实时检测,基于Spark框架和位置敏感哈希算法,提出一种分布式数据流聚类方法 DSCLS,能够处理实时数据流,可根据数据流速进行横向分布式扩展。基于DSCLS分布式聚类算法,建立网络入侵检测系统,能够高速实时分析数据流,聚类相关模式,实时检测已知攻击和入侵,能够对未知的新型攻击进行检测。理论分析和实验结果表明,与主流的数据流聚类算法D-Stream相比,DSCLS方法能够有效提高检测率并降低误报率,在时间性能和可扩展性方面更有优势。
To get lower false alarm rate and higher detection rate to detect attacks and malicious behavior,a distributed real-time data stream clustering method DSCLS based on the Spark framework and location sensitive Hash algorithm was proposed.The algorithm was not only able to handle real-time data streams,but could be laterally distributed and extended according to the data flow rate.A network intrusion detection prototype system based on the DSCLS distributed clustering algorithm was established.By using the DSCLS distributed clustering algorithm based on the Spark framework and the LSH algorithm,the system processed the data stream in real time,clustered related patterns,and provided real-time detection of known attacks and new unknown attacks.Theoretical analysis and experimental results show that,compared with the mainstream of the data stream clustering algorithm D-Stream,the DSCLS method can effectively improve the detection rate and reduce the false positive rate,and it has time advantage in terms of performance and scalability.
出处
《计算机工程与设计》
北大核心
2015年第7期1720-1726,共7页
Computer Engineering and Design
基金
国家973重点基础研究发展计划基金项目(2007CB310803)
国家自然科学基金重点项目(61035004)
国家自然科学基金项目(60875029)
关键词
入侵检测
数据流
聚类
位置敏感哈希
DSCLS算法
intrusion detection systems
data stream
clustering algorithm
local sensitive Hash
DSCLS algorithm