摘要
网络信息技术的高速发展产生了新的数据模型,即数据流模型,并且越来越多的领域出现了对数据流实时处理的需求,庞大且高速的数据以及应用场景的实时性需求均推进了数据流挖掘技术的发展。首先介绍了常见的数据流模型;然后根据数据流模型的特点总结数据流挖掘的支撑技术;最后,分析了分布式数据流挖掘的重要性和有效性,给出了算法并行化的数学模型,并介绍了几种具有代表性的分布式数据流处理系统。
The rapid development of Internet information technology has generated a new data model—data stream model. The demands for realtime processing of data stream are emerging in an increasing number of areas,large-scale and high-speed data as well as real-time application of scenarios require the further technological development of data stream mining. This thesis,at its beginning,introduces the common data stream model and then summarizes the supporting technology applied in data stream mining based on the characteristics of the data stream model. Finally,this thesis analyzes the importance and effectiveness of distributed data stream mining technology,presents the parallel algorithm mathematical model and introduces several representative distributed data stream processing system.
出处
《微型机与应用》
2016年第21期8-10,13,共4页
Microcomputer & Its Applications
关键词
数据流模型
数据流挖掘
分布式
并行化
数据流处理系统
data stream model
data stream mining
distributed
parallel
data stream processing system