摘要
随着通信技术和硬件设备的不断发展,尤其是小型无线传感设备的广泛应用,数据采集和生成技术变得越来越便捷和趋于自动化,研究人员正面临着如何管理和分析大规模动态数据集的问题。能够产生数据流的领域应用已经非常普遍,例如传感器网络、金融证券管理、网络监控、Web日志以及通信数据在线分析等新型应用。这些应用的特征是环境配备有多个分布式计算节点;这些节点往往临近于数据源;分析和监控这种环境下的数据,往往需要对挖掘任务、数据分布、数据流入速率和挖掘方法有一定的了解。综述了分布式数据流挖掘的当前进展概况,并展望了未来可能的、潜在的专题研究方向。
With advances in communications technology and hardware equipment technologies,particularly the wide use of small wireless sensor devices,data collection and generation technologies have become more convenient and automated,organizations and researchers are faced with the ever growing problem of how to manage and analyze large dynamic datasets.Environments that produce streaming sources of data are becoming common place,such as sensor network,financial data management,network monitoring,Web log analysis and the communication data online analysis.In many application instances,these environments are also equipped with multiple distributed computing nodes that are often located near the data sources.Analyzing and monitoring data in such environments requires data mining technology that is cognizant of the mining task,the distributed nature of the data,and the data influx rate.We reviewed the current situation of the field and identified potential directions of future research.
出处
《计算机科学》
CSCD
北大核心
2012年第1期1-8,36,共9页
Computer Science
基金
国家自然科学基金(60875029)资助
关键词
分布式数据流挖掘
数据流挖掘
数据流
Distributed mining of data streams
Data streams mining
Data stream