摘要
目前在很多不确定性数据流聚类方法研究中,存在着聚类模型和数据流的数据模型失配问题,且它们往往假定不确定性数据的概率密度函数、概率分布函数或者概率是已知的,然而这些信息在实际系统中很难获得。鉴于此,本文提出一种基于区间数的多维不确定性数据流聚类算法(UIDMicro)。在该算法中,首先利用区间数结合不确定性数据的统计信息表示多维不确定性数据流,然后采用"当前簇"和"候选簇"两层簇窗口对不确定性数据流进行聚类,通过动态调整两层簇窗口实现聚类模型和数据模型的实时匹配。实验结果表明,该方法具有较高的聚类精度和处理效率。
At present,there are some problems in the study of existing uncertain data stream clustering methods ,such as the clustering model is apt to mismatch the data model of the uncertain data stream, and these methods usually assume that the probability density function ,probability distribution function or probability of the uncertain data are known ;however in real application system,the above information is hard to get. To solve these problems,a multi-dimensional uncertain data stream clustering algorithm,UIDMicro (Uncertain Interval Data Micro) based on interval data is proposed. In this algo- rithm,firstly,the interval data combining with the statistic information of uncertain data is used to represent the multi-di- mensional uncertain data stream;then two levels of cluster windows, namely current cluster and candidate cluster are used to cluster the multi-dimensional uncertain data stream. Through adjusting the two levels of cluster windows dynamically ,the real time matching of the clustering model and data model of the uncertain data stream is realized. The experiment results show that the proposed clustering algorithm possesses better clustering precision and higher processing efficiency.
出处
《仪器仪表学报》
EI
CAS
CSCD
北大核心
2013年第6期1330-1338,共9页
Chinese Journal of Scientific Instrument
基金
国家自然科学基金(61102038/F010908)
装备预研重点基金(9140A17040409HT01)
教育部高等学校博士学科点专项科研基金(20092302110013)
教育部新世纪优秀人才支持计划(NCET-10-0062)资助项目
关键词
不确定性数据流
聚类算法
区间数
uncertain data stream
clustering algorithm
interval data