摘要
针对时序动态数据挖掘算法有限的问题,充分考虑动态数据之间的依赖性,将隐马尔可夫模型和启发式聚类策略相结合实现对时序动态数据发展变化特征及规律的挖掘。首先,基于隐马尔可夫模型将时序数据转换到似然空间,并以对称性KL(Kullback-Leibler)距离来标识似然度的大小;其次,构建对称性KL距离转移矩阵,并借助分层聚类方法实现对时序动态数据变化模式的分类。通过将该方法应用于计算机网络专业职位需求变化规律的知识发现,挖掘出职位需求变化的五类模式。
Taking the dependence of the adjacent dynamic data into consideration, this paper performed the mining of changing trend of the dynamic Web data by combining the Hidden Markov Model( HMM) with the hierarchical clustering method. In the first step, the original data were transformed by extension of the hidden Markov model and Symmetric Kullback-Leibler( SKL) distance into probabilistic space. In the second step, the time series data could be clustered using hierarchical clustering method on the SKL confusion matrix. This method was verified with a mining of the changing trend using dynamic statistic data of job requirements in the major of computer network. The result shows that five dynamic change patterns of the job requirements could be discovered.
出处
《计算机应用》
CSCD
北大核心
2014年第A02期120-122,150,共4页
journal of Computer Applications
关键词
时序数据
数据挖掘
隐马尔可夫模型
聚类
职位需求
time series data
data mining
Hidden Markov Model (HMM)
clustering
job requirement