摘要
在时间序列分类等数据挖掘工作中,不同数据集基于类别的相似性表现有明显不同,因此一个合理有效的相似性度量对数据挖掘非常关键。传统的欧氏距离、余弦距离和动态时间弯曲等方法仅针对数据自身进行相似度公式计算,忽略了不同数据集所包含的知识标注对于相似性度量的影响。为了解决这一问题,提出基于孪生神经网络(SNN)的时间序列相似性度量学习方法。该方法从样例标签的监督信息中学习数据之间的邻域关系,建立时间序列之间的高效距离度量。在UCR提供的时间序列数据集上进行的相似性度量和验证性分类实验的结果表明,与ED/DTW-1NN相比SNN在分类质量总体上有明显的提升。虽然基于动态时间弯曲(DTW)的1近邻(1NN)分类方法在部分数据上表现优于基于SNN的1NN分类方法,但在分类过程的相似度计算复杂度和速度上SNN优于DTW。可见所提方法能明显提高分类数据集相似性的度量效率,在高维、复杂的时间序列的数据分类上有不错的表现。
In data mining such as time series classification,the similarity performance based on category of different datasets are significantly different from each other.Therefore,a reasonable and effective similarity measure is crucial to data mining.The traditional methods such as Euclidean Distance(ED),cosine distance and Dynamic Time Warping(DTW)only focus on the similarity formula of the data themselves,but ignore the influence of the knowledge annotation contained in different datasets on the similarity measure.To solve this problem,a learning method of time series similarity measure based on Siamese Neural Network(SNN)was proposed.In the method,the neighborhood relationship between the data was learnt from the supervision information of sample tags,and an efficient distance measure between time series was established.The similarity measurement and confirmatory classification experiments were performed on UCR-provided time series datasets.Experimental results show that compared with ED/DTW-1 NN(one Nearest Neighbors),the overall classification quality of SNN is improved significantly.The Dynamic Time Warping(DTW)-based 1 NN calssification method outperforms the SNN-based 1 NN classification method on some data,but SNN outperforms DTW in complexity and speed of similarity calculation during the classification.The results show that the proposed method can significantly improve the measurement efficiency of the classification of dataset similarity,and has good performance for high-dimensional and complex time-series data classification.
作者
姜逸凡
叶青
JIANG Yifan;YE Qing(School of of Electrical & Information Engineering,Changsha University of Science & Technology,Changsha Hunan 411000,China)
出处
《计算机应用》
CSCD
北大核心
2019年第4期1041-1045,共5页
journal of Computer Applications
关键词
时间序列
相似性度量
神经网络
孪生神经网络
time serie
similarity measure
neural network
Siamese Neural Network(SNN)