摘要
时间序列shapelets是时间序列中能够最大限度地表示一个类别的子序列.解决时间序列分类问题的有效途径之一是通过shapelets转换技术,将shapelets的发现与分类器的构建相分离,其主要优点是优化了shapelets的选择过程,并能够灵活应用不同的分类策略.但该方法也存在不足:一是在shapelets转换时,用于产生最好分类结果的shapelets数量是很难确定的;二是被选择的shapelets之间往往存在着较大的相似性.针对这两个问题,首先提出了一种简单有效的shapelet剪枝技术,用于过滤掉相似的shapelets;其次,提出了一种基于shapelets覆盖的方法来确定用于数据转换的shapelets的数量.通过在多个数据集上的测试实验,表明了所提出的算法具有更高的分类准确率.
Time series shapelets are subsequences of time series that can maximally represent a class. One of the most promising approaches to solve the problem of time series classification is to separate the process of finding shapelets from classification algorithm by adopting a shapelet transformation. The main advantages of that technique are that it optimizes the process of shapelets selection and different classification strategies could be applied. Important limitations also exist in that method. First, although the number of shapelets selected for the transformation directly affects the classification result, the quantity of shapelets which yields the best data for classification is hard to be decided. Second, previous algorithms often inevitably result in similar shapelets among the selected shapelets. This work addresses the latter problem by introducing an efficient and effective shapelet pruning technique to filter similar shapelets and decrease the number of candidate shapelets at the same time. On this basis, a shapelet coverage method is proposed for selecting the number of shapelets for a given dataset. Experiments using the classic benchmark datasets for time series classification demonstrate that the proposed transformation can improve classification accuracy.
出处
《软件学报》
EI
CSCD
北大核心
2015年第9期2311-2325,共15页
Journal of Software
基金
北京市自然科学基金(4142042)
中央高校基本科研基金(2015YJS049)