摘要
针对传统支持向量机算法时空复杂度较高的不足,提出了一种基于交叉验证KNN的支持向量预选取算法。该算法首先对原始样本求k个的邻近样本,然后计算邻近样本中异类样本的比例p1,最后选取满足p1大于阈值p的原始样本作为支持向量。通过交叉验证方法确定k与p的最合适的数值。在UCI标准数据集和说话人识别数据集上的仿真实验显示算法可有效地降低支持向量机分类器的运行时间,同时又具有较好的分类性能。
As traditional support vector machine algorithm is with a high time and space complexities, in this paper, we propose a cross-validation KNN based support vector pre-extracted algorithm. The algorithm firstly computes k neighboring samples for each original sample. Then it computes the proportion of heterogeneous samples in the neighboring samples. Finally, it selects the samples which meet p1 greater than p as support vectors. In this paper, the proposed algorithm use cross-validation method to determine the most appropriate values of k and p. Simulation experiments on the UCI standard data sets and speaker recognition dataset show that the proposed algorithm can effectively reduce the running time of support vector machine classifiers, while being with a good classification performance.
出处
《科学技术与工程》
北大核心
2013年第20期5839-5842,5847,共5页
Science Technology and Engineering
基金
国家自然科学基金项目(61101160)
广东省自然科学基金项目(9151009001000043)
东莞市高校科研机构科技计划项目(2011108102016)资助
关键词
支持向量机
交叉验证
KNN算法
说话人识别
support vector machine
cross-validation
KNN algorithm
speaker recognition