摘要
为解决k-NN算法中固定k的选定问题,引入稀疏学习和重构技术用于最近邻分类,通过数据驱动(data-driven)获得k值,不需人为设定。由于样本之间存在相关性,用训练样本重构所有测试样本,生成重构系数矩阵,用l1-范数稀疏重构系数矩阵,使每个测试样本用它邻域内最近的k(不定值)个训练样本来重构,解决k-NN算法对每个待分类样本都用同一个k值进行分类造成的分类不准确问题。UCI数据集上的实验结果表明,在分类时,改良k-NN算法比经典k-NN算法效果要好。
To deal with the problem that k-NN algorithm selects the fixed k,the sparse learning and reconstruction techniques for classification were used,so that k value was obtained through data-driven without artificial set.Due to the existence correlation between the samples,every test sample was used to reconstruct all the training samples,reconstruction coefficient matrix was generated.The l1-norm was used to penalize the objective function,so that each test sample used its neighborhood nearest k(a variable value)training samples to reconstruct,which solved the problem of inaccurate classification caused by k-NN algorithm using the fixed k value.Results of experiments on UCI datasets show that the improved k-NN algorithm is better than the classical k-NN algorithm in terms of classification effect.
出处
《计算机工程与设计》
北大核心
2015年第7期1912-1916,共5页
Computer Engineering and Design
基金
国家自然科学基金项目(61170131
61263035
61363009)
国家863高技术研究发展计划基金项目(2012AA011005)
国家973重点基础研究发展计划基金项目(2013CB329404)
广西自然科学基金项目(2012GXNSFGA060004)
广西高校科学技术研究重点基金项目(2013ZD041)
广西研究生教育创新计划基金项目(YCSZ2015095
YCZ2015096)
关键词
稀疏学习
重构技术
数据驱动
l1-范数
邻域
sparse learning
reconstruction techniques
data-driven
l1-norm
neighborhood