摘要
基于词袋模型的图像表示方法的有效性主要受限于局部特征的量化误差.文中提出一种基于多视觉码本的图像表示方法,通过综合考虑码本构建和编码方法这两个方面的因素加以改进.具体包括:1)多视觉码本构建,以迭代方式构建多个紧凑且具有互补性的视觉码本;2)图像表示,首先针对多码本的情况,依次从各码本中选择相应的视觉单词并采用线性回归估计编码系数,然后结合图像的空间金字塔结构形成最终的图像表示.在一些标准测试集合的图像分类结果验证文中方法的有效性.
The effectiveness of the image representation based on bag-of-visual words (BoW) model is majorly limited by the quantization error. To address this issue, an improved image representation based on multiple visual codebooks is proposed in this paper, which considers both visual codebook construction and feature coding. The proposed method specifically consists of 1 ) multiple visual codebooks construction, in which the compact and complementary visual codebooks are iteratively generated; 2) image representation, in which the visual words are firstly selected from each individual visual codebook, then the coding coefficients are determined by using the regularized linear regression method, and finally the image is represented by combining the spatial pyramid structure. The experimental results on several benchmark image classification datasets demonstrate the consistent and significant improvement of the proposed method.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2013年第10期909-915,共7页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金资助项目(No.61172158)
关键词
图像分类
视觉码本
聚类分析
图像表示
Image Classification, Visual Codebook, Clustering Analysis, Image Representation