期刊文献+

基于多视觉码本的图像表示 被引量:1

Image Representation Based on Multiple Visual Codebooks
下载PDF
导出
摘要 基于词袋模型的图像表示方法的有效性主要受限于局部特征的量化误差.文中提出一种基于多视觉码本的图像表示方法,通过综合考虑码本构建和编码方法这两个方面的因素加以改进.具体包括:1)多视觉码本构建,以迭代方式构建多个紧凑且具有互补性的视觉码本;2)图像表示,首先针对多码本的情况,依次从各码本中选择相应的视觉单词并采用线性回归估计编码系数,然后结合图像的空间金字塔结构形成最终的图像表示.在一些标准测试集合的图像分类结果验证文中方法的有效性. The effectiveness of the image representation based on bag-of-visual words (BoW) model is majorly limited by the quantization error. To address this issue, an improved image representation based on multiple visual codebooks is proposed in this paper, which considers both visual codebook construction and feature coding. The proposed method specifically consists of 1 ) multiple visual codebooks construction, in which the compact and complementary visual codebooks are iteratively generated; 2) image representation, in which the visual words are firstly selected from each individual visual codebook, then the coding coefficients are determined by using the regularized linear regression method, and finally the image is represented by combining the spatial pyramid structure. The experimental results on several benchmark image classification datasets demonstrate the consistent and significant improvement of the proposed method.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2013年第10期909-915,共7页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金资助项目(No.61172158)
关键词 图像分类 视觉码本 聚类分析 图像表示 Image Classification, Visual Codebook, Clustering Analysis, Image Representation
  • 相关文献

参考文献15

  • 1Lazebnik S, Schmid C, PonceJ. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories//Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA, 2006, II: 2169-2178.
  • 2Boureau Y, Bach F, LeCun Y, et al. Learning Mid-Level Features for Recognition//Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA, 2010: 2559-2566.
  • 3Lowe D. Distinctive Image Features from Scale-Invariant Keypoints. InternationalJournal of Computer Vision, 2004, 60(2): 91-110.
  • 4SivicJ, Zisserman A. Video Coogle: A Text Retrieval Approach to Object Matching in Videos//Proc of the 9th IEEE International Conference on Computer Vision. Nice, France, 2003, II: 1470-1477.
  • 5Aharon M, Elad M, Bruckstein A. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. IEEE Trans on Signal Processing, 2006, 54 ( 11 ) : 4311-4322.
  • 6Jiang Yuguang, Ngo C W. Visual Word Proximity and Linguistics for Semantic Video Indexing and Near-Duplicate Retrieval. Compu?ter Vision and Image Understanding, 2009,113(3): 405-414.
  • 7Jurie F, Triggs B. Creating Efficient Codebooks for Visual Recogni?tion//Proc of the 10th International Conference on Computer Vision. Beijing, China, 2005, I: 604-610.
  • 8Boiman 0, Shechtman E, Irani M. In Defense of Nearest-Neighbor Based Image Classification//Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Ancho?rage, USA, 2008: 1-8.
  • 9GernertJ, GeusebroekJ, Veenman C, et al. Kernel Codebooks for Scene Categorization//Proc of the 10th European Conference on Computer Vision. Marseille, France, 2008: 696-709.
  • 10Coates A, Ng A Y. The Importance of Encoding versus Training with Sparse Coding and Vector Quantization//Proc of the 28th International Conference on Machine Learning. Bellevue, USA, 2011 : 921-928.

同被引文献9

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部