期刊文献+

Effective and Efficient Multi-Facet Web Image Annotation

Effective and Efficient Multi-Facet Web Image Annotation
原文传递
导出
摘要 The vast amount of images available on the Web request for an effective and efficient search service to help users find relevant images. The prevalent way is to provide a keyword interface for users to submit queries. However, the amount of images without any tags or annotations are beyond the reach of manual efforts. To overcome this, automatic image annotation techniques emerge, which are generally a process of selecting a suitable set of tags for a given image without user intervention. However, there are three main challenges with respect to Web-scale image annotation: scalability, noise- resistance and diversity. Scalability has a twofold meaning: first an automatic image annotation system should be scalable with respect to billions of images on the Web; second it should be able to automatically identify several relevant tags among a huge tag set for a given image within seconds or even faster. Noise-resistance means that the system should be robust enough against typos and ambiguous terms used in tags. Diversity represents that image content may include both scenes and objects, which are further described by multiple different image features constituting different facets in annotation. In this paper, we propose a unified framework to tackle the above three challenges for automatic Web image annotation. It mainly involves two components: tag candidate retrieval and multi-facet annotation. In the former content-based indexing and concept-based eodebook are leveraged to solve scalability and noise-resistance issues. In the latter the joint feature map has been designed to describe different facets of tags in annotations and the relations between these facets. Tag graph is adopted to represent tags in the entire annotation and the structured learning technique is employed to construct a learning model on top of the tag graph based on the generated joint feature map. Millions of images from Flickr are used in our evaluation. Experimental results show that we have achieved 33% performance improvements compared with those single facet approaches in terms of three metrics: precision, recall and F1 score. The vast amount of images available on the Web request for an effective and efficient search service to help users find relevant images. The prevalent way is to provide a keyword interface for users to submit queries. However, the amount of images without any tags or annotations are beyond the reach of manual efforts. To overcome this, automatic image annotation techniques emerge, which are generally a process of selecting a suitable set of tags for a given image without user intervention. However, there are three main challenges with respect to Web-scale image annotation: scalability, noise- resistance and diversity. Scalability has a twofold meaning: first an automatic image annotation system should be scalable with respect to billions of images on the Web; second it should be able to automatically identify several relevant tags among a huge tag set for a given image within seconds or even faster. Noise-resistance means that the system should be robust enough against typos and ambiguous terms used in tags. Diversity represents that image content may include both scenes and objects, which are further described by multiple different image features constituting different facets in annotation. In this paper, we propose a unified framework to tackle the above three challenges for automatic Web image annotation. It mainly involves two components: tag candidate retrieval and multi-facet annotation. In the former content-based indexing and concept-based eodebook are leveraged to solve scalability and noise-resistance issues. In the latter the joint feature map has been designed to describe different facets of tags in annotations and the relations between these facets. Tag graph is adopted to represent tags in the entire annotation and the structured learning technique is employed to construct a learning model on top of the tag graph based on the generated joint feature map. Millions of images from Flickr are used in our evaluation. Experimental results show that we have achieved 33% performance improvements compared with those single facet approaches in terms of three metrics: precision, recall and F1 score.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第3期541-553,共13页 计算机科学技术学报(英文版)
基金 supported by the National Natural Science Foundation of China under Grant No. 60931160445
关键词 image annotation multi-facet web-scale image annotation, multi-facet, web-scale
  • 相关文献

参考文献27

  • 1Wang C H, Jing F, Zhang L, Zhang H J. Scalable search-based image annotation. Multimedia Systems, 2008, 14(4): 205-220.
  • 2Wang X J, Zhang L, Jing F, Ma W Y. AnnoSearch: Image auto-annotation by search. In Proc. the PO06 IEEE Com- puter Society Conference on Computer Vision and Pattern Recognition, New York, USA, June 17-22, 2006, pp.1483-1490.
  • 3Blei D M, Jordan M I. Modeling annotated data. In Proc. th- 26th Annual International ACM SIGIR Conference on Ret search and Development in Information Retrieval, TorontoI Canada, July 28-August 1, 2003, pp.127-134. |.
  • 4Duygulu P, Barnard K, Freitas F G. Object recognition as mq chine translation: Learning a lexicon for a fixed image vocab- ulary. In Proc. of European Conference of Computer Vision[ Copenhagen, Denmark, May 28-31, 2002, pp.97-112. [.
  • 5Jeon J, Lavrenko V, Manmatha 1-. Automatic image anno- tation and retrieval using cross-media relevance models. InProc. the 26th Annual International ACM SIGIR Confer- ence on Research and Development in Information Retrieval, Toronto, Canada, July 28-August 1, 2003, pp.119-126.
  • 6--J J ---J o - - -'-" Lavrenko V, Manmatha R, Jeon J. A model for learning the semantics of pictures. In Proe. Advances in Neural Infor- mation Processing Systems, Vancouver, British Columbia, Canada, December 8-13, 2003.
  • 7Li J, Wang J Z. Real-time computerized annotation of pic- tures. In Proc. the l-th Annual ACM International Con- ference on Multimedia, Santa Barbara, CA, USA, October 23-27, pp.911-920.
  • 8SigurbjSrnsson B, van Zwol R. Flickr tag recommendation based on collective knowledge. In Proc. the 17th Interna- tional Conference on World Wide Web, Beijing, China, April 21-25, 2008, pp.327-336.
  • 9Shi J B, Malik J. Normalized cuts and image segmentation. In IEEE Trans. Pattern Anal. Mach. Intell., 2000, 22(8): 888-905.
  • 10Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. In Proc. the 1993 ACM SIGMOD International Conference on Management of Data, Washington D.C., USA, May 26-28, 1993, pp.207-216.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部