Conference Paper

Dynamic Two-Stage Image Retrieval from Large Multimodal Databases.

Department of Electrical and Computer Engineering, Democritus University of Thrace, University Campus, 67100 Xanthi, Greece
DOI: 10.1007/978-3-642-20161-5_33 Conference: Advances in Information Retrieval - 33rd European Conference on IR Research, ECIR 2011, Dublin, Ireland, April 18-21, 2011. Proceedings
Source: DBLP

ABSTRACT Content-based image retrieval (CBIR) with global features is notoriously noisy, especially for image queries with low percentages of relevant images in a collection. Moreover, CBIR typically ranks the whole collection, which is inefficient for large databases. We experiment with a method for image retrieval from multimodal databases, which improves both the effectiveness and efficiency of traditional CBIR by exploring secondary modalities. We perform retrieval in a two-stage fashion: first rank by a secondary modality, and then perform CBIR only on the top-K items. Thus, effectiveness is improved by performing CBIR on a ‘better’ subset. Using a relatively ‘cheap’ first stage, efficiency is also improved via the fewer CBIR operations performed. Our main novelty is that K is dynamic, i.e. estimated per query to optimize a predefined effectiveness measure. We show that such dynamic two-stage setups can be significantly more effective and robust than similar setups with static thresholds previously proposed.

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We compare two methods for retrieval from multimodal collections. The first is a score-based fusion of results, retrieved visually and textually. The second is a two-stage method that visually re-ranks the top-K results textually retrieved. We discuss their underlying hypotheses and practical limitations, and contact a comparative evaluation on a standardized snapshot of Wikipedia. Both methods are found to be significantly more effective than single-modality baselines, with no clear winner but with different robustness features. Nevertheless, two-stage retrieval provides efficiency benefits over fusion.
    Advances in Information Retrieval - 33rd European Conference on IR Research, ECIR 2011, Dublin, Ireland, April 18-21, 2011. Proceedings; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this article we address the problem of searching for products using an image as query, instead of the more popular approach of searching by textual keywords. With the fast development of the Internet, the popularization of mobile devices and e-commerce systems, searching specific products by image has become an interesting research topic. In this context, Content-Based Image Retrieval (CBIR) techniques have been used to support and enhance the customer shopping experience. We propose an image re-ranking strategy based on multimedia information available on product databases. Our re-ranking strategy relies on category and textual information associated to the top-k images of an initial ranking computed purely with CBIR techniques. Experiments were carried out with users' relevance judgment on two image datasets collected from e-commerce Web sites. Our results show that our re-ranking strategy outperforms the baselines when using only CBIR techniques.
    Proceedings of the 35th European conference on Advances in Information Retrieval; 03/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Bag-of-visual-words (BOVW) is a representation of images which is built using a large set of local features. To date, the experimental results presented in the literature have shown that this approach achieves high retrieval scores in several benchmarking image databases because of their ability to recognize objects and retrieve near-duplicate (to the query) images. In this paper, we propose a novel method that fuses the idea of inserting the spatial relationship of the visual words in an image with the conventional Visual Words method. Incorporating the visual distribution entropy leads to a robust scale invariant descriptor. The experimental results show that the proposed method demonstrates better performance than the classic Visual Words approach, while it also outperforms several other descriptors from the literature.
    Computer Vision/Computer Graphics Collaboration Techniques - 5th International Conference, MIRAGE 2011, Rocquencourt, France, October 10-11, 2011. Proceedings; 01/2011

Full-text (2 Sources)

Available from
Jun 4, 2014