Conference Paper

Dynamic Two-Stage Image Retrieval from Large Multimodal Databases.

Department of Electrical and Computer Engineering, Democritus University of Thrace, University Campus, 67100 Xanthi, Greece
DOI: 10.1007/978-3-642-20161-5_33 Conference: Advances in Information Retrieval - 33rd European Conference on IR Research, ECIR 2011, Dublin, Ireland, April 18-21, 2011. Proceedings
Source: DBLP

ABSTRACT Content-based image retrieval (CBIR) with global features is notoriously noisy, especially for image queries with low percentages of relevant images in a collection. Moreover, CBIR typically ranks the whole collection, which is inefficient for large databases. We experiment with a method for image retrieval from multimodal databases, which improves both the effectiveness and efficiency of traditional CBIR by exploring secondary modalities. We perform retrieval in a two-stage fashion: first rank by a secondary modality, and then perform CBIR only on the top-K items. Thus, effectiveness is improved by performing CBIR on a ‘better’ subset. Using a relatively ‘cheap’ first stage, efficiency is also improved via the fewer CBIR operations performed. Our main novelty is that K is dynamic, i.e. estimated per query to optimize a predefined effectiveness measure. We show that such dynamic two-stage setups can be significantly more effective and robust than similar setups with static thresholds previously proposed.

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The advances in computer and network infrastructure together with the fast evolution of multimedia data has resulted in the growth of attention to the digital video’s development. The scientific community has increased the amount of research into new technologies, with a view to improving the digital video utilization: its archiving, indexing, accessibility, acquisition, store and even its process and usability. All these parts of the video utilization entail the necessity of the extraction of all important information of a video, especially in cases of lack of metadata information. The main goal of this paper is the construction of a system that automatically generates and provides all the essential information, both in visual and textual form, of a video. By using the visual or the textual information, a user is facilitated on the one hand to locate a specific video and on the other hand is able to comprehend rapidly the basic points and generally, the main concept of a video without the need to watch the whole of it. The visual information of the system emanates from a video summarization method, while the textual one derives from a key-word-based video annotation approach. The video annotation technique is based on the key-frames, that constitute the video abstract and therefore, the first part of the system consists of the new video summarization method.According to the proposed video abstraction technique, initially, each frame of the video is described by the Compact Composite Descriptors (CCDs) and a visual word histogram. Afterwards, the proposed approach utilizes the Self-Growing and Self-Organized Neural Gas (SGONG) network, with a view to classifying the frames into clusters. The extraction of a representative key frame from every cluster leads to the generation of the video abstract. The most significant advantage of the video summarization approach is its ability to calculate dynamically the appropriate number of final clusters. In the sequel, a new video annotation method is applied to the generated video summary leading to the automatic generation of key-words capable of describing the semantic content of the given video. This approach is based on the recently proposed N-closest Photos Model (NCP). Experimental results on several videos are presented not only to evaluate the proposed system but also to indicate its effectiveness.
    Expert Systems with Applications 10/2013; · 1.97 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this article we address the problem of searching for products using an image as query, instead of the more popular approach of searching by textual keywords. With the fast development of the Internet, the popularization of mobile devices and e-commerce systems, searching specific products by image has become an interesting research topic. In this context, Content-Based Image Retrieval (CBIR) techniques have been used to support and enhance the customer shopping experience. We propose an image re-ranking strategy based on multimedia information available on product databases. Our re-ranking strategy relies on category and textual information associated to the top-k images of an initial ranking computed purely with CBIR techniques. Experiments were carried out with users' relevance judgment on two image datasets collected from e-commerce Web sites. Our results show that our re-ranking strategy outperforms the baselines when using only CBIR techniques.
    Proceedings of the 35th European conference on Advances in Information Retrieval; 03/2013
  • Source
    ACHI 2012, The Fifth International Conference on Advances in Computer-Human Interactions; 01/2012

Full-text (2 Sources)

Available from
Jun 4, 2014