Conference Paper

Photo Context as a Bag of Words

LIG-Lab. Inf. de Grenoble, St. Martin d'Heres
DOI: 10.1109/ISM.2008.15 Conference: Tenth IEEE International Symposium on Multimedia (ISM2008), December 15-17, 2008, Berkeley, California, USA
Source: DBLP


In the recent years, photo context metadata (e.g.,date, GPS coordinates) have been proved to be useful in the management of personal photos. However, these metadata are still poorly considered in photo retrieving systems. In order to overcome this limitation, we propose an approach to incorporate contextual metadata in a keyword-based photo retrieval process.We use metadata about the photo shot context (address location, nearby objects, season, light status...) to generate a bag of words for indexing each photo. We extend the Vector Space Model in order to transform these shot context words into document-vector terms. In addition, spatial reasoning and geographical ontologies are used to infer new indexing terms. This facilitates the query-document matching process and also allows performing semantic comparison between the query terms and photo annotations.

Download full-text


Available from: Windson Viana, Mar 17, 2014
  • Source
    • "Even though contextual information is helpful to organize photos and provides the first descriptions that are remembered by users (Naaman et al., 2004), it does not fully describe the content of the image. These metadata are rather related to the users' situation when the photo is captured (Viana et al., 2008). However, users might be interested as well in the content of the image (Content Based Image Retrieval -CBIR). "

    COnférence en Recherche d'Infomations et Applications - CORIA 2011, 8th French Information Retrieval Conference, Avignon, France, March 16-18, 2011. Proceedings; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Due to geotagging capabilities of consumer cameras, it has become easy to capture the exact geometric location where a picture is taken. However, the location is not the whereabouts of the scene taken by the photographer but the whereabouts of the photographer himself. To determine the actual location of an object seen in a photo some sophisticated and tiresome steps are required on a special camera rig, which are generally not available in common digital cameras. This article proposes a novel method to determine the geometric location corresponding to a specific image pixel. A new technique of stereo triangulation is introduced to compute the relative depth of a pixel position. Geographical metadata embedded in images are utilized to convert relative depths to absolute coordinates. When a geographic database is available we can also infer the semantically meaningful description of a scene object from where the specified pixel is projected onto the photo. Experimental results demonstrate the effectiveness of the proposed approach in accurately identifying actual locations.
    ACM Transactions on Multimedia Computing Communications and Applications 02/2013; 9(1). DOI:10.1145/2422956.2422961 · 0.97 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Personal photo revisitation on smart phones is a common yet uneasy task for users due to the large volume of photos taken in daily life. Inspired by the human memory and its natural recall characteristics, we build a personal photo revisitation tool, PhotoPrev, to facilitate users to revisit previous photos through associated memory cues. To mimic users’ episodic memory recall, we present a way to automatically generate an abundance of related contextual metadata (e.g., weather, temperature) and organize them as context lattices for each photo in a life cycle. Meanwhile, photo content (e.g., object, text) is extracted and managed in a weighted term list, which corresponds to semantic memory. A threshold algorithm based photo revisitation framework for context- and content-based keyword search on a personal photo collection, together with a user feedback mechanism, is also given. We evaluate the scalability on a large synthetic dataset by crawling users’ photos from Flickr, and a 12-week user study demonstrates the feasibility and effectiveness of our photo revisitation strategies.
    Journal of Computer Science and Technology 05/2015; 30(3):453-466. DOI:10.1007/s11390-015-1536-z · 0.67 Impact Factor