Conference Paper

Photo Context as a Bag of Words.

LIG-Lab. Inf. de Grenoble, St. Martin d'Heres
DOI: 10.1109/ISM.2008.15 Conference: Tenth IEEE International Symposium on Multimedia (ISM2008), December 15-17, 2008, Berkeley, California, USA
Source: DBLP

ABSTRACT In the recent years, photo context metadata (e.g.,date, GPS coordinates) have been proved to be useful in the management of personal photos. However, these metadata are still poorly considered in photo retrieving systems. In order to overcome this limitation, we propose an approach to incorporate contextual metadata in a keyword-based photo retrieval process.We use metadata about the photo shot context (address location, nearby objects, season, light status...) to generate a bag of words for indexing each photo. We extend the Vector Space Model in order to transform these shot context words into document-vector terms. In addition, spatial reasoning and geographical ontologies are used to infer new indexing terms. This facilitates the query-document matching process and also allows performing semantic comparison between the query terms and photo annotations.

0 Bookmarks
 · 
64 Views
  • Source
    COnférence en Recherche d'Infomations et Applications - CORIA 2011, 8th French Information Retrieval Conference, Avignon, France, March 16-18, 2011. Proceedings; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Due to geotagging capabilities of consumer cameras, it has become easy to capture the exact geometric location where a picture is taken. However, the location is not the whereabouts of the scene taken by the photographer but the whereabouts of the photographer himself. To determine the actual location of an object seen in a photo some sophisticated and tiresome steps are required on a special camera rig, which are generally not available in common digital cameras. This article proposes a novel method to determine the geometric location corresponding to a specific image pixel. A new technique of stereo triangulation is introduced to compute the relative depth of a pixel position. Geographical metadata embedded in images are utilized to convert relative depths to absolute coordinates. When a geographic database is available we can also infer the semantically meaningful description of a scene object from where the specified pixel is projected onto the photo. Experimental results demonstrate the effectiveness of the proposed approach in accurately identifying actual locations.
    ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP). 02/2013; 9(1).

Full-text (2 Sources)

Download
25 Downloads
Available from
May 19, 2014