Conference Paper

Adding Affine Invariant Geometric Constraint for Partial-Duplicate Image Retrieval.

DOI: 10.1109/ICPR.2010.212 Conference: 20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey, 23-26 August 2010
Source: DBLP


The spring up of large numbers of partial-duplicate images on the internet brings a new challenge to the image retrieval systems. Rather than taking the image as a whole, researchers bundle the local visual words by MSER detector into groups and add simple relative ordering geometric constraint to the bundles. Experiments show that bundled features become much more discriminative than single feature. However, the weak geometric constraint is only applicable when there is no significant rotation between duplicate images and it couldn't handle the circumstances of image flip or large rotation transformation. In this paper, we improve the bundled features with an affine invariant geometric constraint. It employs area ratio invariance property of affine transformation to build the affine invariant matrix for bundled visual words. Such affine invariant geometric constraint can cope well with flip, rotation or other transformations. Experimental results on the internet partial-duplicate image database verify the promotion it brings to the original bundled features approach. Since currently there is no available public corpus for partial-duplicate image retrieval, we also publish our dataset for future studies.

Download full-text


Available from: Zhipeng wu, May 06, 2014
  • Source
    • "Aiming to deal with similar feature-order problem encountered in [12], Zhang et al. [13] propose to identify unbounded-order spatial features by efficient kernels, which could also be used by kernelbased learning algorithms. The method of bundled feature [14] exploits the relative spatial information between SIFT features by bundling them via the MSER region [15] and measure the image similarity by accumulating the spatial matching score of bundled features; this method [14] is further improved by Wu et al. [16] with an affine invariant geometric constraint. The geometric-preserving visual phrase method [17] not only considers the co-occurrences of visual words in the neighborhood, but also captures their local and long-range spatial layouts. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Partial duplicate images often have large non-duplicate regions and small duplicate regions with random rotation, which lead to the following problems: 1) large number of noisy features from the non-duplicate regions; 2) small number of representative features from the duplicate regions; 3) randomly rotated or deformed duplicate regions. These problems challenge many content based image retrieval (CBIR) approaches, since most of them cannot distinguish the representative features from a large proportion of noisy features in a rotation invariant way. In this paper, we propose a rotation invariant partial duplicate image retrieval (PDIR) approach, which effectively and efficiently retrieves the partial duplicate images by accurately matching the representative SIFT features. Our method is based on the Combined-Orientation-Position (COP) consistency graph model, which consists of the following two parts: 1) The COP consistency, which is a rotation invariant measurement of the relative spatial consistency among the candidate matches of SIFT features; it uses a coarse-to-fine family of evenly sectored polar coordinate systems to softly quantize and combine the orientations and positions of the SIFT features. 2) The consistency graph model, which robustly rejects the spatially inconsistent noisy features by effectively detecting the group of candidate feature matches with the largest average COP consistency. Extensive experiments on five large scale image data sets show promising retrieval performances.
    IEEE Transactions on Multimedia 12/2013; 15(8):1982-1996. DOI:10.1109/TMM.2013.2270455 · 2.30 Impact Factor
  • Source
    • "Ground truth annotations for 15 queries (transformed videos) [12] [13] Google search engine web crawling Images of various objects and people N/A [14] [15] [16] [17] INRIA Copydays dataset [18] [19] Personal holiday photos with artificial degradations (no people) 500 queries and their correct retrieval results [12] [20] Oxford buildings dataset [21] [22] 5062 images of landmarks collected from Flickr Ground truth for 11 different landmarks, with 5 possible queries [15] [20] Internet partial-duplicate image database [23] Brand logos N/A [24] UKbench dataset (object recognition evaluation) [25] 2550 groups of 4 images, from four different viewpoints. N/A [20] Corel Photo CD collection Scenery, animals, flowers, object close-ups, activities N/A [4] [17] Flickr Various images used as distractions for other databases N/A [12] [20] [26] [27] lic corpus in which near duplicate images have been annotated. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Managing photo collections involves a variety of image quality assessment tasks, e.g. the selection of the “best” photos. Detecting near-duplicate images is a prerequisite for automating these tasks. This paper presents a new dataset that may assist researchers in testing algorithms for the detection of near-duplicates in personal photo libraries. The proposed dataset is derived directly from an actual personal travel photo collection. It contains many difficult cases and types of near-duplicates. More importantly, in order to deal with the inevitable ambiguity that the near-duplicate cases exhibit, the dataset is annotated by 10 different subjects. These annotations are combined into a non-binary ground truth, which indicates the probability that a pair of images may be considered a near-duplicate by an observer.
    Quality of Multimedia Experience (QoMEX), 2013 Fifth International Workshop on; 07/2013
  • Source
    • "In [8], bundled feature is improved by adding an affine invariant geometric constraint (calling this method " bundled+ " ). All of these approaches share the common SIFT vocabulary of 5,000 visual words, and the weighting parameter is set to be 2 for [7] and 1 for [8]. In [6], besides the SIFT descriptor, another local selfsimilarity descriptor (LSSD) is also used. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In traditional partial-duplicate image retrieval, images are commonly represented using the Bag-of-Visual-Words (BOV) model built from image local features, such as SIFT. Actually, there is only a small similar portion between partialduplicate images so that such representation on the whole image is not adequate for the partial-duplicate image retrieval task. In this paper, we propose a novel perspective to retrieval partial-duplicate images with Contented-based Saliency Region (CSR). CSRs are such sub-regions with abundant visual content and high visual attention in the image. The content of CSR is represented with the BOV model while saliency analysis is employed to ensure the high visual attention of CSR. Each CSR is regarded as an independent unit to be retrieved in the dataset. To effectively retrieve the CSRs, we design a relative saliency ordering constraint, which captures a weak saliency relative layout among interest points in the CSR. Comparison experiments with four state-of-the-art methods on the standard partial-duplicate image dataset clearly verify the effectiveness of our scheme. Further, our approach can provide a more diverse retrieval result, which facilitates the interaction of portable-device users.
    Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011, 11-15 July, 2011, Barcelona, Catalonia, Spain; 07/2011
Show more