Conference Paper

Semantic manifold learning for image retrieval

DOI: 10.1145/1101149.1101193 Conference: Proceedings of the 13th ACM International Conference on Multimedia, Singapore, November 6-11, 2005
Source: DBLP


Learning the user's semantics for CBIR involves two different sources of information: the similarity relations entailed by the content-based features, and the relevance relations specified in the feedback. Given that, we propose an augmented relation embedding (ARE) to map the image space into a semantic manifold that faithfully grasps the user's preferences. Besides ARE, we also look into the issues of selecting a good feature set for improving the retrieval performance. With these two aspects of efforts we have established a system that yields far better results than those previously reported. Overall, our approach can be characterized by three key properties: 1) The framework uses one relational graph to describe the similarity relations, and the other two to encode the relevant/irrelevant relations indicated in the feedback. 2) With the relational graphs so defined, learning a semantic manifold can be transformed into solving a constrained optimization problem, and is reduced to the ARE algorithm accounting for both the representation and the classification points of views. 3) An image representation based on augmented features is introduced to couple with the ARE learning. The use of these features is significant in capturing the semantics concerning different scales of image regions. We conclude with experimental results and comparisons to demonstrate the effectiveness of our method.

Download full-text


Available from: Yen-Yu Lin, Oct 08, 2015
37 Reads
  • Source
    • "Note that Manifold Regularization is able to explore the manifold structure possessed by multimedia data [31] [35] [36] "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a novel semi-supervised feature selection framework by mining correlations among multiple tasks and apply it to different multimedia applications. Instead of independently computing the importance of features for each task, our algorithm leverages shared knowledge from multiple related tasks, thus, improving the performance of feature selection. Note that we build our algorithm on assumption that different tasks share common structures. The proposed algorithm selects features in a batch mode, by which the correlations between different features are taken into consideration. Besides, considering the fact that labeling a large amount of training data in real world is both time-consuming and tedious, we adopt manifold learning which exploits both labeled and unlabeled training data for feature space analysis. Since the objective function is non-smooth and difficult to solve, we propose an iterative algorithm with fast convergence. Extensive experiments on different applications demonstrate that our algorithm outperforms other state-of-the-art feature selection algorithms.
  • Source
    • "We further leverage Manifold Regularization [17] built upon the graph Laplacian to extend our framework to a semi-supervised scenario. Manifold Regularization is adopted because multimedia data have been normally shown to possess a manifold structure [18], [19] and Manifold Regularization can explore it. Consequently, by applying Manifold Regularization to the loss function in (1), we obtain (3) where denotes the trace operator. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a novel semi-supervised feature analyzing framework for multimedia data understanding and apply it to three different applications: image annotation, video concept detection and 3-D motion data analysis. Our method is built upon two advancements of the state of the art: (1) l2, 1-norm regularized feature selection which can jointly select the most relevant features from all the data points. This feature selection approach was shown to be robust and efficient in literature as it considers the correlation between different features jointly when conducting feature selection; (2) manifold learning which analyzes the feature space by exploiting both labeled and unlabeled data. It is a widely used technique to extend many algorithms to semi-supervised scenarios for its capability of leveraging the manifold structure of multimedia data. The proposed method is able to learn a classifier for different applications by selecting the discriminating features closely related to the semantic concepts. The objective function of our method is non-smooth and difficult to solve, so we design an efficient iterative algorithm with fast convergence, thus making it applicable to practical applications. Extensive experiments on image annotation, video concept detection and 3-D motion data analysis are performed on different real-world data sets to demonstrate the effectiveness of our algorithm.
    IEEE Transactions on Multimedia 12/2012; 14(6):1662-1672. DOI:10.1109/TMM.2012.2199293 · 2.30 Impact Factor
  • Source
    • "In pattern recognition, dimensionality reduction is an effective technique to solve the 'curse of dimensionality' [1] and improve classification performance and computational efficiency in many applications, such as face recognition [2] [3] [4] [5] [6] and image retrieval [7] [8] [9]. Recently, a general framework for dimensionality reduction has been developed in [10]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this article, we develop a linear supervised subspace learning method called locality-based discriminant neighborhood embedding (LDNE), which can take advantage of the underlying submanifold-based structures of the data for classification. Our LDNE method can simultaneously consider both 'locality' of locality preserving projection (LPP) and 'discrimination' of discriminant neighborhood embedding (DNE) in manifold learning. It can find an embedding that not only preserves local information to explore the intrinsic submanifold structure of data from the same class, but also enhances the discrimination among submanifolds from different classes. To investigate the performance of LDNE, we compare it with the state-of-the-art dimensionality reduction techniques such as LPP and DNE on publicly available datasets. Experimental results show that our LDNE can be an effective and robust method for classification.
    The Computer Journal 08/2012; 56(9):1063-1082. DOI:10.1093/comjnl/bxs113 · 0.79 Impact Factor
Show more