Semantic manifold learning for image retrieval.
ABSTRACT Learning the user's semantics for CBIR involves two different sources of information: the similarity relations entailed by the content-based features, and the relevance relations specified in the feedback. Given that, we propose an augmented relation embedding (ARE) to map the image space into a semantic manifold that faithfully grasps the user's preferences. Besides ARE, we also look into the issues of selecting a good feature set for improving the retrieval performance. With these two aspects of efforts we have established a system that yields far better results than those previously reported. Overall, our approach can be characterized by three key properties: 1) The framework uses one relational graph to describe the similarity relations, and the other two to encode the relevant/irrelevant relations indicated in the feedback. 2) With the relational graphs so defined, learning a semantic manifold can be transformed into solving a constrained optimization problem, and is reduced to the ARE algorithm accounting for both the representation and the classification points of views. 3) An image representation based on augmented features is introduced to couple with the ARE learning. The use of these features is significant in capturing the semantics concerning different scales of image regions. We conclude with experimental results and comparisons to demonstrate the effectiveness of our method.
Full-textDOI: · Available from: Yen-Yu Lin, Jul 06, 2015
- SourceAvailable from: Xiaojun Chang[Show abstract] [Hide abstract]
ABSTRACT: In this paper, we propose a novel semi-supervised feature selection framework by mining correlations among multiple tasks and apply it to different multimedia applications. Instead of independently computing the importance of features for each task, our algorithm leverages shared knowledge from multiple related tasks, thus, improving the performance of feature selection. Note that we build our algorithm on assumption that different tasks share common structures. The proposed algorithm selects features in a batch mode, by which the correlations between different features are taken into consideration. Besides, considering the fact that labeling a large amount of training data in real world is both time-consuming and tedious, we adopt manifold learning which exploits both labeled and unlabeled training data for feature space analysis. Since the objective function is non-smooth and difficult to solve, we propose an iterative algorithm with fast convergence. Extensive experiments on different applications demonstrate that our algorithm outperforms other state-of-the-art feature selection algorithms.
- [Show abstract] [Hide abstract]
ABSTRACT: In this paper, we propose a novel semi-supervised feature analyzing framework for multimedia data understanding and apply it to three different applications: image annotation, video concept detection and 3-D motion data analysis. Our method is built upon two advancements of the state of the art: (1) l2, 1-norm regularized feature selection which can jointly select the most relevant features from all the data points. This feature selection approach was shown to be robust and efficient in literature as it considers the correlation between different features jointly when conducting feature selection; (2) manifold learning which analyzes the feature space by exploiting both labeled and unlabeled data. It is a widely used technique to extend many algorithms to semi-supervised scenarios for its capability of leveraging the manifold structure of multimedia data. The proposed method is able to learn a classifier for different applications by selecting the discriminating features closely related to the semantic concepts. The objective function of our method is non-smooth and difficult to solve, so we design an efficient iterative algorithm with fast convergence, thus making it applicable to practical applications. Extensive experiments on image annotation, video concept detection and 3-D motion data analysis are performed on different real-world data sets to demonstrate the effectiveness of our algorithm.IEEE Transactions on Multimedia 12/2012; 14(6):1662-1672. DOI:10.1109/TMM.2012.2199293 · 1.78 Impact Factor
- The Computer Journal 08/2012; 56(9):1063-1082. DOI:10.1093/comjnl/bxs113 · 0.89 Impact Factor