Article

Image Retrieval with a Bayesian Model of Relevance Feedback

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

A content-based image retrieval system based on multinomial relevance feedback is proposed. The system relies on an interactive search paradigm where at each round a user is presented with k images and selects the one closest to their ideal target. Two approaches, one based on the Dirichlet distribution and one based the Beta distribution, are used to model the problem motivating an algorithm that trades exploration and exploitation in presenting the images in each round. Experimental results show that the new approach compares favourably with previous work.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Early on, the development of image retrieval spurred the development of many feature descriptors, such as color histogram [1], texture histogram [2], SIFT [3], rgSIFT [4], PHOG [5], and GIST [6]. Meanwhile, a large number of retrieval models, such as the Bayesian model [7,8], random forest model [9], and SVM model [10][11][12], directly support the retrieval process. Moreover, quite a few improved retrieval models [13][14][15] increase the accuracy of image retrieval significantly. ...
Article
Full-text available
Given one specific image, it would be quite significant if humanity could simply retrieve all those pictures that fall into a similar category of images. However, traditional methods are inclined to achieve high-quality retrieval by utilizing adequate learning instances, ignoring the extraction of the image’s essential information which leads to difficulty in the retrieval of similar category images just using one reference image. Aiming to solve this problem above, we proposed in this paper one refined sparse representation based similar category image retrieval model. On the one hand, saliency detection and multi-level decomposition could contribute to taking salient and spatial information into consideration more fully in the future. On the other hand, the cross mutual sparse coding model aims to extract the image’s essential feature to the maximum extent possible. At last, we set up a database concluding a large number of multi-source images. Adequate groups of comparative experiments show that our method could contribute to retrieving similar category images effectively. Moreover, adequate groups of ablation experiments show that nearly all procedures play their roles, respectively.
... In this case, the extensions of the query can lead to more inappropriate results than the initial query because of irrelevant concepts which leads to significantly increase noise [17] . Another way to grasp user's intention is relevance feedback [18][19][20][21] and pseudo relevance feedback. In Relevance feedback technique, the user choose from the returned images those that he/she finds relevants (positive examples) and those that he/she finds not relevants (negative examples) then a query-specific similarity metric was learned from the selected examples. ...
... Low-res Images image High-res grid Fig. 6 The sample medical image super-resolution illustration achieve the purpose of improving the performance [27,34,52]. We firstly show the image sampleing procedure as formula 16. ...
Article
Full-text available
This paper proposes a novel registration and super-resolution jointed paradigm for medical images under the Internet of thing environment. In the medical image processing, the matching issue is one catches wide attention with the domain of research. Image registration technique can be divided into similarity measure, optimization, geometric transformation, and interpolation, etc. As the first essential clue of our model, we propose the novel registration algorithm based on energy feature extraction. Generally, the matching energy function by the similarity measurement and a penalty constitution is called the external force and endogenic force separately. The matching is an external force and endogenic force mutual competition, eventually achieves the balanced process. Furtherly, we integrate the game analysis and area feature selection to achieve the better image super-resolution mode through the pretreatment of the image to change the initial value, so as to achieve the purpose of improving the performance. Besides the algorithm level innovation, we integrate the GPU and the IOT to construct the hardware based implementation of the proposed medical image processing system. The latency of registers to read and write data across a GPU’s entire storage system is minimal, it is private to each thread, and can only be accessed by its owning thread. For each thread, the local memory is also private and it is often used to deal with the problem of overflow register, reducing the buffer overflow caused by the entire application of a substantial decline in the possibility and shared memory is visible to all threads within the thread block. We then achieve the optimal integration of IOT and GPU. The experimental result proves the robustness of the method.
Conference Paper
With the increase of digital media databases, the need for methods that can allow the user to efficiently peruse them has risen dramatically. This paper studies how to explore image datasets more efficiently in online content-based image retrieval (CBIR). We present a new approach for exploratory CBIR that is dynamic, robust and gives a good coverage of the search space, while maintaining a high retrieval precision. Our method uses deep similarity-based learning to find a new representation of the image space. With this metric, it finds the central point of interest and clusters its local region to present the user with representative images within the vicinity of their target search. This clustering provides a more varied training set for the next iteration, allowing the location of relevant features faster. Additionally, relearning a representation of the user’s search interest in each round enables the system to find other non-local regions of interest in the search space, thus preventing the user from getting stuck in a context trap. We test our method in a simulated online setting, taking into consideration the accuracy, coverage and flexibility of adapting to changes in the user’s interest.
Article
Full-text available
This paper presents a tutorial introduction to the use of variational methods for inference and learning in graphical models (Bayesian networks and Markov random fields). We present a number of examples of graphical models, including the QMR-DT database, the sigmoid belief network, the Boltzmann machine, and several variants of hidden Markov models, in which it is infeasible to run exact inference algorithms. We then introduce variational methods, which exploit laws of large numbers to transform the original graphical model into a simplified graphical model in which inference is efficient. Inference in the simpified model provides bounds on probabilities of interest in the original model. We describe a general framework for generating variational transformations based on convex duality. Finally we return to the examples and demonstrate how variational algorithms can be formulated in each case.
Article
Full-text available
This paper presents the theory, design principles, implementation and performance results of PicHunter, a prototype content-based image retrieval (CBIR) system. In addition, this document presents the rationale, design and results of psychophysical experiments that were conducted to address some key issues that arose during PicHunter's development. The PicHunter project makes four primary contributions to research on CBIR. First, PicHunter represents a simple instance of a general Bayesian framework which we describe for using relevance feedback to direct a search. With an explicit model of what users would do, given the target image they want, PicHunter uses Bayes's rule to predict the target they want, given their actions. This is done via a probability distribution over possible image targets, rather than by refining a query. Second, an entropy-minimizing display algorithm is described that attempts to maximize the information obtained from a user at each iteration of the search. Third, PicHunter makes use of hidden annotation rather than a possibly inaccurate/inconsistent annotation structure that the user must learn and make queries in. Finally, PicHunter introduces two experimental paradigms to quantitatively evaluate the performance of the system, and psychophysical experiments are presented that support the theoretical claims.
Article
Thompson sampling is one of oldest heuristic to address the exploration / ex-ploitation trade-off, but it is surprisingly unpopular in the literature. We present here some empirical results using Thompson sampling on simulated and real data, and show that it is highly competitive. And since this heuristic is very easy to implement, we argue that it should be part of the standard baselines to compare against.
Article
Latent Dirichlet analysis, or topic modeling, is a flexible latent variable framework for model-ing high-dimensional sparse count data. Various learning algorithms have been developed in re-cent years, including collapsed Gibbs sampling, variational inference, and maximum a posteriori estimation, and this variety motivates the need for careful empirical comparisons. In this paper, we highlight the close connections between these approaches. We find that the main differences are attributable to the amount of smoothing applied to the counts. When the hyperparameters are op-timized, the differences in performance among the algorithms diminish significantly. The ability of these algorithms to achieve solutions of com-parable accuracy gives us the freedom to select computationally efficient approaches. Using the insights gained from this comparative study, we show how accurate topic models can be learned in several seconds on text corpora with thousands of documents.
Chapter
We investigate models for content-based image retrieval with relevance feedback, in particular focusing on the exploration-exploitation dilemma. We propose quantitative models for the user behavior and investigate implications of these models. Three search algorithms for efficient searches based on the user models are proposed and evaluated. In the first model a user queries a database for the most (or a sufficiently) relevant image. The user gives feedback to the system by selecting the most relevant image from a number of images presented by the system. In the second model we consider a filtering task where relevant images should be extracted from a database and presented to the user. The feedback of the user is a binary classification of each presented image as relevant or irrelevant. While these models are related, they differ significantly in the kind of feedback provided by the user. This requires very different mechanisms to trade off exploration (finding out what the user wants) and exploitation (serving images which the system believes relevant for the user).
Article
Representing texture images statistically as histograms over a discrete vocabulary of local features has proven widely effective for texture classification tasks. Images are described locally by vectors of, for example, responses to some filter bank; and a visual vocabulary is defined as a partition of this descriptor-response space, typically based on clustering. In this paper, we investigate the performance of an approach which represents textures as histograms over a visual vocabulary which is defined geometrically, based on the Basic Image Features of Griffin and Lillholm (Proc. SPIE 6492(09):1–11, 2007), rather than by clustering. BIFs provide a natural mathematical quantisation of a filter-response space into qualitatively distinct types of local image structure. We also extend our approach to deal with intra-class variations in scale. Our algorithm is simple: there is no need for a pre-training step to learn a visual dictionary, as in methods based on clustering, and no tuning of parameters is required to deal with different datasets. We have tested our implementation on three popular and challenging texture datasets and find that it produces consistently good classification results on each, including what we believe to be the best reported for the KTH-TIPS and equal best reported for the UIUCTex databases. KeywordsTexture classification-Basic Image Features-Textons
Article
We have witnessed great interest and a wealth of promise in content-based image retrieval as an emerging technology. While the last decade laid foundation to such promise, it also paved the way for a large number of new techniques and systems, got many new people involved, and triggered stronger association of weakly related elds. In this paper, we survey almost 300 key theoret- ical and empirical contributions in the current decade related to image retrieval and automatic image annotation, and discuss the spawning of related sub-elds in the process. We also discuss signican t challenges involved in the adaptation of existing image retrieval techniques to build systems that can be useful in the real-world. In retrospect of what has been achieved so far, we also conjecture what the future may hold for image retrieval research.
Article
Thesis (Ph.D.)--University of London, 2003.
Article
With the advent of the Internet, billions of images are now freely available online and constitute a dense sampling of the visual world. Using a variety of non-parametric methods, we explore this world with the aid of a large dataset of 79,302,017 images collected from the Internet. Motivated by psychophysical results showing the remarkable tolerance of the human visual system to degradations in image resolution, the images in the dataset are stored as 32 x 32 color images. Each image is loosely labeled with one of the 75,062 non-abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database gives a comprehensive coverage of all object categories and scenes. The semantic information from Wordnet can be used in conjunction with nearest-neighbor methods to perform object classification over a range of semantic levels minimizing the effects of labeling noise. For certain classes that are particularly prevalent in the dataset, such as people, we are able to demonstrate a recognition performance comparable to class-specific Viola-Jones style detectors.
Article
We provide an introduction to the theory and use of variational methods for inference and estimation in the context of graphical models. Variational methods become useful as ecient approximate methods when the structure of the graph model no longer admits feasible exact probabilistic calculations. The emphasis of this tutorial is on illustrating how inference and estimation problems can be transformed into variational form along with describing the resulting approximation algorithms and their properties insofar as these are currently known. 1 Introduction The term variational methods refers to a large collection of optimization techniques. The classical context for these methods involves nding the extremum of an integral depending on an unknown function and its derivatives. This classical de nition, however, and the accompanying calculus of variation no longer adequately characterizes modern variational methods. Modern variational approaches have become indispensable tools in...
Article
This paper presents a novel practical framework for Bayesian model averaging and model selection in probabilistic graphical models. Our approach approximates full posterior distributions over model parameters and structures, as well as latent variables, in an analytical manner. These posteriors fall out of a free-form optimization procedure, which naturally incorporates conjugate priors. Unlike in large sample approximations, the posteriors are generally nonGaussian and no Hessian needs to be computed. Predictive quantities are obtained analytically. The resulting algorithm generalizes the standard Expectation Maximization algorithm, and its convergence is guaranteed. We demonstrate that this approach can be applied to a large class of models in several domains, including mixture models and source separation. 1 Introduction A standard method to learn a graphical model 1 from data is maximum likelihood (ML). Given a training dataset, ML estimates a single optimal value f...
Sifting through images with multinomial relevance feedback
  • D Lowacka
  • A Medlar
  • J Shawe-Taylor
D. G lowacka, A. Medlar, and J. Shawe-Taylor. Sifting through images with multinomial relevance feedback. In NIPS Workshop Beyond Classification: Machine Learning for Next Generation Computer Vision Challenges, 2010.
Using basic image features for texture classification Image retrieval: Ideas, influences, and trends of the new age
  • M Crosier
  • L Griffin
  • R Datta
  • D Joshi
  • J Li
  • J Wang
M. Crosier and L. Griffin. Using basic image features for texture classification. International Journal of Computer Vision, 88(3):447 – 460, 2010. [8] R. Datta, D. Joshi, J. Li, and J. Wang. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv., 2008.