Conference PaperPDF Available

Using perceptually-based face indexing to facilitate human-computer collaborative retrieval

Authors:

Abstract and Figures

While methods such as Principle Component Analysis, Linear/Fisher Discriminate Analysis, and Hidden Markov Models provide useful similarity measures between face images, they are not based on factors that humans use to perceive facial similarity. This can make it difficult for humans to work collaboratively with face retrieval systems. For example, if a witness to a crime uses a query-by-example paradigm to retrieve the face of the perpetrator from a database of mug-shots, and if the similarity measures used for retrieval are not based on facial features that are salient or important to humans, the retrievals will likely be of limited value. Based on the observation that humans tend to name things that are particularly salient or important to them, this research uses words (such as bearded, bespectacled, big eared, blond, buck-toothed, bug-eyed, curly-haired, dimpled, freckled, gap-toothed, long-faced, snub-nosed, thin-lipped, or wrinkled) to manually index face images. Pair-wise similarity values are then derived from the resulting feature vectors and are compared to ground-truth similarity values, which have been established by having humans hierarchically sort the same set of face images. This comparison indicates which words are most important for indexing the face images, allows the computation of a weighting factor for each word to enhance the overall quality of indexing, and suggests which facial features might provide a more intuitive basis for evaluating similarity.
Content may be subject to copyright.
A preview of the PDF is not available
... In Black et al. (2002Black et al. ( , 2005, the authors studied the method of using words that represent high-level concepts to estimate the similarity between face images. They indexed each face image by asking human participants to determine the usefulness of each word for describing that image. ...
Article
Content-based image retrieval (CBIR) research is currently faced with the so called the 'semantic gap' problem. CBIR researchers work at the near end of the gap, applying computer science methods to bridge the gap. Cognitive psychology researchers work at the far end of the gap, studying how humans perceive things in their environment. This article looks at the literature from a new perspective, and presents a structured analysis of what CBIR researchers have done, in their efforts to bridge the gap. This analysis suggests that more emphasis should be placed on studying the far end of the gap. The second part of this article discusses a study conducted at the far end of the gap, to begin the process of developing and testing techniques for externalising and analysing the visual concepts that are evoked by images. This more disciplined approach has the potential to guide researchers in their efforts to bridge the gap.
... In Black et al. (2002Black et al. ( , 2005, the authors studied the method of using words that represent high-level concepts to estimate the similarity between face images. They indexed each face image by asking human participants to determine the usefulness of each word for describing that image. ...
Conference Paper
Over two decades of intensive research, researchers employed many different approaches to solve the what so called “semantic gap” problem in image indexing and retrieval. Yet, the gap is still recognized as a barrier to progress, and further work is needed to bridge (or at least) narrow the gap. This suggests that more emphasis should be placed on understanding how humans perceive images. This study measures the effectiveness of two image indexing techniques for estimating similarities between images; the semantic basis functions and the affective basis functions. These indexing techniques aim at providing a measure of similarity between outdoor natural images as humans see it. The results presented in this study suggest that the semantic basis functions outperform the affective basis functions, and are able to index the content of outdoor natural images in a manner that allows retrieval of images that have been judged to be subjectively similar.
Article
We present a system for recognizing human faces from single images out of a large database containing one image per person. Faces are represented by labeled graphs, based on a Gabor wavelet transform. Image graphs of new faces are extracted by an elastic graph matching process and can be compared by a simple similarity function. The system differs from the preceding one in three respects. Phase information is used for accurate node positioning. Object-adapted graphs are used to handle large rotations in depth. Image graph extraction is based on a novel data structure, the bunch graph, which is constructed from a small set of sample image graphs.
Article
We have developed a near-real-time computer system that can locate and track a subject's head, and then recognize the person by comparing characteristics of the face to those of known individuals. The computational approach taken in this system is motivated by both physiology and information theory, as well as by the practical requirements of near-real-time performance and accuracy. Our approach treats the face recognition problem as an intrinsically two-dimensional (2-D) recognition problem rather than requiring recovery of three-dimensional geometry, taking advantage of the fact that faces are normally upright and thus may be described by a small set of 2-D characteristic views. The system functions by projecting face images onto a feature space that spans the significant variations among known face images. The significant features are known as "eigenfaces," because they are the eigenvectors (principal components) of the set of faces; they do not necessarily correspond to features such as eyes, ears, and noses. The projection operation characterizes an individual face by a weighted sum of the eigenface features, and so to recognize a particular face it is necessary only to compare these weights to those of known individuals. Some particular advantages of our approach are that it provides for the ability to learn and later recognize new faces in an unsupervised manner, and that it is easy to implement using a neural network architecture.
Article
In this paper, we study how human observers judge image similarity. To do so, we have conducted two psy-chophysical scaling experiments and have compared the results to two algorithmic image similarity metrics. For these experiments, we selected a set of 97 digitized photographic images which represent a range of semantic cate-gories, viewing distances, and colors. We then used the two perceptual and the two algorithmic methods to measure the similarity of each image to every other image in the data set, producing four similarity matrices. These matrices were analyzed using multidimensional scaling techniques to gain insight into the dimensions human observers use for judging image similarity, and how these dimensions differ from the results of algorithmic methods. This paper also describes and validates a new technique for collecting similarity judgments which can provide meaningful results with a factor of four fewer judgments, as compared with the paired comparisons method.
Article
Humans detect and identify faces in a scene with little or no effort. However, building an automated system that accomplishes this task is very difficult. There are several related subproblems: detection of a pattern as a face, identification of the face, analysis of facial expressions, and classification based on physical features of the face. A system that performs these operations will find many applications, e.g. criminal identification, authentication in secure systems, etc. Most of the work to date has been in identification. This paper surveys the past work in solving these problems. The capability of the human visual system with respect to these problems is also discussed. It is meant to serve as a guide for an automated system. Some new approaches to these problems are also briefly discussed.