Conference Proceeding

Predicting experimental similarity ratings and recognition rates for individual natural stimuli with the NIM model.

01/2005; In proceeding of: BNAIC 2005 - Proceedings of the Seventeenth Belgium-Netherlands Conference on Artificial Intelligence, Brussels, Belgium, October 17-18, 2005
Source: DBLP

ABSTRACT In earlier work, we proposed a recognition memory model that operates directly on digitized natural images. The model is called the Natural Input Memory (NIM) model. When presented with a natural image, the NIM model employs a biologically-informed perceptual pre-processing method that translates the image into a similarity-space representation. In this paper, the NIM model is validated on individual natural stimuli (i.e., images of faces) in two tasks: (1) a similarity- rating task and (2) a recognition task. The results obtained with the NIM model are compared with the results of corre- sponding behavioral experiments. The similarity structure of the face images that is reflected in the similarity space forms the basis for the comparison. The results reveal that the N IM model's similarity ratings and recognition rates for individual images correlate well with those obtained in the behavioral ex- periments. We conclude that the NIM model successfully sim- ulates similarity ratings and recognition performance for indi- vidual natural stimuli.

0 0
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The full version of this paper appeared in: Proceedings of the CogSci 2004 Conference. A new recognition memory model, the natural input memory (NIM) model, is proposed which differs from existing models of human memory in that it operates on natural input. A biologically-informed pre-processing method, which is commonly used in artificial in- telligence (8), takes local samples from a natural image and translates these into a feature vector representation. Existing memory models (e.g., the R EM model, (9); the model of differentiation, (5)) lack such a pre-processing method and often make simplifying as- sumptions about item representations. These models represent an item by a vector of abstract features. The feature values are usually drawn from a particular mathematical distribution, which describes the distributional statistics of real-world perceptual features. Since these models artificially generate representations, they do not address the informa- tional contribution of the similarity structure intrinsic to natural data. However, we believe that the similarity structure of natural data contains important information. Therefore, the NIM model operates on natural input and represents the similarity structure of the input. The NIM model encompasses the following two stages: (1) a perceptual pre-processing stage, and (2) a memory stage. The perceptual pre-processing stage derives the similarity structure from high-dimensional natural images by applying a multi-scale wavelet decom- position followed by a principal component analysis. This is an often applied method in the domain of visual object recognition to model the first three stages of processing of information in the human visual system (i.e., retina/LGN, V1/V2, V4/LOC; (7)). Pre- processing a high-dimensional image, results in a number of low-dimensional feature- vectors, which reside in a so-called 'similarity space' (1). In this space, representations of perceptually similar images reside in close proximity of each other. The memory stage comprises two processes: (a) a storage process, and (b) a recognition process. The storage process simply stores feature vectors. The recognition process compares feature vectors of the image to be recognized with previously stored feature vectors. Simulations with the NIM model showed that it is able to produce a number of recent findings from experimental recognition memory studies that relate to the similarity of
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: We present ongoing work on a project for automatic recognition of spon- taneous facial actions. Spontaneous facial expressions differ substan- tially from posed expressions, similar to how continuous, spontaneous speech differs from isolated words produced on command. Previous methods for automatic facial expression recognition assumed images were collected in controlled environments in which the subjects delib- erately faced the camera. Since people often nod or turn their heads, automatic recognition of spontaneous facial behavior requires methods for handling out-of-image-plane head rotations. Here we explore an ap- proach based on 3-D warping of images into canonical views. We eval- uated the performance of the approach as a front-end for a spontaneous expression recognition system using support vector machines and hidden Markov models. This system employed general purpose learning mech- anisms that can be applied to recognition of any facial movement. The system was tested for recognition of a set of facial actions defined by the Facial Action Coding System (FACS). We showed that 3D tracking and warping followed by machine learning techniques directly applied to the warped images, is a viable and promising technology for automatic facial expression recognition. One exciting aspect of the approach pre- sented here is that information about movement dynamics emerged out of filters which were derived from the statistics of images.
    Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, December 9-14, 2002, Vancouver, British Columbia, Canada]; 01/2002
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Pictures of facial expressions from the Ekman and Friesen set (Ekman, P., Friesen, W. V., (1976). Pictures of facial affect. Palo Alto, California: Consulting Psychologists Press) were submitted to a principal component analysis (PCA) of their pixel intensities. The output of the PCA was submitted to a series of linear discriminant analyses which revealed three principal findings: (1) a PCA-based system can support facial expression recognition, (2) continuous two-dimensional models of emotion (e.g. Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39, 1161-1178) are reflected in the statistical structure of the Ekman and Friesen facial expressions, and (3) components for coding facial expression information are largely different to components for facial identity information. The implications for models of face processing are discussed.
    Vision Research 05/2001; 41(9):1179-208. · 2.14 Impact Factor

Full-text (2 Sources)

Available from
Feb 4, 2013