Predicting experimental similarity ratings and recognition rates for individual natural stimuli with the NIM model.
ABSTRACT In earlier work, we proposed a recognition memory model that operates directly on digitized natural images. The model is called the Natural Input Memory (NIM) model. When presented with a natural image, the NIM model employs a biologically-informed perceptual pre-processing method that translates the image into a similarity-space representation. In this paper, the NIM model is validated on individual natural stimuli (i.e., images of faces) in two tasks: (1) a similarity- rating task and (2) a recognition task. The results obtained with the NIM model are compared with the results of corre- sponding behavioral experiments. The similarity structure of the face images that is reflected in the similarity space forms the basis for the comparison. The results reveal that the N IM model's similarity ratings and recognition rates for individual images correlate well with those obtained in the behavioral ex- periments. We conclude that the NIM model successfully sim- ulates similarity ratings and recognition performance for indi- vidual natural stimuli.
- SourceAvailable from: Eric Postma[show abstract] [hide abstract]
ABSTRACT: The full version of this paper appeared in: Proceedings of the CogSci 2004 Conference. A new recognition memory model, the natural input memory (NIM) model, is proposed which differs from existing models of human memory in that it operates on natural input. A biologically-informed pre-processing method, which is commonly used in artificial in- telligence (8), takes local samples from a natural image and translates these into a feature vector representation. Existing memory models (e.g., the R EM model, (9); the model of differentiation, (5)) lack such a pre-processing method and often make simplifying as- sumptions about item representations. These models represent an item by a vector of abstract features. The feature values are usually drawn from a particular mathematical distribution, which describes the distributional statistics of real-world perceptual features. Since these models artificially generate representations, they do not address the informa- tional contribution of the similarity structure intrinsic to natural data. However, we believe that the similarity structure of natural data contains important information. Therefore, the NIM model operates on natural input and represents the similarity structure of the input. The NIM model encompasses the following two stages: (1) a perceptual pre-processing stage, and (2) a memory stage. The perceptual pre-processing stage derives the similarity structure from high-dimensional natural images by applying a multi-scale wavelet decom- position followed by a principal component analysis. This is an often applied method in the domain of visual object recognition to model the first three stages of processing of information in the human visual system (i.e., retina/LGN, V1/V2, V4/LOC; (7)). Pre- processing a high-dimensional image, results in a number of low-dimensional feature- vectors, which reside in a so-called 'similarity space' (1). In this space, representations of perceptually similar images reside in close proximity of each other. The memory stage comprises two processes: (a) a storage process, and (b) a recognition process. The storage process simply stores feature vectors. The recognition process compares feature vectors of the image to be recognized with previously stored feature vectors. Simulations with the NIM model showed that it is able to produce a number of recent findings from experimental recognition memory studies that relate to the similarity of01/2004;
Conference Proceeding: A Prototype for Automatic Recognition of Spontaneous Facial Actions.[show abstract] [hide abstract]
ABSTRACT: We present ongoing work on a project for automatic recognition of spon- taneous facial actions. Spontaneous facial expressions differ substan- tially from posed expressions, similar to how continuous, spontaneous speech differs from isolated words produced on command. Previous methods for automatic facial expression recognition assumed images were collected in controlled environments in which the subjects delib- erately faced the camera. Since people often nod or turn their heads, automatic recognition of spontaneous facial behavior requires methods for handling out-of-image-plane head rotations. Here we explore an ap- proach based on 3-D warping of images into canonical views. We eval- uated the performance of the approach as a front-end for a spontaneous expression recognition system using support vector machines and hidden Markov models. This system employed general purpose learning mech- anisms that can be applied to recognition of any facial movement. The system was tested for recognition of a set of facial actions defined by the Facial Action Coding System (FACS). We showed that 3D tracking and warping followed by machine learning techniques directly applied to the warped images, is a viable and promising technology for automatic facial expression recognition. One exciting aspect of the approach pre- sented here is that information about movement dynamics emerged out of filters which were derived from the statistics of images.Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, December 9-14, 2002, Vancouver, British Columbia, Canada]; 01/2002
- [show abstract] [hide abstract]
ABSTRACT: Pictures of facial expressions from the Ekman and Friesen set (Ekman, P., Friesen, W. V., (1976). Pictures of facial affect. Palo Alto, California: Consulting Psychologists Press) were submitted to a principal component analysis (PCA) of their pixel intensities. The output of the PCA was submitted to a series of linear discriminant analyses which revealed three principal findings: (1) a PCA-based system can support facial expression recognition, (2) continuous two-dimensional models of emotion (e.g. Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39, 1161-1178) are reflected in the statistical structure of the Ekman and Friesen facial expressions, and (3) components for coding facial expression information are largely different to components for facial identity information. The implications for models of face processing are discussed.Vision Research 05/2001; 41(9):1179-208. · 2.14 Impact Factor
Predicting Experimental Similarity Ratings and Recognition Rates for Individual
Natural Stimuli with the NIM Model
Joyca P.W. Lacroix (firstname.lastname@example.org)
Eric O. Postma (email@example.com)
Department of Computer Science, IKAT, Maastricht University; Minderbroedersberg 6a, Maastricht, The Netherlands
Jaap M.J. Murre (firstname.lastname@example.org)
Department of Psychology, University of Amsterdam; Roeterstraat 15, 1018 WB Amsterdam, The Netherlands
In earlier work, we proposed a recognition memory model
that operates directly on digitized natural images. The model
is called the Natural Input Memory (NIM) model.
presented with a natural image, the NIM model employs a
biologically-informed perceptual pre-processing method that
translates the image into a similarity-space representation. In
this paper, the NIM model is validated on individual natural
stimuli (i.e., images of faces) in two tasks: (1) a similarity-
rating task and (2) a recognition task. The results obtained
with the NIM model are compared with the results of corre-
sponding behavioral experiments. The similarity structure of
the face images that is reflected in the similarity space forms
the basis for the comparison. The results reveal that the NIM
model’s similarity ratings and recognition rates for individual
images correlate well with those obtained in the behavioral ex-
periments. We conclude that the NIM model successfully sim-
ulates similarity ratings and recognition performance for indi-
vidual natural stimuli.
Keywords: Memory modeling, perception, representation,
recognition, computer science, computer simulation.
Many psychological modelers have developed so-called ‘sim-
ilarity spaces’ that represent similar items in close prox-
imity to each other (e.g., Shepard, 1957; Nosofsky, 1986;
Shepard, 1986; Valentine, 1991; Busey, 1998; Edelman,
1998; Steyvers, Shiffrin, & Nelson, 2004). So far, existing
memory models have usually been tested with three types
of spaces: (1) artificially generated similarity spaces (e.g.,
Shiffrin & Steyvers, 1997), (2) similarity spaces derived from
co-occurrence of objects in the environment (e.g., word co-
occurrence in texts; Landauer & Dumais, 1997), and (3) sim-
ilarity spaces derived from human similarity judgments (e.g.,
Busey & Tunnicliff, 1999). In the latter, the resulting sim-
ilarity spaces reflect similarities obtained from human judg-
ments. However, because representations are not derived di-
rectly from the stimuli, these models fall short in constructing
a representation that is grounded in the natural world.
We proposed the Natural Input Memory (NIM) model
(Lacroix, Murre, Postma, & Van den Herik, 2004), which
employs a biologically-informed perceptual pre-processing
method to derive representations directly from natural visual
stimuli. Therefore, unlikeanyoftheexistingmodels, the NIM
model is able to make predictions for responses to individual
natural stimuli. In our earlier work, we showed that the NIM
model produces general effects found in experimental recog-
nition memory studies (Lacroix et al., 2004; Lacroix, Murre,
Postma, & Van den Herik, in prep). In this paper, we show
that the NIM model can predict similarity ratings and recog-
nition rates for individual natural stimuli. The model is tested
on a similarity-rating task and a face-recognition task using
the same facial stimuli that were used in behavioral exper-
iments (Busey, unpublished results; Busey, 1998; Busey &
The outline of the remainder of this paper is as follows. In the
following section, the NIM model is introduced and discussed
briefly. Then, we will test the NIM model on the similarity-
judgment task and the face-recognition task, discuss the re-
sults, and provide a conclusion.
The NIM model
The NIM model encompasses the following two stages.
1. A perceptual pre-processing stage that translates a natural
image into feature vectors.
2. A memory stage comprising two processes:
(a) a storage process that stores feature vectors in a straight-
(b) a recognition process that compares feature vectors of
the image to be recognized with previously stored fea-
Below we will discuss both stages briefly (see Lacroix et al.,
2004, for a more detailed exposition of the NIM model).
The perceptual pre-processing stage
The perceptual pre-processing stage in the NIM model is
based on the processing of information in the human visual
system. Motivated by eye fixations in human vision, the pre-
processing stage takes local samples at selected locations in
a natural image. Since humans tend to place fixations at or
near contours in a visual scene, the NIM model ’fixates’ ran-
domly along the contours in an image (e.g., Norman, Phillips,
& Ross, 2001). The features extracted at fixation locations in
the face image consist of the responses of different derivatives
of Gaussian filters. Such multi-scale wavelet decomposition
models are biologically plausible (see, e.g., Bartlett, Little-
worth, Braathen, Sejnowski, & Movellan, 2003; Kahana &
Sekuler, 2002; Palmeri & Gauthier, 2004). By applying a set
of filters at multiple scales and orientations, a representation
of the image area centered at the fixation location is obtained
(Freeman & Adelson, 1991). A pre-processed image is then
represented by a number of feature vectors, each correspond-
ing to a fixation. After feature-vector extraction, we apply a
principal component analysis to the extracted feature vectors
in order to reduce their dimensionality. We selected the first
imately 50 components are sufficient to accurately represent
faces (e.g., Hancock, Burton, & Bruce, 1996).
The memory stage
Whereas in the NIM model, the pre-processing stage is bi-
ologically informed, the memory stage is kept as simple as
possible. Instead of attempting to model the cortical or sub-
simply assume that feature vectors are retained (i.e., ‘stored’)
in a similarity space (c.f., Nosofsky, 1986). The memory
stage comprises two processes: the storage process and the
recognition process.The storage process stores the pre-
processed natural image in memory. The recognition pro-
cess determines the familiarity of the image to be recognized
by comparing feature vectors of the image with previously
stored feature vectors. The familiarity of an image is equal to
the average familiarity of its feature vectors. The familiarity
of a feature vector is determined by the number of previously-
stored feature vectors that fall within a hypersphere with ra-
dius r, centered around the feature vector (i.e, the number
of nearest neighbors). Formally, the familiarity of image A,
which we will denote as fA, is defined as: fA=
fAithe familiarity of the ith feature vector of image A, and
tAthe total number of feature vectors extracted from image
A. To obtain false-alarm and hit rates, we compare our famil-
iarity values, f, to a decision criterion θ. When a familiarity
value, f, of an image to be recognized exceeds the criterion,
the image is recognized, otherwise, the image is not recog-
The similarity-rating task
To test the psychological plausibility of the similarity-space
representations that the NIM model derives from natural
stimuli a comparison to human similarity ratings is needed.
Therefore, we compared model and human similarity ratings
on identical sets of natural stimuli. In this section, we re-
view two behavioral similarity-rating experiments, describe
the similarity-rating stimulations with the NIM model, and
provide a brief discussion of the results.
Behavioral similarity-rating experiments
Human similarity-ratings were obtained in two studies:
Busey (unpublished results1) and Busey (1998). In both ex-
periments, subjects were repeatedly presented with two faces
and were instructed to rate the similarity by assigning a num-
ber ranging from 1 (most similar) to 9 (least similar). This
resulted in human similarity ratings for all possible pairs of
faces. In the experiment of Busey (unpublished results), 238
subjects were tested on pairs of faces from a set of 60 faces
(8 of the faces contained in the set were morphs, created by
averaging 2 of the 52 other faces). Across subjects, this re-
sulted in a total of about 25 ratings for each possible pair. In
Busey (1998), 343 subjects were tested on pairs of faces from
a set of 104 faces (16 of which were morphs, created from
averaging two of the 88 other faces), resulting in at least six
1also referred to in Steyvers and Busey (2000)
ratings for each possible pair. The similarity ratings for indi-
vidual subjects were translated into z-scores and within each
experiments these scores were averaged over subjects.
Similarity-rating simulations with the NIM model
The NIM model bases a similarity rating for two faces
on the similarity-space representations that result from pre-
processing those faces. Since this task involves no memory
processes, the memory stage is not employed in these simu-
lations. We presented the NIM model with all possible face
pairs from the sets of face images used in the experiments
of Busey (unpublished results; 1998). For each face of a pair
(A,B), 100 feature vectors were extracted. Then, for each fea-
ture vector of face A, the number of feature vectors of face B
residing within a hypersphere with radius r centered around
the feature vector (i.e., the nearest neighbors), were counted.
The average number of nearest neighbors for all feature vec-
tors of face A determined the similarity value, s, of face A
and B. Since, subjects in the experimental study made a sim-
ilarity rating from 1 (most similar) to 9 (least similar), we
transformed each similarity value s into a similarity rating,
SR, from 1 to 9. We applied the following linear transform:
integer value contained in x, r is the radius parameter, and b
and u are constants. First, we varied the radius r to obtain the
r value for which correlations between human similarity rat-
ings and similarity values produced by the NIM model were
optimal. With this optimal value r held constant, u and b were
varied to find the u and b values for which human similarity
ratings and model similarity ratings SR correlated best. Simi-
larity values, s, and similarity ratings, SR, were averaged over
if 0 < s < 8∗b+u
if s = 0
Similarity-rating simulation results
We varied the value of the radius r from 2.0 to 6.0. For both
experiments (Busey, unpublished results; 1998), we found
that simulations with r = 5.0 resulted in the highest correla-
tions between model similarity values (s) and experimentally
obtained similarity ratings. The precise value of r was not
critical to the results. Then, for the similarity values s pro-
duced with r = 5.0, we varied the u and b values of the lin-
ear transform to find those that produced the best correlations
between human and model similarity ratings SR. For both
experiments, we found that for u = 0.02 and b = 0.02, the
highest correlations were obtained, which were R = 0.65 and
R = 0.70 for the two experiments, respectively. Fig. 1 shows
the similarity ratings SR produced by the NIM model (r=5.0,
u = 0.02, and b = 0.02) as a function of the similarity ratings
that were obtained experimentally by Busey (1998). A simi-
lar distribution of similarity ratings was found when compar-
ing the NIM model’s similarity ratings to those obtained in
the experiment by Busey (unpublished results).
Discussion of the similarity-rating results
An analysis of the similarity ratings obtained in the experi-
ment by Busey (unpublished results) showed that there was
Figure 1: The NIM model’s similarity ratings (z-scores; r =
5.0, u = 0.02, and b = 0.02) as a function of the experimen-
tal similarity ratings (z-scores+10) obtained by Busey (1998),
R = 0.70.
considerable variation between different subject’s similarity
ratings. When similarity ratings of the subjects were divided
into 5 groups, correlations between the different group aver-
ages ranged from 0.72 to 0.75. This demonstrates the limited
consistency of the similarity judgments across subjects. It is,
therefore, highly unlikely that correlations between similarity
ratings produced by our model and experimental similarity
ratings will exceed the value of 0.75. Given this considera-
tion, the similarity-space representations of the NIM model
can be said to correlate reasonably well with the experimen-
tally obtained similarity ratings.
Although very high correlations are unlikely, small increases
might be achieved by testing different transforms of the simi-
larity values into the similarity ratings. It is very well possible
that human similarity-rating category boundaries are not lin-
Several studies have compared representation spaces result-
ing from various image pre-processing schemes to the out-
comes of psychological experiments. For instance, Calder,
Burton, Miller, Young, and Akamatsu (2001) and Dailey,
Cottrell, Padgett, and Adolphs (2002) did this for coding fa-
cial expressions and Hancock, Bruce, and Burton (1998), and
Kalocsai, Zhao, and Biederman (1998) did this for face iden-
tity. Hancock et al. (1998) used an approach similar to ours,
by comparing human similarity judgments with similarity-
space representations. They found small correlations of about
0.20 and argued it to be caused by the noisiness of the hu-
man data. Kalocsai et al. (1998) used a same-different judg-
ment task to compare the performances of a Gabor-filter pre-
processing scheme and a global template-matching classifier
with human data. In the experiment subjects had to judge
whether two sequentially presented images were of the same
individual. They found correlations up to 0.91. However,
these were based on a small sample of 16 trials. Moreover,
the similarity judgments were obtained using a much simpler
binary classification task.
The recognition task
In order to validate the recognition predictions of the NIM
model for individual natural stimuli, we compared the recog-
nition rates produced by the NIM model with those that were
obtained in two behavioral experiments. The first behav-
ioral recognition experiment is the face-recognition experi-
ment of Busey (unpublished results), employing 60 faces and
238 subjects. The second is the face-recognition experiment
of Busey and Tunnicliff (1999), employing 104 faces and 180
subjects. The face images used in these experiments are iden-
our simulations used the same face images as were used in the
Below, we describe the behavioral face-recognition experi-
ments, present the simulation results obtained with the NIM
model, and provide a brief discussion.
Behavioral recognition experiments
In the experiments, Busey (unpublished results) and Busey
and Tunnicliff (1999) determined the recognition rates (i.e.,
hit rates and false-alarm rates) for different types of faces.
The sets of 60 and 104 faces employed in the experiments
contained three types of faces: (1) normal faces, (2) parent
faces, and (3) morph faces. Each morph face was the average
of two parent faces.
In the face-recognition experiment of Busey (unpublished re-
sults), the set of 60 faces was subdivided into two sets of 30
faces such that each set contained 18 normal faces, 8 parent
faces and 4 morph faces. When a morph was in one set, its
parents were in the other. Half of the subjects were presented
with a study list with the faces from set 1 and tested for old-
new recognition for the faces from both set 1 (i.e., targets)
and set 2 (i.e., lures). For the other half of the subjects this
was reversed. This allowed hit and false-alarm rates to be ob-
tained for each of the 60 faces.
In the face-recognition experiment of Busey and Tunnicliff
(1999), subjects were presented with a study list with 36 nor-
mal faces and 32 parent faces. Then, old-new recognition was
tested for these 36 normal faces (i.e., normal targets) and 32
parent faces (i.e., parent targets) along with 20 new normal
faces (i.e., normal lures) and 16 new morph faces (i.e., morph
that were either dissimilar or similar to each other. Dissimilar
and similar morph lures resulted from dissimilar and similar
morph parents, respectively.
Recognition simulations with the NIM model
In half of the simulations of Busey’s (unpublished results)
face-recognition experiment, the NIM model was presented
with the 18 normal faces, 8 parent faces, and 4 morph faces
from set 1 and in the other half with those from set 2. Then
the model was tested for old-new recognition of the presented
faces (i.e., targets) along with the 18 normal faces, 8 parent
faces, and 4 morph faces from the other set (i.e., lures).
In the simulations of the face-recognition experiment of
Busey and Tunnicliff (1999), the model was presented with
the 36 normal faces, and 32 parent faces from the study list
of the behavioral study. Then recognition was tested for these
faces along with the 20 new normal faces (i.e., normal lures)
and the 16 new morph faces (i.e., morph lures) from the test
list of the behavioral experiment.
In the behavioral experiments, the images were presented for
1500 ms, followed by a two-second delay. In our simula-
tions, the number of fixations selected and stored for each
face was set to 10 which corresponds to approximately 2 sec-
onds of viewing time. For recognition, the NIM model cal-
culated the familiarity of each target and lure on the basis
of 100 fixations (as described previously). Comparison of
the familiarity values to the decision criterion, θ, determined
the recognition rates, i.e., the hit-rates for the targets and the
false-alarm rates for the lures. In these simulations the radius
r and the decision criterion θ were varied to determine the
root mean square errors (RMSE) and the correlation values
between recognition rates produced by the NIM model and
experimentally obtained recognition rates. The radius r and
the decision criterion θ were our only free parameters.
Recognition simulation results
For the simulations of the face-recognition experiment of
Busey (unpublished results) in which set 1 were the targets
and set 2 were the lures as well as for the simulations in
which this was reversed, the smallest RMSE between exper-
imentally obtained recognition rates and those produced by
the NIM model resulted with a radius r = 3.5 and a recogni-
tion decision criterion θ = 0.01. The correlations with these
set 1 were the targets and set 2 were the lures and R=0.68 for
the experiment for which this was reversed. Fig. 2 shows the
model recognition rates with these parameter settings ((a) for
simulations in which set 1 were the targets and set 2 were
the lures, (b) for simulations in which set 2 were the tar-
gets and set 1 were the lures) as a function of experimentally
obtained recognition rates. For the simulations of the face-
recognition experiment of Busey and Tunnicliff (1999), the
smallest RMSE between experimentally obtained recognition
rates and those produced by the NIM model resulted with a
radius r = 3.5 and a recognition decision criterion θ = 0.02.
The correlation obtained with these parameter settings was
R = 0.60. Fig. 3 shows the recognition rates produced by the
NIM model with these parameter settings as a function of the
experimentally obtained recognition rates.
Discussion of the recognition results
Recognition rates produced by the NIM model agree quite
well with experimentally obtained recognition rates. Across
the two behavioral face-recognition experiments, three main
effects were found in the data. These will be discussed and
compared to the recognition results of the NIM model. Then,
an important difference between the recognition rates pro-
duced by the NIM model and those obtained experimentally
for dark-skin faces will be discussed. Finally, we will make a
brief remark about the radius parameter r that was used in the
The first effect found in the experimental data is that morph
lures are falsely recognized more often than normal lures.
This effect was successfully produced by the NIM model for
simulations of both the experiments of Busey (unpublished
results) and Busey and Tunnicliff (1999).
The second effect is that in the experiment of Busey and
Tunnicliff (1999), which employed similar and dissimilar
morph lures, false-alarm rates for the similar morph lures
were higher than those for the dissimilar morph lures. This
effect was also produced successfully by the NIM model sim-
ulations of the experiment of Busey and Tunnicliff (1999).
Figure 2: The NIM model’s face-recognition rates (r = 3.5,
θ = 0.01) as a function of the experimental recognition rates
obtained by Busey (unpublished results). (a) set 1: targets, set
2: lures, R = 0.82, and RMSE = 0.156 and (b) set 2: targets,
set 1: lures, R = 0.68 and RMSE = 0.167.
The third effect concerns the false-alarm rates for lures com-
pared to the hit rates for their parents. In the behavioral ex-
periments the false-alarm rate for the morph lures approached
the hit rates for their parents. In the experiment of Busey
and Tunnicliff (1999) the false-alarm rates even marginally
exceeded the hit rates for the similar lures (and not for the
dissimilar lures). However, as can be seen in Figs. 2 and
3 the NIM model produced clearly smaller false-alarm rates
for the morph lures compared to the hit rates for their par-
ents. By changing the parameters of the NIM model, a bet-
ter agreement with experimental findings could be obtained.
Fig. 4 shows the results with parameter settings r = 3.9 and
θ = 0.02 for simulations of Busey’s (unpublished results) ex-
periment in which set 1 were the targets and set 2 were the
lures and Fig. 6 shows the results for simulations of Busey
and Tunnicliff (1999)’s experiment with parameter settings
r = 3.8 and θ = 0.03. Compared to the data in Figs. 2(a) and
3, the false-alarm rates for the morph lures are increased rel-
ative to the hit rates for their parents. This is at the expense,
Figure 3: The NIM model’s face-recognition rates (r = 3.5
and θ = 0.02) as a function of the experimental recognition
rates obtained by Busey and Tunnicliff (1999), R = 0.60 and
RMSE =0.180. Forthegroupofluresindicatedbyanellipse,
however, of lower overall correlations and larger RMSE’s be-
tween experimental and model recognition rates.
Two points should be noted about the third effect. First,
the experimental data showed substantially larger variations
in rates obtained for similar as well as dissimilar target par-
ents than for other types of faces. Therefore, deviations be-
tween the model and experimental results concerning the dif-
ference in recognition of parents and their morphs can partly
be explained by the noisiness of the experimental recognition
rates for parents. Second, while in the experimental data of
Busey and Tunnicliff (1999), the false-alarm rate for the sim-
ilar morph lures exceeded the hit rate of their parents (and
the false-alarm rate for the dissimilar lures did not), this ef-
fect was small and not significant (α = 0.05). For the four
Figure 4: The NIM model’s face-recognition rates (r = 3.9,
θ = 0.02) as a function of the experimental recognition
rates obtained by Busey (unpublished results), R = 0.73, and
RMSE = 0.184.
specific lures, indicated by an ellipse in Fig. 3, considerable
deviations from the experimental data are observed. The NIM
model produces a recognition rate around 0, while humans
tend to falsely recognize these faces very often. The corre-
sponding faces are shown in Figs. 5(a), 5(c), 5(c), and 5(d). A
striking fact is that these are faces of people with a very dark
skin color. Moreover, no other lures were faces with such a
dark skin color. Experimental studies have shown that much
experience with faces of a given race, improves recognition
of these faces relative to others (for an overview see, Meiss-
ner & Brigham, 2001). As a result, faces of other races look
more similar than faces from one’s own race. This effect is
also known as the other-race effect. In the experiment, when
a subject has seen a face of a certain racial type other than
its own on the study list, it will have a hard time discriminat-
ing it from a different face of that same racial type presented
on the test list. Presumably the larger part of the subjects
participating in the experimental face-recognition task were
Caucasians that have abundant experience with Caucasian
faces compared to non-Caucasian faces. Therefore, they will
exhibit a worse recognition performance for non-Caucasian
faces. In contrast, the NIM model is presented with the 104
faces only, and bases the PCA on the characteristics of these
faces of which roughly a quarter were non-Caucasian. This
is a possible explanation for the low false-alarm rate of the
NIM model on the four black-skin lures and the relatively
high false-alarm rate of human subjects on these same faces.
Figure 5: The four faces corresponding to the four dots indi-
cated by an ellipse in Fig.3
Figure 6: The NIM model’s face-recognition rates (r = 3.8
and θ = 0.03) as a function of the experimental recognition
rates obtained by Busey and Tunnicliff (1999), R = 0.52 and
RMSE = 0.210.
As a final remark we would like to note that for the
similarity-rating and the recognition tasks, a different setting
of the radius parameter r resulted in the best correlations be-
tween model and human data. Although we have not expli-
cated the meaning of the radius parameter r, we interpret it as
the attentional weighing of all dimensions that depends on the
task (cf., the Generalized Context Model (GCM); e.g., Nosof-
sky, 1986). In our future work, we intend to explore the inte-
gration of our model with Nosofsky’s GCM.
Previous studies showed that the NIM model produces gen-
eral effects found in experimental recognition-memory stud-
ies. In this paper, the NIM model is validated on individual
natural stimuli. The model is tested on a similarity-rating task
and a face-recognition task using the same face images as
were used in experimental studies. From our results we con-
clude that the NIM model can simulate similarity judgments
and recognition performance for individual natural stimuli
We wish to thank Dr Busey for providing us with his experi-
mental data. The research project is supported in the frame-
work of the NWO Cognition Program with financial aid from
the Netherlands Organization for Scientific Research (NWO).
ment: modeling and experimentation in humans and robots’
(project number: 051.02.2002).
Bartlett, M. S., Littleworth, G., Braathen, B., Sejnowski, T. J.,
& Movellan, J. R.(2003).
recognition of spontaneous facial actions. In S. Becker &
K. Obermayer (Eds.), Advances in neural information pro-
cessingsystems(Vol.15). Cambridge, MA:TheMITPress.
Busey, T. (1998). Physical and psychological representations
of faces: Evidence from morphing. Psychological Science,
Busey, T. A., & Tunnicliff, J. (1999). Accounts of blending,
typicality and distinctiveness in face recognition. Journal
of Experimental Psychology: Learning Memory and Cog-
nition, 25, 1210-1235.
Calder, A. J., Burton, A. M., Miller, P., Young, A. W., &
Akamatsu, S. (2001). A principal component analysis of
facial expressions. Vision Research, 41, 1179-1208.
Dailey, M. N., Cottrell, G. W., Padgett, C., & Adolphs, R.
(2002). A neural network that categorizes facial expres-
sions. Journal of Cognitive Neuroscience, 14, 1158-1173.
Edelman, S. (1998). Representation is representation of sim-
ilarities. Behavioral and Brain Sciences, 21, 449-498.
Freeman, W. T., & Adelson, E. H. (1991). The design and
use of steerable filters. IEEE Trans. Pattern Analysis and
Machine Intelligence, 13, 891-906.
Hancock, P. J. B., Bruce, V., & Burton, A. M. (1998). A
comparison of two computer-based face identification sys-
tems with human perceptions of faces. Vision Research, 38,
A prototype for automatic
Hancock, P. J. B., Burton, A. M., & Bruce, V. (1996). Face
processing: human perception and principal components
analysis. Memory and Cognition, 24, 26-40.
Kahana, M. J., & Sekuler, R. (2002). Human sensitivity to
face statistics computed on v1 similarity. Journal of Vision,
Kalocsai, P., Zhao, W., & Biederman, I. (1998). Face sim-
ilarity space as perceived by humans and artificial sys-
tems. In Proceedings, Third International Conference on
Automatic Face and Gesture Recognition (pp. 177-180).
Lacroix, J. P. W., Murre, J. M. J., Postma, E. O., & Van den
Herik, H. J. (2004). The natural input memory model. In
Proceedings of the XXVI Annual meeting of the Cognitive
Science Society (pp. 773-778).
Lacroix, J. P. W., Murre, J. M. J., Postma, E. O., & Van den
Herik, H. J. (in prep). Modeling recognition memory using
the similarity structure of natural input. (Preprint available
Landauer, T. K., & Dumais, S. T. (1997). A solution to platos
problem: The latent semantic analysis theory of acquisi-
tion, induction, and representation of knowledge. Psycho-
logical Review, 104, 211-240.
Meissner, C. A., & Brigham, J. C. (2001). Thirty years of
investigating the own-race bias in memory for faces: A
meta-analytic review. Psychology, Public Policy, and Law,
Norman, J. F., Phillips, F., & Ross, H. E. (2001). Informa-
tion concentration along the boundary contours of naturally
shaped solid objects. Perception, 30, 1285-1294.
Nosofsky, R. M.(1986).Attention, similarity, and the
identification-categorization relationship. Journal of Ex-
perimental Psychology: General, 115, 39-57.
Palmeri, T. J., & Gauthier, I. (2004). Visual object under-
standing. Nature Reviews Neuroscience, 5, 291-303.
Shepard, R. N. (1957). Stimulus and response generalization:
A stochastic model relating generalization to distance in
psychological space. Psychometrika, 22, 325-345.
Shepard, R. N. (1986). Discrimination and generalization
in identification and classification: Comment on nosofsky.
Journal of Experimental Psychology: General, 115, 58-61.
Shiffrin, R. M., & Steyvers, M. (1997). A model for recog-
nition memory: Rem: Retrieving effectively from memory.
Psychonomic Bulletin & Review, 4, 145-166.
Steyvers, M., & Busey, T. (2000). Predicting similarity rat-
ings to faces using physical descriptions. In M. Wenger
& J. Townsend (Eds.), Computational, geometric, and pro-
cess perspectives on facial cognition: Contexts and chal-
lenges. Hillsdale, NJ: Erlbaum Associates.
Steyvers, M., Shiffrin, R. M., & Nelson, D. L. (2004). Word
in episodic memory. In A. Healy (Ed.), Experimental cog-
nitive psychology and its applications. Washington, DC:
American Psychological Association.
Valentine, T. (1991). Representation and process in face
recognition. In R. Watt (Ed.), Vision and visual dysfunc-
tion, Vol. 14: Pattern recognition by man and machine.
London, UK: MacMillan Press.