ArticlePDF Available

Abstract and Figures

The Cambridge Face Memory Test (CFMT) is one of the most important measures of individual differences in face recognition and for the diagnosis of prosopagnosia. Having two different CFMT versions using a different set of faces seems to improve the reliability of the evaluation. However, at the present time, there is only one Asian version of the test. In this study, we present the Cambridge Face Memory Test - Chinese Malaysian (CFMT-MY), a novel Asian CFMT using Chinese Malaysian faces. In Experiment 1, Chinese Malaysian participants (N = 134) completed two versions of the Asian CFMT and one object recognition test. The CFMT-MY showed a normal distribution, high internal reliability, high consistency and presented convergent and divergent validity. Additionally, in contrast to the original Asian CFMT, the CFMT-MY showed an increasing level of difficulties across stages. In Experiment 2, Caucasian participants (N = 135) completed the two versions of the Asian CFMT and the original Caucasian CFMT. Results showed that the CFMT-MY exhibited the other-race effect. Overall, the CFMT-MY seems to be suitable for the diagnosis of face recognition difficulties and could be used as a measure of face recognition ability by researchers who wish to examine face-related research questions such as individual differences or the other-race effect.
Content may be subject to copyright.
Vol.:(0123456789)
1 3
Behavior Research Methods
https://doi.org/10.3758/s13428-023-02085-6
A new Asian version oftheCFMT: The Cambridge Face Memory
Test – Chinese Malaysian (CFMT‑MY)
SiewKeiKho1 · BryanQiZhengLeong1· DavidR.T.Keeble1· HooKeatWong1· AlejandroJ.Estudillo1,2
Accepted: 7 February 2023
© The Psychonomic Society, Inc. 2023
Abstract
The Cambridge Face Memory Test (CFMT) is one of the most important measures of individual differences in face
recognition and for the diagnosis of prosopagnosia. Having two different CFMT versions using a different set of faces seems
to improve the reliability of the evaluation. However, at the present time, there is only one Asian version of the test. In this
study, we present the Cambridge Face Memory Test – Chinese Malaysian (CFMT-MY), a novel Asian CFMT using Chinese
Malaysian faces. In Experiment 1, Chinese Malaysian participants (N = 134) completed two versions of the Asian CFMT
and one object recognition test. The CFMT-MY showed a normal distribution, high internal reliability, high consistency and
presented convergent and divergent validity. Additionally, in contrast to the original Asian CFMT, the CFMT-MY showed an
increasing level of difficulties across stages. In Experiment 2, Caucasian participants (N = 135) completed the two versions
of the Asian CFMT and the original Caucasian CFMT. Results showed that the CFMT-MY exhibited the other-race effect.
Overall, the CFMT-MY seems to be suitable for the diagnosis of face recognition difficulties and could be used as a measure
of face recognition ability by researchers who wish to examine face-related research questions such as individual differences
or the other-race effect.
Keywords Cambridge Face Memory Test· Asian· Other-ethnicity effect· Face memory· Face recognition·
Prosopagnosia· Neuropsychological test
Faces are one of the most critical stimuli for successful
social interaction. However, despite its importance, face
recognition abilities present substantial inter-individual
variability (Bowles etal., 2009; Bruce etal., 2018; Wang
etal., 2012; Wilmer, 2017) with some people showing
superior face recognition (i.e., super-recognizers) (Russell
etal., 2009) while others presenting difficulties in recogniz-
ing even highly familiar faces (i.e., prosopagnosics) (Ros-
sion, 2014). Prosopagnosia, also known as face blindness,
is a visual impairment that affects face recognition despite
intact visual acuity and intelligence and can result from
brain injury (i.e., acquired prosopagnosia) or abnormal
development (i.e., developmental or congenital prosopag-
nosia). Remarkably, these difficulties in face recognition
could contribute to negative social consequences (e.g., high
anxiety in social situations) not only for adults (Yardley
etal., 2008), but also for children (Dalrymple etal., 2014).
Although the estimated prevalence of developmental pros-
opagnosia in the general population is around 2.5% (Bowles
etal., 2009; Kennerknecht etal., 2006, 2008), many cases
remain undiagnosed (Duchaine, 2000). Given the limited
insights that people have into their own face recognition
skills (Bate & Dudfield, 2019; Bobak etal., 2019; Estudillo,
2021; Estudillo & Wong, 2021; Palermo etal., 2017), objec-
tive measures of face identification are crucial for the study
of individual differences in face recognition skills and the
diagnosis of prosopagnosia.
One of the most prominent objective measures of face
recognition abilities is the Cambridge Face Memory Test
(CFMT) (Duchaine & Nakayama, 2006). This test, which
can be completed in about 15 min, provides a valid measure
of face recognition, as it requires the identification of faces
across different views (Bruce, 1982; Estudillo & Binde-
mann, 2014). The CFMT is poorly correlated with general
* Siew Kei Kho
khpy5ksk@nottingham.edu.my
* Alejandro J. Estudillo
aestudillo@bournemouth.ac.uk
1 School ofPsychology, University ofNottingham Malaysia,
Jalan Broga, 43500Semenyih, Selangor, Malaysia
2 Department ofPsychology, Bournemouth University, Poole
House Talbot Campus, BH12, Bournemouth, UK
Behavior Research Methods
1 3
intelligence (Shakeshaft & Plomin, 2015) and object recog-
nition ability (Dennett etal., 2012; Shakeshaft & Plomin,
2015), which suggests that this test taps into face identifica-
tion specific processes. The original version of the CFMT
consists of a three-alternative forced choice paradigm sub-
divided into three stages of increasing difficulty. Participants
are firstly asked to study six Caucasian target faces. Subse-
quently, during the recognition trials, the target faces are
presented without any variation in the image (learning stage)
with different lighting and viewpoint (novel stage) and with
the addition of visual noise (novel-with-noise stage). The
CFMT has been widely used to investigate different aspects
of face recognition, including its heritability (Wilmer etal.,
2010), development (Germine etal., 2011), relationship with
holistic processing (DeGutis etal., 2013) and other group
effects (Childs etal., 2021; Estudillo etal., 2020; McKone
etal., 2012; Wan etal., 2017). Importantly, because it has
high reliability (Cronbach's alpha (α) ≈ .90), the CFMT is
also used to aid the diagnosis of prosopagnosia (e.g., Bowles
etal., 2009; Duchaine & Nakayama, 2006; Estudillo etal.,
2020; McKone etal., 2017). Specifically, individuals scoring
two standard deviations below the mean CFMT performance
are considered as possible prosopagnosia cases (Duchaine
& Nakayama, 2006).
Despite the remarkable psychometric properties of the
CFMT, several factors (i.e., problems understanding the
instructions and inattentiveness) could influence the final
test scores irrespective of actual face recognition skills (see
e.g., Gamaldo & Allaire, 2016). Although repeating the same
test could provide a more reliable score, this practice is not
exempt from problems (McCaffrey & Westervelt, 1995). For
example, due to face familiarity effects as a consequence of
using the same face stimuli, an individual who scored below
the cut-off value during the first assessment may score above
the cut-off value in the next reassessment test (Murray &
Bate, 2020). This familiarity effect could be easily avoided
by using a complementary version of the CFMT contain-
ing a different set of face stimuli (Murray & Bate, 2020). In
addition, with the increasing interest in face training proto-
cols (Bate, Adams, & Bennetts, 2019a; Corrow etal., 2019;
Davies-Thompson etal., 2017), having complementary ver-
sions of the CFMT is also highly useful for rigorous pre-
post training comparisons. For Caucasian participants, such
a complementary version does exist, the CFMT-Australian
(CFMT-Aus) (McKone etal., 2011). Importantly, the psycho-
metric properties of the CFMT-Aus are comparable to those
of the original CFMT, making this test not only an alternative
to the original CFMT, but also a complementary assessment
tool in the aforementioned situations.
People tend to be better recognizing faces from their own-
race compared to other-race faces, the so-called other-race
effect (Meissner & Brigham, 2001). Both the CFMT-original
and the CFMT-Aus consist of Caucasian face stimuli and
have shown strong other-race effects (see e.g., Estudillo
etal., 2020; McKone etal., 2012; Wan etal., 2017), limit-
ing their use to Caucasian populations. The CFMT-Chinese
(McKone etal., 2012) was introduced to study individual
differences in face recognition and aid the diagnosis of pros-
opagnosia in Asian populations. This test follows an identi-
cal format compared to the original version of the test and
has comparable psychometric properties (McKone etal.,
2017). However, at present, the CFMT-Chinese is the only
Asian version of the CFMT which, as previously discussed,
might present difficulties for the study of individual differ-
ences, the diagnosis of borderline cases of prosopagnosia
and pre-post face training comparisons.
Although the CFMT-Chinese aims to explore individ-
ual differences in face recognition and aid the diagnosis of
prosopagnosia in the Asian population, the other-race effect
has also been found within the Asian population (Wong
etal., 2020). However, other studies using the CFMT-Chi-
nese found that the scores of the CFMT-Chinese were still
higher than the CFMT-original where the Asian participants
recruited comprised of a variety of Asian origins, some of
which were not Chinese, such as Indonesian (McKone etal.,
2012), Malay and Filipino participants (Bate, Bennetts, etal.,
2019b). Similarly, Estudillo etal. (2020) found that although
Malaysian Malay and Malaysian Indian showed a clear other-
race effect for Caucasian faces, they presented identical per-
formance for Chinese faces compared to Malaysian Chinese
participants in the CFMT-Chinese. Altogether, these findings
suggest that non-Chinese Asians may perform better for Chi-
nese faces as compared to Caucasian faces. Despite the fact
that using the CFMT with Chinese faces for the diagnosis
of prosopagnosia among the non-Chinese Asian population
may not be ideal, currently, the CFMT-Chinese may still be a
superior face recognition measure compared to the Caucasian
CFMT versions for the diagnosis of prosopagnosia among the
non-Chinese Asian population.
Present study
In the current study, we presented a novel Asian version
of the CFMT, the CFMT-Chinese Malaysian (CFMT-MY).
In Experiment 1, we determined the psychometric proper-
ties of the CFMT-MY using a Chinese Malaysian sample.
Specifically, in Experiment 1 we explored the internal reli-
ability, convergent validity and divergent validity of the
CFMT-MY. Experiment 1 also tested whether the three
stages of the CFMT-MY represent increasing levels of dif-
ficulty. The increasing levels of difficulty across stages is
an important property of the CFMT-original (Duchaine &
Nakayama, 2006) that has been overlooked in the CFMT-
Chinese (e.g., Estudillo etal., 2020; McKone etal., 2012,
2017). After checking the psychometric properties of the
Behavior Research Methods
1 3
CFMT-MY, Experiment 2 used a sample of Caucasian par-
ticipants to explore whether the CFMT-MY captures an
other-race effect of similar magnitude compared to that of
the CFMT-Chinese.
Experiment 1
Experiment 1 aimed to investigate the psychometric proper-
ties of the CFMT-MY. In addition to measures of reliability
(Cronbach’s α) and internal consistency across stages, we
explored the convergent and divergent validities of the test.
Convergent validity was explored by correlating participants’
performance in the CMFT-MY with their performance in
the CFMT-Chinese. Divergent validity was explored by cor-
relating participants’ performance in the CFMT-MY and
their performance in a general object recognition task that
follows the same format as the CFMT: the Cambridge Car
Memory Test (CCMT) (Dennett etal., 2012). If the CFMT-
MY had appropriate convergent and divergent validity we
would expect a stronger correlation between the CFMT-MY
and the CFMT-Chinese than between the CFMT-MY and
the CCMT. Additionally, we examined the increasing level
of difficulty across the three stages of the CFMT-Chinese and
the CFMT-MY. Differences in accuracy between the different
stages of the CFMT-MY and CFMT-Chinese were assessed
using repeated-measures ANOVA.
Methods
Participants
A total of 139 participants took part in this experiment,
but the final sample included 134 Chinese Malaysians
(92 females and 42 males) with an age range of 18 to 66
years (M = 22.81 years, SD = 5.53 years). The age range
for female participants was between 18 and 66 years (M =
22.50 years, SD = 6.24 years) while for male participants, the
age range was from 18 to 35 years (M = 23.48 years, SD =
3.47 years). Data from participants of other-ethnicity (e.g.,
Malay, Indian, Eurasian, mixed) (four participants) and that
had median reaction times less than 500 ms (one partici-
pant) were removed from further analysis. Eight additional
participants were excluded from the data analysis (except
for internal reliability and internal consistency analyses) as
their performance on the face memory tasks was indicative
of possible prosopagnosia (Appendix 1). The remaining par-
ticipants were 126 Chinese Malaysians (87 females and 39
males) with an age range from 18 to 66 years (M = 22.92
years, SD = 5.64 years). The age range for female partici-
pants was between 18 and 66 years (M = 22.58 years, SD =
6.38 years) while for male participants, the age range was
between 18 and 35 years (M = 23.69 years, SD = 3.40 years).
An a priori power analysis was conducted using G*Power
3.1 (Faul etal., 2009) for a repeated-measures ANOVA com-
paring the stages of the two Asian CFMT versions (CFMT-
MY and CFMT-Chinese). The effect size for the CFMT stage
was based on Murray and Bate (2020) where ηp
2 = .824, a
large effect size. A large effect size estimate (ηp
2 = .14) was
entered into the power analysis with the following param-
eters: α = .05, power = .95. The power analysis implied that
N = 50 would be required to detect a difference between the
CFMT versions with 95% probability. A priori power analy-
sis was also conducted for correlation tests comparing two
versions of face memory test (CFMT-Chinese and CFMT-
MY). The correlations between two different versions of the
CFMT reported in past studies were higher than .5 (e.g., r =
.71 in Arrington etal., 2022; r = .61 in McKone etal., 2011).
A medium correlation (r = .5) was entered into the power
analysis with the following parameters: α = .05, power = .95.
The power analysis suggested that N = 46 would be required
to detect a correlation between the two CFMT versions with
95% probability.
All participants provided informed consent to partici-
pate in the study. Upon recruiting every 10 participants, a
lucky draw was held with each participant given a chance
to win RM20 or alternatively course credits were given for
participation. The study has been reviewed and approved
by the Science and Engineering Research Ethics Commit-
tee (SEREC) at the University of Nottingham Malaysia
(approval code: KSK050320).
Cambridge Face Memory Test – Chinese (CFMT‑Chinese)
The CFMT-Chinese was obtained from McKone etal.
(2012). Fifty-two male identities were used in the task. The
CFMT consists of three stages with increasing difficulty:
learning stage (i.e., faces are presented in the same lighting
and viewpoint condition), novel stage (i.e., faces are
presented in different lighting and viewpoint condition) and
novel-with-noise stage (i.e., faces are presented in different
lighting and viewpoint condition with Gaussian noise
applied). In total, there were 72 trials and six target faces to
be memorized throughout the whole task.
Three practice trials with feedback were given before the
experimental trials to familiarize participants with the pro-
cedure. The practice trials were identical to the procedure in
the learning stage, but using cartoon images of Bart Simp-
son. In the learning stage, three study images (left 1/3 pro-
file, frontal view and right 1/3 profile) of the same identity
were presented sequentially for 3 s each with inter-trial inter-
val of 500 ms (Fig.1a). The target face was then presented
with two distractor faces and participants were required to
select the target face shown using the “1”, “2” or “3” key
with no time limit (Fig.1b). In total, there were 18 trials in
the learning stage (six target faces × three trials).
Behavior Research Methods
1 3
In the novel stage, participants were required to memorize
the same six target faces in the learning stage which were
presented simultaneously in frontal view for 20 s. Similar
to the test phase of the previous stage, participants were
required to select the target face presented with two
distractor faces with no time limit. The images presented in
this stage were different from the learning stage in terms of
lighting and/or viewing angle (Fig.1c). In total, there were
30 trials in the novel stage (six target faces × five trials).
The novel-with-noise stage was identical to the novel stage,
except that noise was added to the test images to increase
the difficulty level (Fig.1d). In total, there were 24 trials in
the novel-with-noise stage (six target faces × four trials).
Cambridge Face Memory Test – Chinese Malaysian
(CFMT‑MY)
The stimuli used in the CFMT-MY were created using
the University of Nottingham Malaysia face database,
where photographs of students from the University of
Nottingham Malaysia were obtained with informed consent
before photographing. In total, 52 male Chinese Malaysian
identities were used as stimuli. The faces had no piercings
or glasses. Editing of images was conducted using Adobe
Photoshop CS6. Blemishes, moles, and facial hair were
removed. Five different viewing angles of each identity were
used (frontal, 45 degrees left, 45 degrees right, 90 degrees
left and 90 degrees right). The face images were cropped
to a size of 210 pixels in height while the width of the face
images was resized according to the original proportion of
the face. Each image was then placed onto a 200 × 250
pixels black canvas. Examples of the CFMT stimuli similar
to the actual test stimuli used are shown in Fig.1.
The CFMT-MY was designed to replicate the original
CFMT but using Chinese Malaysian faces. For the learning
stage, the same cropping template was used for all targets
and distractors. Frontal viewpoint, 45 degrees right and 45
degrees left were used. The distractors were matched to the
target faces in the testing phase based on their similarity
in appearance. Replicating the original CFMT, target faces
were never used as distractors and the distractors were
presented repeatedly to ensure that participants could not
use familiarity to decide whether the faces were previously
memorized or not.
In the novel stage of the original CFMT, images of the
same identity were captured with different poses and physi-
cal lighting (i.e., the frontal view of the same identity was
captured with lighting from the bottom or a slightly different
frontal pose). However, such images did not exist in our face
database. Thus, we used frontal viewpoint, 45 degrees right,
90 degrees right and 90 degrees left for the novel stage. We
followed the procedure of CFMT-Aus (McKone etal., 2011)
where instead of poses, different templates were used and
lighting was added to the images using Adobe Photoshop
Three study images in learning stage
presented in different views
Test trials in learning stage (faces are
presented with same light and viewpoint
condion as in study image)
Test trials in novel stage (faces are
presented with different lighng and
cropping template)
Test trials in novel-with-noise stage (faces are
presented with different light and viewpoint
condion with Gaussian noise applied)
(a)
(d)
(b)
(c)
Fig. 1 Sample CFMT-MY stimuli. None of the faces shown in the sample figure were used in the actual task to avoid familiarity with the actual
target faces used in the task
Behavior Research Methods
1 3
CS6. For the frontal view and the 45 degrees right view, the
images from the learning stage were used with modifications
(i.e., the use of different external template shape and/or the
addition of lighting). The external templates used were rep-
licated based on CFMT-Aus. Point light was added using the
function Lighting effects. The lighting was directed from the
right for the 45 degrees right images, from the left for half of
the frontal view images and from the bottom for the other half
of the frontal view images. As the 90 degrees right and left
images were not shown in the learning stage, only a template
was used with no lighting changes made.
In the novel-with-noise stage, the viewpoints used were
frontal, 45 degrees left and 90 degrees right. The light-
ing was directed from the right for half of the frontal view
images and the 45 degrees left images. For the 90 degrees
right images, the lighting was directed from the left. The
other half of the front-facing images were made to appear
lightly shadowed by adjusting the brightness and contrast
(–30 brightness and +30 contrast). Different templates were
applied to the frontal view and the 45 degrees left images.
Next, 30% colored Gaussian noise was added using the func-
tion Add noise. The CFMT-MY materials (stimuli and trial
order) are available in the Open Science Framework reposi-
tory, https:// osf. io/ gu4fy/.
Cambridge Car Memory Test (CCMT)
CCMT was obtained from the authors of the task (Dennett
etal., 2012). Fifty-two different cars were used in the
CCMT. The CCMT follows the same procedure as CFMT,
except the images presented were cars instead of faces.
Procedure
Testable (https:// www. testa ble. org/) was used to run the
online experiment (Rezlescu etal., 2020). To ensure that
the stimuli size remained the same for different screen sizes,
calibration was included before the start of the task where
participants had to match the length of a line on the screen
to the length of a bank card. The average vertical height of
the face stimuli in the CFMT-Chinese and CFMT-MY was 4
cm while the average vertical height of the car stimuli in the
CCMT was 3.5 cm. Participants completed all three tasks:
Asian CFMT (Chinese and Malaysia) and CCMT in random
order. The experiment took about 45 min to complete.
Results
All data analysis was conducted using JASP (JASP Team,
2022), except for the internal reliability analysis which was
carried out using R software and R Studio (R Core Team,
2021; RStudio Team, 2021) including several R packages:
dplyr (Wickham etal., 2021), tidyr (Wickham, 2021), data.
table (Dowle & Srinivasan, 2021), psy (Falissard, 2012) and
reshape (Wickham, 2007).
Normal distribution
The skewness (skew = – 0.397, SE = 0.216) and kurtosis
(kurtosis = – 0.485, SE = 0.428) values for the CFMT-MY
score were between ± 1 which indicates normal distribution
(George & Mallery, 2019). Additionally, no significant skew
was found for the scores of CFMT-MY (z = – 1.838, p = .07).
The mean score for CFMT-MY was 59.94/72, SD = 6.93.
Internal reliability
The internal reliability of the test was measured using
Cronbach’s α. For all trials, internal reliability was α = .86
for CFMT-MY. Results showed high internal reliability for
CFMT-MY which was in line with previous work such as
CFMT-Chinese, α = .86 (McKone etal., 2017) and CFMT-
Aus, α = .88 (McKone etal., 2011).
Internal consistency
The internal consistency of the CFMT-MY at stage level
(i.e., learning, novel, and novel-with-noise) was measured
using Pearson correlation (r). Results showed positive
correlation between the learning and novel stage, r(134) =
.55, p < .001, learning and novel-with-noise stage, r(134) =
.40, p < .001 and novel and novel-with-noise stage, r(134) =
.68, p < .001 showing that the scores were highly consistent
across the different stages of CFMT-MY.
Validity
Convergent and divergent validity were measured using
Pearson correlation (r). Convergent validity was measured
by examining the correlation between the CFMT-
Chinese and the CFMT-MY whereas divergent validity
was measured by examining the correlation between
the CCMT and the CFMT-MY. Results showed positive
correlation between the scores of the CFMT-MY and the
CFMT-Chinese, r(124) = .59, p < .001. A weak positive
correlation was found between the scores of the CCMT
and the CFMT-MY, r(124) = .26, p = .004. The difference
between the two correlation was further analyzed by
comparing the dependent overlapping correlations
(Diedenhofen & Musch, 2015; Hittner etal., 2003). The
test showed that the correlation between the CFMT-MY
and the CFMT-Chinese (i.e., convergent validity) was
larger than the correlation between the CCMT and the
CFMT-MY (i.e., divergent validity), z = 3.62, p < .001.
Behavior Research Methods
1 3
Repeated‑measures ANOVA
A repeated-measures ANOVA was conducted to explore
(1) potential differences between the CFMT-Chinese and
the CFMT-MY and (2) the increasing levels of difficulty
across the test stages. A 3 (stage: learning vs. novel vs.
novel-with-noise) × 2 (test version: CFMT-MY vs. CFMT-
Chinese) repeated-measures ANOVA was conducted on the
accuracy (calculated by proportion correct scores). When the
Mauchly’s test indicated that the assumption of sphericity
was violated, the degrees of freedom were corrected using
the Greenhouse–Geisser method.
Analysis revealed a significant main effect of stage on
accuracy, F(1.70, 212.05) = 357.49, p < .001, ηp
2 = .74.
A post hoc Holm–Bonferroni test demonstrate that the
accuracy of the learning stage (M = .98, SD = .05) was
higher than the novel stage (M = .77, SD = .15), p < .001,
d = 1.94. Similarly, the accuracy of the learning stage was
higher than the novel-with-noise stage (M = .74, SD = .16),
p < .001, d = 2.17. Accuracy was found to be higher in the
novel stage compared to the novel-with-noise stage, p = .01,
d = 0.23.
Results showed a significant main effect of test version on
accuracy, F(1, 125) = 14.03, p < .001, ηp
2 = .10, where the
accuracy of CFMT-MY (M = .83, SD = .10) was higher than
CFMT-Chinese (M = .79, SD = .12). A significant interac-
tion effect between stage and test version on accuracy was
found, F(2, 250) = 65.68, p < .001, ηp
2 = .34 (Fig.2). Sim-
ple main effects analysis showed no differences between the
test versions in the learning stage, F(1, 125) = 3.497, p =
.064, η2 = .027, and novel-with-noise stage, F(1, 125) =
2.042, p = .156, η2 = .016. However, a significant effect
was found in the novel stage, F(1, 125) = 89.45, p < .001,
η2 = .417, where the novel stage score for CFMT-MY (M
= .83, SD = .12) was higher than CFMT-Chinese (M = .71,
SD = .16).
Additional simple main effects analysis showed a significant
main effect of stage on accuracy in the CFMT-MY, F(1.86,
231.82) = 245.16, p < .001, η2 = .66. A post hoc Holm–Bon-
ferroni test showed that the accuracy of the learning stage (M
= .97, SD = .05) was higher than the novel stage (M = .83,
SD = .12), p < .001, d = 1.18. Similarly, the accuracy of the
learning stage was higher than the novel-with-noise stage (M
= .73, SD = .15), p < .001, d = 1.96. Accuracy for the novel
stage was found to be higher than the novel-with-noise stage, p
< .001, d = 0.78. Results also showed a significant main effect
of stage on accuracy in the CFMT-Chinese, F(1.81, 226.17)
= 270.01, p < .001, η2 = .68. A post hoc Holm–Bonferroni
test demonstrated that the accuracy of the learning stage (M =
.98, SD = .04) was higher than the novel stage (M = .71, SD =
.16), p < .001, d = 1.93. Similarly, the accuracy of the learning
stage was higher than the novel-with-noise stage (M = .75, SD
= .17), p < .001, d = 1.61. Interestingly, the accuracy for the
novel stage was found to be lower than the novel-with-noise
stage, p < .001, d = –0.318.
Discussion
Overall, the results showed that the CFMT-MY seems to be
suitable to study individual differences in face recognition
and for the diagnosis of individuals with face recognition
impairments. The scores of CFMT-MY were normally
distributed when all trials were included in the analysis (72
trials). Hence, the standard method used to calculate the
cut-off score, M – 2SD seems to be a suitable option for the
diagnosis of face recognition impairments in the CFMT-MY.
Additionally, the CFMT-MY was highly consistent and
exhibited high internal reliability (α = .86) which was in
line with those reported in previous work on CFMT-Chinese
and CFMT-Aus (α = .86, .88; McKone etal., 2011, 2017).
This high reliability further supports the suitability of the
test to be used for diagnosis in clinical settings and for the
measurement of individual differences in face recognition.
The findings also demonstrated convergent validity where
the CFMT-MY was moderately correlated with the CFMT-
Chinese. This suggest that both tests tap very similar cognitive
processes. Results also demonstrated divergent validity where
the CFMT-MY was weakly correlated with the CCMT which
measures object recognition, despite both tests having similar
procedures and formats. Additionally, the correlation between
the Asian CFMT versions was larger compared to the correla-
tion between CCMT and CFMT-MY. Hence, there is strong
evidence that the CFMT-MY taps face-recognition-specific
processes rather than general visual memory.
Our results showed that the difficulty of the CFMT-MY
increases across stages. Specifically, the learning stage
achieved the highest accuracy followed by the novel and finally
0.7
0.8
0.9
1.0
Learning NovelNovel−with−noise
CFMT stage
Accuracy
CFMT version
CFMT−Chinese
CFMT−MY
Fig. 2 Proportion correct scores of Chinese Malaysian participants in
the three stages of CFMT. Error bars represent 95% confidence inter-
vals
Behavior Research Methods
1 3
the novel-with-noise stage. The CFMT-Chinese showed a sim-
ilar pattern of results where the learning stage achieved higher
accuracy compared to the novel and novel-with-noise stages.
However, the novel stage had lower accuracy compared to the
novel-with-noise stage. This finding is surprising and contra-
dicted the intended higher level of difficulty for the novel-with-
noise stage (Duchaine & Nakayama, 2006).
In summary, the analysis revealed that the CFMT-MY
seems to be suitable to use for diagnosis in clinical
settings and the measurement of individual differences in
face recognition ability with high consistency and high
internal reliability scores. The CFMT-MY also shows
appropriate convergent and divergent validity. In addition,
the CFMT-MY scores show an increasing level of difficulty
stages which is important for the assessment of a wide range
of face recognition abilities.
Experiment 2
In Experiment 2, we aim to investigate if the CFMT-MY
would be sensitive to a classical effect in face recognition
literature: the other-race effect. We also aim to explore if
Caucasian participants would present similar levels of
other-race effect for the CFMT-MY and the CFMT-Chinese.
Differences in accuracy between the CFMT-MY, CFMT-
Chinese and CFMT-original would be assessed using a
repeated-measures ANOVA. Additionally, the CFMT-MY
scores of Chinese Malaysian participants in Experiment 1
and the Caucasian participants in Experiment 2 would be
compared using an independent-samples t test to determine
if the CFMT-MY is sensitive to the other-race effect.
Methods
Participants
One hundred and fifty participants took part in this experi-
ment, but the final sample included 135 Caucasians (108
females, 25 males and two non-binary) with ages ranging
between 18 to 52 years (M = 22.04 years, SD = 6.62 years).
The age range for female participants was between 18 and
52 years (M = 21.64 years, SD = 6.54 years) while for male
participants, the age range was between 18 and 49 years (M
= 23.32 years, SD = 6.58 years). The age range for non-
binary participants was between 19 and 36 years (M = 27.50
years, SD = 12.02 years). Data from participants who had a
median reaction time of less than 500 ms (nine participants)
or scored below chance level (24/72) (one participant) for
any one of the CFMT versions were removed from further
analysis. Five participants of other ethnicities (e.g., Asian,
Other) other than White/Caucasian were also excluded. Ten
participants were excluded from the data analysis (except
for internal reliability and internal consistency analysis) as
their performance on the face memory tasks was indicative
of possible prosopagnosia (Appendix 2). The remaining par-
ticipants were 125 Caucasians (100 females, 24 males, and
one non-binary) with ages ranging between 18 and 52 years
(M = 21.53 years, SD = 5.66 years). The age range for female
participants was between 18 and 52 years (M = 21.38 years,
SD = 6.03 years) while for male participants, the age range
was between 18 and 34 years (M = 22.25 years, SD = 3.92
years). The age for the non-binary participant was 19 years.
An a priori power analysis was conducted using G*Power
3.1 (Faul etal., 2009) for a repeated-measures ANOVA
comparing the three CFMT versions (CFMT-MY, CFMT-
Chinese and CFMT-original). The effect size for the other-
race effect was estimated from two studies which had used
CFMT in measuring the other-race effect (McKone etal.,
2012; Wan etal., 2017) where the average ηp
2 = .44, large
effect size (effect size (ηp
2) in the papers was calculated using
formula 13 in Lakens (2013)). Additionally, a meta-analysis
study has reported a large effect size for the other-race effect,
Hedge’s g = .82. Therefore, a large effect size estimate (ηp
2 =
.14) was entered into the power analysis with the following
parameters: α = .05, power = .95. The power analysis implied
that 50 participants would be required to detect a difference
between the CFMT versions with 95% probability.
All participants provided informed consent to participate
in the study. Course credits were given for participation.
The study has been reviewed and approved by the Science
and Engineering Research Ethics Committee (SEREC) at
the University of Nottingham Malaysia (approval code:
KSK050320).
Materials andprocedure
Three versions of CFMT were used: the CFMT-original, the
CFMT-Chinese and the CFMT-MY. The CFMT-original was
obtained from the authors of the task (Duchaine & Nakayama,
2006). Fifty-two male identities were used in the task. The
CFMT-original follows the same procedure as the CFMT-
Chinese and the CFMT-MY (refer to Experiment 1 for full
procedure). Testable (https:// www. testa ble. org/) was used to
run the online experiment (Rezlescu etal., 2020). The aver-
age vertical height of the face stimuli in the CFMT was 4 cm.
Participants completed all three CFMT versions in random
order. The experiment took about 45 min to complete.
Results
The analysis of internal reliability, internal consistency and
validity were consistent with Experiment 1 and are available
in Appendix 3. A repeated-measures ANOVA was conducted
to examine if there were any differences between the scores
Behavior Research Methods
1 3
of the different test versions and the different stages of the
test. A 3 (test version: CFMT-MY vs. CFMT-Chinese vs.
CFMT-original) × 3 (stage: learning vs. novel vs. novel-
with-noise) repeated-measures ANOVA was conducted
on the accuracy (calculated by proportion correct scores).
When the Mauchly’s test indicated that the assumption
of sphericity was violated, the degrees of freedom were
corrected using the Greenhouse–Geisser method.
Results showed a significant main effect of test version
on accuracy, F(2, 248) = 70.49, p < .001, ηp
2 = .36. A post
hoc Holm–Bonferroni test demonstrated that the accuracy
of CFMT-original (M = .82, SD = .12) was higher than
CFMT-MY (M = .74, SD = .12), p < .001, d = 0.73.
Similarly, the accuracy of CFMT-original was higher than
CFMT-Chinese (M = .69, SD = .11), p < .001, d = 1.03.
Accuracy for CFMT-MY was also found to be higher than
CFMT-Chinese, p < .001, d = –0.30. To further demonstrate
the other-race effect, we ran an additional analysis comparing
the CFMT-MY scores of Chinese Malaysian participants in
Experiment 1 and the Caucasian participants in Experiment
2. Independent-samples t test revealed that Chinese
Malaysian participants (M = .83, SD = .10) scored higher
than the Caucasian participants (M = .74, SD = .12) on the
CFMT-MY, t(249) = –7.20, p < .001, d = –0.91.
Analysis revealed a significant main effect of stage on
accuracy, F(1.64, 203.00) = 628.13, p < .001, ηp
2 = .84. A
post hoc Holm–Bonferroni test demonstrate that the accuracy
of the learning stage (M = .96, SD = .05) was higher than
the novel stage (M = .71, SD = .12), p < .001, d = 2.35.
Similarly, the accuracy of the learning stage was higher than
the novel-with-noise stage (M = .64, SD = .14), p < .001,
d = 3.02. Accuracy was found to be higher for the novel
compared to the novel-with-noise stage, p < .001, d = 0.66.
A significant interaction effect between test version and
stage was found, F(4, 496) = 38.47, p < .001, ηp
2 = .24
(Fig.3). Results showed a significant main effect of test ver-
sion on accuracy in the learning stage, F(2, 248) = 17.48, p <
.001, η2 = .12. A post hoc Holm–Bonferroni test demonstrated
that the accuracy of CFMT-original (M = .98, SD = .04) was
higher than CFMT-MY (M = .94, SD = .07) in the learning
stage, p < .001, d = 0.52. Similarly, the accuracy of the CFMT-
original was higher than the accuracy in the CFMT-Chinese
(M = .95, SD = .07) in the learning stage, p < .001, d = 0.36.
No difference was found between the accuracy for the CFMT-
MY and the CFMT-Chinese in the learning stage, p = .08, d
= 0.16. Analysis also revealed a significant main effect of test
version on accuracy in the novel stage, F(2, 248) = 96.96, p <
.001, η2 = .44. A post hoc Holm–Bonferroni test demonstrate
that the accuracy of the CFMT-original (M = .80, SD = .16)
was higher than the CFMT-MY (M = .73, SD = .15) in the
novel stage, p < .001, d = 0.45. Similarly, the accuracy of the
CFMT-original was higher than the CFMT-Chinese (M = .60,
SD = .14) in the novel stage, p < .001, d = 1.23. Accuracy for
the CFMT-MY was also found to be higher than the accu-
racy in the CFMT-Chinese in the novel stage, p < .001, d =
–0.78. A significant main effect of test version on accuracy in
the novel-with-noise stage was found, F(2, 248) = 30.57, p <
.001, η2 = .20. A post hoc Holm–Bonferroni test demonstrated
that the accuracy of the CFMT-original (M = .71, SD = .18)
was higher than the accuracy in the CFMT-MY (M = .60,
SD = .17) in the novel-with-noise stage, p < .001, d = 0.64.
Similarly, the accuracy of the CFMT-original was higher than
the accuracy in the CFMT-Chinese (M = .61, SD = .17) in
the novel-with-noise stage, p < .001, d = 0.56. No difference
was found for the accuracy in the CFMT-MY and the CFMT-
Chinese in the novel-with-noise stage, p = .35, d = 0.09.
Simple main effects analysis also revealed differences
on accuracy across stages in the CFMT-MY, F(2, 248) =
330.81, p < .001, η2 = .73. A post hoc Holm–Bonferroni
test demonstrated that the accuracy of the learning stage
(M = .94, SD = .08) was higher than accuracy in the novel
stage (M = .73, SD = .15), p < .001, d = 1.4. Similarly, the
accuracy of learning stage was higher than the accuracy of
the novel-with-noise stage (M = .6, SD = .17), p < .001, d
= 2.28. Accuracy for the novel stage was found to be higher
than the novel-with-noise stage, p < .001, d = 0.88. Analy-
sis revealed a significant main effect of stage on accuracy
in the CFMT-Chinese, F(1.85, 228.85) = 480.27, p < .001,
η2 = .8. A post hoc Holm–Bonferroni test demonstrate that
the accuracy of the learning stage (M = .95, SD = .07) was
higher than the novel stage (M = .6, SD = .14), p < .001,
d = 2.43. Similarly, the accuracy of the learning stage was
higher than the novel-with-noise stage (M = .61, SD = .17),
p < .001, d = 2.37. No difference was found for the accuracy
of the novel stage and novel-with-noise stage, p = .5, d =
–0.06. A significant main effect of stage on accuracy in the
CFMT-original was found, F(1.65, 205.15) = 212.68, p <
.001, η2 = .63. A post hoc Holm–Bonferroni test demon-
strated that the accuracy of the learning stage (M = .98, SD
= .04) was higher than the novel stage (M = .8, SD = .16),
0.6
0.7
0.8
0.9
1.0
Learning NovelNovel−with−noise
CFMT stage
Accuracy
CFMT version
CFMT−original
CFMT−Chinese
CFMT−MY
Fig. 3 Proportion correct scores of Caucasian participants in the three
stages of CFMT. Error bars represent 95% confidence intervals
Behavior Research Methods
1 3
p < .001, d = 1.23. Similarly, the accuracy of the learning
stage was higher than the accuracy in the novel-with-noise
stage (M = .71, SD = .18), p < .001, d = 1.81. Accuracy for
the novel stage was found to be higher than the accuracy in
the novel-with-noise stage, p < .001, d = 0.58.
Discussion
Our findings revealed that the CFMT-MY was sensitive to
the other-race effect. Although the accuracy of the CFMT-
original was higher than the CFMT-MY and CFMT-Chinese
in all three stages, this could not adequately demonstrate the
other-race effect since the other-race CFMT (i.e., CFMT-
Chinese and CFMT-MY) may be more difficult compared to
the own-race CFMT (i.e., CFMT-original). Hence, we ran an
additional analysis which showed that Caucasian participants
scored lower compared to the Chinese Malaysian
participants from Experiment 1 on the CFMT-MY. This
indicated that Chinese Malaysian participants had superior
recognition of own-race faces (i.e., Chinese Malaysian faces)
as compared to Caucasian participants, replicating the other-
race effect in face recognition (Meissner & Brigham, 2001).
Our results also showed that the scores of CFMT-MY and
CFMT-original clearly represent the increasing difficulty of
the three stages, with the learning stage achieving close-to-
ceiling scores, followed by the novel stage and the novel-
with-noise stage with the highest difficulty. Interestingly,
while the learning stage had higher accuracy compared to
the novel and novel-with-noise stage in the CFMT-Chinese,
accuracies for the novel and novel-with-noise stages were
similar. Additionally, higher accuracy was found for the
CFMT-MY compared to CFMT-Chinese in our Caucasian
sample, replicating our findings from Experiment 1.
To summarize, our results show that the CFMT-MY is
sensitive to the other-race effect as Caucasian participants
scored lower on the CFMT-MY compared to the Chinese
Malaysian participants from Experiment 1. Interestingly,
with a Caucasian sample, we have shown that the difficulty
of the CFMT-MY increases across stages. This result was,
however, not found in the CFMT-Chinese.
General discussion
The current study aimed to develop a new version of the Asian
CFMT using Chinese Malaysian faces, the CFMT-MY, as a
standardized test of face recognition ability. Overall, results
indicated that the CFMT-MY has high consistency and high
reliability and so is suitable for the diagnosis of individu-
als with difficulty in face recognition in clinical settings and
also for research measuring individual differences in face
recognition ability. The CFMT-MY also showed convergent
validity with the CFMT-Chinese and divergent validity with
the CCMT. Scores for the CFMT-MY corresponded to the
increasing level of difficulty intended for the CFMT stages
(see Duchaine & Nakayama, 2006), where the learning stage
achieved the highest accuracy followed by the novel and
finally the novel-with-noise stage. The CFMT-MY scores
were also normally distributed when all trials were included
in the analysis (72 trials). Thus, the standard method used to
calculate the cut-off score, M – 2SD can be used for the diag-
nosis of impairments related to face recognition. The CFMT-
MY was also sensitive to the other-race effect. Our results
revealed that Caucasian participants scored lower on the
CFMT-MY compared to the Chinese Malaysian participants
from Experiment 1. Chinese Malaysian participants showed
superior recognition of own-race faces compared to Caucasian
participants, supporting the other-race effect in face recogni-
tion (Meissner & Brigham, 2001; Wong etal., 2020, 2021).
The results of both experiments also revealed an inter-
esting pattern: Chinese Malaysian and Caucasian partici-
pants showed higher accuracy in the CFMT-MY compared
to the CFMT-Chinese. This result seems to be explained
by a surprisingly low performance in the novel stage of
the CFMT-Chinese. In fact, performance on this stage was
lower (Experiment 1) or identical (Experiment 2) to that of
the novel-with-noise stage. This pattern of results, which
was only found in the CFMT-Chinese, is problematic as it
shows no linear increment of difficulty across stages. The
increment of difficulty across stages is important for the
assessment of a wide range of face recognition abilities
(Duchaine & Nakayama, 2006). Because in our first experi-
ment we used a sample of Chinese Malaysian participants,
it could be argued that these results could be explained by
the other-ethnicity effect (McKone etal., 2012). However,
this hypothesis cannot explain why in Experiment 2, with
a Caucasian sample, we found no differences between the
novel and the novel-with-noise stages in the CFMT-Chinese,
but clear differences across these stages in the CFMT-MY.
More importantly, previous studies have also revealed a
similar percentage of correct responses across the novel
and the novel-with-noise stage in the CFMT-Chinese (i.e.,
72.13% and 71.58%, see McKone etal., 2017, table3 and
79.24% and 80.11%, see McKone etal., 2012, table1). Past
research has suggested that the CFMT could be shortened by
including only the first two stages (i.e., learning and novel
stage) for diagnosis of prosopagnosia (Corrow etal., 2018;
Murray & Bate, 2020). In this case, including only the first
two stages of the CFMT-Chinese may be problematic, as in
this test the novel stage seems to be identical or even more
difficult than the novel-with-noise stage, which could poten-
tially result in more individuals scoring below the cut-off.
Behavior Research Methods
1 3
It is important to note here that, compared to the CFMT-
Chinese and CFMT-original, we used a different method to
create the novel stage. Images of the same identity that were
captured with different poses and physical lighting were used
for the CFMT-Chinese novel stage, but such images did not
exist in our face database, and hence we followed the proce-
dure of CFMT-Aus (McKone etal., 2011) where instead of
poses, different cropping templates were used and lighting was
added into the images using photo editing software. Despite
these differences, the CFMT-MY showed a clear increment
in difficulty across stages. In addition, these differences in
the stimuli cannot explain the discrepancy in difficulty levels
between the novel stage of the CFMT-Chinese and CFMT-MY
as both the CFMT-original and the CFMT-Chinese use the
same method in the novel stage. In this sense, higher accuracy
for the novel stage compared to the novel-with-noise stage was
found in the CFMT-original while these differences were not
found in the CFMT-Chinese. Thus, we conclude that the lower
accuracy for CFMT-Chinese compared to CFMT-MY among
Chinese Malaysian and Caucasian participants was due to dif-
ferences in test difficulty, specifically in the novel stage.
Malaysia is a multiracial country with Malays constituting
57.93% of the population; followed by Chinese at 22.58%;
Indians at 6.7%; indigenous people (i.e., Orang Asli) at
3.95%; others at 0.64%; and non-citizen at 8.2% (Depart-
ment of Statistics Malaysia, 2011). In this case, the use of
CFMT-MY in Malaysia may be limited to Chinese Malay-
sian participants due to the presence of the other-race effect
in face recognition (Meissner & Brigham, 2001; Wong etal.,
2020). However, recent research showed no differences in
the recognition of Chinese faces between Chinese Malaysian
and non-Chinese Malaysian (i.e., Malays and Indians) (Estu-
dillo etal., 2020) suggesting that the CFMT-MY may also be
suitable to use for diagnosis of face recognition difficulties in
the non-Chinese population. Because Chinese is Malaysia's
second most populous race, non-Chinese Malaysian may
have developed greater expertise with Chinese faces due to
extensive experience with the Chinese Malaysian popula-
tion (Tanaka etal., 2013; Wan etal., 2015). However, some
of the states in Malaysia have majority Malay populations
(e.g., Kelantan, Terengganu and Perlis) (Saravanamuttu,
2010) and hence, the population in those states may not be as
familiar with Chinese faces, hindering the use of CFMT-MY
for diagnosis of face recognition difficulties in those regions.
Conclusions
In summary, we report that the CFMT-MY is a highly
consistent and reliable test for diagnosing individuals
with difficulty in face recognition in clinical settings,
measurement of individual differences in face recognition
ability and measurement of the other-race effect. The
standard method to calculate the cut-off score (M – 2SD)
seems to be appropriate for the diagnosis of impairments
related to face recognition. Additionally, the lower end
of the norm scores was far from the chance level (24/72
trials) which permits a range of scores for the diagnosis
of impairments related to face recognition such as
prosopagnosia. Although the psychometric properties of the
CFMT-MY has been shown to be appropriate for diagnosis
of face recognition impairments, future research involving
Asian prosopagnosics participants would be required to
further validate the use of CFMT-MY for diagnosis of
prosopagnosia. The CFMT-MY scores corresponded to the
increasing level of difficulty intended for the CFMT stages
where the learning stage achieved the highest accuracy
followed by the novel and finally the novel-with-noise
stage. Finally, the current availability of two Asian CFMT
versions could lead to improvement of diagnosis for face
recognition difficulties and is beneficial for use in pre-post
face recognition ability assessments.
Appendix1
Exclusion ofpossible prosopagnosia cases
forExperiment 1
Possible prosopagnosia cases were excluded to provide calcu-
lation representing “norm” participants in order to be able to
use the test for diagnosing prosopagnosia cases (see Bowles
etal., 2009 and McKone etal., 2017 for a similar procedure).
Percentile ranks (Crawford etal., 2009) were calculated to
determine the bottom 2% of the sample using the formula (m
+ 0.5k) / N × 100 where m is the number of participants scor-
ing below a given score, k is the number of participants which
have obtained the given score and N is the total sample size.
Using this formula, CFMT-MY score of 39/72 was equivalent
to a percentile rank of 1.87% of the total sample size (N = 134)
and the score after, 42/72 was equivalent to 2.61%. Three par-
ticipants (participant ID: 71, 102, and 75) which scored ≤ 39
were excluded. Based on the scores in Appendix Table1, these
participants scored quite well in the learning stage of CFMT-
Chinese and CFMT-MY, showing that the low scores were not
attributable to lack of effort. Similarly, the scores for CFMT-
Chinese were at the lower end of the normal distribution. Raw
data file showed no indication of repeated same key pressing.
The standard method was used to calculate the cut-off value
of the CFMT-Chinese for prosopagnosia, M – 2SD. The cut-off
score was 36.46/72. Five participants (participant ID: 78, 105,
123, 83 and 45) which scored ≤ 36 were excluded. Based on
the scores in Appendix Table1, these participants scored quite
well in the learning stage of CFMT-Chinese and CFMT-MY,
showing that the low scores were not attributable to lack of
effort. However, they unexpectedly scored in the average to
Behavior Research Methods
1 3
high range for CFMT-MY, except for participant 105. Raw
data file showed no indication of repeated same key press-
ing, however, participant 123 had 16/72 trials (22.22%) with
response time < 500ms and four trials with abnormally long
response time (12194–168585ms) in the CFMT-Chinese block
showing that the participation may be distracted during the
task. Additionally, all five participants completed the CFMT-
Chinese as the last block as per randomization, hence, the low
performance could be attributed to fatigue or loss of attention/
effort towards the end of the experiment.
It is unclear whether these cases presented are
prosopagnosia as some of the participants may have scored
on the lower end due to fatigue or loss of attention towards
the end of the experiment. Other measures such as the
20-item prosopagnosia index (PI20) (Shah etal., 2015),
famous face test and a clinical interview are needed to
confirm these cases.
Appendix2
Exclusion ofpossible prosopagnosia cases
forExperiment 2
Possible prosopagnosia cases were excluded to provide
calculation representing “norm” participants (see Bowles
etal., 2009 and McKone etal., 2017 for a similar proce-
dure). The standard method, M – 2SD, was used to calcu-
late the cut-off value of the CFMT-original for prosopag-
nosia. The cut-off score was 37.79/72. Four participants
(ID: 49, 87, 43 and 96) which scored ≤ 38 were excluded.
Based on the scores in Appendix Table2, these partic-
ipants scored quite well in the learning stage of all test
versions of CFMT, showing that the low scores were not
attributable to lack of effort (except for participant 43 in
the CFMT-MY learning stage). Similarly, the scores for
CFMT-Chinese and CFMT-MY were at the lower end of
the normal distribution. Raw data file showed no indication
of repeated same key pressing, however, participant 43 had
32/72 trials (44.44%) for the CFMT-Chinese block, 17/72
trials (23.61%) for the CFMT-MY block and 34/72 trials
(47.22%) for the CFMT-original block with response time
< 500 ms indicating that the participation may be pressing
some of the keys randomly during the task.
The standard method to calculate the cut-off value, M
– 2SD, was used to determine participants which had aver-
age score ranked in the bottom 2% of the sample for CFMT-
Chinese (for similar procedure, see Wan etal., 2017). The
cut-off value was 31.49/72. Four participants (ID: 129, 115,
37 and 9) which scored ≤ 31 were excluded. Based on the
scores in Appendix Table2, these participants scored quite
well in the learning stage of all test versions of CFMT, show-
ing that the low scores were not attributable to lack of effort.
Similarly, the scores for CFMT-MY were at the lower end of
the normal distribution (except for participant 9). Raw data
file showed no indication of repeated same key pressing. It
is unclear whether the participants were possible prosopag-
nosia cases or if they were severely affected by the other-race
effect (ORE). For example, participant 37 scored on the lower
end for both CFMT-Chinese and CFMT-MY (other-race) but
scored on the average range for CFMT-original (own-race).
Cut-off value was also calculated using the standard
method, M – 2SD, to determine participants which had
average score ranked in the bottom 2% of the sample for
CFMT-MY. The cut-off value was 32.8/72. Two partici-
pants (ID: 71 and 104) which scored ≤ 33 were excluded.
Based on the scores in Table7, these participants scored
quite well in the learning stage of all test versions of
CFMT, showing that the low scores were not attributable
to lack of effort. Similarly, the scores for CFMT-Chinese
were at the lower end of the normal distribution. Raw data
file showed no indication of repeated same key pressing.
Table 1 Possible prosopagnosia cases based on CFMT-MY and CFMT-Chinese scores
Note. Participant 71, 102 and 75 scored below percentile rank of 2% for CFMT-MY and participant 78, 105, 123, 83 and 45 scores below cut-off
score (M - 2SD).
CFMT-MY CFMT-Chinese CCMT
Participant ID All trials (/72) Learning
stage (/18)
All trials (/72) Learning
stage (/18)
All trials (/72) Learning
stage (/18)
Order of tasks
71 39 18 37 14 49 14 CCMT > CFMT-MY > CFMT-Chinese
102 31 11 40 17 37 8 CCMT > CFMT-Chinese > CFMT-MY
75 36 15 43 18 41 17 CCMT > CFMT-MY > CFMT-Chinese
78 61 18 30 18 47 13 CCMT > CFMT-MY > CFMT-Chinese
105 44 15 33 17 44 15 CFMT-MY > CCMT > CFMT-Chinese
123 53 17 33 15 32 10 CFMT-MY > CCMT > CFMT-Chinese
83 53 17 34 14 46 17 CCMT > CFMT-MY > CFMT-Chinese
45 56 18 35 17 51 18 CCMT > CFMT-MY > CFMT-Chinese
Behavior Research Methods
1 3
All ten participants were excluded from the data analysis
(except for internal reliability analysis). As in Experiment
1, it is unclear whether the participants scoring below the
cut-off value on the CFMT-original were indicative of pos-
sible prosopagnosia as further diagnosis using other meas-
ures are needed to confirm these cases. It is also unclear if
the participants which had average score ranked in the bot-
tom 2% of the sample for CFMT-MY and CFMT-Chinese
were possible prosopagnosia cases, or if they were severely
affected by the other-race effect with average face recogni-
tion ability for own-race faces (Wan etal., 2017).
Appendix3
Internal reliability
The internal reliability of the test was measured using
Cronbach’s α. For all trials, internal reliability was α =
.87 for CFMT-MY. Results showed high internal reliability
for CFMT-MY, which was in line with previous work such
as CFMT-Chinese, α = .86 (McKone etal., 2017).
Internal consistency
The internal consistency of the CFMT-MY at stage level
(i.e., learning, novel, and novel-with-noise) was meas-
ured using Pearson correlation (r). Results showed posi-
tive correlation between the learning and novel stage,
r(133) = .45, p < .001, learning and novel-with-noise
stage, r(133) = .34, p < .001 and novel and novel-with-
noise stage, r(133) = .67, p < .001 showing that the
scores were highly consistent across the different stages
of CFMT-MY.
Validity
Convergent validity was measured using Pearson correla-
tion (r). Convergent validity was measured by examining
the correlation between the CFMT-MY and CFMT-Chinese
and between the CFMT-MY and CFMT-original. Results
showed positive correlation between the scores of CFMT-
MY and CFMT-Chinese, r(123) = .57, p < .001 and of
CFMT-MY and CFMT-original, r(123) = .52, p < .001.
Table 2 Possible prosopagnosia cases based on CFMT-original, CFMT-Chinese, and CFMT-MY scores
Note. Participants scored below percentile rank of 2% for CFMT-original (49, 87, 43, and 96), CFMT-Chinese (129, 115, 37, and 9) and CFMT-MY
CFMT-MY CFMT-Chinese CFMT-original
Participant ID All trials (/72) Learning
stage
(/18)
All trials (/72) Learning
stage
(/18)
All trials (/72) Learning
stage
(/18)
Order of tasks
49 35 15 41 18 34 17 CFMT-Chinese > CFMT-original >
CFMT-MY
87 39 15 39 18 34 16 CFMT-MY > CFMT-Chinese > CFMT-
original
43 26 6 37 18 36 11 CFMT-Chinese > CFMT-original >
CFMT-MY
96 36 17 37 16 38 18 CFMT-Chinese > CFMT-original >
CFMT-MY
129 38 15 30 18 39 16 CFMT-MY > CFMT-Chinese > CFMT-
original
115 34 12 31 14 43 15 CFMT-original > CFMT-Chinese >
CFMT-MY
37 39 13 31 15 56 18 CFMT-MY > CFMT-original > CFMT-
Chinese
9 46 17 31 12 41 16 CFMT-Chinese > CFMT-original >
CFMT-MY
71 30 14 40 13 48 15 CFMT-Chinese > CFMT-MY > CFMT-
original
104 31 17 44 16 45 18 CFMT-original > CFMT-Chinese >
CFMT-MY
Behavior Research Methods
1 3
The difference between the two correlations was further
analyzed by comparing the dependent overlapping correla-
tions (Diedenhofen & Musch, 2015; Hittner etal., 2003).
The test showed that the correlation between CFMT-MY
and CFMT-Chinese was no different compared to the cor-
relation between CFMT-MY and CFMT-original (z = 0.7,
p = .49).
Code availability Not applicable.
Authors' contributions The authors confirm contribution to the paper
as follows: study conception and design: Siew Kei Kho, Bryan Qi
Zheng Leong, and Alejandro J. Estudillo; data collection: Siew Kei
Kho and Alejandro J. Estudillo; analysis and interpretation of results:
Siew Kei Kho and Alejandro J. Estudillo; draft manuscript preparation:
Siew Kei Kho, David R. T. Keeble, Hoo Keat Wong, and Alejandro
J. Estudillo. All authors reviewed the results and approved the final
version of the manuscript.
Funding This study was funded by the Fundamental Research Grant
Scheme (FRGS) from the Ministry of Education (MOE) Malaysia
(Grant number: FRGS/1/2018/SS05/UNIM/02/4).
Data availability The datasets generated during the current study and
the CFMT-MY materials (stimuli and trial order) are available in the
Open Science Framework repository, https:// osf. io/ gu4fy/.
Declarations
Competing interests The authors have no competing interests to
declare that are relevant to the content of this article.
Ethics approval Approval was obtained from the Science and Engi-
neering Research Ethics Committee (SEREC) at the University of
Nottingham Malaysia (approval code: KSK050320). The procedures
used in this study adhere to the tenets of the Declaration of Helsinki.
Consent to participate Informed consent was obtained from all indi-
vidual participants included in the study.
Consent for publication Not applicable.
References
Arrington, M., Elbich, D., Dai, J., Duchaine, B., & Scherf, K. S. (2022).
Introducing the female Cambridge face memory test – long form
(F-CFMT+). Behavior Research Methods, 17(10), 841. https://
doi. org/ 10. 3758/ s13428- 022- 01805-8
Bate, S., & Dudfield, G. (2019). Subjective assessment for super rec-
ognition: An evaluation of self-report methods in civilian and
police participants. PeerJ, 2019(1), 1–17. https:// doi. org/ 10. 7717/
peerj. 6330
Bate, S., Adams, A., & Bennetts, R. J. (2019a). Guess Who? Facial
Identity Discrimination Training Improves Face Memory in Typi-
cally Developing Children. Journal of Experimental Psychology:
General, 1–47. https:// doi. org/ 10. 1037/ xge00 00689
Bate, S., Bennetts, R., Hasshim, N., Portch, E., Murray, E., Burns, E.,
& Dudfield, G. (2019b). The limits of super recognition: An other-
ethnicity effect in individuals with extraordinary face recognition
skills. Journal of Experimental Psychology: Human Perception
and Performance, 45(3), 363–377. https:// doi. org/ 10. 1037/ xhp00
00607
Bobak, A. K., Mileva, V. R., & Hancock, P. J. B. (2019). Facing the
facts: Naive participants have only moderate insight into their
face recognition and face perception abilities. Quarterly Journal
of Experimental Psychology, 72(4), 872–881. https:// doi. org/ 10.
1177/ 17470 21818 776145
Bowles, D. C., McKone, E., Dawel, A., Duchaine, B., Palermo, R.,
Schmalzl, L., Rivolta, D., Wilson, C. E., & Yovel, G. (2009).
Diagnosing prosopagnosia: Effects of ageing, sex, and partici-
pant–stimulus ethnic match on the Cambridge Face Memory Test
and Cambridge Face Perception Test. Cognitive Neuropsychology,
26(5), 423–455. https:// doi. org/ 10. 1080/ 02643 29090 33431 49
Bruce, V. (1982). Changing faces: Visual and non-visual coding pro-
cesses in face recognition. British Journal of Psychology, 73(1),
105–116. https:// doi. org/ 10. 1111/j. 2044- 8295. 1982. tb017 95.x
Bruce, V., Bindemann, M., & Lander, K. (2018). Individual differences
in face perception and person recognition. Cognitive Research:
Principles and Implications, 3(1), 10–12. https:// doi. org/ 10. 1186/
s41235- 018- 0109-4
Childs, M. J., Jones, A., Thwaites, P., Zdravković, S., Thorley, C.,
Suzuki, A., Shen, R., Ding, Q., Burns, E., Xu, H., & Tree, J.
J. (2021). Do individual differences in face recognition ability
moderate the other ethnicity effect? Journal of Experimental Psy-
chology: Human Perception and Performance, 47(7), 893–907.
https:// doi. org/ 10. 1037/ xhp00 00762
Corrow, S. L., Albonico, A., & Barton, J. J. S. (2018). Diagnosing
Prosopagnosia: The Utility of Visual Noise in the Cambridge Face
Recognition Test. Perception, 47(3), 330–343. https:// doi. org/ 10.
1177/ 03010 06617 750045
Corrow, S. L., Davies-Thompson, J., Fletcher, K., Hills, C., Corrow, J.
C., & Barton, J. J. S. (2019). Training face perception in develop-
mental prosopagnosia through perceptual learning. Neuropsycho-
logia, 134, 107196. https:// doi. org/ 10. 1016/j. neuro psych ologia.
2019. 107196
Crawford, J. R., Garthwaite, P. H., & Slick, D. J. (2009). On percen-
tile norms in neuropsychology: Proposed reporting standards and
methods for quantifying the uncertainty over the percentile ranks
of test scores. The Clinical Neuropsychologist, 23(7), 1173–1195.
https:// doi. org/ 10. 1080/ 13854 04090 27950 18
Dalrymple, K. A., Fletcher, K., Corrow, S., Barton, J. J. S., Yonas,
A., & Duchaine, B. (2014). “ A room full of strangers every day
”: The psychosocial impact of developmental prosopagnosia on
children and their families. Journal of Psychosomatic Research,
77(2), 144–150. https:// doi. org/ 10. 1016/j. jpsyc hores. 2014. 06. 001
Davies-Thompson, J., Fletcher, K., Hills, C., Pancaroglu, R., Corrow,
S. L., & Barton, J. J. S. (2017). Perceptual Learning of Faces:
A Rehabilitative Study of Acquired Prosopagnosia. Journal of
Cognitive Neuroscience, 29(3), 573–591. https:// doi. org/ 10. 1162/
jocn_a_ 01063
DeGutis, J., Wilmer, J., Mercado, R. J., & Cohan, S. (2013). Using
regression to measure holistic face processing reveals a strong link
with face recognition ability. Cognition, 126(1), 87–100. https://
doi. org/ 10. 1016/j. cogni tion. 2012. 09. 004
Dennett, H. W., McKone, E., Tavashmi, R., Hall, A., Pidcock, M.,
Edwards, M., & Duchaine, B. (2012). The Cambridge Car Mem-
ory Test: A task matched in format to the Cambridge Face Mem-
ory Test, with norms, reliability, sex differences, dissociations
from face memory, and expertise effects. Behavior Research Meth-
ods, 44(2), 587–605. https:// doi. org/ 10. 3758/ s13428- 011- 0160-2
Department of Statistics Malaysia. (2011). Population Distribution
and Basic Demographic Characteristic Report 2010 (Updated:
05/08/2011). Retrieved September 20, 2022, from https:// www.
dosm. gov. my/ v1/ index. php?r= column/ cthem e& menu_ id= L0phe
U43NW JwRWV SZklW dzQ4T lhUUT 09& bul_ id= MDMxd
HZjWT k1SjF zTzNk RXYzc VZjdz 09
Behavior Research Methods
1 3
Diedenhofen, B., & Musch, J. (2015). Cocor: A comprehensive solution
for the statistical comparison of correlations. PLoS One, 10(4),
1–12. https:// doi. org/ 10. 1371/ journ al. pone. 01219 45
Dowle, M., & Srinivasan, A. (2021). data.table: Extension of ‘data.
frame’. Retrieved July 20, 2022, from https:// cran.r- proje ct. org/
packa ge= data. table
Duchaine, B. (2000). Developmental prosopagnosia with normal con-
figural processing. NeuroReport, 11(1), 79–83. https:// doi. org/ 10.
1097/ 00001 756- 20000 1170- 00016
Duchaine, B., & Nakayama, K. (2006). The Cambridge Face Memory
Test: Results for neurologically intact individuals and an investi-
gation of its validity using inverted face stimuli and prosopagnosic
participants. Neuropsychologia, 44(4), 576–585. https:// doi. org/
10. 1016/j. neuro psych ologia. 2005. 07. 001
Estudillo, A. J. (2021). Self-reported face recognition abilities for own
and other-race faces. Journal of Criminal Psychology, 11(2),
105–115. https:// doi. org/ 10. 1108/ JCP- 06- 2020- 0025
Estudillo, A. J., & Bindemann, M. (2014). Generalization across view
in face memory and face matching. I-Perception, 5(7), 589–601.
https:// doi. org/ 10. 1068/ i0669
Estudillo, A. J., & Wong, H. K. (2021). Associations between self-
reported and objective face recognition abilities are only evident
in above- And below-average recognisers. PeerJ, 9, 1–12. https://
doi. org/ 10. 7717/ peerj. 10629
Estudillo, A. J., Lee, J. K. W., Mennie, N., & Burns, E. (2020). No
evidence of other-race effect for Chinese faces in Malaysian
non-Chinese population. Applied Cognitive Psychology, 34(1),
270–276. https:// doi. org/ 10. 1002/ acp. 3609
Falissard, B. (2012). psy: Various procedures used in psychometry.
Retrieved July 20, 2022, from https:// cran.r- proje ct. org/ packa ge=
psy
Faul, F., Erdfelder, E., Buchner, A., & Lang, A. G. (2009). Statisti-
cal power analyses using G*Power 3.1: Tests for correlation and
regression analyses. Behavior Research Methods, 41(4), 1149–
1160. https:// doi. org/ 10. 3758/ BRM. 41.4. 1149
Gamaldo, A. A., & Allaire, J. C. (2016). Daily Fluctuations in Every-
day Cognition: Is It Meaningful? Journal of Aging and Health,
28(5), 834–849. https:// doi. org/ 10. 1177/ 08982 64315 611669
George, D., & Mallery, P. (2019). IBM SPSS Statistics 26 Step by Step.
In: IBM SPSS Statistics 26 Step by Step. Routledge. https:// doi.
org/ 10. 4324/ 97804 29056 765
Germine, L. T., Duchaine, B., & Nakayama, K. (2011). Where cogni-
tive development and aging meet: Face learning ability peaks after
age 30. Cognition, 118(2), 201–210. https:// doi. org/ 10. 1016/j.
cogni tion. 2010. 11. 002
Hittner, J. B., May, K., & Silver, N. C. (2003). A Monte Carlo evalu-
ation of tests for comparing dependent correlations. Journal of
General Psychology, 130(2), 149–168. https:// doi. org/ 10. 1080/
00221 30030 96012 82
JASP Team. (2022). JASP (Version 0.16.3)[Computer software].
Retrieved January 14, 2022, from https:// jasp- stats. org/
Kennerknecht, I., Grueter, T., Welling, B., Wentzek, S., Horst, J.,
Edwards, S., & Grueter, M. (2006). First report of prevalence of
non-syndromic hereditary prosopagnosia (HPA). American Jour-
nal of Medical Genetics Part A, 140A(15), 1617–1622. https:// doi.
org/ 10. 1002/ ajmg.a. 31343
Kennerknecht, I., Ho, N. Y., & Wong, V. C. N. (2008). Prevalence of
hereditary prosopagnosia (HPA) in Hong Kong Chinese popula-
tion. American Journal of Medical Genetics Part A, 146A(22),
2863–2870. https:// doi. org/ 10. 1002/ ajmg.a. 32552
Lakens, D. (2013). Calculating and reporting effect sizes to facilitate
cumulative science: a practical primer for t-tests and ANOVAs.
Frontiers in Psychology, 4, 863. https:// doi. org/ 10. 3389/ fpsyg.
2013. 00863
McCaffrey, R. J., & Westervelt, H. J. (1995). Issues associated with
repeated neuropsychological assessments. Neuropsychology
Review, 5(3), 203–221. https:// doi. org/ 10. 1007/ BF022 14762
McKone, E., Hall, A., Pidcock, M., Palermo, R., Wilkinson, R. B., Riv-
olta, D., Yovel, G., Davis, J. M., & O’Connor, K. B. (2011). Face
ethnicity and measurement reliability affect face recognition perfor-
mance in developmental prosopagnosia: Evidence from the Cam-
bridge Face Memory Test–Australian. Cognitive Neuropsychology,
28(2), 109–146. https:// doi. org/ 10. 1080/ 02643 294. 2011. 616880
McKone, E., Stokes, S., Liu, J., Cohan, S., Fiorentini, C., Pidcock,
M., Yovel, G., Broughton, M., & Pelleg, M. (2012). A robust
method of measuring other-race and other-ethnicity effects: The
Cambridge face memory test format. PLoS One, 7(10), e47956.
https:// doi. org/ 10. 1371/ journ al. pone. 00479 56
McKone, E., Wan, L., Robbins, R., Crookes, K., & Liu, J. (2017). Diag-
nosing prosopagnosia in East Asian individuals: Norms for the Cam-
bridge Face Memory Test–Chinese. Cognitive Neuropsychology,
34(5), 253–268. https:// doi. org/ 10. 1080/ 02643 294. 2017. 13716 82
Meissner, C. A., & Brigham, J. C. (2001). Thirty Years of Investigat-
ing the Own-Race Bias in Memory for Faces: A Meta-Analytic
Review. Psychology, Public Policy, and Law, 7(1), 3–35. https://
doi. org/ 10. 1037/ 1076- 8971.7. 1.3
Murray, E., & Bate, S. (2020). Diagnosing developmental prosop-
agnosia: repeat assessment using the Cambridge Face Memory
Test. Royal Society Open Science, 7(9), 200884. https:// doi. org/
10. 1098/ rsos. 200884
Palermo, R., Rossion, B., Rhodes, G., Laguesse, R., Tez, T., Hall, B.,
Albonico, A., Malaspina, M., Daini, R., Irons, J., Al-Janabi, S.,
Taylor, L. C., Rivolta, D., & McKone, E. (2017). Do people have
insight into their face recognition abilities? Quarterly Journal
of Experimental Psychology, 70(2), 218–233. https:// doi. org/ 10.
1080/ 17470 218. 2016. 11610 58
R Core Team. (2021). R: A Language and Environment for Statistical
Computing. R Foundation for Statistical Computing.Retrieved
January 14, 2022, fromhttps:// www.r- proje ct. org/
Rezlescu, C., Danaila, I., Miron, A., & Amariei, C. (2020). More time
for science: Using Testable to create and share behavioral experi-
ments faster, recruit better participants, and engage students in
hands-on research. Progress in Brain Research, 253, 243–262.
Rossion, B. (2014). Understanding face perception by means of pros-
opagnosia and neuroimaging. Frontiers in Bioscience, 6(2), 706.
https:// doi. org/ 10. 2741/ e706
RStudio Team. (2021). RStudio: Integrated Development Environment
for R. Retrieved January 14, 2022, from http:// www. rstud io. com/
Russell, R., Duchaine, B., & Nakayama, K. (2009). Super-recognizers:
People with extraordinary face recognition ability. Psychonomic
Bulletin and Review, 16(2), 252–257. https:// doi. org/ 10. 3758/
PBR. 16.2. 252
Saravanamuttu, J. (2010). Malaysia: Multicultural society, Islamic
state, or what? State and Secularism: Perspectives from Asia,
279–300. https:// doi. org/ 10. 1142/ 97898 14282 383_ 0016
Shah, P., Gaule, A., Sowden, S., Bird, G., & Cook, R. (2015). The
20-item prosopagnosia index (PI20): A self-report instrument for
identifying developmental prosopagnosia. Royal Society Open
Science, 2(6), 1–11. https:// doi. org/ 10. 1098/ rsos. 140343
Shakeshaft, N. G., & Plomin, R. (2015). Genetic specificity of face
recognition. Proceedings of the National Academy of Sciences of
the United States of America, 112(41), 12887–12892. https:// doi.
org/ 10. 1073/ pnas. 14218 81112
Tanaka, J. W., Heptonstall, B., & Hagen, S. (2013). Perceptual exper-
tise and the plasticity of other-race face recognition. Visual Cog-
nition, 21(9–10), 1183–1201. https:// doi. org/ 10. 1080/ 13506 285.
2013. 826315
Wan, L., Crookes, K., Reynolds, K. J., Irons, J. L., & McKone, E.
(2015). A cultural setting where the other-race effect on face
recognition has no social-motivational component and derives
Behavior Research Methods
1 3
entirely from lifetime perceptual experience. Cognition, 144(0010),
91–115. https:// doi. org/ 10. 1016/j. cogni tion. 2015. 07. 011
Wan, L., Crookes, K., Dawel, A., Pidcock, M., Hall, A., & McKone,
E. (2017). Face-blind for other-race faces: Individual differences
in other-race recognition impairments. Journal of Experimental
Psychology: General, 146(1), 102–122. https:// doi. org/ 10. 1037/
xge00 00249
Wang, R., Li, J., Fang, H., Tian, M., & Liu, J. (2012). Individual dif-
ferences in holistic processing predict face recognition ability.
Psychological Science, 23(2), 169–177. https:// doi. org/ 10. 1177/
09567 97611 420575
Wickham, H. (2007). Reshaping data with the reshape package. Jour-
nal of Statistical Software, 21(12), 1–20.Retrieved July 20, 2022,
fromhttp:// www. jstat soft. org/ v21/ i12/ paper
Wickham, H. (2021). tidyr: Tidy Messy Data. Retrieved July 20, 2022,
from https:// cran.r- proje ct. org/ packa ge= tidyr
Wickham, H., François, R., Henry, L., & Müller, K. (2021). dplyr: A Gram-
mar of Data Manipulation. Retrieved July 20, 2022, from https://
cran.r- proje ct. org/ packa ge= dplyr
Wilmer, J. B. (2017). Individual Differences in Face Recognition: A
Decade of Discovery. Current Directions in Psychological Sci-
ence, 26(3), 225–230. https:// doi. org/ 10. 1177/ 09637 21417 710693
Wilmer, J. B., Germine, L., Chabris, C. F., Chatterjee, G., Williams,
M., Loken, E., Nakayama, K., & Duchaine, B. (2010). Human face
recognition ability is specific and highly heritable. Proceedings of
the National Academy of Sciences of the United States of America,
107(11), 5238–5241. https:// doi. org/ 10. 1073/ pnas. 09130 53107
Wong, H. K., Stephen, I. D., & Keeble, D. R. T. (2020). The Own-Race
Bias for Face Recognition in a Multiracial Society. Frontiers in
Psychology, 11, 208. https:// doi. org/ 10. 3389/ fpsyg. 2020. 00208
Wong, H. K., Estudillo, A. J., Stephen, I. D., & Keeble, D. R. T.
(2021). The other-race effect and holistic processing across racial
groups. Scientific Reports, 11(1), 1–15. https:// doi. org/ 10. 1038/
s41598- 021- 87933-1
Yardley, L., McDermott, L., Pisarski, S., Duchaine, B., & Nakayama,
K. (2008). Psychosocial consequences of developmental pros-
opagnosia: A problem of recognition. Journal of Psychosomatic
Research, 65(5), 445–451. https:// doi. org/ 10. 1016/j. jpsyc hores.
2008. 03. 013
Open practices statements The data for all experiments and the
CFMT-MY materials (stimuli and trial order) are available at https://
osf. io/ gu4fy/. None of the experiments was preregistered.
Publisher’s note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds
exclusive rights to this article under a publishing agreement with the
author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of
such publishing agreement and applicable law.
Article
Although it is generally assumed that face recognition relies on holistic processing, whether face recognition deficits observed in Developmental Prosopagnosics (DPs) can be explained by impaired holistic processing is currently under debate. The mixed findings from past studies could be the consequence of DP's heterogeneous deficit nature and the use of different measures of holistic processing-the inversion, part-whole, and composite tasks-which showed a poor association among each other. The present study aimed to gain further insight into the role of holistic processing in DPs. Groups of DPs and neurotypicals completed three tests measuring holistic face processing and non-face objects (i.e., Navon task). At a group level, DPs showed (1) diminished, but not absent, inversion and part-whole effects, (2) comparable magnitudes of the composite face effect and (3) global precedence effect in the Navon task. However, single-case analyses showed that these holistic processing deficits in DPs are heterogeneous.
Article
Full-text available
Successful face recognition is important for social interactions and public security. Although some preliminary evidence suggests that anodal and cathodal transcranial direct current stimulation (tDCS) might modulate own- and other-race face identification, respectively, the findings are largely inconsistent. Hence, we examined the effect of both anodal and cathodal tDCS on the recognition of own- and other-race faces. Ninety participants first completed own- and other-race Cambridge Face Memory Test (CFMT) as baseline measurements. Next, they received either anodal tDCS, cathodal tDCS or sham stimulation and finally they completed alternative versions of the own- and other-race CFMT. No difference in performance, in terms of accuracy and reaction time, for own- and other-race face recognition between anodal tDCS, cathodal tDCS and sham stimulation was found. Our findings cast doubt upon the efficacy of tDCS to modulate performance in face identification tasks.
Article
Full-text available
The Cambridge Face Memory Test (CFMT) is one of the most used assessments of face recognition abilities in the science of face processing. The original task, using White male faces, has been empirically evaluated for psychometric properties (Duchaine & Nakayama, 2006), while the longer and more difficult version (CFMT+; Russell et al., 2009) has not. Critically, no version exists using female faces. Here, we present the Female Cambridge Face Memory Test - Long Form (F-CFMT+) and evaluate the psychometric properties of this task in comparison to the Male Cambridge Face Memory Test - Long Form (M-CFMT+). We tested typically developing emerging adults (18 to 25 years old) in both Cambridge face recognition tasks, an old-new face recognition task, and a car recognition task. Results indicate that the F-CFMT+ is a valid, internally consistent measure of unfamiliar face recognition that can be used alone or in tandem with the M-CFMT+ to assess recognition abilities for young adult White faces. When used together, performance on the F-CFMT+ and M-CFMT+ can be directly compared, adding to the ability to understand face recognition abilities for different kinds of faces. The two tasks have high convergent validity and relatively good divergent validity with car recognition in the same task paradigm. The F-CFMT+ will be useful to researchers interested in evaluating a broad range of questions about face recognition abilities in both typically developing individuals and those with atypical social information processing abilities.
Article
Full-text available
Individuals are better at recognizing faces from their own ethnic group compared with other ethnicity faces—the other-ethnicity effect (OEE). This finding is said to reflect differences in experience and familiarity to faces from other ethnicities relative to faces corresponding with the viewers’ ethnicity. However, own-ethnicity face recognition performance ranges considerably within a population, from very poor to extremely good. In addition, within-population recognition performance on other-ethnicity faces can also vary considerably with some individuals being classed as “other ethnicity face blind” (Wan et al., 2017). Despite evidence for considerable variation in performance within population for faces of both types, it is currently unclear whether the magnitude of the OEE changes as a function of this variability. By recruiting large-scale multinational samples, we investigated the size of the OEE across the full range of own and other ethnicity face performance while considering measures of social contact. We find that the magnitude of the OEE is remarkably consistent across all levels of within-population own- and other-ethnicity face recognition ability, and this pattern was unaffected by social contact measures. These findings suggest that the OEE is a persistent feature of face recognition performance, with consequences for models built around very poor, and very good face recognizers.
Article
Full-text available
It is widely accepted that holistic processing is important for face perception. However, it remains unclear whether the other-race effect (ORE) (i.e. superior recognition for own-race faces) arises from reduced holistic processing of other-race faces. To address this issue, we adopted a cross-cultural design where Malaysian Chinese, African, European Caucasian and Australian Caucasian participants performed four different tasks: (1) yes–no face recognition, (2) composite, (3) whole-part and (4) global–local tasks. Each face task was completed with unfamiliar own- and other-race faces. Results showed a pronounced ORE in the face recognition task. Both composite-face and whole-part effects were found; however, these holistic effects did not appear to be stronger for other-race faces than for own-race faces. In the global–local task, Malaysian Chinese and African participants demonstrated a stronger global processing bias compared to both European- and Australian-Caucasian participants. Importantly, we found little or no cross-task correlation between any of the holistic processing measures and face recognition ability. Overall, our findings cast doubt on the prevailing account that the ORE in face recognition is due to reduced holistic processing in other-race faces. Further studies should adopt an interactionist approach taking into account cultural, motivational, and socio-cognitive factors.
Article
Full-text available
Purpose The other-race effect shows that people are better recognizing faces from their own-race compared to other-race faces. This effect can have dramatic consequences in applied scenarios whereby face identification is paramount, such as eyewitness identification. This paper aims to investigate whether observers have insights into their ability to recognize other-race faces. Design/methodology/approach Chinese ethnic observers performed objective measures of own- and other-race face recognition – the Cambridge Face Memory Test Chinese and the Cambridge Face Memory Test original; the PI20 – a 20-items self-reported measured of general face recognition abilities; and the ORE20 – a new developed 20-items self-reported measure of other-race face recognition. Findings Recognition of own-race faces was better compared to other-race faces. This effect was also evident at a phenomenological level, as observers reported to be worse recognizing other-race faces compared to own-race faces. Additionally, although a moderate correlation was found between own-race face recognition abilities and the PI20, individual differences in the recognition of other-race faces was only poorly associated with observers’ scores in the ORE20. Research limitations/implications These results suggest that observers’ insights to recognize faces are more consistent and reliable for own-race faces. Practical implications Self-reported measures of other-race recognition could produce misleading results. Thus, when evaluating eyewitness’ accuracy identifying other-race faces, objective measures should be used. Originality/value In contrast to own race recognition, people have very limited insights into their recognition abilities for other race faces.
Article
Full-text available
The 20-Item Prosopagnosia Items (PI-20) was recently introduced as a self-report measure of face recognition abilities and as an instrument to help the diagnosis of prosopagnosia. In general, studies using this questionnaire have shown that observers have moderate to strong insights into their face recognition abilities. However, it remains unknown whether these insights are equivalent for the whole range of face recognition abilities. The present study investigates this issue using the Mandarin version of the PI-20 and the Cambridge Face Memory Test Chinese (CFMT-Chinese). Our results showed a moderate negative association between the PI-20 and the CFMT-Chinese. However, this association was driven by people with low and high face recognition ability, but absent in people within the typical range of face recognition performance. The implications of these results for the study of individual differences and the diagnosis of prosopagnosia are discussed.
Article
Full-text available
Developmental prosopagnosia (DP) is a cognitive condition characterized by a relatively selective impairment in face recognition. Currently, people are screened for DP via a single attempt at objective face-processing tests, usually all presented on the same day. However, several variables probably influence performance on these tests irrespective of actual ability, and the influence of repeat administration is also unknown. Here, we assess, for the first known time, the test–retest reliability of the Cambridge Face Memory Test (CFMT)—the leading task used worldwide to diagnose DP. This value was found to fall just below psychometric standards, and single-case analyses revealed further inconsistencies in performance that were not driven by testing location (online or in-person), nor the time-lapse between attempts. Later administration of an alternative version of the CFMT (the CFMT-Aus) was also found to be valuable in confirming borderline cases. Finally, we found that performance on the first 48 trials of the CFMT was equally as sensitive as the full 72-item score, suggesting that the instrument may be shortened for testing efficiency. We consider the implications of these findings for existing diagnostic protocols, concluding that two independent tasks of unfamiliar face memory should be completed on separate days.
Article
Full-text available
The own-race bias (ORB) is a reliable phenomenon across cultural and racial groups where unfamiliar faces from other races are usually remembered more poorly than own-race faces (Meissner and Brigham, 2001). By adopting a yes–no recognition paradigm, we found that ORB was pronounced across race groups (Malaysian–Malay, Malaysian–Chinese, Malaysian–Indian, and Western–Caucasian) when faces were presented with only internal features (Experiment 1), implying that growing up in a profoundly multiracial society does not necessarily eliminate ORB. Using a procedure identical to Experiment 1, we observed a significantly greater increment in recognition performance for other-race faces than for own-race faces when the external features (e.g. facial contour and hairline) were presented along with the internal features (Experiment 2)—this abolished ORB. Contrary to assumptions based on the contact hypothesis, participants’ self-reported amount of interracial contact on a social contact questionnaire did not significantly predict the magnitude of ORB. Overall, our findings suggest that the level of exposure to other-race faces accounts for only a small part of ORB. In addition, the present results also support the notion that different neural mechanisms may be involved in processing own- and other-race faces, with internal features of own-race faces being processed more effectively, whereas external features dominate representations of other-race faces.
Chapter
A major pain for researchers in all fields is that they have less and less time for actual science activities: reading, thinking, coming up with new theories and hypotheses, testing, analyzing data, writing. In psychology, three of the most time-consuming nonactual science activities are: learning how to program an experiment, recruiting participants, and preparing teaching materials. Testable (www.testable.org) provides a suite of academic tools to speed things up considerably. The Testable software allows the development of most psychology experiments in minutes, using a natural language form and a spreadsheet. Furthermore, any experiment can be easily converted into a social experiment in Testable Arena, with multiple participants interacting and viewing each other's responses. Experiments can then be published to Testable Library, a public repository for demonstration and sharing purposes. Participants can be recruited from Testable Minds, the subject pool with the most advanced participants verification system. Testable Minds employs multiple checks (such as face authentication) to ensure participants have accurate demographics (age, sex, location), are human, unique, and reliable. Finally, the Testable Class module can be used to teach psychology through experiments. It features over 50 ready-made classic psychology experiments, fully customizable, which instructors can add to their classes, together with their own experiments. These experiments can then be made available to students to do, import, modify, and use to collect data as part of their class. These Testable tools, backed up by a strong team of academic advisors and thousands of users, can save psychology researchers and other behavioral scientists valuable time for science.