Content uploaded by Cansu Malak
Author content
All content in this area was uploaded by Cansu Malak on Jan 02, 2025
Content may be subject to copyright.
The importance of external features for categorizing ethnicity: Can Koreans identify
Korean, Japanese, and Chinese faces?
Cansu Malak1, Christian Wallraven1,2*
1Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea
2Department of Artificial Intelligence, Korea University, Seoul, South Korea
* Corresponding author
E-mail: wallraven@korea.ac.kr
Cansu Malak https://orcid.org/0000-0001-6621-0068
Christian Wallraven https://orcid.org/0000-0002-2604-9115
Abstract
To what degree is it possible to recognize a person’s ethnicity just from their face? What
information do you need to do so? Here, we present the first investigation addressing these
questions, and in particular the role of internal and external facial features for fine-grained
ethnicity categorization among closely related East Asian ethnicities. Specifically, we tested
the ability of N=105 Korean participants to categorize male faces into Korean, Japanese, or
Chinese ethnicities - a categorization that according to participants ’ own introspective
opinion should be easily doable for them. Participants were split into two groups: one group
(N=53) was shown cropped, grayscale images showing solely internal facial features, and the
other (N=52) was shown faces including external features, including hair and full face
outline. Our findings indicate categorization based on internal features only was barely above
chance level, whereas external features resulted in more reliable above-chance performance -
albeit at a low overall level of around 52%. Our study emphasizes the importance of external
features in the accurate categorization of ethnic backgrounds, and establishes an overall low
ability of Koreans to reliably distinguish between East Asian ethnicities.
Introduction
Face processing is a complex cognitive phenomenon that includes categorization of
various personal characteristics, including racial and ethnic backgrounds [1]. It is important
to clarify the distinction between "race" and "ethnicity" in this context, as the terms are often
used interchangeably but carry different meanings. While race is a socially constructed
concept often referring to broad categories based on physical traits, such as skin color (e.g.,
white, Black), ethnicity pertains to finer distinctions related to cultural, linguistic, and
ancestral factors [2] (see [3]).
Much of the existing research focuses on face recognition across broad racial groups,
particularly in the context of large-scale appearance differences like those between
Caucasians and Asians or Caucasians and Black people (e.g. [4]). However, there has been
less emphasis on fine-grained ethnic groups that share more subtle, distinct facial features.
This study aims to explore these finer distinctions, examining how Korean participants
differentiate between closely related East Asian ethnicities, such as Korean, Japanese, and
Chinese.
One critical aspect of race-related facial recognition research is the phenomenon
known as the 'cross-race effect' or 'other-race effect.' This effect is well-documented, with
individuals showing a tendency to more accurately recognise and match faces of their own
racial or ethnic group in comparison to those from different groups [5–7]. In contrast, the
other-race advantage (ORA) or other-race classification advantage (ORCA) refers to the
quicker categorization of other-race faces by race compared to own-race faces [4,8]. Levin
[9] and Valentine and Endo [8] highlight that outgroup faces are easier to classify by race
than in-group faces. Although the ORE is well documented, the ORA during the
categorization of other race faces has received less attention and has yielded inconsistent
findings. Research suggests that the accuracy of categorizing faces by gender [10] and
estimating age [11] tends to be higher when applied to faces of the same race as the observer.
However, Zhao and Bentin [4] showed that individuals categorize faces of a different race
more quickly and accurately than those of their own race, with no differences in gender or
age classification. The difference in results from O'Toole et al.'s [10] and Dehon and
Brédartcan's [11] studies can be explained by the methods employed in Zhao and Bentin's [4]
research, where all images were shown without any head or facial hair. This is a significant
factor in remembering a face's gender, race, and identification [12,13].
Expanding upon the understanding of the importance of facial features in recognition,
prior studies have emphasized the role of external facial attributes, such as hairstyles and
accessories, in recognizing unfamiliar faces. Those studies show that internal features are
crucial for identifying familiar faces, whereas external traits are highly significant when
dealing with unfamiliar faces [14,15]. Moreover, the utility of internal (eyes, nose, mouth)
and external (hair, face shape) facial features can differ depending on one's cultural
background. For instance, individuals from the Middle East demonstrate a cognitive
advantage in relation to internal features, potentially due to cultural practices like headscarves
[16,17]. Furthermore, Wong et al. [18] discovered that internal features are more successfully
processed for own-race faces, while external features are significant for other-race faces.
Supporting this, Sporer and Horry [7] demonstrated that the removal of external features had
a greater negative impact on the ability to recognize faces of individuals from outgroups
compared to faces of individuals from ingroups. This suggests that people rely more on
internal characteristics when recognizing faces of individuals from their own race. Similarly,
Havard [19] demonstrated that removing external features increased the accurate matching of
their own “races” for Chinese and U.K. participants, but viewing whole faces, including
external features, were important for accurately matching other race faces.
Nevertheless, the studies mentioned above are mainly concerned with tasks related to
face identification or face matching/sorting. However, the process of perceiving faces
involves more than just being able to recall them; it also involves rapid and often unconscious
categorization of unfamiliar faces by ethnicity, gender, and age. Zhao and Bentin [20]
examined the "other race advantage" (ORA) as well as the function of configural processing
and facial features in race categorization. While they acknowledged the utility of individual
features, their study demonstrated that overall face configuration plays a significant role in
race classification. Similarly, Bülthoff and colleagues [21] aimed to identify which facial
features (eyes, nose, mouth, face contour, and texture) are most effective in race perception,
finding that the eyes and facial texture (surface information) are the two most important
features in determining a face's race. Still, there is a limited number of studies examining the
impact of external and internal facial features on the "other race advantage" during an
ethnicity categorization task. While most research has concentrated on how these features
contribute to face identification and individuals' ability to accurately match faces, our study
aims to expand the literature by investigating the role of internal and external facial features
in processing ethnicity and cultural backgrounds.
Moreover, the traditional assumption in the literature that racial classifications, such
as white and black, exhibit uniformity across regions is not accurate. Research by Chiroro et
al. [22] found that both White and Black South Africans demonstrated a recognition
advantage for faces within their own ethnic group but not for faces from other geographic
regions. This suggests an ethnic-geographically specific bias, highlighting substantial
variations in facial and bodily features across continents and even within them. Such findings
demonstrate that racial classifications cannot be considered perceptually uniform. As a result,
a single "Asian" category is insufficient, and more detailed ethnic face recognition research is
required.
This study aims to address the gap in the existing literature by examining the
importance of internal and external facial features for fine-grained ethnicity categorization. In
this regard, the goal of our research is to respond to the following question from a Korean
participant's point of view: how do internal and external facial features affect how different
fine-grained ethnic groups—like Korean, Japanese, and Chinese—are perceived and
distinguished?
Method
Participants
To determine the appropriate sample size for this study, we conducted an a priori power
analysis using G*Power version 3.1.9.6 software [23] based on the partial eta squared effect
size (ηp² = 0.031) drawn from a previous study on facial recognition [19]. The power
analysis, with an alpha level of 0.05 and 80% power, determined that a minimum sample size
of N = 52 participants was required to detect significant interaction effects between ethnicity
(Korean, Japanese, Chinese) and image presentation type (cropped versus full-face).
To ensure a more robust statistical setup and allow for additional testing of individual
categorization accuracy levels against chance, we recruited a sample of twice the size,
totalling N=105 participants with an age range of 18-35 (58 female; Mage = 23.68; SD = 2.67)
for our experiment. This sample was randomly split into two groups: Experiment 1 used
N=53 participants (29 female; Mage = 24.17; SD = 2.92) and Experiment 2 used N=52
participants (29 female; Mage = 23.17; SD = 2.32). To avoid familiarity bias, participants
were selected solely based on their lack of interest in football through the Korea University
student portal and compensated around $6 for their time. Prior to the experiment, participants
were fully informed about the study's aims, procedures, and their right to withdraw at any
time without any consequence. Written informed consent was obtained from all participants
before the start of the experiment, and consent forms were securely stored. All procedures
were carried out in compliance with the WMA Declaration of Helsinki, and the study was
approved by the Internal Review Board (IRB) of Korea University (KUIRB-2024-0104-01).
Materials and Stimuli
Dataset
The lack of large-scale fine-grained datasets [24,25] and the inability to access
datasets used in other studies [25,26] necessitated the creation of our own dataset.
Additionally, to ensure that participants in the human experiment had not been exposed to the
faces used, we were not able to utilize other existing datasets composed of celebrity
photographs, for example [27]. Therefore, our dataset comprises images of male football
players from the third or second leagues in Korea, China, and Japan. To avoid gender
confounding effects suggested by Wang et al. [26], which indicate that differences might arise
from mannerism and fashion, we exclusively scraped pictures of male players from the
publicly-accessible websites of sport clubs.
Our final dataset consists of 600 images, with 200 images from each ethnic group
(Chinese, Japanese, Korean). The images were histogram normalized to ensure equal contrast
range, converted to grayscale and edited for two separate experiments: the first experiment
used images cropped elliptically to exclude external features (hair, hairline, outline of the face
- see Fig 1A), whereas the second experiment used full-face images with external features
visible (Fig 1B). All images were edited using the GNU Image Manipulation Program [28].
Fig 1. Example Stimuli Used in Experiments.
(A) Trial setup for Experiment 1, where participants viewed cropped, grayscale images
focusing on internal facial features. (B) Trial setup for Experiment 2, where participants
viewed full-face grayscale images including both internal and external features. In both
experiments, participants were required to categorize the faces as Korean, Chinese, or
Japanese. For data protection issues, the face shown here is by one of the authors.
Confidence Questionnaires
We compiled two questionnaires to assess participants’ confidence in recognizing
Korean, Japanese, and Chinese faces - one pre-experiment, and one post-experiment. Each
questionnaire asked participants to indicate their level of agreement (1 = very strongly
disagree, 7 = very strongly agree) with three statements indicating recognition confidence for
Chinese faces and three indicating recognition confidence for Japanese faces. (1) I am
confident in distinguishing between Korean and Chinese/Japanese faces. (2) It is easy for me
to distinguish between Korean and Chinese/Japanese faces. (3) I often mistake a person's
nationality among Koreans and Chinese/Japanese in my daily life. The post-experiment
questionnaire used the same set of questions and added two extra questions, which were used
to determine if participants recognized any faces during the task.
Since the three questions queried similar introspective notions, we calculated
Cronbach’s α across them to assess consistency. The pre-experiment ratings formed a reliable
scale for Korean and Chinese (for Experiment 1, cropped - α = 0.730; for Experiment 2, full -
α = 0.807) and Korean and Japanese people (for Experiment 1, cropped - α = 0.918; for
Experiment 2, full - α = 0.823). The same result held for the post-experiment ratings, which
formed a reliable scale for Chinese (for Experiment 1, cropped - α = 0.654; for Experiment 2,
full - α = 0.749) and Japanese people (for Experiment 1, cropped - α = 0.707 ; for Experiment
2, full - α = 0.854).
Procedure
Following the pre-experiment confidence questionnaire, participants completed practice
trials using a separate set of images to familiarize themselves with the task.
The primary experiment then began, in which participants were shown images for 200
milliseconds (ms) each, following a 1000 ms fixation period. The stimuli consisted of 600
images, with 200 images from each ethnic group (Chinese, Japanese, Korean) - images were
shown in fully randomized order. After viewing each image, participants were instructed to
categorize the faces as either Korean, Chinese, or Japanese by using the keyboard (1, 2, 3) -
see Fig 1.
After viewing all stimuli, participants completed the post-experiment assessment. The
entire procedure was exactly the same for both experiments with the only difference being the
editing style of the photographs. The total experiment lasted around 40 minutes on average.
Statistical Analyses
Behavioral data were analyzed using a series of statistical tests to determine the
effects of ethnicity and image presentation style on categorization accuracy. First, a two-way
mixed ANOVA was conducted on the dependent variable of categorization accuracy to assess
the main effects of ethnicity (within-participant factor with levels Korean, Japanese, Chinese)
and background (between-participant factor with levels cropped and full-face) as well as their
interaction. Post hoc analyses were conducted using Bonferroni corrections for multiple
comparisons. To further investigate the differences in categorization accuracy across the
ethnic groups, a Chi-Square Goodness of Fit Test was performed, and to further explore the
relationships between categorization accuracy and introspective questionnaire items, Pearson
correlation analyses were performed with Bonferroni correction applied to adjust for multiple
comparisons. All statistical analyses were conducted with open-source JASP software [29].
Study data are available on the Open-Source Framework (https://osf.io/tyu2z/).
Results
We first conducted one-sample t-tests against the chance level of 33.3% in each
condition with Bonferroni correction applied (see Table 1). For Experiment 1 and cropped
faces, accuracy was significantly above chance for Korean faces (t(52) = 3.12, p= .001, d=
0.43), Japanese faces (t(52) = 5.41, p< .001, d= 0.74), and Chinese faces (t(52) = 5.56, p<
.001, d= 0.76). Likewise, for Experiment 2 and full-face images, categorization accuracy was
above chance for all conditions (Korean faces: t(51) = 7.76, p< .001, d= 1.08, Japanese
faces: t(51) = 14.38, p< .001, d= 1.99, and Chinese faces: t(51) = 8.74, p< .001, d= 1.21).
Importantly, Cohen's d values indicated larger effect sizes for full-face conditions, while
cropped conditions showed medium to small effect sizes.
Table 1. Descriptive Statistics.
Ethnicity
Background
N
Mean
Std.
Deviation
Korean
Cropped
53
38.0%
11.1%
Full
52
46.2%
12.0%
Japanese
Cropped
53
40.8%
10.2%
Full
52
61.5%
14.1%
Chinese
Cropped
53
41.4%
10.6%
Full
52
49.4%
13.3%
Categorization accuracy across different ethnic groups (Korean, Japanese, Chinese) in the
experimental conditions: cropped and full-face images (mean values and standard
deviations).
Next, a two-way mixed ANOVA was performed to evaluate the effects of ethnicity
and background on categorization accuracy. The ANOVA results indicated a significant main
effect for ethnicity (F(2, 206) = 13.445, p < .001, partial η2= 0.115); a significant main effect
for background (F(1, 103) = 118.940, p < .001, partial η2= 0.536); and - most importantly - a
significant interaction between ethnicity and background (F(2, 206) = 8.410, p < .001, partial
η2= 0.075). Results of the Bonferroni-corrected Post Hoc comparisons are shown in Table 2
(see also Figure 2). Specifically, performance in Experiment 1 for cropped faces was similar
across all three ethnicities. However, there were distinct differences Experiment 2: in
particular, accuracy for full Japanese faces was significantly higher than for both full Korean
faces, with a mean difference of 15.3% (p < .001), and full Chinese faces, with a mean
difference of 12.1% (p < .001), whereas Korean and Chinese faces showed similar
performance. These findings indicate that adding external features (full-face condition)
significantly improves classification accuracy for Japanese faces compared to Korean and
Chinese faces.
Table 2. Post Hoc Comparisons - Background * Ethnicity.
Comparison
Mean
Difference
SE
t
p.bonf
cropped, Korean
- full, Korean
-8.2%
2.3%
-3.519
0.008**
cropped, Korean
- cropped, Japanese
-2.8%
2.5%
-1.13
1.0
cropped, Korean
- full, Japanese
-23.5%
2.3%
-10.07
< .001***
cropped, Korean
- cropped, Chinese
-3.3%
2.5%
-1.335
1.0
cropped, Korean
- full, Chinese
-11.3%
2.3%
-4.864
< .001***
full, Korean -
cropped, Japanese
5.4%
2.3%
2.313
0.321
full, Korean -
full, Japanese
-15.3%
2.5%
-6.082
< .001***
full, Korean -
cropped, Chinese
4.9%
2.3%
2.094
0.556
full, Korean -
full, Chinese
-3.1%
2.5%
-1.248
1.0
cropped,
Japanese - full,
Japanese
-20.7%
2.3%
-8.863
< .001***
cropped,
Japanese - cropped,
Chinese
-0.5%
2.5%
-0.205
1.0
cropped,
japanese - full,
chinese
-8.5%
2.3%
-3.658
0.005**
full, japanese -
cropped, chinese
20.2%
2.3%
8.645
< .001***
full, japanese -
full, chinese
12.1%
2.5%
4.833
< .001***
cropped, chinese
- full, chinese
-8.0%
2.3%
-3.439
0.01*
Post hoc comparisons for categorization accuracy across different ethnic groups (Korean,
Japanese, Chinese) and image conditions (cropped vs. full-face). The mean difference,
standard error (SE), t-values, and Bonferroni-adjusted p-values (p.bonf) are reported.
Significant differences are marked with asterisks, indicating different levels of statistical
significance (* p < .05, ** p < .01, *** p < .001).
Figure 2. Mean Accuracy and Standard Deviation by Ethnicity and Image
Background. Error bars represent the standard error of the mean.
The confusion matrices (Fig. 3) indicate significant misclassification in the cropped
condition (Experiment 1). Korean faces were frequently misidentified as Japanese (32.83%),
while Japanese faces were often classified as Chinese (35.13%). In contrast, the full-face
condition (Experiment 2) demonstrates a significant improvement in accuracy, particularly
for Japanese faces, and a decrease in misclassifications. In this condition, fewer Korean faces
were categorized as Japanese (23.51%), and fewer Japanese faces were classified as Chinese
(26.42%). This decline emphasizes the importance of external features in improving the
accuracy of ethnic categorization.
Figure 3. Confusion Matrix during Ethnicity Categorization Task - Experiment 1,
cropped (left) and Experiment 2, full-face (right). Each cell shows the number and
percentage of times a particular ethnicity was predicted versus the true ethnicity label.
Higher values along the diagonal indicate better categorization accuracy, while
off-diagonal values represent misclassifications.
A two-way mixed ANOVA was conducted to compare pre-experiment and
post-experiment confidence scores for distinguishing Korean versus Chinese or Japanese
faces in cropped and full image conditions. The results showed significant main effects of
time (F(1, 103) = 129.75, p < .001, partial η² = 0.56) and of ethnicity, F(1, 103) = 34.33, p <
.001, partial η² = 0.25). The interaction between time and ethnicity was significant (F(1, 103)
= 17.67, p < .001, partial η² = 0.15), suggesting that the decline in confidence from pre-test to
post-test differed based on ethnicity, being more pronounced for Chinese faces. Additionally,
there was a significant interaction between ethnicity and background (F(1, 103) = 10.75, p =
.001, partial η² = 0.09), indicating that the effect of ethnicity on confidence differed
depending on the background condition (cropped or full). However, the time x background
interaction was not significant (F(1, 103) = 1.72, p = .193, partial η² = 0.02) as was the
three-way interaction between time, ethnicity, and background (F(1, 103) = 1.77, p = .186,
partial η² = 0.02). Post hoc tests revealed that confidence scores were significantly higher in
the pre-test (M = 1.21, SE = 0.10) than in the post-test, t(103) = 11.39, p < .001. Additionally,
confidence for Japanese faces (M = 0.41, SE = 0.07) was significantly higher than for
Chinese faces, t(103) = -5.86, p < .001. In the interaction between background and ethnicity,
confidence was significantly higher for full Japanese faces compared to cropped Japanese
faces, t(103) = -6.43, p < .001.
Since the questionnaire items asked for distinguishing Korean versus other faces,
comparing the introspective confidence values for this task with the categorization accuracy
results directly may not fully capture their relationship. Nonetheless, we performed Pearson
correlations of the average categorization performance for Chinese and Japanese faces with
the respective questionnaire items. For Chinese faces, a significant positive correlation was
found between categorization accuracy and post-experiment confidence scores for the
full-image condition (r = 0.574, p < .001), suggesting that participants who performed better
in categorizing Chinese faces also reported higher confidence after the experiment.
Additionally, there was a significant positive correlation between the post-experiment
confidence for Japanese faces in the full-image condition and categorization accuracy (r =
0.556, p < .001). This finding indicates that confidence was generally aligned with actual
performance accuracy for Japanese faces in the full-face condition, providing insight into the
relationship between participants' confidence and their categorization success.
Moreover, a Pearson correlation was conducted to examine the relationship between
correct selections for full and cropped images. We found a moderate, positive correlation
between correct selections for full-face images and cropped-face images for each ethnicity.
Specifically, for Korean faces, there was a significant positive correlation, r(199) = 0.52, p<
.001, indicating that participants who performed well in the full-face condition for Korean
faces also tended to perform well in the cropped-face condition. A similar positive correlation
was found for Japanese faces, r(199) = 0.41, p< .001. Chinese faces also showed a positive
correlation, r(199) = 0.32, p< .001 (see Figure 4).
Figure 4. Item Based Analysis for Each Image: Full-face and Cropped-face Correct
Selections for Korean, Japanese, and Chinese Faces. The scatter plot illustrates the
relationship between correct selections for full-face and cropped-face images across
three ethnic groups: Korean, Chinese, and Japanese. Each dot represents an individual
data point, color-coded by ethnicity. Separate regression lines are fitted for each
ethnicity group, showing a positive correlation between the two variables. The R² values
for each group are as follows: Korean (R² = 0.27), Chinese (R² = 0.10), and Japanese (R²
= 0.17), indicating the strength of the correlation within each ethnic group.
Discussion
Recognizing subtle differences between faces is something we do every day. But how
accurate are we at identifying someone's ethnicity based on facial features alone? This study
presents a novel investigation into the importance of internal and external facial features
within closely related East Asian groups (Korean, Japanese, and Chinese), using a
fine-grained ethnicity categorization task.
First, our findings showed significant differences in categorization accuracy
depending on both the ethnicity of the faces and the presentation style of the images (cropped
versus full-face). While it may be tempting to assume that internal features are essential to
ethnic categorization, our findings show a different story. Accuracy based on eyes, nose, and
mouth alone barely exceeded chance level. Although the inclusion of external features like
hair and face shape did improve performance, with average accuracy rising from 40% to
52%, performance was still at an overall low level. This suggests that even with a full facial
view, distinguishing these closely related East Asian ethnicities remains a significant
challenge, highlighting the subtle nature of the visual cues involved. Interestingly, a deeper
analysis of the full-face condition revealed significant differences in classification accuracy
between Japanese faces and the other two groups. Accuracy in categorizing Japanese faces
was substantially higher than for both Korean and Chinese faces, with a mean difference
exceeding 12% in both cases. This suggests that external features associated with Japanese
faces, such as hairstyle or face shape, may provide more distinct visual cues than those
related to Korean and Chinese faces, contributing to greater accuracy in ethnic categorization.
The dependence on external features observed in our study aligns with findings across
various face recognition studies. Studies by Meissner and Brigham [6] and Wong et al. [18]
have demonstrated the importance of external features in tasks involving unfamiliar or
other-race faces, particularly in face-matching and recognition paradigms. Wong et al. [18]
further noted a distinction in feature processing: while own-race faces are processed more
efficiently based on internal features, other-race face recognition appears to rely
predominantly on external features. This pattern aligns with the reliance on external cues
observed in our Korean participants when categorizing Chinese and Japanese faces. This
reliance on external features is supported by the work of Sporer and Horry [7], which
examined the influence of internal and external features on face recognition. Similarly,
Havard [19] highlighted the reliance on cues such as hair and face shape in the recognition of
Asian faces by U.K. participants. Furthermore, MacLin and Malpass [30] found that when the
majority of faces were racially ambiguous, hair did actually function as a distinguishing
characteristic of race. This further supports our assertion that external features, such as face
outline and / or hairstyle, are not just beneficial but crucial in the task of fine-grained ethnic
categorization. This finding further aligns with prior research, such as studies by Ellis et al.
[14] and Young et al. [15], which emphasized the importance of external features in
recognizing unfamiliar faces.
Our study next explored the relationship between participants' confidence in
distinguishing between these ethnicities and their actual performance in the categorization
task. Participants entered the task with high pre-test confidence, reflecting optimistic
assumptions about their ability to differentiate between the faces. The significant drop in
post-test confidence highlights the misalignment between these initial assumptions and the
actual complexity of the task. Interestingly, while pre-test confidence did not predict
performance, post-test confidence aligned significantly with categorization accuracy for both
Chinese and Japanese faces in the full-image condition. This suggests that participants
adjusted their introspective assessments based on their experience with the task, indicating
that such introspection may serve as a learning mechanism when fine-grained visual
distinctions are required. For Japanese faces, confidence remained higher throughout the task,
likely reflecting the salience of distinct external features. This shows that the presence of
these visually distinct cues influenced some of the participants' introspective judgments. In
contrast, the sharp drop in confidence for Chinese faces highlights the difficulty of
identifying groups with subtle cues, suggesting a dependence on assumptions that proved
ineffective. Overall, while initial introspective assumptions were overconfident, post-task
introspections became more aligned with actual performance, mainly when external features
were available. This demonstrates the dynamic interaction between perceived ability, actual
performance, and the availability of visual information in ethnic categorization.
Finally, our item-based analyses offer further support for the consistency of our
findings. Faces correctly categorized in the full-face condition tended to also be more
accurately classified in the cropped-face condition. Altogether, while our findings contribute
to the ongoing discussion around the role of external features in face recognition, they also
highlight the distinct challenges posed by fine-grained ethnic categorization tasks. Unlike
traditional ORE studies (see [6,14]), which often focus on tasks involving other races (such as
face matching or yes/no tasks), our study directly addresses race categorization, as seen in
Zhao and Bentin [4]. Also, our focus shifts toward the specific role of internal and external
facial features in fine-grained ethnic recognition. The findings of Zhao and Bentin [4]
demonstrated that individuals categorize faces of a different race more quickly and accurately
than those of their own race, even in the absence of external cues such as hair. While their
study focused on larger racial groups such as Chinese and Caucasian, our findings for
fine-grained ethnicity categorization suggest that even though quick categorization may occur
without external features, accurate identification relies heavily on these external cues,
particularly in fine-grained ethnic recognition tasks. This finding is further contextualized
when comparing our results to other studies that examined broader racial distinctions. For
example, Zhao and Bentin’s [4] study, which investigated broader racial distinctions between
Chinese and Caucasian faces, reported moderate to large effects for the other-race advantage
(ORA) in categorization tasks (partial η² = 0.26 to 0.33). This highlights the efficiency with
which participants categorize other-race faces, reflecting the perceptual salience of broad
racial categories. In contrast, our study, which focuses on the finer distinctions between East
Asian ethnicities (Korean, Japanese, and Chinese), showed a smaller effect of ethnicity on
categorization accuracy (partial η² = 0.115), suggesting that fine-grained ethnic recognition is
a more cognitively demanding task. Although external features, such as face outline and
hairstyle, improved accuracy (partial η² = 0.536), the absolute performance remained lower
compared to broader racial categorization tasks. Thus, while external features seem crucial
for fine-grained ethnic recognition, the lower effect sizes in our study suggest that
distinguishing between closely related ethnicities remains challenging, even with the
availability of external cues.
Our study has several limitations that will need to be addressed in future research.
First of all, our participant sample focused on an easily accessible student population in
Korea - generalizing our results to other age ranges as well as the corresponding Chinese and
Japanese ethnicities will be an important next step. Another limitation of the study involves
the use of static photos of male football players, which may not accurately reflect the
dynamic and context-dependent character of ethnicity categorization in real-world scenarios.
Future studies should incorporate female stimuli and test extension to videos or real-life
interactions to examine how these factors influence ethnic and cultural background
recognition. Finally, it would be interesting to explore whether similar patterns are observed
in other parts of the world with different ethnic compositions.
Acknowledgments
This study was supported by the National Research Foundation of Korea under
project BK21 FOUR and grants NRF-2022R1A2C2092118, NRF-2022R1H1A2092007, as
well as by Institute of Information \& Communications Technology Planning & Evaluation
(IITP) grants funded by the Korea government (No. RS-2019-II190079, Department of
Artificial Intelligence, Korea University; No. RS-2021-II212068, Artificial Intelligence
Innovation Hub).
References
1. Wong HK, Keeble DRT, Stephen ID. Do they ‘look’ different(ly)? Dynamic face
recognition in Malaysians: Chinese, Malays and Indians compared. Br J Psychol. 2023
May;114(S1):134–49.
2. APA Dictionary of Psychology [Internet]. [cited 2024 Oct 16]. Available from:
https://dictionary.apa.org/
3. Suyemoto KL, Curley M, Mukkamala S. What Do We Mean by “Ethnicity” and “Race”?
A Consensual Qualitative Research Investigation of Colloquial Understandings.
Genealogy. 2020 Sep;4(3):81.
4. Zhao L, Bentin S. Own- and other-race categorization of faces by race, gender, and age.
Psychon Bull Rev. 2008;15(6):1093–9.
5. Malpass RS, Kravitz J. Recognition for faces of own and other race. J Pers Soc Psychol.
1969;13(4):330–4.
6. Meissner CA, Brigham JC. Thirty years of investigating the own-race bias in memory for
faces: A meta-analytic review. Psychol Public Policy Law. 2001;7(1):3–35.
7. Sporer SL, Trinkl B, Guberova E. Matching faces: Differences in processing speed of
out-group faces by different ethnic groups. J Cross-Cult Psychol. 2007;38(4):398–412.
8. Valentine T, Endo M. Towards an exemplar model of face processing: the effects of race
and distinctiveness. Q J Exp Psychol A. 1992 May;44(4):671–703.
9. Levin DT. Classifying faces by race: The structure of face categories. J Exp Psychol
Learn Mem Cogn. 1996;22(6):1364–82.
10. O’Toole AJ, Peterson J, Deffenbacher KA. An “other-race effect” for categorizing faces
by sex. Perception. 1996;25(6):669–76.
11. Dehon H, Brédart S. An “other-race” effect in age estimation from faces. Perception.
2001;30(9):1107–13.
12. Wright DB, Sladden B. An own gender bias and the importance of hair in face
recognition. Acta Psychol (Amst). 2003 Sep;114(1):101–14.
13. MacLin OH, Malpass RS. The ambiguous-race face illusion. Perception.
2003;32(2):249–52.
14. Ellis HD, Shepherd JW, Davies GM. Identification of familiar and unfamiliar faces from
internal and external features: Some implications for theories of face recognition.
Perception. 1979;8(4):431–9.
15. Young AW, Hay DC, McWeeny KH, Flude BM, Ellis AW. Matching familiar and
unfamiliar faces on internal and external features. Perception. 1985;14(6):737–46.
16. Megreya AM, Bindemann M. Revisiting the processing of internal and external features
of unfamiliar faces: The headscarf effect. Perception. 2009;38(12):1831–48.
17. Wang Y, Thomas J, Weissgerber SC, Kazemini S, Ul-Haq I, Quadflieg S. The headscarf
effect revisited: Further evidence for a culture-based internal face processing advantage.
Perception. 2015;44(3):328–36.
18. Wong HK, Stephen ID, Keeble DRT. The Own-Race Bias for Face Recognition in a
Multiracial Society. Front Psychol [Internet]. 2020 Mar 6 [cited 2024 Oct 16];11.
Available from:
https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2020.00208/full
19. Havard C. The Importance of Internal and External Features in Matching Own and Other
Race Faces. Perception. 2021 Oct 1;50(10):861–75.
20. Zhao L, Bentin S. The role of features and configural processing in face-race
classification. Vision Res. 2011 Dec 8;51(23–24):2462–70.
21. Bülthoff I, Jung W, Armann RGM, Wallraven C. Predominance of eyes and surface
information for face race categorization. Sci Rep. 2021 Jan 21;11(1):1927.
22. Chiroro P, Valentine T. An Investigation of the Contact Hypothesis of the Own-race Bias
in Face Recognition. Q J Exp Psychol Sect A. 1995 Nov 1;48(4):879–94.
23. Faul F, Erdfelder E, Lang AG, Buchner A. G*Power 3: a flexible statistical power
analysis program for the social, behavioral, and biomedical sciences. Behav Res
Methods. 2007 May;39(2):175–91.
24. Fu S, He H, Hou ZG. Learning Race from Face: A Survey. IEEE Trans Pattern Anal
Mach Intell. 2014 Dec;36(12):2483–509.
25. Ng W, Zhou Z, Wang T. Fine-Grained Facial Ethnicity Recognition Based on Dual
Convolutional Autoencoders. 2021. 235 p.
26. Wang Y, Liao H, Feng Y, Xu X, Luo J. Do They All Look the Same? Deciphering
Chinese, Japanese and Koreans by Fine-Grained Deep Learning [Internet]. arXiv; 2016
[cited 2024 Oct 16]. Available from: http://arxiv.org/abs/1610.01854
27. Guo Y, Zhang L, Hu Y, He X, Gao J. MS-Celeb-1M: A Dataset and Benchmark for
Large-Scale Face Recognition. In: Leibe B, Matas J, Sebe N, Welling M, editors.
Computer Vision – ECCV 2016. Cham: Springer International Publishing; 2016. p.
87–102.
28. GIMP - Downloads [Internet]. [cited 2024 Oct 16]. Available from:
https://www.gimp.org/downloads/
29. Love J, Selker R, Marsman M, Jamil T, Dropmann D, Verhagen J, et al. JASP: Graphical
Statistical Software for Common Statistical Designs. J Stat Softw. 2019 Jan 29;88:1–17.
30. MacLin OH, Malpass RS. Racial categorization of faces: The ambiguous race face effect.
Psychol Public Policy Law. 2001;7(1):98–118.