Content uploaded by Seth D Pollak
Author content
All content in this area was uploaded by Seth D Pollak on Oct 29, 2019
Content may be subject to copyright.
Content uploaded by Seth D Pollak
Author content
All content in this area was uploaded by Seth D Pollak on Oct 29, 2019
Content may be subject to copyright.
https://doi.org/10.1177/1529100619832930
Psychological Science in the
Public Interest
2019, Vol. 20(1) 1 –68
© The Author(s) 2019
Article reuse guidelines:
sagepub.com/journals-permissions
DOI: 10.1177/1529100619832930
www.psychologicalscience.org/PSPI
ASSOCIATION FOR
PSYCHOLOGICAL SCIENCE
Faces are a ubiquitous part of everyday life for humans.
People greet each other with smiles or nods. They have
face-to-face conversations on a daily basis, whether in
person or via computers. They capture faces with smart-
phones and tablets, exchanging photos of themselves
and of each other on Instagram, Snapchat, and other
social-media platforms. The ability to perceive faces is
one of the first capacities to emerge after birth: An
infant begins to perceive faces within the first few days
832930PSIXXX10.1177/1529100619832930Barrett etal.Facial Expressions of Emotion
research-article2019
Corresponding Author:
Lisa Feldman Barrett, 125 Nightingale Hall, Northeastern University,
Boston, MA 02115
E-mail: l.barrett@northeastern.edu
Emotional Expressions Reconsidered:
Challenges to Inferring Emotion From
Human Facial Movements
Lisa Feldman Barrett1,2,3, Ralph Adolphs4, Stacy Marsella1,5,6,
Aleix M. Martinez7, and Seth D. Pollak8
1Department of Psychology, Northeastern University; 2Department of Psychiatry, Massachusetts General Hospital,
Boston, Massachusetts; 3Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital,
Boston, Massachusetts; 4Division of Humanities and Social Sciences, California Institute of Technology; 5College
of Computer and Information Science, Northeastern University; 6Institute of Neuroscience & Psychology,
University of Glasgow; 7Department of Electrical and Computer Engineering and Center for Cognitive and Brain
Sciences, The Ohio State University; and 8Department of Psychology, University of Wisconsin–Madison
Abstract
It is commonly assumed that a person’s emotional state can be readily inferred from his or her facial movements,
typically called emotional expressions or facial expressions. This assumption influences legal judgments, policy decisions,
national security protocols, and educational practices; guides the diagnosis and treatment of psychiatric illness, as well
as the development of commercial applications; and pervades everyday social interactions as well as research in other
scientific fields such as artificial intelligence, neuroscience, and computer vision. In this article, we survey examples
of this widespread assumption, which we refer to as the common view, and we then examine the scientific evidence
that tests this view, focusing on the six most popular emotion categories used by consumers of emotion research: anger,
disgust, fear, happiness, sadness, and surprise. The available scientific evidence suggests that people do sometimes
smile when happy, frown when sad, scowl when angry, and so on, as proposed by the common view, more than what
would be expected by chance. Yet how people communicate anger, disgust, fear, happiness, sadness, and surprise
varies substantially across cultures, situations, and even across people within a single situation. Furthermore, similar
configurations of facial movements variably express instances of more than one emotion category. In fact, a given
configuration of facial movements, such as a scowl, often communicates something other than an emotional state.
Scientists agree that facial movements convey a range of information and are important for social communication,
emotional or otherwise. But our review suggests an urgent need for research that examines how people actually move
their faces to express emotions and other social information in the variety of contexts that make up everyday life, as
well as careful study of the mechanisms by which people perceive instances of emotion in one another. We make
specific research recommendations that will yield a more valid picture of how people move their faces to express
emotions and how they infer emotional meaning from facial movements in situations of everyday life. This research is
crucial to provide consumers of emotion research with the translational information they require.
Keywords
emotion perception, emotional expression, emotion recognition
2 Barrett etal.
of life, equipped with a preference for face-like arrange-
ments that allows the brain to wire itself, with experi-
ence, to become expert at perceiving faces (Arcaro,
Schade, Vincent, Ponce, & Livingstone, 2017; Cassia,
Turati, & Simion, 2004; Gandhi, Singh, Swami, Ganesh,
& Sinhaet, 2017; Grossmann, 2015; L. B. Smith, Jayaraman,
Clerkin, & Yu, 2018; Turati, 2004; but see Young and
Burton, 2018, for a more qualified claim). Faces offer
a rich, salient source of information for navigating the
social world: They play a role in deciding whom to
love, whom to trust, whom to help, and who is found
guilty of a crime (Todorov, 2017; Zebrowitz, 1997, 2017;
Zhang, Chen, & Yang, 2018). Beginning with the ancient
Greeks (Aristotle, in the 4th century BCE) and Romans
(Cicero), various cultures have viewed the human face
as a window on the mind. But to what extent can a
raised eyebrow, a curled lip, or a narrowed eye reveal
what someone is thinking or feeling, allowing a per-
ceiver’s brain to guess what that someone will do next?1
The answers to these questions have major conse-
quences for human outcomes as they unfold in the
living room, the classroom, the courtroom, and even
on the battlefield. They also powerfully shape the direc-
tion of research in a broad array of scientific fields,
from basic neuroscience to psychiatry.
Understanding what facial movements might reveal
about a person’s emotions is made more urgent by the
fact that many people believe they already know. Spe-
cific configurations of facial-muscle movements2
appear as if they summarily broadcast or display a
person’s emotions, which is why they are routinely
referred to as emotional expressions and facial
expressions. A simple Google search for the phrase
“emotional facial expressions” (see Box 1 in the Supple-
mental Material available online) reveals the ubiquity
with which, at least in certain parts of the world, people
believe that certain emotion categories are reliably sig-
naled or revealed by certain facial-muscle movement
configurations—a set of beliefs we refer to as the common
view (also called the classical view; L. F. Barrett, 2017b).
Likewise, many cultural products testify to the common
view. Here are several examples:
Technology companies are investing tremendous
resources to figure out how to objectively “read”
emotions in people by detecting their presumed
facial expressions, such as scowling faces, frown-
ing faces, and smiling faces, in an automated fash-
ion. Several companies claim to have already
done it (e.g., Affectiva.com, 2018; Microsoft Azure,
2018). For example, Microsoft’s Emotion API
promises to take video images of a person’s face
to detect what that individual is feeling. Micro-
soft’s website states that its software “integrates
emotion recognition, returning the confidence
across a set of emotions . . . such as anger, con-
tempt, disgust, fear, happiness, neutral, sadness,
and surprise. These emotions are understood to
be cross-culturally and universally communicated
with particular facial expressions” (screen 3).
Countless electronic messages are annotated with
emojis or emoticons that are schematized ver-
sions of the proposed facial expressions for vari-
ous emotion categories (Emojipedia.org, 2019).
Putative emotional expressions are taught to pre-
school children by displaying scowling faces,
frowning faces, smiling faces, and so on, in post-
ers (e.g., use “feeling chart for children” in a
Google image search), games (e.g., Miniland emo-
tion games; Miniland Group, 2019), books (e.g.,
Cain, 2000; T. Parr, 2005), and episodes of Sesame
Street (among many examples, see Morenoff,
2014; Pliskin, 2015; Valentine & Lehmann, 2015).3
Television shows (e.g., Lie to Me; Baum & Grazer,
2009), movies (e.g., Inside Out; Docter, Del Carmen,
LeFauve, Cooley, and Lassetter, 2015), and docu-
mentaries (e.g., The Human Face, produced by the
British Broadcasting Company; Cleese, Erskine, &
Stewart, 2001) customarily depict certain facial
configurations as universal expressions of
emotions.
Magazine and newspaper articles routinely fea-
ture stories in kind: Facial configurations depict-
ing a scowl are referred to as “expressions of
anger,” facial configurations depicting a smile are
referred to as “expressions of happiness,” facial
configurations depicting a frown are referred to
as “expressions of sadness,” and so on.
Agents of the U.S. Federal Bureau of Investigation
(FBI) and the Transportation Security Administra-
tion (TSA) were trained to detect emotions and
other intentions using these facial configurations,
with the goal of identifying and thwarting terror-
ists (R. Heilig, special agent with the FBI, personal
communication, December 15, 2014; L. F. Barrett,
2017c).4
The facial configurations that supposedly diagnose
emotional states also figure prominently in the
diagnosis and treatment of psychiatric disorders.
One of the most widely used tasks in autism
research, the Reading the Mind in the Eyes Test,
asks test takers to match photos of the upper (eye)
region of a posed facial configuration with specific
mental state words, including emotion words
(Baron-Cohen, Wheelwright, Hill, Raste, & Plumb,
2001). Treatment plans for people living with
autism and other brain disorders often include
learning to recognize these facial configurations
Facial Expressions of Emotion 3
as emotional expressions (Baron-Cohen, Golan,
Wheelwright, & Hill, 2004; Kouo & Egel, 2016).
This training does not generalize well to real-
world skills, however (Berggren etal., 2018; Kouo
& Egel, 2016).
“Reading” the emotions of a defendant—in the
words of Supreme Court Justice Anthony Kennedy,
to “know the heart and mind of the offender”
(Riggins v. Nevada, 1992, p. 142)—is one pillar of
a fair trial in the U.S. legal system and in many
legal systems in the Western world. Legal actors
such as jurors and judges routinely rely on facial
movements to determine the guilt and remorse
of a defendant (e.g., Bandes, 2014; Zebrowitz,
1997). For example, defendants who are per-
ceived as untrustworthy receive harsher sen-
tences than they otherwise would (J. P. Wilson &
Rule, 2015, 2016), and such perceptions are more
likely when a person appears to be angry (i.e.,
the person’s facial structure looks similar to the
hypothesized facial expression of anger, which is
a scowl; Todorov, 2017). An incorrect inference
about defendants’ emotional state can cost them
their children, their freedom, or even their lives
(for recent examples, see L. F. Barrett, 2017b,
beginning on page 183).
But can a person’s emotional state be reasonably
inferred from that person’s facial movements? In this
article, we offer a systematic review of the evidence,
testing the common view that instances of an emotion
category are signaled with a distinctive configuration
of facial movements that has enough reliability and
specificity to serve as a diagnostic marker of those
instances. We focus our review on evidence pertaining
to six emotion categories that have received the lion’s
share of attention in scientific research—anger, disgust,
fear, happiness, sadness, and surprise—and that, cor-
respondingly, are the focus of the common view (as
evidenced by our Google search, summarized in Box
1 in the Supplemental Material). Our conclusions apply,
however, to all emotion categories that have thus far
been scientifically studied. We open the article with a
brief discussion of its scope, approach, and intended
audience. We then summarize evidence on how people
actually move their faces during episodes of emotion,
referred to as studies of expression production, fol-
lowing which we examine evidence on which emotions
are actually inferred from looking at facial movements,
referred to as studies of emotion perception. We iden-
tify three key shortcomings in the scientific research
that have contributed to a general misunderstanding
about how emotions are expressed and perceived in
facial movements and that limit the translation of this
scientific evidence for other uses:
1. Limited reliability (i.e., instances of the same
emotion category are neither reliably expressed
through nor perceived from a common set of
facial movements).
2. Lack of specificity (i.e., there is no unique map-
ping between a configuration of facial move-
ments and instances of an emotion category).
3. Limited generalizability (i.e., the effects of con-
text and culture have not been sufficiently docu-
mented and accounted for).
We then discuss our conclusions, followed by proposals
for consumers on how they might use the existing sci-
entific literature. We also provide recommendations for
future research on emotion production and perception
with consumers of that research in mind. We have
included additional detail on some topics of import or
interest in the Supplemental Material.
Scope, Approach, and Intended
Audience of Article
The common view: reading an
emotional state from a set of facial
movements
In common English parlance, people refer to “an emo-
tion” as if anger, happiness, or any emotion word
referred to an event that is highly similar on most occur-
rences. But an emotion word refers to a category of
instances that vary from one another in their physical
features (e.g., facial movements and bodily changes)
and mental features (e.g., pleasantness, arousal, expe-
rience of the surrounding situation as novel or threaten-
ing, awareness of these properties, and so on). Few
scientists who study emotion, if any, take the view that
every instance of an emotion category, such as anger,
is identical to every other instance, sharing a set of
necessary and sufficient features across situations, peo-
ple, and cultures. For example, Keltner and Cordaro
(2017) recently wrote that “there is no one-to-one cor-
respondence between a specific set of facial muscle
actions or vocal cues and any and every experience of
emotion” (p. 62). Yet there is considerable scientific
debate about the extent of the within-category varia-
tion, the specific features that vary, the causes of the
within-category variation, and implications of this varia-
tion for the nature of emotion (see Fig. 1).
One popular scientific framework, referred to as the
basic-emotion approach, hypothesizes that instances of
an emotion category are expressed with facial move-
ments that vary, to some degree, around a typical set of
movements (referred to as a prototype; for examples,
see Table 1). For example, it is hypothesized that in one
situation or for one person, anger might be expressed
4 Barrett etal.
with the facial prototype (e.g., brows furrowed, eyes
wide, lips tightened) plus additional facial movements,
such as a widened mouth, whereas on other occasions,
one facial movement from the prototype might be miss-
ing (e.g., anger might be expressed with narrowed eyes
or without movement in the eyebrow region; for a dis-
cussion, see Box 2 in the Supplemental Material). None-
theless, the basic-emotion approach still assumes that
there is a core facial configuration—the prototype—that
can be used to diagnose a person’s emotional state in
much the same way that a fingerprint can be used to
uniquely recognize a person. More substantial variation
in expressions (e.g., smiling in anger, gasping with wid-
ened eyes in anger, and scowling not in anger but in
confusion or concentration) is typically explained as the
result of processes that are independent of an emotion
itself and that modify its prototypic expression, such as
display rules, emotion-regulation strategies (e.g., sup-
pressing the expression), or culture-specific dialects (as
proposed by various scientists, including Ekman & Cor-
daro, 2011; Elfenbein, 2013, 2017; Matsumoto, 1990;
Matsumoto, Keltner, Shiota, O’Sullivan, & Frank, 2008;
Tracy & Randles, 2011).
By contrast, other scientific frameworks propose that
expressions of the same emotion category, such as
anger, vary substantially across different people and
situations. For example, when the goal of being angry
is to overcome an obstacle, it may be more useful to
scowl during some instances of anger, smile or laugh,
or even stoically widen one’s eyes, depending on the
temporospatial context. This variation is thought to be
a meaningful part of an emotional expression because
facial movements are functionally tied to the immediate
context, which includes a person’s internal context
(e.g., the person’s metabolic condition, the past experi-
ences that come to mind) and outward context (e.g.,
whether a person is at work, at school, or at home, who
else is present and the broader cultural conditions),
both of which vary in dynamic ways over time (see Box
2 in the Supplemental Material).
These debates—regarding the source and magnitude
of variation in the facial movements that express
Surface Similarity
Deep Similarity
Behavioral Ecology View;
Theory of Constructed Emotion Descriptive Appraisal Views
Core Affect View
Componential Process View
Basic Emotion View
(Original); Discrete
Emotion View
Functional Views
Basic Emotion View
(Revised)
LOW | Situated Variation Strong Similarity | HIGH
Stable Mechanism | HIGHLOW | Situated Variation Stable Function
Fig. 1. Explanatory frameworks guiding the science of emotion: the nature of emotion categories and their concepts. The information in
the figure is plotted along two dimensions. The horizontal dimension represents hypotheses about the similarities in surface features shared
by instances of the same emotion category (e.g., the facial movements that express instances of the same emotion category). The vertical
dimension represents hypotheses about the similarities in the mechanisms that cause instances of the same emotion category (e.g., the neural
circuits or assemblies that cause instances of the same emotion category). The colors represent the type of emotion categories proposed in
each theoretical framework. Approaches in the green area describe ad hoc, abstract categories; those in the yellow area describe prototype
or theory-based categories, and those in the red area describe natural-kind categories.
5
Table 1. A Comparison of the Facial Configurations Listed as the Expressions of Selected Emotion Categories
Emotion
category
Proposed expressive configurations described using The Facial Action Coding System (FACS)
Matsumoto, Keltner, Shiota,
O’Sullivan, & Frank (2008) Cordaro etal. (2018)
Keltner, Sauter,
Tracy, & Cowen
(2019) Physical description
Darwin’s
(1872/1965)
description
Observed in
research
Reference configuration
used
International core
pattern
Amusement Not listed Not listed 6, 12, 26 or 27, 55 or 56, a
“head bounce” (Shiota,
Campos, & Keltner, 2003)
6, 7, 12, 16, 25, 26 or
27, 53
6 7 12
25 26
53
Head back, Duchenne smile (AU 6,
7, 12), lips separated, jaw dropped
(AU 26, 27)
Anger 4 5 24
38
4 5 or 7
22 23 24
4 5 7 23 (Ekman,
Levenson, & Friesen, 1983)
4, 7 4 5 17
23 24
Brows furrowed, eyes wide, lips
tightened and pressed together
Awe Not listed Not listed 1, 5, 26 or 27, 57 and visible
inhalation (Shiota etal.,
2003)
1, 2, 5, 12, 25, 26 or
27, 53
Not listed
Contempt 9 10 22
41 61
or 62
12 (unilateral)
14
(unilateral)
12 14 (Ekman etal., 1983) 4, 14, 25 Not listed
Disgust 10 16 22
25 or 26
9 or 10, 25
or 26
9 15 16 (Ekman etal.,
1983)
4, 6, 7, 9, 10, 25, 26
or 27
7 9 19
25 26
Eyes narrowed, nose wrinkled, lips
parted, jaw dropped, tongue show
Embarrassment Not listed Not listed 12, 24, 51, 54, 64 (Keltner &
Buswell, 1997)
6, 7, 12, 25, 54,
participant dampens
smile with 23, 24,
frown, etc.)
7 12 15
52 54
64
Eyelids narrowed, controlled smile,
head turned and down, (not
scored with FACS: hand touches
face)
Fear 1 2 5 20 1 2 4
5 20, 25
or 26
1 2 4 5 7 20 26
(Ekman etal., 1983)
1, 2, 5, 7, 25, 26 or 27,
participant suddenly
shifts entire body
backward in chair
1 2 4
5 7 20
25
Eyebrows raised and pulled together,
upper eyelid raised, lower eyelid
tense, lips parted and stretched
Happiness 6 12 6 12 6 12 (Ekman etal., 1983) 6, 7, 12, 16, 25, 26
or 27
6 7 12
25 26
Duchenne smile (6, 7, 12)
Pride Not listed Not listed 6, 12, 24, 53, a straightening of
the back and pulling back of
the shoulders to expose the
chest (Shiota etal., 2003)
7, 12, 53, participant
sits up straight
53 64 Head up, eyes down
Sadness 1 15 1 15, 4, 17 1 4 5 (Ekman etal., 1983) 4, 43, 54 1 4 6
15 17
Brows knitted, eyes slightly
tightened, lip corners depressed,
lower lip raised
Shame Not listed Not listed 54, 64 (Keltner & Buswell, 1997) 4, 17, 54 54 64 Head down, eyes down
Surprise 1 2 5
25 or 26
1 2 5 25
or 26
1 2 5 26 (Ekman etal.,
1983)
1, 2, 5, 25, 26 or 27 1 2 5
25 26
Eyebrows raised, upper eyelid raised,
lips parted, jaw dropped
Note: Descriptions attributed to Darwin are taken from Matsumoto etal. (2008), Table 13.1. Physical descriptions are taken from Keltner et al. (2019). International core patterns refer to expressions
of 22 emotion categories that are thought to be conserved across cultures, taken from Cordaro etal. (2018), Tables 4 through 6. A plus sign means that action units would appear simultaneously. A
comma means that action units are statistically the most probable to appear but do not necessarily happen simultaneously (D. Cordaro, personal communication, November 11, 2018).
6 Barrett etal.
instances of the same emotion category, as well as the
magnitude and meaning of the similarity in the facial
movements that express instances of different emotion
categories—are useful to scientists. But these debates
do not provide clear guidance for consumers of emo-
tion research, who are focused on the practical issue of
whether emotion categories are expressed with facial
configurations of sufficient regularity and distinctiveness
so that it is possible to read emotion in a person’s face.
The common view of emotional expressions persists,
too, because scientists’ actions often do not follow their
claims in a transparent, straightforward way. Many sci-
entists continue to design experiments, use stimuli, and
publish review articles that, ironically, leave readers
with the impression that certain emotion categories
have a unique, prototypic facial expression, even as
those same scientists acknowledge that instances of
every emotion category can be expressed with a vari-
able set of facial movements. Published studies typically
test the hypothesis that there are unique emotion-
expression links (for examples, see the reference lists
in Elfenbein & Ambady, 2002; Keltner, Sauter, Tracy, &
Cowen, 2019; Matsumoto etal., 2008; also see most of
the studies reviewed in this article, e.g., Cordaro etal.,
2018). The exact facial configuration tested for each
emotion category varies slightly from study to study
(for examples, see Table 1), but a core, prototypic facial
configuration for a given emotion category is still
assumed within a single study. Review articles (again,
perhaps unintentionally) reinforce the impression of
unique face-emotion mappings by including tables and
figures that display a single, unique facial configuration
for each emotion category, referred to as the expres-
sion, signal or display for that emotion (Fig. 2 presents
two recent examples).5 This pattern of hypothesis test-
ing and writing—that instances of one emotion cate-
gory are expressed with a single prototypic facial
configuration—reinforces (perhaps unintentionally) the
common view that each emotion category is consis-
tently and uniquely expressed with its own distinctive
configuration of facial movements. Consumers of this
research then assume that a distinctive configuration
can be used to diagnose the presence of the corre-
sponding emotion in everyday life (e.g., that a scowl
indicates the presence of anger with high reliability and
specificity).
The common view of emotional expressions has also
been imported into other scientific disciplines with an
interest in understanding emotions, such as neurosci-
ence and artificial intelligence (AI). For example, from
a published article on AI:
American psychologist Ekman noticed that some
facial expressions corresponding to certain emotions
are common for all the people independently of
their gender, race, education, ethnicity, etc. He
proposed the discrete emotional model using six
universal emotions: happiness, surprise, anger,
disgust, sadness and fear. (Brodny etal., 2016, p. 1;
emphasis in original)
Similar examples come from our own articles. One
series focused on the brain structures involved in per-
ceiving emotions from facial configurations (Adolphs,
2002; Adolphs, Tranel, Damasio, & Damasio, 1994), and
the other focused on early life experiences (Pollak,
Cicchetti, Hornung, & Reed, 2000; Pollak & Kistler,
2002). These articles were framed in terms of “recogniz-
ing facial expressions of emotion” and exclusively pre-
sented participants with specific, posed photographs
of scowling faces (the presumed facial expression for
anger), wide-eyed, gasping faces (the presumed facial
expression for fear), and other presumed prototypical
expressions. Participants were shown faces of different
individuals, and each person posed the same facial
configuration for a given emotion category, ignoring
the importance of individual and contextual variation.
One reason for this flawed approach to investigating
the perception of emotion from faces was that then—at
the time these studies were conducted—as now, pub-
lished experiments, review articles, and stimulus sets
were dominated by the common view that certain emo-
tion categories were signaled with an invariant set of
facial configurations, referred to as “the facial expres-
sions of basic emotions.”
In our review of the scientific evidence, we test two
hypotheses that arise from the common view of emo-
tional expressions: that certain emotion categories are
each routinely expressed by a unique facial configura-
tion and, correspondingly, that people can reliably infer
someone else’s emotional state from a set of facial
movements. Our discussion is written for consumers of
emotion research, whether they be scientists in other
fields or nonscientists, who need not have deep knowl-
edge of the various theories, debates, and broad range
of findings in the science of emotion, with sufficient
pointers to those discussions if they are of interest (see
Box 2 in the Supplemental Material).
In discussing what this article is about—the common
view that a person’s emotional state is revealed in facial
movements—it bears mentioning what this article is not
about: It is not a referendum on the “basic emotion”
view that we mentioned briefly, earlier in this section,
proposed by the psychologist Paul Ekman and his col-
leagues; nor is it a commentary on any other specific
research program or individual psychologist’s view.
Ekman’s theoretical approach has been highly influen-
tial in research on emotion for much of the past 50
years. We often cite studies inspired by the basic-
emotion approach, and Ekman’s work, for this reason.
Facial Expressions of Emotion 7
State Example Photo Action Units Physical Description
Amusement 6 + 7 +
12 + 25 +
26 + 53
Head back, Duchenne
smile, lips separated,
jaw dropped
Anger 4 + 5 +
17 + 23 +
24
Brows furrowed, eyes
wide, lips tightened
and pressed together
Boredom 43 + 55 Eyelids drooping,
head tilted
(not scorable with FACS:
slouched posture, head
resting on hand)
Confusion 4 + 7 + 56 Brows furrowed, eyelids
narrowed, head tilted
Contentment 12 + 43 Smile, eyelids drooping
Coyness 6 + 7 + 12 +
25 + 26 +
52 + 54 +
61
Duchenne smile, lips
separated, head turned
and down, eyes turned
opposite to head turn
Desire 19 + 25 +
26 + 43
Tongue shown, lips parted,
jaw dropped,
eyelids drooping
Disgust 7 + 9 + 19 +
25 + 26
Eyes narrowed, nose
wrinkled, lips parted,
jaw dropped,
tongue shown
Embarrassment 7 + 12 +
15 + 52 +
54 + 64
Eyelids narrowed,
controlled smile, head
turned and down
(not scorable with FACS:
hand touches face)
State Example Photo Action Units Physical Description
Fear 1 + 2 + 4 +
5 + 7 + 20 +
25
Eyebrows raised and pulled
together, upper eyelid
raised, lower eyelid tense,
lips parted and stretched
Happiness 6 + 7 +
12 + 25 +
26
Duchenne smile
Interest 1 + 2 + 12 Eyebrows raised, slight
smile
Pain 4 + 6 + 7 +
9 + 17 +
18 + 23 +
24
Eyes tightly closed, nose
wrinkled, brows furrowed,
lips tight, pressed together,
and slightly puckered
Pride 53 + 64 Head up, eyes down
Sadness 1 + 4 + 6 +
15 + 17
Brows knitted, eyes slightly
tightened, lip corners
depressed, lower lip raised
Shame 54 + 64 Head down, eyes down
Surprise 1 + 2 + 5 +
25 + 26
Eyebrows raised, upper
eyelid raised, lips parted,
jaw dropped
Sympathy 1 + 17 +
24 + 57
Inner eyebrow raised,
lower lip raised, lips
pressed together,
head slightly forward
Fig. 2. (continued on next page)
8 Barrett etal.
Research Needed Communicates a Lack of Threat
Research Needed Tears Handicap Vision to Signal
Appeasement and Elicit Sympathy Hasson (2009)
Research Needed Alerts of Impending Threat,
Communicates Dominance
Marsh et al. (2005)
Ohman & Mineka (2001)
Susskind et al. (2008)
Research Needed Ekman (1989)
Rozin et al. (1994)
Chapman, Kim, Susskind, &
Anderson (2009)
Keltner & Buswell (1997)
EMOTION
EXPRESSION
HYPOTHESIZED
PHYSIOLOGICAL
FUNCTION
Happiness
Sadness
Anger
Fear
Surprise
Disgust
Pride
Shame
Embarrassment
HYPOTHESIZED
COMMUNICATIVE
FUNCTION
RELEVANT
RESEARCH
Alerts of Possible Threat and
Appeases Potential Aggressors
Warns About Aversive Foods, as
Well as Distatesful Ideas
and Behaviors
Communicates Heightened
Social Status
Communicates Lessened Social
Status, Desire to Appease
Communicates Lessened Social
Status, Desire to Appease
Widened Eyes Increase Visual
Field and Speed Up Eye Movements
Widened Eyes Increase Visual
Field to See Unexpected Stimulus
Constricted Orifices Reduce
Inhalation of Possible Contaminants
Boots Testosterone and
Increases Lung Capacity to
Prepare for Agonistic Encounters
Recues/Hides Bodily Targets
From Potential Attack
Recues/Hides Bodily Targets
From Potential Attack
Preuschoft & Van Hoof, 1997
Ramachandran, 1998
Marsh, Ambady, & Kleck (2005)
Wilkowski & Meier (2010)
Carney, Cuddy, & Yap (2010)
Shariff & Tracy (2009)
Tracy & Matsumoto (2008)
Keltner & Harker (1998)
Shariff & Tracy (2009)
Tracy & Matsumoto (2008)
Fig. 2. Example figures from recently published articles that reinforce the common belief in prototypic facial expressions of emo-
tion. The graphic in (a) was adapted from Table 2 in Keltner, D., Sauter, D., Tracy, J., and Cowen, A. (2019). Emotional expression:
Advances in basic emotion theory. Journal of Non-Verbal Behavior. Photos originally from in Cordaro, D. T., Sun, R., Keltner, D.,
Kamble, S., Huddar, N., and McNeil, G. (2018). Universals and cultural variations in 22 emotional expressions across five cultures.
Emotion, 18, 75–93, with permission from the American Psychological Association. Face photos copyright Dr. Lenny Kristal, used with
permission. The graphic in (b) was adapted from Figure 2 in Shariff and Tracy (2011).
In addition, the common view of emotional expressions
is most readily associated with a simplified version of
the basic-emotion approach, as exemplified by the
quotes above. Critiques of Ekman’s basic-emotion view
(and related views) are numerous (e.g., L. F. Barrett,
2006, 2011; L. F. Barrett, Lindquist etal., 2007; Russell,
1991, 1994, 1995; Ortony & Turner, 1990), as are rejoin-
ders that defend it (e.g., Ekman, 1992, 1994; Izard,
2007). Our article steps back from these debates. We
instead focus on the existing research on emotional
expression and emotion perception in general and ask
whether the scientific evidence is sufficiently strong
and clear enough to justify the way it is increasingly
being used by those who consume it.
A systematic approach for evaluating
the scientific evidence
When you see someone smile and infer that the person
is happy, you are making what is known as a reverse
inference: You are assuming that the smile reveals
something about the person’s emotional state that you
cannot access directly (see Fig. 3). Reverse inference
requires calculating a conditional probability: the
probability that a person is in a particular emotion
episode (e.g., happiness) given the observation of a
unique set of facial muscle movements (e.g., a smile).
The conditional probability is written as
p(| )emotioncategoryaunique facial configuration
for example,
p(| )happiness asmilingfacialconfiguration
Reverse inferences about emotion are ubiquitous in
everyday life—whenever you experience someone as
emotional, your brain has performed a reverse inference,
guessing at the cause of a facial movement when you
have access only to the movement itself. Every time an
app on a phone or computer measures someone’s facial
muscle movements, identifies a facial configuration such
Facial Expressions of Emotion 9
Angry Afraid
Emotion State
Brow
Lowerer
(AU4)
Stretch
Lips
(AU20)
High Reliability:
A Scowling Facial
Configuration Occurs
Frequently When
Someone Is Angry
High Specificity:
A Scowling Facial
Configuration Occurs
Rarely When
Someone Is Not
Angry
Facial Movement
Positive
Predictive Value:
The Probability
That Someone
Who Is Angry Is
Scowling
Negative
Predictive Value:
The Probability
That Someone
Who Is Fearful
(Not Angry) Has
Tightened Lips
True Positive
Person Is Angry and
Producing Facial
Movement Thought to
Be Expressive for Anger
False Positive
Person Is Afraid but
Producing Facial
Movement Thought to
Be Typical for Anger
False Negative
Person Is Angry but
Producing Facial
Movement Thought to
Be Expressive for Fear
True Negative
Person Is Afraid and
Producing Facial
Movement Thought to
Be Expressive for Fear
Fig. 3. Defining reliability and specificity. Anger and fear are used as the example categories.
as a frowning facial configuration, and proclaims that
the target person is sad, that app has engaged in reverse
inference, such as
p(| )sadness afrowningfacialconfiguration
Whenever a security agent infers anger from a scowl,
the agent has assumed a strong likelihood for
p(| )angerascowlingfacialconfiguration
Four criteria must be met to justify a reverse inference
that a particular facial configuration expresses and therefore
reveals a specific emotional state: reliability, specificity,
generalizability, and validity (explained in Table 2
and Fig. 3). These criteria are commonly encountered in
the field of psychological measurement, and over the
past several decades, there has been an ongoing dia-
logue about thresholds for these criteria as they apply
in production and perception studies, with some con-
sensus emerging for the first three criteria (see Haidt &
Keltner, 1999). Only when a pattern of facial muscle
movements strongly satisfies these four criteria can we
justify calling it an “emotional expression.” If any of these
criteria are not met, then we should instead use neutral,
descriptive terms to refer to a facial configuration with-
out making unwarranted inferences, simply calling it a
smile (rather than an expression of happiness), a frown
10
Table 2. Criteria Used to Evaluate the Empirical Evidence
Criterion Expression production Emotion perception
Cutoffs Reliability and specificity rates between 70% and 90% provide strong evidence for the common view, rates between 40% and 69% provide moderate
support for the common view, and rates between 20% and 39% provide weak support (Ekman, 1994; Haidt & Keltner, 1999; Russell, 1994).
Reliability When a person is sad, the proposed expression (a frowning facial
configuration) should be observed more frequently than would be expected
by chance. This likewise must be true for every other emotion category that
is subject to a common belief. Reliability is related to a forward inference:
Given that someone is happy, what is the likelihood of observing a smile,
p[set of facial muscle movements|emotion category].
When a person makes a scowling facial configuration, perceivers
should consistently infer that the person is angry. This likewise
must be true for every facial configuration that has been proposed
as the expression of a specific emotion category. That is, perceivers
must consistently make a reverse inference: Given that someone is
scowling, what is the likelihood that he or she is angry, p[emotion
category|set of facial muscle movements].
Chance means that facial configurations occur randomly with no predictable
relationship to a given emotional state. This would mean that the facial
configuration in question carries no information about the presence or
absence of an emotion category. For example, in an experiment that
observes the facial configurations associated with instances of happiness and
anger, chance levels of scowling or smiling would be 50%.
Chance means that emotional states occur randomly with no
predictable relationship to a given facial configuration. This
would mean that the presence or absence of an emotion category
cannot be inferred from the presence or absence of the facial
configuration. For example, in an experiment that observes how
people perceive 51 different facial configurations, chance levels for
correctly labeling a scowling face as anger would be 2%.
Reliability also depends on the base rate: how frequently people make
a particular facial configuration. For example, if a person frequently
makes a scowling facial configuration during an experiment examining
the expressions of anger, sadness and fear, he or she will seem to
be consistently scowling in anger when in fact the scowling may be
indiscriminate.
Reliability also depends on the base rate: how frequently people use
a particular emotion label or make a particular emotional inference.
For example, if a person frequently labels facial configurations as
“angry” during an experiment examining scowling, smiling, and
frowning faces, she will seem to be consistently perceiving anger
when in fact she is labeling indiscriminately.
Specificity If a facial configuration is diagnostic of a specific emotion category, then the
facial configuration should express instances of one and only one emotion
category better than chance; it should not consistently express instances of
any other mental event (emotion or otherwise) at better than chance levels.
For example, to be considered the expression of anger, a scowling facial
configuration must not express sadness, confusion, indigestion, an attempt to
socially influence, etc., at better than chance levels. Estimates of specificity,
like reliability, depend on base rates and on how chance levels are defined.
If a frowning facial configuration is perceived as the diagnostic
expression of sadness, then a frowning facial configuration should
be labeled only as sadness (or sadness should be inferred only
from a frowning facial configuration) at above chance levels.
And it should not be consistently perceived as an expression of
any mental state other than sadness at better than chance levels.
Estimates of specificity, like reliability, depend on base rates and
on how chance levels are defined.
Generalizability Patterns of reliability and specificity should replicate across studies, particularly
when different populations are sampled (e.g., infants, congenitally blind
individuals, and individuals sampled from diverse cultural contexts, including
small-scale, remote cultures). High reliability and specificity across different
circumstances ensures that scientific findings are generalizable.
Patterns of reliability and specificity should replicate across studies,
particularly when different populations are sampled (e.g., infants,
congenitally deaf individuals, and individuals sampled from diverse
cultural contexts, including small-scale, remote cultures). High
reliability and specificity across different circumstances ensures that
scientific findings are generalizable.
Validity Even if a facial configuration is consistently and uniquely observed in relation
to a specific emotion category across many studies (strong generalizability),
it is necessary to demonstrate that the person in question is really in
the expected emotional state. This is the only way that a given facial
configuration leads to accurate inferences about a person’s emotional state.
A facial configuration is valid as a display or a signal for emotion if and
only if it is strongly associated with other measures of emotion, preferably
those that are objective and do not rely on anyone’s subjective report (i.e., a
facial configuration should be strongly and consistently related to perceiver-
independent evidence about the emotional state of the expresser).
Even if a facial configuration is consistently and uniquely labeled
with a specific emotion word across many studies (strong
generalizability), it is necessary to demonstrate that the person
making the facial configuration is really in the expected emotional
state. This is the only way that a given perception or inference of
emotion is accurate. A perceiver can be said to be recognizing an
emotional expression if and only if the person being perceived is
verifiably in the expected emotional state.
Note: Reliability is also related to sensitivity, consistency, informational value, and the true positive rate (for further description, see Fig. 3). Specificity is related to uniqueness, discreteness, the
true negative rate and referential specificity. In principle, we can also ask more parametrically whether there is a link between the intensity of an emotional instance and the intensity of facial
muscle contractions, but scientists rarely do.
Facial Expressions of Emotion 11
(rather than an expression of sadness), a scowl (rather
than an expression of anger), and so on.6
The null hypothesis and the role
of context
Tests of reliability, specificity, generalizability, and
validity are almost always compared with what would
be expected by sheer chance, if facial configurations
(in studies of expression production) and inferences
about facial configurations (in studies of emotion per-
ception) occurred randomly with no relation to particu-
lar emotional states. In most studies, chance levels
constitute the null hypothesis. An example of the null
hypothesis for reliability is that people do not scowl
when angry more frequently than would be expected
by chance.7 If people are observed to scowl more fre-
quently when angry than they would by chance, then
the null hypothesis is rejected on the basis of the reli-
ability of the findings. We can also test the null hypoth-
esis for specificity: If people scowl more frequently
than they would by chance not only when angry but
also when fearful, sad, confused, hungry, and so forth,
then the null hypothesis for specificity is retained.8
Tests of generalizability are becoming more common
in the research literature, again using the null hypoth-
esis. Questions about generalizability test whether a
finding in one experiment is reproduced in other exper-
iments in different contexts, using different experimen-
tal methods or sampling people from different
populations. There are two crucial questions about
generalizability when it comes to the production and
perception of emotional expressions: Do the findings
from a laboratory experiment generalize to observa-
tions in the real world? And, do the findings from stud-
ies that sample participants from Westernized, educated,
industrialized, rich, and democratic (WEIRD; Henrich,
Heine, & Norenzayan, 2010) populations generalize to
people who live in small-scale remote communities?
Questions of validity are almost never addressed in
production and perception studies. Even if reliable and
specific facial movements are observed across gener-
alizable circumstances, whether these facial movements
can justify an inference about a person’s emotional state
is a difficult and unresolved question. (We have more
to say about this later.) Consequently, in this article, we
evaluate the common view by reviewing evidence per-
taining to the reliability, specificity, and generalizability
of research findings from production and perception
studies.
When observations allow scientists to reject the null
hypothesis for reliability, defined as observations that
could be expected by chance alone, such evidence
provides necessary but not sufficient support for the
common view of emotional expressions. A slightly
above chance co-occurrence of a facial configuration
and instances of an emotion category, such as scowling
in anger—for example, a correlation coefficient (r) of
about .20 to .39 (adapted from Haidt & Keltner, 1999)—
suggests that a person sometimes scowls in anger, but
not most or even much of the time. Weak evidence for
reliability suggests that other factors not measured in
the experiment are likely causing people to scowl dur-
ing an instance of anger. It also suggests that people
may express anger with facial configurations other than
a scowl, possibly in reliable and predictable ways. Fol-
lowing common usage, we refer to these unmeasured
factors collectively as context. A similar situation can
be described for studies of emotion perception: When
participants label a scowling facial configuration as
“anger” in a weakly reliable way (between 20% and
39% of the time; Haidt & Keltner, 1999), then this sug-
gests the possibility of unmeasured context effects.
In principle, context effects make it possible to test
the common view by comparing it directly with an
alternative hypothesis—that a person’s brain will be
influenced by other causal factors—as opposed to com-
paring the findings with those expected by random
chance. It is possible, for example, that a state of anger
is expressed differently depending on various factors
that can be studied, including the situational context
(e.g., whether a person is at work, at school, or at home),
social factors (e.g., who else is present in the situation
and the relationship between the expresser and the
perceiver), a person’s internal physical context (e.g.,
how much sleep they had, how hungry they are), a
person’s internal mental context (e.g., the past experi-
ences that come to mind or the evaluations they make),
the temporal context (what occurred just a moment
ago), differences between people (e.g., whether some-
one is male or female, warm or distant), and the cultural
context, such as whether the expression is occurring in
a culture that values the rights of individuals (compared
with group cohesion) and is open and allows for a
variety of behaviors in a situation (compared with
closed, having more rigid rules of conduct). Other theo-
retical approaches offer some of these specific alterna-
tive hypotheses (see Box 2 in the Supplemental Material).
In practice, however, experiments almost always test the
common view against the null hypothesis and rarely test
specific alternative hypotheses. When context is
acknowledged and studied, it is usually examined as a
factor that might moderate a common and universal
emotional expression, preserving the core assumptions
of the common view (e.g., Cordaro etal., 2018; for more
discussion, see Box 3 in the Supplemental Material).
12 Barrett etal.
A focus on six emotion categories:
anger, disgust, fear, happiness,
sadness, and surprise
Our critical examination of the research literature in
this article focuses primarily on testing the common
view of facial expressions for six emotion categories—
anger, disgust, fear, happiness, sadness, and surprise.
We do not discuss every emotion category ever studied
in the science of emotion. We do not discuss the many
emotion categories that exist in non-English-speaking
cultures, such as gigil (the irresistible urge to pinch or
squeeze something cute) or liget (exuberant, collective
aggression; for discussion of non-English emotion cat-
egories, see Mesquita & Frijda, 1992; Pavlenko, 2014;
Russell, 1991). We do not discuss the various emotion
categories that have been documented throughout his-
tory (e.g., T. W. Smith, 2016). Nor do we discuss every
English emotion category for which a prototypical facial
expression has been suggested. For example, recent
studies motivated primarily by the basic-emotion
approach have suggested that there are “more than six
distinct facial expressions . . . in fact, upwards of 20
multimodal expressions” (Keltner etal., 2019, Introduc-
tion, para. 6), meaning that scientists have proposed a
distinct, prototypic facial configuration as the facial
expression for each of 20 or so emotion categories,
including confusion, embarrassment, pride, sympathy,
awe, and others.
We focus on six emotion categories for two reasons.
First, as we already noted, these categories anchor com-
mon beliefs about emotions and their expressions and
therefore represent the clearest, strongest test of the
common view. They can be traced to Charles Darwin,
who stipulated (rather than discovered) that certain
facial configurations are expressions of certain emotion
categories, inspired by photographs taken by Duchenne
(1862/1990) and drawings made by the Scottish anato-
mist Charles Bell (Darwin, 1872/1965). The proposed
expressive facial configurations for each emotion cat-
egory are presented in Figure 4, and the origin of these
facial configurations is discussed in Box 4 in the Sup-
plemental Material.
Second, these six emotion categories have been the
primary focus of systematic research for almost a cen-
tury and therefore provide the largest corpus of scien-
tific evidence that can be evaluated. Unfortunately, the
same cannot be said for any of the other emotion cat-
egories in question. This is a particularly important
point when considering the more than 20 emotion cat-
egories that are now the focus of research attention. A
PsycInfo search for the term “facial expression” com-
bined with “anger, disgust, fear, happiness, sadness,
surprise” produced over 700 entries, but a similar search
including “love, shame, contempt, hate, interest,
distress, guilt” returned fewer than 70 entries (Duran &
Fernández-Dols, 2018). Almost all cross-cultural studies
of emotion perception have focused on anger, disgust,
fear, happiness, sadness, and surprise (plus or minus a
few), and experiments that measure how people spon-
taneously move their faces to express instances of emo-
tion categories rarely include categories beyond these
six. In particular, too few studies measure spontane-
ous facial movements during episodes of other emo-
tion categories (i.e., production studies) to conclude
anything about reliability and specificity, and there
are too few studies of how these additional emotion
categories are perceived in small-scale, remote cul-
tures to conclude anything about generalizability. In
an era where the generalizability and robustness of
psychological findings are under close scrutiny, it
seemed prudent to focus on the emotion categories
for which there are, by a factor of 10, the largest
number of published experiments. Nonetheless, our
review of the empirical evidence for expressions of
emotion categories beyond anger, disgust, fear, hap-
piness, sadness, and surprise did not reveal any new
information that weakens the conclusions we discuss
in this article. As a consequence, our discussion here,
which is based on a sample of six emotion categories,
generalizes to those other emotion categories that
have been studied.9
Producing Facial Expressions of
Emotion: A Review of the Scientific
Evidence
In this section, we first review the design of a typical
experiment in which emotions are induced and facial
movements are measured. We highlight several obser-
vations to keep in mind as we review the reliability,
specificity, and generalizability for expressions of anger,
disgust, fear, happiness, sadness, and surprise in a
variety of populations, including adults in urban or
small-scale remote cultures, infants and children, and
congenitally blind individuals. Our review is the most
comprehensive to date and allows us to comment on
whether the scientific findings generalize across differ-
ent populations of individuals. The value of doing so
becomes apparent when we observe how similar con-
clusions emerge from these research domains.
The anatomy of a typical experiment
designed to observe people’s facial
movements during episodes of emotion
In the typical expression-production experiment, scien-
tists expose participants to objects, images, or events that
they (the scientists) believe will evoke an instance of
Facial Expressions of Emotion 13
emotion. It is possible, in principle, to evoke a wide
variety of instances for a given emotion category (e.g.,
Wilson-Mendenhall, Barrett, & Barsalou, 2015); in prac-
tice, however, published studies evoke what scientists
believe are the most typical instances of each category,
usually elicited with a stimulus that is presented without
context (e.g., a photograph, a short movie clip separated
from the rest of the film or a simplified description of an
event, such as “your cousin has just died, and you feel
very sad”; Cordaro et al., 2018). Scientists usually include
some measure to verify that participants are in the
expected emotional state (e.g., asking participants to
describe how they feel by rating their experience against
a set of emotion adjectives). They then observe partici-
pants’ facial movements during the emotional episode
and quantify how well the measure of emotion predicts
the observed facial movements. When done properly, this
yields estimates of reliability and specificity and, in prin-
ciple, provides data to assess generalizability. There are
limitations to assessing the validity of a facial configura-
tion as an expression of emotion, as we explain below.
Measuring facial movements. Healthy humans have
a common set of 34 muscle groups, 17 on each side of
the face, that contract and relax in patterns.10 To create
facial movements that are visible to the naked eye, facial
AU 6
AU 12
AU 4
AU 5
AU 23
AU 7
AU 4
AU
A
AU
AU
A
AU
AU
AU
A
AU
AU
U
A
A
AU
U
U
A
AU
AU
AU
AU
AU
U
AU
A
AU
AU
A
AU
AU
A
AU
A
A
A
A
A
U
A
AU
4
4
4
4
4
4
4
AU 11
AU 10
AU 7
AU 25
AU 4 AU 1 AU 2
AU 5
AU 20
AU 1
AU 4
AU 7
AU 17
AU 15
AU 11 AU 26 AU 25
AU 1 AU 2
AU 5
abcdef
g
h
Fig. 4. Facial action ensembles for common-view facial expressions. Facial action coding system (FACS) codes can be used to describe
the proposed facial configuration in adults. The proposed expression for anger (a) corresponds to a prescribed emotion FACS (EMFACS)
code for anger (described as AUs 4, 5, 7, and 23). The proposed expression for disgust (b) corresponds to a prescribed EMFACS code for
disgust (described as AU 10). The proposed expression for fear (c) corresponds to a prescribed EMFACS code for fear (AUs 1, 2, and 5 or
5 and 20). The proposed expression for happiness (d) corresponds to a prescribed EMFACS code for the so-called Duchenne smile (AUs
6 and 12). The proposed expression for sadness (e) corresponds to a prescribed EMFACS code for sadness (AUs 1, 4, 11, and 15 or 1, 4,
15, and 17). The proposed expression for surprise (f) corresponds to a prescribed EMFACS code for surprise (AUs 1, 2, 5, and 26). It was
originally proposed that infants express emotions with the same facial configurations as adults. Later research revealed morphological
differences between the proposed expressive configurations for adults and infants. Of a possible 19 proposed configurations for negative
emotions from the infant coding scheme, only 3 were the same as the configurations proposed for adults (Oster, Hegley, & Nagel, 1992).
The proposed expressive prototypes in (g) are adapted from Cordaro, D. T., Sun, R., Keltner, D., Kamble, S., Huddar, N., and McNeil,
G. (2018). Universals and cultural variations in 22 emotional expressions across five cultures. Emotion, 18, 75–93, with permission from
the American Psychological Association. Face photos copyright Dr. Lenny Kristal. The proposed expressive prototypes in (h) are adapted
from Figure 2 in Shariff and Tracy (2011).
14 Barrett etal.
muscles contract, changing the distance between facial
features (Neth & Martinez, 2009) and shaping skin into
folds and wrinkles on an underlying skeletal structure.
Even when facial movements look the same to the naked
eye, there may be differences in their execution under
the skin. There are individual differences in the mechan-
ics of making a facial movement, including variation in
the anatomical details (e.g., muscle configuration and
relative size vary, and some people lack certain muscle
components), in the neural control of those muscles
(Cattaneo & Pavesi, 2014; Hutto & Vattoth, 2015; Müri,
2015), and in the underlying skeletal structure of the face
(discussed in Box 5 in the Supplemental Material).
There are three common procedures for measuring
facial movements in a scientific experiment. The most
sensitive, objective measure of facial movements, called
facial electromyography (EMG), detects the electrical
activity from actual muscular contractions (again, see
Box 5 in the Supplemental Material). This is a perceiver-
independent way of assessing facial movements that
detects muscle contractions that are not necessarily vis-
ible to the naked eye (Tassinary & Cacioppo, 1992). The
utility of facial EMG is unfortunately offset by its imprac-
ticality: It requires placing electrodes on a participant’s
face in a particular configuration. In addition, a person
can typically tolerate only a few electrodes on the face
at one time. At the writing of this article, relatively few
published articles (we identified 123) reported the use
of facial EMG, the overwhelming majority of which
sparsely sampled the face, measuring the electrical sig-
nals for only a small number of muscles (between one
and six); none of the studies measured naturalistic facial
movements as they occur outside the lab, in everyday
life. Consequently, we focus our discussion on two other
measurement methods: a perceiver-dependent method
that describes visible facial movements, called facial
actions. Human coders indicate the presence or absence
of a facial action while viewing video recordings of
participants. Automated methods also exist for detecting
facial actions from photographs or videos.
Measuring facial movements with human coders. The
Facial Action Coding System, or FACS (Ekman, Friesen,
& Hager, 2002), is a systematic approach to describe what
a face looks like when facial muscle movements have
occurred. FACS codes describe the presence and inten-
sity of facial movements. FACS is purely descriptive and
is therefore agnostic about whether those movements
might express emotions or any other mental event.11
Human coders train for many weeks to reliably identify
specific movements called action units (AUs). Each AU is
hypothesized to correspond to the contraction of a dis-
tinct facial muscle or a distinct grouping of muscles that
is visible as a specific facial movement. For example, the
raising of the inner corners of the eyebrows (contracting
the frontalis muscle pars medialis) corresponds to AU 1.
Lowering of the inner corners of the brows (activation
of the corrugator supercilii, depressor glabellae, and de pres-
sor supercilii) corresponds to AU 4. AUs are scored and
analyzed as independent elements, but the underlying
anatomy of many facial muscles constrains them so that
they cannot move independently of one another, which
generates dependencies between AUs (e.g., see Hao,
Wang, Peng, & Ji, 2018). A list of facial AUs and their
corresponding facial muscles can be found inFigure 5.
Expert FACS coders approach interrater reliabilities of
.80 for individual AUs (Jeni, Cohn, & De la Torre, 2013).
The first version of FACS (Ekman & Friesen, 1978)
was based largely on the work of Swedish anatomist
Carl-Herman Hjortsjö, who catalogued the facial con-
figurations described by Duchenne (Hjortsjö, 1969). In
addition to the updated versions of FACS (Ekman etal.,
2002), other facial coding systems have been devised
for human infants (Izard et al., 1995; Oster, 2007),
chimpanzees (Vick, Waller, Parr, Smith Pasqualini, &
Bard, 2007) and macaque monkeys (L. A. Parr, Waller,
Burrows, Gothard, & Vick, 2010; see also L. F. Barrett,
2017a). Figure 4 displays the common FACS codes for
the configurations of the facial movements that have
been proposed as the prototypic expressions of anger,
disgust, fear, happiness, sadness, and surprise, respec-
tively.
Measuring facial movements with automated algo-
rithms. Human coders require time-consuming, intensive
training and practice before they can reliably assign AU
codes. After training, coding photographs or videos frame
by frame is a slow process, which makes human FACS
coding impractical to use on facial movements as they
occur in everyday life. Large inventories of naturalistic
photographs and videos—which have been curated only
fairly recently (Benitez-Quiroz, Srinivasan, & Martinez,
2016)—would require decades to manually code. This
problem is addressed by automated FACS coding sys-
tems using computer-vision algorithms (Martinez, 2017;
Martinez & Du, 2012; Valstar, Zafeiriou, & Pantic, 2017).12
Recently developed computer vision systems have auto-
mated the coding of some (but not all) facial AUs (e.g.,
Benitez-Quiroz, Srinivasan, & Martinez, 2018; Benitez-
Quiroz, Wang, & Martinez, 2017; Chu, De la Torre, &
Cohn, 2017; Corneanu, Simon, Cohn, & Guerrero, 2016;
Essa & Pentland, 1997; Martinez, 2017a; Martinez & Du,
2012; Valstar etal., 2017; see Box 6 in the Supplemental
Material), making it more feasible to observe facial move-
ments as they occur in everyday life, at least in principle
(see Box 7 in the Supplemental Material).
Automated FACS coding is accurate ( 90%) com-
pared with coding from expert human coders, provided
that the images were captured under ideal laboratory
conditions, where faces are viewed from the front, are
Facial Expressions of Emotion 15
AU Description Facial Muscles (Type of Activation)
1Inner Brow
Raiser
Frontalis (pars
medialis)
2Outer Brow
Raiser
Frontalis (pars
lateralis)
4Brow
Lowerer
Corrugator
supercilii,
depressor supercilii
5Upper-Lid
Raiser
Levator
palpebrae
superioris
6Cheek
Raiser
Orbicularis oculi
(pars orbitalis)
7Lid
Tightener
Orbicularis oculi
(pars palpebralis)
9Nose
Wrinkle
Levatorlabii
superioris
alaquaenasi
10 Upper-Lip
Raiser
Levatorlabii
superioris
11 Nasolabial
Deepener
Zygomaticus
minor
12 Lip-Corner
Puller
Zygomaticus
major
13 Cheeks
Puffer
Levatoranguli
oris
14 Dimpler Buccinator
15 Lip-Corner
depressor
Depressor anguli
oris
16 Lower-Lip
depressor
Depressor labii
inferioris
17 Chin Raiser Mentalis
AU Description Facial Muscles (Type of Activation)
18 Lip
Puckerer
Incisiviilabii
superioris and
incisiviilabii inferioris
20 Lip
Stretcher
Risorius with
platysma
22 Lip
Funneler Orbicularis oris
23 Lip
Tightener Orbicularis oris
24 Lip Pressor Orbicularis oris
25 Lips Part
Depressor labii
inferioris or relaxatio
of mentalis, or
orbicularis oris
26 Jaw Drop
Masseter, relaxed
temporalis and
internal
pterygoid
27 Mouth
Stretch
Pterygoids,
digastric
28 Lip Suck Orbicularis oris
41 Lid Droop
42 Slit
43 Eyes
Closed
44 Squint
45 Blink
46 Wink
Fig. 5. Facial Action Coding System (FACS; Ekman & Friesen, 1978) codes for adults. AU action unit.
well illuminated, are not occluded, and are posed in a
controlled way (Benitez-Quiroz et al., 2016). (It is
important to note, however, that “accuracy” here is de -
fined as the FACS coding produced by human judges—
which may well have errors.) Under ideal conditions,
accuracy is highest (~99%) when algorithms are tested
and trained on images from the same database (Benitez-
Quiroz etal., 2016). The best of these algorithms works
quite well when trained and tested on images from
different databases (~90%), as long as the images are
16 Barrett etal.
all taken in ideal conditions (Benitez-Quiroz etal.,
2016). Accuracy (compared with human FACS coding)
decreases substantially when coding facial actions in
still images or in video frames taken in everyday life,
in which conditions are unconstrained and facial con-
figurations are not stereotypical (e.g., Yitzhak etal.,
2017).13 For example, 38 automated FACS coding algo-
rithms were recently trained on 1 million images (the
2017 EmotioNet Challenge; Benitez-Quiroz, Srinivasan,
Feng, Wang, & Martinez, 2017) and evaluated against
separate test images that were FACS coded by experts.14
In these less constrained conditions, accuracy dropped
below 83%, and a combined measure of precision and
recall (a measure called F1, ranging from zero to one)
was below .65 (Benitez-Quiroz, Srinivasan, etal., 2017).15
These results indicate that current algorithms are not
accurate enough in their detection of facial AUs to fully
substitute for expert coders when describing facial
movements in everyday life. Nonetheless, these algo-
rithms offer a distinct practical advantage because they
can be used in conjunction with human coders to speed
up the study of facial configurations in millions of
images in the wild. It is likely that automated methods
will continue to improve as better and more robust
algorithms are developed and as more diverse face
images become available.
Measuring an emotional state. Once an approach has
been chosen for measuring facial movements, a clear test
of the common view of emotional expressions depends
on having valid measures that reliably and specifically
characterize, in a generalizable way, the instances of each
emotion category to which the measurements of facial
muscle movements can be compared. The methods that
scientists use to assess people’s emotional states vary in
their dependence on human inference, however, which
raises questions about the validity of the measures.
Relatively objective measures of an emotional instance.
The more objective end of the measurement spectrum
includes assessing emotions with dynamic changes in
the autonomic nervous system (ANS), such as cardiovas-
cular, respiratory, or perspiration changes (measured as
variations in skin conductance), and dynamic changes
in the central nervous system, such as changes in blood
flow or electrical activity in the brain. These measures
are thought to be more objective because the measure-
ments themselves (assigning the numbers) do not require
a human judgment (i.e., the measurements are perceiver-
independent). Only the interpretation of the measure-
ments (their psychological meaning) requires human
inference. For example, a human observer does not judge
whether skin conductance or neural activity increases or
decreases; human judgment comes into play when the
measurements are interpreted for the emotional meaning.
Currently, there are no objective measures, either
singly or as a pattern, that reliably, uniquely, and rep-
licably identify an instance of one emotion category
compared with an instance of another. Statistical sum-
maries of hundreds of experiments (i.e., meta-analyses)
show, for example, that currently there is no reliable
relationship between an emotion category, such as
anger, and a specific set of physical changes in the ANS
that accompany the instances of that category, even
probabilistically (the most comprehensive study pub-
lished to date is Siegel etal., 2018, but for earlier stud-
ies, see Cacioppo, Berntson, Larsen, Poehlmann, & Ito,
2000; Stemmler, 2004; also see Box 8 in the Supplemental
Material). In anger, for example, skin conductance can go
up, go down, or stay the same (i.e., changes in skin con-
ductance are not consistently associated with anger). And
a rise in skin conductance is not unique to instances of
anger; it also can occur during a range of other emotional
episodes (i.e., changes in skin conductance do not specifi-
cally occur in anger and only in anger).16
Individual studies often report patterns of ANS mea-
sures that distinguish an instance of one emotion cat-
egory from another, but those patterns are not replicable
across studies and instead vary across studies, even
when studies (a) use the same methods and stimuli and
(b) sample from the same population of participants
(e.g., compare findings from Kragel & LaBar, 2013, with
those from Stephens, Christie, & Friedman, 2010). Simi-
lar within-category variation is routinely observed for
changes in neural activity measured with brain imaging
(Lindquist, Wager, Kober, Bliss-Moreau, & Barrett, 2012)
and single-neuron recordings (Guillory & Bujarski,
2014). For example, pattern-classification studies dis-
cover multivariate patterns of activity across the brain
for emotion categories such as anger, sadness, fear, and
so on, but these patterns are not replicable from study
to study (e.g., compare Kragel & LaBar, 2015; Saarimäki
etal., 2016; Wager etal., 2015; for a discussion, see
Clark-Polner, Johnson, & Barrett, 2017). This observed
variation does not imply that biological variability dur-
ing emotional episodes is random; rather, it may be
context-dependent (e.g., the yellow and green zones
of Fig. 1). It may also be the case that current biological
measures are simply insufficiently sensitive or compre-
hensive to capture situated variation in a precise way.
If this is so, then such variation should be considered
unexplained rather than random.
It is worth pointing out the difficult circularity built
into these studies that we encounter again a few para-
graphs down: Scientists must use some criterion for
identifying when instances of an emotion category are
present in the first place (so as to draw conclusions
about whether emotion categories can be distinguished
by different patterns of physical measurements).17 In
most studies that attempt to find bodily or neural
Facial Expressions of Emotion 17
“signatures” of emotions, the criterion is subjective—it
is either reported by the participants or provided by
the scientist—which introduces problems of its own,
as we discuss in the next section.
Subjective measures of an emotional instance. With-
out objective measures to identify the emotional state
of a participant, scientists typically rely on the relatively
more subjective measures that anchor the other end of
the measurement spectrum. The subjective judgments
can come from the participants (who complete self-report
measures), from other observers (who infer emotion in the
participants), or from the scientists themselves (who use a
variety of criteria, including common sense, to infer the
presence of an emotional episode). These are all exam-
ples of perceiver-dependent measurements because
the measurements themselves, as well as their interpreta-
tion, rely directly on human inference.
Scientists often rely on their own judgments and
intuitions (as Charles Darwin did) to stipulate when an
emotion is present or absent in participants. For exam-
ple, snakes and spiders are said to evoke fear. So are
situations that involve escaping from a predator. Some-
times scientists stipulate that certain actions indicate
the presence of fear, such as freezing or fleeing or even
attacking in defense. The validity of the conclusions
that scientists draw about emotions depends on the
validity of their initial assumptions.18
Inferences about emotional episodes can also come
from other people—for example, independent samples
of study participants, who categorize the situations in
which facial movements are observed. Scientists can
also ask observers to infer when participants are emo-
tional by having them judge subjects’ behavior or tone
of voice (e.g., see our later discussion of Camras etal.,
2007, in the section on infants and children).
Another common strategy for identifying the emo-
tional state of participants is simply to ask them what
they are experiencing. Their self-reports of emotional
experience then become the criteria for deciding
whether an emotional episode is present or absent.
Self-reports are often considered imperfect measures
of emotion because they depend on subjective judg-
ments and beliefs and require translation into words.
In addition, people can experience an emotional event
yet be unaware of it (i.e., conscious with no self-
awareness) or unable to express emotion with words
(a condition called alexithymia) and therefore unable
to report on it. Despite questions about their validity,
self-reports are the most common measure of emotion
that scientists compare with facial AUs.
Human inference and assessing the presence of an
emotional state. At this point, it should be obvious that
any measure of an emotional state itself requires some
degree of human inference; what varies is the amount
of inference that is required. Herein lies a problem: To
properly test the hypothesis that certain facial move-
ments reliably and specifically express emotion, scien-
tists (ironically) must first make a reverse inference that
an emotional event is occurring—that is, they infer the
emotional instance by observing changes in the body,
brain, and behavior. Or they infer (a reverse inference)
that an event or object evokes an instance of a specific
emotion category (e.g., an electric shock elicits fear but
not irritation, curiosity, or uncertainty). These reverse
inferences are scientifically sound only if measures of
emotion reliably, specifically, and validly characterize the
instances of the emotion category. So, any clear, scientific
test of the common view of emotional expressions rests
on a set of more basic inferences about whether an emo-
tional episode is present or absent, and any conclusions
that come from such a test are only as sound as those
basic inferences. (It is, of course, also possible simply to
stipulate the emotion: For instance, a researcher could
choose to define fear as the set of internal states caused
by electric shock, an approach that becomes tautological
if not further constrained.)
If all measures of emotion rest on human judgment
to some degree, then, in principle, a scientist cannot
be sure that an emotional state is present independently
of that judgment, which in turn limits the observer-
independent validity of any experiment designed to test
whether a facial configuration validly expresses a spe-
cific emotion category. All face–emotion associations
that are observed in an experiment reflect human
consensus—that is, the degree of agreement between
self-judgments (from the participants), expert judg-
ments (from the scientist), and/or judgments from other
observers (perceivers who are asked to infer emotion
in the participants). These types of agreement are often
referred to as accuracy, but this may or not be valid.
We touch on this point again when we discuss studies
that test whether certain facial configurations are rou-
tinely perceived as expressions of specific emotion
categories.
Testing the common view of emotional expressions:
interpreting the scientific observations. If a specific
facial configuration reliably expresses instances of a cer-
tain emotion category in any given experiment, then we
would expect measurements of the face (e.g., facial AU
codes) to co-occur with other measures that indicate that
participants are in the target emotional state. In principle,
those measures might be more objective (e.g., ANS
changes during an emotional event) or they might be
more subjective (e.g., ratings provided by the participants
themselves). In practice, however, the vast majority of
experiments compare facial movements with subjective
measures of emotion—a scientist’s judgment about which
18 Barrett etal.
emotions are likely to be evoked by a particular stimulus,
the judgments of other human observers about partici-
pants’ emotional states, or participants’ self-reports of
emotional experience. For example, in an experiment,
scientists might ask questions like these: Do the AUs that
create a scowling facial configuration co-occur with self-
reports of feeling angry? Do the AUs that create a pouting
facial configuration co-occur with perceiver’s judgments
that participants are sad? Do the AUs that create a wide-
eyed, gasping facial configuration co-occur when people
are exposed to an electric shock? If such observations
suggest that a configuration of muscle movements is reli-
ably observed during episodes of a given emotion cate-
gory, then those movements are said to express the
emotion in question. As we will see, many studies show
that some facial configurations occur more often than
random chance would allow but are not observed with a
high degree of reliability (according to the criteria from
Haidt and Keltner, 1999, explained in Table 2 of the cur-
rent article).
If a facial configuration specifically (i.e., uniquely)
expresses instances of a certain emotion category in
any given experiment, then we would expect to observe
little co-occurrence between measurements of the face
and measurements indicating the presence of emotional
instances from other categories (again, see Table 2 and
Fig. 3).
If a configuration of facial movements is observed
in instances of a certain emotion category in a reliable,
specific way within an experiment, so that we can infer
that the movements are expressing an instance of the
emotion in that study as hypothesized, then scientists
can safely infer that the facial movements in question
are an expression of that emotion category’s instances
in that situation. One more step is required before we
can infer that the facial configuration is the expression
of that emotion: We must observe a similar pattern of
facial configuration–emotion co-occurrences across dif-
ferent experiments, to some extent generalizing across
the specific measures and methods used and the par-
ticipants and contexts sampled. If the facial configuration–
emotion co-occurrences replicate across experiments
that sample people from the same culture, then the
facial configuration in question can reasonably be
referred to as an emotional expression only in that
culture; for example, if a scowling facial configuration
co-occurs with measures of anger (and only anger)
across most studies conducted on adult participants in
the United States who are free from illness, then it is
reasonable to refer to a scowl as an expression of
angerin healthy adults in the United States. If facial
configuration–emotion co-occurrences generalize across
cultures—that is, if they are replicated across experi-
ments that sample a variety of instances of that emotion
category in people from different cultures—then the
facial configuration in question can be said to univer-
sally express the emotion category in question.
Studies of healthy adults from the United
States and other developed nations
We now review the scientific evidence from studies that
document how people spontaneously move their facial
muscles during instances of anger, disgust, fear, happi-
ness, sadness, and surprise, as well as how they pose
their faces when asked to indicate how they express
each emotion category. We examine evidence gathered
in the lab and in naturalistic settings, sampling healthy
adults who live in a variety of cultural contexts. To
evaluate the reliability, specificity, and generalizability
of the scientific findings, we adapted criteria set out by
Haidt and Keltner (1999), as discussed in Table 2.
Spontaneous facial movements in laboratory stud-
ies. A meta-analysis was recently conducted to test the
hypothesis that the facial configurations in Figure 4 co-
occur, as hypothesized, with the instances of specific
emotion categories (Duran, Reisenzein, & Fernández-
Dols, 2017). Thirty-seven published articles reported on
how people moved their faces when exposed to objects
or events that evoke emotion. Most studies included in
the meta-analysis were conducted in the laboratory. The
findings from these experiments were statistically sum-
marized to assess the reliability of facial movements as
expressions of emotion (see Fig. 6). In all emotion cate-
gories tested, other than fear, participants moved their
facial muscles into the expected configuration more reli-
ably than what we would expect by chance. Reliability
levels were weak, however, indicating that the proposed
facial configurations in Figure 4 have limited reliability
(and to some extent, limited generalizability; i.e., a scowl-
ing facial configuration is an expression of anger, but not
the expression of anger). More often than not, people
moved their faces in ways that were not consistent with
the hypotheses of the common view. An expanded ver-
sion of this meta-analysis (Duran & Fernández-Dols,
2018) analyzed 131 effect sizes from 76 studies totaling
4,487 participants, with similar results: The hypothesized
facial configurations were observed with average effect
sizes (r) of .31 for the correlation between the intensity
of a facial configuration and a measure of anger, disgust,
fear, happiness, sadness, or surprise (corresponding to
weak evidence of reliability; individual correlations for
specific emotion categories ranged from .06 to .45, inter-
preted as no evidence of reliability to moderate evi-
denceof reliability). The average proportion of the times
that a facial configuration was observed during an
emotional event (in one of those categories) was .22
Facial Expressions of Emotion 19
(proportions for specific emotion categories ranged from
.11 to .35, interpreted as no evidence to weak evidence
of reliability).19
No overall assessment of specificity was reported in
either the original or the expanded meta-analysis because
most published studies do not report the false-positive
rate (i.e., the frequency with which a facial AU is
observed when an instance of the hypothesized emotion
category was not present; see Fig. 3). Nonetheless, some
striking examples of specificity failures have been docu-
mented in the scientific literature. For example, a certain
smile, called a Duchenne smile, is defined in terms of
facial muscle contractions (i.e., in terms of facial mor-
phology): It involves movement of the orbicularis oculi,
which raises the cheeks and causes wrinkles at the outer
corners of the eyes, in addition to movement of the
zygomaticus major, which raises the corners of the lips
into a smile. A Duchenne smile is thought to be a spon-
taneous expression of authentic happiness. Research
shows, however, that a Duchenne smile can be intention-
ally produced when people are not happy (Gunnery &
Hall, 2014; Gunnery, Hall, & Ruben, 2013; also see
Krumhuber & Manstead, 2009), consistent with evidence
that Duchenne smiles often occur when people are sig-
naling submission or affiliation rather than reflecting
happiness (Rychlowska etal., 2017).
Spontaneous facial movements in naturalistic set-
tings. Studies of facial configuration–emotion category
associations in naturalistic settings tend to yield results
similar to those from studies that were conducted in more
controlled laboratory settings (Fernández-Dols, 2017;
Fernández-Dols & Crivelli, 2013). Some studies observe
that people express emotions in real-world settings by
spontaneously making the facial muscle movements pro-
posed in Figure 4, but such observations are generally not
replicable across studies (e.g., cf. Matsumoto & Willingham,
2006 and Crivelli, Carrera, & Fernández-Dols, 2015; cf.
Rosenberg & Ekman, 1994 and Fernández-Dols, Sanchez,
Carrera, & Ruiz-Belda, 1997). For example, two field
studies of winning judo fighters recently demonstrated
that so-called Duchenne smiles were better predicted by
whether an athlete was interacting with an audience than
the degree of happiness reported after winning their
–0.2
–0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Anger Disgust Fear Happiness Surprise
Correlations
Proportions
N = 6 N = 4 N = 5 N = 12 N = 1
.22
.24.28 .11
.41
.21
.24
N = 3 N = 5 N = 2 N = 3 N = 16
.32
.34
.27
.12 .09
1.0
0.0
Correlation or Proportion
Sadness
N = 3N = 1
Fig. 6. Meta-analysis of facial movements during emotional episodes: a summary of effect sizes across studies (Duran,
Reisenzein, & Fernández-Dols, 2017). Effect sizes are computed as correlations or proportions (as reported in the original
experiments). Results include experiments that reported a correspondence between a facial configuration and its hypoth-
esized emotion category as well as those that reported a correspondence between individual AUs of that facial configuration
and the relevant emotion category; meta-analytic effect sizes that summarized only the effects for entire ensembles of AUs
(the facial configurations specified in Fig. 4) were even lower than those reported here.
20 Barrett etal.
matches (Crivelli etal., 2015). Only 8 of the 55 winning
fighters produced a Duchenne smile in Study 1; all occurred
during a social interaction. Only 25 of 119 winning fighters
produced a Duchenne smile in Study 2, documenting, at
best, weak evidence for reliability.
Posed facial movements. Another source of evidence
comes from asking participants sampled from various
cultures to deliberately pose the facial configurations that
they believe they use to express emotions. In these stud-
ies, participants are given a single emotion word or a
single, brief statement to describe each emotion category
and are then asked to freely pose the facial configuration
that they believe they make when expressing that emo-
tion. Such research directly examines common beliefs
about emotional expressions. For example, one study
provided college students from Canada and Gabon (in
Central Africa) with dictionary definitions for 10 emotion
categories. After practicing in front of a mirror, partici-
pants posed the facial configurations so that “their friends
would be able to understand easily what they feel”
(Elfenbein, Beaupre, Levesque, & Hess, 2007, p. 134) and
their poses were FACS coded. Likewise, a recent study
asked college students in China, India, Japan, Korea, and
the United States to pose the facial movements they
believe they make when expressing each of 22 emotion
categories (Cordaro et al., 2018). Participants heard a
brief scenario describing an event that might cause anger
(“You have been insulted, and you are very angry about
it”) and then were instructed to pose a facial (and non-
verbal but vocal) expression of emotion, as if the events
in the scenario were happening to them. Experimenters
were present in the testing room as participants posed
their responses. Both studies found moderate to strong
evidence that participants across cultures share common
beliefs about the expressive pose for anger, fear, and sur-
prise categories; there was weak to moderate evidence
for the happiness category, and weak evidence for the
disgust and sadness categories (Fig. 7). Cultural variation
in participants’ beliefs about emotional expressions was
also observed.
Neither study compared participants’ posed expres-
sions (their beliefs about how they move their facial
muscles to express emotions) with observations of how
they actually moved their faces when expressing emo-
tion. Nonetheless, a quick comparison of the findings
from the two studies and the proportions of spontane-
ous facial movements made during emotional events
(from the Duran etal., 2017 meta-analysis) makes it
clear that posed and spontaneous movements differ,
sometimes quite substantially (again, see Fig. 7). When
people pose a facial configuration that they believe
expresses an emotion category, they make facial move-
ments that more reliably agree with the hypothesized
facial configurations in Figure 4. The same cannot be
said of people’s spontaneous facial movements during
actual emotional episodes, however (for convergent evi-
dence, see Motley & Camden, 1988; Namba, Makihara,
Kabir, Miyatani, & Nakao, 2016). One possible interpre-
tation of these findings is that posed and spontaneous
facial-muscle configurations correspond to distinct com-
munication systems. Indeed, there is some evidence
that volitional and involuntary facial movements are
controlled by different neural circuits (Rinn, 1984).
Another factor that may contribute to the discrepancy
between posed and spontaneous facial movements is
that people’s beliefs about their own behavior often
reflect their stereotypes and do not necessarily cor-
respond to how they actually behave in real life (see
Robinson & Clore, 2002). Indeed, if people’s beliefs, as
measured by their facial poses, are influenced directly
by the common view, then any observed relationship
between posed facial expressions and hypothesized
emotion categories is merely evidence of the beliefs
themselves.
Summary. Our review of the available evidence thus
far is summarized in the first through third data rows in
Table 3. The hypothesized facial configurations presented
in Figure 4 spontaneously occur with weak reliability
during instances of the predicted emotion category, sug-
gesting that they sometimes serve to express the pre-
dicted emotion. Furthermore, the specificity of each
facial configuration as an expression of an emotion cat-
egory is largely unknown (because it is typically not
reported in many studies). In our view, this pattern of
findings is most compatible with the interpretation that
hypothesized facial configurations are not observed reli-
ably or specifically enough to justify using them to infer
a person’s emotional state, whether in the lab or in every-
day life. We are not suggesting that facial movements are
meaningless and devoid of information. Instead, the data
suggest that the meaning of any set of facial movements
may be much more variable and context-dependent than
hypothesized by the common view.
Studies of healthy adults living in
small-scale, remote cultures
The emotion categories that are at the heart of the com-
mon view—anger, disgust, fear, happiness, sadness, and
surprise—derive from modern U.S. English (Wierzbicka,
2014), and their proposed expressions (in Fig. 4) derive
from observations of people who live in urbanized,
Western settings. Nonetheless, it is hypothesized that
these facial configurations evolved as emotion-specific
expressions to signal socially relevant emotional infor-
mation (Shariff & Tracy, 2011) in the challenging
Facial Expressions of Emotion 21
situations that originated in our hunting-and-gathering
hominin ancestors who lived on the African savannah
during the Pleistocene era (Pinker, 1997; Tooby &
Cosmides, 1990). It is further hypothesized that these
facial configurations should therefore be observed dur-
ing instances of the predicted emotion categories with
strong reliability and specificity in people around the
world, although the facial movements might be slightly
modified by culture (Cordaro etal., 2018; Ekman, 1972).
The strongest test of these hypotheses would be to
sample participants who live in remote parts of the
world with relatively little exposure to Western cultural
norms, practices, and values (Henrich et al., 2010;
Norenzayan & Heine, 2005) and observe their facial
movements during emotional episodes.20 In our evalu-
ation of the evidence, we continued to use the criteria
summarized by Haidt and Keltner (1999; see Table 2 in
the current article).
Spontaneous facial movements in naturalistic set-
tings. Our review of scientific studies that systematically
measure the spontaneous facial movements in people of
small-scale, remote cultures is brief by necessity: There
are no such studies. At the time of publication, we were
unable to identify even a single published report or man-
uscript registered on open-access, preprint services that
measured facial muscle movements in people of remote
cultures as they experienced emotional events. Scientists
have almost exclusively observed how people from re -
mote cultures label facial configurations as emotional
expressions (i.e., studying emotion perception, not pro-
duction) to test the hypothesis that certain facial configu-
rations evolved to express certain emotion categories in a
reliable, specific, and generalizable (i.e., universal) man-
ner. Later in this article, we return to this issue and discuss
the findings from these emotion-perception studies.
There are nonetheless several descriptive reports that
provide support for the common view of universal emo-
tional expressions (similar to what Valente, Theurel, &
Gentaz, 2018, refer to as an “observational approach”).
For example, the U.S. psychologist Paul Ekman and
colleagues curated an archive of photographs of the
Fore hunter-gatherers taken during his visits to Papua
New Guinea in the 1960s (Ekman, 1980). The photo-
graphs were taken as people went about their daily
activities in the small hamlets of the eastern highlands
of Papua New Guinea. Ekman used his knowledge of
the situation in which each photograph was taken to
.0
.1
.2
.3
.4
.5
.6
.7
.8
.9
1.0
Correlation or Proportion
Cordaro et al. (2018), Posed Facial Configurations
Elfenbein et al. (2007), Posed Facial Configurations
Duran et al. (2017), Spontaneous Facial Movements
Anger Disgust Fear Happiness Sadness Surprise
Fig. 7. Comparing posed and spontaneous facial movements. Correlations or proportions are presented for anger,
disgust, fear, happiness, sadness, and surprise, separately for three studies. Data are from Table 6 in Cordaro etal.
(2018), from Elfenbein, Beaupre, Levesque, and Hess (2007; reliability for the anger category is for AU4 AU5 only),
and from Duran, Reisenzein, and Fernández-Dols (2017; proportion data only).
22 Barrett etal.
assign each facial configuration to an emotion category,
leading him to conclude that the Fore expressed emo-
tions with the proposed facial configurations shown in
Figure 4. Yet different scientific methods yielded a con-
trasting conclusion. When Trobriand Islanders living in
Papua New Guinea were asked to infer emotions in
facial configurations by labeling these same photo-
graphs in their native language, both by freely offering
words and by choosing the best fitting emotion word
from a list of nine choices, they did not label the facial
configurations as proposed by Ekman and colleagues,
at above-chance levels (Crivelli, Russell, Jarillo, &
Fernández-Dols, 2017).21 In fact, the proposed fear
expression—the wide-eyed, gasping face—is reliably
interpreted as an expression of threat (intent to harm)
and anger by the Maori of New Zealand and by the
Trobriand Islanders in Papua New Guinea (Crivelli,
Jarillo & Fridlund, 2016).
A compendium of spontaneous human behavior
published by the Austrian ethologist Irenäus Eibl-
Eibesfeldt (1989) is sometimes cited as evidence for the
hypothesis that certain facial movements are universal
signals for specific emotion categories. No systematic
coding procedure was used in his investigations, how-
ever. On close examination, Eibl-Eibesfeldt’s detailed
descriptions appear to be more consistent with results
from the studies of people living in more industrialized
cultures that we reviewed above: People move their
faces in a variety of ways during episodes belonging
to the same emotion category. For example, as reported
by Eibl-Eibesfeldt, a rapid eyebrow raise (called an
eyebrow flash) is thought to express friendly recogni-
tion in some cultures but not all. Likewise, particular
facial muscle movements are not specific expressions
of a given emotion category. For example, an eyebrow
flash would be coded with FACS AU 1 (inner brow
raise) and AU 2 (outer brow raise), which are part of
the proposed expressions for surprise and fear (Ekman,
Levenson, & Friesen, 1983), sympathy (Haidt & Keltner,
1999), and awe (Shiota, Campos, & Keltner, 2003). Even
Eibl-Eibesfeldt acknowledged that eyebrow flashes
were not unique expressions of specific emotion cat-
egories, writing that they also served as a greeting, as
an invitation for social contact, as a sign of thanks, as
an initiation of flirting, and as a general indication of
“yes” in Samoans and other Polynesians, in the Eipo
and Trobriand islanders in Papua New Guinea, and in
the Yanomami of South America. In Japan, eyebrow
flashes are considered an impolite way for adults to
greet one another. In the United States and Europe, an
eyebrow flash was observed when greeting friends but
not when greeting strangers.
Table 3. Reliability and Specificity: A Summary of the Evidence
Study type Reliability Specificity
Expression production
Adults, developed, spontaneous, lab Weak Unknown
Adults, developed, spontaneous, naturalistic Weak Unknown
Adults, developed, posed Weak to strong Unknown
Adults, remote, spontaneous Unknown Unknown
Adults, remote, posed Weak to strong Unknown
Newborns, infants, toddlers Unsupported Unsupported
Congenitally blind Unsupported to weak Unsupported
Emotion perception
Adults, developed, choice-from-array Moderate to strong Unknown
Adults, developed, reverse correlation (with choice-from-array) Moderate Moderate
Adults, developed, free labeling Weak to moderate Weak
Adults, developed, virtual humans Unknown Unknown
Adults, remote, choice-from-array (before 2008) Moderate to strong Unknown
Adults, remote, choice-from-array (after 2008) Weak to moderate Unsupported
Adults, remote, free labeling (before 2008) Unsupported to strong Variable
Adults, remote, free labeling (after 2008) Unsupported Unsupported
Infants, young children Unsupported Unsupported
Note. Criteria were adopted from Haidt and Keltner (1999), who suggest that reliability rates of 70% to 90% are considered
strong evidence for universal emotion perception (following Ekman, 1994); presumably, this would also hold for studies
of expression production. Weak evidence is in the range of 20% to 40% (Haidt & Keltner, 1999, citing Russell, 1994). By
interpolation, reliability between 41% and 69% would be considered moderate evidence. Reliability estimates below 20%
are interpreted as findings that clearly do not support the reliability hypothesis. We also adopted these criteria for specificity
findings. Developed studies of participants from the U.S. and other more urban countries; spontaneous spontaneous
facial movements; posed posed facial configurations; remote studies of participants from small-scale, remote samples.
Facial Expressions of Emotion 23
Posed facial movements. In the only study of expres-
sion production in a natural environment that we could
find, researchers read a brief emotion story to people
who live in the remote Fore culture of Papua New Guinea
and asked each person to “show how his face would
appear” (Ekman, 1972, p. 273) if he was the person
described in the emotion story (sample size was not
reported). Videotapes of 9 participants were shown to 34
U.S. college students who were asked to judge which
emotion was being expressed. U.S. participants were
asked to infer the emotional meaning of the facial poses
by choosing an emotion word from six choices provided
by the experimenter (called a choice-from-array task;
discussed on page 31 of this article). Participants inferred
the intended emotional meaning at above-chance levels
for smiling (happiness, 73%), frowning (sadness, 68%),
scowling (anger, 51%), and nose-wrinkling (disgust, 46%),
but not for surprise and fear (27% and 18%, respectively).
Summary. Our review of the available evidence from
expression-production studies in small-scale, remote cul-
tures is inconclusive because there are no systematic, con-
trolled observations that examine how people who live in
these cultural contexts spontaneously move their facial
muscles during emotional episodes. The evidence that
does exist suggests that common beliefs about emotion
may share some similarities across urban and small-scale
cultural contexts, but more research is needed before any
interpretations are warranted. These findings are summa-
rized in the fourth and fifth data rows of Table 3.
Studies of healthy infants and children
The facial movements of infants and young children
provide a valuable way to test common beliefs about
emotional expressions because, unlike older children
and adults, babies cannot exert voluntary control over
their spontaneous expressive behaviors, meaning that
they are unable to deliberately mask or portray instances
of emotion in accordance with social demands. As a
general rule, infants understand far more about the
world than they can easily convey through their physi-
cal actions, making it difficult for experiments to dis-
tinguish between what infants understand and what
they can do; the former often exceeds the latter (Pollak,
2009). Experiments must use human inference to deter-
mine when an infant is in an emotional state, as is the
case in studies of adults (see Human Inference and
Assessing the Presence of an Emotional State, above).
The presence (or absence) of an instance of emotion
is inferred (i.e., stipulated), either by a scientist (who
exposes a child to something that is presumed to evoke
an emotion episode) or by adult “raters” who infer the
emotional meaning of the evoking situation or of the
child’s body movements and vocalizations (see Subjec-
tive Measures of an Emotional Instance, above). In the
latter cases, inferences are measured by asking research
participants to label the situation or the child’s emo-
tional state by choosing an emotion word or image from
a small set of options, known as a choice-from-array
task. We address the strengths and weaknesses of
choice-from-array tasks (see Fig. 8) and the potential
risk of confirmatory bias with the use of such methods
(see A Note on Interpreting the Data, below).
There is also a risk, given the strong reliance on human
inference, that scientists will implicitly confound the mea-
surements made in an experiment with their interpretation
of those measurements, in effect overinterpreting infant
behavior as emotional, in part because these young
research participants cannot speak for themselves. Some
early and influential studies confounded the observation
of facial movements with their interpreted emotional
meaning, leading to the conclusions that babies as young
as 7 months old were capable of producing an expression
of anger. In fact, it is more scientifically correct to say that
the babies were scowling. For example, in one study,
infants’ facial movements were coded as they were given
a cookie, and then the cookie was taken away and placed
out of reach, although it was still clearly visible. The
babies appeared to scowl when the cookie was removed
and not when it was in their mouths (Stenberg, Campos,
& Emde, 1983). It is certainly possible that this repeated
giving and taking away of the treat angered the infants,
but the babies might also have been confused or just
generally distressed. Without some independent evidence
to indicate that a state of anger was induced, we cannot
confidently conclude that certain facial movements in an
infant reliably express a specific instance of emotion.
The Stenberg etal. (1983) study illustrates some of
the concerning design issues that have historically
plagued studies with infants. First, emotion-inducing
situations are often defined with common-sense intu-
itions rather than objective evidence (e.g., an infant is
assumed to become angry when a cookie is taken
away). In fact, it is difficult to know how any individual
infant at any point in time will construct and react to
such an event. Second, when an infant produces a facial
movement, a common assumption is used to infer its
emotional meaning without additional measures or con-
trols (e.g., when a scowling facial configuration is
observed, it is assumed to necessarily be an expression
of infant anger, even if there are no data to confirm that
a scowl is specific to instances of anger in an infant).
In fact, years later, as their research program pro-
gressed, Campos and his team revised their earlier inter-
pretation of their findings, later concluding that the
facial movements in question (infants lowering and
drawing together their brows, staring straight ahead, or
24 Barrett etal.
pressing their lips together) were more generally associ-
ated with unpleasantness and distress and were not
reliable expressions of anger (e.g., Camras etal., 2007).
The inference problem is particularly poignant when
fetuses are studied. For example, in a 4-D ultrasonog-
raphy study performed with fetuses at 20 gestational
weeks, researchers observed the fetuses knitting their
brows and described the facial movements as expres-
sions of distress (Dondi etal., 2012). Yet the fetuses were
producing these facial movements during situations in
which fetal distress was unlikely. The brow-knitting was
observed during noninvasive ultrasound scanning that
did not involve perturbation of the fetus, and the preg-
nant women were at rest. Furthermore, the scans were
brief, and the facial movements were interspersed with
other movements that are typically not thought to
express negative emotions, such as smiling and mouth-
ing. This is an example of making a scientific inference
6 + 12 + 13 + 14 Delighted, Joy, Happy,
Cheerful, Contempt, Pride
Joyful, Delighted, Happy,
Glad, Feel Well, Pleasantly
Surprised, Embarrassed,
Pride
4 + 20 + 24 + 43 Fear, Scared, Anxious,
Upset, Miserable, Sad,
Depressed, Shame,
Embarrassed
Afraid, Anxious, Distressed,
Broken-Hearted, Sorrow
and Sadness, Having a
Hard Time, Grief, Dismay,
Anguish, Worry, Vexed,
Unhappy, Shame, Despise
2 + 5 + 26 + 27 Ecstatic, Excited, Surprised,
Frightened, Terrified
Amazed, Greatly Surprised,
Alarmed and Panicky,
Scared, Fear
7 + 9 + 16 + 22 Hate, Disgust, Fury, Rage,
Anger
Disgusted, Bristle with
Anger, Furious, Wild Wrath,
Storm of Fury, Storm of
Anger, Indignant, Rage
Facial
Configuration
Action Unit
Description
Associated Emotion Words
United Kingdom China
Fig. 8. Culturally common facial configurations extracted using reverse correlation from 62 models of facial configurations. Red
coloring indicates stronger action unit (AU) presence and blue indicates weaker AU presence. Some words and phrases that refer
to emotion categories in Chinese are not considered emotion categories in English. Adapted with permission of the American Psy-
chological Association, from Revealing Culturally Common Facial Expressions of Emotion, by Jack, R. E., Sun, W., Delis, I., Garrod,
O. G., and Schyns, P. G., in the Journal of Experimental Psychology: General, Vol. 145. Copyright © 2016; permission conveyed through
Copyright Clearance Center, Inc.
Facial Expressions of Emotion 25
about the presence of an emotion solely on the basis
of the facial movements without converging evidence
that the organism in question (a fetus) was in a distressed
state. Doing so highlights the common but unsound
assumption that certain facial movements reliably and spe-
cifically index instances of the same emotion category.
The study of expression production in infants and
children must deal with other design challenges—in
addition to the reliance on human inference—that are
shared by experiments with adult participants. In par-
ticular, most experiments observe facial movements in
a restricted range of laboratory settings rather than in
the wide variety of situations that naturally occur in
everyday life. The frequent use of only a single stimulus
or event to observe facial movements for each emotion
category limits the opportunity to discover whether the
expressions of an emotion category vary systematically
with context.
Even with these design considerations, the scientific
findings from studies of infants and children parallel
those that we encountered from studies on adults: Weak
to no reliability and specificity in facial muscle move-
ments is the norm, not the exception (again, using the
criteria from Haidt & Keltner, 1999, that are presented in
Table 2 of the current article). Although some older stud-
ies concluded that infants produce invariant emotional
expressions (e.g., Izard etal., 1995; Izard, Hembree,
Dougherty, & Spirrizi, 1983; Izard, Hembree, & Huebner,
1987; Lewis, Ramsay, & Sullivan, 2006), these conclusions
have been largely superseded by more recent work and
in many cases have been reinterpreted and revised by
the authors themselves (e.g., Lewis etal., 2006).
Facial movement in fetuses, infants, and young chil-
dren. The most detailed research on facial movements
in fetuses and newborns has focused on smiles. Human
fetuses lower their brows (AU4), raise their cheeks (AU6),
wrinkle their noses (AU9), crease their nasolabia (AU11),
pull the corners of their lips (AU12), show their tongues
(AU19), part their lips (AU25), and stretch their mouths
(AU27)—all of which have been implicated, to some
degree, in adult laughter. Infants sometimes produce
facial movements that resemble adult laughter when other
considerations suggest that they are in distress and pain
(Dondi etal., 2012; Hata etal., 2013; Reissland, Francis, &
Mason, 2013; Reissland, Francis, Mason, & Lincoln, 2011;
Yan etal., 2006). Within 24 hr of birth, infants raise their
cheek muscles in response to being touched (Cecchini,
Baroni, Di Vito, & Lai, 2011). But these movements are not
specific to smiling; neonates also raise their cheeks (con-
tract the zygomatic muscle) during rapid eye movement
(REM) sleep, when drowsy, and during active sleep (Dondi
etal., 2007). A neonatal smile with raised cheeks is caused
by brainstem activation (Rinn, 1984), and likely reflects
internally generated arousal rather than expressing or com-
municating an emotion or even a more general feeling of
pleasure (Emde & Koenig, 1969; Sroufe, 1996; Wolff, 1987).
So, it remains unclear whether fetal or neonatal facial mus-
cle movements have any relationship to specific emotional
episodes as well as more generally to pleasant feelings or
to other social meanings (Messinger, 2002).
In fact, it is not clear that fetal and neonatal facial
movements always have a psychological meaning (con-
sistent with a behavioral-ecology view of facial move-
ments; Fridlund, 2017). Newborns appear to produce
some combinations of facial movements for muscular
reasons. For example, infants produce facial movements
associated with the proposed expression for surprise
(open mouth and raised eyebrows) in situations that
are unsurprising, just because opening the mouth nec-
essarily raises their eyebrows; conversely, infants do
not consistently show the proposed expressive configu-
ration for surprise in contexts that are likely to be
surprising (Camras, 1992; Camras, Castro, Halberstadt,
& Shuster, 2017). The facial movement that is part of
the proposed expression for sadness (brows oblique
and drawn together) occurs when infants attempt to lift
their heads to direct their gaze (Michel, Camras, &
Sullivan, 1992).
In addition, newborns produce many facial move-
ments that co-occur with fussiness, distress, focused
attention, and distaste (Oster, 2005). Newborns react to
being given sweet versus sour liquids; for example, when
given a sour liquid, newborns make a nose-wrinkle
movement, which is part of the proposed expressive
configuration for disgust (Granchrow, Steiner, & Daher,
1983). However, other studies show that newborns also
make this facial movement when given sweet, salty, and
bitter tastes (e.g., Rosenstein & Oster, 1988). Still other
studies show that nose-wrinkling does not always occur
when infants taste lemon juice (i.e., when that facial
movement is expected; Bennett, Bendersky, & Lewis,
2002). More generally, infants rarely produce consistent
facial movements that cleanly map onto any single emo-
tion category. Instead, infants produce a variety of facial
configurations that suggest a lack of emotional specific-
ity (Matias & Cohn, 1993).
There are further examples that illustrate how infant
facial movements lack strong reliability and specificity.
In a study of 11-month-old babies from the United
States, China, and Japan, infants saw a toy gorilla head
that growled (to induce fear) or their arms were
restrained (to induce anger; Camras etal., 2007). Observ-
ers judged the infants to be fearful or angry on the basis
of their body movements, yet the infants produced the
same facial movements in the two situations.22 In another
study, 1-year-old infants were videotaped in situations
in which they were tickled (to elicit joy), tasted sour
26 Barrett etal.
flavors (to elicit disgust), watched a jack-in-the box (to
elicit surprise), had an arm restrained (to elicit anger),
and were approached by a masked stranger (to elicit
fear; Bennett etal., 2002). Infants whose arms were
restrained (to purportedly induce an instance of anger)
produced the facial actions associated with the pro-
posed facial configuration for an anger expression only
24% of the time (low reliability); instead, 80 infants
(54%) produced the facial actions proposed as the
expression of surprise, 37 infants (25%) produced the
facial actions proposed as the expression of joy, 29
infants (19%) produced the facial actions proposed as
the expression of fear, and 28 (18%) produced the facial
actions proposed as the expression of sadness. This
dramatic lack of specificity was observed for all emotion
categories studied. An equal number of babies produced
facial movements that are proposed as the expressions
of joy, surprise, anger, disgust, and fear categories when
a sour liquid was placed on infants’ tongues to elicit
disgust. When infants faced a masked stranger, only 20
(13%) produced facial movements that corresponded to
the proposed expression for fear, compared with 56
infants (37%) who produced facial actions associated
with the proposed expression for instances of joy.23
Taken together, these findings suggest that infant
facial movements may be associated with affect (i.e.,
the affective features of experience, such as distress or
arousal), as originally described by Bridges (1932), or
may communicate a desire to approach or avoid some-
thing (e.g., Lewis, Sullivan, & Kim, 2015). Affective fea-
tures such as valence (ranging from pleasantness to
distress) and arousal (ranging from activated to quies-
cent) are continuous properties of experience, just as
approach/avoidance is an affective property of action.
These affective features are shared by many instances
of different emotion categories, as well as with mental
events that are not considered emotional (as discussed
in Box 9 in the Supplemental Material), but this does
not diminish their importance or effectiveness for
infants.24 Over time, infants likely learn to differentiate
mental events with simple affective features into epi-
sodes of emotion with additional psychological features
that are specific to their sociocultural contexts, making
them maximally effective at eliciting needed responses
from their caregivers (L.F. Barrett, 2017b; Holodynski &
Friedlmeier, 2006; Weiss & Nurcombe, 1992; Witherington,
Campos, & Hertenstein, 2001).
The affective meaning of an infant’s facial move-
ments may, in fact, be what makes these movements
so salient for adult observers. When infants move their
lips, open their mouths, or constrict their eyes, adults
view infants as feeling more pleasant or unpleasant
depending on the context (Bolzani Dinehart et al.,
2005). Infant expressions thus do have a reliable link
to instrumental effects in the adults who observe
them—playing an important role in parent–infant inter-
action, attachment, and the beginnings of social com-
munication (Atzil, Gao, Fradkin, & Barrett, 2018;
Feldman, 2016). For example, if an infant cries with
narrowed eyes, adults infer that the infant is feeling
negative, is having an unwanted experience, or is in
need of help, but if the infant makes that same eye
movement while smiling, adults infer that the infant is
experiencing more positive emotion. These data con-
sistently point to the usefulness of facial movements in
the communication of arousal and valence, particularly
when combined or with other communicative features
such as vocalizations (properties of affect; see Box 9
in the Supplemental Material). Even when episodes of
more specific emotions start to emerge, we do not yet
have evidence that facial movements map reliably and
regularly to a specific emotion category.
Young children begin to produce adult-like facial
configurations after the first year of life. Even then,
however, children’s facial movements continue to lack
strong reliability and specificity (Bennett etal., 2002;
Camras & Shutter, 2010; Matias & Cohn, 1993; Oster,
2005). Examples of a wide-eyed, gasping facial configu-
ration, proposed as the expression of fear (see Fig. 4),
have rarely been observed or reported in young infants
(Witherington, Campos, Harriger, Bryan, & Margett,
2010). Nor do infants reliably produce a scowling facial
configuration, proposed as the expression of anger
(again, see Fig. 4). Infants scowl when they cry or are
about to cry (Camras, Fatani, Fraumeni, & Shuster,
2016). A frown (mouth corner depression, AU15) is not
reliably and specifically observed when infants are frus-
trated (Lewis & Sullivan, 2014; Sullivan and Lewis,
2003). A smile (cheek raising and lip corner pulling,
AU6 and AU12) is not reliably observed when infants
are in visually engaging or mastery situations, or even
when they are in pleasant social interactions (Messinger,
2002).
Experiments that observe young children’s facial
movements in naturalistic settings find largely the same
results as those conducted in controlled laboratory set-
tings. For example, one study trained ethnographic
videographers to record a family’s daily activities over
4 days (Sears, Repetti, Reynolds, & Sperling, 2014). Cod-
ers judged whether or not the child from each partici-
pating family made a scowling facial configuration
(referred to as an expression of anger), a frowning
facial configuration (referred to as an expression of
sadness), and so on, for the six (presumed) emotion
categories included in the study—happiness, sadness,
surprise, disgust, fear, and anger. During instances that
were coded as anger (defined as situations that included
verbal disagreements or sibling bickering, requests for
Facial Expressions of Emotion 27
compliance and/or reprimands from parents, parent
refusal of child requests, during homework, and sibling
provocation), a variety of facial movements were
observed, including frowns, furrowed brows, and eye-
rolls, as well as a variety of vocalizations, including
shouts and whining, and both nonaggressive and
aggressive physical behaviors.
Perhaps the most telling observation for our pur-
poses is that expressions of anger were more often
vocal than facial. During anger situations, children
raised their voices 42% of the time, followed by whining
about 21% of the time. By contrast, children made
scowling facial configurations only 16.2% of the time.25
Yet even during anger situations, the facial movements
were predominantly frowning, which can be part of
many different proposed facial configurations. The
authors reasoned that children engage in specific
behaviors to obtain specific goals, and that behaviors
such as whining are more likely to attract attention and
possibly change parental behavior than is a facial move-
ment. Indeed, it is easier for parents to ignore a nega-
tive facial expression than a whining child in the room!
Similar findings for low reliability and specificity of the
facial configurations presented in Figure 4 were recently
observed in a naturalistic study that videotaped 7- to
9-year-old children and their mothers discussing a con-
flict during their visit to the laboratory related to home-
work, chores, bedtime, or interactions with siblings
(Castro, Camras, Halberstadt, & Shuster, 2018).
Summary. Newborns and infants react to the world
around them with facial movements. There is not yet suf-
ficient evidence, however, to conclude that these facial
movements reliably and specifically express the instances
of any emotion category (findings are summarized in the
sixth data row of Table 3). When considered alongside
vocalizations and body movements, there is consistent
evidence that infant facial movements reliably signal dis-
tress, interest, and arousal and perhaps serve as a call for
help and comfort. In young children, instances of the
same emotion category appear to be expressed with a
variety of different muscle movements, and the same
muscle movements occur during instances of various
emotion categories, and even during nonemotional
instances. It may be the case that reliability and specificity
emerges through learning and development (see Box 10
in the Supplemental Material), but this remains an open
question that awaits future research.
Studies of congenitally blind individuals
Another source of evidence to test the common view
comes from observations of facial movements in people
who were born blind. The assumption is that people
who are blind cannot learn by watching others which
facial muscles to move when expressing emotion. On
the basis of this assumption, several studies have
claimed to find evidence that congenitally blind indi-
viduals express emotions with the hypothesized facial
configurations in Figure 4 (e.g., blind athletes were
reported to show expressions that are reliably inter-
preted as shame and pride; Tracy & Matsumoto, 2008;
see also Matsumoto & Willingham, 2009). People who
are born blind learn through other sensory modali-
ties, however (for a review, see Bedny & Saxe, 2012),
and therefore can learn whatever regularities exist
between emotional states and facial movements from
hearing descriptions in conversation, in books and
movies, and by direct instruction.26 As an example of
such learning, Olympic athletes who won medals
smiled only when they knew they were watched by
other people, such as when they were on the podium
facing the audience; in other situations, such as while
they waited behind the podium or while they were on
the podium facing away from people but toward a flag,
they did not smile (but presumably were still very
happy; Fernández-Dols & Ruiz-Belda, 1995). Such find-
ings are consistent with the behavioral ecology view of
facial expressions (Fridlund, 2017, 1991) and with more
recent sociological evidence that smiles are social cues
that can communicate different social messages depend-
ing on the cultural context (J. Martin, Rychlowska,
Wood, & Niedenthal, 2017).
The limitations that apply to studies of emotional
expressions in sighted individuals, reviewed throughout
this article, are even more applicable to scientific stud-
ies of emotional expressions in the blind.27 Participants
are given predetermined emotion categories that con-
strain their possible responses, and facial movements
are often quantified by human judges who have their
own biases when inferring the emotional meaning of
facial movements (e.g., Galati, Miceli, & Sini, 2001;
Galati, Scherer, & Ricci-Bitti, 1997; Valente etal., 2018).
In addition, people who are blind make additional,
often unusual movements of the head and the eyes
(Chiesa, Galati, & Schmidt, 2015) to better hear objects
or echoes. These unusual movements might influence
expressive facial movements. More importantly, they
reveal whether a participant is blind or sighted, and
this knowledge can bias human raters who are judging
the presence or absence of facial movements in emo-
tional situations.
Helpful insights about the facial expressions of con-
genitally blind individuals comes from a recent review
(Valente etal., 2018) that surveyed 21 studies published
between 1932 and 2015. These studies observed how
blind participants move their faces during instances of
emotion and then compared those movements with
28 Barrett etal.
both the proposed expressive forms in Figure 4 and the
facial movements of sighted people. Both spontaneous
facial movements and posed movements were tested.
Eight older studies (published between 1932 and 1977)
reported that congenitally blind individuals spontane-
ously expressed emotions with the proposed facial
configurations in Figure 4, but Valente etal. (correctly)
questioned the objectivity of these studies because the
data were based largely on subjective impressions
offered by researchers or their assistants.
The 13 studies published between 1980 and 2015
were better designed: Researchers videotaped partici-
pants’ facial movements and described them using a
formal facial coding system for adults (e.g., FACS) or a
similar coding system for children. There are too few
of these studies and the sample sizes are insufficient to
conduct a formal meta-analysis, but taken together they
suggest that, in general, congenitally blind individuals
spontaneously moved their faces in ways similar to
sighted individuals during instances of emotion: Both
groups expressed instances of anger, disgust, fear, hap-
piness, sadness, or surprise with the proposed expres-
sive configurations (or their individual AUs) in Figure
4, with either weak reliability or no reliability, and
neither group produced any of the configurations with
any specificity (e.g., Galati etal., 2001; Galati etal.,
1997; Galati, Sini, Schmidt, & Tinti, 2003). The lack of
specificity is not surprising given that, on closer inspec-
tion, several of the studies discussed in Valente etal.
(2018) compared emotion categories that systematically
differ in their prototypical affective properties, contrast-
ing facial movements in pleasant and unpleasant cir-
cumstances (e.g., Cole etal., 1989), or observed facial
movements only in pleasant circumstances without
distinguishing the facial AUs for the happiness category
from other positive emotion categories (e.g., Chiesa
etal., 2015). As a consequence, the findings from these
studies cannot be interpreted unambiguously as evidence
specifically pertaining to emotional expressions, per se.
Congenitally blind and sighted individuals were simi-
lar to one another in the variety of their spontaneous
facial movements, but they differed in their posed facial
configurations. After listening to descriptions of situa-
tions that were assumed to elicit an instance of anger,
sadness, fear, disgust, surprise, and happiness, sighted
participants posed their faces with the proposed expres-
sive forms for the negative emotion categories in Figure
4 at higher levels of reliability and specificity than did
blind participants (Galati et al., 1997; Roch-Levecq,
2006). These findings suggest that sighted individuals
share common beliefs about emotional expressions,
replicating other findings with posed expressions (see
Table 3, third data row), whereas congenitally blind
individuals may share these beliefs to a lesser degree;
their knowledge of social rules for producing those
configurations on command differs from those of
sighted individuals.
Taken together, the evidence from studies of blind
individuals is consistent with the other scientific evidence
reviewed so far (see Table 3). Even in the absence of
visual experience, blind individuals, like sighted individu-
als, develop the ability to spontaneously make a variety
of facial movements to express emotion, but those move-
ments do not reliably and specifically configure in the
manner proposed by the common view of emotion
(depicted in Fig. 4). Learning to voluntarily pose the
proposed expressions in Figure 4 does seem to covary
with vision, however, further emphasizing that posed and
spontaneous expressions should be treated as different
phenomena. Further scientific attention is warranted to
examine how congenitally blind individuals learn, via
other sensory modalities, to express emotions.
Summary of scientific evidence on the
production of facial expressions
The scientific findings we have reviewed thus far—deal-
ing with how people actually move their faces during
emotional events—does not strongly support the com-
mon view that people reliably and specifically express
instances of emotion categories with spontaneous facial
configurations that resemble those proposed in Figure
4. Adults around the world, infants and children, and
congenitally blind individuals all show much more vari-
ability than commonly hypothesized. Studies of posed
expressions further suggest that people believe that
particular facial movements express particular emotions
more reliably and specifically than is warranted by the
scientific evidence. Consequently, it is misleading to
refer to facial movements with commonly used phrases
such as “emotional facial expression,” “emotional
expression” or “emotional display.” More neutral phrases
that assume less, such as “facial configuration” or “pat-
tern of facial movements” or even “facial actions,” are
more scientifically accurate and should be used instead.
We next turn our attention to the question of whether
people reliably and specifically infer certain emotions
from certain patterns of facial movements, shifting
ourfocus from studies of production to studies of per-
ception. It has long been assumed that emotion percep-
tion provides an indirect way of testing the common
view of expression production, because facial expres-
sions, when they are assumed to be displays of emo-
tional states, are thought to have coevolved with the
ability to recognize and read them (Ekman, Friesen, &
Ellsworth, 1972). For example, Shariff and Tracy
(2011) have suggested that emotional expression and
emotion perception likely coevolved as an integrated
Facial Expressions of Emotion 29
signaling system (for additional discussion, see Jack,
Sun, Delis, Garrod, & Schyns, 2016).28 In the next
section, we review the scientific evidence on emotion
perception.
Perceiving Emotions From Facial
Movements: A Review of the Scientific
Evidence
For over a century, researchers have directly examined
whether people reliably and specifically infer emotional
meaning in the facial configurations presented in Figure
4. Most of these studies are interpreted as evidence for
people’s ability to recognize or decode emotion in facial
configurations, on the assumption that the configura-
tions broadcast or signal emotional information to be
recognized or detected. This is yet another example of
confusing what is known and what is being tested. A
more correct interpretation is that these studies evaluate
whether or not people reliably and specifically infer,
attribute, or judge emotion in those facial configura-
tions. The pervasive tendency to confuse inference and
recognition may explain why very few studies have
actually investigated the processes by which people
detect the onset and offset of facial movements and
infer emotions in those movements (i.e., few studies
consider the mechanisms by which people infer emo-
tional states from detecting and perceiving facial move-
ments; for discussion, see Lynn & Barrett, 2014;
Martinez, 2017a, 2017b). In this section, we first review
the design of typical emotion-perception experiments
that are used to test the common view that emotions can
be reliably and specifically “read out” from facial move-
ments. We also examine the emotions people infer from
the facial movements in dynamic, computer-generated
faces, a class of studies that offers an interesting alterna-
tive way to study emotion perception, and in virtual
humans, which provides the opportunity for a more
implicit approach to studying emotion perception.
The anatomy of a typical experiment
designed to test the common view
For a person—a perceiver—to infer that another person
is in an emotional state by looking at that person’s facial
movements, the perceiver must have many competen-
cies. People move their faces continuously (i.e., real
human faces are never still), so a perceiver must notice
or detect the relevant facial movements in question and
discriminate them from other facial movements (that
is, the perceiver must be able to set a perceptual bound-
ary to know when the movements begin and end, and,
for example, that a scowl is different from a sneer). To
do this, the perceiver must be able to identify (or
segment) the movements as an ensemble or pattern
(i.e., bind them together and distinguish them from
other movements that are normally inferred to be irrel-
evant). And the perceiver must be able to infer similari-
ties and differences between different instances of facial
movements, as specified by the task (e.g., categorize a
group of facial movements as instances expressing
anger). This categorization might involve merely
labeling the facial movements, referred to as action
identification (describing how a face is moving, such
as smiling) or it might involve inferring that a particular
mental state caused the actions, referred to as mental
inference or mentalizing (inferring why the action is
performed, such as a state of happiness; Vallacher &
Wegner, 1987). In principle, the categorization could
also involve inferring a situational cause for the actions,
but in practice, this question is rarely investigated in
studies of emotion perception. The overwhelming
majority of studies ask participants to make mental
inferences, although, as we discuss later in this section,
there appears to be important cultural variation in
whether emotions are perceived as situated actions or
as mental states that cause actions.
The use of posed configurations of facial movements
in assessments of emotion perception. In the majority
of the experiments that study emotion perception, research-
ers ask participants to infer emotion in photographs of
posed facial configurations (such as those in Fig. 4, but
without the FACS codes). In most studies, the configura-
tions have been posed by people who were not in an emo-
tional state when the photos were taken. In a growing
number of studies, the poses are created with computer-
generated humans who have no actual emotional state.
As a consequence, it is not possible to assess the accu-
racy (i.e., validity) of perceivers’ emotional inferences
and, correspondingly, data from emotion-perception
studies should not be interpreted as support for the valid-
ity of the common view of emotional expressions (except
insofar as these are simply stipulated to be the consen-
sus). As is the case in expression-production studies, it is
more appropriate to interpret participants’ responses in
terms of their agreement (or consensus) with common
beliefs (which may vary by language and culture).
Even more serious is the fact that the proposed
expressive facial configurations in Figure 4, which are
routinely used as stimuli in emotion-perception studies,
do not capture the wider range of muscle movements
that are observed when people actually express instances
of these emotion categories in the lab or in everyday life.
A recent study that mined more than 7 million images
from the Internet (Srinivasan & Martinez, 2018; for method,
see Box 7 in the Supplemental Material) identified mul-
tiple facial configurations associated with the same
30 Barrett etal.
emotion-category label and its synonyms—17 distinct
facial configurations were associated with the word
happiness, five with anger, four with sadness, four with
surprised, two with fear, and one with disgust. The
different facial configurations associated with each
emotion word were more than mere variations on a uni-
versal core expression—they were distinctive sets of facial
movements.29
Measuring emotion perception. The typical emotion
perception experiment takes one of several forms, sum-
marized in Table 4. Choice-from-array tasks, in which par-
ticipants are asked to match photos of facial configurations
and emotion words (with or without brief stories), have
dominated the study of emotion perception since the
1970s. For example, a meta-analysis of emotion-perception
studies published in 2002 summarized 87 studies, 83 (95%)
of which exclusively used a choice-from-array response
method (Elfenbein & Ambady, 2002). This method has
been widely criticized for more than 2 decades, however,
because it limits the possibility of observing evidence that
could disconfirm the common view. Choice-from-array
tasks strongly constrain the possible meanings that partici-
pants can infer in a facial configuration, such as a photo-
graph of a scowl, because they can choose only the
options provided in the experiment (usually a small num-
ber of emotion words). In fact, the preponderance of
choice-from-array tasks in the scientific study of emotion
perception has been identified as one important factor
that has helped perpetuate and sustain the common view
(Russell, 1994). Other tasks exist for assessing emotion
perception (see Table 4), including those that use a free-
labeling method, where participants are asked to freely
nominate words to label photographs of posed facial
configurations, rather than choosing a word from a small
set of predefined options. For example, on viewing a
scowling configuration, participants might offer responses
like “angry,” “sad,” “confused,” “hungry,” or even “wanting
to avoid a social interaction.” By allowing participants
more freedom in how they infer meaning in a facial con-
figuration, free labeling makes it equally possible to
observe evidence that could either support or disconfirm
the common view.
Recent innovations in measuring emotion perception
use computer-generated faces or heads rather than pho-
tographs of posed human faces. One method, called
reverse correlation, measures participants’ internal
model of emotional expressions (i.e., their mental rep-
resentations of which facial configurations are likely to
express instances of emotion) by observing how par-
ticipants label an avatar head that displays random com-
binations of animated facial action units (Yu, Garrod, &
Schyns, 2012; for a review, see Jack, Crivelli, & Wheatley,
2018; Jack & Schyns, 2017). As each pattern appears on
the computer screen (on a given test trial), participants
infer its emotional meaning by choosing an emotion
label from a set of options (a choice-from-array
response). After thousands of trials, researchers estimate
the statistical relationship between the dynamic patterns
of facial movements and each emotion word (e.g., dis-
gust) to reveal participants’ beliefs about which facial
configurations are likely to express different emotion
categories.
A second approach using computer-generated faces
has participants interact with more fully developed vir-
tual humans (Rickel etal., 2002), also known as embod-
ied conversational agents (Cassell et al., 2000).
Software-based virtual humans look like and act like
people (for examples, see Fig. 9). They are similar to
characters in video games in their surface appearance
and are designed to interact face-to-face with humans
using the same verbal and nonverbal behaviors that
people use to interact with one another. The underlying
technologies used to realize virtual humans vary consid-
erably in approach and capability, but most virtual-human
models can be programmed to make context-sensitive,
dynamic facial actions that would, when used by a per-
son, typically communicate emotional information to
other people (see Box 11 in the Supplemental Material
for discussion). The majority of the scientific studies
with virtual humans were not designed to test whether
human participants infer specific emotional meaning in
a virtual human’s facial movements, but their design
makes them useful for studying when and how facial
movements take on meaning as emotional expressions:
Unlike all the other ways of assessing emotion percep-
tion discussed so far, which ask participants to make
explicit inferences about the emotional cause of facial
configurations, interactions with virtual humans offers
the possibility of learning how a participant implicitly
infers emotional meaning during social interactions.
Testing the common view of emotion perception: inter-
preting the scientific observations. Traditionally, in
most experiments, if participants reliably infer the hypo-
thesized emotional state from a facial configuration (e.g.,
inferring anger from a scowling configuration) at levels
that are greater than what would be expected by chance,
then this is taken as evidence that people “recognize that
emotional state in its facial display.” It is more scientifi-
cally correct, however, to interpret such observations as
evidence that people infer an emotional state (i.e., they
consistently make a reverse inference) at greater-than-
chance levels. Only when reverse inferences are observed
in a reliable and specific way within an experiment can
scientists reasonably infer that participants are perceiving
an instance of a certain emotion category in a certain
facial configuration; technically, the inference holds only
for emotion perception as it occurs in the particular situ-
ations contained in the experiment (because situations
Facial Expressions of Emotion 31
Table 4. Pros and Cons of Common Tasks for Measuring Explicit Emotion Perception
General considerations
Participants are typically asked to infer emotional meaning in posed, rather than spontaneous, facial configurations. Spontaneous
or candid facial configurations typically produce much lower levels of agreement in emotion-perception studies (e.g., Kayyal &
Russell, 2013; Naab & Russell, 2007).
Participants are typically asked to infer emotional meaning in static, nonmoving facial configurations (i.e., in photographs rather
than movies). This reduces the ecological validity of the findings for how people infer emotional meaning in faces in the
real world. In the real world, people have to infer when a set of movements begin and end; this is called discrimination or
detection. Moreover, there is information in the dynamics of facial movements (Jack & Schyns, 2017; Krumhuber, Kappas, &
Manstead, 2013), but dynamic facial movements, particularly when they are spontaneous, do not always produce higher levels
of agreement in emotion-perception studies. Dynamic movements add realism and intensity and improve levels of agreement
primarily when movements are degraded or are artificial.
Participants are typically asked to infer emotional meaning in exaggerated facial configurations, which are said to have greater
“source clarity” (Ekman, Friesen, & Ellsworth, 1972). They reduce the ecological validity of the findings for how people infer
emotional meaning in faces in the real world. The facial configurations used in most experiments (see Fig. 4) are caricatures—
they are exaggerated to maximally distinguish one from the another. Caricatures are easier to label (categorize) than are typical
stimuli, particularly when the categories in question are highly interrelated (Goldstone, Steyvers, & Rogosky, 2003).
Participants are typically asked to infer emotional meaning in highly selected facial configurations. In early studies, a smaller set
of exaggerated facial configurations were culled from much larger sets of posed faces (involving several thousand faces; for a
discussion, see Gendron & Barrett, 2017; Russell, 1994).
Only a single task used in most experiments (i.e., participants are asked to infer emotion in facial configurations via one
method of responding). Ideally, multiple tasks should be used with the same population of participants to determine whether
convergent results are obtained. This approach is rarely taken, but for an example, see Crivelli, Jarillo, Russell, & Fernández-
Dols, 2016; Gendron, Roberson, van der Vyver, & Barrett, 2014b; Gendron, Hoemann, et al., 2018).
Test–retest reliability is rarely evaluated but is critical. A number of contextual factors are known to influence judgments,
including a perceiver’s internal state. Test–retest assessments are rarely done for practical reasons.
Most experiments ask participants to infer emotion in a disembodied face, alone, without context. This reduces the ecological
validity of the findings for how people infer emotional meaning in faces in the real world. In addition, a growing number
of experiments now show that context is an important, and sometimes dominant, source of information when people infer
emotional meaning in a facial configuration. (See Box 3 in the Supplemental Material.) For example, situational information
tends to dominate perception of emotion in faces in common, everyday events (Carrera-Levillain & Fernandez-Dols, 1994),
even when situations are more ambiguous than the exaggerated facial configurations being judged (Carroll & Russell, 1996,
Study 3).
Most studies do not report evidence about the specificity of emotion perceptions or the frequency with which people infer the
nonintended emotional meaning from a facial configuration.
Until recently, the large majority of experiments included only one pleasant emotion category (happiness) among several
unpleasant emotion categories (anger, fear, sadness, etc.). This may be one reason that agreement rates are so high for
smiles. In the past few years, experiments have included a larger variety of pleasant emotion categories (pride, awe, gratitude,
etc.), but there continues to be debate over whether these emotion categories are expressed with consistent, specific facial
configurations.
Choice-From-Array Task: matching photos of facial configurations and emotion words (with or without brief stories).
Response options are limited to those provided in the task.
Participants are (a) shown a facial configuration and asked to infer its emotional meaning by choosing an emotion word from a
small set of words or (b) presented with an emotion word that labels an emotion category (e.g., sadness) or a brief story about
a typical instance of an emotion category (e.g., “the boy’s much loved dog just died and he is sad”) along with two or three
photographs of faces (typically posed into one of the configurations presented in Fig. 4) and then asked to choose the facial
configuration that they judge best matches the emotional episode described in the word or vignette. Typically, each emotion
category is represented by a single scenario.
Words influence how the brain processes visual inputs from faces (e.g., Doyle & Lindquist, 2018; Gendron, Lindquist, Barsalou,
& Barrett, 2012). Stories can prime action perceptions, as well. More generally, choice-from-array tasks have been shown to
encourage biased perceptual responding using a signal detection analysis (e.g., DeCarlo, 2012). Choice-from-array tasks are still
commonly used, however, because they are easy and efficient and straightforward for participants to understand. Choice-from-
array responses are easy for scientists to score. Most studies using continuous judgments (rather than forced choice) find that
participants do not infer emotional meaning in facial configurations in a yes/no or on/off sort of way (Russell, 1994).
The fact that participants are exposed to the same facial configurations and emotion words over and over allows them to learn
the intended pairings even if they do not know them to begin with (N. L. Nelson & Russell, 2016).
An emotion word does not necessarily have a unique correspondence to a single emotion category for all people in a given
culture (i.e., they may differ in emotional granularity; L. F. Barrett, 2004, 2017b; Lindquist & Barrett, 2008) or people from
different cultures. Concerns about individual word meaning is why choice-from-array using stories might be preferable to those
using single words.
(continued)
32 Barrett etal.
(continued)
A small range of answers is predetermined by the experimenter, making it easier for participants to provide the answers scientists
expect. For example, by constraining which words participants were allowed to choose from, frowns were consensually
labeled as fear, wide-eyed, gasping faces were labeled as surprise (Russell, 1993). Scowling faces are more likely to be
perceived as fearful when paired with the description of danger (Carroll & Russell, 1996, Study 1) and appear determined or
puzzled depending on the story they are presented with (Carroll & Russell, 1996, Study 2).
Participants are asked to make yes/no decisions when assigning a facial configuration to an emotion category. Multiple emotion
words may apply to a single configuration (i.e., people might infer more than one emotional meaning in a face), but the
option to infer multiple emotional meanings is rarely given to participants. Continuous judgments using, for example, a Likert-
type scale ranging from 1 to 7 would solve both of these problems and also allow analysis of the similarity among facial
configurations (which evidence shows is important; e.g., Jack, Sun, Delis, Garrod, & Schyns, 2016; Kayyal & Russell, 2013).
Similarity allows scientists to discover the emotional meanings that people implicitly assign to a facial configuration, rather than
having people explicitly state them (see further discussion of similarity below).
A participant might decide that no emotion word provided applies to a facial configuration, but the option to respond this way
is rarely given to participants (they are usually forced to choose an emotion word; for discussion, see Frank & Stennett, 2001).
See Cordaro et al. (2017) for an example of this design feature.
If a participant hears a story and is asked to choose between two faces (e.g., a scowl and smile), he or she can give the expected
answer (e.g., scowl) simply by figuring out that “smile” is NOT correct. For example, after hearing a story about anger, a
participant is shown a scowl and a smile and can choose the scowl merely by realizing the smile is not correct (on the basis
of valence). This is similar to getting an answer right on a multiple-choice test by eliminating all the alternatives—you do
not actually know the right answer, but you figured it out because of the structure of the task. A similar point can be made
about showing a single face and asking participants to label it with a word by selecting from among a small set of options.
Participants use a process of elimination strategy: Words that are not chosen on prior trials are selected more frequently,
inflating agreement levels (DiGirolamo & Russell, 2017).
If participants hear a story about anger and must choose between a scowl and a smile, they can figure out that the scowl is
correct merely because they are distinguishing between negative (scowl) and positive (smile). If participants hear a story about
anger and must choose between a scowl and a frown, they can figure out that the scowl is correct merely because they is
distinguishing between high arousal (scowl) and low arousal (frown).
In tasks that involve brief stories or vignettes about emotion, only one typical story is offered for each emotion category, making
it more difficult to observe any variation within a category.
Free-Sorting Task: photos of facial configurations are sorted into groupings, such that each grouping represents
a perceived category. Cue-to-Cue Matching: matching photos of facial configurations to a recording of posed
vocalization.
Most participants still spontaneously use words to guide their sorting and organize their groupings. Free sorting and cue matching
are ideal for preverbal participants or those with semantic deficits (e.g., Lindquist, Gendron, Barrett, & Dickerson, 2014).
Similarity Task: Judgments between pairs of facial configurations. Perceptual Matching: Indicating whether two
photos of facial configurations belong to the same emotion category.
It is inefficient and time-consuming to judge the similarity of all pairs of facial configurations. For a set of 100 faces, this requires
(100 × 100)/2 = 5,000 different similarity judgments. Participants can arrange face stimuli on a computer screen, and all
pairwise similarity judgments can be computed (the SPAM method proposed by Goldstone, 1994; e.g., see Hout, Goldinger,
& Ferguson, 2013). This procedure also solves the problem that the same pair of stimuli will have a different judged similarity
depending on which item is presented first if face pairs are presented sequentially (the judged similarity of two objects, A and
B, can depend on the order in which they are presented; the similarity between A and B is not always judged to be the same
as that between B and A; Tversky, 1977). Other advantages are that categories can be discovered, rather than prescribed, and
verbal associations are minimized. Analyses of similarity judgments typically yield more continuous similarity relations between
emotion categories along affective dimensions (see Russell & Barrett, 1999).
Free-Labeling Task: photos of facial configurations are labeled with words offered by participants (unconstrained by
experimenter).
Forcing people to translate faces into words is not a good idea, because much of the information from faces cannot be easily
captured in words (Ekman, 1994). In addition, facial expressions did not evolve to represent specific verbal labels (Ekman,
1994, p. 270). “Regardless of the language, of whether the culture is Western or Eastern, industrialized or preliterate, these
facial expressions are labeled with the same emotion terms: happiness, sadness, anger, fear, disgust, and surprise” (Ekman,
1972, p. 278). These are not special criticisms of free-labeling studies—it applies to all studies that ask people to label a face
with words, including the choice-from-array tasks.
Table 4. (Continued)
Facial Expressions of Emotion 33
Table 4. (Continued)
are never randomly sampled). If the emotion-perception
evidence is replicated across experiments that sample
people from the same culture, then the interpretation can
be generalized to emotion perceptions in that culture. Only
when the findings generalize across cultures—that is, are
replicated across experiments that sample people from dif-
ferent cultures—is it reasonable to conclude that people
universally infer a specific emotional state when perceiv-
ing a specific facial configuration. These findings can be
interpreted as evidence about the reliability and specific-
ity of producing emotional expressions if the coevolution
assumption is valid (i.e., that emotional expressions and
their perception coevolved as an integrated signaling sys-
tem; Ekman etal., 1972; Jack etal., 2016; Shariff & Tracy,
There is no widely accepted method for scoring (i.e., categorizing) freely provided responses. (Ekman, 1994, p. 274). Most scientists
group together similar words (synonyms), so that a variety of words can be used to show evidence of a correct response (e.g., a
frowning face, which is the proposed expression for sadness, could be labeled as sad, grieving, disappointed, blue, despairing,
and so on. Scientists routinely use databases that indicate synonyms (e.g., WORDNET; used in Srinivasan & Martinez, 2018). In
addition, it is possible to do data-driven groupings of emotion words into semantic categories (e.g., Jack et al., 2016; Shaver,
Schwartz, Kirson, & O’Connor, 1987). The more serious problem is that early studies using free labeling (e.g., Boucher & Carlson,
1980; Izard, 1971) did not provide enough information in the method sections about how freely provided labels were grouped.
Using freely chosen labels in a study of different cultures is difficult because it may be hard to find adequate translations (Ekman,
1994, p. 274). A given emotion word, such as sadness, can correspond to different emotion concepts (with different features) in
different languages (e.g., Wierzbicka, 1986, 2014). A single emotion word in one language can refer to more than one concept
in another language (e.g., Pavlenko, 2014). Some languages have no one-to-one translation for English emotion words, and
some emotion concepts in other languages are not directly translatable into English emotion words (see L. F. Barrett, 2017b;
Russell, 1991; Jack et al., 2016). This is not a special criticism of free-labeling studies, however; it holds for any experiment
that uses emotion words requiring translation, including choice-from-array tasks. A standard solution to this problem is to use
both forward and backward translation (e.g., a word spoken in Hadzane is translated into English and then back translated
into Hadzane; this process estimates whether or not the translation has fidelity). An even better method is to elicit features for
the emotion words in question, including typicality of those features, to determine the fidelity of translation (e.g., de Mendoza,
Fernández-Dols, Parrott, & Carrera, 2010).
Scientifically, issues with translation are manageable if scientists allow phrases to stand in for specific words.
Using only single words will always fail to capture much of the rich information in faces. Participants often provide multiple
words or even longer descriptions of situations, behaviors, or behaviors in situations (e.g., see Gendron et al., 2014b; Russell,
1994). Such data are time consuming to code and analyze.
Even when participants are told that photographs are of people trying to express an emotion, they often offer nonemotion labels.
For example, Izard (1971) found that people offered labels such as deliberating, clowning, skepticism, pain, and so on (as
reported in Russell, 1994).This is not necessarily evidence that participants did not understand the task asked of them. It might
be evidence that these facial configurations are not specific for expressing emotions.
Note: Response tasks are arrayed in order from those that most constrain participants’ responses (making it difficult to observe evidence that
can disconfirm common beliefs about emotion) to those that least constrain participants’ responses (making it easier to observe variation and
disconfirm common beliefs). For detailed design concerns about choice-from-array tasks, see Russell (1994, 1995).
ab dc
Fig. 9. Examples of virtual humans. Virtual humans are software-based artifacts that look like and act like people. (a) The system that
used this virtual human is described in Feng, Jeong, Krämer, Miller, and Marsella (2017). (b) This virtual human is reproduced from Zoll,
Enz, Aylett, and Paiva (2006). (c) This virtual human is reproduced from Hoyt, C., Blascovich, J., and Swinth, K. (2003). Social inhibition
in immersive virtual environments. Presence, 12(2), 183–195, courtesy of The MIT Press. (d) The system that was used to create this virtual
human is described in Marsella, Johnson, and LaBore (2000).
34 Barrett etal.
2011). The findings can be interpreted as evidence about
emotion recognition only if the reverse inference has
been verified as valid (i.e., if it can be verified that the
person in the photograph is, indeed, in the expected
emotional state).
Studies of healthy adults from the United
States and other developed nations
Studies that measure emotion perception with choice-
from-array tasks. The most recent meta-analysis of
emotion-perception studies was published by Elfenbein
and Ambady (2002). It statistically summarized 87 experi-
ments in which more than 22,000 participants from more
than 20 cultures around the world inferred emotional mean-
ing in facial configurations and other stimuli (e.g., posed
vocalizations). The majority of participants were sampled
from larger-scale or developed countries, including Argentina,
Brazil, Canada, Chile, China, England, Estonia, Ethiopia,
France, Germany, Greece, Indonesia, Ireland, Israel,
Italy, Japan, Malaysia, Mexico, the Netherlands, Scotland,
Singapore, Sweden, Switzerland, Turkey, the United States,
Zambia, and various Caribbean countries. The majority of
studies (95%) used posed facial configurations; only four
studies had participants label spontaneous facial move-
ments, a dramatic example of the challenges facing valid-
ity that we discussed earlier. All but four studies used a
choice-from-array response method to measure emotion
inferences, a good example of the challenges facing
hypothesis disconfirmation that we discussed earlier.
The results of the meta-analysis, presented in Figure
10, reveal that perceivers inferred emotions in the facial
configurations of Figure 4 in line with the common
view, well above chance levels (using the criteria set
out by Haidt and Keltner, 1999, presented in Table 2 of
the current article). Results provided strong evidence
that, when participants viewed posed facial configura-
tions made by people from their own culture, they
reliably perceived the expected emotion in those
configurations: Scowling facial configurations were per-
ceived as anger expressions, wide-eyed facial configu-
rations were perceived as fear expressions, and so on,
for all six emotion categories. Moderate levels of reli-
ability were observed when perceivers were labeling
facial configurations posed by people from other cul-
tures; this difference in reliability between same- and
cross-culture differences is referred to as an in-group
advantage (see Box 12, in the Supplemental Material).
The majority of emotion-perception studies did not
report whether the hypothesized facial configurations
were perceived with any specificity (e.g., how likely
was a scowl to be perceived as expressing an instance
of emotion categories other than anger, or as an instance
of a mental category that is not considered emotional).
Without information about specificity, no firm con-
clusions can be drawn about the emotional meaning
of the facial configurations in Figure 4, especially
for the translational purpose of inferring someone’s
emotional state from their facial comportment in real
life.
Nonetheless, most of the studies cited in the Elfenbein
and Ambady (2002) meta-analysis interpret their reli-
ability findings alone (i.e., inferring anger from a scowl-
ing face, disgust from a nose-wrinkled face, fear from
a wide-eyed, gasping face, etc.) as evidence of accurate
reverse inferences. Such interpretations may explain
why many scientists who study emotion, when sur-
veyed, indicated that they believe compelling evidence
exists for the hypothesis that certain emotion categories
are each expressed with a unique, universal facial con-
figuration (see Ekman, 2016) and interpret variation in
emotional expressions to be caused by cultural learning
that modifies what are presumed to be inborn universal
expressive patterns (e.g., Cordaro etal., 2018; Ekman,
1972; Elfenbein, 2013). Cultural learning has also been
hypothesized to modify how people “decode” facial
configurations during emotion perception (Buck, 1984).
Studies that measure emotion perception with free-
labeling tasks. Experimental methods that place fewer
constraints on participants’ inferences (Table 4) provide
considerably less support for the common view of emo-
tional expressions. In the least constrained experimental
task, called free labeling, perceivers freely volunteer a
word (emotion or otherwise) that they believe best cap-
tures the meaning in a facial configuration, rather than
choosing from a small set of experimenter-provided
options. In urban samples, participants who freely label
facial configurations produce the expected emotion
labels with weak reliability (when labeling spontaneously
produced facial configurations) to moderate reliability
(when labeling posed facial configurations). Participants’
responses usually reveal weak specificity when specific-
ity is assessed at all (for examples and discussion, see
Russell, 1994; also see Naab & Russell, 2007).
For example, participants in a study by Srinivasan
and Martinez (2018) were sampled from multiple coun-
tries. They were asked to freely provide emotion words
in their native languages (English, Spanish, Mandarin
Chinese, Farsi, Arabic, and Russian) to label each of 35
facial configurations that had been cross-culturally
identified. Their labels provided evidence of a moder-
ately reliable correspondence between facial configura-
tions and emotion categories, but there was no evidence
of specificity (see Fig. 11).30 Multiple facial configura-
tions were associated with the same emotion category
label (e.g., 17 different facial configurations were asso-
ciated with the expression of happiness, five with
Facial Expressions of Emotion 35
anger, four with sadness, four with surprise, two with
fear, and one with disgust). This many-to-many map-
ping is inconsistent with the common view that the
facial configurations in Figure 4 are universally recog-
nized as expressing the hypothesized emotion category,
and they give evidence of variation that is far beyond
what is proposed by the basic-emotion view. Some of
this variability may come from different cultures and
languages, but there is variability even within a single
culture and language. Evidence of this many-to-many
mapping is also apparent in free-labeling tasks in small-
scale, remote samples as well (Gendron, Crivelli, &
Barrett, 2018), which we discuss in the next section.
Studies that measure emotion perception with the
reverse-correlation method. Using a choice-from-array
response method with the reverse-correlation method is
an inductive way to learn people’s beliefs about which
facial configurations express the instances of an emotion
category (for reviews, see Jack etal., 2018; Jack & Schyns,
2017). In such studies, participants view thousands of
random combinations of AUs that are computer gener-
ated on an avatar head and label each one by choosing
an emotion word from a set of predefined options. All of
the facial configurations labeled with the same emotion
word (e.g., anger) are then statistically combined for
each participant to estimate that person’s belief about
which facial movements express instances of the corre-
sponding emotion category. One recent study using the
reverse correlation method with participants from the
United Kingdom and China found evidence of variation
in the facial movements that were judged to express a
single emotion category as well as similarity in the facial
movements that were judged to express different catego-
ries (Jack etal., 2016). The study first identified group-
ings of emotion words that are widely discussed in the
scientific literature (which, we should note, is dominated
by English): 30 English words grouped into eight emo-
tion categories for the sample from the United Kingdom
(happy/excited/love, pride, surprise, fear, contempt/dis-
gust, anger, sad, and shame/embarrassed) and 52 Chi-
nese words grouped into 12 categories in the Chinese
0
10
20
30
40
50
60
70
80
90
100
Anger Disgust Fear Happiness Sadness Surprise
Average Consistency
Within Culture
Across Cultures
N = 70
73
77.1
71.7
93.4
79.1
80.3
N = 66N = 69N = 70 N = 68N = 63
65.262.1
58.3
87.6
68.469.1
Fig. 10. Emotion-perception findings. Average effect sizes for perceptions of facial configurations in which 95% of
the articles summarized used choice-from-array to measure participants’ emotion inferences. Data are from Elfenbein
and Ambady (2002). The images presented on the x-axis are for illustrative purposes only and were not necessarily
used in the articles summarized in this meta-analysis.
36 Barrett etal.
Face
Emotion Label Offered (% of Responses)
Anger Disgust Fear Happiness Sadness SurpriseNonaffective Action
39.92 7.19 7.93 12.92 3.96
11.38 24.94 2.63 10.05 12.39
4.14 4.01 34.75 4.12 4.13 18.61 2.73
48.84 3.17 1.84 10.01
8.31 8.60 37.41 13.40
2.16 6.02 12.23 6.01 31.90 1.82
Fig. 11. Free labeling of facial configurations across five language groups. Data are from
Srinivasan and Martinez (2018). The proportion of times participants offered emotion-category labels
(or their synonyms) are reported. The facial configurations presented were chosen by researchers
to represent the best match to the hypothetical facial configurations in Figure 4 on the basis of the
action units (AUs) present. No configuration discovered in this study exactly matches the AU con-
figurations proposed by Darwin or documented in prior research. According to standard scientific
criteria, universal expressions of emotion should elicit agreement rates that are considerably higher
than those reported here, generally in the 70% to 90% range, even when methodological constraints
are relaxed (Haidt & Keltner, 1999). Specificity data are not available for the Elfenbein and Ambady
(2002) meta-analysis.
sample (joyful/excitement, pleasant surprise, great sur-
prise/amazement, shock/alarm, fear, disgust, anger, sad,
embarrassment, shame, pride, and despise). The reverse-
correlation method revealed 62 separate facial configura-
tions: The same emotion category in a given culture was
associated with multiple models of facial movements
because synonyms of the same emotion category were
associated with distinctive models of facial movements.
Amidst this variability, Jack and colleagues also found
that these 62 separate facial configurations could be
summarized as four prototypes, which are presented in
Figure 8 along with the corresponding emotion words
with which they were frequently associated. Each pro-
totype was described with a unique set of affective
features (combinations of valence, arousal and domi-
nance). A comparison of the four estimated configura-
tions with the common view presented in Figure 4 and
with the basic-emotion hypotheses listed in Table 1
reveals some striking similarities: Configuration 1 in
Figure 8 most closely resembles the proposed expression
for happiness, Configuration 2 is similar to a combination
of the proposed expressions for fear and anger, Configu-
ration 3 most closely resembles the proposed expression
for surprise, and Configuration 4 is similar to a combina-
tion of the proposed expressions for disgust and anger.31
Taken together, these findings suggest that, at the most
general level of description, participants’ beliefs about
emotional expressions (i.e., their internal models of
which facial movements expressed which emotions)
were consistent with the common view (indeed, they
could be taken to constitute part of the common view);
when examined in finer detail with more granularity,
however, the findings also give evidence of substantial
within-category variation in beliefs about the facial
movements that express instances of the same emotion
category. This observation suggests that the way the
common view is often described in scientific reviews,
depicted in the media, and used in many applications
does not, in fact, do justice to people’s more varied
beliefs about facial expressions of emotion.
Facial Expressions of Emotion 37
Studies that implicitly assess emotion perception
during interactions with virtual humans. Design-
ers typically study how a virtual human’s expressive
movements influence an interaction with a human par-
ticipant. Much of the early research modeling expressive
movements in virtual humans focused on endowing
them with the facial expressions proposed in Figure 4. A
number of studies have endowed virtual humans with
blends of these configurations (Arya, DiPaola, & Parush,
2009; Bui, Heylen, Poel, & Nijholt, 2004). Designers are
also inspired by people’s beliefs about how emotions are
expressed. Actors, for example, have been asked to pose
facial configurations that they believe express emotions,
which are then processed by graphical and machine-
learning algorithms to craft the relation between emo-
tional states and expressive movements (Alexander,
Rogers, Lambeth, Chiang, & Debevec, 2009). In another
study, human subjects used a specially designed software
tool to craft animations of facial movements that they
believed express certain mental categories, including
emotion categories. Then, other human subjects judged
the crafted facial configurations (Ochs, Niewiadomski, &
Pelachaud, 2010). Increasingly, data-driven methods are
used that place people in emotion-eliciting conditions,
capture the facial and body motion, and then synthesize
animations from those captured motions (Ding, Prepin,
Huang, Pelachaud, & Artie`res, 2014; Niewiadomski
etal., 2015; N. Wang, Marsella, & Hawkins, 2008).
In general, studies with virtual humans show nicely
how the situational context influences people’s infer-
ences about the meaning of facial movements (de Melo,
Carnevale, Read, & Gratch, 2014). For example, in a
game that allowed competition and cooperation
(Prisoner’s Dilemma, Pruitt & Kimmel, 1977), a virtual
human who smiled after making a competitive move
evoked more competitive and less cooperative
responses from human participants compared with a
virtual human using an identical strategy in the game
(tit-for-tat) but who smiled after cooperating. Virtual
humans who make a verbal comment about a film that
is inconsistent with their facial movements, such as
saying they enjoyed the film but grimacing, quickly
followed by a smile, were perceived as less reliable,
trustworthy, and credible (Rehm & Andre, 2005).
The dynamics of the facial actions, including the
relative timing, speed, and duration of the individual
facial actions, as well as the sequence of facial muscle
movements over time, offer information over and above
the mere presence or absence of the movements them-
selves and have an important influence on how human
perceivers interpret facial movements (e.g., Ambadar,
Cohn, & Reed, 2009; Jack & Schyns, 2017; Keltner, 1995;
Krumhuber, Kappas, & Manstead, 2013) and how much
they trust a virtual human during a social interaction
(Krumhuber, Manstead, Cosker, Marshall, & Rosin,
2009). Research with virtual humans has shown that
the dynamics of facial muscle movements are critical
for them to be perceived as emotional expressions
(Niewiadomski etal., 2015; Ochs et al., 2010). These
findings are consistent with research showing that the
temporal dynamics carry information about the emo-
tional meaning of facial movements that are made by
real humans (e.g., Kamachi etal., 2001; Krumhuber &
Kappas, 2005; Sato & Yoshikawa, 2004; for a review,
see Krumhuber etal., 2013).32
Summary. Whether people can reliably perceive emo-
tions in the expressive configurations of Figure 4, as pre-
dicted by the common view, depends on how participants
are asked to report or register their inferences (see Table
3). Hundreds of experiments have asked participants to
infer the emotional meaning of posed, exaggerated facial
configurations (such as those presented in Figure 4) by
choosing a single emotion word from a small number of
options offered by scientists, called choice-from-array-
tasks. This experimental approach tends to generate
moderate to strong evidence that people reliably label
scowling facial configurations as angry, frowning facial
configurations as sad, and so on for all six emotion cat-
egories that anchor the common view. Choice-from-array
tasks severely limit the possibility of observing evidence
that can disconfirm the common view of emotional
expressions, however, because they restrict participants’
options for inferring the psychological meaning of facial
configurations by offering them a limited set of emotion
labels. (As we discuss below, when people are provided
with labels other than angry, sad, afraid, and so on,
theyroutinely choose them; also see Carroll & Russell,
1996; Crivelli etal., 2017). In addition, the specificity of
emotion-perception judgments is largely unreported.
Scientists often go further and interpret the better-
than-chance reliability findings from these studies as
evidence that scowls are expressions of anger, frowns
are expressions of sadness, and so on. Such inferences
are not sound, however, because most of these studies
ask participants to infer emotion from posed, static
faces that are likely limited in their validity (i.e., people
posing facial configurations such as those depicted in
Figure 4 are unlikely to be experiencing the hypo
thesized emotional state). Furthermore, other ways of
assessing emotion perception, such as the reverse-
correlation method and free-labeling tasks, find
much weaker evidence for reliability of emotion
inferences. Instead, they suggest that what people actu-
ally infer and believe about facial movements incorpo-
rates considerable variability: In short, the common
view depicted in many reviews, summaries, the media,
and used in numerous applications is not an accurate
38 Barrett etal.
reflection of what people believe about facial expres-
sions of emotion, when these beliefs are probed in
more detail (in a way that makes it possible to
observe evidence that could disconfirm the common
view). In the next section, we discuss scientific evi-
dence from studies of emotion perception in small-
scale remote cultures, which further undermines the
common view.
Studies of healthy adults living in
small-scale, remote cultures
A growing number of studies examine emotion percep-
tion in people from remote, nonindustrialized cultural
groupings. A more in-depth review of these studies can
be found in Gendron, Crivelli, and Barrett (2018). Our
goal here is to summarize the trends found in this line
of research (see Table 5).
Studies that measure emotion perception with choice-
from-array tasks. During the period from 1969 to 1975,
between five and eight small-scale samples from remote
cultures in the South Pacific were studied with choice-
from-array tasks; the goal was to investigate whether these
participants perceived emotional expressions in facial
movements in a manner similar to that of people from the
United States and other industrialized countries of the
Western world (see Fig. 12a). Our uncertainty in the num-
ber of samples stems from reporting inconsistencies in the
published record (see note to Table 5). We present the
findings here according to how the original authors
reported their findings, despite the inconsistencies. Five
samples performed choice-from-array tasks, three in which
participants chose a photographed facial configuration to
match one brief vignette that described each emotion cat-
egory (Ekman, 1972; Ekman & Friesen, 1971; Sorenson,
1975) and two in which they chose a photograph to match
an emotion word (Ekman, Sorenson, & Friesen, 1969). All
five samples performed some version of a choice-from-
array task that provided strong evidence in support of
cross-cultural reliability of emotion perception in small-
scale societies. Evidence for specificity was not reported.
Until 2008, all claims that anger, sadness, fear, disgust,
happiness, and surprise are universally recognized (and
therefore are universally expressed) were based largely
on three articles (two of them peer reviewed) reporting
on four samples (Ekman, 1972; Ekman & Friesen, 1971;
Ekman etal., 1969).33
Since 2008, 10 verifiably separate experiments observ-
ing emotional inferences in small-scale societies have
been published or submitted for publication. These
studies involve a greater diversity of social and ecologi-
cal contexts, including sampling five small-scale societ-
ies across Africa and the South Pacific (see Fig. 12b) that
were tested with a greater diversity of research methods
listed in Table 4, including tasks that allow for the pos-
sibility of observing cross-cultural variation in emotion
perception and therefore the possibility of disconfirming
the common view. Six samples registered their emotion
inferences using a choice-from-array task, in which par-
ticipants were given an emotion word and asked to
choose the posed facial configuration that best matched
it or vice versa (Crivelli, Jarillo, Russell, & Fernández-
Dols, 2016; Crivelli, Russell, Jarillo, & Fernández-Dols,
2016; Crivelli etal., 2017, Study 2; Gendron, Hoemann,
etal., 2018, Study 2; Tracy & Robins, 2008).
Only one study (Tracy & Robins, 2008) reported that
participants selected an emotion word to match the
facial configurations similar to those in Figure 4 more
reliably than what would be expected by chance, and
effects ranged from weak (anger and fear) to strong
(happiness) with surprise and disgust falling in the
moderate range.34 Information about the specificity of
emotion inferences was not reported. A close examina-
tion of the evidence from four studies by Crivelli and
colleagues suggest weak to moderate levels of reliabil-
ity for inferring happiness in smiling facial configura-
tions (all four studies), sadness in frowning facial
configurations (all four studies), fear in gasping, wide-
eyed facial configurations (three studies), anger in scowl-
ing facial configurations (two studies), and disgust in
nose-wrinkled facial configurations (three studies). A
detailed breakdown of findings can be found in Box 13
in the Supplemental Material. None of the studies found
specificity for any facial configuration, however, except
that smiling was reported as unique to happiness, but
that finding was not replicated across samples.35
The final study using a choice-from-array task with
people from a small-scale, remote culture is important
because it involves the Hadza hunter-gatherers of Tan-
zania (Gendron, Hoemann, etal., 2018, Study 2).36 The
Hadza are a high-value sample for two reasons. First,
universal and innate emotional expressions are hypoth-
esized to have evolved to solve the recurring fitness
challenges of hunting and gathering in small groups on
the African savanna (Pinker, 1997; Shariff & Tracy, 2011;
Tooby & Cosmides, 2008); the Hadza offer a rare oppor-
tunity to study hunters and foragers who are currently
living in an ecosystem that is thought to be similar to
that of our Paleolithic ancestors.37 Second, the popula-
tion is rapidly disappearing (Gibbons, 2018). Before
this study, the Hadza had not participated in any studies
of emotion perception, although they have been the
subject of social cognition research more broadly (H.
C. Barrett etal., 2016; Bryant etal., 2016).
After listening to a brief story about a typical instance
of anger, disgust, fear, happiness, sadness, and surprise,
Hadza participants chose the expected facial
Facial Expressions of Emotion 39
Table 5. Summary of Cross-Cultural Emotion Perception in Small-Scale Societies
Task Culture NCitation
Level of
support
Free labeling Fore, PNGa100 Sorenson (1975), Sample 2bNone
Bahinemo, PNG 71 Sorenson (1975), Sample 3 None
Hadza, Tanzania 43 Gendron, Hoemann, etal. (2018), Study 1 None
Trobrianders, PNG 32cCrivelli etal. (2017), Study 1 None
Sadong, Borneod15 Sorenson (1975), Sample 4eStrong
Cue-to-cue matching Shuar, Ecuador 23 Bryant & Barrett (2008), Study 2 Weak
Himba, Namibiaf65 Gendron etal. (2014) None
Choice-from array: matching
face and words
Fore, PNGa32 Ekman, Sorenson, & Friesen (1969)bNone
Mwani, Mozambique 36c,g Crivelli, Jarillo, Russell, & Fernández-Dols
(2016), Study 2
None
Trobrianders, PNG 24cCrivelli etal. (2017), Study 2 None
Trobrianders, PNG 68c,g Crivelli, Jarillo, Russell, & Fernández-Dols
(2016), Study 1
None
Trobrianders, PNG 36cCrivelli, Russell, etal. (2016), Study 1a None
Dioula, Burkina Fasoa39 Tracy & Robins (2008), Study 2 Weak
Sadong, Borneod15 Ekman etal. (1969)eStrong
Choice-from array: matching
face and scenario
Hadza, Tanzania 54 Gendron, Hoemann, etal. (2018), Study 2 None
Dani, New Guineaa34 Described in Ekman (1972)hModerate
Fore, PNGd189, 130gSorenson (1975), Sample 1iModerate
Fore, PNGd189, 130gEkman and Friesen (1971)iStrong
Note. Findings for anger, disgust, fear, sadness, and surprise are summarized; happiness is the only pleasant category tested in all studies except
for Tracy and Robins (2008). Therefore, in most studies, inferences of happiness in smiling faces do not uniquely reflect emotion perception
and may be driven by valence perception (distinguishing pleasant from unpleasant). All studies used photographs of posed facial configurations
that are similar to those in Figure 4, except Crivelli, Jarillo, Russell, and Fernández-Dols, 2016, Study 2, which used dynamic as well as static
posed configurations, and Crivelli etal., 2017, Study 1, which used static spontaneous configurations. The Bryant and Barrett (2008) study was
designed to examine emotion perception from vocalizations but is included because perceivers matched them to faces; in addition, participants
were tested in a second language (Spanish) in which they received training. Choice-from-array studies (except Gendron etal. 2014a, Study 2,
and Gendron, Hoemann, etal., 2018, Study 2) did not carefully control whether target facial configurations and foils could be distinguished by
valence and/or arousal. All participants were adults unless otherwise specified. Levels of support: “None” indicates that reliability and specificity
were at chance levels or that any level of reliability above chance was combined with evidence of no specificity. “Weak” indicates that reliability
was between 20% and 40% (weak) for at least a single emotion category other than happiness combined with above chance specificity for that
category or reliability between 41% and 70% (moderate reliability) for at least a single category other than happiness with unknown specificity.
“Moderate” indicates that reliability was between 41% and 70% combined with any evidence of above-chance specificity for a category other
than happiness or reliability above 70% (strong reliability) for at least a single category other than happiness with unknown specificity. “Strong”
indicates strong evidence of reliability (above 70%) and strong evidence of specificity for at least a single emotion category other than happiness.
It is questionable whether the Sadong and the Fore subgroup with more other-group contact should be considered isolated (see Sorenson, 1975,
pp. 362 and 363), but we include them here to avoid falsely dichotomizing cultures as “isolated from” versus “exposed to” one another (Fridlund,
1994; Gewald, 2010). PNG Papua New Guinea.
aSpecificity levels were not reported. bSorenson (1975), Sample 2, included three groups of Fore participants (those with little, moderate, and
most other-group contact). The pattern of findings is nearly identical for the subgroup with the most contact and the data reported for the Fore
in Ekman etal. (1969); Sorenson described using a free-labeling method, whereas Ekman etal. (1969) described using a choice-from-array
method. cEkman (1994) indicated, however, that he did not use a free-labeling method, implying that the samples may be distinct. Participants
were adolescents. dSpecificity was inferred from reported results. eThe sample size, marginal means, and exact pattern of errors reported for the
Sadong samples are identical in Sorenson (1975), Sample 4, and Ekman etal. (1969); Sorenson described using a free-labeling method and Ekman
etal. (1969) described using a choice-from-array method in which participants were shown photographs and asked to choose a label from a
small list of emotion words. fTraditional specificity and consistency tests are inappropriate for this method, but the results are placed here based
on the original author’s interpretation of multidimensional scaling and clustering results. gParticipants were children. hThe Dani sample reported
in Ekman (1972) is likely to be a subset of the data from an unpublished manuscript. iSorenson (1975), Sample 1 and Ekman and Friesen (1971)
may be the same sample because the sample sizes and pattern of data are identical for all emotion categories except for the fear category, which
is extremely similar, and for the disgust category, which includes responses for contempt in Ekman and Friesen (1971) but was kept separate in
Sorenson (1975).
configuration more often than chance only when the
target and foil could be distinguished by the affective
property referred to as valence. The finding that Hadza
participants were successfully inferring pleasantness
and unpleasantness is consistent with anthropological
studies of emotion (Russell, 1991), linguistic studies
(Osgood, May, & Miron, 1975), and findings from other
recent studies of participants from small-scale societies,
40 Barrett etal.
Sadong of Borneoa
(Ekman et al.,1969)
Sadong of Borneoa
(Sorenson, 1975)
Bahinemo of Papua New Guinea
(Sorenson, 1975)
Fore of Papua New
Guinead
(Ekman & Freisen,
1971)
Dani of Indonesiab
(Western New Guinea)
(Ekman, 1972)
Dani of Indonesia
(Western New Guinea)b
(Ekman, Heider, et al., unpublished)
Fore of Papua New Guinea
(2 samples)c,d
(Sorenson, 1975)
Fore of Papua New
Guineac
(Ekman, Sorenson &
Friesen, 1969)
a
Trobrianders of Papua New Guinea
(Crivelli, Jarillo, et al., 2016)
Mwani of Mozambique
(Crivelli, Jarillo, et al., 2016)
Trobrianders of Papua
New Guinea (2 samples)
(Crivelli, Russell, et al., 2016)
Trobrianders of Papua
New Guinea
(2 samples)
(Crivelli, Russell, et al., 2017)
Himba of Namibia
(2 samples)
(Gendron et al., 2014)
Dioula of Burkina Faso
(Tracy & Robins, 2008)
Shuar of Amazonian Ecuador
(Bryant & Barrett, 2008)
b
Fig. 12. Map of cross-cultural studies of emotion perception in small-scale societies. People in small-scale societies typically live in
groupings of several hundred to several thousand people who maintain autonomy in social, political and economic spheres. (a) Epoch
1 studies, published between 1969 and 1975, were geographically constrained to societies in the South Pacific. Studies that share the
same superscript letter may share the same samples. (b) Epoch 2 studies, published between 2008 and 2017, sample from a broader
geographic range including Africa and South America and are more diverse in the ecological and social contexts of the societies tested.
This type of diversity is a necessary condition for discovering the extent of cultural variation in psychological phenomena (Medin,
Ojalehto, Marin, & Bang, 2017). Adapted from Gendron, Crivelli, and Barrett (2018).
such as the Himba (Gendron, Roberson, van der Vyver,
& Barrett, 2014a, 2014b) and the Trobriand Islanders
(Crivelli, Jarillo, etal., 2016; also see Srinivasan &
Martinez, 2018, described in Box 7 in the Supplemental
Material); these studies showed that perceivers can reliably
infer valence but not arousal in facial configurations.
In addition, Hadza participants who had some contact
with people from other cultures—they had some formal
schooling or could speak Swahili, which is not their
native language—were more consistently able to choose
Facial Expressions of Emotion 41
the hypothesized facial configuration than were those
with no formal schooling who spoke minimal Swahili
(for a similar finding with Fore participants in a free-
labeling study, see Table 2 in Sorenson, 1975). Of the
27 Hadza participants who had minimal contact with
other cultures, only 12 reliably chose the wide-eyed,
gasping facial configuration at above chance levels to
match the fear story. (Compare this finding with the
observation that the hypothesized universal expression
for fear—a wide-eyed, gasping facial configuration—is
understood as an aggressive, threatening display by
Trobriand Islanders; Crivelli, Jarillo, & Fridlund, 2016;
Crivelli, Russell, Jarillo, & Fernández-Dols, 2016, 2017).
Studies that measure emotion perception with free-
labeling tasks. During the period from 1969 to 1975,
between one and three small-scale samples from remote
cultures in the South Pacific were studied with free label-
ing to investigate emotion perception (three samples
were reported in Sorenson, 1975; see Table 5 in the cur-
rent article). From 2008 onward, two additional studies
were conducted, one asking participants from the Trobri-
and Islands to infer emotions in photographs of sponta-
neous facial configurations (Crivelli etal., 2017, Study 1)
and the other asking Hadza participants to infer emotions
in photographs of posed facial configurations (Gendron
etal., 2018, Study 2). Taken together, these five studies
provide little evidence that the facial configurations in
Figure 4 are universally judged to specifically express
certain emotion categories. The three free-labeling stud-
ies reported in Sorenson (1975) produced variable results.
The only replicable finding appears to be that partici-
pants labeled smiling facial configurations uniquely as
happiness in all studies (as the only pleasant emotion
category tested). The two newer free-labeling studies
both indicated that participants rarely spontaneously
labeled facial configurations with the expected emotion
labels (or their synonyms) above chance levels. Trobriand
Islanders did not label the proposed facial configurations
for happiness, sadness, anger, surprise, or disgust with the
expected emotion labels (or their synonyms) at above
chance levels (although they did label the faces consis-
tently with other words; Crivelli et al., 2017, Study 1).
Hadza participants labeled smiling and scowling facial
configurations as happiness (44%) and anger (65%),
respectively, at above chance levels (Gendron, Hoemann,
et al., 2018, Study 2). The word anger was not used to
uniquely label scowling facial configurations, however,
and it was frequently applied to frowning, nose-wrinkled,
and gasping facial configurations.
Facial movements carry meaningful information,
even if they do not reliably and specifically display
emotional states. The more recent studies of people
living in small-scale, remote cultures suggest two interest-
ing and noteworthy observations. First, even though peo-
ple may not routinely infer anger from scowls, sadness
from frowns, and so on, they do reliably infer other social
meanings for those facial configurations, because facial
movements often carry important information about
social motives and other psychological features (Crivelli,
Jarillo, Russell, & Fernández-Dols, 2016; Crivelli et al.,
2017; Rychlowska et al., 2015; Wood, Rychlowska, &
Niedenthal, 2016; Yik & Russell, 1999; for a discussion,
see Fridlund, 2017; J. Martin etal., 2017). For example, as
we mentioned earlier, Trobriand Islanders consistently
labeled wide-eyed, gasping faces (the proposed expres-
sive facial configuration for the fear category) as signal-
ing an intent to attack (i.e., a threat; for additional
evidence in carvings and masks in a variety of cultures,
including Maori, !Kung Bushmen, Himba, and Eipo, see
Crivelli, Jarillo, & Fridlund, 2016; Crivelli, Jarillo, Russell,
& Fernández-Dols, 2016).
Second, people do not always infer internal psycho-
logical states (emotions or otherwise) from facial move-
ments. People who live in non-Western cultural
contexts, including Himba and Hadza participants, are
more likely to assume that other people’s minds are not
accessible to them, a phenomenon called opacity of
mind in anthropology (Danziger, 2006; Robbins &
Rumsey, 2008). Instead, facial movements are perceived
as actions that predict future actions in certain situa-
tions (e.g., a wide-eyed, gasping face is labeled as
“looking”; Crivelli etal., 2017; Gendron, Hoemann,
etal., 2018; Gendron etal., 2014b). Similar observations
were unavailable for the earlier studies conducted by
Ekman, Friesen, and Sorenson because, according to
Sorenson (1975), they directed participants to provide
emotion terms. When participants spontaneously
offered an action label (e.g., “she is just looking”) or a
social evaluation (e.g., “he is ugly,” or “he is stupid”),
they were asked to provide an “affect term.” Such find-
ings suggest that there may be profound cultural varia-
tion in the type of inferences human perceivers typically
make when looking at other human faces in general,
an observation that has been raised by a number of
anthropologists and historians.
A note on interpreting the data. To properly inter-
pret the scientific evidence, it is crucial to consider the
constraints placed on participants by the experimental
tasks that they are asked to complete, summarized in Table
4. In most urban and in some remote samples, experiments
using choice-from-array tasks produce evidence support-
ing the common view: Participants reliably label scowling
facial configurations as angry, smiling facial configurations
as happy, and so on. (We do not yet know whether per-
ceivers are uniquely labeling each facial configuration as a
42 Barrett etal.
specific emotion because most studies do not report that
information.)
It has been known for almost a century that choice-
from-array tasks help participants obtain a level of reli-
ability in their emotion perceptions that is not routinely
seen in studies using methods that allow participants
to respond more freely, and this is one reason they
were chosen for use in the first place (for a discussion,
see Gendron & Barrett, 2009, 2017; Russell, 1994; Widen
& Russell, 2013). When participants are offered words
for happiness, fear, surprise, anger, sadness, and disgust
to register their inferences for a scowling facial configu-
ration, they are prevented from judging a face as
expressing other emotion categories (e.g., confusion or
embarrassment), nonemotional mental states (e.g., a
social motive, such as rejection or avoidance), or physi-
cal events (e.g., pain, illness, or gas), thus inflating
reliability rates within the task. When people are pro-
vided with other options, they routinely choose them.
For example, participants label scowling faces as “deter-
mined” or “puzzled,” wide-eyed faces as “hopeful,” and
gasping faces as “pained” when they are provided with
stories about those emotions rather than with stories
of anger, surprise, and fear (Carroll & Russell, 1996;
also see Crivelli etal., 2017). The problem is not with
the choice-from-array task per se—it is more with fail-
ing to consider alternative explanations for the observa-
tions in an experiment and therefore drawing
unwarranted conclusions from the data.
Choice-from-array tasks may do more than just limit
response options, making it difficult to disconfirm
common beliefs. The emotion words provided during
the task may actually encourage people to see anger
in scowls, sadness in pouts, and so on, or to learn
associations between a word (e.g., anger) and a facial
configuration (e.g., a scowl) during the experiment
(e.g., Gendron, Roberson, & Barrett, 2015; Hoemann
etal., in press). The potency of words is discussed in
Box 14, in the Supplemental Material.
Summary. The pattern of findings from the studies con-
ducted with remote samples replicates and underscores
the pattern observed in samples of participants from
larger, more urban cultural contexts: Asking perceivers to
infer an emotion by matching a facial configuration to an
emotion word selected from a small array of options, or
telling participants a brief story about a typical instance
of an emotion category and asking them to pick a facial
configuration from an array of two or three photos, gen-
erally inflates agreement rates, producing evidence that is
more likely to support the hypothesis of reliable emotion
perception compared with data coming from less con-
strained response methods, such as free labeling (see
Table 3). This is particularly true for studies that include
only one pleasant emotion category (i.e., happiness) so
that all foils differ from the target in valence. The robust
reliability and specificity for inferring happiness from
smiling observed in these studies may be the result of
participants classifying valence rather than classifying
emotion categories per se. Studies that use less con-
strained tasks that are designed to more freely discover
how people perceive emotion instead yield evidence that
generally fails to find support for the common view. Less
constrained studies suggest that perceivers infer more
than one emotion category from the same facial configu-
ration, infer the same emotion category in a variety of
different configurations, and often disagree about the set
of emotion categories that they infer. Cultural variation in
emotion perception is consistent with the variation we
observed in studies of expression production (again, see
Table 3) and is even consistent with the research on face
perception, which itself is determined by experience and
cultural factors (Caldara, 2017).
Studies of healthy infants and children
Some scientists concur with the common view that
infants can read specific instances of emotion in faces
from birth (Haviland & Lelwica, 1987; Izard, Woodburn,
& Finlon, 2010; Leppänen & Nelson, 2009; Walker-
Andrews, 2005). However, it is difficult to ascertain
whether infants and young children possess the various
capacities required to perceive emotion per se: Simply
detecting and discriminating facial movements is not
the same as categorizing them to infer their emotional
meaning. It is challenging to design well-controlled
experiments that do a good job of distinguishing these
two capacities. Infants are preverbal, so scientists use
other measurement techniques, such as the amount of
time an infant looks at a stimulus, to infer whether
infants can discriminate one facial configuration from
another, and ultimately, whether infants categorize
those configurations as emotionally meaningful (for a
brief explanation, see Box 15 in the Supplemental
Material).
This “looking” approach introduces several possible
confounds because of the stimuli used in the experi-
ments: Infants and children are typically shown photo-
graphs of the proposed expressive forms (similar to
those presented in Figure 4; e.g., Leppänen, Richmond,
Vogel-Farley, Moulson, & Nelson, 2009; Peltola,
Leppänen, Palokangas, & Hietanen, 2008). Infants are
more familiar with some of these configurations than
with others (e.g., most infants are more familiar with
smiling faces than with scowls or frowns), and familiar-
ity is known to influence perception (see Box 15, in
Facial Expressions of Emotion 43
the Supplemental Material), making it difficult to know
which features of a face are holding an infant’s attention
(familiarity or novelty) and which might be the basis
of categorization in terms of emotional meaning. The
configurations proposed for each emotion category also
differ in their perceptual features (e.g., the proposed
expressions for fear and surprise contain widened eyes,
whereas the proposed expression for sadness does
not), contributing more ambiguity to the interpretation
of findings. For example, when an infant discriminates
smiling and scowling facial configurations, it is tempt-
ing to infer that the child is discriminating expressions
of anger and happiness when in fact that target of
discrimination may be the presence or absence of
teeth in a photograph (Caron, Caron, & Myers, 1985).
Moreover, the facial configurations in question are usu-
ally posed as exaggerated facial movements that are
not typical of the expressive variation that children
actually observe in their everyday lives (Grossmann,
2010). Furthermore, unlike adults, infants may have had
little or no experience with viewing photographs of
anything, including heads of people with no bodies
and no context.
The most important and pervasive confound in
developmental studies of emotion perception is that
most studies are not designed to distinguish between
whether infants and children (a) discriminate facial con-
figurations according to their emotional meaning and
whether they (b) discriminate affective features (pleas-
ant vs. unpleasant; high arousal vs. low arousal; see
Box 9 in the Supplemental Material). Often, a facial
configuration that is intended to depict a pleasant
instance of emotion (smiling in happiness) is compared
with one that is intended to depict an unpleasant
instance of emotion (e.g., scowling in anger, frowning
in sadness, or gasping in fear), or these configurations
are compared with a neutral face at rest (e.g., Leppänen
etal., 2007, 2009; Montague & Walker-Andrews, 2001).
(This problem is similar to the one encountered earlier
in our discussion of emotion-perception studies in
adults from small-scale societies, in which perceptions
of valence can be confused with perceptions of emo-
tion categories.) For example, in one study, 16- to
18-month-olds preferred toys paired with smiling
faces and avoided toys paired with scowling and
gasping faces (N. G. Martin, Maza, McGrath, & Phelps,
2014); this type of study cannot distinguish whether
infants are differentiating pleasant from unpleasant,
approach versus avoidance, or something about a
specific emotion.
Another study (Soken & Pick, 1999) reported that
7-month-olds distinguished sadness and anger when
looking at faces, but only when the faces were paired
with vocalizations. What is unclear is the extent to
which the level of arousal or activation conveyed in the
acoustic signals were most salient to infants. A recent
study suggested that 10-month-old infants can differ-
entiate between the high arousal, unpleasant scowling
and nose-wrinkled facial configurations that are pro-
posed as expressions of anger and disgust, suggesting
that they can categorize these two facial configurations
separately (Ruba etal., 2017). Yet the scowling and
nose-wrinkled facial configurations also differed in
properties besides their proposed emotional meaning:
scowling faces showed no teeth, but nose-wrinkled
faces were toothy, and it is well known that infants use
perceptual features such as “toothiness” to categorize
faces (see Caron etal., 1985). If an infant looks longer
at a (pleasant) smiling facial configuration after viewing
several (unpleasant) scowling faces, this does not nec-
essarily mean that the infant has discriminated between
and understands “happiness” and “anger”; the infant
might have discriminated positive from negative, affec-
tive from neutral, familiar from novel, the presence of
teeth from the absence, less eye sclera from more, or
even different amounts of contrast in the photographs.
In the future, to provide a sound basis to infer that
infants are processing specific emotional meaning,
experiments must be designed to rule out the possibility
that infants are categorizing facial configurations into
different groupings using factors other than emotion.
As a consequence of these confounds, there is still
much to learn about the developmental course of
emotion-perception abilities. By 3 months of age,
infants can distinguish the facial features (the morphol-
ogy) in the proposed expressive configurations for hap-
piness, surprise, and anger; by 7 months, they can
discriminate the features in proposed expressive con-
figurations for fear, sadness, and interest. Left uncertain
is whether, beyond just discriminating between the
mere appearance of particular facial features, infants
also understand the emotional meaning that is typically
inferred from those features within their culture. By 7
months of age, infants can reliably infer whether some-
one is feeling pleasant or unpleasant when facial con-
figurations are accompanied by sensory information
from the voice (Flom & Bahrick, 2007; Walker-Andrews
& Dickson, 1997). Only a handful of studies have
attempted to test whether infants can infer emotional
meaning in facial configurations rather than just dis-
criminating between faces with different physical
appearances, but they report conflicting results
(Schwartz, Izard, & Ansul, 1985; Serrano, Iglesias, &
Loeches, 1992). One promising future direction involves
measuring the electrical signals (event-related poten-
tials) in infant brains as they view the proposed
44 Barrett etal.
expressive configurations for anger and fear categories
(e.g., Hoehl & Striano, 2008; Kobiella, Grossmann, Reid,
& Striano, 2008). Both of these studies reported dif-
ferential brain responses to the proposed facial con-
figurations for anger and fear, but their findings did not
replicate one another (and for certain measurements,
they observed opposing effects; for a broader review,
see Grossmann, 2015).
Studies that measure a child’s ability to use an adult
caregiver’s facial movements to resolve ambiguous or
threatening situations, referred to as social referencing,
have been interpreted as evidence of emotion percep-
tion in infants. One-year-olds use social referencing to
stay in close physical proximity to a caregiver who is
expressing negative affect, whereas infants are more
likely to approach novel objects if the caregiver
expresses positive affect (Carver & Vaccaro, 2007;
Moses, Baldwin, Rosicky, & Tidball, 2001; Saarni,
Campos, Camras, & Witherington, 2006). Similar results
emerge from the caregiver’s tone of voice (Hertenstein
& Campos, 2004; Mumme, Fernald, & Herrera, 1996).
In fact, by 14 months of age, the positive or negative
tone of a caregiver’s voice influences what an infant
will touch even more than will a caregiver’s facial move-
ments or the content of what the adult is actually saying
(Vaish & Striano, 2004; Vaillant-Molina & Bahrick, 2012).
These studies clearly suggest that infants can infer the
valenced meaning of facial movements, at least when
made by live (as opposed to virtual) people with whom
they are familiar. But, again, these data do not help
resolve what, if anything, infants infer about the emo-
tional meaning of facial movements.
Learning to perceive emotions. Children grow up in
emotionally rich social environments, making it difficult
to run experiments that are capable of testing the com-
mon view of emotion perception while also taking into
account the possible roles for learning and social experi-
ence. Nonetheless, several themes have emerged in the
scientific literature, all of which suggest a clear role for
learning and context in children’s developing emotion-
perception capacities.
One hypothesis that continues to be strongly sup-
ported by experiments is that children’s capacity to
infer emotional meaning in facial movements depends
on context (the conditions surrounding the face that
may convey information about a face’s meaning). For
example, emotion-concept learning, as a potent source
of internal context, shapes emotion-perception capacity
(discussed in Boxes 10 and 16 in the Supplemental
Material). There are also developmental changes in how
people use context to shape their emotional inferences
about facial movements. Children as young as 19
months old can detect facial movements that
are emotionally incongruent with a context (Walle &
Campos, 2014). For example, when presented with
adult facial configurations that are placed on bodies
posing an emotional context (e.g., a scowling facial
configuration placed on a body holding a soiled dia-
per), children (ages 4, 8, and 12 years) moved their
eyes back and forth between faces and bodies when
deciding how to label the emotional meaning of the
faces, whereas adult participants directed their gaze
(and overt visual attention) to the face alone, judging
its emotional meaning in a way that was independent
of the bodily context (Leitzke & Pollak, 2016). The
youngest children were equally likely to base their
labeling of the scene on face or context. The results of
this experiment suggest that younger children devote
greater attention to contextual information and actively
cross-reference facial and contextual cues, presumably
to better learn about and understand the emotional
meaning those cues.38
Another important source of context that shapes the
development of emotion perception in children involves
the broader environment in which children grow. Chil-
dren who grow up in neglectful or abusive environ-
ments in which their emotional interactions with
caregivers are highly atypical have a different develop-
mental trajectory than do those growing up in more
consistently nurturing environments (Bick & Nelson,
2016; Pollak, 2015). Parents from these high-risk fami-
lies produce unclear or context-inconsistent expres-
sions of emotion (Shackman etal., 2010). Neglected
children, who often do not receive sufficient social
feedback, show delays in perceiving emotions in the
ways that adults do (Camras, Perlman, Fries, & Pollak,
2006; Pollak etal., 2000), whereas children who are
physically abused learn to preferentially attend to and
identify facial movements that are associated with
threat, such as a scowling facial configuration (Briggs-
Gowan etal., 2015; Cicchetti & Curtis, 2005; da Silva
Ferreira, Crippa, & de Lima Osório, 2014; Pollak, Vardi,
Bechner, & Curtin, 2005; Shackman &