ArticlePDF Available

The Origin of Human Theory-of-Mind

Authors:

Abstract

Is there a qualitative difference between apes’ and humans ‘ability to estimate others’ mental states’, a.k.a. ‘Theory-of-Mind’? After opting for the idea that expectations are empty profiles that recognize a particular content when it arrives, I apply the same description to ‘vicarious expectations’—very probably present in apes. Thus, (empty) vicarious expectations and one’s (full) contents are distinguished without needing meta-representation. Then, I propose: First, vicarious expectations are enough to support apes’ Theory-of-Mind (including ‘spontaneous altruism’). Second, since vicarious expectations require a profile previously built in the subject that activates them, this subject cannot activate any vicarious expectation of mental states that are intrinsically impossible for him. Third, your mental states that think of me as a distal individual are intrinsically impossible states for me, and therefore, to estimate them, I must estimate your mental contents. This ability (the original nucleus of the human Theory-of-Mind) is essential in the human lifestyle. It is involved in unpleasant and pleasant self-conscious emotions, which respectively contribute to ‘social order’ and to cultural innovations. More basically, it makes possible human (prelinguistic or linguistic) communication, since it originally made possible the understanding of others’ mental states as states that are addressed to me, and that are therefore impossible for me. Keywords: human lifestyle; language evolution; mentalese; self-conscious emotions; Theory-of-Mind; vicarious expectations
Academic Editors: Haskel
J. Greenfield and Kevin M. Kelly
Received: 5 November 2024
Revised: 20 January 2025
Accepted: 4 February 2025
Published: 12 February 2025
Citation: Bejarano, T. (2025). The
Origin of Human Theory-of-Mind.
Humans,5(1), 5. https://doi.org/
10.3390/humans5010005
Copyright: © 2025 by the author.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license
(https://creativecommons.org/
licenses/by/4.0/).
Review
The Origin of Human Theory-of-Mind
Teresa Bejarano
Department of Philosophy, Logic and Philosophy of Science, University of Sevilla, 41018 Sevilla, Spain;
tebefer@us.es
Abstract: Is there a qualitative difference between apes’ and humans ‘ability to estimate
others’ mental states’, a.k.a. ‘Theory-of-Mind’? After opting for the idea that expectations
are empty profiles that recognize a particular content when it arrives, I apply the same
description to ‘vicarious expectations’—very probably present in apes. Thus, (empty)
vicarious expectations and one’s (full) contents are distinguished without needing meta-
representation. Then, I propose: First, vicarious expectations are enough to support apes’
Theory-of-Mind (including ‘spontaneous altruism’). Second, since vicarious expectations
require a profile previously built in the subject that activates them, this subject cannot
activate any vicarious expectation of mental states that are intrinsically impossible for
him. Third, your mental states that think of me as a distal individual are intrinsically
impossible states for me, and therefore, to estimate them, I must estimate your mental
contents. This ability (the original nucleus of the human Theory-of-Mind) is essential in
the human lifestyle. It is involved in unpleasant and pleasant self-conscious emotions,
which respectively contribute to ‘social order’ and to cultural innovations. More basically,
it makes possible human (prelinguistic or linguistic) communication, since it originally
made possible the understanding of others’ mental states as states that are addressed to
me, and that are therefore impossible for me.
Keywords: human lifestyle; language evolution; mentalese; self-conscious emotions;
Theory-of-Mind; vicarious expectations
1. Introduction
This article will propose that apes’ Theory-of-Mind (ToM) is supported by vicarious
expectations and that these, like any other expectation, are—let’s put it this metaphorical
way—empty profiles that will recognize a particular content when it arrives. Thus, vicari-
ous expectations, since they are empty profiles, can be automatically separated from the
subject’s own (full) mental contents. By contrast, in the human ToM, the subject estimates
foreign (i.e., others’) contents, which need some meta-representational resource that sepa-
rates them from the subject’s own contents. After having described in this way the contrast
between apes’ and uniquely human ToM, I must try to answer the following question: For
what function was the estimation of foreign contents—that is, the costly duality of one’s
own (full) content and foreign (full) content—originally advantageous?
If it is accepted that vicarious expectations require a previous empty profile in the
subject that activates them, then it must be also accepted that such expectations cannot
correspond to states which are intrinsically impossible for the subject. Thus, I propose that
the ability to estimate foreign contents originally arose when in the human lifestyle mental
states that were intrinsically impossible for the subject needed to be thought. But here it is
necessary to pause very briefly to deal with this lifestyle.
Humans 2025,5, 5 https://doi.org/10.3390/humans5010005
Humans 2025,5, 5 2 of 42
The new—human—lifestyle, which is the key to the co-evolution of genes/culture,
can be characterized by two features. (i) A ‘cultural’ feature. (ii) A ‘social’ one.
(i) An increasing technology: This would have needed some degree of teach-
ing (Gärdenfors,2022;Laland,2017;Tatone & Csibra,2015), or, at least, parental ap-
proval/disapproval (Castro & Toro,2004), and, therefore, some increase in communication.
But the technological increase also needs self-control, not only to acquire technological
skills, but also above all, to surpass previous cultural products and, later, to support creative
innovations (which are the essential factor to achieve the cultural advances).1
(ii) A high degree and wide span of collaboration and ‘partner choice’: This would
have required increasing communication (Mussavifard & Csibra,2023), and also (since
there is “competition to be chosen as a partner in cooperative ventures”: Baumard et al.
(2013) self-control that “refrains from blatantly selfish actions” (Baumard et al.,2013).
Returning to our thread, we must ask ourselves why actions intrinsically impossible
for the subject needed to be thought of in the new, human lifestyle. Self-conscious emotions
(if we opt for the idea that they are originally based on an interpersonal relationship, not
on an innate moral core) are advantageous because they provide the self-control necessary
to care for one’s own reputation. In addition, the subject who experiences those emotions
‘thinks what others think of him’ (of him as a distal, foreign individual), and, therefore, he
thinks a foreign mental state which, being impossible for him in any circumstances, is not
graspable through vicarious expectations.
But, to get to the origin of the matter, let us focus on a basic question—how does
the human subject think originally what others think of him? Thus, we will study the
new communicative reception (not production, at the very beginning) that distinguishes
human—even prelinguistic—communication from that of chimpanzees. The human ad-
dressee must think of a foreign mental state as a state addressed to him. By contrast, apes—I
propose—can think of a foreign mental state—not content, but vicarious expectation—only
if this mental state is not addressed to them and can understand that a message is addressed
to them, only if there is no need of estimating foreign (i.e., the producer’s) mental states.
After proposing the double identification of ‘apes’ Theory-of-Mind/vicarious expecta-
tions’, and ‘human Theory-of-Mind/foreign mental contents’, I will add two clarifications.
Firstly, the strict condition for the very origin of ‘foreign mental contents’ (that is, the strict
requirement that the mental states that must be thought are impossible for the subject
under any circumstances) is not necessary for the subsequent development of human
Theory-of-Mind. Secondly, it is convenient to focus in a more detailed way on the two
receptions—by apes and by humans—of pointing gestures.
Section 2briefly exposes the old descriptions (around the year 2003) of the primitive
and the advanced Theory-of-Mind and then presents the recent changes. Next, it focuses
on three articles—M. Tomasello (2018), Southgate (2020), Lurz et al. (2022)—that attempt to
accommodate the new data regarding the abilities of Theory-of-Mind in infants or apes
without having to dismiss the qualitative separation of the two modes of the Theory-of-
Mind. I share those authors’ goals, but I do not agree with their proposals. Section 3,
after highlighting the lack of consensus regarding the format of radically non-linguistic
‘expectations’, and after facing the mentalese (in Section “Does the ‘Language of Thought’
Exist?”), chooses to call them ‘well-defined, empty profiles’. Such emptiness, which gets in
any animal the automatic separation between goals and perceptions, can also be applied—I
propose—to a special type of expectations, the ‘vicarious expectations’. These special
expectations, very probably present in humans and apes, are processed as ‘belonging
to the other’ through the simplest way of the two proposed by Ereira et al. (2018), i.e.,
“through an encoding of agent identity intrinsic” to them. The nuclear Section 4—or rather
Section 4.2—proposes that the estimation of foreign contents originally arose when mental
Humans 2025,5, 5 3 of 42
actions intrinsically impossible for the subject needed to be thought of. Section 5focuses on
self-conscious emotions, which are essential in ‘the new—human—lifestyle’ (Section 5.1)
and require the ability to estimate foreign mental contents (Section 5.2). Section 6specifies
that the (above-mentioned) strict conditions for the evolutionary emergence of ‘foreign
mental contents’ are not necessary for the subsequent (ontogenetic and historic) functions
of the ‘second line’ of mental contents. Section 7proposes that the really effective (I will
call it ‘unified’) reception of pointing gestures requires the estimation of the producer’s
mental content, and in such sense is similar to the reception of gestures and gazes that
cause the addressee self-conscious emotions, and similar to the dialogic nucleus of any
linguistic reception. Section 8provides a general outline and summarizes the article in
the sense of listing all its hypotheses and suggestions without distinguishing between
the main one and those that were at the service of this. In fact, the order followed there
aspires to be the evolutionary order in which the different capacities would have arisen.
The outline does not, of course, cite any bibliography, nor does it accompany each idea
with the qualifications ‘according to my proposal’ or ‘I propose’, which were obligatory in
the other sections. Finally, Section 9deals with the testability of the proposals.
2. The Theory-of-Mind from 2003 Until Now
2.1. A Very Brief Summary
For the authors that accepted Theory-of-Mind around 2003, its primitive mode was
the ability to know what the other sees (/does not see)—or has (/has not) seen immediately
before. This ability is possessed, not only by children much younger than four years but
also (as M. Tomasello et al.,2003 showed) by chimpanzees. These results, soon extended to
goats or ravens (see Bugnyar & Heinrich,2005;Bugnyar et al.,2016) were explained by a
very simple mechanism, namely, that the subject both tracks a line from the (visually or
acoustically) perceived location of a conspecific to the relevant object and is aware of the
(possible) opaque barriers obstructing that straight line.
The advanced Theory-of-Mind was linked to the ability to attribute ‘false beliefs’ to
others. The early tests of ‘false belief show a video in which a child (Maxi) puts his marble
inside a vase and then leaves; afterward, his mother puts the marble inside his toy box and
leaves. Right then, Maxi comes back, and the experimenter asks the children who have
seen the video, ‘Where will Maxi look for his marble?’ The answers coming from children
under 4 do not show the false belief that Maxi is bound to have, but their own knowledge.
Within this general framework, the implicit knowledge of somebody else’s false beliefs in
some 3-year-olds that gave the wrong explicit answer (Clements & Perner,1994) did not
seem to disturb the mentioned descriptions of the two modes of Theory-of-Mind.
But nowadays, there is new data. Let’s begin by attending to Karg et al. (2015), which
shows that apes’ ability to estimate what the other sees (or does not see) goes well beyond
its old description. These new experiments investigated whether chimpanzees could use
self-experience to infer what another sees. Subjects first gained self-experience with the
visual properties of an object (either opaque or see-through). In a subsequent test phase,
a human agent interacted with the object, and the authors tested whether chimpanzees
understood that the experimenter experienced the object as opaque or as see-through.
Crucially, in the test phase, the object seemed opaque to the subjects in all cases (while the
experimenter could see through the one that they had experienced as see-through before).
Therefore, the chimpanzees had to use their previous self-experience with the object to
correctly infer whether the experimenter could or could not see when looking at the object.
Chimpanzees in a competitive context (that is, when they were sufficiently motivated)
successfully used their self-experience to infer what the competitor saw.
Humans 2025,5, 5 4 of 42
This experimental design is an ‘ecological’ one. Let’s think of an ape who must
estimate if his peer sees the immobile object that he, the subject-ape, “has previously seen”,
Karg et al. (2016). In the wild, it is probable that the ape-subject must estimate if the foliage
prevents the peer from seeing the object, but note that, since apes can often find themselves
at different heights from each other, ‘the possible foliage that might—or not—prevent the
peer from seeing the object’ is often hidden from the ape-subject’s eyes. To make such an
estimation, he certainly could move. However, apes (heavy and lacking wings) would take
too long to reach a location that would allow them to see their peer’s the visual field. This
problem was—I suggest—solved in apes’ evolution by vicarious expectations (in addition
to the subject’s prior knowledge of the area).
There is also news regarding false-belief tasks. More concretely, since Onishi and
Baillargeon (2005), numerous results in non-verbal tests have been offered in favor of the
estimation of false beliefs by infants. That type of test was later applied to great apes,
who achieved not very different results—Krupenye et al. (2016), Kano et al. (2017). But,
since the rate of success in prelinguistic children, and even more, in apes, is smaller, and
more variable than the rate obtained in verbal tests, we must ask: Are those successes in
non-verbal tests based on the same resource which supports traditional tests?
2.2. Discussing Some Proposals About the Difference Between the Primitive and
Advanced Theory-of-Mind
Before moving on to my proposal, let’s see that some articles try to separate the
new data from what is achieved by the advanced Theory-of-Mind. We will focus on
M. Tomasello
(2018), Southgate (2020), and Lurz et al. (2022), which goes in a different
direction. I am close to their goal, but not to their proposals.
According to M. Tomasello (2018), the infant grasps others’ beliefs because he “disre-
gards his own (diverging) knowledge”. In my view, such a reason is not convincing, since
disregarding the knowledge of the situation in which we find ourselves is at any age a
non-convenient inattention. But it is also true that, as Tomasello argues, if one’s own mental
content, instead of being disregarded, is simultaneously carried with somebody else’s
content in one’s own mind, then the two contents must be distinguished and compared
by the subject, and thus, we would be identifying the primitive mode with the advanced
one—an identification which I am opposed to.
Let us look at Southgate (2020), which, being relatively like M. Tomasello (2018),
is more recent and elaborate. Southgate (who, unlike Tomasello, doesn’t mention the
experiments about ‘foreign false beliefs’ in apes) proposes that “human infants have an
altercentric bias, which results from a combination of the value that human cognition places
on others, and an absence of a competing self-perspective”, and that such bias causes that
the events that are not co-witnessed with the protagonist of the play are encoded with
less strength. (About the altercentric bias, Southgate cites Bräten,2004, and we could
add Gallese,2018). This is what explains, according to Southgate, infants’ successes in
non-verbal tests of false belief.
I will start by saying that I very much like the idea that for infants, ‘altercentrism’ is
beneficial since it helps them to know what is relevant to others. However, I reject the
alleged “weakness of self-perspective” for the same reason I rejected Tomasello’s proposal
that the infant “disregards his own (diverging) knowledge”. Note that typical perceptions
are evolutionarily much older than altercentrism and are used at any age much more
frequently. Thus, it is unlikely that the degree of conservatism that evolution necessarily
includes fails there. Certainly, while infants typically pay a lot of attention to what people
around them look at, they sometimes do not care—I agree—about changes in the location
of an object. However, in my view, such lack of attention only appears if the object is not
Humans 2025,5, 5 5 of 42
salient enough for the subject, and, therefore, that contrast would not be a consequence of
‘altercentrism in the strong sense’ (i.e., ‘weakness of self-perspective’).
Let’s also focus on Lurz et al. (2022). This article—quite different from Tomasello’s
or Southgate’s ones—proposes that apes’ success can be explained in “a simple way:
Apes don’t use meta-representations, but they merely simulate (/imagine) to believe what
the other agent believes”. But note that this simulated (/imagined) belief or “low-level
simulation” (as Lurz et al. say) requires dealing with two contents about the same thing
and distinguishing each from the other. Thus, this task, as implicit as it may be, is not “a
really simpler model”, as these authors defend, but is still a meta-representation.2
While I reject Tomasello’s and Southgate’s idea that infants and apes “disregard their
own diverging knowledge”, I accept that the union of “inattention to one’s own mental
states” and “attention to somebody else’s mental states” characterizes the primitive Theory-
of-Mind. But I will propose that such inattention and such attention take place, not at the
content level, but at the expectation level.
3. Expectations and Vicarious Expectations
After having criticized those three articles, can we keep the idea of a qualitative
difference between apes’ and humans’ Theories-of-Mind? Let’s focus on Karg et al. (2015),
which, as above said, shows that chimpanzees can use self-experience to infer what another
sees. Probably, on the one hand, they activate their own expectations about what they
would see if they were in the same location and circumstances as their observed peer, but,
on the other hand, they process such expectations as belonging to the peer. These would be
expectations of a special—vicarious—kind. But what exactly does a vicarious expectation
consist of?
3.1. Expectations in General
Let us begin by attending to expectations in general. These, mainly since Bar (2007),
are more often called ‘predictions’ (Latin prae-dictio: said or evoked in advance), a term
that I don’t like to use for non-human animals because of the view presented in the next
lines. I borrow ‘innate or learned expectations’ from Lorenz (1966).3
General expectations, mainly the goals, are a vital resource to guide behavior and—as
‘teaching mechanisms’—also learning in any animal. The matter is how expectations act in
radically prelinguistic minds (and possibly also in our most spontaneous mental processes),
while expected things are absent. Probably, instead of proposing that the animal agent has
a simulation (or evocation, or off-line copy) of expected ‘things/events’, it could be helpful
to understand such ‘presence of absent elements’ in a less demanding, non-evocational
mode.
4
Therefore, we could describe them as well-defined but empty profiles hierarchically
arranged according to their lesser or greater degree of dependency on learning. These
empty profiles can recognize the appropriate content when it arrives.
Okasha (2022), like other researchers, claims that ‘the mental representations of goal’
in avian and mammal species are objective facts, and he justifies such claim “on grounds of
(their) evolutionary continuity and neuro-physiological similarity (with humans)”. But I,
doubting that those “grounds” are enough of a guarantee, suggest the following alterna-
tive. It was the very beginning of the new lifestyle—that is, the initial strong increase of
cooperation and communication—that made more and more advantageous a less empty
representation of goals: Individuals needed to communicate their displaced goals to their
group so that the group can cooperate towards reaching that goal.
5
This need for produc-
ing and understanding such communications probably caused such advance. In short,
‘well-defined, empty profiles’, which had been sufficient in the old lifestyle, no longer
were. But all this view is opposed to Fodor (2007), i.e., to ‘language-of-thought’ or ‘innate
Humans 2025,5, 5 6 of 42
mentalese’. Therefore, now I must focus on the contrast between this and my underlining of
the crucial role of communication. In this way, I am close to the goal pursued, for example,
by Fedorenko et al. (2024), but not to their way of dealing with the relationship between
language and thought. (The next sub-subsection offers some proposals about language, its
origin, and cognitive consequences. However, it is only later that I will describe my nuclear
proposal, that is, ‘the new—even prelinguistic—communicative reception’, which is the
point where primitive Theory-of-Mind had to be transformed).
Does the ‘Language of Thought’ Exist?
Fodor (1975) postulated that an innate ‘language of thought’ (discrete symbols, and
syntax) supports perceptions (even if “these, unlike discursive representations, lack canoni-
cal decomposition”, as Fodor,2007 adds) and makes evocations possible. Certainly, this
idea constituted a root of artificial intelligence, and this fact explains its current revival.
However, I lean towards rejecting the existence of the innate ‘language of thought’. More
concretely, in some works, I have opposed, not only innate syntax, but also innate semantics,
since our semantics is indelibly shaped by syntax: See Bejarano (2008,2010), and Bejarano
(2011) (Chapters 10–16). Without syntax, there aren’t nouns/verbs/adjectives, etc.: There
are not even nouns—my proposal insists against a deep-rooted idea that influences Hurford
(2007), for example.6
Fodor’s theory, although not focusing on the origin of human cognition and language,
closely channels the hypotheses about such origin. Therefore, facing that theory requires
facing its potential for derivations. Hence this section is going to be longer than what one
might think at first glance is appropriate.
S. Phillips (2024), trying to explain language-of-thought, proposes that “(perceptual)
data are projected onto a base (conceptual) space in one direction, and in the opposite
direction, these data are referenced by that space”. I agree, of course, that the elements
(objects, qualities, relations) of a perception are recognized by the individual who perceives
it. However, in my view, no independent element is used in genuinely prelinguistic percep-
tions. That is, while in linguistic understanding, each of the meanings receives independent
attention before they are integrated into the total meaning, in those perceptions, on the
contrary, such attentional, non-subpersonal independence of each relevant element would
consume time, and therefore, far from solving any problem, would be a detrimental feature.
(Certainly, we, using other human abilities, can attend slowly and long to any perception—
let alone an artistic painting. However, perception evolved for survival in a world where
rapid response is crucial).
7
In addition, I think that in preverbal human infants and in
animals, the so-called ‘logical reasoning’ or ‘intuitive logical reasoning’ requires neither
decomposition nor compositional explicitation. Or if (as Durdevic & Call,2022 propose)
“deductive reasoning, rather than relational or belief reasoning, is so far the best candidate
for a human-unique derived cognitive ability”, it is because deductive reasoning requires
syntax and syntactic semantics.
But if I reject, not only the innate language-of-thought but also its innate ‘semantics’,
then I must try to give an alternative account of the emergence of language. Before that
emergence, there were—I propose—pre-syntactic (that is, holophrastic) ‘requests for a cer-
tain material’ or ‘calls to a certain individual’, which would use pre-words, i.e., meanings
always linked to conative function and conative intonation. These holophrases could some-
times (when the individual was absent or the material was gone) reveal the speaker’s false
or outdated beliefs to the listener, and, therefore, provoke the theme/rheme composition,
which corrects, completes or updates those beliefs (and is ‘meta-communicative’ in the
minimal meaning of Dingemanse & Enfield,2023).
Humans 2025,5, 5 7 of 42
The emergence of this pre-grammatical syntax would have been helped by a new and
broader intonation pattern that girds into a single unit the theme and the rheme. (This sug-
gestion fits well with the link between intonation and semantics: “Regarding prosodic cues
that correlate with distinct communicative function, the brain responds very rapidly, but
not in communicative situations without semantic content”,
R. Tomasello et al.,2022
).
8
Such intonational help—a case of the physical, pre-symbolic embodiment in human
communication—probably facilitated the victory of voice over gesture—Bejarano (2014)—
(an evident victory, even if gestures continue to accompany and complement vocal commu-
nication).9Thus, complex management of the two different levels of packaging (the word
level and the intonational level) became necessary. Let us focus on that.
Linguistic structure, including hierarchical structure, is “a special case of structured
action” (e.g., Planer,2023). In addition, Gallardo et al. (2023) propose: “In Broca’s area, an
action-related region evolved into a bipartite system, with a posterior portion supporting
action and an anterior portion supporting syntactic processes”. Could we then suggest that
the structured action immediately and directly linked to the syntactic structure is the action of
managing the two different levels of packaging? Let’s note that Osiurak et al. (2021) assign
the “technical” dimension (not the “motor” one) of actions to Broca’s area. This hypothesis
is also consistent with the theory (Corballis,2011) that recursive skills go beyond language.
The new and broader intonation that girds into a single unit the theme and the rheme
resulted in a duality of different sounds for the same sign (with the conative intonation in
holophrases, and with the non-conative one in the genuine word used in pre-grammatical
syntax). With this perhaps the problem arose of how to identify the same meaning in two
different vocal patterns. The final solution could be the learning of articulatory-phonetic
sequences, which are able to be produced with one intonation or another depending on
the circumstances.
The learning of articulatory-phonetic sequences, even if it does not have to face the
problem of perceptual-motor correspondence—one hears oneself–, is a difficult type of
imitation. Certainly, as C. Heyes (2021a) says, “I could copy a sound you make by simple
trial-and-error, varying my vocal output until it matches my memory of the sounds you
made”. This perfectly describes the babbling. However, note that unitary articulatory-
phonetic sequences of several different steps—i.e., typical words—cannot be reproduced
simultaneously with their hearing, nor can they be easily remembered in an exact way, nor,
unlike bird song dialects or vocal learning in parrots, are they merely an enrichment of an
innate pattern.10
In this way, the super-high fidelity copying’ could perhaps arise.
11
Obviously, “this
type of imitation makes sense in intransitive or object-free actions” (C. Heyes,2021b). It is
a “mimicking” (M. Tomasello,1999) of ‘conventional’ motor sequences. Regarding ‘high-
fidelity copying’, I agree that it was not necessary for the early technologies (see Andersson
& Tennie,2023;Osiurak et al.,2022;Sterelny,2023;Tennie et al.,2016). However, if, as I
have just suggested, the strictest motor imitation (i.e., the super-high fidelity in articulatory-
phonetic sequential imitation) was really essential for the deployment of syntactic language
(and, therefore, also of ‘collaborative computation’, Dor,2023), then such imitation is a very
important cause of the human cultural advances.
Regarding a deeper link between language and those advances, I suggest that the
predicative, really compositional language—beyond making communication easier—is
likely to strengthen complex innovations, since these may be supported by the same
cognitive resources (of decomposition and recomposition) used in syntactic language.
Note that the primary cause of cultural advancement is not the ability to copy know-how
(see van Leeuwen et al.,2024),
12
but the ability to produce innovative solutions, mainly
Humans 2025,5, 5 8 of 42
through creative problem-solving (although more serendipitous processes, e.g., of drift
from copying error, can sometimes lead to improvement of previous results).
I have no intention of trying to substantiate the previous suggestion that ‘syntactic
language, or, more precisely, its cognitive resources of decomposition and recomposition,
help to support creative innovations, even in non-linguistic areas’. Anyway, I’ll bring some
quotes. “Members of modern Homo sapiens can mentally combine and recombine symbols,
according to rules, not only to consciously describe the world as it is but to generate new
visions of it as it might be” (Tattersall,2023). Likewise, Vyshedskiy (2022) highlights ‘the
voluntary imagination component of language’. This imagination must—I would add—be
used even in simple receptions of theme/rheme, since, for the typical, i.e., the non-informed
addressee (vs. the atypical, perfectly informed one), the content provided by the theme
doesn’t include yet the rheme, and thus, the addressee will have to imagine a new situation
that he/she has not perceived. A clear example is the reception of “The blanket turned to
ashes”. In addition, note that, for this communication, the real blanket (or, more exactly, ‘the
blanket for the speaker’) is decomposed in two elements—firstly, ‘the addressee’s false belief
about the blanket’, that is, an inadequate means to reach the producer’s communicative
goal, and secondly, ‘the adequate correction or updating’. Thus, it is communicatively
recomposed. This is an ability to transform others’ mental contents. Returning to the previous
suggestion: Could that ability later—and more creatively—be exercised on one’s own
mental contents and support difficult problem-solving? Thus, in addition to connecting—in
my nuclear proposal—human Theory-of-Mind with human communication, I also suggest
connecting it with creative problem-solving. What relationship does this last skill have
with “human causal cognition” (whose original connection with technology is persuasively
proposed by Gärdenfors & Lombard,2020)?
But let us return to our thread. What have I achieved in this section? Having said
above that the innate mentalese is incompatible with my proposals, I, unfortunately, have not
offered any strong argument against it. However, I have tried to show that an alternative
hypothesis may also have the potential for derivations. So, if it is accepted that the innate
language-of-thought is not necessary, then expectations can be more easily described as
empty profiles. (If,then: Needless to say, this article, based on data from several disciplines,
uses the hypothetico-deductive method).
3.2. Vicarious Expectations
3.2.1. Can They Also Be Described as Empty Profiles?
So far, we have dealt with expectation in general, which is inseparable from any animal
life, and involves extremely basic competencies (for instance, the physical understanding
of the effects of gravity, or the daily exposure to the principles of causality). But what is
interesting for us—what can, in my view, connect with apes’ Theory-of-Mind—is only the
vicarious expectation. Thus, we must focus on the following question: Can the metaphorical
description (‘well-defined but empty profile’) also be applied to vicarious expectations?
Such an application seems plausible. Note, for example, that such emptiness can explain
why ‘level II perspective-taking’ is absent in the primitive Theory-of-Mind.
13
Rakoczy
(2022) underlines in his general review this absence.
3.2.2. An Argument That Favors That Application: Primates’ Mirror-Neurons
To favor a little more the affirmative answer (that is, vicarious expectations can be
described as empty profiles), I will try to show that vicarious expectations derive quite
directly from a particular non-vicarious expectation. To do this, let us start looking at
non-human primates again. More specifically, let us focus on macaques’ mirror-neurons.
Humans 2025,5, 5 9 of 42
But first I must admit that, as Reviewer 2 rightly points out, this business of mirror
neurons has weakened in recent years. And this is not just because the innatism of the
first cognitive revolution has generally lost its appeal. Beyond this peripheral level of
weakening, there is another more nuclear one, namely, more and more differences are being
found in mirror neurons between the execution and the observation of the same action: It
seems that the alleged ‘mirror’ is becoming increasingly cloudy.
Why, despite that, do I decide to insert this subsection? First, because I really like the
idea that a primate trait—the hand—can, by being perfectly visible to its owner, provide
an initial bridge towards the estimation of another’s interiority, even if it is only the
estimation of a proprioceptive-tactile sensation that another feels. And, second, because the
apparent clouding of the ’mirror’ could perhaps be the consequence of an initial erroneous
interpretation of mirroring.
But let us leave these preambles and let’s focus on the original cause of mirroring,
according to Keysers and Perrett (2004). These authors pointed to a learning process. Hands
are (together with the forearm) perfectly visible to their owner, who must look at them very
attentively during the actions of grasping. Thus, the proprioceptive and tactile feedback of
any grasping will end up being connected with the visual perception of that movement.
14
This hypothesis is attractive. Thus, C. Heyes and Catmur (2022) agree that the abilities of
mirror-neurons are learned “through the correlated experience of seeing and doing the
same actions in the context of self-observation”.
In other paragraphs, C. Heyes and Catmur (2022) and C. Heyes (2021b) also emphasize
that cultural practices—“childrearing practices that encourage adults to imitate infants and
children, or the use of optical mirrors”—solve ‘the problem of visuomotor correspondence’.
I accept, of course, that these cultural factors have a powerful influence on development
(Essler et al.,2023). But, in my view, the (both phylogenetic and ontogenetic) origin of
the visuomotor correspondence is, as Keysers & Perrett propose, the vision of one’s own
hand. More specifically, in the very origin, there would be learning, but not yet cultural,
but rather dependent on a bodily trait specific to primates.
But, after accepting the Keysers & Perrett hypothesis, we now have to focus on the
activity of the already taught mirror-neurons. It is that when the visual perception is given
without the corresponding inner sensations—that is, when it is someone else’s hand–, the
subject must disengage from himself the hand that is in sight. That disengagement is
confirmed by the results of all the rubber-hand experiments: See e.g., Pfister et al. (2021):
“A single tactile stimulus applied to the rubber hand—but not to the real hand—triggers
substantial and immediate disembodiment”. But this ‘disembodiment’ (or exclusion from
one’s own body) does not only concern—I propose—the hand at sight, but also the propri-
oceptive and tactile expectations that the observed grasping had activated in the subject,
and which this subject now needs to process as ‘belonging to other’.
This may be when vicarious expectations appear for the first time in evolution, this
may be the very origin of the estimation of another’s interiority, or, in other words, the
most primitive form of the non-human Theory-of-Mind. In short, while it is typically
emphasized that “mirror-neurons map other-related information onto self-related brain
structures” (Bonini et al.,2023), I underline the later, inverse mapping: One’s own failed
proprioceptive expectations become vicarious expectations automatically (see the next
subsection) processed as ‘belonging to another’.
Let us now pay attention to Pomper et al. (2023). “At most time points, mirror
neurons did not encode observed actions with the same code underlying action execution.
However, in about 20% of neurons, there were time periods with a shared code. These time
periods formed a distinct cluster and cannot be considered a product of chance”. These
experimental results might fit with the proposal offered in this subsection: Note that, if
Humans 2025,5, 5 10 of 42
mirroring is explained as vicarious empty expectation, then it is not surprising that it be
different from the ‘fulfillment of expectation’ that movements provoke.
But let us continue reading Pomper et al. (2023): “We propose that mirror neurons rep-
resent the process of a goal pursuit from the observer’s viewpoint. Whether the observer’s
goal pursuit, in which the other’s action goal becomes the observer’s action goal, or the
other’s goal pursuit is represented remains to be clarified. In any case, it may allow the
observer to use expectations associated with a goal pursuit to directly intervene in or learn
from another’s action”.
I would venture to say that “a goal pursuit from the observer’s viewpoint” is relatively
close to a vicarious expectation. Even the question (“Whether”) that “remains to be clarified”
is similar to what we will see in the next subsection with Ereira et al. (2018). Finally, the
last sentence—“In any case, it may allow the observer to use expectations associated with a
goal pursuit to directly intervene in or learn from another’s action”—is unobjectionable,
of course: Hebbian connections (such as those established between the two simultaneous
expectations, visual and proprioceptive, during the period in which mirror neurons learn)
constitute a resource that can serve animals in many different ways.
If (and only if) that proposal about the functioning of mirror-neurons and also the
proposal in the previous subsection about expectations are both correct, then we could
underline that, while visual/proprioceptive connection is forming in a macaque, it is still
a non-vicarious expectation: It is the grip that the macaque is going to execute that ac-
tivates in him the (general, non-vicarious) expectation of the two versions—visual and
proprioceptive—of the adequate ‘feedback’. In this way, we could deduce the desired
conclusion—i.e., vicarious expectations are directly derived from non-vicarious expecta-
tions, and, therefore, if it is accepted that this latter type is an empty profile, then the same
has to be accepted with respect to vicarious expectations.
Certainly, the vicarious expectations that I propose to attribute to apes concern the
entire body, not only the hand. However, this could be an almost irrelevant difference.
Piaget (1954) showed that it is from hands and (since hands bring food to mouth) also
from mouth that the child builds correspondences between his own body and other bodies.
In addition, Errante et al. (2023) found (in human participants) that “actions-observation
activates specific cortical and subcortical sectors not only during hand actions observation
but also during the observation of mouth and foot actions.
What do I finally get from all this? If vicarious expectations—instead of requiring
imagined (/simulated/evoked/off-line) contents—are ‘well-defined but empty states’,
then no meta-representational separation between contents and vicarious expectations is
necessary, and also then, the contrast between vicarious expectations and foreign contents
can support the contrast between apes’ and humans’ Theory-of-Mind. But here we must
add some clarifications.
3.2.3. Some Clarifications on Vicarious Expectations
In Section 2.2, I proposed that it is the subject’s own expectation that is absent when
the subject activates vicarious expectations and encodes them as ‘belonging to other’.
Why do I use “absent” (instead of “disregarded”, the term applied by Tomasello to ‘the
subject’s own knowledge’)? Let’s remember that behavioral activity necessarily activates
expectations of goals and subgoals. Therefore, “inattention to one’s own expectations”
can only take place when the subject, being behaviorally inactive, has not any general
expectation activated. Thus, confusion is impossible, not only between (empty) vicarious
expectations and one’s own (full) mental contents but also between both types of the
subject’s expectations—(absent) general expectations and (present) vicarious expectations.
Humans 2025,5, 5 11 of 42
My second clarification is that the so-called ‘attribution of ignorance’ in the primi-
tive Theory-of-Mind does not require any resource different from vicarious expectation.
The mere ‘absence of vicarious expectations’—when, for example, the other chimpanzee
has not seen the food—can explain why in that case the (subordinate) subject goes (as
M. Tomasello et al.,2003
showed) to food. This view is close to Barone et al. (2022), who
studied early implicit measures of false belief understanding: “The results from a new
‘Ignorance’ control condition in which children largely behave like in the ‘False-Belief’
condition, suggest that the epistemic state ascription does not amount to full-fledged belief
attribution. Rather, children probably merely track knowledge vs. ignorance”. In addition,
basic and implicit ToM capacities seem not to be the same ones as those tapped in standard
explicit false-belief tasks, since—as Poulin-Dubois et al. (2023) found—there is no stability
in Theory-of-Mind skills from infancy to early childhood.
From Neuroscience, Schüler et al. (2024) say: “While the primitive Theory-of-Mind
is supported by the salience network, it is the default network that supports foreign
false beliefs and, more in general, the processing of internal, perceptually decoupled
representations”. This is compatible with my hypothesis. Note that vicarious expectations
are perceived in the body and movements of another agent, and are really salient perceptions
for the behaviorally inactive subject.15
The third clarification is particularly important. If vicarious expectations are accepted,
then we must accept that the self-other distinction is automatic in the primitive Theory-
of-Mind. Let’s see Ereira et al. (2018), who worked with human adult subjects: “When
another agent’s mental state is inferred, it can be identified as ‘belonging to other’ in two
different ways”. A way is that “a learning signal (prediction-error or belief) is encoded in
an agent-independent pattern. In this case, the learning signal and the identity of the agent
to whom the signal is attributed would need to be encoded in 2 separate activity patterns”.
This first way, with its meta-representational separation, would be linked to the advanced
Theory-of-Mind (in Ereira et al.’s words, “to standard false belief task”). But these authors
claim that, to identify mental states as ‘belonging to other’, there is another way, which
operates “through an encoding of agent identity intrinsic to fundamental learning signals
(my emphasis)”. This second way of self-other distinction (which in human adults is limited
to the most spontaneous processes) would be, in my view, based on
vicarious expectations
.
Those two ways might be relevant to solve a repeatedly alleged conundrum—“the
empathy-sharing conundrum, which mainly refers to the self-other differentiation that
empathy entails”, Vincini (2023). In my view, the type of self-other distinction that is based
on vicarious expectations does not involve any clash between self and other. This is the
type that, when it is linked to ‘empathy’, intervenes in spontaneous altruism. On the
contrary, the other type, when it is linked to ‘empathy’, appears, for example, when the
subject receives a request that he/she feels is an obstacle to—or, in other words, as a clash
with—his/her own activated goals.16
In a similar line to that of Ereira et al. (2018) (but focusing on ‘altercentrism’),
Tebbe et al.
(2024) report: “A highly specific neural signature of visual object process-
ing was also present when their view was blocked and only another observer saw the
object”. This, which was found in infants and adults, could perhaps indicate that the visual
vicarious expectation shared the empty profile that served the subject to search for and
recognize the particular object.
17
The core of the experimental design of Tebbe et al. (2024)
should be applied to apes. (See above the paragraphs about Karg et al.’s articles).
Let us recapitulate the previous clarifications. Vicarious expectations can include
what Michael and Székely (2019) call ‘goal slippage’. In summary, the slippage into the
circumstances of the other, or the ‘disembodiment’ of expectations (i.e., the ‘exclusion from
one’s own body’, Pfister et al.,2021) which the subject performs when the observed hand
Humans 2025,5, 5 12 of 42
is a foreign one, or, as in Ereira et al. (2018), the encoding of agent identity intrinsic’ to the
mental state—all these terms—describe ‘vicarious expectations’. I would now add that such
easy slippage is abruptly interrupted (both in humans and apes) as soon as the other agent
turns around and looks at the subject. We have to not forget that, in a subject, vicarious
expectations are incompatible, not only with behavior but also with a high probability of
immediate behavioral activation.
That rupture of the easy slippage is similar to what happens when, after having
imitated (copying all his turns left or right) someone who walks ahead of me, I realize that
he turns around and faces me. Certainly, humans can continue such ‘bilaterally accurate’
motor imitation, but only if they start doing something different from what they were
doing before (i.e., different from the mere ‘slippage’ into another location). More precisely,
if I want to continue the imitation, I will have to imagine myself in a situation that is as
intrinsically impossible for me as being in a different spatial relationship with myself.
18
(Let’s
think of the gesture of two individuals shaking each other’s right hand:Could this gesture
originally involve—or try to provoke—the grasping of foreign mental contents?) All in all,
the similarity of the collapse of the flowing “slippage” in the two mentioned cases—with
imitation and without imitation—is clear.
4. Primitive and Advanced Theory-of-Mind
4.1. Working-Memory and Non-Verbal Tests of False Belief
Indeed, in the (so-called) ‘non-verbal tests of false belief’ there are successes (above
chance), but they are quantitatively limited. Regarding non-human primates, see, for
example, Berke et al. (2023, preprint). In addition, ‘replication findings’ are mixed. This
is why “a large-scale multi-lab collaboration will examine whether 18–27-month-olds
and adults’ anticipatory looks distinguish between knowledgeable and ignorant agents”
(Schuwerk et al.,2024). About those difficulties in replicating successes, Rakoczy (2022)
proposes: “There might be two classes of implicit tasks”.
How could we interpret all this? Certainly, regarding this matter, we must wait for
new data. In addition, evolutionary emergence and ontogenetic development can never be
identified (and even less so in our case, since “infants’ experience is already enlanguaged”—
Dreon,2024). However, when we are asking a question so difficult to answer—when we
are wondering about the evolutionary origin of the human Theory-of-Mind—we should
not rule out anything that can provide us even a little bit of light. Thus, I will add some
little commentaries.
In my view, those non-verbal tests require—regarding just Theory-of-Mind—vicarious
expectations, and do not need meta-representation of foreign contents. In other words,—
regarding just Theory-of-Mind, I repeat—such tests mainly depend on the primitive, easier
one (even though sometimes, of course, adult humans apply the advanced Theory-of-Mind
to them). However, they require other abilities beyond Theory-of-Mind. Thus, the two
scenes (original and changed place) and the consequent demand on attention and especially
on working memory provoke great difficulty in less motivated subjects.
19
Such difficulty
mainly appears if these are at the same time prelinguistic subjects. Note, please, that
developmentally—and, very plausibly, also evolutionarily—the reception of multiple-word
messages causes a great expansion of working memory.
Leaving all this, let’s focus on the nuclear proposal in this article. Certainly, all the other
proposals or suggestions offered in the article can and should be evaluated in themselves.
However, if I have included them here, it has been to hold the main one.
Humans 2025,5, 5 13 of 42
4.2. What Made the Estimation of Foreign Mental Contents Originally Advantageous?
My proposal up to this point has been that the contrast ‘primitive vs. advanced Theory-
of-Mind’ equals the contrast ‘(empty, easier) vicarious expectations vs. (full, more difficult)
foreign contents’. Therefore, the following question arises: What made the estimation of
foreign mental contents originally advantageous? As the reader can see, I believe that
only if we explain the difference in function—and not only in features—between vicarious
expectations and foreign mental content can we move forward.
I propose the following three points. First, to support the adaptive advantages pro-
vided by apes’ Theory-of-Mind, vicarious expectations are sufficient resources. Second,
since vicarious and non-vicarious expectations require previous, well-defined profiles in
the subject that activates them, this subject cannot activate any vicarious expectation of
mental states that are impossible for him in any circumstances.
20
Third, your mental state
of thinking of me as a foreign,distal individual,since it is a mental state that is impossible for me,
cannot be a vicarious expectation of mine, and therefore, I will only access that state if I am able
to estimate foreign contents. Above I proposed that ‘the situation of being in a different
spatial relationship with myself’ is impossible for me. Now, a new example—your state of
interacting with me as with a foreign, distal individual—would be equally impossible for
me, but much more relevant for human needs.
Thus, we can reformulate our previous question in the following way. For what
function was ‘the ability of estimating the foreign mental states that involve me as a distal,foreign
individual’ originally advantageous?Or, more concretely: In the new lifestyle, were there
problems that such ability could solve?
5. Self-Conscious Emotions
I am now going to separate (as other researchers have done) Theory-of-Mind from
‘false belief’ a bit and focus more on those emotions and other issues. “A developmental
approach that focuses on a plurality of domains makes us able to generate useful insights
that may not be obvious when focusing on a single domain”, A. L. Ruba et al. (2022). Or, in
other words, a puzzle is more difficult if some pieces are missing. But let’s go back to work
step by step.
“The thinking what others think of us (Darwin,1872, about blush; my emphasis)
necessarily requires, according to my proposal, the estimation of foreign mental contents,
and therefore, the beginning of the advanced Theory-of-Mind. That phrase can describe,
beyond blush, also self-conscious (or “self-other-conscious”: Reddy,2010) emotions, which
are “embarrassment, shame, guilt, pride” (e.g., M. Lewis,2000).
Is there a common neurophysiologic signature for these four emotions? We can see
Piretti et al. (2023). However, these authors unfortunately did not include pride—the only
pleasant self-conscious emotion—in their study.
I am opting—it is already evident—for the idea that such emotions are originally based,
not on an innate moral core, but on an interpersonal relationship. Thus, in self-conscious
emotions (unlike in basic emotions),
21
the subject “thinks what others think of him”. Beyond
Darwin’s phrase, Frith and Frith (2007) is essential: “The appropriate reception of deliberate
social signals depends on the ability to take another person’s point of view. This ability is
critical to reputation management, as this depends on monitoring how our own actions are
perceived by others”.22 Indeed when we experience self-conscious emotions, the contents
of the foreign mind become more real, more relevant for us than any other reality in our
surroundings. Cf. Peeters et al. (2023): “[O]bserver-memories are often associated with
events where the memorizer experienced a high degree of self-awareness, such as during
public speaking. This could be explained by appealing to the context of encoding, where
the relatively intense emotions guide encoding towards an observer perspective”.
Humans 2025,5, 5 14 of 42
In this Section, firstly, I will argue that self-conscious emotions relate to the new, human
lifestyle. They are “survival circuits” (as LeDoux,2012,2023, describe the function of any
emotion), but survival circuits of a very special type that evolved linked to the human
lifestyle. Secondly, I will propose that self-conscious emotions require the estimation of
somebody else’s mental contents.
5.1. Self-Conscious Emotions Are Useful in the Human Lifestyle
The new, human lifestyle is based on special cooperation and communication. Conse-
quently, the care of one’s own reputation, and therefore, also an enhancement of self-control
became crucial: Leary (2004) and Sznycer (2019). (All this did not replace “the old dynamics
of social dominance, which are based on aggressive and submissive interactions”—Royo
et al. (2024)—, but was added to them. Hence, prestige is associated with evolutionarily new
nonverbal displays: Witkower et al.,2020). In this way, Baumard et al. (2013), who focus on
“competition to be chosen as a partner in cooperative ventures”, practically identify the care
of reputation with the habit of refraining from “blatantly selfish actions”.
23
This refraining
is certainly essential in the care of reputation. However, even in “cooperative ventures”
other aspects are important—e.g., the reputation concerning good communicative abilities.
In addition, beyond cooperative ventures, there are—see Crespi et al. (2022)—other “arenas
of runaway social selection” where reputation is equally crucial. We must also consider
that when narrative language and thereafter the (negative or positive) gossip arose, the
care of reputation became more intense.24
But let us pay attention to the different usefulness of self-conscious emotions. ‘The new,
human lifestyle’ requires also the “deliberate practice”—Ericsson (2002), Rossano (2003)—
that is necessary to achieve any kind of cultural expertise. Here, a self-conscious emotion—
pride—intervenes. Experts arouse admiration. (About the two types of admiration—for
skill and for moral virtue—, see Algoe & Haidt,2009. About admiration—vs. envy—for
experts: Onu et al.,2016). Therefore, experts experience the only pleasant self-conscious
emotion—pride. See Sznycer and Cohen (2021), and Sznycer et al. (2017).
25
The search
for those attractive rewards can support, at least in some of the admirers, prolonged,
effortful acquisitions, not only of the admired level of expertise but also of a better one.
This role of pride could become even stronger in “collaborative computation, which is the
foundation of our cumulative cultures”—Dor (2023)—(and is very different from the so-
called ’collective mind’ that leads many animal species to, for example, efficiently organize
their group movements).
In addition to providing motivation in that way, pride could influence in an indirect,
but still effective way. Progress towards a goal > Higher ‘self-efficacy’ and pride > More dif-
ficult goals are perceived as possible. A goal that is perceived as both difficult and possible
(that is, a goal in Vygotsky’s zone of proximal development—
Vygotsky & Cole,1978)
can
improve the subject’s level. In other words, “there may be a positive relationship between
difficulty and progress when self-efficacy is high”, as Thorne et al. (2023) preprint try
to confirm.
Certainly, children at first are concerned with learning by observing their parents.
However, from about age 8, they switch to copying the local expert instead.
26
This tendency
is probably universal (Henrich & Broesch,2011). Expertise, despite not influencing auto-
matic imitation (Nevejans & Cracco,2022), can cause the desire to acquire such expertise,
and in that causality, “admiration is more decisive than prestige bias” (Chellappoo,2021).
In addition, let us look at experimental results by Brinums et al. (2023): “Children that
were asked to imagine succeeding in the test and to focus on what they will be feeling
(Emotional Condition) practiced longer than those in the Non-emotional Condition”. More
in general, Shimoni et al. (2022) report that a strong link between delay of gratification
Humans 2025,5, 5 15 of 42
and pride has been found among preschool-aged children, an age at which self-regulation
abilities are still developing.
Thus, pride can, I propose, support cultural advances. Pride is a reward that subjects
get when they see the admiration with which they are looked at by the group—a reward
that the subject, of course, will seek to obtain again. Certainly, there are other rewards
for an outstanding skill. E.g., André et al. (2023)—who do not underline the causal role
of pride, or, more concretely, of its pleasant nature—focus on “reputational and material
benefits to the recognized artists”. However, the pleasure that others’ admiration and
consequent pride provide, being less deferred and more easily evocable than those benefits,
could originally be the best resource to support the prolonged effort that an outstanding
skill requires. “Regarding, for example, the learning of post-Acheulean shaped stone tools,
we should be concerned to explain the hours of effort with little or no short-run return”
(Spurrett,2024). About this, Castro and Toro (2004) and Castro et al. (2024) talk about the
reward that a parental positive evaluation involves, and also Sterelny and Hiscock (2024)
(in their reply to Spurrett) focus on children. All this is certainly true, but such focus is
useful only to support the basic acquisition of skills, not to sustain the attempt to surpass
the previous level of the group. Therefore, pride—I propose—could be an important cause
of the innovations that gave rise to our cultural advances. (Mere serendipity, in my view,
would have had in general only a small influence). Thus, the two features of the new,
human lifestyle described above (in Introduction) would be supported by self-conscious
emotions. In other words, not only the negative self-conscious emotions are partially
responsible for its ‘social’ feature (as it is generally admitted), but also the only pleasant
self-conscious emotion has a strong influence on its ‘cultural’ feature.27
In short, self-conscious emotions support self-control, which is necessary in different
aspects of the new lifestyle.
28
Certainly, self-control will be bolstered later by ‘speech di-
rected to oneself’ or, even later, by ‘inner speech’ (see Bejarano,2022, in its Section 4), and
can be put at the service of any type of goal (even the goal of exercising what I call—see
previous note 16—the most demanding moral capacity). Probably, those very special types
of speech originally arose when the gossip (which “gives gossipers an evolutionary advan-
tage”, X. Pan et al.,2024) spread more and more. However, before ‘self-directed speech’
began, self-conscious emotions were crucial for the growth of self-control in humans.
5.2. Self-Conscious Emotions and the Estimation of Foreign Contents: The Two Connections
Between Both Traits
Now, let us move on to the link between self-conscious emotions and the ability to
estimate foreign content. I propose that if the human being can experience self-conscious
emotions, it is because he is capable of imagining a situation as impossible for him as
that of seeing himself as a distal, foreign element. (An earlier, more embodied version
of that imagining was offered in Section 3.2.3). Thinking what others think of oneself
requires the ability to estimate other people’s mental contents: Vicarious expectations
would have been useless there. This is the first of the two connections mentioned in the
title of this subsection.
Let’s move on to the second connection. Having opted for the idea that originally
such emotions were based on an interpersonal relationship, I suggest that, very likely, such
interpersonal relationship originally occurred as a prelinguistic intentional communication,
that is, as expressive ‘gestures or vocalizations’ accompanied by gazes. (I agree with, for
example, Bohn et al. (2022) that the main link between the kinds of signals our human
ancestors used and human language “is the interaction engine”. In general, I accept
Tomasello’s claims that human uniqueness is previous to language).
29
Such prelinguistic
intentional communications—for example, ‘gesture or vocalization of disgust (/happy
Humans 2025,5, 5 16 of 42
surprise) + eye contact with the addressee’—could have caused unpleasant (/pleasant)
self-conscious emotions in the addressee.30
Such productions are “simultaneous multilevel communications”: Lipschits and Geva
(2024) (who also underline the decisive role of the adult receiver). More concretely, in
such communications, the intentional level would control and use the behavioral and even
autonomic ones, i.e., those movements or expressions that originally were not intentionally
communicative. This transformation of the old levels makes ‘the dissociation between ex-
pression and intentional communication’ “murky” (Warren et al.,2023), or, more concretely,
there may not be any such dissociation at all in the intentionally communicative production
of the great apes.
31
The proposal that “in such communications, the intentional level would
control and use the behavioral one” (Lipschits & Geva,2024) is similar to ‘the recruitment
view’ about the origin of great ape gestures—“Great ape gestures recruit features of their
existing behavioral repertoire for communicative purposes”, Graham et al. (2024).
Certainly, the prelinguistic intentionally communicative messages that caused self-
conscious emotions in the addressee stand out due to their special importance (focused on
in Section 5.1) for the development of the new, human lifestyle. However, as communicative
productions, they are examples just like any other within apes’ and infants’ abilities. Despite
this, we need to underline such messages: Note that, while the above-cited phrase of
Darwin perfectly serves, with its of us”, to distinguish what vicarious expectations cannot
do, it, however, ignores a basic question—how the human subject originally comes to
think what others think of him–, and therefore, the cited phrase can’t get us to the human
communicative reception,which is an (or the?) essential root of human uniqueness. So, henceforth
this subsection will focus on that root, and in this way will give a second argument in favor
of the link between self-conscious emotions and the estimation of foreign contents.
As seen just above, in non-human primates the intentional control of the behavioral
and autonomic levels can occur in production. Thus, in the very beginning, human
communicative uniqueness only happens at the reception: In other words, according to
my proposal, it is the recipient who originally needs to strive—and to estimate foreign
mental contents.
This proposal (it is the recipient who originally needs to strive) maybe can seem like a
way to escape from the controversy between, on the one hand, Scott-Phillips and Heintz
(2023), who agree with Grice that “the communicative producer typically intends that
the recipients recognize his/her communicative intention” and, on the other hand, R. R.
Moore (2015) or Geurts (2019), who reply that it is only to hide his/her communicative
intention that the producer must strive. However, that ‘second Gricean requisite’ is not
the best terrain to focus on the very origin: Note that, while Grice starts from a clear
contrast between natural and non-natural signs, I propose that it is the transformation
of a ‘natural’ (or rather, returning to Lipschits and Geva (2024), merely “behavioral” or
even “autonomic”) sign into a communicative, ‘non-natural’ one that must be recognized
by the addressee.
32
This transformation can be called the “behavior of marking entities
(e.g., objects and actions) as communicative” (Mussavifard,2023, preprint). However, at
the very origin, the recognition of such ’marking’ (i.e., its understanding by the addressee)
required an evolutionary, probably genetic transition: This is my point.
In other words, what I really propose is (as in Section 4.2) that, if an addressee identifies
through vicarious expectations the outcome that is intended by the producer, then this
addressee will not be able to perceive the producer’s behavior as a communicative behavior
towards him/her—i.e., towards the addressee. Therefore, the eye contact that typically
accompanies chimpanzees’ intentional communications with an addressee will be, of
course, understood by the ape addressee as a communicative resource, but it will not be
applied to the behavior that activates vicarious expectations. This non-unified reception is
Humans 2025,5, 5 17 of 42
certainly more hazardous and less effective than human, unified reception. However, if, as
I believe, the non-unified one exists, then it sometimes must produce the result wanted by
the producer.
Therefore, my proposal can only be defended if we find which is the condition that
allows some intentional communications of that type to be successful—i.e., allows them
to get the addressee to satisfy the producer’s desire. The proposal makes the following
prediction: In such successes, the behavior with which the ape-producer tries to manipulate
the addressee’s attention toward evidence of the intended outcome—that behavior or
resource—may be well understood even if it is not perceived as communicative. If that
were so, then we could hypothesize that failures do not derive mainly from a deficient
ability for pragmatic interpretation (even if interpretation, “in a novel situation, requires
the integration and assimilation of multiple pieces of information to guess at outcomes”,
Warren & Call,2022), but above all from the limitations of non-unified reception.
Melis and Rossano (2022)—as others had done before—claim that monkeys’ and
apes’ communicative production is better than reception. These primates can intentionally
produce request messages for an addressee.
33
We can even see that “a female adult baboon
tries to draw the attention of her offspring toward the piece of fruit that she waves between
her fingers” (Meguerditchian,2022).
34
However, when the non-human primate receives the
message that is addressed to him, he cannot—I propose—grasp that ‘such action of trying
to draw attention’ is simultaneously a ‘foreign mental state’ and ‘addressed to him, i.e., to
the recipient’.
Returning to the purpose of finding which is the condition that allows some intentional
communications of that type to be successful, I will start by recognizing that such a task
is a difficult one. Firstly, D. A. D. A. Leavens et al. (2005), studying their captive but
untrained chimpanzees, have found that ape producers no longer use pointing gestures
as soon as the recipients leave, and confirmed, therefore, that those communications are
intentionally targeted at the addressee, but nothing is said about reception, because the
addressee is human. Secondly, in Hobaiter et al. (2014), the addressee of the pointing
gesture is the chimpanzee-producer’s mother, but in the case observed, the mother did not
satisfy the desire. She probably did not as it would have been risky—we can suppose–,
but, anyway, this case cannot be used as an example of successful communication. Thirdly,
loud scratch, despite its great relevance, doesn’t seem to help us enough either, since it
has typically been regarded as ritualized. However, in this third case, we can remember
‘the recruitment view’—Graham et al. (2024)—, and also a suggestion that was offered
above, in Section 3.2.2—“Probably, only the primates possess vicarious expectations”.
If these views were correct, then the producer of the loud scratch could intentionally
activate in the addressee vicarious expectations instead of the general expectations that are
activated by the overwhelming majority of animal ritual signals, which are not recruited
for communicative purposes.
Regarding the two first situations, do their circumstances constitute an insurmountable
obstacle to considering both as indicative of a possible reception by chimpanzees? I believe
it does not. We—again—must consider that, if those gestures or behaviors could never be
understood by apes, then they would not be produced by wild (Hobaiter et al.,2014), and
captive but non-trained (D. A. Leavens et al.,2005) chimpanzees either.
We can see that those productions occur when a very conspicuous obstacle (the cage in
Leavens, or the dominating individual in Hobaiter) prevents the producer from satisfying
his/her goal. Therefore, ‘the behaviors that try to signal the purpose of the producer can be
understood by the ape-addressee as behavior that merely responds to the producer’s goal
(although, due to the obstacles, he, the producer, was unable to achieve such a goal). Or,
describing it according to my proposal: Those behaviors easily raised vicarious expectations
Humans 2025,5, 5 18 of 42
in the chimpanzee addressee and did not need to be understood as communicative by
that addressee.
The non-unified reception may seem surprisingly inappropriate. However, it was—I
suggest—kept in apes for two interrelated causes. One, in apes’ lifestyle, the non-unified
reception, despite being suboptimal, is a sufficiently useful resource. Two, the change to
unified reception requires a new ability and probably also brain modifications that allow
the duality of contents.
A clarification can be convenient here about the unified, human reception of that type
of prelinguistic communication (i.e., the unification between ‘gaze towards the addressee’
and ‘behavior that tries to signal the outcome that is intended by the producer’). While
such reception already must be supported by the estimation of foreign mental contents (or,
more concretely, of a foreign thought that interacts with the addressee-subject, i.e., with
the recipient, as with a distal individual), it is still different from the predicative language.
Note that, on the one hand, only predicative communications are primarily used to correct
(or complete or update) the addressee’s (incorrect, according to the speaker) beliefs. On the
other hand, the role that the gaze towards the addressee fulfills in those human prelinguistic
communications is dispensable in linguistic communication: The non-natural feature of
linguistic signs is sufficient to reveal that they have an intentional communicative function.
(See above, in this same subsection, the debate about the second Gricean requisite).
Therefore, the predicative language (the only communicative function that absolutely
requires syntax and syntactic semantics) could mark a new stage, which would be char-
acterized by more working memory (see above, Section 4.1, and, first of all, Coolidge,
2023), and—I suggest—also by constituting an interpersonal, easy precedent for creative
problem-solving (see above, the end of Section “Does the ‘Language of Thought’ Exist?”).
Certainly, the role I have proposed above for pride would have begun before creative
problem-solving and continued afterward. However, creative problem-solving, which
transforms the subject’s own mental contents so that they become adequate for solving the
problem, could correlate with the emergence of more decisive innovations.
In humans, the non-unified communicative reception is practically absent. The ad-
dressee that possesses human Theory-of-Mind, not only can activate vicarious expectations
but also estimate foreign mental contents. Let’s apply this—if only to close the argument—
to self-conscious emotions. I have accepted that, for communication to cause self-conscious
emotions, the recipient must estimate the interiority (the emotional mental content) of the
producer—i.e., a foreign interiority that is communicating with him, the recipient.
35
But if
the recipient’s ability to estimate foreign interiority is reduced to the activation of vicarious
expectations, then, that ability—I repeat—will not be able to apply to a foreign interiority
which is at that very moment communicating with the recipient as with a distal individual.
In conclusion, self-conscious emotions (1) support the ‘cultural’ and ‘social’ features
of the new, human lifestyle, and (2) are linked to one of its most basic and crucial fea-
tures, namely, the new, advanced type of communicative reception. In the Introduction
(when I focused on the question, ‘What is ‘the new, human lifestyle’?), it was stressed that
this lifestyle needed increasing communication. But now we can say that prior to that
quantitative increase, the new lifestyle needed a deep change in communicative reception.
6. The Human Theory-of-Mind Beyond Its Origin
‘The thinking foreign mental states which involve us as their distal addressees’ is,
in my view, a requirement only for the very origin of the human Theory-of-Mind. In
fact, I propose that, once the ability to think ‘two lines’ of content becomes strong, this
Theory-of-Mind can carry complex functions that do not fulfill that requirement. Such
complex functions are varied.
Humans 2025,5, 5 19 of 42
Sometimes they use foreign but non-interactive contents, as in verbal false-belief tests,
which involve “a non-dialogic capacity of mind-reading” (Dor,2016) in relation to the
believer. Note that in those verbal tests, the communicative interaction, instead of being
between the subject who attributes the mental content and the ‘attributee’, is reduced to
that which is established between the child and experimenter. Regarding this feature of
verbal tests of false belief, Gallagher (2015) states that “given the specific attraction of the
second-person interaction (vs. third-person perspective), the saliency of the interaction with
the experimenter takes precedence over the third-person task”. Elaborating that contrast,
Barone and Gomila (2019) conclude that second-person attributions of false belief (unlike
third-person attributions—for example ‘The Ancients believed that p’) “are transparent,
extensional, non-propositional and implicit”.
By way of a parenthetical digression, I will comment about first-person beliefs. Regard-
ing current first-person beliefs, if it is required that they possess the meaning of ‘believe’ that
habitually is activated in second- or third-person attributions (‘He—mistakenly—believes
that p vs. ‘he knows that p’), then we must say that originally, such first-person beliefs
did not exist. In the beginning, for human subjects, their non-outdated beliefs are just the
reality (and—in the beginning, again—their outdated beliefs are immediately replaced in
an automatic way by the new perceptions, and so, the origin of the predicative negation
was probably not intrapersonal but interpersonal). In short, the ‘believer cannot have first-
person beliefs in the above-described sense, but only ‘knowledge’: On this point, I agree
with J. Phillips et al. (2020) (at least, for a primitive, prelinguistic sense of ‘knowledge’—as
Rakoczy & Proft,2022 specify). The concept of belief (and of some traits of character:
remember what Ross,1977 called ‘fundamental attribution error’) emerged—I suggest—in
an interpersonal way. In my view, the so-called ‘animal meta-cognition in great apes’
(summarized in M. Tomasello,2022; see also Tomonaga et al.,2023) is not a judgment on
one’s own contents, but a mere hesitation about one’s own general expectations, or (as
Edwards-Lowe et al.,2024, preprint say) “subpersonal uncertainty estimates”.
Thus (according to this added, parenthetical sub-proposal) the intrapersonal meta-
cognition or intrapersonal ‘cognitive humility’ (i.e., a cognitive humility not primarily
understood as “moral interpersonal virtue” àla Priest,2017, or “as reputation manage-
ment” àla Karabegovi´c & Mercier,2023) would be a very late human ability. I agree with
Li (2023) that it is both interpersonally originated (since the subject during a dialogue
sometimes grasps that the knowledge of the other is more complete than his) and very
necessary. Such cognitive humility is necessary perhaps because (see the suggestion at the
end of Section “Does the ‘Language of Thought’ Exist?”) it is required by the transforma-
tion that any creative problem-solving involves, i.e., by the process of transforming our
initially inadequate resource (i.e., our incomplete or incorrect mental content of reality) into
one capable of achieving the solution. That type of humility—that, so to speak, ‘culmina-
tion/intrapersonalisation’ of Theory-of-Mind— may be enhanced by the least social—and
ontogenetically the latest—type of laughter, namely, the laughter caused—e.g., after a
punchline—by one’s own interpretive failure. In fact, all kinds of laughter are caused by
failures or deficiencies in some expectation—either general, vicarious, or narrative—of the
subject.36
Once the digression is over, let us return to “second-person attributions”. According
to my proposal, this type of attribution is included within ‘the advanced (or uniquely
human) Theory-of-Mind’. However, I fully accept its great simplicity. (As said above in
Section “Does the ‘Language of Thought’ Exist?”, even pre-syntactic ‘requests for a certain
object’ or ‘calls to a certain individual’ could reveal the speaker’s false beliefs to the listener:
Therefore, those easy, second-person attributions of mental contents could provoke the
Humans 2025,5, 5 20 of 42
origin of syntax). Needless to say, what I have just said is entirely compatible with the fact
that second- and third-person attributions of mental content can become very complex.
Other times, non-original functions of the human Theory-of-Mind are not only non-
dialogical. Indeed, these functions can even connect with non-foreign content. These
contents (not far from ’mental time travel’) are either the subject’s beliefs/perceptions
which he no longer holds or ‘possible’ contents, in any of the senses of ‘possible’.
However, according to my proposal, the human Theory-of-Mind originally arose
from a directly relational, interpersonal process, which requires neither language nor
experience with narratives. In my view, the linguistic modeling of Theory-of-Mind—
C. M. Heyes and Frith
(2014), and R. Moore (2020)—is a much later step, which requires
new linguistic discoveries. Among those new linguistic discoveries, it is worth highlighting
above all others the irreducibly hypotactic ‘referred speech’, and the verbs ‘say’, ‘believe’,
or ‘imagine’ (See Bejarano,2011, Chapter 21).37 A later and highly decisive discovery was
literacy or ‘the externalization of memory’, as Merlin Donald called it.
38
But leaving all
these late human advances aside, I return to the core of the proposal.
The original ‘estimation of foreign mental contents’ is what cognitive archeologists
recommend looking for, namely, a “component attribute” (vs. ‘compound concept’): See
Foley and Mirazón (2020). Likewise, my proposal on the origin of the human Theory-of-
Mind fits with the suggestion that “a priority for future research is to identify the genetic
‘start-up kit’ for the cultural inheritance of mind-reading (Uta Frith, cited with approval by
C. M. Heyes & Frith,2014; my emphasis).
39
In my view, the rejection of ‘innate universal
grammar’ or of ‘innate mentalese’—a rejection that I obviously share—should not prevent
us from proposing this ‘genetic start-kit’, and searching for it with the current resources
of Genomics.
Here it is necessary to mention the subject of autism. (It was Reviewer 2 who pointed
out this crucial issue to me, which I had omitted). For quite a few years now—especially
since Happé (1993)—, autism has been put in relation to the Theory-of-Mind, and this
relationship is very interesting with a view to finding the “genetic ’start-up kit’ for the
cultural inheritance of mind-reading”. Note that it is much easier to find the genetic basis
for a rare disease than for a universal trait.
7. The Advanced Reception of Pointing
In children’s acquisition of language, pointing gestures are important
(Southgate et al.,2007
;
Kishimoto et al.,2007). Since the child’s pointing gestures may often provoke linguistic
comments from the adult about the signaled object, it is evident that those gestures create
the ideal context for learning words. Note that, although the words that appear in the
adult’s comments may be unknown to the child, this will rely on the trick of knowing which
object such comments refer to. But in the evolutionary origin of language—I propose—
pointing gestures may have been even more important.
This Section, even if now I will add new arguments and data, will repeat the same
hypothesis above applied. More concretely, in Section 5.2, I applied it to the reception of
communications that cause self-conscious emotions, and now, to the reception of pointing
gestures. However, I have considered it appropriate to delay in dealing with pointing
gestures, since, while self-conscious emotions are almost unanimously considered uniquely
human, regarding pointing gestures, however, things are very different.
In addition, at the end of this section, I return to ‘the cooperative eye hypothesis’
(M. Tomasello et al.,2007, built on Kobayashi & Kohshima,2001). Certainly, my proposal
will put the evolutionary transition (i.e., my proposed transition to the human, unified
reception of pointing) precisely in the process that unifies the two gazes—or, in other
words, extends the communicative function of ‘the gaze towards the addressee’ to ‘the
Humans 2025,5, 5 21 of 42
gaze towards the object’: Therefore, it fits well with the fact that human eyes make the
horizontal traveling of the iris conspicuous. Likewise, such conspicuity is certainly an
embodied resource, like that of the broad intonational pattern that was proposed above
regarding the origin of syntax. However, despite all that, I’m not convinced that the human
type of eye emerged in synchrony with the unified reception of pointing gestures (or, in
other words, with the beginning of the human Theory-of-Mind). In other words, while I am
fully convinced that human eyes are very effective facilitators of the advanced, or ‘unified’,
reception of pointing gestures, I have only a faint hope about that synchrony. Anyway,
since the problem of when the transition occurred is so difficult, I strongly recommend
that researchers in Paleogenomics try to answer the question of when the human-type eye
appeared in evolution. As said above, we should not rule out anything that involves any
possibility of giving us light.40
7.1. Apes and Pointing Gestures
7.1.1. Responding to a Possible Objection: Pointing in Apes
On the one hand, I have proposed that the advanced Theory-of-Mind is uniquely
human. On the other hand, we know that many chimpanzees raised by humans have been
taught to produce pointing gestures and to understand them (even the declarative type of
pointing: Lyn et al.,2011) What answer can I give to all this?
I will begin by admitting two indisputable facts. One, “human children display
this ability to use communicative cues only after many months of intensive exposure to
cultural environments characterized by frequent referential signaling, both verbally and
nonverbally” Clark et al. (2019). Two, the absence of pointing is not at all harmful in
“apes’ lifestyle”.
From those statements, some authors conclude that in non-human primates that ability
would be present, although scarcely exercised or developed. See Vasilieva (2019): “Not
only the presence/absence of a trait but whether it manifests in animals to the same degree
as in humans is equally important for our understanding of trait evolution”. The following
example is offered by Heintz and Scott-Phillips (2022): “Human bodies are not especially
well-suited to swing from trees. However, there is no absolute barrier”. In that same line,
Berio and Moore (2023) recommend resuming great ape enculturation studies.
But, according to my proposal, it is only the effective, ‘unified’ reception of pointing
gestures that is uniquely human. Certainly, in this way, I place as a vital criterion a process
that is still unobservable, which may seem like a withdrawal towards “untestability with
scientific methods”(D. Leavens,2021). However, as can be seen, the proposal relates to
some facts and several potential experiments and research.
7.1.2. Authors Who, When Dealing with Pointing in Apes, Have Focused on Reception
The focus on reception is not new. R. Moore (2013) focuses on the receptive failure
of apes and proposes that “since pointing gestures provide poor evidence for a speaker’s
message, they exceed the pragmatic capacity of apes”. Likewise, Morrison (2020) empha-
sizes the ambiguity and necessary disambiguation of pointing gestures. I agree with these
claims. But, in my view, ‘poor evidence for the message’ and ‘poor pragmatic ability’ are
insufficient to explain the frequency of receptive failures in apes.
Lyn and Christopher (2018) list three conditions which the experimenter may point
out and whose reception by apes is differently successful: “(i) Proximal-Proximal: The
choice items are close together and the point is close to the correct item. (ii) Proximal-Distal:
The choice items are close together, but the point is further away. (iii) Distal-Distal: The
choice items are further apart, and the point is therefore necessarily further away”.
Humans 2025,5, 5 22 of 42
According to that work, in Proximal-Proximal and Distal-Distal, point-following can
be achieved by simple mechanisms. However, “in Proximal-Distal, the best predictor of
success is ontogenetically previous human social contact”. I would underline the fact that
it is just in Proximal-Distal where the direction of the head of the producer (that is, the cue
that chimpanzees use to estimate what others can see: M. Tomasello et al.,2007) is unable
to signal the object.
7.1.3. Unlearned Production in Apes
Before focusing on the contrast between the two receptions, it is convenient to go
again and in a more detailed way over unlearned production in apes. “Unlearned (i.e.,
with no explicit training whatsoever) captive chimpanzees frequently point to unreachable
foods. These are communicative signals because apes will not reach towards obviously
unreachable food if there is nobody around to see them do it” (D. A. Leavens et al.,2005).
In addition, in those chimpanzees, a repeated gaze alternation between the food and the
experimenter was significantly associated with their pointing gestures.
Since then, Leavens and other authors began to ask themselves whether conditions
like those (cage and benevolent recipient) which in the mentioned observations were
considered as decisive appeared in wild chimpanzees too. Hobaiter et al. (2014) offer the
following proposal: “Wild chimpanzees experience few physical barriers, but the presence
of a dominant, unrelated chimpanzee monopolizing a particular resource may be a greater
barrier to a young chimpanzee’s access than bars on a cage. To overcome this challenge,
a juvenile’s only resource is another chimpanzee, mainly its mother”. Thus, they found
a case in the jungle that they classified as “possibly deictic”. A possible conclusion: Wild
chimpanzees that use this type of production with their conspecifics can thus achieve (at
least sometimes) their goals.
Nevertheless, for such production to be a useful resource in the wild, it is necessary for
recipients to deliver (at least sometimes) the desired object. Is it possible? Animal altruism
is a controversial matter: see, e.g., Rendall et al. (2009) vs. De Waal (2010). But I do not
discard it if it does not cross the (always narrow) limits of ‘spontaneous altruism’.41
7.2. Reception of Pointing Gestures in Chimpanzees and in Humans
Regarding the reception of pointing gestures in chimpanzees, I begin by highlighting
that they understand the communicative value of gazes toward the addressee. Indeed “the sen-
sitivity to being watched is both innate and shared by most vertebrates”
(Klein et al.,2009)
.
Thus, in the species that are able to perform ‘recipient-directed’ communication, recipients
of that gaze understand that they are the addressees of this innate communicative resource.
(But, while in gorillas, eye contact communicates mild threat, in chimpanzees, by contrast,
it is a friendly communicative resource).
However, in the chimpanzee-recipient such communicative value is not applied—
this was proposed above in Section 5.2—to the other element produced by Leavens’ or
Hobaiter’s untrained chimpanzees, that is, to gazes towards the object and to hand/arm
movements. ‘The gaze towards the object and hand/arm movements’ is, for an ape-
addressee, a non-communicative behavior that can sometimes activate vicarious expec-
tations in him (in the addressee). It is fair to specify up to which point this description
of non-human reception of pointing gestures seems implausible to human intuition. The
producer, both before and after making movements in a certain direction with his arm and
head, communicates with the recipient by means of eye contact. Why would the recipient
not understand that the producer’s movements are communicative, or, in other words, that
the communicative value of eye contact is applied to those movements and gives them a
Humans 2025,5, 5 23 of 42
communicative function? For humans, that unification of the two consecutive instants is
obliged and unstoppable, I acknowledge it. But is such unification present in chimpanzees?
As said above, the cage (D. A. Leavens et al.,2005) or the dominating individual
(Hobaiter et al.,2014)make the chimpanzee’s gesture non-absurd for conspecifics even if it is not
interpreted as communicative. On the contrary, our human reception of pointing gestures
can be considered closer to that of communicative pantomimes.
42
M. Tomasello (2008)
stresses how strange any pantomime can be for a recipient if the gestures involved are not
interpreted as being communicative (“the recipient will see my iconic gestures as some kind
of strangely misplaced instrumental action”
43
), but he does not say it about our pointing.
However, according to my proposal, in both cases, the same problem arises for apes. As
said above in Section 4.2, vicarious expectations—the only resource that, according to my
proposal, apes have to estimate the interiority of others—cannot involve any action that
is impossible for the subject in which they are activated. Therefore, vicarious expecta-
tions cannot be understood by the subject—that is, by the ape-addressee—as involving
communicative actions directed by the producer to him.
Now, let’s pay attention to the alternation between gazing at the object and gazing
at the addressee. This alternation appears in apes’ and humans’ production of pointing
gestures. In D. A. Leavens et al. (2005) we already read that in those captive but untrained
chimpanzees, the repeated gaze alternation between the food and the experimenter was
significantly associated with their pointing gestures. Even more important—of the utmost
importance really: Paulus and Fikkert (2013) show that the necessary and sufficient element
for human babies to first understand pointing gestures is not the hand movement (or its situa-
tional/cultural variations—see Cooperrider & Slotta,2018), but the alternation between the
two gazes. (The movements of the arm/hand/finger would be, therefore, a later strategy
to make more precise the function of the gaze to the object). Thus, we must focus on the
two gazes.
On the one hand, the ‘gaze towards the object’ causes the recipient to estimate what
the producer sees. On the other hand, the ‘gaze towards the recipient’ (a.k.a. ‘eye contact’)
informs the recipient that he is being the addressee.
44
In addition, inter-brain consequences
of eye contact in humans are increasingly studied. Y. Pan et al. (2020) mainly focus on
teaching. Di Bernardi Luft et al. (2022) stress that “inter-brain synchronization mainly flows
from leader to follower”, and thus, from the producer of pointing gestures to the addressee.
In general, second-person approaches underline eye contact: Cañigueral et al. (2022).
But what must be highlighted is that in our human communicative reception, those
two instants (‘gaze towards the addressee’ and ‘gaze towards the object’) cannot in any
way remain separate, but they must be unified. The addressee has (1) to estimate what the
producer from his place and in his circumstances is looking at, and (2) to understand that
what the producer is looking for by looking at the object is to point at the object for him,
for the addressee. According to my proposal, it is—as the reader already knows—in that
unification where the problem arises for the ape-recipient. Let’s return one more time to the
nuclear subsection (i.e., to Section 4.2). Certainly, vicarious expectations are automatically
processed by the subject as belonging to the observed individual. However, since there
can be no vicarious expectation of the results of an action intrinsically impossible for the
subject, the recipient-subject will be unable to apply to such expectations an interpersonal
communicative function towards himself.
Therefore, the unified, fully effective reception of pointing gestures will only be
possible by the estimation of the mental contents of the producer. Thus, there would be
a common capacity to that reception and to that of prelinguistic messages that cause self-
conscious emotions, and to any linguistic reception,since this always includes that the involved
thought comes to the receiver from someone other.
45
That ability can be colloquially described
Humans 2025,5, 5 24 of 42
as the one of ‘remaining in your shoes when you look at me’ (a description that highlights
the similarity to a more embodied version—see above, near the end of Section 3.2.3—of
the ability).
A preliminary test of these proposals could investigate in humans whether there
is some relevant neurophysiologic similarity between the interpersonal activation of all
(negative and positive) self-conscious emotions and the unified communicative reception
of pointing gestures. If such similarity is found in the future, then the plausibility of
the general proposal would increase. But it is convenient to specify that the proposed
explanations of self-conscious emotions and of the effective, unified reception of pointing
might be evaluated by future discoveries differently.
In other words, in addition to total success and total failure, there are other two
possibilities, the partial results. Thus, it might be discovered that, while the proposal about
the advanced reception of pointing can be maintained, the explanation of self-conscious
emotions, however, must be transformed—for example, rejecting their interpersonal origin
and deriving their ontogenetic and evolutionary emergence from ‘an innate core’ of moral
norms. Or, conversely, the result might be that, while the proposal about self-conscious
emotions can be maintained, the effective, non-hazardous reception of points, however,
does not require any process of unification between ‘gaze towards the addressee’ and
‘gaze towards the object’—because, for example, their mere succession might be enough
for full effectiveness to be achieved through “the human pragmatic competence, which
is greater than that of apes” (R. Moore,2013) or, alternatively, because human beings are
much more inclined to gaze-following (an inclination that might either derive from the
salience of human eyes or connect with a supposedly prior, not subsequent, type of what
Csibra & György,2006
called “Natural Pedagogy”). Anyway, for now, I bet on my proposal
in the most ambitious way (or rather the most self-reinforcing one: to give a recent example,
see ‘causal-association inferences’ in Currie et al.,2024), that is, applying it to both abilities.
Of course, at the beginning of ‘the new lifestyle’, several behaviors (not very different
from the ones carried out by Leavens’ and Hobaiter’s untrained or wild chimpanzees)
could achieve some degree of reception and could be useful for both producer and recipient.
Let’s consider, for instance, the action of pushing a conspecific until we place him so
that he can see a relevant object. These types of communicative production would have
been multiplied at the beginning of the ‘new, cooperative lifestyle’, without the recipient
grasping the simultaneously mental and communicative nature of the behavior yet. But
this problem finally became accessible to coevolution genes/culture. And so, the effective,
unified reception of pointing gestures appeared, together with the estimation of foreign
contents.
46
Now, I will propose that the unified reception of pointing gestures is strongly
facilitated by a little anatomical feature.
7.3. The Human Eye and the Unified Reception of Pointing Gestures
M. Tomasello et al. (2007) (that is, six years after Kobayashi & Kohshima,2001)
focused on the universally human white sclera, or, more precisely, on both its horizontal
enlargement and its depigmentation and proposed that these human peculiarities enhance
“the visibility of eye-gaze orientation”. But gaze-following, a phylogenetically old ability,
is—an objector might say—carried out without the help of the white-of-eye. Indeed,
M. Tomasello et al.
(2007) showed in apes the reliance on head (vs. eyes) in gaze-following.
Likewise, C. Moore (2008) concluded from his experiments that when infants first start to
follow gaze (at that age—note, please—they are still unable to receive pointing gestures),
“they do so on the basis of head direction, not eye direction”.
Despite those possible objections, M. Tomasello et al. (2007), putting ‘the enhancement
of the visibility of eye-orientation’ in the evolutionary context of human special coopera-
Humans 2025,5, 5 25 of 42
tiveness, hypothesized that humans evolved such unique eye morphology to facilitate joint
attentional and communicative interactions among conspecifics. See also Wolf et al. (2023),
or Yáñez and Gomila (2018), who, after underlining ‘the interactional importance of gazes’,
adds: “especially when oneself is the focus of that attention”, i.e., during eye contact. I will
specify this emphasis on cooperation and interaction to connect it with my proposal of the
‘unified’, effective reception of pointing gestures. Let’s start by describing “the enhanced
visibility of eye orientation” in more detail.
Mayhew and Gómez (2015), Perea-García et al. (2019) (but see Mearing & Koops,
2021) and Caspar et al. (2021) have proposed that the chromatic contrast in the human
eye is not unique among ape species. But let’s focus on horizontal elongation. This
feature may have evolved to allow non-arboreal primates to scan their environment widely.
Nevertheless, such elongation together with the universal “totally/bilaterally white sclera”
make the location of the iris conspicuous not only in averted but also in a direct gaze. In
addition, “the eye outline is easier to see in humans (than in apes) irrespective of skin
color”
(Kano et al.,2022)
and this makes the location of the iris even more conspicuous. See
also
Prein et al. (2024, preprint)
, who conclude that human ‘gaze understanding’ is “based
on the pupil location within the eye”. Thus, human eyes—this is my point—make the
successive locations (that is, the horizontal traveling) of the iris conspicuous.
In this way, the continuity of the two gazes in pointing (or, in other words, the crucial—
remember Paulus and Fikkert (2013)—alternation between gazes) is enhanced. It might be
said that when the producer moves his iris from the ‘gaze towards the object’ to the ‘gaze
towards the recipient’, that movement is perceived by human recipients as if it was injecting
the ‘gaze towards the object’—and, consequently, also the vicarious expectations activated
by recipients—into the ‘gaze towards the addressee’, that is, into the communication. So, the
human eye would lead the human recipient of pointing gestures to unify the two instants—
and, therefore, to estimate the producer’s mental states that, involving himself, i.e., the
recipient, as their distal addressee, are intrinsically impossible for this addressee—and,
therefore again, to estimate ‘foreign mental contents’.
In short, in my view, the human sclera is an anatomical,universal ‘facilitator resource’ of
a mental process—the unified communicative reception of pointing gestures, of course—in
the addressee. It is also a strong ‘facilitator resource’. These qualifications could maybe
raise the suspicion (1) that the ‘unified’ communicative reception of pointing gestures
was the evolutionary first function of the ability to estimate foreign mental contents, and
(2) that this estimation—and the consequent ‘duality of mental contents’—was originally
difficult and demanding. However, such deductions (let us not forget!) would require us to
choose the option of the synchronic or quasi-synchronic emergence between human eyes
and human Theory-of-Mind. (And, as said above, the decisive ‘horizontal elongation of
eyes’ may have emerged much earlier, only to allow non-arboreal primates to scan their
environment widely).
The depigmented sclera could become universal in an evolutionarily very short time,
and therefore (if there was such synchronic emergence) the human sclera could arise in the
same species in which the effective, unified reception of pointing gestures was beginning
to emerge. But did it happen in Sapiens? And if so, did it happen at the very beginning
of our species? Or later?
47
Or did it emerge in Neanderthals/Denisovans? This can be
a crucial question. I hope that Paleogenomics and Genomics specialists will answer it
soon. Certainly, the depigmented sclera is a quite simple feature. However, its universality
makes, of course, their task difficult.
If we follow the option—the faint hope, as I said in Section 7.1—that the peculiarity of
human eyes emerged in relative synchrony with human Theory-of-Mind, then we could
propose that this facilitator is an essential basis for any human communicative reception
Humans 2025,5, 5 26 of 42
(i.e., our ability to understand messages as foreign mental states and, simultaneously, as
addressed to ourselves). But such proposal can accept either that such basis—such estimation
of foreign contents—emerged in Sapiens, or that, on the contrary, in Sapiens, only its
derivations emerged (see above, in Section 5.2, the separation between the human reception
of prelinguistic messages, on the one hand, and predicative language, on the other hand,
and see also Section 6), while the estimation itself had emerged in Neanderthals. In short,
that option, in addition to being based on a ‘faint hope’, could predict only that the human
type of eye will not be found in earlier hominins. Therefore, regarding Neanderthals, it
does not possess a strict falsifiability. This is an extremely unfortunate fact since it is just
the Neanderthal genome that is being studied. Anyway (and returning again to Section 7.1,
but now to the recommendation that “we should not rule out anything that can provide us
even a little bit of light”), the question of whether Neanderthals—or even, as suggested in
note 47, our species in its beginning—possessed eyes like ours should be answered. If such
an answer is negative, then it could give us a useful supply of light. But now all this is just
a very faint hope.
I do not want this last paragraph, with its lack of confidence and pessimistic tone, to
mark readers’ final impression of my hopes. Please remember that such a tone has not been
the norm for this article. Indeed, as said above, I’m much more convinced of my general
proposal than of the synchrony between the mentioned emergencies.
8. General Outline
(I) Animals do not evoke their goals. Expectation—an empty profile—is enough to
guide their behavior.
(II) The primate hand (which its owner can see, and needs, during his grasping action,
to see) gives rise to a first novelty. When a movement is to be executed with the hand,
there is not only kinesthetic and proprioceptive expectation but also a visual expectation.
Thus, the sight of a foreign hand can activate the expectation of the normally concomitant
kinesthetic and proprioceptive sensation. When at the very next moment, this error is
corrected, those kinesthetic and proprioceptive expectations are automatically processed
as belonging to the individual whose manual movement was observed by the subject.
Vicarious expectations have appeared.
(III) Vicarious expectation can, perhaps only in great apes, extend beyond the hand. In
this more complex vicarious expectation the subject, having established a correspondence
between his own body (felt but not seen) and the other’s body (seen but not felt), gains the
highly adaptive ability to activate vicarious expectations about what the other sees from
his position and orientation, even though at that moment the subject does not have access
to such a visual field. (Of course, such evolved vicarious expectations will only be possible
if the subject knows the area very well and has often been in the place where the observed
individual is now).
(IV) All vicarious expectations, both original and other, remain what all expectations
in general are, that is, empty profiles.
(V) But with the human way of life, new skills become necessary, which vicarious
expectations are unable to sustain. It is now necessary that communicative messages,
though still prelinguistic, be understood by the recipient simultaneously as mental states
of the producer and as addressed to him, to the recipient. It is communicative reception,
not production, that originally required a great change. (In the communicative production
of the great apes, a merely communicative use had already been given to behavior and
movements that were not originally communicative). Apes can estimate the mental states of
another individual, since, as already said above, a particular type among all the expectations
they activate in themselves, i.e., vicarious expectations, are automatically processed as
Humans 2025,5, 5 27 of 42
those of the other individual. But such an incipient Theory-of-Mind is not sufficient now.
In human communication, the recipient has to grasp a thought that could never be his own
under any circumstances, and could never, consequently, be an expectation of his: Note that
the content that is thought by the communicative producer necessarily includes the feature
of being addressed to him, the recipient, as to a distal individual. Human estimation of the
mind of others must therefore capture (full) contents and not mere (empty) expectations.
(VI) Part of this human communicative reception was applied to understanding
messages that would trigger self-conscious emotions in the recipient. These very particular
emotions emerged in large groups, that is, among individuals who were not permanently
together, and where the behavior of one could surprise another. (Hominids who lived in
small groups probably evolved only in another direction, that is, in the direction of greater
social cohesion and greater spontaneous altruism). In addition to the well-known role of
social control played by the three self-conscious emotions that are unpleasant, I want to
emphasize that the pleasant one—i.e., pride—could lead to ‘improvements in the group
culture’ by an individual.
(VII) With the emergence of this uniquely human type of communicative reception,
communication becomes much more useful, and many more meanings are created. The
first meanings in human communication had nothing to do with our semantics since this is
intrinsically shaped by syntax. The first meanings were only calls to someone in particular
or requests for something specific, and they could not have any other intonation than that
of a request or call. The message was made up of only one of these pre-words.
(VIII) But these primitive messages, despite their limitations, were capable of desig-
nating concrete realities as an individual or an object. And this, together with the already
acquired ability to capture other people’s mental contents—in this case, the previous
speaker’s false beliefs about the nearby presence of the individual called, or about the avail-
ability of the object requested–, soon gave rise to syntax. Note that syntax is only needed in
language with a predicative function, and this communicative function seeks—except in
lies, of course—to correct, complete, or update the mental content of the addressee. (Of
course, this syntax was pre-grammatical—that is, ’theme, rheme’—and remained like that
for a long time probably. Complex grammatical devices—subordination, deictics converted
into anaphoras—originated only with ‘reported speech’ or with long interventions by a
single speaker).
(IX) But let us stop focusing only on the evolution of language. The great transition,
the cerebral change that the new communicative reception entailed, had effects beyond
communication. If humans can simultaneously think about their own mental content and
the mental content of others, they will also be able to evoke, as (full) mental content, their
past perceptions, or possible future perceptions. One key to the difficulty is in all cases
the same. The brain has to prevent any content other than ‘its own at the moment’ from
directing its behavior. This difficulty had no previous precedents. Note that dreams present
two differences that remove all difficulty: one, the dream situation is the only one that the
subject pays attention to, and two, there is motor paralysis (except in sleepwalkers).
(X) Creative problem-solving also had to do with the great transition. To try to
connect the two, we have to go back to language. Creative problem-solving consists of
the transformation of our mental contents, which initially seem inadequate to achieve the
solution, into ones that do solve the problem. This is, of course, much more difficult and,
both in evolution and development, much later than the predicative communication. But
in predicative communication, in the mere ‘theme, rheme’, there is also a transformation
of an inadequate element (the false belief of the addressee, which is, of course, the only
thing the addressee can grasp in the theme) into one that is adequate to communicate to
the listener what the speaker judges to be the reality of the matter. The difference lies in the
Humans 2025,5, 5 28 of 42
fact that the operation in creative problem-solving is intrapersonal, not interpersonal. But
that difference, that enormous distance, could be bridged during human genetic-cultural
co-evolution. Thus one should distinguish between cultural innovations that occurred
only through pride and other, later and more crucial ones, that were based not only on
pride but also on creative problem-solving. (The connections accumulated by any fully
linguistic individual throughout not only his years of language acquisition but throughout
his entire life would facilitate the search for a way to transform initially inadequate content
and achieve problem resolution. One might suspect that such connections are stored
not only in language but among the resources an individual learns in music, painting,
science, and other areas). But in the linguistic area we might perhaps find a slightly less
distant precedent for creative problem-solving than predicative syntax: Note that in partial
interrogations the speaker has to communicate what he does not know.
Before moving on to another section, I feel it is necessary to say something about the
strong charge of speculation that there is in these views on evolution. At the end of the
article, I will return to the speculative character more generally. But I will deal with it here
as well, as a grateful response to Reviewer 4.
I will now focus, then, on what some authors have said about proposals on evolution.
Lotem et al. (2017) point out that “the typical reductionist appeal to parsimony—that is,
Morgan’s Canon—is somewhat misleading in evolutionary contexts and time scales, where
changes are actually to be expected”. Thus, the only thing that is then indispensable is that
the new proposals—that is, the ‘non-parsimonious’ ones, according to Morgan—follow the
slow pace of evolution (van Woerkum & Barrett,2024). But the ‘search for this balance’ is
risky and, to a greater or lesser degree, speculative: Needless to say, the set of canons, Dicta
(Buckner,2013), and general truths does not give us concrete solutions. However, in my
view, that search can be a useful task, as long as we recognize that more research is needed
to verify if the hypotheses are on the right path or not.
9. Summarizing, and Looking Towards the Future
This article has hypothesized that the contrast ‘vicarious expectations vs. foreign
mental contents’ is a genomic, brain novelty that appeared in coevolution genes/culture.
Therefore, I have also made another proposal, namely, that such novelty was required by
‘the new, human lifestyle’, which was increasingly technologic (humans are ‘obligatory’
users and producers of tools) and cooperative (with a way of cooperating that is based
on a particular type of communication). More concretely, in the origin of this lifestyle,
two extremely important abilities (self-conscious emotions and, more basically, the new
communicative reception of even prelinguistic messages) required, according to my proposal,
the ability to estimate foreign contents. The key to my argument has been that only in
human communication the addressee has to think foreign (i.e., others’) thoughts as mental
states addressed to him. As the reader already knows, my hypothesis is above all dialogical,
and of course, also embodied and deeply embedded in evolution.48
I have proposed that the human Theory-of-Mind and human (even prelinguistic)
communication are inextricably linked. Or more precisely: On the one hand, the set of
those two abilities and, on the other hand (and more initially), the new lifestyle, feed off
each other in a growing spiral. Therefore, while there is absolutely no suggestion on my
part that all uniquely human capacities evolutionarily arose at the same time, I maintain
that one of them—namely, the estimation of foreign contents, and not only of vicarious
expectations—underlies the rest.49
The contrast ‘extinct species of Homo vs. us’, if it becomes finally an area of Com-
parative Neuroscience, might fulfill especially the promise to help us to ‘know ourselves’,
as classical philosophy wanted.
50
Such a result could perhaps be achieved with the help
Humans 2025,5, 5 29 of 42
of Genomics/Paleogenomics, as said above. Also with “the use of evolution to identify
meaningful categories of mental activity” (Cisek,2019,2021, which apply this resource
to animals). However, the use of evolution (or rather, coevolution genes/culture) is also
necessary to identify categories of human mental activity. In other words, the nuclear
categories of human mental activity will be more easily found the more we seek their link
with the emergence of the human lifestyle.
Returning to the nuclear proposal, this article has not offered any new empirical
results. However, the main proposal and each sub-proposal raise questions. Let us mention
some of those questions. My view of expectations? Apes’ vicarious expectations? The
anti-intuitive ‘non-unified reception of pointing’ in chimpanzees? Interpersonal origin of
syntax and syntactic semantics? Is there genuine metacognition in great apes, or, on the
contrary, only ‘subpersonal uncertainty estimates’? These questions can lead to different
experiments and research in Neuroscience or Genetics, whose results will have an impact
on my proposal, in one way or another. But I have already dealt with this above.
Therefore, I will add only a more personal comment. I am looking forward to those
results that can make my hypothesis testable. Even if those results discarded my proposals,
I would feel that my effort has been useful: Obviously, the hypotheses are most useful
when they point out a correct path, but if an apparent road leads nowhere, then the
task of promoting its testability is also a service to the community. In short, I ardently
wish that these tests are conducted. However, since such empirical research is out of my
reach, I can only request them. This is what this article would want to do now and in the
medium-term future.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Conflicts of Interest: The author declares no conflicts of interest.
Notes
1
Even if population size and connectivity have been strong drivers of the cultural advances and also—mainly in the African
Middle Stone Age—of cultural droppings: (Scerri & Will,2023).
2
However, I agree that apes’ ability in those tests is related to “affective empathy” (Lurz et al.,2022). Or, in my words (Bejarano,
2022), ‘vicarious expectations’ are related to ‘spontaneous altruism’.
3
However, the methodological, more particular matter of the violation-of-expectation paradigm (see the general review by Margoni
et al.,2023) will not be discussed here.
4
Nowadays it is known that unexpected events can only be connected to superficial layers of the visual primary area, while
expected events are also connected to the deeper levels of that area—Thomas et al. (2024)—, and, thus, it is possible to suspect
that expectations are coded in the brain in a different format than perceptions. (This publication studied human adults. That does
not conflict at all with my proposal. Humans, although we can evoke absent things, also have our empty expectations).
5
Such communications would already use non-innate resources (based not only on iconicity but, perhaps even more, as Cartmill
et al. (2024, preprint) suggest, on ‘past conditioned associations known by the group’). However, most likely, these cultural
gestures or calls still lacked super-high fidelity’ transmission (which supports articulatory-phonetic imitation). In addition, let’s
note that in the reception of these messages, the principle of “Teleology, first” in Theory-of-Mind (Perner et al.,2018) was, of
course, obeyed. We could even suppose that such type of individual message attempted, firstly, to become more and more
choral to, finally, influence group behavior: In other words, it would not be ‘dialogic’. All these features would place this type of
message far from even prelinguistic human communication. Despite this, such messages would go beyond empty expectations
of goals.
6
“The first words ever spoken is a key issue for the research in the evolution of language” (Gasparri,2023). I agree with the
importance of such an issue.
7
Planer (2019) (an article defending languages-of-thought) understands perfectly that “if the brains of many animals instantiate
languages of thought, then we face a serious explanatory challenge. That challenge is to explain how languages-of-thought might
have evolved”. But I am not persuaded by his explanation.
Humans 2025,5, 5 30 of 42
8
Or, more precisely, without a semantic content either produced simultaneously with the prosodic cue, or immediately previous in
a dialogue—I add. This second type can be produced with a minimal articulation originally empty of meaning (e.g., the ‘huh?’ of
Dingemanse et al.,2013).
9
“In human infants, shoulder movements, controlled by ipsilateral motor pathways from the right hemisphere, precede the
left-hemisphere control of the right hand” (Rönnqvist,2003) and also of culturally learned motor sequences. Nowadays it is
also known that in humans, certain muscles that are mainly associated with shoulder movement—and, therefore, also with the
expressive gestures that involve arm movement—are likely to interact with the voice (Pouw et al.,2023). Thus, the superiority
of arm-gestures over vocal resources that is observed in intentionally addressed communications of non-human primates, that
indisputable (even if relative, Lameira et al.,2024) superiority, could perhaps be conserved in multimodal communication of
human infants as the anteriority of arm-gestures—less complex than hand-movements—over cultural vocal learning. If that were
so, then we could suspect that such anteriority, interacting with the voice, caused the new, broader intonational unit, and, in
this way, paradoxically ended up giving rise to the mentioned ‘victory of voice on gestural communication’. We must take into
account that “in apes, communicative gestures, unlike manipulative movements, are controlled by areas that in the human brain
are responsible for human language”: Becker et al. (2021), Becker et al. (2022), Meguerditchian et al. (2011). In short, I wonder if
the following similarity has a basis in the ontogenesis and phylogenesis of our brain: Culturally learned movements of the right
hand (controlled, of course, by the left hemisphere) are embedded in a previous, simpler arm movement (right hemisphere), and,
similarly, culturally learned articulatory-phonetic signifiers (left hemisphere) are embedded in an intonational pattern (perhaps
right hemisphere: Gainotti (2024) again vindicates the recently challenged “graded, right-hemisphere dominance for emotions”).
10
Bejarano (2011, p. 126) underlined three points: First, “the imitation of complex motor patterns (or ‘kinetic melodies’, as Luria
calls them) which are new to the subject requires, according to Piaget (1954), that there to have been, during the observation of
the model, a latent imitation”. Second, “since ‘any body representation which is used for action continuously tracks the positions
of our body parts as we move’ (Haggard & Wolpert,2005), why would the same thing not occur in a motor sequence that occurs
latently?” Third, “we can conclude that the latent learning of kinetic melodies requires that any step other than the first in the
sequence be imitated through fictionalization of the posture derived from the unexecuted previous motor step”. Certainly, this is
too narrow a framework to support the difficulty of learning those sequences: It is clearly a far cry from Lind’s extensive and
updated argumentation (see its most recent summary in Lind & Jon-And,2024) against sequence learning by animals. However,
my old framework led me to suspect that super-high fidelity imitation was a fairly late skill.
11
So, I am wondering about the possibility that the early language did not depend on the super-high fidelity copying’. Planer et al.
(2024, preprint) focus on a similar puzzle—“an early language previous to know-how copying”, although these authors perhaps
do not sufficiently emphasize the difference between ‘know-how copying’ and the vocal super-high fidelity copying’, and thus
they choose as a solution to the puzzle the idea of a merely gestural-iconic origin of early language. For my part, I prefer to
suggest the possibility that the vocal component of the syntactic multimodal language did not originally involve our current
imitation of articulatory-phonetic sequences. Note, please, that the delay in the appearance of articulatory-phonetic sequences is
a reliable fact in the first manifestations of writing. Could the same thing have happened in oral language? This suggestion,
already put forward by Hockett (1960), has been defended by Fleming (2017), in the context of studying the ‘clicks’ of South
African languages.
12
That article shows that chimpanzees used know-how social learning’ (from a chimpanzee that experimenters had taught) to acquire
a skill they failed to innovate. Thus, we can think that if wild chimpanzees use such type of learning only very infrequently, it is
because they don’t produce complex innovations.
13
Certainly, recent research—Steven et al. (2022)—points to perspective-taking as a flexible and context-specific suite of abilities.
However, here we can continue with Flavell’s dichotomy.
14
If this hypothesis turns out correct, then we could deduce that the so-called ‘audio-motor mirror neurons of birds’ cannot be
mirror neurons. Note that, while learning the song-dialect, the bird does not sing yet. Therefore, the externally perceived dialect
(that is, the dialectal enrichment of the innate template) is stored without any connection with proprioceptive expectations. Thus,
if the proposal of Keysers & Perrett is accepted, the research about ‘the mirroring’ would have to refocus on primates, without it
meaning undervaluing any type of ‘analogous similarities’ (underlined, for instance, by De Waal & Ferrari,2010).
15
But, beyond that compatibility, the contrast shown by Schüler et al. (2024) puts a very interesting need at the center of the scene.
The human Theory-of-Mind (which will be fully deployed in Section 6) must prevent all those internal, perceptually decoupled
representations from influencing our behavior. Such prevention—I add—is a much more difficult task than the one required
in nightmares, for example. While in this latter case, there is only one line of mental content—nightmare situations–, in the
human Theory-of-Mind, however, there are ‘two lines’ of content, and, therefore, in the default network (in this peculiar, human
‘resting-state’) the prevention must be much more subtle and complex than mere muscle paralysis.
16
In Bejarano (2022) I have focused on that second type, and differentiated it from both spontaneous altruism and caring for one’s
own reputation. The proposal of that article is that, while the ‘(ultimately perceptual) estimation of foreign mental contents’ is an
adaptively very advantageous resource in human lifestyle, it however caused that the two typical features of perceptions—one,
Humans 2025,5, 5 31 of 42
that of informing about the surroundings, i.e., of being true, and the other, that of being useful to the subject’s interests—became,
for the first time in evolution, dissociated from each other. Thus, the perception of foreign mental contents—which include, of
course, another individual’s needs and interests—is the basis of a demanding moral capacity that, not being adaptive either for
the entire group (as spontaneous altruism is) or for the individual (as happens with the care of one’s own reputation), is—almost
paradoxically and therefore more wonderfully—built by evolution. More concretely, while in Joyce (2007) or Wilkins and Griffiths
(2013) (see also Levy & Weinshtock-Saadon,2023) moral beliefs are “debunked” by evolution (since they, having been selected for
adaptive reasons, are epistemically suspect), I propose how a base (however poor and weak) for that most demanding moral
capacity could really arise in evolution. From the recommendations made to me by Reviewer 1, it can be inferred that he/she
asks me to pay more attention to the dualism soul-body. In my view, a key core of that issue is whether outside of the acceptance
of such dualism there is still a base for the capacity to choose between egoism (which intervenes even in the refined self-control
that cares for one’s reputation) and truth, and that is precisely the question—“Could a base (however poor and weak) for this
capacity arise in evolution?”—that Bejarano (2022) attempts to answer.
17
Thornton and Tamir (2024) (who use the term ‘affordances’) may perhaps make us see the very different ways in which general
expectations are activated in humans.
18
Corballis (2000) and Corballis (2001) claimed that we interpret the ‘images in the mirror’ as the left-right reversal of the original
objects, and that, while a reflection’s reversal is a product of optics, such particular interpretation comes from neuroscience”. This
link with neuroscience could be lengthened: The sudden acknowledgment of standing before a mirror and not before a peer (/or
conversely, the sudden acknowledgment of standing before a peer and not before a mirror) inhibits (/activates) the mentioned
high-level resource.
19
L. Lewis and Krupenye (2022), for example, underline apes’ competitive motivation. About infants’ motivation, see an interesting
proposal in Woo et al. (2022) and Woo and Spelke (2022), who apply to this question (infants’ estimation of others’ false belief) an
idea relatively similar to the link between “look for cheaters” and reasoning (Cheng & Holyoak,1985, or Cosmides,1989). In short,
the mentioned proposal underlines that, since in some contexts “the estimation of others’ false beliefs may facilitate the ability to
morally evaluate others’ actions”, such estimation is an adaptive task even in toddlers. But, according to my hypothesis, even if
that interesting proposal becomes discarded, children’s curiosity about the interiority of others would still be
extremely adaptive.
20
Any mammal or bird has expectations about the behavior of animals that are vastly different from him. But those are general,
non-vicarious expectations.
21
Thus, it is not surprising that, for example, pride, when it is compared to joy, involves what Bornstein et al. (2023) call “a relatively
more distant perspective”.
22
We could also remember Baader’s anti-Cartesian formulation (“Cogitor, ergo sum”), even if Baader (who lived from 1765 to 1841)
interpreted it “more theologically than interpersonally” (Geldhof,2005). I would reformulate it in the following way: ‘If I grasp
foreign (i.e., others’) thoughts that involve me, I am human’.
23
Baumard et al. (2013) propose: “The best care of reputation (the most adaptively advantageous one, since the error of mistakenly
assuming that no one is paying attention to a blatantly selfish action may compromise an agent’s reputation) is the genuinely
moral habit”. This, of course, is also proposed by many other authors, for example, Boileau (“Pour paraître honnête homme, il
faut l’être”). I shall not comment on such a proposal here, but see Bejarano (2022).
24
This more intense care could relate to what, on a higher, later level, Di Francesco et al. (2021) said: “People’s self-defining life
stories have an intrinsically defensive nature; the description-narration of one’s own inner life is organized on the basis of the
fundamental need to construct and defend a self-image endowed with an at least minimal solidity”.
25
According to my option, pride originally arose interpersonally: The “hubristic, narcissist pride” that is mentioned by Tracy et al.
(2024) would have been a late (“evolved”) intrapersonal derivation.
26
As said above, while none of the earliest technological abilities implied high-fidelity transmission, this type of transmission
not only supported later technologies, but also what I called (in Section “Does the ‘Language of Thought’ Exist?”) the set of
all super-high fidelity copying’—the articulatory-phonetic copying, and the learning of songs or dances. (Obviously, in these
skillful tasks the conscious activity of memorizing and copying the model gives way, after multiple repetitions, to subconsciously
memorized actions, and this allows attention to be focused on a higher level).
27
The underlining of pride is also useful to prevent the concept of self-control from being incorrectly narrowed. See Bermúdez
et al. (2024): “Apathy is a normally overlooked kind of self-control problem. However, compared to negative self-control (i.e.,
self-control against temptations), which relies more on situational strategies, positive self-control requires more intrapsychic work
to get motivation (my emphasis)”.
28
‘Self-control’ (Shilton et al.,2020)? Or ‘self-domestication’ (Benítez-Burraco & Nikolsky,2023, to choose a recent example)? I can
only say that the connotations of the term ‘self-domestication’ (even if this is very different from ‘submission’—the evolutionary
precedent of shame, according to Maibom,2010) are less suitable for a capacity that, “even when it takes us to meekness, means
the strength and power to use one’s energy” for one’s previously chosen purposes: Roszak (2022). (This author, instead of
Humans 2025,5, 5 32 of 42
“self-control”, uses the traditionally moral term “fortitude”. But I cannot adopt such a use, since in my view—Bejarano (2022)—,
self-control is not necessarily moral).
29
Could Bryant et al. (2024) reinforce that claim? They state: “Our findings support a two-step evolutionary process, in which
changes in prefrontal cortex organization emerge prior to changes in temporal areas”.
30
Certainly, I’m not really proposing these examples, but just putting them here to facilitate the exposition. However, I want to
mention Breil et al. (2022), who investigated the unified reception (in their words, “the early temporal integration”) of gaze and
emotion cues, and “suggest a processing benefit when emotional expression (happy/disgusted) and gaze (direct/averted) are
congruent in terms of approach- or avoidance-orientation”.
31
Remember that, much later in development, also our current narrative speech uses gestural ‘theatricalization’ (whose effects
Rühlemann & Trujillo,2024 have studied in detail) and affective prosody. Likewise, ‘symbolic play’—or ‘pretense’—might train
this ‘intentional control and use’ of behavioral and even ‘autonomic’ levels.
32
This capacity of recognition is so adaptive that ‘the possibility of false positives’ (i.e., the currently very mentioned ‘overextension
of Theory-of-Mind’–see, e.g., Bering,2011) doesn’t matter, especially since exercising that capacity makes it stronger. This is a
repetition of what happened at a much earlier point in evolution with the detection of agency.
33
Obviously, there is an easier type of communication that is present in many more animal species: In it, individuals accumulate
evidence through ‘many pairs of eyes’, for example. Thus, “cues and signals from other individuals (e.g., fleeing movements and
alarm calls) reduce uncertainty about predator risk” (Hahn et al.,2024, preprint).
34
Likewise, human infants produce ”ostensive gestures with an object” months before making pointing gestures: Rodríguez et al.
(2015) and Guevara et al. (2024).
35
Ontogenetically that estimation is a difficult process, even in its previous requisite: Note that caregivers may naturally express their
emotions in ways that maximize learning possibilities—e.g., “emotionese”: see Benders (2013), or A. Ruba and Repacholi (2020).
36
Thus, the pleasure of laughter (a pleasure not entirely exclusive to humans, but certainly a universally human characteristic)
arose in evolution because it might—I choose this explanation– prevent frequent failures from discouraging primate brains from
making ever more complex expectations. The infant and the chimpanzee know when they are going to be tickled, but they fail to
predict the exact point or the exact instant. Likewise, we laugh when, after activating the vicarious expectation that the observed
individual will sit down, we see him fall over. However, the predictive failure of the continuation of the narrative after the
punchline is not a failure of prediction directly, but one of inadequate and incomplete understanding of the preceding part. Thus
it is only this kind of laughter that fosters the cognitive humility that is necessary for creativity.
37
‘Say’ was even later used in ‘first person + present + affirmative’, an apparently tautological use which came to fulfill a new
function, but still originally related, in my view, to ‘referred speech’. With these uses the speaker communicates that he is aware
of how his speech looks—and could be referred—from the outside. This may have been the ‘interpersonal’ origin of the (later,
more culturally and institutionally supported) ‘performatives’: Let’s compare ‘I say that
. . .
with ‘I swear that
. . .
(which was the
example chosen by Benveniste,1958/1966).
38
In grateful response to Reviewer 1, I want to add that I highly value Donald (1991) (of which I published in 1996 “Recensión de
Donald, 1991 y 1993”), especially the idea that beyond animal memory (which probably only stores the—so to speak—‘moral of
the story’ of past events, that is, only what may ever be immediately useful), there are three memory transitions (in my view,
supported respectively by non-syntactic multimodal communications, syntactic language, and writing).
39
The appeal to such a ‘genetic start-kit’ is, unsurprisingly, rejected in writings in the behaviorist tradition dealing with Theory-of-
Mind. One such paper is Schlinger (2009) (which was recommended to me by Reviewer 1). As for Schlinger, I, while not accepting
his rejection of the genetic basis of Theory-of-Mind, do share his criticism that (sometimes, I would qualify) “discussions of ToM
focus almost exclusively on inferred cognitive structures and processes and shed little light on the actual behaviors involved”
(See, for example, in Section 5.2, my question about how the experiencer of self-conscious emotions is aware of what others think
of him/her).
40
In the words of Uomini and Ruck (2019) (who exemplify this attitude in their study of the emergence of human handedness):
“The paucity of data is an obstacle in studying cognitive evolution, but this has not stopped researchers from trying”. I love
that “but”.
41
About ‘spontaneous altruism’: See M. Tomasello (2012), Rand et al. (2012), and, especially, “self-other merging” (Miyazono &
Inarimori,2021) and “goal slippage” (Michael & Székely,2019). Let us also focus on the unquestionable footprints of caring for
the ill or the wounded that have been found in Neanderthals: At least we cannot doubt “the selective advantages of reducing
the risk of mortality of other group members in small groups whose members are highly interdependent” (Spikins et al.,2019,
my emphasis). Spontaneous altruism is ontogenetically earlier than the motivation to improve one’s reputation by helping: See
Hepach et al. (2022). About the (probably, very primitive) type of spontaneous altruism that, “connected to reactive, non-cognitive
fear circuits, helps others under threat” (for instance, in social hunters): See J. B. Vieira et al. (2020), J. Vieira and Olsson (2022).
Humans 2025,5, 5 33 of 42
42
According to M. Tomasello and Call (2019), “attention-getters, since they manipulate attention of addressees, evolutionarily
precede pointing gestures, while intention-movements, since they manipulate the imagination, precede pantomimes”. I agree
with such a difference, but my interest is now in the similarity of both receptions.
43 See also Bohn et al. (2020), who report that apes do not learn from iconic gestures.
44
When infants first understand pointing in a unified way, do they understand it only when the producer addresses it to them?
Clark (1996) claimed: “The basic arena for social interaction is the dyad”. Certainly, some findings might seem to challenge
that claim. (Thiele et al.,2023 report that “observed joint attention” already modulates 9-month-old infants’ object encoding.
Likewise, according to Goupil et al. (2024), both humans and macaques show spontaneous preference to look at two bodies
facing towards each other). However, those findings do not seem to me to involve that challenge. People’s movements are
always salient stimuli, of course, but, in my view, the ‘ability to capture other people’s mental contents is not required in those
experimental situations. Thus, according to my proposal, “the dyad” can be maintained for the very origin of the human mode of
receiving pointing gestures.
45
Bejarano (2011), Chapter 6: My argumentation started by focusing on the reception (see Rubio-Fernandez,2020) of the most
egocentric deictics (here vs. there; this vs. that; I vs. you), i.e., of the words that the addressee has to understand in a different
way than the way he, the now addressee, uses them when he is the speaker. But I extended it to any linguistic reception.
46
What about dogs? Eye contact—i.e., the communicator making eye contact with the dog—is the major cue that dogs use to
determine when a human pointing is intended for them. (See Kaminski & Nitzschner,2013;Téglás et al.,2012). However,
Lyn et al. (2024, preprint) may have slightly lowered the initial triumphalism: Since dogs have more difficulty in following
contralateral pointing, these authors suggest that ipsilateral points are learned through associative mechanisms. In general,
Project MANYDOGS will try to replicate previous findings. But it is worth remembering Zuberbühler (2008): “Social carnivores
must decide on one particular prey individual prior to group hunting”. Thus, if the dominant wolf remains for a few moments
looking at—or making some movement towards—a particular prey, this could be an innately communicative signal, which
would pre-activate in the members of the herd a plan of attack in the signaled direction. So, when, shortly after, the wolf-recipient
feels that he is being looked at by the dominant individual, he starts its previously pre-activated attack plan. In this way, dogs
would just make richer their innate expectation of the first signal—i.e., they would learn to associate their innate expectation with
some other features (hand or finger).
47
This possibility is not at all an absurd suggestion. Firstly, within the lineage of Sapiens and even in dates totally within the
(formerly so-called) ‘anatomically modern humans’, there is a marked evolution in the shape of the cranium: See Neubauer et al.
(2018) (although, at least since 160.000 b. p., these differences with living humans would mainly affect, according to Zollikofer
et al.,2022, the face and cranial base). See also Freidline et al. (2024): “The unique facial growth pattern of Homo sapiens
post-dated the Middle Stone Age”. Secondly, regarding our progressive absence of prominent brow bridges—which were very
prominent in Neanderthals–, Godinho et al. (2018) reject the old hypotheses on such absence and suggest “its potential role in
social communication”. (See Siposova et al.,2018, who underline the role of raised and highly mobile eyebrows in “the reception
of communicative looks”. Likewise, Gast (2023) focuses on the link between linguistic prosody and eyebrow movement). In
addition, I ask: Could the chin, whose absence in Neanderthal has been so studied (cf. Meneganzin et al.,2024), strengthen the
gestural, emotional expressivity of the mouth? (Remember Section 5.2 above).
48
‘Embodied’ is a term that I have decided to use, although it does not really make sense in a position (such as mine) that
opposes dualisms, both the body-mind dualism of the cognitive revolution (about this debate, see an excellent summary in
Barrett & Stout,2024)
and the various body-soul dualisms. Indeed, I believe not only that animal consciousness emanates from
the evolved complexity of the animal body, but also that the most spiritual capacities of human beings (see previous note 16) are
the product of the extremely, wonderfully evolved matter that forms our bodies.
49
Regarding such later rest, I would underline: (1) creative (technical, artistic, or scientific) problem-solving, that is, the ability to
transform one’s insufficient mental contents into sufficient ones to solve the problem, and (2) what I called in previous note 16
‘the most demanding moral capacity’.
50
Bejarano (2022): “The current focus on hominids and Neanderthals opens a new door for us which was undreamt of for previous
philosophers and scholars”. Or, much more precisely, Currie et al. (2024): “Philosophical methodology can benefit greatly from
interaction with cognitive paleoanthropology. [
. . .
] Coherent evolutionary narratives is a means of readmitting synthesis to the
philosophical toolkit”.
References
Algoe, S. B., & Haidt, J. (2009). Witnessing excellence in action: The ‘other-praising’ emotions of elevation, gratitude, and admiration.
The Journal of Positive Psychology,4(2), 105–127. [CrossRef] [PubMed]
Andersson, C., & Tennie, C. (2023). Zooming out the microscope on cumulative cultural evolution: ’Trajectory B’ from animal to human
culture. Humanities and Social Sciences Communications,10, 1–20. [CrossRef]
Humans 2025,5, 5 34 of 42
André, J., Baumard, N., & Boyer, P. (2023). Cultural Evolution from the Producers’ Standpoint. Evolutionary Human Sciences,5, 1–24.
[CrossRef]
Bar, M. (2007). The proactive brain: Using analogies and associations to generate predictions. Trends in Cognitive Sciences,11(7), 280–289.
[CrossRef] [PubMed]
Barone, P., & Gomila, A. (2019). Infants’ performance in the indirect false belief tasks: A second-person interpretation. Cognitive Science,
12(3), e1551. [CrossRef]
Barone, P., Wenzel, L., Proft, M., & Rakoczy, H. (2022). Do young children track other’s beliefs, or merely their perceptual access?
An interactive, anticipatory measure of early theory of mind. Royal Society Open Science,9(10), 211278. Available online:
https://royalsocietypublishing.org/doi/10.1098/rsos.211278 (accessed on 4 November 2024).
Barrett, L., & Stout, D. (2024). Minds in movement: Embodied cognition in the age of artificial intelligence. Philosophical Transactions B,
379, 20230144. [CrossRef]
Baumard, N., André, J., & Sperber, D. (2013). A mutualistic approach to morality. The evolution of fairness by partner choice. Behavioral
and Brain Sciences,36, 59–78. [CrossRef]
Becker, Y., Claidière, N., Margiotoudi, K., Marie, D., Roth, M., Nazarian, B., Anton, J., Coulon, O., & Meguerditchian, A. (2022).
Broca’s cerebral asymmetry reflects gestural communication’s lateralisation in monkeys (Papio anubis). eLife,11. Available online:
https://elifesciences.org/articles/70521 (accessed on 4 November 2024).
Becker, Y., Sein, J., Velly, L., Giacomino, L., Renaud, L., Lacoste, R., Anton, J., Nazarian, B., Berne, C., & Meguerditchian, A. (2021). Early
Left-Planum Temporale Asymmetry in Newborn Monkeys (Papio anubis): A Longitudinal Structural MRI Study at Two Stages of
Development. NeuroImage,227, 117575. [CrossRef] [PubMed]
Bejarano, T. (2008, March 12–15). Pragmatics and theory-of-mind: A problem exportable to the origins of language. Conference ‘Evolang 7’,
Barcelona, Spain. Available online: https://www.worldscientific.com/doi/abs/10.1142/9789812776129
_
0003 (accessed on 4
November 2024).
Bejarano, T. (2010). REVIEW of hurford, james, 2007, the origins of meaning. Teorema,29, 157–164. Available online: http://
www.lel.ed.ac.uk/~jim/origins.revu.bejarano.html (accessed on 4 November 2024).
Bejarano, T. (2011). Becoming Human: From pointing gestures to syntax. Benjamins. Available online: https://benjamins.com/catalog/
aicr.81 (accessed on 4 November 2024).
Bejarano, T. (2014). From holophrase to syntax: Intonation and the victory of voice over gesture. HUMANA. MENTE. Journal of
Philosophical Studies,27, 21–37. Available online: https://www.humanamente.eu/index.php/HM/article/view/95 (accessed on
4 November 2024).
Bejarano, T. (2022). The most demanding moral capacity: Could evolution provide any base? Isidorianum,31(2), 91–126. Available online:
https://www.sanisidoro.net/publicaciones/index.php/isidorianum/article/view/Bejarano (accessed on 4 November 2024).
Benders, T. (2013). Mommy is only happy! Dutch mothers’ realisation of speech sounds in infant-directed speech expresses emotion,
not didactic intent. Infant Behavior and Development,36(4), 847–862. [CrossRef]
Benítez-Burraco, A., & Nikolsky, A. (2023). The (Co)evolution of language and music under human self-domestication. Human Nature,
34(2), 229–275. [CrossRef]
Benveniste, E. (1966). De la subjectivity dans le langage. In Problèmes de linguistique générale. Gallimard. Original work published 1958.
Bering, J. (2011). The belief instinct: The psychology of souls, destiny, and the meaning of life. W.W. Norton.
Berio, L., & Moore, R. (2023). Great ape enculturation studies: A neglected resource in cognitive development research. Biology &
Philosophy,38, 1–24. [CrossRef]
Berke, M., Horschler, D., Jara-Ettinger, J., & Santos, L. (2023). Differences between human and non-human primate theory of mind:
Evidence from computational modeling. bioRxiv. [CrossRef]
Bermúdez, J. P., Berthelette, S., Anaya, A., Fernández-Miranda, G., & Téllez, D. R. (2024). Temptation and apathy. Oxford Studies in
Agency and Responsibility Volume 8: Non-Ideal Agency and Responsibility,8, 10.
Bohn, M., Kordt, C., Braun, M., Call, J., & Tomasello, M. (2020). Learning novel skills from iconic gestures: A developmental and
evolutionary perspective. Psychological Science,31(7), 873–880. [CrossRef]
Bohn, M., Liebal, K., Oña, L., & Tessler, M. H. (2022). Great ape communication as contextual social inference: A computational
modelling perspective. Philosophical Transactions of the Royal Society B: Biological Science,377, 20210096. [CrossRef] [PubMed]
Bonini, L., Rotunno, C., Arcuri, E., & Gallese, V. (2023). The mirror mechanism: Linking perception and social interaction. Trends in
Cognitive Sciences,27(3), 220–221. [CrossRef]
Bornstein, O., Moran, T., Simchon, A., & Eyal, T. (2023). The effect of psychological distance on the experience of joy versus pride.
Social Cognition,41(4), 341–364. [CrossRef]
Bräten, S. (2004). Hominin Infant Decentration Hypothesis: Mirror neurons system adapted to subserve mother-centered participation.
Behavioral and Brain Sciences,27, 508–509. [CrossRef]
Breil, C., Raettig, T., Pittig, R., van der Wel, R. P. R. D., Welsh, T., & Böckler, A. (2022). Don’t Look at Me Like That: Integration of Gaze
Direction and Facial Expression. Journal of Experimental Psychology. Human Perception and Performance,48, 1083–1098. [CrossRef]
Humans 2025,5, 5 35 of 42
Brinums, M., Franco, C., Kang, J., Suddendorf, T., & Imuta, K. (2023). Driven by emotion: Anticipated feelings motivate children’s
deliberate practice. Cognitive Development,66, 101340. [CrossRef]
Bryant, K., Camilleri, J., Warrington, S., Blazquez Freches, G., Sotiropoulos, S., Jbabdi, S., Eickhoff, S., & Mars, R. (2024). Connectivity
profile and function of uniquely human cortical areas. bioRxiv, 2024-06. [CrossRef]
Buckner, C. (2013). Morgan’s canon, meet hume’s dictum: Avoiding anthropofabulation in cross-species comparisons. Biology &
Philosophy,28, 853–871. [CrossRef]
Bugnyar, T., & Heinrich, B. (2005). Ravens differentiate between knowledgeable and ignorant competitors. Proceedings of the Royal
Society B,272, 1641–1646. [CrossRef] [PubMed]
Bugnyar, T., Reber, S. A., & Buckner, C. (2016). Ravens attribute visual access to unseen competitors. Nature Communications,7(1), 10506.
Available online: https://www.nature.com/articles/ncomms10506 (accessed on 4 November 2024). [CrossRef] [PubMed]
Cañigueral, R., Krishnan-Barman, S., & Hamilton, A. F. d. C. (2022). Social signalling as a framework for second-person neuroscience.
Psychonomic Bulletin & Review,29, 2083–2095. [CrossRef]
Cartmill, E., Cartmill, M., Brown, K., & Foster, J. (2024). Which came first—Iconicity or symbolism? Evolang XV. Available online:
https://evolang2024.github.io/proceedings/schedule.html (accessed on 4 November 2024).
Caspar, K. R., Biggemann, M., Geissmann, T., & Begall, S. (2021). Ocular pigmentation in humans, great apes, and gibbons is not
suggestive of communicative functions. Scientific Reports,11(1), 12994. [CrossRef]
Castro, L., & Toro, M. A. (2004). The evolution of culture: From primate social learning to human culture. Proceedings of the National
Academy of Sciences USA,101, 10235–10240. [CrossRef]
Castro, L., Castro-Nogueira, M. Á., & Toro, M. Á. (2024). Teaching and the origin of the normativity. Biology & Philosophy,39, 23.
[CrossRef]
Chellappoo, A. (2021). Rethinking Prestige Bias. Synthese,198, 8191–8212. [CrossRef]
Cheng, P. W., & Holyoak, K. J. (1985). Pragmatic reasoning schemas. Cognitive Psychology,17, 391–416. [CrossRef] [PubMed]
Cisek, P. (2019). Resynthesizing behavior through phylogenetic refinement. Attention, Perception & Psychophysics,81(7), 2265–2287.
[CrossRef]
Cisek, P. (2021). Evolution of behavioural control from chordates to primates. Philosophical Transactions of the Royal Society B: Biological
Sciences,377, 20200522. [CrossRef] [PubMed]
Clark, H. (1996). Using language. Cambridge U. P.
Clark, H., Elsherif, M. M., & Leavens, D. A. (2019). Ontogeny versus phylogeny in primate/canid comparisons: A metaanalysis of the
object choice task. Neuroscience and Biobehavioral Reviews,105, 178–189. [CrossRef] [PubMed]
Clements, W. A., & Perner, J. (1994). Implicit understanding of belief. Cognitive Development,9(4), 377–395. [CrossRef]
Coolidge, F. (2023). Parietal lobe expansion, its consequences for working memory, and the evolution of modern thinking. Cognitive
Archaeology, Body Cognition, and the Evolution of Visuospatial Perception,2023, 181–194. [CrossRef]
Cooperrider, K., & Slotta, J. (2018). The preference for pointing with the hand is not universal. Cognitive Science,42(1), 1375–1390.
[CrossRef]
Corballis, M. (2000). Much ado about mirrors. Psychonomic Bulletin & Review,7, 163–169.
Corballis, M. (2001). Why Mirrors Reverse Left and Right. Psycoloquy,12, 1–4. Available online: https://www.cogsci.ecs.soton.ac.uk/
cgi/psyc/newpsy?12.032 (accessed on 4 November 2024).
Corballis, M. (2011). The recursive mind: The origins of human language, thought, and civilization. Princeton University Press.
Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Cognition,31(3), 187–276.
[CrossRef] [PubMed]
Crespi, B. J., Flinn, M. V., & Summers, K. (2022). Runaway social selection in human evolution. Frontiers in Ecology and Evolution,
10, 894506. [CrossRef]
Csibra, G., & György, G. (2006). Social learning and social cognition: The case for pedagogy. In Y. Munakata, & M. H. Johnson (Eds.),
Processes of change in brain and cognitive development (pp. 249–274). Academia. [CrossRef]
Currie, A., Killin, A., Lequin, M., Meneganzin, A., & Pain, R. (2024). Past materials, past minds: The philosophy of cognitive
paleoanthropology. Philosophy Compass,19, e13001. [CrossRef]
Darwin, C. (1872). The expression of the emotions in man and animals. John Murray.
De Waal, F. (2010). The age of empathy. Three Rivers Press.
De Waal, F., & Ferrari, P. (2010). Toward a bottom-up perspective on animal and human cognition. Trends in Cognitive Sciences,14,
201–207. [CrossRef] [PubMed]
Di Bernardi Luft, C., Zioga, I., Giannopoulos, A., Di Bona, G., Binetti, N., Civilini, A., Latora, V., & Mareschal, I. (2022). Social
synchronization of brain activity increases during eye-contact. Communications Biology,5(1), 412. [CrossRef] [PubMed]
Di Francesco, M., Marraffa, M., & Paternoster, A. (2021). A self properly embodied. In The jamesian mind. Routledge. [CrossRef]
Dingemanse, M., & Enfield, N. (2023). Interactive repair and the foundations of language. Trends in Cognitive Sciences,28(1), 30–42.
[CrossRef]
Humans 2025,5, 5 36 of 42
Dingemanse, M., Torreira, F., & Enfield, N. J. (2013). Is “Huh?” a universal word? Conversational infrastructure and the convergent
evolution of linguistic items. PLoS ONE,8(11), e78273. [CrossRef] [PubMed]
Donald, M. (1991). Origins of human mind. Three stages in the evolution of culture and cognition. Harvard University Press.
Dor, D. (2016). From experience to imagination: Language and its evolution as a social communication technology. Journal of
Neurolinguistics,43, 107–119. [CrossRef]
Dor, D. (2023). Communication for collaborative computation: Two major transitions in human evolution. Philosophical Transactions of
the Royal Society of London. Series B, Biological Sciences,378(1872), 20210404. [CrossRef]
Dreon, R. (2024). Enlanguaged experience. Pragmatist contributions to the continuity between experience and language. Phenomenology
and the Cognitive Sciences,24(1), 63–83. [CrossRef]
Durdevic, K., & Call, J. (2022). On the origins of mind: A COMPARATIVE PERSPECTIVE. Annual Review of Developmental Psychology,4,
63–87. [CrossRef]
Edwards-Lowe, G., La Chiusa, E., Olawole-Scott, H., & Yon, D. (2024). Information seeking without metacognition. Available online:
https://osf.io/preprints/psyarxiv/cf4a7_v1 (accessed on 4 November 2024).
Ereira, S., Dolan, R., & Kurth-Nelson, Z. (2018). Agent-specific learning signals for self—Other distinction during mentalising. PLoS
Biology,16(4), e2004752. [CrossRef] [PubMed]
Ericsson, K. A. (2002). Attaining excellence through deliberate practice: Insights from the study of expert performance. In M. Ferrari
(Ed.), The pursuit of excellence through education (pp. 21–55). Lawrence Erlbaum Associates Publishers.
Errante, A., Gerbella, M., Mingolla, G. P., & Fogassi, L. (2023). Activation of cerebellum, basal ganglia and thalamus during observation
and execution of mouth, hand, and foot actions. Brain Topography,36(4), 476–499. [CrossRef] [PubMed]
Essler, S., Becher, T., Pletti, C., Gniewosz, B., & Paulus, M. (2023). Longitudinal evidence that infants develop their imitation abilities by
being imitated. Current Biology,33(21), 4674–4678.e3. [CrossRef] [PubMed]
Fedorenko, E., Piantadosi, S. T., & Gibson, E. A. F. (2024). Language is primarily a tool for communication rather than thought. Nature,
630, 575–586. [CrossRef] [PubMed]
Fleming, L. (2017). Phoneme inventory size and the transition from monoplanar to dually patterned speech. Journal of Language
Evolution,2(1), 52–56. [CrossRef]
Fodor, J. (1975). The language of thought. Harvard University Press.
Fodor, J. (2007). The revenge of the given. In B. McLaughlin, & J. Cohen (Eds.), Contemporary debates in philosophy of mind (pp. 105–116).
Blackwell.
Foley, R., & Mirazón, L. (2020). Variable cognition in the evolution of homo: Biology and behaviour in the african middle stone age. In
Landscapes of human evolution (pp. 125–141). Archaeopress Publishing Ltd.
Freidline, S. E., Gunz, P., Alichane, H., Oujaa, A., Ben-Ncer, A., El Hajraoui, M. A., & Hublin, J. (2024). The undescribed juvenile maxilla
from contrebandiers cave, morocco—A study on middle stone age facial growth. Journal of Paleolithic Archaeology,7, 15. [CrossRef]
Frith, C. D., & Frith, U. (2007). Social cognition in humans. Current Biology,17(16), R724–R732. [CrossRef] [PubMed]
Gainotti, G. (2024). Emotions related to threatening events are mainly linked to the right hemisphere. Journal of Psychiatry & Neuroscience,
49(3), E208–E211. [CrossRef]
Gallagher, S. (2015). The problem with 3-year-olds. Journal of Consciousness Studies: Controversies in Science and the Humanities,22(1–2),
160–182.
Gallardo, G., Eichner, C., Sherwood, C. C., Hopkins, W. D., Anwander, A., & Friederici, A. D. (2023). Morphological evolution of
language-relevant brain areas. PLoS Biology,21, e3002266. [CrossRef] [PubMed]
Gallese, V. (2018). The Problem of Images: A view from the brain-body. Phenomenology and Mind,14, 70–79. [CrossRef]
Gärdenfors, P. (2022). Teaching as evolutionary precursor to language. Frontiers in Communication,7, 970069. [CrossRef]
Gärdenfors, P., & Lombard, M. (2020). Technology led to more abstract causal reasoning. Biology & Philosophy,35, 40. [CrossRef]
Gasparri, L. (2023). The first words ever spoken. Synthese,201, 174. [CrossRef]
Gast, V. (2023). The temporal alignment of speech-accompanying eyebrow movement and voice pitch. Behavioral Sciences,13(1), 52.
[CrossRef] [PubMed]
Geldhof, J. (2005). ‘Cogitor ergo sum’: On the meaning and relevance of baader’s theological critique of descartes. Modern Theology,
21(2), 237–251. [CrossRef]
Geurts, B. (2019, September 25–27). What’s wrong with Gricean pragmatics? 10th International Conference of Experimental Linguistics,
Lisbon, Portugal. [CrossRef]
Godinho, R. M., Spikins, P., & O’higgins, P. (2018). Supraorbital morphology and social dynamics in human evolution. Nature (Ecology
& Evolution),2(6), 956–961. [CrossRef]
Goupil, N., Rayson, H., Serraille, É., Massera, A., Ferrari, P. F., Hochmann, J., & Papeo, L. (2024). Visual preference for socially relevant
spatial relations in humans and monkeys. Psychological Science,35(6), 681–693. [CrossRef]
Graham, K. E., Rossano, F., & Moore, R. T. (2024). The origin of great ape gestural forms. Biological Reviews of the Cambridge Philosophical
Society,100(1), 190–204. [CrossRef]
Humans 2025,5, 5 37 of 42
Guevara, I., Rodríguez, C., & Núñez, M. (2024). Developing gestures in the infant classroom: From showing and giving to pointing.
European Journal of Psychology of Education,39, 4671–4702. [CrossRef]
Haggard, P., & Wolpert, D. (2005). Disorders of Body Scheme. In Higher-Order motor disorders (pp. 261–271). Oxford University Press.
Hahn, L., Sergiou, A., Arbon, J., Fuertbauer, I., King, A., & Thornton, A. (2024). The co-evolution of cognition and sociality. Available
online: https://osf.io/preprints/osf/n2z4a_v1 (accessed on 4 November 2024).
Happé, F. (1993). Communicative competence and theory of mind in autism: A test of relevance theory. Cognition,48, 101–119.
[CrossRef]
Heintz, C., & Scott-Phillips, T. (2022). Expression unleashed: The evolutionary & cognitive foundations of human communication.
Behavioral and Brain Sciences,46, 1–46. [CrossRef]
Henrich, J., & Broesch, J. (2011). On the nature of cultural transmission networks: Evidence from Fijian villages for adaptive learning
biases. Philosophical Transactions of the Royal Society Biological Sciences,366, 1139–1148. [CrossRef] [PubMed]
Hepach, R., Engelmann, J. M., Herrmann, E., Gerdemann, S. C., & Tomasello, M. (2022). Evidence for a developmental shift in the
motivation underlying helping in early childhood. Developmental Science,26, e13253. [CrossRef]
Heyes, C. (2021a). Imitation. Current Biology,31(5), R228–R232. [CrossRef] [PubMed]
Heyes, C. (2021b). Imitation and culture: What gives? Mind and Language,38(1), 42–63. [CrossRef]
Heyes, C., & Catmur, C. (2022). What happened to mirror neurons? Perspectives on Psychological Science,17(1), 153–168. [CrossRef]
[PubMed]
Heyes, C. M., & Frith, C. D. (2014). The cultural evolution of mind reading. Science,344, 1243091. [CrossRef] [PubMed]
Hobaiter, C., Leavens, D. A., & Byrne, R. W. (2014). Deictic gesturing in wild chimpanzees? Journal of Comparative Psychology,128, 82–87.
[CrossRef]
Hockett, C. (1960). The origin of speech. Scientific American,203, 88–111. [CrossRef]
Hurford, J. (2007). The origins of meaning. Oxford University Press.
Joyce, R. (2007). The evolution of morality. MIT Press.
Kaminski, J., & Nitzschner, M. (2013). Do dogs get the point? A review of dog–human communication ability. Learning and Motivation,
44(4), 294–302. [CrossRef]
Kano, F., Furuichi, T., Hashimoto, C., Krupenye, C., Leinwand, J. G., Hopper, L. M., Martin, C. F., Otsuka, R., & Tajima, T. (2022). What
is unique about the human eye? Comparative image analysis on the external eye morphology of human and nonhuman great
apes. Evolution and Human Behavior,43(3), 169–180. [CrossRef]
Kano, F., Krupenye, C., Hirata, S., Call, J., & Tomasello, M. (2017). Submentalizing cannot explain belief-based action anticipation in
apes. Trends in Cognitive Sciences,21(9), 633–634. [CrossRef] [PubMed]
Karabegovi´c, M., & Mercier, H. (2023). The reputational benefits of intellectual humility. Review of Philosophy and Psychology,15(2),
483–498. [CrossRef]
Karg, K., Schmelz, M., Call, J., & Tomasello, M. (2015). The goggles experiment: Can chimpanzees use self-experience to infer what a
competitor can see? Animal Behavior,105, 211–221. [CrossRef]
Karg, K., Schmelz, M., Call, J., & Tomasello, M. (2016). Differing views: Can chimpanzees do level 2 perspective-taking? Animal
Cognition,19, 555–564. [CrossRef]
Keysers, C., & Perrett, D. (2004). Demystifying social cognition: A Hebbian perspective. Trends in Cognitive Sciences,8, 501–507.
[CrossRef] [PubMed]
Kishimoto, T., Shizawa, Y., Yasuda, J., Hinobayashi, T., & Minami, T. (2007). Do pointing gestures by infants provoke comments from
adults? Infant Behavior and Development,30, 562–567. [CrossRef] [PubMed]
Klein, J. T., Shepherd, S. V., & Platt, M. L. (2009). Social attention and the brain. Current Biology,19, R958–R962. [CrossRef] [PubMed]
Kobayashi, H., & Kohshima, S. (2001). Unique morphology of the human eye and its adaptive meaning. Journal of Human Evolution,40,
419–435. [CrossRef]
Krupenye, C., Kano, F., Hirata, S., Call, J., & Tomasello, M. (2016). Great apes anticipate that other individuals will act according to
false beliefs. Science,354(6308), 110–114. [CrossRef] [PubMed]
Laland, K. (2017). The origins of language in teaching. Psychonomic Bulletin & Review,24(1), 225–231. [CrossRef]
Lameira, A. R., E Hardus, M., Ravignani, A., Raimondi, T., & Gamba, M. (2024). Recursive self-embedded vocal motifs in wild
orangutans. eLife,12, RP88348. [CrossRef]
Leary, M. (2004). The sociometer. In R. Baumeister, & K. Vohs (Eds.), Handbook of self-regulation (pp. 373–391). Guilford.
Leavens, D. (2021). The referential problem space revisited: An ecological hypothesis of the evolutionary and developmental origins of
pointing. Cognitive Science,12(4), e1554. [CrossRef] [PubMed]
Leavens, D. A., Hopkins, W. D., & Bard, K. A. (2005). Understanding the point of chimpanzee. Epigenesis and ecological validity.
Current Directions in Psychological Science,14(4), 185–189. [CrossRef] [PubMed]
LeDoux, J. (2012). Rethinking the emotional brain. Neuron,73, 653–676. [CrossRef] [PubMed]
Humans 2025,5, 5 38 of 42
LeDoux, J. (2023). The deep history of ourselves: The four-billion-year story of how we got conscious brains. Philosophical Psychology,
36(4), 704–715. [CrossRef]
Levy, A., & Weinshtock-Saadon, I. (2023). Evolutionary king of (arguments for) moral realism. Synthese,201(5), 1–22. [CrossRef]
Lewis, L., & Krupenye, C. (2022). Theory of mind in nonhuman primates. In Primate cognitive studies. Cambridge University Press.
[CrossRef]
Lewis, M. (2000). The emergence of human emotions. In M. Lewis, & J. Haviland-Jones (Eds.), Handbook of emotions (pp. 265–280).
Guilford.
Li, L. (2023). The other side of false belief: Constructing the objectivity of reality. Infant and Child Development,32, e2416. [CrossRef]
Lind, J., & Jon-And, A. (2024). A sequence bottleneck for animal intelligence and language? Trends in Cognitive Sciences.Online ahead of
print. [CrossRef]
Lipschits, O., & Geva, R. (2024). An integrative model of parent-infant communication development. Child Development Perspectives,
18(3), 137–144. [CrossRef]
Lorenz, K. (1966). Evolution and modification of behaviour. Methuen.
Lotem, A., Halpern, J. Y., Edelman, S., & Kolodny, O. (2017). The Evolution of Cognitive Mechanisms in Response to Cultural
Innovations. Proceedings of the National Academy of Sciences USA,114, 7915–7922. [CrossRef]
Lurz, R. W., Krachun, C., Mareno, M. C., & Hopkins, W. D. (2022). Do chimpanzees predict others’ behavior by simulating their beliefs?
Animal Behavior and Cognition,9, 153–175. [CrossRef]
Lyn, H., & Christopher, J. (2018). A point is not a point is not a point: Reinterpreting three basic kinds of pointing comprehension. Proceedings of
Evolang 2018 (pp. 260–263). Available online: https://pure.mpg.de/rest/items/item
_
3190925
_
17/component/file
_
3260022/
content (accessed on 4 November 2024).
Lyn, H., Greenfield, P. M., Savage-Rumbaugh, S., Gillespie-Lynch, K., & Hopkins, W. D. (2011). Nonhuman primates do declare! A
comparison of declarative symbol and gesture use in children, bonobos, and chimpanzees. Language & Communication,31(1),
63–74. [CrossRef]
Lyn, H., West, K., Villegas, J., Bass, C., & Baker, S. (2024). Pointing on the other side: Do dogs follow contralateral points? Available online:
https://www.preprints.org/manuscript/202401.1896 (accessed on 4 November 2024).
Maibom, H. (2010). The descent of shame. Philosophy and Phenomenological Research,80(3), 566–594. [CrossRef]
Margoni, F., Surian, L., & Baillargeon, R. (2023). The violation-of-expectation paradigm: A conceptual overview. Psychological Review,
131, 716–748. [CrossRef] [PubMed]
Mayhew, J., & Gómez, J. C. (2015). Gorillas with white sclera. American Journal of Primatology,77, 869–887. [CrossRef] [PubMed]
Mearing, A. S., & Koops, K. (2021). Quantifying gaze conspicuousness: Are humans distinct from chimpanzees and bonobos? Journal of
Human Evolution,157, 103043. [CrossRef] [PubMed]
Meguerditchian, A. (2022). On the gestural origins of language: What baboons’ gestures and brain have told us after 15 years
of research. Ethology Ecology & Evolution,34, 288–302. Available online: https://www.tandfonline.com/doi/full/10.1080/
03949370.2022.2044388 (accessed on 4 November 2024).
Meguerditchian, A., Molesti, S., & Vauclair, J. (2011). Right-handedness predominance in 162 baboons for gestural communication:
Consistency across time and groups. Behavioral Neuroscience,125, 653–660. [CrossRef] [PubMed]
Melis, A., & Rossano, F. (2022). When and how do non-human great apes communicate to support cooperation? Philosophical
Transactions of the Royal Society B: Biological Sciences,377, 20210109. [CrossRef]
Meneganzin, A., Ramsey, G., & DiFrisco, J. (2024). What is a trait? Lessons from the human chin. Journal of Experimental Zoology. Part B,
Molecular and Developmental Evolution,342, 65–75. [CrossRef] [PubMed]
Michael, J., & Székely, M. (2019). Goal slippage: A mechanism for spontaneous instrumental helping in infancy? Topoi,38, 173–183.
[CrossRef]
Miyazono, K., & Inarimori, K. (2021). Empathy, altruism, and group identification. Frontiers in Psychology,12, 749315. [CrossRef]
[PubMed]
Moore, C. (2008). The development of gaze following. Child Development Perspectives,2, 66–70. [CrossRef]
Moore, R. (2013). Evidence and Interpretation in great ape gestural communication. HUMANA. MENTE Journal of Philosophical Studies,
6, 27–51. Available online: https://pure.mpg.de/rest/items/item_1838343_2/component/file_1838342/content (accessed on 4
November 2024).
Moore, R. (2015). A common intentional framework for ape and human communication. Current Anthropology,56(1), 56–80.
Moore, R. (2020). The cultural evolution of mind-modelling. Synthese,199(1), 1751–1776. [CrossRef]
Morrison, D. (2020). Disambiguated indexical pointing as a tipping point for the explosive emergence of language among human
ancestors. Biological Theory,15, 196–211. [CrossRef]
Mussavifard, N. (2023). Ostensive marking as a distinctive feature of human communication. Available online: https://www.researchgate.net/
publication/372788891_Ostensive_Marking_as_a_Distinctive_Feature_of_Human_Communication (accessed on 4 November 2024).
Humans 2025,5, 5 39 of 42
Mussavifard, N., & Csibra, G. (2023). The co-evolution of cooperation and communication: Alternative accounts. Behavioral and Brain
Sciences,46, e11. [CrossRef] [PubMed]
Neubauer, S., Hublin, J., & Gunz, P. (2018). The evolution of modern human brain shape. Science Advances,4, eaao5961. [CrossRef]
[PubMed]
Nevejans, M., & Cracco, E. (2022). Model expertise does not influence automatic imitation. Experimental Brain Research,240(4),
1267–1277. [CrossRef]
Okasha, S. (2022). Goal attributions in biology: Objective fact, anthropomorphic bias, or valuable heuristic? Available online: https://
philsci-archive.pitt.edu/id/eprint/20701 (accessed on 4 November 2024).
Onishi, K. H., & Baillargeon, R. (2005). Do 15-month-old infants understand false beliefs? Science,308, 255–258. [CrossRef] [PubMed]
Onu, D., Kessler, T., & Smith, J. R. (2016). Admiration: A conceptual review of the knowns and unknowns. Emotion Review,8, 218–230.
[CrossRef]
Osiurak, F., Claidière, N., & Federico, G. (2022). Bringing cumulative technological culture beyond copying versus reasoning. Trends in
Cognitive Sciences,27, 30–42. [CrossRef] [PubMed]
Osiurak, F., Crétel, C., Uomini, N., Bryche, C., Lesourd, M., & Reynaud, E. (2021). On the neurocognitive co-evolution of tool behavior
and language: Insights from the massive redeployment framework. Topics in Cognitive Science,13(4), 684–707. [CrossRef]
Pan, X., Hsiao, V., Nau, D. S., & Gelfand, M. J. (2024). Explaining the evolution of gossip. Proceedings of the National Academy of Sciences
USA,121, e2214160121. [CrossRef]
Pan, Y., Dikker, S., Goldstein, P., Zhu, Y., Yang, C., & Hu, Y. (2020). Instructor-learner brain coupling discriminates between instructional
approaches and predicts learning. NeuroImage,211, 116657. [CrossRef]
Paulus, M., & Fikkert, P. (2013). Conflicting social cues: Infants’ reliance on gaze and pointing cues in word learning. Journal of Cognition
and Development,15, 43–59. [CrossRef]
Peeters, A., Cosentino, E., & Werning, M. (2023). Constructing a wider view on memory: Beyond the dichotomy of field and observer
perspectives. In A. Berninger, & Í. Vendrell Ferran (Eds.), Philosophical perspectives on memory and imagination (pp. 165–190).
Routledge.
Perea-García, J. O., Kret, M. E., Monteiro, A., & Hobaiter, C. (2019). Scleral pigmentation leads to conspicuous, not cryptic, eye
morphology in chimpanzees. Proceedings of the National Academy of Sciences USA,116(39), 19248–19250. [CrossRef]
Perner, J., Priewasser, B., & Roessler, J. (2018). The practical other: Teleology and its development. Interdisciplinary Science Reviews,43,
99–114. [CrossRef]
Pfister, R., Klaffehn, A., Kalckert, A., Kunde, W., & Dignath, D. (2021). How to lose a hand: Sensory updating drives disembodiment.
Psychonomic Bulletin & Review,28, 827–833. Available online: https://link.springer.com/article/10.3758/s13423-020-01854-0
(accessed on 4 November 2024).
Phillips, J., Buckwalter, W., Cushman, F., Friedman, O., Martin, A., Turri, J., Santos, L., & Knobe, J. (2020). Knowledge before belief.
Behavioral and Brain Sciences,44, 1–37. [CrossRef]
Phillips, S. (2024). A category theory perspective on the language of thought. Frontiers in Psychology,15, 1361580. [CrossRef] [PubMed]
Piaget, J. (1954). La formation du symbole chez l’enfant. Delachaux & Niestlé.
Piretti, L., Pappaianni, E., Garbin, C., Rumiati, R. I., Job, R., & Grecucci, A. (2023). The neural signatures of shame, embarrassment, and
guilt: A voxel-based meta-analysis on functional neuroimaging studies. Brain Sciences,13, 559. [CrossRef] [PubMed]
Planer, R. (2019). The evolution of languages of thought. Biology and Philosophy,34, 47. [CrossRef]
Planer, R. (2023). The evolution of hierarchically structured communication. Frontiers in Psychology,14, 1224324. [CrossRef] [PubMed]
Planer, R., Bandini, E., & Tennie, C. (2024). Hominin tool evolution and its (surprising) relation to language origins. Available online: https://
www.academia.edu/105665796/Hominin
_
Tool
_
Evolution
_
and
_
Its
_
Surprising
_
Relation
_
to
_
Language
_
Origins (accessed on 4
November 2024).
Pomper, J. K., Shams, M., Wen, S., Bunjes, F., & Thier, P. (2023). Non-shared coding of observed and executed actions prevails in
macaque ventral premotor mirror neurons. eLife,12, e77513. [CrossRef] [PubMed]
Poulin-Dubois, D., Goldman, E. J., Meltzer, A., & Psaradellis, E. (2023). Discontinuity from implicit to explicit theory of mind from
infancy to preschool age. Cognitive Development,65, 101273. [CrossRef]
Pouw, W., Werner, R., Burchardt, L., & Selen, L. (2023). The human voice aligns with whole-body kinetics. bioRxiv. [CrossRef]
Prein, J., Maurits, L., Werwach, A., Haun, D., & Bohn, M. (2024). Variation in gaze understanding across the life span: A process-level
perspective. Available online: https://osf.io/preprints/psyarxiv/dy73a_v1 (accessed on 4 November 2024).
Priest, M. (2017). Intellectual humility: An interpersonal theory. Ergo,4, 463–480. [CrossRef]
Rakoczy, H. (2022). Foundations of theory of mind and its development in early childhood. Nature Reviews Psychology,1(4), 223–235.
Available online: https://www.nature.com/articles/s44159-022-00037-z (accessed on 4 November 2024). [CrossRef]
Rakoczy, H., & Proft, M. (2022). Knowledge before belief ascription? Yes and no (depending on the type of “knowledge” under
consideration). Frontiers in Psychology,13, 988754. [CrossRef] [PubMed]
Rand, D., Greene, J., & Nowak, M. (2012). Spontaneous giving and calculated greed. Nature,489, 427–430. [CrossRef]
Humans 2025,5, 5 40 of 42
Reddy, V. (2010). How infants know minds. Harvard University Press.
Rendall, D., Owren, M., & Ryan, M. (2009). What do animal signals mean? Animal Behaviour,78, 233–240. [CrossRef]
Rodríguez, C., Moreno-Núñez, A., Basilio, M., & Sosa, N. (2015). Ostensive gestures come first: Their role in the beginning of shared
reference. Cognitive Development,36, 142–149. [CrossRef]
Rönnqvist, L. (2003). Developmentally, the arm preference precedes handedness. Behavioral and Brain Sciences,26, 238–239. [CrossRef]
Ross, L. (1977). The intuitive psychologist and his shortcomings: Distortions in the attribution process. In L. Berkowitz (Ed.), Advances
in experimental social psychology (Vol. 10). Academic Press.
Rossano, M. (2003). Expertise and the evolution of consciousness. Cognition,89, 207–236. [CrossRef]
Roszak, P. (2022). Not only coping: Resilience and its sources from a thomistic perspective. Journal of Religion and Health,62(4),
2734–2745. [CrossRef]
Royo, J., Orset, T., Catani, M., Pouget, P., & Thiebaut de Schotten, M. (2024). Evidence for an evolutionary continuity in social dominance: In-
sights from non-human primates tractography. Available online: https://www.researchsquare.com/article/rs-4772053/v1 (accessed
on 4 November 2024).
Ruba, A., & Repacholi, B. (2020). Beyond language in infant emotion concept development. Emotion Review,12(4), 255–258. [CrossRef]
Ruba, A. L., Pollak, S. D., & Saffran, J. R. (2022). Acquiring complex communicative systems: Statistical learning of language and
emotion. Topics in Cognitive Science,14(3), 432–450. [CrossRef]
Rubio-Fernandez, P. (2020). Pragmatic markers: The missing link between language and theory of mind. Synthese,199, 1125–1158.
[CrossRef]
Rühlemann, C., & Trujillo, J. (2024). The effect of gesture expressivity on emotional resonance in storytelling interaction. Frontiers in
Psychology,15, 1477263. [CrossRef]
Scerri, E. M., & Will, M. (2023). The revolution that still isn’t: The origins of behavioral complexity in Homo sapiens. Journal of Human
Evolution,179, 103358. [CrossRef]
Schlinger, H. (2009). Theory of mind: An overview and behavioral perspective. The Psychological Record,59, 435–448. Available online:
https://link.springer.com/content/pdf/10.1007/BF03395673.pdf (accessed on 4 November 2024). [CrossRef]
Schüler, C., Berger, P., & Grosse Wiesmann, C. (2024). A dorsal versus ventral network for understanding others in the developing
brain. bioRxiv. [CrossRef]
Schuwerk, T., Kampis, D., Baillargeon, R., Biro, S., Bohn, M., Byers-Heinlein, K., & Rakoczy, H. (2024). Project MANYBABIES. Registered
report. Action anticipation based on an agent’s epistemic state in toddlers and adults. Available online: https://osf.io/preprints/
psyarxiv/x4jbm_v1 (accessed on 4 November 2024).
Scott-Phillips, T., & Heintz, C. (2023). Great ape interaction: Ladyginian but not gricean. Proceedings of the National Academy of Sciences
USA,120, e2300243120. [CrossRef] [PubMed]
Shilton, D., Breski, M., Dor, D., & Jablonka, E. (2020). Human social evolution: Self-domestication or self-control? Frontiers in Psychology,
11, 134. [CrossRef]
Shimoni, E., Berger, A., & Eyal, T. (2022). Your pride is my goal: How the exposure to others’ positive emotional experience influences
preschoolers’ delay of gratification. Journal of Experimental Child Psychology,217, 105356. [CrossRef] [PubMed]
Siposova, B., Tomasello, M., & Carpenter, M. (2018). Communicative eye contact signals a commitment to cooperate for young children.
Cognition,179, 192–201. [CrossRef] [PubMed]
Southgate, V. (2020). Are infants altercentric? The other and the self in early social cognition. Psychological Review,127(4), 505–523.
[CrossRef]
Southgate, V., Van Maanen, C., & Csibra, G. (2007). Infant pointing: Communication to cooperate or communication to learn?
Child Development,78(3), 735–740. Available online: https://srcd.onlinelibrary.wiley.com/doi/10.1111/j.1467-8624.2007.01028.x
(accessed on 4 November 2024). [CrossRef] [PubMed]
Spikins, P., Needham, A., Wright, B., Dytham, C., Gatta, M., & Hitchens, G. (2019). Living to fight another day: The ecological and
evolutionary significance of neanderthal healthcare. Quaternary Science Reviews,217, 98–118. [CrossRef]
Spurrett, D. (2024). Motivation and cumulative culture. Commentary on Sterelny and hiscock, cumulative culture, archaeology, and the
zone of latent solutions. Current Anthropology,65(1), 23–48. Available online: https://www.journals.uchicago.edu/doi/10.1086/
728723 (accessed on 4 November 2024).
Sterelny, K. (2023). Niche construction, cumulative culture and the social transmission of expertise. PaleoAnthropology. Available online:
https://paleoanthropology.org/ojs/index.php/paleo/article/view/119 (accessed on 4 November 2024).
Sterelny, K., & Hiscock, P. (2024). Cumulative culture, archaeology, and the zone of latent solutions. Current Anthropology,65(1), 23–48.
[CrossRef]
Steven, S., Cole, G., & Eacott, M. (2022). It’s not you, it’s me: A review of individual differences in visuospatial perspective taking.
Perspectives on Psychological Science,18(2), 293–308. [CrossRef]
Sznycer, D. (2019). Forms and functions of the self-conscious emotions. Trends in Cognitive Sciences,23(2), 143–157. [CrossRef] [PubMed]
Sznycer, D., & Cohen, A. (2021). How pride works. Evolutionary Human Sciences,3, 1–39. [CrossRef]
Humans 2025,5, 5 41 of 42
Sznycer, D., Al-Shawaf, L., Bereby-Meyer, Y., Curry, O. S., De Smet, D., Ermer, E., Kim, S., Kim, S., Li, N. P., Seal, M. F. L., McClung, J.,
O, J., Ohtsubo, Y., Quillien, T., Schaub, M., Sell, A., van Leeuwen, F., Cosmides, L., & Tooby, J. (2017). Cross-cultural regularities in
the cognitive architecture of pride. Proceedings of the National Academy of Sciences USA,114, 1874–1879. [CrossRef]
Tatone, D., & Csibra, G. (2015). Learning in and about opaque worlds. Behavioral and Brain Sciences,38, e68. [CrossRef]
Tattersall, I. (2023). Let sleeping syntheses lie. Special issue: Niche construction, plasticity, and inclusive inheritance: Rethinking
human origins with the extended evolutionary synthesis, part 1. PaleoAnthropology,2023, 258–265. [CrossRef]
Tebbe, A. L., Rothmaler, K., Koester, M., & Wiesmann, C. (2024). Infants and adults neurally represent the perspective of others like
their own perception. bioRxiv. [CrossRef]
Téglás, E., Gergely, A., Kupán, K., Miklósi, Á., & Topál, J. (2012). Dogs’ gaze following is tuned to human communicative signals.
Current Biology,22(3), 209–212. [CrossRef]
Tennie, C., Braun, D. R., Premo, L. S., & McPherron, S. P. (2016). The island test for cumulative culture in the paleolithic. In M. Haidle,
N. Conard, & M. Bolus (Eds.), The nature of culture. Springer Press. [CrossRef]
Thiele, M., Kalinke, S., Michel, C., & Haun, D. B. M. (2023). Direct and observed joint attention modulate 9-month-old infants’ object
encoding. Open Mind,7, 917–946. [CrossRef] [PubMed]
Thomas, E. R., Haarsma, J., Nicholson, J., Yon, D., Kok, P., & Press, C. (2024). Predictions and errors are distinctly represented across V1
layers. Current Biology,34, 2265–2271.e4. [CrossRef]
Thorne, T. N., Milyavskaya, M., Werner, K., Leduc-Cummings, I., Saunders, B., & Inzlicht, M. (2023). The personal
goal difficulty—Progress paradox: Unraveling the role of self-efficacy on perceptions of goal difficulty. Available online:
https://www.researchgate.net/publication/376174835
_
The
_
Personal
_
Goal
_
Difficulty
_
-
_
Progress
_
Paradox
_
Unraveling
_the_Role_of_Self-Efficacy_on_Perceptions_of_Goal_Difficulty (accessed on 4 November 2024).
Thornton, M., & Tamir, D. (2024). Neural representations of situations and mental states are composed of sums of representations of
the actions they afford. Nature Communications,15(1), 620. [CrossRef] [PubMed]
Tomasello, M. (1999). The Human Adaptation for Culture. Annual Review of Anthropology,28, 509–529. [CrossRef]
Tomasello, M. (2008). Origins of human communication. MIT Press.
Tomasello, M. (2012). Why be nice? Better not think about it. Trends in Cognitive Sciences,16, 580–581. [CrossRef] [PubMed]
Tomasello, M. (2018). How children come to understand false beliefs: A shared intentionality account. Proceedings of the National
Academy of Sciences USA,115, 8491–8498. [CrossRef] [PubMed]
Tomasello, M. (2022). Social cognition and metacognition in great apes: A theory. Animal Cognition,26(1), 25–35. [CrossRef] [PubMed]
Tomasello, M., & Call, J. (2019). Thirty years of great ape gestures. Animal Cognition,22(4), 461–469. [CrossRef]
Tomasello, M., Call, J., & Hare, B. (2003). Chimpanzees understand psychological states—The question is which ones and to what
extent. Trends in Cognitive Sciences,7, 153–156. [CrossRef]
Tomasello, M., Hare, B., Lehmann, H., & Call, J. (2007). Reliance on head versus eyes in the gaze following of great apes and human
infants: The cooperative eye hypothesis. Journal of Human Evolution,52, 314–320. [CrossRef] [PubMed]
Tomasello, R., Grisoni, L., Boux, I., Sammler, D., & Pulvermüller, F. (2022). Instantaneous neural processing of communicative functions
conveyed by speech prosody. Cerebral Cortex,32, 4885–4901. [CrossRef] [PubMed]
Tomonaga, M., Kurosawa, Y., Kawaguchi, Y., & Takiyama, H. (2023). Don’t look back on failure: Spontaneous uncertainty monitoring
in chimpanzees. Learning & Behavior,51(4), 402–412. [CrossRef]
Tracy, J. L., Mercadante, E., & Witkower, Z. (2024). The evolved nature of pride. In The oxford handbook of evolution and the emotions
(pp. 203–218). Oxford University Press. [CrossRef]
Uomini, N., & Ruck, L. (2019). Testing models of handedness in stone tools. In Squeezing minds from stones. Oxford University Press.
[CrossRef]
van Leeuwen, E., Detroy, S., Haun, D., & Call, J. (2024). Chimpanzees use social information to acquire a skill they fail to innovate.
Nature Human Behaviour,8(5), 891–902. [CrossRef] [PubMed]
van Woerkum, B., & Barrett, L. (2024). Anthropofabrication and the redressing of memory: An embodied approach to comparative
cognition. Philosophical Transactions B,379, 20230145. [CrossRef]
Vasilieva, O. (2019). Beyond “Uniqueness”: Habitual traits in the context of cognitive-communicative continuity. Theoria et Historia
Scientiarum,16, 129. [CrossRef]
Vieira, J., & Olsson, A. (2022). Help or flight: Neural defensive circuits promote helping under threat in humans. eLife,11, e78162.
[CrossRef] [PubMed]
Vieira, J. B., Schellhaas, S., Enström, E., & Olsson, A. (2020). Help or flight? Increased threat imminence promotes defensive helping in
humans. Proceedings of the Royal Society B: Biological Sciences,287, 20201473. [CrossRef] [PubMed]
Vincini, S. (2023). Can interactionist approaches solve the empathy-sharing conundrum? In Empathy’s role in understanding persons,
literature, and art (pp. 44–64). Routledge. [CrossRef]
Vygotsky, L., & Cole, M. (1978). Mind in society: Development of higher psychological processes. Harvard University Press.
Humans 2025,5, 5 42 of 42
Vyshedskiy, A. (2022). Language evolution is not limited to speech acquisition: A large study of language development in children with
language deficits highlights the importance of the voluntary imagination component of language. Research Ideas and Outcomes,8,
e86401. [CrossRef]
Warren, E., & Call, J. (2022). Inferential communication: Bridging the gap between intentional and ostensive communication in
non-human primates. Frontiers in Psychology,12, 718251. [CrossRef] [PubMed]
Warren, E., Call, J., & György, G. (2023). On the murky dissociation between expression and communication. Behavioral and Brain
Sciences,46, e19. [CrossRef]
Wilkins, J., & Griffiths, P. (2013). Evolutionary debunking arguments in three domains: Fact, value, and religion. In A new science of
religion (pp. 136–146). University of Chicago Press.
Witkower, Z., Tracy, J., Cheng, J., & Henrich, J. (2020). Two signals of social. Prestige and dominance are associated with distinct
nonverbal displays. Journal of Personality and Social Psychology,118, 89–120. [CrossRef] [PubMed]
Wolf, W., Thielhelm, J., & Tomasello, M. (2023). Five-year-old children show cooperative preferences for faces with white sclera. Journal
of Experimental Child Psychology,225, 105532. [CrossRef]
Woo, B. M., Tan, E., Yuen, F. L., & Hamlin, J. K. (2022). Socially evaluative contexts facilitate mentalizing. Trends in Cognitive Sciences,
27(1), 17–29. [CrossRef] [PubMed]
Woo, B., & Spelke, E. (2022). Toddlers’ social evaluations of agents who act on false beliefs. Developmental Science,26(2), e13314.
[CrossRef]
Yáñez, B., & Gomila, A. (2018). Evolución de la esclerótica del ojo humano: Una hipótesis social. Ludus Vitalis,26, 119–132.
Zollikofer, C. P. E., Bienvenu, T., Beyene, Y., Suwa, G., Asfaw, B., White, T. D., & de León, M. S. P. (2022). Endocranial ontogeny and
evolution in early homo sapiens: The evidence from Herto, Ethiopia. Proceedings of the National Academy of Sciences USA,119(32),
e2123553119. [CrossRef]
Zuberbühler, K. (2008). Gaze following. Current Biology,18(11), R453–R455. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The key function of storytelling is a meeting of hearts: a resonance in the recipient(s) of the story narrator’s emotion toward the story events. This paper focuses on the role of gestures in engendering emotional resonance in conversational storytelling. The paper asks three questions: Does story narrators’ gesture expressivity increase from story onset to climax offset (RQ #1)? Does gesture expressivity predict specific EDA responses in story participants (RQ #2)? How important is the contribution of gesture expressivity to emotional resonance compared to the contribution of other predictors of resonance (RQ #3)? 53 conversational stories were annotated for a large number of variables including Protagonist, Recency, Group composition, Group size, Sentiment, and co-occurrence with quotation. The gestures in the stories were coded for gesture phases and gesture kinematics including Size, Force, Character view-point, Silence during gesture, Presence of hold phase, Co-articulation with other bodily organs, and Nucleus duration. The Gesture Expressivity Index (GEI) provides an average of these parameters. Resonating gestures were identified, i.e., gestures exhibiting concurrent specific EDA responses by two or more participants. The first statistical model, which addresses RQ #1, suggested that story narrators’ gestures become more expressive from story onset to climax offset. The model constructed to adress RQ #2 suggested that increased gesture expressivity increases the probability of specific EDA responses. To address RQ #3 a Random Forest for emotional resonance as outcome variable and the seven GEI parameters as well as six more variables as predictors was constructed. All predictors were found to impact Eemotional resonance. Analysis of variable importance showed Group composition to be the most impactful predictor. Inspection of ICE plots clearly indicated combined effects of individual GEI parameters and other factors, including Group size and Group composition. This study shows that more expressive gestures are more likely to elicit physiological resonance between individuals, suggesting an important role for gestures in connecting people during conversational storytelling. Methodologically, this study opens up new avenues of multimodal corpus linguistic research by examining the interplay of emotion-related measurements and gesture at micro-analytic kinematic levels and using advanced machine-learning methods to deal with the inherent collinearity of multimodal variables.
Article
Full-text available
Norms play a crucial role in governing human societies. From an early age, humans possess an innate understanding of norms, recognizing certain behaviours, contexts, and roles as being governed by them. The evolution of normativity has been linked to its contribution to the promotion of cooperation in large groups and is intertwined with the development of joint intentionality. However, there is no evolutionary consensus on what normatively differentiated our hominin ancestors from the phylogenetic lineage leading to chimpanzees and bonobos. Here we propose that the development of teaching through a process of evaluative feedback between parent and offspring functioned as a prerequisite for the later development of normativity. Parents approve or disapprove of offspring’s behaviours based on their own learned knowledge of what is appropriate or inappropriate. We argue our proposition using a simple model of cultural transmission, which shows the adaptive advantage offered by these elementary forms of teaching. We show that an important part of this adaptive advantage can arise from the benefits derived from guidance about which behaviours to adopt or reject. We propose that this type of guidance has fundamental elements that characterise the normative world. We complete our argument by reviewing several studies that examine the emergence of normativity in young children without prior exposure to a normative framework with respect to the behaviours under analysis. We suggest that this normativity is best interpreted as manifestations of teaching among young children rather than as norm recognition among early normative children.
Article
Full-text available
Two views claim to account for the origins of great ape gestural forms. On the Leipzig view, gestural forms are ontogenetically ritualised from action sequences between pairs of individuals. On the St Andrews view, gestures are the product of natural selection for shared gestural forms. The Leipzig view predicts within- and between-group differences between gestural forms that arise as a product of learning in ontogeny. The St Andrews view predicts universal gestural forms comprehensible within and between species that arise because gestural forms were a target of natural selection. We reject both accounts and propose an alternative "recruitment view" of the origins of great ape gestures. According to the recruitment view, great ape gestures recruit features of their existing behavioural repertoire for communicative purposes. Their gestures inherit their communicative functions from visual (and sometimes tactile) presentations of familiar and easily recognisable action schemas and states and parts of the body. To the extent that great ape species possess similar bodies, this predicts mutual comprehensibility within and between species - but without supposing that gestural forms were themselves targets of natural selection. Additionally, we locate great ape gestural communication within a pragmatic framework that is continuous with human communication, and make testable predications for adjudicating between the three alternative views. We propose that the recruitment view best explains existing data, and does so within a mechanistic framework that emphasises continuity between human and non-human great ape communication.
Article
Full-text available
This theme issue brings together researchers from diverse fields to assess the current status and future prospects of embodied cognition in the age of generative artificial intelligence. In this introduction, we first clarify our view of embodiment as a potentially unifying concept in the study of cognition, characterizing this as a perspective that questions mind–body dualism and recognizes a profound continuity between sensorimotor action in the world and more abstract forms of cognition. We then consider how this unifying concept is developed and elaborated by the other contributions to this issue, identifying the following two key themes: (i) the role of language in cognition and its entanglement with the body and (ii) bodily mechanisms of interpersonal perception and alignment across the domains of social affiliation, teaching and learning. On balance, we consider that embodied approaches to the study of cognition, culture and evolution remain promising, but will require greater integration across disciplines to fully realize their potential. We conclude by suggesting that researchers will need to be ready and able to meet the various methodological, theoretical and practical challenges this will entail and remain open to encountering markedly different viewpoints about how and why embodiment matters. This article is the part of this theme issue ‘Minds in movement: embodied cognition in the age of artificial intelligence’.
Article
Full-text available
On what basis do researchers posit that humans and other animals share cognitive capacities? We argue that such claims are not based on inherent, pre-existing similarities, but rather emerge through a two-step process, which we will call ‘anthropofabrication’. In the initial stage, embodied action-based strategies and environmental context in human studies are ignored owing to the need for measurement and quantification. Consequently, cognitive terms become disconnected from the context to which we apply them, and human classificatory cognitive terms are transformed into broad explanatory terms, assumed to be ‘species-neutral’. The second phase entails translating and applying these generalized explanatory terms to specific nonverbal animals in ways that serve to further cloak differences between animals and other species. Here, again, researchers selectively discard contextual information to facilitate the comparison with humans. To limit anthropofabrication, we should (re)acknowledge that cognitive abilities are not species-neutral and cannot be detached from embodied action, perception and their context of occurrence. We illustrate our points about anthropofabrication using the example of memory research. This article is part of the theme issue ‘Minds in movement: embodied cognition in the age of artificial intelligence’.
Preprint
Full-text available
The dynamics of social dominance play a significant role in regulating access to resources and influencing reproductive success and survival in non-human primates. These dynamics are based on aggressive and submissive interactions which create distinct, hierarchically organized social structures. In humans, whose social behavior is similarly organized, the use of brain imaging based on tractography has identified key neuronal networks of the limbic system underlying social behaviour. Among them the uncinate fasciculus and the cingulum bundle which have been associated with conduct disorder and psychopathy. In this study, we have used advanced tractography to study the anatomy of connections underlying social dominance in a colony of 15 squirrel monkeys (Saimiri sciureus). We correlated biostructural properties of the uncinate fasciculus and cingulum with behavioral hierarchy measures while controlling for factors such as age, weight, handedness, brain size, and hormonal influences. The fornix, a limbic connection involved in memory was also included as control tract. Our findings indicate a significant correlation between the integrity of the right uncinate fasciculus and social dominance measures, including normalised David’s scores, aggressive behaviors, and withdrawal behaviors. Trends observed in the left uncinate fasciculus hint at potential bilateral involvement with a right hemispheric lateralisation. These results are consistent with human studies linking the uncinate fasciculus to social disorders, suggesting an evolutionary continuity in the neuro-anatomical substrates of social dominance back to at least 35 million years.
Preprint
Cognition serves to resolve uncertainty. Living in social groups is widely seen as a source of uncertainty driving cognitive evolution, but sociality can also mitigate sources of uncertainty, reducing the need for cognition. Moreover, social systems are not simply external selection pressures, but rather arise from the decisions individuals make regarding who to interact with and how to behave. Thus, an understanding of how and why cognition evolves requires careful consideration of the co-evolutionary feedback loop between cognition and sociality. Here, we adopt ideas from information theory to evaluate how potential sources of uncertainty differ across species and social systems. Whereas cognitive research often focuses on identifying human-like abilities in other animals, we instead emphasise that animals need to make adaptive decisions to navigate socio-ecological trade-offs. These decisions can be viewed as feedback loops between perceiving and acting on information, which shape individuals’ immediate social interactions, and scale up to generate the structure of societies. Emerging group-level characteristics such as social structure, communication networks, and culture in turn produce the context in which decisions are made and so shape selection on the underlying cognitive processes. Thus, minds shape societies and societies shape minds.
Article
Research on gesture development has mostly focused on home environments. Little is known about early communicative development in other relevant contexts, such as early-year-schools. These settings, rich in diverse educative situations, objects, and communicative partners, provide a contrast to parent–child interactions, complementing our understanding of gesture development. This study aims to describe the development of the first gestures in the infant classrooms of early-years-schools, focusing on ostensive gestures of showing and giving—their emergence, communicative functions, and relation to the subsequent emergence of pointing. We conducted a longitudinal, observational investigation analyzing the gestures of 21 children (7–13 months). Over 7 months, we observed and registered children’s daily interactions in the classroom, employing a mixed quantitative and qualitative approach to analyze the types and functions of their gestures. We found a significant increase and diversification of gesture types and functions with age. Gestures followed a proximal–distal developmental course. Ostensive gestures were the earliest and most prevalent gestures observed. There was a correlation between the frequency of these gestures, with ostensive gestures fulfilling communicative functions later observed in pointing. Our qualitative analysis revealed the progressive construction of ostensive gestures into spontaneous, complex, and conventional forms of communication. These results highlight the important role of ostensive gestures in early communicative development, paving the way for distal communication through pointing and relating to the origin of intentional communication. More broadly, these findings have significant implications for early educational practices and show the value of conducting research on developmental processes in early education. Full paper available on: https://rdcu.be/dRJ1A
Preprint
Preverbal infants already seem to consider the perspective of others, even when it differs from their own. Similarly, adults take the perspective of others very quickly, in parallel to other cognitively demanding tasks. This raises the question of how multiple perspectives are processed efficiently, and even before higher cognitive capacities develop. To test whether and how others' perspectives are neurally represented, we presented 12-14-months-old infants and adults with objects flickering at 4 Hz, which evoked neural oscillations at the exact same frequency. Remarkably, both in infants and adults, this same highly specific neural signature of visual object processing was also present when their view was blocked and only another observer saw the object. These results provide strong evidence that we process what others see as if we saw it ourselves, revealing a neural mechanism for efficient perspective taking, present from infancy.