Darwin's mistake: Explaining the discontinuity between human and nonhuman minds

Article (PDF Available)inBehavioral and Brain Sciences 31(2):109-30; discussion 130-178 · May 2008with97 Reads
DOI: 10.1017/S0140525X08003543 · Source: PubMed
Over the last quarter century, the dominant tendency in comparative cognitive psychology has been to emphasize the similarities between human and nonhuman minds and to downplay the differences as "one of degree and not of kind" (Darwin 1871). In the present target article, we argue that Darwin was mistaken: the profound biological continuity between human and nonhuman animals masks an equally profound discontinuity between human and nonhuman minds. To wit, there is a significant discontinuity in the degree to which human and nonhuman animals are able to approximate the higher-order, systematic, relational capabilities of a physical symbol system (PSS) (Newell 1980). We show that this symbolic-relational discontinuity pervades nearly every domain of cognition and runs much deeper than even the spectacular scaffolding provided by language or culture alone can explain. We propose a representational-level specification as to where human and nonhuman animals' abilities to approximate a PSS are similar and where they differ. We conclude by suggesting that recent symbolic-connectionist models of cognition shed new light on the mechanisms that underlie the gap between human and nonhuman minds.
Darwin’s mistake: Explaining the
discontinuity between human and
nonhuman minds
Derek C. Penn
Department of Psychology, University of CaliforniaLos Angeles, Los Angeles,
CA 90095; Cognitive Evolution Group, University of Louisiana, Lafayette, LA
Keith J. Holyoak
Department of Psychology, University of CaliforniaLos Angeles, Los Angeles,
CA 90095
Daniel J. Povinelli
Cognitive Evolution Group, University of Louisiana, Lafayette, LA 70504
Abstract: Over the last quarter century, the dominant tendency in comparative cognitive psychology has been to emphasize the
similarities between human and nonhuman minds and to downplay the differences as “one of degree and not of kind” (Darwin
1871). In the present target article, we argue that Darwin was mistaken: the profound biological continuity between human and
nonhuman animals masks an equally profound discontinuity between human and nonhuman minds. To wit, there is a significant
discontinuity in the degree to which human and nonhuman animals are able to approximate the higher-order, systematic,
relational capabilities of a physical symbol system (PSS) (Newell 1980). We show that this symbolic-relational discontinuity
pervades nearly every domain of cognition and runs much deeper than even the spectacular scaffolding provided by language
or culture alone can explain. We propose a representational-level specification as to where human and nonhuman animals’
abilities to approximate a PSS are similar and where they differ. We conclude by suggesting that recent symbolic-
connectionist models of cognition shed new light on the mechanisms that underlie the gap between human and nonhuman
Keywords: analogy; animal cognition; causal learning; connectionism; Darwin; discontinuity; evolution; human mind; language;
language of thought; physical symbol system; reasoning; same-different; theory of mind
1. Introduction
Human animals and no other build fires and
wheels, diagnose each other’s illnesses, communicate
using symbols, navigate with maps, risk their lives for
ideals, collaborate with each other, explain the world
in terms of hypothetical causes, punish strangers for
breaking rules, imagine impossible scenarios, and
teach each other how to do all of the above. At first
blush, it might appear obvious that human minds are
qualitatively different from those of every other
animal on the planet. Ever since Darwin, however,
the dominant tendency in comparative cognitive
psychology has been to emphasize the continuity
between human and nonhuman minds and to downplay
the differences as “one of degree and not of kind”
(Darwin 1871). Particularly in the last quarter century,
many prominent comparative researchers have claimed
that the traditional hallmarks of human cognition for
example, complex tool use, grammatically structured
language, causal-logical reasoning, mental state attribu-
tion, metacognition, analogical inferences, mental time
travel, culture, and so on are not nearly as unique as
we once thought (see, e.g., Bekoff et al. 2002; Call
2006; Clayton et al. 2003; de Waal & Tyack 2003;
Matsuzawa 2001; Pepperberg 2002; Rendell &
Whitehead 2001; Savage-Rumbaugh et al. 1998; Smith
et al. 2003; Tomasello et al. 2003a). Pepperberg (2005,
p. 469) aptly sums up the comparative consensus as
follows: “for over 35 years, researchers have been
demonstrating through tests both in the field and in the lab-
oratory that the capacities of nonhuman animals to solve
complex problems form a continuum with those of
Printed in the United States of America
doi: 10.1017/S0140525X08003543
# 2008 Cambridge University Press 0140-525X/08 $40.00 109
Of course, many scholars continue to claim that there is
something qualitatively different about at least some
human faculties, particularly those associated with language
and a representational theory of mind (see, e.g., Bermudez
2003; Carruthers 2002; Donald 2001; Mithen 1996;
Premack 2007; Suddendorf & Corballis 2007a). Nearly
everyone agrees that there is something uniquely human
about our ability to represent and reason about our own
and others mental states (e.g., Tomasello et al. 2005).
And most linguists and psycho-linguists argue that there is
a fundamental discontinuity between human and nonhu-
man forms of communication (e.g., Chomsky 1980;
Jackendoff 2002; Pinker 1994). But the trend among com-
parative researchers is to construe the uniquely human
aspect of these faculties in increasingly narrow terms.
Hauser et al. (2002a), for example, continue to claim that
grammatically structured languages are unique to the
human species, but suggest that the only component of
the human language faculty that is, in fact, uniquely
human is the computational mechanism of recursion. The
rest of our “conceptual-intentional” system, they argue,
differs from that of nonhuman animals only in “quantity
rather than kind” (Hauser et al. 2002a, p. 1573). Similarly,
Tomasello and Rakoczy (2003, p. 121) argue that the
ability to participate in cultural activities with shared goals
and intentions is uniquely human, but claim that the cogni-
tive skills of a human child born on a desert island and
somehow magically kept alive by itself until adulthood
“would not differ very much perhaps a little, but not
very much” from the cognitive skills of other great apes
(see also Tomasello et al. 2003a; Tomasello et al. 2005).
Notwithstanding the broad comparative consensus
arrayed against us, the hypothesis we will be proposing
in the present paper is that Darwin was mistaken: The pro-
found biological continuity between human and nonhu-
man animals masks an equally profound functional
discontinuity between the human and nonhuman mind.
Indeed, we will argue that the functional discontinuity
between human and nonhuman minds pervades nearly
every domain of cognition from reasoning about
spatial relations to deceiving conspecifics and runs
much deeper than even the spectacular scaffolding pro-
vided by language or culture alone can explain.
At the same time, we know from Darwin’s more well-
grounded principles that there are no unbridgeable gaps
in evolution. Therefore, one of the most important
challenges confronting cognitive scientists of all stripes,
in our view, is to explain how the manifest functional
discontinuity between extant human and nonhuman
minds could have evolved in a biologically plausible
The first and probably most important step in
answering this question is to clearly identify the simi-
larities and the dissimilarities between human and nonhu-
man cognition from a purely functional point of view. We
therefore spend the bulk of the paper reexamining the evi-
dence for “human-like” cognitive abilities among nonhu-
man animals at a functional level, before speculating as
to how these processes might be implemented. We cover
a wide variety of domains, species, and experimental pro-
tocols ranging from spatial relations and mental state
reasoning in the lab to dominance relations and transitive
inferences in the wild. Across all these disparate cases,
a consistent pattern emerges: Although there is a profound
similarity between human and nonhuman animals’ abil-
ities to learn about and act on the perceptual relations
between events, properties, and objects in the world,
only humans appear capable of reinterpreting the
higher-order relation between these perceptual relations
in a structurally systematic and inferentially productive
fashion. In particular, only humans form general cat-
egories based on structural rather than perceptual criteria,
find analogies between perceptually disparate relations,
draw inferences based on the hierarchical or logical
relation between relations, cognize the abstract functional
role played by constituents in a relation as distinct from
the constituents’ perceptual characteristics, or postulate
relations involving unobservable causes such as mental
states and hypothetical physical forces. There is not
simply a consistent absence of evidence for any of these
higher-order relational operations in nonhuman animals;
there is compelling evidence of an absence.
In the last part of the article, we argue for the represen-
tational-level implications of our analysis. Povinelli and
colleagues have previously proposed that humans alone
are able to “reinterpret” the world in terms of unobserva-
ble, hypothetical entities such as mental states and causal
DEREK C. PENN received a Masters Degree from Boston
University in Philosophy and Literary Semiotics in 1987.
He spent the next 15 years on the trading floors of various
Wall Street investment firms and in Silicon Valley as a
software entrepreneur. In 2002 he retired from the
business world to pursue his life-long interest in compara-
tive psychology. He is currently affiliated with the Cogni-
tive Evolution Group, University of Louisiana, and the
University of California, Los Angeles and is working on
a trade book with Daniel J. Povinelli based on the hypoth-
eses proposed in the present article.
EITH J. HOLYOAK is a Distinguished Professor of Psy-
chology at the University of California, Los Angeles. The
author of more than 180 research articles, his books
include Mental Leaps: Analogy in Creative Thought
(co-authored with Paul Thagard, MIT Press, 1995) and
The Cambridge Handbook of Thinking and Reasoning
(co-edited with Robert Morrison, Cambridge University
Press, 2005). A past recipient of a Guggenheim Fellow-
ship, Holyoak is a Fellow of the American Association
for the Advancement of Science, the Association for
Psychological Science, the Cognitive Science Society,
and the Society for Experimental Psychology.
ANIEL J. POVINELLI is a Professor of Biology at the
University of Louisiana. He is the recipient of an Ameri-
can Psychological Association Award for an Early Career
Contribution to Psychology, an National Science Foun-
dation Young Investigator Award, and a Centennial Fel-
lowship from the James S. McDonnell Foundation. He
is a Fellow of the Association for Psychological Science
and was named one of “20 Scientists to Watch in the
Next 20 Years” by Discover magazine. Povinelli is also
Project Director for the National Chimpanzee Observa-
tory Working Group, a group of scientists, policy
makers, and concerned citizens dedicated to creating a
network of naturalistic observatories to prevent the immi-
nent extinction of chimpanzees in captivity and preserve
them for future behavioral and cognitive study.
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
forces and that our ability to do so relies on a unique rep-
resentational system that has been grafted onto the cogni-
tive architecture we inherited from our nonhuman
ancestors (Povinelli 2000; 2004; Povinelli & Giambrone
2001; Povinelli & Preuss 1995; Povinelli & Vonk 2003;
2004; Vonk & Povinelli 2006). Independently, Holyoak,
Hummel, and colleagues have argued that the ability to
reason about higher-order relations in a structurally sys-
tematic and inferentially productive fashion is a defining
feature of the human mind and requires the distinctive
representational capabilities of a “biological symbol
system” (Holyoak & Hummel 2000; 2001; Hummel &
Holyoak 1997; 2001; 2003; Kroger et al. 2004; Robin &
Holyoak 1995). Herein we combine, revise, and substan-
tially expand on the hypotheses proposed by these two
research groups.
We argue that most of the salient functional discontinu-
ities between human and nonhuman minds including
our species’ unique linguistic, mentalistic, cultural,
logical, and causal reasoning abilities result in part
from the difference in degree to which human and nonhu-
man cognitive architectures are able to approximate the
higher-order, systematic, relational capabilities of a phys-
ical symbol system (Newell 1980; Newell & Simon
1976). Although human and nonhuman animals share
many similar cognitive mechanisms, our relational reinter-
pretation hypothesis (RR) is that only human animals
possess the representational processes necessary for sys-
tematically reinterpreting first-order perceptual relations
in terms of higher-order, role-governed relational struc-
tures akin to those found in a physical symbol system
(PSS). We conclude by suggesting that recent advances
in symbolic-connectionist models of cognition provide
one possible explanation for how our species’ unique
ability to approximate the higher-order relational capabili-
ties of a physical symbol system might have been grafted
onto the proto-symbolic cognitive architecture we inher-
ited from our nonhuman ancestors in a biologically plaus-
ible manner.
2. Similarity
2.1. Perceptual versus relational similarity
We begin our review of the similarities and differences
between human and nonhuman cognition with what
William James (1890/1950) called “the very keel and back-
bone of our thinking”: sameness. The ability to evaluate the
perceptual similarity between stimuli is clearly the sine
qua non of biological cognition, subserving nearly every
cognitive process from stimulus generalization and Pavlo-
vian conditioning to object recognition, categorization,
and inductive reasoning. Humans, however, are not
limited to evaluating the similarity between objects
based on perceptual regularities alone. Humans not only
recognize when two physical stimuli are perceptually
similar, they can also recognize that two ideas, two
mental states, two grammatical constructions, or two
causal-logical relations are similar as well. Even pre-
school-age children understand that the relation between
a bird and its nest is similar to the relation between a
dog and its doghouse despite the fact that there is little
“surface” or “object” similarity between the relations’ con-
stituents (Goswami & Brown 1989; 1990). Indeed, as
numerous researchers have shown, the propensity to
evaluate the similarity between states of affairs based on
the causal-logical and structural characteristics of the
underlying relations rather than on their shared percep-
tual features appears quite early and spontaneously in
all normal humans as early as 25 years of age, depend-
ing on the domain and complexity of the task (Gentner
1977; Goswami 2001; Halford 1993; Holyoak et al. 1984;
Namy & Gentner 2002; Rattermann & Gentner 1998a;
Richland et al. 2006).
In short, there appear to be at least two kinds of simi-
larity judgments at work in human thought: judgments
of perceptual similarity based on the relation between
observed features of stimuli; and judgments of non-
perceptual relational similarity based on logical, func-
tional, and/or structural similarities between relations
and systematic correspondences between the abstract
roles that elements play in those relations (Gentner
1983; Gick & Holyoak 1980; 1983; Goswami 2001;
Markman & Gentner 2000). The question we are
interested in here is whether or not there is any evidence
for non-perceptual relational similarity judgments in
nonhuman animals as well.
2.2. Same-different relations
Among comparative researchers, the most widely repli-
cated test of relational concept learning over the last
quarter century has been the simultaneous same-different
(S/D) task, in which the subject is trained to respond one
way if two simultaneously presented stimuli are the same
and to respond a different way if the two stimuli are differ-
ent. In the purportedly more challenging relational match-
to-sample (RMTS) task, the subject must select the choice
display in which the perceptual similarity among elements
in the display is the same as the perceptual similarity
among elements in the sample stimulus. For example, pre-
sented with a pair of identical objects, AA, as a sample
stimulus, the subject should select BB rather than CD;
presented with a pair of dissimilar objects, EF, as the
sample stimulus, the subject should select GH rather
than JJ (see Thompson & Oden 2000 for a seminal
Although Premack (1983a; 1983b) initially reported that
only language-trained chimpanzees passed S/Dand
RMTS tasks, success on two-item S/D tasks has since
been demonstrated in parrots (Pepperberg 1987), dol-
phins (Herman et al. 1993b; Mercado et al. 2000),
baboons (Bovet & Vauclair 2001), and pigeons (Blaisdell
& Cook 2005; Katz & Wright 2006), among others.
Thompson et al. (1997) showed that language-naive chim-
panzees with some exposure to token-based symbol
systems are able to pass a two-item RMTS task
(cf. Premack 1988). Vonk (2003) has reported that three
orangutans and one gorilla were able to pass a complex
two-item RMTS task without any explicit symbol or
language training at all. Fagot et al. (2001) have shown
that language-naı
ve baboons can pass an RMTS task invol-
ving arrays of elements (see discussion below); and Cook
and Wasserman (in press) have reported successful
results on an array-based RMTS task with pigeons. So
passing S/D and RMTS tasks does not appear to be
limited to language-trained apes or even primates.
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
Regardless of which nonhuman species are capable of
passing S/D and RMTS tasks, the more critical and
largely overlooked point is this: Both of these experimental
protocols lack the power, even in principle, of demonstrat-
ing that a subject cognizes sameness and difference as
abstract, relational concepts which are (1) independent
of any particular source of stimulus control, and (2) avail-
able to serve in a variety of further higher-order inferences
in a systematic fashion. A functional decomposition of the
S/D and RMTS protocols reveals that the minimum cog-
nitive capabilities necessary to pass these tests are much
more modest.
The fundamental problem is that the same-different
relation at stake in the classic S/D task can be reduced
to a continuous, analog estimate of the degree of percep-
tual variability between the elements in each display.
Halford et al. (1998a) refer to this type of cognitive trick
as “conceptual chunking.” Chunking reduces the complex-
ity of processing a relation at the cost of losing the original
structure and components of the relation itself, but suffices
when the task does not require the structure of the relation
itself to be taken into account. A cognizer could pass a
classic S/D task by calculating an analog estimate of the
variability between items in the sample display and then
employ a simple conditional discrimination to select the
appropriate behavioral response to this chunked result.
Hence, success on an S/D task may imply that a subject
can generalize a rule-like discrimination beyond any par-
ticular feature in the training stimuli; but it cannot be
taken as evidence that the subject has understood same-
ness and difference as structured relations that are
mutually exclusive or that can be freely generalized
beyond the modality-specific rule the subject used in a
particular learning context.
The same deflationary functional analysis applies,
mutatis mutandis, to the RMTS task. The apparent rela-
tional complexity of the RMTS task can be significantly
reduced by segmenting the task into separate chunked
operations that are evaluated sequentially. First, the
subject can evaluate the variability within the first-order
relations by chunking them into analog variables.
Second, the subject can employ a straightforward con-
ditional discrimination to select the appropriate choice
display: for example, , if the variability of the sample
display is low, select the choice display with a low
variability . . Although this may qualify as a “higher-
order” operation, it does not qualify as a higher-order rela-
tional operation since the constituent structures of the
first-order relations are no longer relevant or available to
the higher-order process (see again, Halford et al.
1998a). At best, the RMTS task demonstrates that nonhu-
man animals can select the choice display that has the
same degree of between-item variability as the sample
display. But the task says nothing about nonhuman
animals’ ability to evaluate the non-perceptual relational
similarity between those relations.
The preceding functional decomposition of the S/D
and RMTS tasks is not merely a hypothetical possibility.
There is now good experimental evidence that chunking
and segmentation are precisely the tactics that nonhuman
animals employ when they succeed at S/D and RMTS
tasks. Wasserman and colleagues, for example, have
shown that both pigeons and baboons have much less dif-
ficulty passing S/D tasks when there are 16 items in each
set than when there are only 2 items in each set (Wasser-
man et al. 2001; Young & Wasserman 1997). Wasserman
et al. showed that a simple measure of item variability,
based on Shannon and Weaver’s (1949) measure of infor-
mational entropy, nicely captures the functional pattern of
nonhuman subjects’ discriminations across a variety of
experimental conditions (reviewed in Wasserman et al.
2004). Nonhuman animals’ performance on S/D tasks
differs markedly from the categorical, logical distinction
that humans make between sameness and difference.
Human subjects’ responses to S/D tasks are also influ-
enced by the degree of variability in the stimuli (Castro
et al. 2007; Young & Wasserman 2001); but most human
subjects exhibit a categorical distinction between displays
with no item variability (i.e., same) and those with any item
variability at all (i.e., different).
An analogous discontinuity between human and nonhu-
man judgments of similarity has also been documented on
RMTS tasks. Fagot et al. (2001) presented two adult
baboons and two adult human subjects with an RMTS
task using arrays of 16 visual icons that were either all
alike or all different. Both baboon and human subjects
learned to pass the RMTS test and successfully general-
ized to novel sets of stimuli. When the authors reduced
the number of items in the sample set from 16 to 2
icons, the difference between the two species, however,
was notable. The impact on the human subjects’ responses
was insignificant. The baboons’ performance, however, fell
to chance on different trials, whereas their performance on
same trials remained unchanged. This markedly asym-
metric effect is exactly what one would expect if the
baboons were discriminating between second-order
same and different relations by comparing the amount of
variability (e.g., entropy) in the two displays. That is,
same trials with 2 icons continue to yield zero entropy,
but different trials now yield a small entropy value that is
more difficult to discriminate from zero.
Entropy is certainly not the only factor modulating
nonhuman subjects’ judgments of sameness and differ-
ence. Stimulus oddity as well as spatial organization and
degree of similarity also play an important role (see
Cook & Wasserman 2006 for an important review).
Vonk (2003) has shown that language-naive apes can
judge variability along specific perceptual dimensions
(e.g., color rather than size or shape). And Bovet and
Vauclair (2001) have shown that baboons can pass a “con-
ceptual” S/D task in which pairs of objects are to be
treated as same if they share a similar learning history
or biological significance (e.g., objects-I-have-eaten vs.
objects-I-have-not-eaten). These results demonstrate
that nonhuman animals and not just language-trained
chimpanzees are capable of learning novel, sophisti-
cated, rule-governed discriminations that generalize
beyond any specific perceptual cue. But in all of the
results reported to date, the relevant discriminations
are bound to a particular source of stimulus control
(e.g., entropy, oddity, edibility). There is no evidence
that nonhuman animals understand what “sameness” in
one task has in common with “sameness” in another.
For example, after passing a “perceptual” S/Dtaskand
having been trained to categorize objects as either
“food” or “not food,” the baboons in Bovet and Vauclair’s
(2001) study nevertheless required an average of 14,576
additional trials on the “conceptual” S/Dtaskbefore
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
their responses were correct 80% of the time on trials
involving novel pairs of objects.
The available evidence therefore suggests that the for-
mative discontinuity in same-different reasoning lies not
between monkeys and apes, as Thompson and Oden
(2000) proposed, but between nonhumans and humans.
Chimpanzees and other nonhuman apes can pass
RMTS tasks with only 2 items in the sample display
(e.g., Thompson et al. 1997; Vonk 2003). Baboons can
pass RMTS tasks with as few as 34 items in each
sample (Fagot et al. 2001); and pigeons can pass RMTS
tasks with 16 items in each sample (Cook & Wasserman,
in press). The difference between the performance of
language-naive pigeons and language-trained chimps on
these tasks often comes down to a question of the
number of items in each set and the number of trials
necessary to reach criterion. As Katz and colleagues
point out (see Katz & Wright 2006; Katz et al. 2002),
this strongly suggests that there is a difference in degree
between various nonhuman species’ sensitivity to simi-
larity discriminations (influenced by training regimen),
not a difference in kind between their conceptual abilities
to predicate same-different relations.
The performance of human subjects, on the other hand,
contrasts sharply with the performance of all other animal
species. Humans manifest an abrupt, categorical distinc-
tion between displays in which there is no variability and
displays in which there is any variability at all (Cook &
Wasserman 2006; Wasserman et al. 2004). More impor-
tantly, contra Castro et al. (2007), we believe that human
subjects possess a qualitatively distinct system for reinter-
preting sameness and difference in a logical and abstract
fashion that generalizes beyond any particular source of
stimulus control. In short, even with respect to the most
basic and ubiquitous of all cognitive phenomena judg-
ments of similarity there is already a distinctive seam
between human and nonhuman minds.
2.3. Analogical relations
Premack (1983a, p. 357) suggested that the RMTS task is
an implicit form of analogy and claimed that “animals that
can make same/different judgments should be able to do
analogies.” Indeed, it is still widely accepted that the
ability to pass an RMTS task is the “cognitive primitive”
for analogical reasoning (see, e.g., Thompson & Oden
2000, p. 378). We disagree. While recognizing perceptual
similarities is certainly a necessary condition for making
analogical inferences (inter alia), there is a qualitative
difference between the kind of cognitive processes necess-
ary to pass an S/D or RMTS task and the kind of cognitive
processes necessary to reason in an analogical fashion. The
relations at issue in S/D and RMTS tasks are based solely
on the perceptual features of the constituents; and the
constituents play undifferentiated and symmetrical roles
in those relations (e.g., two objects are symmetrically
either the same or different).
Most true analogies, on the other hand, are based on
relations in which the constituents play asymmetrical,
causal-logical roles (e.g., the role that John plays in
forming the relation, John loves Mary, is not equivalent
to the role that Mary plays, perhaps to John’s dismay).
Furthermore, genuine analogical inferences are made by
finding systematic structural similarities between
perceptually disparate relations, allowing the cognizer to
draw novel inferences about the target domain indepen-
dently from the perceptual similarity between the
relations’ constituents (Gentner 1983; Gentner &
Markman 1997; Holyoak & Thagard 1995). Accordingly,
analogical relations sensu stricto cannot be reduced via
chunking and segmentation, but require the cognizer to
evaluate the abstract, higher-order relations at stake in a
structurally systematic and inferentially productive
Analogical reasoning is a fundamental and ubiquitous
aspect of human thought. It is at the core of creative
problem solving, scientific heuristics, causal reasoning,
and poetic metaphor (Gentner 2003; Gentner et al.
2001; Holyoak & Thagard 1995; 1997; Lien & Cheng
2000). And it is also central to the more prosaic ways
that typical human children learn about the world and
each other (Goswami 1992; 2001; Halford 1993; Holyoak
et al. 1984). To date, however, the only evidence that
any nonhuman animal is capable of analogical reasoning
sensu stricto comes from the unreplicated feats of a
single chimpanzee, Sarah, reported more than 25 years
ago by Gillan et al. (1981). Sarah reportedly constructed
and completed two distinct kinds of analogies. The first
was based on judging whether or not two geometric
relationships were the same or different (e.g., large blue
triangle is to small blue triangle as large yellow crescent
is to small yellow crescent). The second was based on
judging the similarity between two “functional” relation-
ships (e.g., padlock is to key as tin can is to can opener).
Gillan et al. (1981) reported that Sarah was successful on
both tests.
Savage-Rumbaugh was the first to point out that Sarah’s
performance on the geometric version of the original tests
could have been the result of a simple, feature-matching
heuristic (cited by Oden et al. 2001). In response, Oden
et al. (2001) followed up Gillan et al.’s original experiment
on geometric analogies with a series of more carefully con-
structed tests designed to flesh out Sarah’s actual cognitive
strategy. These new experiments used geometric forms
that varied along one or more featural dimensions (e.g.,
size, color, shape, and/or fill). After extensive testing,
Oden et al. showed that Sarah was actually tracking the
number of within-pair featural differences rather than
the kind of relation between pairs of figures. For
example, whereas a human would see a color plus a
shape change as differing from a size plus a fill change,
Sarah saw these two transformations as equivalent
because they both entailed two featural changes.
Oden et al. (2001) argued that this strategy still demon-
strates Sarah’s ability to reason about the “relation
between relations.” But there is a profound difference
between the feature-based heuristic Sarah apparently
adopted and the role-based structural operations that are
the basis of analogical inference sensu stricto. To be
sure, keeping track of the number of within-pair featural
changes certainly requires quite sophisticated represen-
tational processes. But the fact that Sarah apparently
ignored the structure of the relation between pairs of
figures suggests that she represented any featural change
as an undifferentiated chunk for the purposes of this
task. Therefore, her strategy on this task appears to be
computationally equivalent to the kind of chunking and
segmentation strategies other nonhuman primates use
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
to solve RMTS tasks. According to Oden et al.’s (2001)
own analysis, Sarah failed to demonstrate a systematic sen-
sitivity to the higher-order structural relation between
relations. It is this systematic sensitivity to higher-order
structural relations which is, as Gentner (1983) has long
argued, the hallmark of analogical reasoning in humans.
Therefore, the claim that nonhuman animals are
capable of analogical inferences rests solely on Sarah’s per-
formance in the test of functional analogies reported by
Gillan et al. (1981). There are many reasons to be skeptical
of these results as well. For one, Sarah’s performance on
these analogies has never been replicated either by
Sarah herself or by any other nonhuman subject.
Second, of the two experiments (3A and 3B) devoted to
functional analogies, the authors themselves admit that
the first, 3A, is open to an alternative feature-based
account. Furthermore, the second experiment, 3B, did
not require Sarah to complete or construct analogies. It
merely required her to respond to the relation between
two pairs of objects with one of two plastic tokens that
her experimenters interpreted as meaning same and differ-
ent. Sarah’s extensive prior exposure to the objects used in
this experiment, however, makes it very difficult to judge
how she learned to cognize the relation between these
objects (e.g., how exactly did Sarah understand that the
relation between “torn cloth” and “needle and thread” is
the same as the relation between “marked, torn paper”
and “tape”?). Indeed, the authors themselves admit that
Sarah’s “unique experimental history” may have contribu-
ted to her success on these tasks (Gillan et al. 1981, p. 11).
In short, what is sorely needed is a more extensive series
of tests, like those carried out by Oden et al. (2001), to sys-
tematically tease apart the salient parameters in Sarah’s
cognitive strategy. Until then, Sarah’s remarkable and
unreplicated success on experiment 3B as reported by
Gillan et al. (1981) constitutes thin support for claiming
that nonhuman animals are capable of analogical
3. Rules
One of the hallmarks of human cognition is our ability to
freely generalize abstract relational operations to novel
cases beyond the scope in which the relation was originally
learned (see Marcus 2001 for a lucid exposition). It is widely
recognized, for example, that the ability to freely generalize
relational operations over role-based variables is a necessary
condition for using human languages (Gomez & Gerken
2000). Furthermore, experiments in artificial grammar
learning (AGL) have shown that human subjects’ ability
to learn and generalize abstract relations over role-based
abstractions is not limited to natural languages (e.g.,
Altmann et al. 1995; Gomez 1997; Marcus et al. 1999;
Reber 1967). Although it is quite controversial how the
human cognitive architecture performs these rule-like
feats (see, e.g., Marcus 1999; McClelland & Plaut 1999; Sei-
denberg & Elman 1999), the fact that human subjects
manifest these rule-like generalizations is “undisputed
(Perruchet & Pacton 2006). The question we want to
focus on here is whether or not this undisputable behavioral
“fact” also holds for nonhuman animals.
To date, the strongest positive evidence that nonhuman
animals are able to generalize novel rules in a systematic
fashion comes from an experiment with tamarin
monkeys (Hauser et al. 2002b), which replicated an AGL
experiment that Marcus et al. (1999) had previously per-
formed on 7-month-old children. In this “ga ti ga” proto-
col, subjects were habituated to sequences of nonsense
syllables in one of two patterns (e.g., AAB vs. ABB). Fol-
lowing habituation, the subjects were presented with test
sequences drawn from an entirely novel set of syllables.
Some of the test sequences followed the grammatical
pattern presented during habituation and some did not.
Hauser et al. (2002b) showed that tamarin monkeys, like
human children, were more likely to dishabituate to the
novel, “ungrammatical” pattern.
In our view, the claim that this experiment provides
evidence for “rule learning” in a nonhuman species is
not entirely unfounded; but it needs to be carefully qua-
lified, as the kind of rules that tamarin monkeys learned
in this experiment is qualitatively different from the
kind of rules that is characteristic of human language
and thought. Many early AGL experiments failed to dis-
tinguish between tasks that required subjects to learn
perceptually bound relations from tasks that required
subjects to learn non-perceptual structural relations
over role-based variables (for a critical review, see
Redington & Chater 1996). Tunney and Altmann
(1999), for example, point out that there are at least
two forms of sequential dependencies that might be
learned in an AGL experiment: “repeating” dependen-
cies in which the occurrence of an element in one pos-
ition determines the occurrence of the same element in
a subsequent position, and “nonrepeating” dependencies
in which the occurrence of an element in one position
determines the occurrence of a different element in a
subsequent position. Repeating elements share a
higher-order perceptual regularity (i.e., perceptual simi-
larity), whereas purely structural dependencies between
non-repeating elements do not. Therefore, sensitivity to
sequential dependencies between repeating elements
does not necessarily imply sensitivity to sequential
dependencies between nonrepeating elements. Indeed,
Tunney and Altmann (2001) demonstrate that adult
human subjects appear to have distinct and dissociable
mechanisms for learning each kind of dependency. At
best, Hauser et al.’s (2002b) results demonstrate that
tamarin monkeys possess the ability to learn repeating,
perceptually based dependencies.
Similarly, Gomez and Gerken (2000) distinguish
between “pattern-based” and “category-based” rules. In
the former case, the rule is abstracted from the sequence
of perceptual relations between elements in a given array
of training stimuli; in the latter case, the rule is based on
the structural relation between abstract functional roles.
The AAB and ABB patterns learned by tamarin monkeys
in Hauser et al.’s (2002b) study are an example of the
former, pattern-based type of rule; the noun-verb-noun
pattern learned by human language users is an example
of the latter, role-based type of rule. Both kinds of oper-
ations may qualify as “rule-like” in the sense that they gen-
eralize a given relation beyond the feature set on which it
was originally trained. But it is role-based (i.e., “algebraic”)
rules, as Marcus (2001) points out, that are the hallmarks
of human thought and language. To date, there is no evi-
dence for this kind of rule learning in any nonhuman
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
4. Higher-order spatial relations
All normal adult humans are capable of using allocentric
representations of spatial relations and of reasoning
about the higher-order relation between spatial relations
at different scales. The ubiquity of maps, diagrams,
graphs, gestures, and artificial spatial representations of
all sorts in human culture speaks for itself. Indeed, by
the age of 3, all normal humans are able to reason about
the higher-order relation between small-scale artificial
spatial models and large-scale spatial relations in the real
world (see Gattis 2005 for a review). DeLoache (2004)
has argued that this ability represents a crucial step in chil-
dren’s progress towards becoming “symbol minded.” The
question at hand is whether there is any evidence that non-
human animals can reason about the higher-order relation
between spatial relations in a similar fashion.
The best evidence to date for higher-order spatial
reasoning in a nonhuman animal comes from the work
of Kuhlmeier and colleagues (Kuhlmeier & Boysen
2001; 2002; Kuhlmeier et al. 1999). Kuhlmeier et al.
(1999) first instructed seven captive chimpanzees to
associate the miniature and the full-sized versions of
four distinct objects by drawing their attention to the
association “verbally and gesturally” (p. 397). After this
initial training, the chimpanzees watched as the exper-
imenter hid a miniature can of soda behind a miniature
version of one of the four objects within a 1:7 scale
model of a full-sized room or outdoor enclosure. Then
the chimpanzees were given the opportunity to find the
real can of soda in the adjacent full-sized space. When
the chimpanzees were tested on a version of the task in
which they were rewarded only if they retrieved the can
of soda on the first search attempt (Kuhlmeier & Boysen
2001), six out of the seven subjects performed above
These results demonstrate that chimpanzees are able to
learn to associate two objects (the real object and its min-
iature) that are highly similar perceptually and to locate a
reward based on this association. But this is a far cry from
being able to reason about the higher-order relation
between a scale model and its real-world referent.
Indeed, Kuhlmeier et al. (1999, p. 397) reported that
one chimpanzee was able to locate the food rewards
simply upon being shown the miniature version of the
hiding place without referring to the scale model at all.
In short, this first protocol did not require the chimpan-
zees to reason about the higher-order spatial relation
between the scale model and full-sized room. A simple,
learned association between two arbitrary cues sufficed.
In a follow-up experiment designed to eliminate purely
associative cues, Kuhlmeier and Boysen (2002) varied the
congruency of the color, shape, or position of the minia-
tures relative to the full-sized version of the hiding site.
As a group, the chimpanzees were successful when pos-
itional cues were absent. However, when all the hiding
sites were visually identical and the correct one had to
be found based on its relative location within the scale
model alone, only two of the seven chimpanzees per-
formed above chance.
It is clear from these results that reasoning in terms of
relative spatial locations alone is significantly more difficult
for chimpanzees than is reasoning in terms of object-based
cues alone. But it must be noted that even the successful
performance of two out of the seven subjects does not
demonstrate higher-order relational abilities, since the
four locations in which the hiding sites were placed
remained constant across all of these experiments (Kuhl-
meier, personal communication). Hence, it is impossible
to know whether the two successful chimpanzees were
reasoning on the basis of a general, systematic under-
standing of the analogy between spatial locations in the
scale model and spatial locations in the outdoor enclosure,
or whether, more modestly, they had simply learned over
the course of their long experimental history with this par-
ticular protocol to associate a particular location in the
scale model with a particular location in the enclosure.
It remains to be seen whether chimpanzees, or any
other nonhuman animal, could succeed in this protocol
if the hiding sites were randomly relocated on each trial.
In the meantime, there is a conspicuous absence of evi-
dence that any nonhuman animal can reason about scale
models, maps, or higher-order spatial relations in a
human-like fashion.
5. Transitive inference
Ever since Piaget (1928; 1955), the ability to make sys-
tematic inferences about unobserved transitive relations
has been taken as a litmus test of logical-relational reason-
ing (but see Wright 2001). For example, told that “Bill is
taller than Charles” and “Abe is taller than Bill,” human
children can infer that “Abe is taller than Charles”
without being given any information about the absolute
heights of Abe, Bill, or Charles (Halford 1984). Over the
last quarter century, comparative researchers have persist-
ently claimed that nonhuman animals are capable of
making transitive inferences in a purely logical-relational
fashion, as well. Upon closer examination of the evidence,
however, it becomes apparent that the kinds of transitive
inferences that are made by nonhuman animals do not
require a systematic, domain-general logical-relational
competence, but rather, can be made using much more
prosaic, domain-specific, and egocentric information-
processing mechanisms.
5.1. Transitive choices in the lab
For many decades now, the classic comparative test of
transitive inference has been a nonverbal five-item task
developed by Bryant and Trabasso (1971) in which sub-
jects are incrementally trained on pairs of stimuli (i.e.,
A þ B-, BþC-, CþD-, DþE-) and then tested on non-
adjacent untrained pairs. The discriminative relation
between the stimuli used in most of these studies is not,
in fact, transitive; it is the subjects’ choices that become
transitive as a result of the pattern of differential reinforce-
ment: that is, repeated reinforcement of the choice of A
over B and of B over C eventually leads to the subject pre-
ferring A over C. As Halford et al. (1998b) pointed out, a
subject’s preferences can become transitive through incre-
mental reinforcement without there being a transitive
relation between the underlying task elements themselves,
and therefore without requiring the subject to understand
anything about transitivity as a logical property. Indeed,
many researchers have shown that successfully selecting
B over D in the traditional five-item incremental protocol
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
can be achieved using purely associative operations (De
Lillo et al. 2001; Wynne 1995).
To be sure, reinforcement history cannot be the whole
story, as Lazareva et al. (2004) have recently demon-
strated. Lazareva et al. (2004) trained eight hooded
crows in a clever variation on Bryant and Trabasso’s five-
item protocol. Five colored cards were used to represent
the elements in the series, A through E. The color on
one side of the card served as the choice stimulus, and a
circle of the same color on the underside of the card
served as the post-choice feedback stimulus. The crows
were asked to choose one of two simultaneously presented
cards. Importantly, the colored circles on the underside of
the cards were displayed to the crows only after they had
selected one of the two choice stimuli. The crows were
divided into two experimental groups. In the ordered-
feedback group, the diameter of the circles associated
with the choice stimuli became progressively smaller
from A to E. In the constant-feedback group, the diameter
of the feedback circles did not change. After initial train-
ing, Lazareva et al. (2004) overexposed both groups of
crows to DþE- pairings. Under traditional associative
models, massive overexposure to DþE- pairings should
lead to preferentially selecting D over B. Nevertheless,
the crows in the ordered-feedback group selected B over
D in the BD pairings, whereas the crows in the con-
stant-feedback group either chose at random or preferred
D over B.
Lazareva et al.’s (2004) results show that reinforcement
history alone cannot account for the emergence of choice
transitivity among nonhuman animals. Moreover, we
agree with Lazareva et al. (2004) that these results are con-
sistent with some kind of “spatial representation” hypoth-
esis (Gillan 1981). But what is not often noted by
comparative researchers is that evidence for an integrated
representation of an ordered series is not in and of itself
evidence for transitive reasoning or relational integration
in a logical-deductive sense. There is more to making logi-
cally underpinned transitive inferences than constructing
an ordered representation of one’s choices.
As Lazareva et al. (2004) themselves point out, in order
to claim evidence for logically underpinned transitive
inferences, one must show that the organism can, in fact,
distinguish between transitive and non-transitive relations
and that it makes its choices on the basis of this logical
relation independently of other non-logical factors such
as reinforcement history and training regime (see also
Halford et al. 1998a; Wright 2001). The results reported
by Lazareva et al. (2004) do not provide evidence for
either of these criteria.
In a follow-up experiment, Lazareva and Wasserman
(2006) showed that pigeons select B over D stimuli in
the same protocol employed by Lazareva et al. (2004)
even when the size of the post-choice cues is constant
which demonstrates that the transitive perceptual relation
between the post-choice cues is not, in fact, computation-
ally necessary for successfully passing this particular proto-
col. It is unclear why crows but not pigeons were
unable to pass the test in the constant-feedback condition.
There are many possible explanations. For example,
Lazareva et al. (2004) did not rule out the possibility that
it was simply the variability between post-choice cues
that encouraged the crows’ successful responses rather
than their transitivity per se. In any case, in order to
warrant the claim that the crows were reasoning on the
basis of the logical relation between post-choice stimuli
independently of other non-logical factors, it would be
necessary to show that the crows could systematically gen-
eralize to novel stimuli on a first trial basis: For example,
trained to associate a novel choice stimulus, X, with a
colored circle of a given diameter, could the crows cor-
rectly choose between X and any stimulus from the set,
A through E, on a first-trial basis in a systematic
manner? To date, there is no evidence that crows, or any
other nonhuman animal, could pass such a test.
5.2. Transitive inferences in the wild
Many researchers have argued that animals’ full transi-
tive reasoning capabilities are most likely to manifest
themselves in inferences involving social relations (e.g.,
Bond et al. 2003; Grosenick et al. 2007; Kamil 2004;
Paz et al. 2004). Much of the early fieldwork focused on
nonhuman primates (see Tomasello & Call 1997 for a
review). The strongest evidence to date for transitive
social inferences in a nonhuman animal comes not from
primates, however, but from birds (see review by Kamil
2004) and fish (see Grosenick et al. 2007). Paz et al.
(2004), for example, showed that male pinyon jays can
anticipate their own subordinance relation to a stranger
after having witnessed the stranger win a series of con-
frontations with a familiar but dominant conspecific.
Similarly, Grosenick et al. (2007) allowed territorial A.
burtoni male fish to observe pairwise fights between
five rivals (i.e., AB, BC, CD, DE), with the outcomes
implying a dominance ordering of A . B . C . D . E.
When subsequently given a choice between B and
D, observers preferred to spend more time adjacent to
D rather than B.
Results such as these demonstrate that the ability to keep
track of the dominance relations between tertiary dyads is
not limited to nonhuman primates or even to mammals
(cf. Tomasello & Call 1997). Furthermore, fish and birds,
in addition to nonhuman primates, can apparently use
this information to make rational (i.e., ecologically adaptive)
choices about how to respond to potential rivals (see also
Bergman et al. 2003; Bond et al. 2003; Hogue et al. 1996;
Silk 1999). The accumulated evidence therefore rules out
a traditional associative explanation and strongly supports
a more complex, information-processing account of how
nonhuman animals keep track of and respond to dominance
relations among conspecifics.
But none of the available comparative evidence suggests
that nonhuman animals are able to process transitive infer-
ences in a systematic or logical fashion, even in the social
domain. The experiments reported by Paz et al. (2004) and
Grosenick et al. (2007) provide evidence for only one par-
ticular kind of transitive inference: an inference from
watching a series of agonistic interactions between conspe-
cifics to an egocentric prediction about how to respond to
a potentially dominant rival. Neither experiment provides
any evidence that these subjects would also be able to
systematically predict the relation between unobserved
third-party dyads or could use their own interactions with
a conspecific to predict that conspecific’s relation to other
rivals let alone answer the kind of omni-directional
queries of which humans are manifestly capable: For
example, what individuals are dominant to B? What is the
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
relation between C and A? Is A dominant to C to a greater
or lesser extent than B is dominant to C? (Goodwin &
Johnson-Laird 2005; Halford et al. 1998a).
In short, whereas at least some nonhuman animals
clearly are able to make transitive inferences about their
own relation to potential rivals to a degree that rules out
purely associative learning mechanisms, the comparative
evidence accumulated to date is nevertheless consistent
with the hypothesis that nonhuman animals’ under-
standing of transitive relations is punctate, egocentric,
non-logical, and context-specific.
6. Hierarchical relations
Being able to process recursive operations over hierarchi-
cal relations is unarguably a key prerequisite for using a
human language (Hauser et al. 2002a). And most normal
human children are capable of reasoning about hierarchi-
cal class relations in a systematic and combinatorial fashion
by the age of five (Andrews & Halford 2002; cf. Inhelder &
Piaget 1964). Given the ubiquity and importance of hier-
archical relations in human thought, the lack of any
similar ability in nonhuman animals would therefore con-
stitute a marked discontinuity between human and nonhu-
man minds.
6.1. Seriated cups and hierarchical reasoning
A number of comparative researchers have reinterpreted
the behavior of nonhuman animals in hierarchical terms
(e.g., Byrne & Russon 1998; Greenfield 1991; Matsuzawa
1996). In each of these cases, however, there is no evi-
dence that the nonhuman animals themselves cognized
the task in hierarchical terms or employed hierarchically
structured mental representations to do so. The most
widely cited case of hierarchical reasoning among nonhu-
man animals, for example, has come from experiments
involving seriated cups. It has been claimed that “subas-
sembly” (i.e., combining two or more cups as a subunit
with one or more other cups) requires the subject to rep-
resent these nested relations in a combinatorial and
“reversible” fashion (Greenfield 1991; Westergaard &
Suomi 1994). Indeed, Greenfield (1991) argued that chil-
dren’s ability to nest cups develops in parallel with their
ability to employ hierarchical phonological and grammati-
cal constructions, and therefore, that the ability of nonhu-
man primates to seriate cups is the precursor to
comprehending hierarchical grammars (see Matsuzawa
1996 for claims of a similar “isomorphism” between tool
and symbol use).
But is it actually necessary to cognize hierarchically
structured relations in order to assemble nested cups?
To date, Johnson-Pynn, Fragaszy, and colleagues have
provided the most convincing evidence that a nonhuman
animal can use subassembly to assemble seriated cups
(Fragaszy et al. 2002; Johnson-Pynn & Fragaszy 2001;
Johnson-Pynn et al. 1999). Yet, Johnson-Pynn and Fra-
gaszy themselves dispute the claim that this behavior
requires hierarchical relational operations of the kind
suggested by Greenfield (1991).
Fragaszy et al. (2002), for example, presented seriated
cups to adult capuchin monkeys, chimpanzees, and 11-,
16-, and 21-month-old children. Children of all three
ages created five-cup sets less consistently than the nonhu-
man subjects did, and they were rarely able to place a sixth
cup into a seriated set. Bizarrely, at least for a purely rela-
tional interpretation of the results, monkeys were more
successful than either apes or human children on the
more challenging six-cup trials, yet were also the most inef-
ficient (in terms of number of moves) of the three
Fragaszy et al.’s (2002) explanation for these anomalous
results is quite sensible (see also Fragaszy & Cummins-
Sebree 2005): They hypothesize that the seriation task
does not, in fact, require the subject to reason about com-
binatorial, hierarchical relations per se, but depends more
simply on situated, embodied sensory-motor skills that are
experientially, rather than conceptually, driven. Apes and
monkeys do better than children because they are more
physically adept than 11- to 21-month-old children
are not because they have a more sophisticated rep-
resentation of the combinatorial and hierarchical relations
involved. Although subassembly may be a more physically
“complex” strategy than other methods of seriation, it does
not necessarily require the subject to cognize the spatial-
physical relations involved as hierarchical; and therefore
there is no reason to claim an isomorphism between the
embodied manipulation of nested cups and the cognitive
manipulation of symbolic-relational representations
(cf. Greenfield 1991; Matsuzawa 1996).
6.2. Hierarchical relations in the wild
The strongest evidence to date in support of the claim that
nonhuman animals can reason about hierarchically struc-
tured relations in the social domain comes from
Bergman et al.’s (2003) study of free-ranging baboons.
Bergman et al. designed an elegant playback experiment
in which female baboons heard a sequence of recorded
calls mimicking a fight between two other females. Mock
agonistic confrontations were created by playing the
“threat-grunt” of one individual followed by the subordi-
nate screams of another. On separate days, the same
subject heard one of three different call sequences: (1)
an anomalous sequence mimicking a rank reversal
between members of the same matrilineal family (i.e.,
sisters, mothers, daughters, or nieces); (2) an anomalous
sequence mimicking a between-family rank reversal (i.e.,
between members of two different matrilineal families in
which one of the families is dominant to the other); or
(3) a control sequence replicating an existing dominant-
subordinate relationship (i.e., no rank reversal) using
between-family or within-family dyads. As predicted,
there was a significant difference in the focal subjects’
responses to the three different kinds of call sequences.
Subjects looked longest at between-family rank reversals.
There was no significant difference between within-
family reversals and no-reversal control sequences.
According to Bergman et al., the reason the baboons
responded more strongly to between-family rank reversals
than within-family sequences is because the baboons
recognized that the former imply a superordinate reorgan-
ization of matrilineal subgroups. Bergman et al. (2003,
p. 1236) conclude: “Our results suggest that baboons
organize their companions into a hierarchical, rule-
governed structure based simultaneously on kinship and
rank” (see also Seyfarth et al. 2005).
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
In our view, the evidence reported by Bergman et al.
(2003) does not support this conclusion. Even if baboons
do make a categorical distinction between kin and non-
kin dyads based on interaction history, familiarity, spatial
proximity, phenotypic cues, or some other observable
regularity (see Silk 2002a for a review of the possibilities),
this does not necessarily mean that they represent the
entire matrilineal social structure as an integrated rela-
tional schema in which non-kin relations are logically
superordinate to between-kin relations. As Bergman
et al. (2003) themselves point out, between-family rank
reversals are much more disruptive to baboon social life
than within-family rank reversals. Therefore, Bergman
et al.’s (2003) results are consistent with the hypothesis
that female baboons have learned that rank reversals
among non-kin are more salient (i.e., associated with
greater social turmoil and personal risk) than are within-
kin rank reversals occurring in someone else’s family
(notably, Bergman et al. did not test rank reversals
within the focal subject’s own family). While baboons
clearly recognize particular conspecifics’ vocalizations
and represent dominance and kin relations in a combina-
torial manner, there is nothing in Bergman et al.’s data that
remotely suggests a higher-order, hierarchical relation
among these representations.
Once again, there is not simply an absence of evidence;
there is evidence of an absence. Bergman et al. (2003) note
that the subjects’ responses to apparent rank reversals
were unrelated to the rank distance separating the two sig-
nalers: that is, subjects paid as much attention to mock
rank reversals involving closely ranked opponents as
those involving more distantly ranked opponents.
Bergman et al. use this fact to rebut the hypothesis that
the baboons were responding more strongly to between-
family rank reversals simply because the individuals
involved had more disparate ranks. However, the data
cut both ways: If the baboons did cognize the relation
between female conspecifics as an integrated matrilineal
dominance hierarchy, ceteris paribus, they should have
been more surprised at a rank reversal between a very
low ranking and a very high ranking individual than by a
rank reversal between two individuals of adjacent ranks.
Ironically, Bergman et al.’s results provide some of the
strongest evidence to date that female baboons do not,
in fact, cognize the structure of their conspecifics’ matrili-
neal social relationships in a systematic or hierarchical
7. Causal relations
There is ample evidence that traditional associationist
models are inadequate to account for nonhuman causal
cognition; but the available comparative evidence also
suggests that there is a critical and qualitative difference
between the ways that human and nonhuman animals
reason about causal relations (see Penn & Povinelli
2007a for a more extensive review and discussion).
Humans explicitly reason in terms of unobservable and/
or hidden causes (Hagmayer & Waldmann 2004;
Kushnir et al. 2005; Saxe et al. 2005), distinguish
between “genuine” and “spurious” causes (Lien &
Cheng 2000), reason diagnostically from effects to their
possible causes (Waldmann & Holyoak 1992), and plan
their own interventions in a quasi-experimental fashion
to elucidate ambiguous causal relations (Hagmayer et al.
2007). Numerous researchers have argued that normal
humans not just scientists or philosophers form
“intuitive theories” or “mental models” about the unobser-
vable principles and causal forces that shape relations in a
specific domain (e.g., Carey 1985; Gopnik & Meltzoff
1997; Keil 1989; Murphy & Medin 1985). These tacit
systems of higher-order relations at various levels of gen-
erality modulate how human subjects judge and discover
novel relations within those domains by a process akin to
analogical inference (Goldvarg & Johnson-Laird 2001;
Lee & Holyoak 2007; Lien & Cheng 2000; Tenenbaum
et al. 2007). In short, the ability to reason about higher-
order, analogical relations in a systematic and productive
fashion appears to be an integral aspect of human causal
In stark contrast to the human case, there is no compel-
ling evidence that nonhuman animals form tacit theories
about the unobservable causal mechanisms at work in
the world, seek out explanations for anomalous causal
relations, reason diagnostically about unobserved causes,
or distinguish between genuine and spurious causal
relations on the basis of their prior knowledge of abstract
causal mechanisms.
Indeed, there is consistent evidence
of an absence across a variety of protocols (see, e.g.,
Penn & Povinelli 2007a; Povinelli 2000; Povinelli &
Dunphy-Lelii 2001; Visalberghi & Tomasello 1998).
A variety of nonhuman animal species and certainly
not primates alone (Emery & Clayton 2004b) are able
to construct and use tools in a flexible and adaptive
fashion. But a series of seminal experiments, initiated by
Visalberghi and colleagues (see Visalberghi & Limongelli
1996 for a review), provides a particularly compelling
example of how nonhuman animals’ remarkable use of
tools nevertheless belies a fundamental discontinuity
with our human understanding of causal relations.
Visalberghi and Limongelli (1994) tested capuchin
monkeys’ ability to retrieve a piece of food placed inside
a transparent tube using a straight stick. In the middle of
the tube, there was a highly visible hole with a small trans-
parent cup attached. If the subject pushed the food over
the hole, the food fell into the cup and was inaccessible
(“trap-down” condition). Visalberghi and Limongelli
(1994) tested four capuchin monkeys to see whether
they would understand that they needed to push the
food out the end of the tube away from the hole. After
about 90 trials, only one out of the four capuchin
monkeys learned to push the food away from the hole,
and even this one learned the correct behavior through
trial and error. Worse, once the experimenters rotated
the tube so that the trap hole was now facing up and cau-
sally irrelevant (“trap-up” condition), the one successful
capuchin still persisted in treating the hole as if it
needed to be avoided making it obvious that even this
subject misunderstood the causal relation between the
trap hole and the retrieval of the reward.
Povinelli (2000) and colleagues subsequently replicated
Visalberghi’s trap-tube protocol with seven chimpanzees.
Povinelli performed the experiments once when the chim-
panzees were juveniles (5 to 6 years old) and again when
they were young adults (10 years old). Three out of the
seven chimps learned to solve the trap-down version of
the task as adults, with one chimp, Megan, learning to
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
solve the task within 100 trials. However, none of the
chimps showed any evidence of distinguishing between
the trap-up and trap-down versions of the task. By way
of comparison, it should be noted that children as young
as 3 years of age successfully solve the trap-tube task
after only a few trials (see Limongelli et al. 1995).
Recently, Mulcahy and Call (2006b) tested ten great
apes on a version of the trap-tube task that allowed sub-
jects to choose whether to pull or push the reward
through the tube. Three out of the ten subjects learned
to avoid the trap when pulling rather than pushing.
However, the majority of subjects still failed the task.
Indeed, even the three successful subjects took an
average of 44 trials to achieve above-chance performance,
and then continued to fail Visalberghi and Limongelli’s
(1994) push-only version of the task. Therefore, these
latest results seem to confirm two earlier hypotheses: (1)
nonhuman apes are more adept at pulling than pushing
in tool-use tasks such as these (see, e.g., Povinelli 2000,
Ch. 5); and (2) nonhuman primates’ causal knowledge is
tightly coupled to specific task parameters and bodily
movements: in particular, they do not appear to grasp
the abstract, analogical similarity between perceptually
disparate but functionally equivalent tasks (Penn & Povi-
nelli 2007a; Povinelli 2000; Visalberghi & Tomasello
Nonhuman primates are not the only animals that seem
to be incapable of cognizing the general causal principles
at issue in the trap-tube task. Seed et al. (2006) recently
presented eight rooks with a clever modification to Visal-
berghi’s trap-tube task in which each tube contained two
traps, one which was functional and one which was not.
Seven out of eight rooks rapidly learned to pull the food
away from the functional trap and successfully transferred
this solution to a novel but perceptually similar version of
the task. Nevertheless, when presented with transfer tasks
in which the visual cues that were associated with success
in the initial tasks were absent or confounded, only one of
the seven subjects passed. In a follow-up experiment
(Tebbich et al. 2007), none of the rooks passed the transfer
Seed et al.’s (2006) results add to the growing evidence
that corvids are quite adept at using stick-like tools (see,
e.g., Weir & Kacelnik 2007). But as Seed et al. (2006)
point out, these results also suggest that rooks share a
common cognitive limitation with nonhuman primates:
they do not understand “unobservable causal properties”
such as gravity and support; nor do they reason about
the higher-order relation between causal relations in an
analogical or theory-like fashion. Instead, rooks, like
other nonhuman animals, appear to solve tool-use pro-
blems based on evolved, domain-specific expectations
about what perceptual features are likely to be most
salient in a given context and a general ability to reason
about the causal relation between observable contingen-
cies in a flexible, goal-directed but task-specific fashion
(see also Penn & Povinelli 2007a).
8. Theory of mind
Nonhuman animals certainly manifest many sophisticated
social-cognitive abilities. But having a theory of mind
(ToM) sensu Premack and Woodruff (1978) means
something more specific than being a socially savvy
animal: it means being able to impute unobservable, con-
tentful mental states to other agents and then to reason in
a theory-like fashion about the causal relation between
these unobservable mental states and the agents’ sub-
sequent behavior (see Penn & Povinelli 2007b for a
more extensive discussion of this point). Of course,
theory-like inferences are not the only way in which a cog-
nizer might reason about other agents’ mental states (see
Carruthers & Smith 1996 for a review of the possibilities).
Mentalistic simulation, for example, provides an alterna-
tive and popular explanation. However, all but the most
radical simulation-oriented theories do not deny that
humans represent causal relations involving other agents’
unobservable mental states. They simply propose an
alternative, analogical mechanism for how humans do so.
Whiten (1996; 2000) has proposed another, influential
hypothesis about how nonhuman apes (and young chil-
dren) might represent the mental states of their conspeci-
fics without relying on theory-like metarepresentations.
Whiten proposed that nonhuman apes use “intervening
variables” to stand in for generalizations about the causal
role played by a given mental state in a set of disparate
behavioral patterns. For example, a chimpanzee that
encodes the observable patterns “X saw Y put food in
bin A,” “X hid food in bin A,” and “X sees Y glancing at
bin A” as members of the same abstract equivalence
class could be said, on Whiten’s account, to recognize
that “X knows food is in bin A” and, therefore, be
capable of “explicit mindreading” (Whiten 1996).
Notice that Whiten’s example of “explicit mindreading”
is a textbook example of analogical reasoning: Whiten’s
hypothetical chimpanzee must infer a systematic higher-
order relation among disparate behavioral patterns that
have nothing in common other than a shared but unobser-
vable causal mechanism: that is, what X “knows.” If this is
an “intervening variable,” it is an intervening variable that
requires reasoning about the higher-order, role-governed
relational similarity between perceptually disparate
causal relations in order to be produced.
We believe Whiten is right in this sense: If a nonhuman
animal were capable of inferring that these disparate beha-
vioral patterns were actually instances of the same super-
ordinate causal relation, then the animal would surely
have demonstrated that it possessed a ToM and the
ability to reason analogically, as well. There is, however,
no such evidence on offer. Indeed, until recently, there
has been a fragile consensus that nonhuman animals lack
anything even remotely resembling a ToM (Cheney &
Seyfarth 1998; Heyes 1998; Tomasello & Call 1997; Visal-
berghi & Tomasello 1998).
A few years ago, however, Hare et al. (2000; 2001)
reported “breakthrough” evidence that chimpanzees do,
in fact, reason about certain psychological states in their
conspecifics (see, particularly, Tomasello et al. 2003a;
2003b). And since then, there have been a flurry of
similar claims on behalf of corvids and monkeys based
on similar protocols (Bugnyar & Heinrich 2005; 2006;
Dally et al. 2006; Emery & Clayton 2001; in press; Flom-
baum & Santos 2005; Santos et al. 2006). Because Povi-
nelli and colleagues have provided detailed critiques of
Hare et al.’s (2000; 2001) protocol and results elsewhere
(see Penn & Povinelli 2007b; Povinelli 2004; Povinelli &
Vonk 2003; 2004), here we will focus on the best available
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
evidence for a ToM system among non-primates. As will
become apparent, our original critique of Hare et al.’s
(2000; 2001) protocol applies, mutatis mutandis, to the
new claims being made on behalf of corvids, as well.
The best evidence for a ToM system in a non-primate
comes from the work of Emery, Clayton and colleagues
(Emery & Clayton 2001; 2004b; in press). Dally et al.
(2006), for example, had scrub-jays cache food items
under one of four conditions: (1) in the presence of a
dominant conspecific, (2) in the presence of a subordinate,
(3) in the presence of the storer’s preferred partner, or (4)
in private. The storers were allowed to cache the food in
two trays, one nearer and one farther away from the obser-
ver, and then they were allowed to recover their caches in
private three hours later. Dally et al. (2006) showed that
birds that had stored food in the presence of a dominant
or subordinate competitor tended to re-cache food predo-
minantly from the near tray, and that the proportion of
food that was re-cached was greatest for birds that had
stored food in the presence of a dominant competitor. In
a follow-up experiment, scrub-jays were given the
chance to cache successively in two trays, each in view of
a different observer. After three hours, storers were
allowed to recover their caches. Dally et al. (2006)
reported that significantly more food caches were re-
cached when a previous observer was present than when
the storers retrieved their caches in private or in view of
a control bird that had not witnessed the original
caching. Furthermore, if a previous observer was
present, storers tended to re-cache from the tray that
the previous observer had actually observed.
Results such as these leave no doubt that corvids are
remarkably intelligent creatures, able to keep track of
the social context of specific past events, as well as the
what, when, and where information associated with
those events (Clayton et al. 2001). But nothing in the
results reported to date suggests that corvids actually
reason about their conspecifics’ mental states or even
understand that their conspecifics have mental states at
all as distinct from their conspecifics’ past and occurrent
behaviors and the subjects’ own knowledge of past and
current states of affairs (Penn & Povinelli 2007b; Povinelli
et al. 2000; Povinelli & Vonk 2003; 2004).
In the case of Dally et al.’s (2006) experiment, for
example, it suffices for the subjects to keep track of
which competitor was present during which caching
event and to formulate strategies on the basis of observa-
ble features of the task alone: for example, , Re-cache
food if a competitor has oriented towards it in the
past . , , Try to cache food in sites that are farther
away from potential competitors . , , Attempt to pilfer
food if the competitor that cached it is not present . ,
and so on. Since none of the protocols required the sub-
jects to reason in terms of the specific contents of the com-
petitor’s epistemic mental states, the additional inference
that the subjects acted the way they did because they
understood that , The competitor knows where the
food is located . does no additional cognitive or explana-
tory work. This additional mentalistic claim merely
satisfies our all-too-human need to posit an explicit, con-
scious, propositional reason for the birds’ behaviors. But
it is obvious that animals including humans do not
necessarily need to “know” why they are acting the way
they are acting in order for a behavior to be flexible,
effective, and (biologically) rational (see lucid discussions
by Heyes & Papineau 2006; Kacelnik 2006).
Indeed, many of the same researchers who claim evi-
dence for ToM abilities in corvids explicitly acknowledge
that an explanation based on responding to observed
cues alone would be sufficient to account for the existing
data. Dally et al. (2006, p. 1665), for example, point out
that scrub-jays’ ability to keep track of which competitors
have observed which cache sites “need not require a
human-like ‘theory of mind’ in terms of unobservable
mental states, but [...] may result from behavioral predis-
positions in combination with specific learning algorithms
or from reasoning about future risk.” Similarly, Bugnyar
and Heinrich (2006, p. 374) acknowledge that a represen-
tation of “states in the physical world” and “responses to
subtle behavioral cues given by the competitor” would
be sufficient to explain the available evidence concerning
the manipulative behaviors of ravens as well, we would
add, as all the other comparative evidence claiming to
show ToM-like abilities in nonhuman animals to date
(for examples of the kind of protocols that could, in prin-
ciple, provide evidence for a ToM system in a nonhuman
animal, see Penn & Povinelli 2007b).
9. Explaining the discontinuity
Up to this point in the article, we have focused solely on
showing that there is, in fact, a pervasive functional discon-
tinuity between human and nonhuman minds, and that
this discontinuity is located specifically in the way that
human and nonhuman animals reason about relations.
Now we turn to the daunting question of how to account
for this pervasive discontinuity. Let us first consider the
three most influential hypotheses that have been proposed
in recent years.
9.1. The massive modularity hypothesis
A “modular” explanation for the evolution of human cogni-
tion is popular among many evolutionary-minded theorists
(e.g., Barkow et al. 1992). Certainly, many central cogni-
tive processes including almost all of the cognitive
mechanisms we share with nonhuman animals are at
least moderately modular once the notion of modularity
has been defined in a purely functional sense (see
Barrett 2006). But the modular story alone does not
provide a satisfying explanation for the disparity between
human and nonhuman minds.
As we have seen in our review of the comparative evi-
dence, the pattern of similarities and differences
between human and nonhuman relational reasoning is
remarkably consistent across every domain of cognition,
from same-different reasoning and spatial relations to
tool use and ToM. Therefore, it seems highly implausible
that the disparities in each domain are the result of inde-
pendent, module-specific adaptations. It seems much
more likely (not to mention, parsimonious) that a
common set of specializations perhaps in some more
general “supermodule” is responsible for augmenting
the relational capabilities of all of the cognitive modules
we inherited from our nonhuman ancestors. Unfortu-
nately, the two most popular supermodules that have
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
been proposed to date ToM and language do not do a
good job of accounting for the comparative evidence.
9.2. The ToM hypothesis
A number of comparative researchers believe that the dis-
continuity between human and nonhuman minds can be
traced back to some limitation in nonhuman animals’
social-cognitive abilities (e.g., Cheney & Seyfarth 1998;
Terrace 2005a; Tomasello et al. 2005). Although we cer-
tainly agree that nonhuman animals do not appear to
possess anything remotely resembling a ToM, the hypoth-
esis that some aspect of our ToM alone is responsible for
the disparity between human and nonhuman cognition
seems difficult to sustain. For example, it is very hard to
see how a discontinuity in social-cognitive abilities alone
could explain the profound differences between human
and nonhuman animals’ abilities to reason about causal
relations in the physical world or nonhuman animals’
inability to reason about higher-order spatial relations.
Even Tomasello and his colleagues have admitted that
trying to explain all the differences between human and
nonhuman cognition in terms of a difference in ToM
skills is “highly speculative” at best (Tomasello & Call
1997, p. 418). Indeed, in a different context, Tomasello
has himself argued (e.g., Tomasello 2000) that human
language learners rely on cognitive capacities such as ana-
logical reasoning and abstract rule learning that are inde-
pendent from ToM and absent in nonhuman animals. So
while our ability to participate in collaborative activities
and to take each others’ mental states into account may
be a distinctive feature of the human lineage, it is clearly
not the only or even the most basic one.
9.3. The language-only hypothesis
The oldest and still most popular explanation for the wide-
ranging disparity between human and nonhuman animals’
cognitive abilities is language (for recent examples of this
venerable argument, see Bermudez 2003; Carruthers
2002; Clark 2006). Dennett (1996, p. 17) described the
extreme version of this hypothesis in characteristically
pithy terms: “Perhaps the kind of mind you get when
you add language to it is so different from the kind of
mind you can have without language that calling them
both minds is a mistake.”
To be sure, language clearly plays an enormous and
crucial role in subserving the differences between
human and nonhuman cognition. But we believe that
language alone is not sufficient to account for the discon-
tinuity between human and nonhuman minds. In order
to make our case, we need to distinguish between three
distinct versions of the language-only hypothesis: (1) that
verbalized (or imaged) natural language sentences are
responsible for the disparity between human and nonhu-
man cognition; (2) that some aspect of our internal
“language faculty” is responsible for the disparity; and
(3) that the communicative and/or cognitive function of
language served as the prime mover in the evolution of
the uniquely human features of the human mind.
9.3.1. Are natural language sentences what makes the
human mind human?
Natural language tokens clearly
play an enormous role in “extending” and even in “rewir-
ing” the human mind (Bermudez 2005; Clark 2006;
Dennett 1996). Gentner and colleagues, for example,
have shown that relational labels play an instrumental
role in facilitating young human learners’ sensitivity to
relational similarities and potential analogies (Gentner &
Rattermann 1991; Loewenstein & Gentner 2005). Our
ability to reason about large quantities of countable
objects in a generative and systematic fashion seems to
require the acquisition of numeric symbols and a linguistic
counting system (Bloom & Wynn 1997). Numerous
studies have shown that subjects with language impair-
ment exhibit a variety of cognitive deficits (e.g., Baldo
et al. 2005) and that deaf children from hearing families
(i.e., “late signers”) show persistent deficits in ToM tasks
(see Siegal et al. 2001 for a review). Furthermore, there
is good evidence that a child’s ability to pass certain
kinds of ToM tests is intricately tied to the acquisition of
specific sentential structures (de Villiers 2000). Normal
human cognition clearly depends on normal linguistic
But although natural language clearly subserves and cat-
alyzes normal human cognition, there is compelling evi-
dence that the human mind is distinctively human even
in the absence of normal natural language sentences (see
Bloom 2000; Garfield et al. 2001; Siegal et al. 2001).
Varley and Siegal (2000), for example, studied the
higher-order reasoning abilities of an agrammatic
aphasic man who was incapable of producing or compre-
hending sentences and whose vocabulary was essentially
limited to perceptual nouns. In particular, he had lost all
his vocabulary for mentalistic entities such as “beliefs”
and “wants.” Yet this patient continued to take care of
the family finances and passed a battery of causal reason-
ing and ToM tests (see also Varley et al. 2001; 2005).
Although late-signing deaf children’s cognitive abilities
may not be “normal,” they nevertheless manifest gramma-
tical, logical, and causal reasoning abilities far beyond
those of any nonhuman subject (Peterson & Siegal
2000). And the many remarkable cases of congenitally deaf
children spontaneously “inventing” gestural languages
with hierarchical and compositional structure provide
further confirmation that the human mind is indomitably
human even in the absence of normal linguistic encultura-
tion (see, e.g., Goldin-Meadow 2003; Sandler et al. 2005;
Senghas et al. 2004).
Of course, the process of learning a language may
“rewire” the human brain in ways that make certain
kinds of cognition possible that would not be possible
otherwise, even if the subject subsequently loses the
ability to use language later in life. But this ontogenetic
version of the “rewiring hypothesis” (Bermudez 2005)
begs the question of what allows language to so profoundly
rewire the human mind, but no other.
Over the last 35 years, comparative researchers have
invested considerable effort in teaching nonhuman
animals of a variety of taxa to use and/or comprehend
language-like symbol systems. Many of these animals
have experienced protracted periods of enculturation
that rival those of modern (coddled) human children.
The stars of these animal language projects have indeed
been able to approximate certain superficial aspects of
human language, including the ability to associate arbi-
trary sounds, tokens, and gestures with external objects,
properties, and actions and a rudimentary sensitivity to
the order in which these “symbols” appear when
Penn et al.: Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds
interpreting novel “sentences” (Herman et al. 1984; Pep-
perberg 2002; Savage-Rumbaugh & Lewin 1994; Schuster-
man & Krieger 1986). But even after decades of exhaustive
training, no nonhuman animal has demonstrated a clear
mastery of abstract grammatical categories, closed-class
items, hierarchical syntactic structures, or any of the other
defining features of a human language (cf. Kako 1999). Fur-
thermore, there is still no evidence that symbol-trained
animals are any more adept than symbol-naive ones at