Content uploaded by Wolfgang Gaissmaier
Author content
All content in this area was uploaded by Wolfgang Gaissmaier
Content may be subject to copyright.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
Heuristic Decision Making
Gerd Gigerenzer and Wolfgang Gaissmaier
Center for Adaptive Behavior and Cognition, Max Planck Institute for Human
Development, 14195 Berlin, Germany; email: gigerenzer@mpib-berlin.mpg.de
Annu. Rev. Psychol. 2011. 62:451–82
The Annual Review of Psychology is online at
psych.annualreviews.org
This article’s doi:
10.1146/annurev-psych-120709-145346
Copyright c
2011 by Annual Reviews.
All rights reserved
0066-4308/11/0110-0451$20.00
Key Words
accuracy-effort trade-off, business decisions, ecological rationality,
legal decision making, medical decision making, social intelligence
Abstract
As reflected in the amount of controversy, few areas in psychology have
undergone such dramatic conceptual changes in the past decade as the
emerging science of heuristics. Heuristics are efficient cognitive pro-
cesses, conscious or unconscious, that ignore part of the information.
Because using heuristics saves effort, the classical view has been that
heuristic decisions imply greater errors than do “rational” decisions as
defined by logic or statistical models. However, for many decisions, the
assumptions of rational models are not met, and it is an empirical rather
than an a priori issue how well cognitive heuristics function in an uncer-
tain world. To answer both the descriptive question (“Which heuris-
tics do people use in which situations?”) and the prescriptive question
(“When should people rely on a given heuristic rather than a complex
strategy to make better judgments?”), formal models are indispensable.
We review research that tests formal models of heuristic inference, in-
cluding in business organizations, health care, and legal institutions.
This research indicates that (a) individuals and organizations often rely
on simple heuristics in an adaptive way, and (b) ignoring part of the infor-
mation can lead to more accurate judgments than weighting and adding
all information, for instance for low predictability and small samples.
The big future challenge is to develop a systematic theory of the build-
ing blocks of heuristics as well as the core capacities and environmental
structures these exploit.
451
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
Click here for quick links to
Annual Reviews content online,
including:
• Other articles in this volume
• Top cited articles
• Top downloaded articles
• Our comprehensive search
Fur ther
ANNUAL
REVIEWS
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
Contents
INTRODUCTION.................. 452
Scope of Review . . . . . . . . . . . . . . . . . . . 453
WHAT IS AHEURISTIC?........... 454
Definition......................... 454
Less-Can-Be-More: Managers’
One-Good-Reason Decisions.... 455
The Adaptive Toolbox . . . . . . . . . . . . . 456
WHYHEURISTICS?................ 456
Accuracy-Effort Trade-Off . . . . . . . . . 456
Ecological Rationality . . . . . . . . . . . . . . 457
METHODOLOGICAL
PRINCIPLES..................... 458
Comparative Versus Singular
Tests........................... 459
Test of Individuals Versus
GroupMeans................... 459
Testing the Adaptive Versus
Universal Use of Heuristics . . . . . 459
Prediction Versus Fitting . . . . . . . . . . . 459
RECOGNITION-BASED
DECISION MAKING . . . . . . . . . . . . 460
Recognition Heuristic . . . . . . . . . . . . . . 460
FluencyHeuristic.................. 462
Neural Basis of Recognition
and Evaluation . . . . . . . . . . . . . . . . . . 463
ONE-REASON DECISION
MAKING......................... 463
One-Clever-Cue Heuristics . . . . . . . . 463
Take-the-Best..................... 464
Fast-and-FrugalTrees ............. 467
TRADE-OFFHEURISTICS......... 469
Tallying........................... 469
Mapping Model . . . . . . . . . . . . . . . . . . . 470
1/NRule .......................... 470
SOCIAL INTELLIGENCE . . . . . . . . . . 471
Recognition-Based Decisions . . . . . . . 471
One-Reason Decision Making . . . . . . 472
Trade-OffHeuristics............... 472
Social Heuristics................... 472
MoralBehavior.................... 473
CONCLUSIONS.................... 473
INTRODUCTION
How are decisions made? Three major answers
have been proposed: The mind applies logic,
statistics, or heuristics. Yet these mental tools
have not been treated as equals, each suited to
a particular kind of problem,as we believe they
should be. Rather, rules of logic and statistics
have been linked to rational reasoning and
heuristics linked to error-prone intuitions or
even irrationality. Since the 1970s, this oppo-
sition has been entrenched in psychological
research, from the heuristics-and-biases pro-
gram (Tversky & Kahneman 1974) to various
two-system theories of reasoning (Evans 2008).
Deviations from logical or statistical principles
became routinely interpreted as judgmental
biases and attributed to cognitive heuristics
such as “representativeness” or to an intuitive
“System 1.” The bottom line was that people
often rely on heuristics, but they would be bet-
ter off in terms of accuracy if they did not. As
Kahneman (2003) explained in his Nobel
Memorial Lecture: “Our research attempted
to obtain a map of bounded rationality, by
exploring the systematic biases that separate
the beliefs that people have and the choices
they make from the optimal beliefs and choices
assumed in rational-agent models” (p. 1449). In
this research, it is assumed that the conditions
for rational models hold and can thus define
optimal reasoning. The “father” of bounded
rationality, Simon (1989), however, asked a
fundamentally different question, leading to a
different research program.
Simon’s question: “How do human beings rea-
son when the conditions for rationality postu-
lated by the model of neoclassical economics
are not met?” (p. 377)
As Simon (1979, p. 500) stressed in his No-
bel Memorial Lecture, the classical model of ra-
tionality requires knowledge of all the relevant
alternatives, their consequences and probabili-
ties, and a predictable world without surprises.
These conditions, however, are rarely met for
the problems that individuals and organizations
face. Savage (1954), known as the founder of
modern Bayesian decision theory, called such
perfect knowledge small worlds, to be distin-
guished from large worlds. In large worlds, part
452 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
of the relevant information is unknown or has
to be estimated from small samples, so that the
conditions for rational decision theory are not
met, making it an inappropriate norm for opti-
mal reasoning (Binmore 2009). In a large world,
as emphasized by both Savage and Simon, one
can no longer assume that “rational” models
automatically provide the correct answer. Even
small deviations from the model conditions can
matter. In fact, small-world theories can lead
to disaster when applied to the large world,
as Stiglitz (2010) noted with respect to the fi-
nancial crash of 2008: “It simply wasn’t true
that a world with almost perfect information
was very similar to one in which there was per-
fect information” (p. 243, emphasis added). And
Sorros (2009) concluded that “rational expecta-
tions theory is no longer taken seriously outside
academic circles” (p. 6).
In recent years, research has moved beyond
small worlds such as the ultimatum game and
choice between monetary gambles. To test how
well heuristics perform in large worlds, one
needs formal models of heuristics. Such tests
are not possible as long as heuristics are only
vaguely characterized by general labels, because
labels cannot make the precise predictions that
statistical techniques can.
When heuristics were formalized, a surpris-
ing discovery was made. In a number of large
worlds, simple heuristics were more accurate
than standard statistical methods that have the
same or more information. These results be-
came known as less-is-more effects: There is
an inverse-U-shaped relation between level of
accuracy and amount of information, computa-
tion, or time. In other words, there is a point
where more is not better, but harmful. Starting
in the late 1990s, it was shown for the first time
that relying on one good reason (and ignoring
the rest) can lead to higher predictive accuracy
than achieved by a linear multiple regression
(Czerlinski et al. 1999, Gigerenzer & Goldstein
1996) and a three-layer feedforward connec-
tionist network trained using the back propa-
gation algorithm (Brighton 2006, Chater et al.
2003, Gigerenzer & Brighton 2009). These
results put heuristics on par with standard
Heuristics: strategies
that ignore
information to make
decisions faster, more
frugally, and/or more
accurately than more
complex methods
Small worlds: a
situation in which all
relevant alternatives,
their consequences,
and probabilities are
known, and where the
future is certain, so
that the optimal
solution to a problem
can be determined
Large world:
a situation in which
some relevant
information is
unknown or must be
estimated from
samples, and the future
is uncertain, violating
the conditions for
rational decision
theory
Less-is-more effects:
when less information
or computation leads
to more accurate
judgments than more
information or
computation
statistical models of “rational” cognition (see
Gigerenzer 2008). Simon (1999) spoke of a
“revolution in cognitive science, striking a great
blow for sanity in the approach to human
rationality.”
The revolution Simon referred to could not
have happened without formal models and the
power of modern computers. Moreover, it is a
“revolution” in the original sense of the term,
building on earlier demonstrations of the robust
beauty of simple models. These include Dawes
& Corrigan (1974) and Einhorn & Hogarth
(1975), who showed that simple equal weights
predict about as well as—and sometimes better
than—multiple regression with “optimal” beta
weights. Their important work has not received
the recognition it deserves and is not even men-
tioned in standard textbooks in econometrics
(Hogarth 2011).
Although the study of heuristics has been
typically considered as purely descriptive, less-
is-more effects open up a prescriptive role for
heuristics, resulting in two research questions:
Description:Which heuristics do people use
in which situations?
Prescription:When should people rely on a
given heuristic rather than a complex strategy
to make more accurate judgments?
Scope of Review
We review a field that is in a fundamental tran-
sition, focusing on the major new ideas. The
literature on heuristics does not speak with one
voice, and we do not attempt to cover it ex-
haustively. Rather than presenting a patchwork
of ideas to the reader, we organize this review
within a theoretical framework and restrict it
to (a) formal models of heuristics and (b)infer-
ences rather than preferences.
The first restriction excludes explanation by
mere labels but also by verbally stated pro-
cesses that have not been formalized, such as
the tools-to-theories heuristic in scientific dis-
covery (Gigerenzer 1991). Formal models al-
low rigorous tests of both descriptive and pre-
scriptive questions. “Inference” refers to tasks
www.annualreviews.org •Heuristic Decision Making 453
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
for which a unique criterion exists, whereas
“preference” (or preferential choice) refers to
tasks where no such criteria exist, as in matters
of taste. The advantage of studying inference
is that the accuracy of a strategy can be deter-
mined. At the same time, we agree with Weber
& Johnson (2009) that inferences and prefer-
ences draw on the same cognitive processes; in
fact, most heuristics covered in this review can
be used for preferential choice as well, as illus-
trated with examples from consumer choice and
health. Note that the general term “decision
making” is used here to cover both inferences
and preferences.
We begin with a brief, incomplete history
(for more, see Groner et al. 1983), define the
term heuristic, and provide an illustration of
the use of heuristics in organizations, including
an empirical demonstration of a less-is-more
effect.
WHAT IS A HEURISTIC?
The term heuristic is of Greek origin and
means, “serving to find out or discover.” Ein-
stein included the term in the title of his Nobel
prize–winning paper from 1905 on quantum
physics, indicating that the view he presented
was incomplete but highly useful (Holton
1988, pp. 360–361). Max Wertheimer, who
was a close friend of Einstein, and his fellow
Gestalt psychologists spoke of heuristic meth-
ods such as “looking around” to guide search
for information. The mathematician George
Polya distinguished heuristics from analytical
methods: For instance, heuristics are needed
to find a proof, whereas analysis is for checking
a proof. Simon and Allen Newell, a student of
Polya, developed formal models of heuristics to
limit large search spaces. Luce (1956), Tversky
(1972), Dawes (1979), and others studied mod-
els of heuristics, such as lexicographic rules,
elimination-by-aspect, and equal-weight rules.
Payne and colleagues (1993) provided evidence
for the adaptive use of these and other heuristics
in their seminal research. Similarly, behavioral
biologists studied experimentally the rules of
thumb (their term for heuristics) that animals
use for choosing food sites, nest sites, or mates
(Hutchinson & Gigerenzer 2005). After an
initial phase dominated by logic, researchers
in artificial intelligence (AI) began to study
heuristics that can solve problems that logic
and probability cannot, such as NP-complete
(computationally intractable) problems. While
AI researchers began to study how heuristics
make computers smart, psychologists in the
1970s became interested in demonstrating hu-
man reasoning errors, and they used the term
heuristic to explain why people make errors.
This change in the evaluation of heuristics
went hand-in-hand with replacing models of
heuristics by general labels, such as “availabil-
ity” and, later, “affect.” Unlike in biology and
AI, heuristics became tied to biases, whereas
the content-free laws of logic and probability
became identified with the principles of
sound thinking (Kahneman 2003, Tversky &
Kahneman 1974). The resulting heuristics-
and-biases program has had immense influence,
contributing to the emergence of behavioral
economics and behavioral law and economics.
Definition
Many definitions of heuristics exist. Kahneman
& Frederick (2002) proposed that a heuristic
assesses a target attribute by another property
(attribute substitution) that comes more read-
ily to mind. Shah & Oppenheimer (2008) pro-
posed that all heuristics rely on effort reduction
by one or more of the following: (a) examin-
ing fewer cues, (b) reducing the effort of re-
trieving cue values, (c) simplifying the weight-
ing of cues, (d) integrating less information, and
(e) examining fewer alternatives. Although both
attribute substitution and effort reduction are
involved, attribute substitution is less specific
because most inference methods, including
multiple regression, entail it: An unknown cri-
terion is estimated by cues. For the purpose of
this review, we adopt the following definition:
A heuristic is a strategy that ignores part of the
information, with the goal of making decisions
more quickly, frugally, and/or accurately than
more complex methods.
454 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
Let us explain the terms. Heuristics are a
subset of strategies; strategies also include com-
plex regression or Bayesian models. The part
of the information that is ignored is covered
by Shah and Oppenheimer’s list of five aspects.
The goal of making judgments more quickly
and frugally is consistent with the goal of effort
reduction, where “frugal” is often measured by
the number of cues that a heuristic searches. Of
course, there is no strict dichotomy between
heuristic and nonheuristic, as strategies can
ignore more or less information. The goal of
making judgments more accurately by ignoring
information is new. It goes beyond the classical
assumption that a heuristic trades off some
accuracy for less effort. Unlike the two-system
models of reasoning that link heuristics to
unconscious, associative, and error-prone
processes, no such link is made in this review.
Every heuristic reviewed in this article can
also be relied upon consciously and is defined
as a rule. The amount of error it generates
can be measured and compared to other
strategies.
Consider the following illustration.
Less-Can-Be-More: Managers’
One-Good-Reason Decisions
Commercial retailers need to distinguish those
customers who are likely to purchase again in a
given time frame (active customers) from those
who are not (inactive customers). These com-
panies have a large database containing the
amount, kind, and date of every customer’s pre-
vious purchases. Based on this information, how
can an executive predict which customers will
be active in the future?
Statistically sophisticated academics might
opt for a Bayesian analysis, regression analy-
sis, or some other optimizing strategy to pre-
dict the probability that a customer with a
given purchase history is active at some fu-
ture time. Researchers in business share this vi-
sion, and the state-of-the-art approach is the
Pareto/NBD model (negative binomial distri-
bution; Schmittlein & Peterson 1994). This
model assumes that purchases follow a Poisson
process with a purchase parameter λ, that cus-
tomer lifetimes follow an exponential distribu-
tion with a dropout rate parameter μ,andthat,
across customers, purchase and dropout rates
are distributed according to a gamma distri-
bution. However, most managers in Europe,
North America, Japan, Brazil, and India rely
on “intuitive” heuristics rather than on this or
similar statistical forecasting methods (Parikh
1994). W¨
ubben & Wangenheim (2008) re-
ported that experienced managers use a simple
recency-of-last-purchase rule:
Hiatus heuristic: If a customer has not pur-
chased within a certain number of months (the
hiatus), the customer is classified as inactive;
otherwise, the customer is classified as active.
The managers of an apparel retailer and an
airline relied on nine months as the hiatus,
whereas the hiatus of an online CD retailer
was six months. Note that by relying on re-
cency only, the managers ignore information
such as the frequency and the spacing of previ-
ous purchases. Yet how accurate is the heuristic
compared to the Pareto/NBD model? To in-
vestigate this question, the Pareto/NBD model
was allowed to estimate its parameters from
40 weeks of data and was tested over the follow-
ing 40 weeks. The hiatus heuristic does not need
to estimate any parameters. For the apparel re-
tailer, the hiatus heuristic correctly classified
83% of customers, whereas the Pareto/NBD
model classified only 75% correctly. For the
airline, the score was 77% versus 74%, and for
the online CD business, the two methods tied at
77% (W¨
ubben & Wangenheim 2008). Similar
results were found for forecasting future best
customers and for a second complex statistical
model.
This study demonstrated empirically a less-
is-more effect: The complex model had all the
information the simple heuristic used and more,
performed extensive estimations and computa-
tions, but nevertheless made more errors. The
study also showed how important it is to for-
malize a heuristic so that its predictions can be
tested and compared to competing models.
www.annualreviews.org •Heuristic Decision Making 455
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
Adaptive toolbox:
the cognitive
heuristics, their
building blocks (e.g.,
rules for search,
stopping, decision),
and the core capacities
(e.g., recognition
memory) they exploit
Ecological
rationality: the study
of ecological
rationality investigates
in which environments
a given strategy is
better than other
strategies (better—not
best—because in large
worlds the optimal
strategy is unknown)
Accuracy-effort
trade-off: the
traditional explanation
why people use
heuristics, assuming
that effort is traded
against accuracy. Not
generally true (see
less-is-more effects)
The Adaptive Toolbox
Formal models of heuristics represent progress
over labels, but precision alone is not enough to
build a science of heuristics. For instance, be-
havioral biology has experimentally identified
various rules of thumb that animals use, which
often look like curiosities in the absence of an
overarching theory (Hutchinson & Gigeren-
zer 2005). Further progress requires a theo-
retical framework that reaches beyond a list
of heuristics. One step toward such a theory
is to look for common building blocks, from
which the various heuristics are constructed as
an organizing principle. This would allow re-
ducing the larger number of heuristics to a
smaller number of components, similar to how
the number of chemical elements in the peri-
odic table is built from a small number of parti-
cles. Three building blocks have been proposed
(Gigerenzer et al. 1999):
1. Search rules specify in what direction the
search extends in the search space.
2. Stopping rules specify when the search is
stopped.
3. Decision rules specify how the final deci-
sion is reached.
For instance, the hiatus heuristic searches
for recency-of-last-purchase information; stops
when it is found, ignoring further information;
and uses a nine-month threshold to make the
decision. Similarly, Simon’s (1955) satisficing
heuristic searches through options in any order,
stops as soon the first option exceeds an aspi-
ration level, and chooses this option. Many but
not all heuristics are composed of these three
building blocks; thus, the list of building blocks
is incomplete.
The collection of heuristics and building
blocks an individual or a species has at its dis-
posal for constructing heuristics, together with
the core mental capacities that building blocks
exploit, has been called the adaptive toolbox
(Gigerenzer et al. 1999). Core capacities in-
clude recognition memory, frequency monitor-
ing, object tracking, and the ability to imitate.
These vary systematically between species and
individuals. Heuristics can be fast and frugal
only because the core capacities are already in
place.
How are heuristics selected for a given
problem? Although some authors implied that
the selection problem is unique to heuristics
(Gl¨
ockner et al. 2010, Newell 2005), it equally
applies to statistical models of mind. There are
many such models. Even if one proposes that
the mind has only one tool in its statistical
toolbox, such as Bayes, regression, or neural
network, the strategy selection problem trans-
lates into the question of how parameter values
are selected for each new problem (Marewski
2010).
Several principles appear to guide learning
which strategy to select. First, heuristics and
their underlying core capacities can be (partly)
hardwired by evolution, as it appears to be in
bees’ collective decision about the location of
a new hive (Seeley 2001) and in perceptual
mechanisms for inferring the extension of ob-
jects in three-dimensional space (Kleffner &
Ramachandran 1992). The second selection
principle is based on individual learning; a for-
mal model is Rieskamp & Otto’s (2006) strategy
selection learning theory. Third, heuristics can
be selected and learned by social processes, as
in imitation and explicit teaching of heuristics
(e.g., Snook et al. 2004). Finally, the content of
individual memory determines in the first place
which heuristics can be used, and some heuris-
tics’ very applicability appears to be correlated
with their “ecological rationality” (see below).
For instance, the fluency heuristic is most likely
to be applicable in situations where it is also
likely to succeed (Marewski & Schooler 2010).
WHY HEURISTICS?
Two answers have been proposed to the ques-
tion of why heuristics are useful: the accuracy-
effort trade-off, and the ecological rationality of
heuristics.
Accuracy-Effort Trade-Off
The classical explanation is that people save ef-
fort with heuristics, but at the cost of accuracy
456 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
(Payne et al. 1993, Shah & Oppenheimer 2008).
In this view, humans and other animals rely on
heuristics because information search and com-
putation cost time and effort; heuristics trade-
off some loss in accuracy for faster and more
frugal cognition.
There are two interpretations of this trade-
off: (a) Rational trade-offs. Not every decision is
important enough to warrant spending the time
to find the best course of action; thus, people
choose shortcuts that save effort. The program
on the adaptive decision maker (Payne et al.
1993) is built on the assumption that heuris-
tics achieve a beneficial trade-off between ac-
curacy and effort. Here, relying on heuristics
can be rational in the sense that costs of effort
are higher than the gain in accuracy. (b) Cog-
nitive limitations. Capacity limitations prevent
us from acting rationally and force us to rely
on heuristics, which are considered a source of
judgmental errors.
The accuracy-effort trade-off is regularly
touted as a potentially universal law of cogni-
tion. Yet the study on the hiatus heuristic il-
lustrated that this assumption is not generally
correct. The hiatus heuristic saves effort com-
pared to the sophisticated Pareto/NBD model,
but is also more accurate: a less-is-more effect.
Ecological Rationality
Less-is-more effects require a new conception
of why people rely on heuristics. The study of
the ecological rationality of heuristics, or strate-
gies in general, is such a new framework: “A
heuristic is ecologically rational to the degree
that it is adapted to the structure of the envi-
ronment” (Gigerenzer et al. 1999, p. 13). Smith
(2003) used this definition in his Nobel lecture
and generalized it from heuristics to markets
and institutions. The study of ecological ra-
tionality fleshes out Simon’s scissors analogy:
“Human rational behavior (and the rational be-
havior of all physical symbol systems) is shaped
by a scissors whose two blades are the structure
of task environments and the computational ca-
pabilities of the actor” (Simon 1990, p. 7). If one
looks only at one blade, cognition, one cannot
understand why and when it succeeds or fails.
The study of ecological rationality addresses
two related questions: How does cognition ex-
ploit environmental structures, and how does it
deal with error?
Exploiting environmental structure. In
which environments will a given heuristic suc-
ceed, and in which will it fail? Environmental
structures that have been identified include
(Todd et al. 2011):
1. Uncertainty: how well a criterion can be
predicted.
2. Redundancy: the correlation between
cues.
3. Sample size: number of observations (rel-
ative to number of cues).
4. Variability in weights: the distribution of
the cue weights (e.g., skewed or uniform).
For instance, heuristics that rely on only one
reason, such as the hiatus heuristic and take-
the-best heuristic (see below), tend to succeed
(relative to strategies that rely on more rea-
sons) in environments with (a) moderate to high
uncertainty (Hogarth & Karelaia 2007) and
(b) moderate to high redundancy (Dieckmann
& Rieskamp 2007). For customer activity, un-
certainty means that it is difficult to predict fu-
ture purchases, and redundancy might be re-
flected in a high correlation between length of
hiatus and spacing of previous purchases. The
study of ecological rationality results in com-
parative statements of the kind “strategy Xis
more accurate (frugal, fast) than Yin environ-
ment E” or in quantitative relations between the
performance of strategy Xwhen the structure
of an environment changes (e.g., Baucells et al.
2008, Karelaia 2006, Martignon & Hoffrage
2002). Specific findings are introduced below.
Dealing with error. In much research on rea-
soning, a bias typically refers to ignoring part
of the information, as in the base rate fallacy.
This can be captured by the equation:
Error =bias +ε,(1)
where εis an irreducible random error. In
this view, if the bias is eliminated, good
www.annualreviews.org •Heuristic Decision Making 457
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
inferences are obtained. In statistical theory
(Geman et al. 1992), however, there are three
sources of errors:
Error =bias +variance +ε,(2)
where bias refers to a systematic deviation be-
tween a model and the true state, as in Equation
1. To define the meaning of variance, consider
100 people who rely on the same strategy, but
each one has a different sample of observations
from the same population. Because of sampling
error, the 100 inferences may not be the same.
Across samples, bias is the difference between
the mean prediction and the true state of nature,
and variance is the expected squared deviation
around this mean. To illustrate, the nine-month
hiatus heuristic has a bias but zero variance,
because it has no free parameters to adjust to
specific samples. In contrast, the Pareto/NBD
model has free parameters and is likely to suf-
fer from both variance and bias. Variance de-
creases with increasing sample size, but also
with simpler strategies that have fewer free pa-
rameters (and less flexible functional forms; Pitt
et al. 2002). Thus, a cognitive system needs to
draw a balance between being biased and flexi-
ble (variance) rather than simply trying to elim-
inate bias. In the extreme, as illustrated by the
nine-month hiatus, the total elimination of vari-
ance at the price of higher bias can lead to bet-
ter inferences. This “bias-variance dilemma”
helps to explicate the rationality of simple
heuristics and how less can be more (Brighton
& Gigerenzer 2008, Gigerenzer & Brighton
2009).
The study of ecological rationality is re-
lated to the view that human cognition is
adapted to its past environment (Cosmides &
Tooby 2006), yet it should not be confused
with the biological concept of adaptation. A
match between a heuristic and an environmen-
tal structure does not imply that the heuristic
evolved because of that environment (Hutchin-
son & Gigerenzer 2005). The distinction be-
tween ecological and logical rationality is linked
to that between correspondence and coher-
ence (Hammond 2007), but it is not identi-
cal. If correspondence means achieving a goal
in the world rather than cohering to a rule of
logic, correspondence and ecological rational-
ity refer to similar goals—although the study
of the latter adds a mathematical analysis of
the relation between heuristic and environ-
ment. If correspondence, however, means that
the mental representation corresponds to the
world, as in a fairly accurate mental model or in
Shepard’s (2001) view of the mind as a mirror
reflecting the world, then ecological rational-
ity is different. A heuristic is functional, not a
veridical copy of the world.
Ecological rationality does not mean that all
people are perfectly adapted to their environ-
ment. As Simon (1992) noted, if that were the
case, one would only need to study the environ-
ment to predict behavior; the study of heuristics
would be obsolete.
METHODOLOGICAL
PRINCIPLES
Formal models of heuristics are indispensable
for progress, yet remain the exception in psy-
chology. Much of the research first documents
an error of judgment and thereafter attributes
it to a heuristic. In a widely cited experiment,
Tversky & Kahneman (1973) reported that cer-
tain letters were falsely judged to occur more
frequently in the first than the third position
in English words. They attributed this error to
the availability heuristic: Words with a letter
in the first position come to mind more eas-
ily. Note that availability was introduced after
the fact, without any independent measure or
test. Once the heuristic is formalized, conclu-
sions change. Sedlmeier and colleagues (1998)
defined and modeled the two most common
meanings of availability—the speed of retrieval
of the first word and the number of retrieved
words within a constant time period. Neither
version of the availability heuristic could pre-
dict participants’ frequency estimates. Instead,
estimated frequencies were best predicted by
actual frequencies, consistent with the classi-
cal findings by Attneave (1953). Formal models
protect against the seductive power of general
labels.
458 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
We are concerned about the replacement
of formal models by general labels in parts
of psychology. For instance, Tversky’s (1977)
seminal model of similarity makes testable
predictions (e.g., the asymmetry of similarity),
whereas the widely cited label “representative-
ness” can predict little but is so flexible that it
is consistent with many judgments, including
opposite intuitions (Ayton & Fischer 2004).
Similarly, research on the adaptive decision
maker (Payne et al. 1993) and the adaptive tool-
box (Gigerenzer et al. 1999) has studied formal
models of heuristics, which have been ignored
in two-system theories of reasoning in favor of
a “System 1” (Evans 2008). The problem with
two-system theories “is the lack of any predic-
tive power and the tendency to employ them as
an after-the-fact explanation” (Keren & Schul
2009, p. 544). Moving backward from existing
models to labels is a rare event in science, which
typically proceeds in the opposite direction.
The study of formal models entails four
methodological principles.
Comparative Versus Singular Tests
All models are wrong. But some predict better
than others and lead to novel questions. There-
fore, tests of cognitive strategies need to be
comparative, that is, test several models. This
differs from the widespread practice of null hy-
pothesis testing, where only one model (the
null) is specified.
Test of Individuals Versus
Group Means
Numerous studies have documented system-
atic individual differences in the use of heuris-
tics (e.g., Lee & Cummins 2004, Nosofsky &
Bergert 2007), including in old age (Mata et al.
2007). In the presence of individual differences,
tests of group mean differences can be highly
misleading (see Pachur et al. 2008).
Testing the Adaptive Versus Universal
Use of Heuristics
Research has shifted from asking whether
people use one heuristic in all situations to
asking whether heuristics are applied in situa-
tions where these are ecologically rational. For
instance, Br¨
oder began by asking whether all
people use the take-the-best heuristic all the
time, but soon asked whether people rely on
take-the-best in situations where it is ecolog-
ically rational, for instance, when cue validi-
ties are highly skewed (Br ¨
oder & Schiffer 2003,
2006).
Prediction Versus Fitting
Prediction takes place when the data have not
yet been observed and a model with fixed pa-
rameter values is used to predict them; fitting
takes place when the data have already been ob-
served and the parameters of a model are cho-
sen so that they maximize the fit (such as R2).
In general, the more free parameters a model
has, the better the fit, but this does not hold for
predictions. In a large world where parameters
need to be estimated from small or unreliable
samples, the function between predictive accu-
racy and the flexibility of a model (e.g., num-
ber of free parameters) is typically inversely U-
shaped. Both too few and too many parameters
can hurt performance (Pitt et al. 2002). Com-
peting models of strategies should be tested for
their predictive ability, not their ability to fit
already known data.
In the next sections, we review four classes
of heuristics. The first class exploits recognition
memory, the second relies on one good reason
only (and ignores all other reasons), the third
weights all cues or alternatives equally, and the
fourth relies on social information. As men-
tioned in the introduction, formal models of
heuristics allow asking two questions: whether
they can describe decisions, and whether they
can prescribe how to make better decisions
than, say, a complex statistical method. The
prescriptive question is particularly relevant for
organizations, from business to health care.
Organizations seem ideally suited to the appli-
cation of heuristics because of the inherent un-
certainty and the pressure to act quickly. One
might therefore presume that plenty of studies
have investigated fast-and-frugal heuristics in
www.annualreviews.org •Heuristic Decision Making 459
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
organizations. Not so, as Hodgkinson & Healey
(2008) pointed out. Our working hypothesis
is that heuristic decision making in individu-
als and organizations can be modeled by the
same cognitive building blocks: rules of search,
stopping, and decision.
RECOGNITION-BASED
DECISION MAKING
The recognition memory literature indicates
that a sense of recognition (often called famil-
iarity) appears in consciousness earlier than rec-
ollection (Ratcliff & McKoon 1989). The first
class of heuristics exploits this core capacity.
Recognition Heuristic
The goal is to make inferences about a crite-
rion that is not directly accessible to the de-
cision maker, based on recognition retrieved
from memory. This is possible in an environ-
ment (reference class) Rwhere the recognition
of alternatives a,b∈Rpositively correlates with
their criterion values. For two alternatives, the
heuristic is defined as (Goldstein & Gigerenzer
2002):
Recognition heuristic: If one of two alterna-
tives is recognized and the other is not, then
infer that the recognized alternative has the
higher value with respect to the criterion.
The higher the recognition validity αfor a
given criterion, the more ecologically rational
it is to rely on this heuristic and the more likely
people will rely on it. For each individual, αcan
be computed by
α=C/(C +W),
where Cis the number of correct inferences the
recognition heuristic would make, computed
across all pairs in which one alternative is recog-
nized and the other is not, and Wis the number
of wrong inferences.
A number of studies addressed the ques-
tion of whether people rely on the recognition
heuristic in an ecologically rational way. For in-
stance, name recognition of Swiss cities is a valid
predictor of their population (α=0.86) but
not their distance from the center of Switzer-
land (α=0.51). Pohl (2006) reported that 89%
of inferences accorded with the model in judg-
ments of population, compared to only 54%
in judgments of the distance. More generally,
there is a positive correlation of r=0.64 be-
tween the recognition validity and the propor-
tion of judgments consistent with the recogni-
tion heuristic across 11 studies (Pachur et al.
2011). Similarly, old and young people alike
adjust their reliance on the recognition heuris-
tic between environments with high versus low
recognition validities, even though old people
have poorer recognition memory (Pachur et al.
2009).
The recognition heuristic is a model that
relies on recognition only. This leads to the
testable prediction that people who rely on it
will ignore strong, contradicting cues (so-called
noncompensatory inferences). Several studies
that taught participants between one and three
contradicting cues, typically of higher validity
than α(Newell & Fernandez 2006; Pachur et al.
2008; Richter & Sp¨
ath 2006, experiment 3), re-
ported that mean accordance rates decreased. A
reanalysis of these studies at an individual level,
however, showed that typically about half of the
participants consistently followed the recog-
nition heuristic in every single trial, even in
the presence of up to three contradicting cues
(Pachur et al. 2008).
The model of the recognition heuristic does
not distinguish between pairs where the model
leads to a correct inference and pairs where it
leads to a wrong inference. However, the mean
accordance rates were 90% and 74%, respec-
tively (Pohl 2006, Hilbig & Pohl 2008). To-
gether with the effect of contradicting cues, this
result indicated that some people did not follow
the recognition heuristic, although the overall
accordance rates remain high. Various authors
concluded that people relied on a compensatory
strategy, such as weighting and adding of all
cues (e.g., Hilbig & Pohl 2008, Oppenheimer
2003). None of the studies above, however,
460 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
formulated and tested a compensatory strategy
against the recognition heuristic, leaving the
strategies that participants relied on unknown.
One study since tested five compensatory mod-
els and found that none could predict judgments
better than the simple model of the recognition
heuristic (Marewski et al. 2010).
The recognition heuristic model also makes
another bold prediction:
If α>β,andα,βare independent of n,
then a less-is-more effect will be observed.
Here, βis the knowledge validity, measured
as C/(C+W) for all pairs in which both al-
ternatives are recognized, and nis the num-
ber of alternatives an individual recognizes. A
less-is-more effect means that the function be-
tween accuracy and nis inversely U-shaped
rather than monotonically increasing. Some
studies reported less-is-more effects empirically
among two, three, or four alternatives (Frosch
et al. 2007, Goldstein & Gigerenzer 2002) and
in group decisions (Reimer & Katsikopoulos
2004), whereas others failed to do so (Pachur
& Biele 2007, Pohl 2006), possibly because the
effect is predicted to be small [see Katsikopoulos
(2010) for an excellent analysis of the evidence].
Using a signal detection analysis, Pleskac (2007)
showed how the less-is-more effect depends on
the false alarms and miss rates in the recogni-
tion judgments.
Dougherty et al. (2008) criticized the model
of the recognition heuristic for treating recog-
nition as binary input (threshold model) rather
than continuously. In contrast, Br ¨
oder &
Sch¨
utz (2009) argued that the widespread cri-
tique of threshold models is largely invalid. In
a reanalysis of 59 published studies, they con-
cluded that threshold models in fact fit the data
better in about half of the cases.
Predicting Wimbledon. Although much of
the work has addressed the descriptive ques-
tion of what proportion of people rely on the
heuristic when it is ecologically rational, the
prescriptive question is how well the heuristic
can compete with well-established forecasting
instruments (Goldstein & Gigerenzer 2009).
For instance, Serwe & Frings (2006) reported
that collective recognition of amateur play-
ers (who knew only half of the contestants)
turned out to be a better predictor of the 2004
Wimbledon tennis match outcomes (72% cor-
rect) than did the Association of Tennis Pro-
fessionals (ATP) Entry Ranking (66%), ATP
Champions Race (68%), and the seeding of
the Wimbledon experts (69%). Scheibehenne
&Br¨
oder (2007) found the same surprising re-
sult for Wimbledon 2006.
Predicting elections. Gaissmaier &
Marewski (2010) put the recognition heuristic
to a test in predicting federal and state elections
in Germany. Surprisingly, forecasts based on
name recognition were as accurate as inter-
viewing voters about their voting intentions.
This particularly holds true when predicting
the success of small parties, for which no polls
are usually available because those polls would
require huge samples. In contrast to surveys of
voting intentions, recognition-based forecasts
can be computed from small, “lousy” samples.
Investment. In three studies on predicting
the stock market, Ortmann et al. (2008) re-
ported that recognition-based portfolios (the
set of most-recognized options), on average,
outperformed managed funds such as the Fi-
delity Growth Fund, the market (Dow or Dax),
chance portfolios, and stock experts. In con-
trast, Boyd (2001) found no such advantage
when he used college students’ recognition of
stocks rather than that of the general public. It is
imperative to understand why and under what
conditions this simple heuristic can survive in
financial markets without making a systematic
market analysis. This remains an open question.
Consumer choice. The recognition heuris-
tic could be a first step in consideration set
formation (Marewski et al. 2010), as it allows
the choice set to be quickly reduced. This idea
is consistent with research that suggests that
priming a familiar brand increases the probabil-
ity that it will be considered for purchase (e.g.,
Coates et al. 2004). Brand recognition can be
even more important than attributes that are a
www.annualreviews.org •Heuristic Decision Making 461
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
more direct reflection of quality. For instance,
in a blind test, most people preferred a jar of
high-quality peanut butter to two alternative
jars of low-quality peanut butter. Yet when a
familiar brand label was attached to one of the
low-quality jars, the preferences changed. Most
(73%) now preferred the jar with the label they
recognized, and only 20% preferred the unla-
beled jar with the high-quality peanut butter
(Hoyer & Brown 1990). Brand recognition may
well dominate the taste cues, or the taste cues
themselves might even be changed by brand
recognition—people “taste” the brand name.
Fluency Heuristic
The recognition heuristic is mute about the
underlying recognition process, just as Bayes’
rule is mute about source of prior probabilities.
Dougherty et al. (2008) argued that it needs to
be embedded in a theory of the recognition pro-
cess. Schooler & Hertwig (2005) implemented
the heuristic based on the ACT-R (Adaptive
Control of Thought-Rational) model of mem-
ory, which showed how forgetting—a process
often seen as nuisance and handicap—can be
functional in the context of inference, gener-
ating less-is-more effects. In this same work,
the fluency heuristic was formulated for situ-
ations when both alternatives are recognized,
that is, when the recognition heuristic cannot be
applied:
Fluency heuristic: If both alternatives are rec-
ognized but one is recognized faster, then infer
that this alternative has the higher value with
respect to the criterion.
The fluency heuristic builds on earlier work
on fluency ( Jacoby & Dallas 1981). For in-
stance, fluent processing that stems from previ-
ous exposure can increase the perceived truth
of repeated assertions (Hertwig et al. 1997)
and the perceived fame of names ( Jacoby et al.
1989), and it is related to the mere exposure
effect (Zajonc 1968). People’s sense of fluency
has been reported to predict the performance
of stocks (Alter & Oppenheimer 2006).
By formalizing the fluency heuristic,
Schooler & Hertwig (2005) clearly defined the
difference between the recognition and fluency
heuristics and contributed to the progress in
replacing verbal labels with computational
models. The fluency heuristic is ecologically
rational if the speed of recognition is correlated
with the criterion, that is, the fluency validity
>0.5. Hertwig et al. (2008) reported that the
validity of fluency for predicting variables such
as sales figures and wealth was always lower
than recognition validity, although always
above chance. Subsequently, they showed
that people can accurately tell the difference
between two recognition latencies if the
difference exceeded 100 ms, and that across
three environments, the mean proportions of
inferences consistent with the fluency heuristic
were 74%, 63%, and 68%, respectively.
Accordance rates were as high as 82% when
differences in recognition latencies were large.
Deriving the fluency heuristic’s prediction
for individual people and individual items is
a strong test. Yet it is not how the impact
of fluency is commonly tested in social and
cognitive psychology, where researchers tend
to manipulate fluency experimentally and
observe the consequences.
Fluency also plays a role when alternatives
are not given (as in a two-alternative choice) but
need to be generated from memory. Johnson &
Raab (2003) proposed a variant of the fluency
heuristic when alternatives are sequentially re-
trieved rather than simultaneously perceived:
Take-the-first heuristic: Choose the first al-
ternative that comes to mind.
Johnson & Raab (2003) showed experienced
handball players video sequences from a pro-
fessional game and asked what they would have
done—e.g., pass the ball to the player at the
left or take a shot. On average, the first op-
tion that came to mind was better than later
options and when more time was given to in-
spect the situation. This result was replicated
for basketball players (Hepler 2008). Klein’s
(2004) recognition-primed decision model for
462 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
expertise appears to be closely related to the
take-the-first heuristic.
Neural Basis of Recognition
and Evaluation
Although a number of studies have shown that
people do not automatically use the recogni-
tion heuristic when it can be applied, it is less
clear how this evaluation process can be mod-
eled. A functional magnetic resonance imag-
ing study tested whether the two processes,
recognition and evaluation, can be separated on
a neural basis (Volz et al. 2006). Participants
were given two tasks: The first involved only a
recognition judgment (“Have you ever heard of
Modena? Milan?”), while the second involved
an inference in which participants could rely
on the recognition heuristic (“Which city has
the larger population: Milan or Modena?”). For
mere recognition judgments, activation in the
precuneus, an area that is known from inde-
pendent studies to respond to recognition con-
fidence (Yonelinas et al. 2005), was reported.
In the inference task, precuneus activation was
also observed, as predicted, and activation was
detected in the anterior frontomedian cortex
(aFMC), which has been linked in earlier stud-
ies to evaluative judgments and self-referential
processing. The aFMC activation could repre-
sent the neural basis of this evaluation of eco-
logical rationality. Furthermore, the neural ev-
idence suggests that the recognition heuristic
may be relied upon by default, consistent with
the finding that response times were consid-
erably faster when participants’ inferences fol-
lowed the recognition heuristic than when they
did not (Pachur & Hertwig 2006, Volz et al.
2006).
ONE-REASON DECISION
MAKING
Whereas the recognition and fluency heuris-
tics base decisions on recognition information,
other heuristics rely on recall. One class looks
for only one “clever” cue and bases its decision
on that cue alone. The hiatus heuristic is one
One-reason
decisions: a class of
heuristics that bases
judgments on one
good reason only,
ignoring other cues
(e.g., take-the-best and
hiatus heuristic)
example. A second class involves sequential
search through cues, and it may search for
more than one cue but also bases its deci-
sion on only one. Examples include lexico-
graphic rules (Fishburn 1974, Luce 1956) and
elimination-by-aspect (Tversky 1972). These
heuristics were originally developed for prefer-
ences; here, we focus on models of inferences.
One-Clever-Cue Heuristics
Many animal species appear to rely on a sin-
gle “clever” cue for locating food, nest sites, or
mates. For instance, in order to pursue a prey or
a mate, bats, birds, and fish do not compute tra-
jectories in three-dimensional space, but sim-
ply maintain a constant optical angle between
their target and themselves—a strategy called
the gaze heuristic (Gigerenzer 2007, Shaffer
et al. 2004). In order to catch a fly ball, baseball
outfielders and cricket players rely on the same
kind of heuristics rather than trying to compute
the ball’s trajectory (McLeod & Dienes 1996).
Similarly, to choose a mate, a peahen investi-
gates only three or four of the peacocks dis-
playing in a lek and chooses the one with the
largest number of eyespots (Petrie & Halliday
1994).
When are one-clever-cue heuristics ecolog-
ically rational? The answer is not entirely clear
at this point in time, but candidates are envi-
ronments where the variability of cue weights
and redundancy is moderate to high and sam-
ple size is small (see Hogarth & Karelaia 2007,
Katsikopoulos et al. 2010, McGrath 2008).
Geographic profiling. The task of geographic
profiling is to predict where a serial criminal is
most likely to live given the sites of the crimes.
Typically, geographical profiling is performed
by sophisticated statistical software programs,
such as CrimeStat, that calculate a probabil-
ity distribution across possible locations. Snook
and colleagues (2005) were among the first to
challenge the “complexity equals accuracy” as-
sumptions in the field of profiling. They tested
the circle heuristic, which predicts the crim-
inal’s most likely location in the center of a
www.annualreviews.org •Heuristic Decision Making 463
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
circle drawn through the two most distant sites
of crime. It relies on one cue only, the largest
distance. In a comparison with 10 other profil-
ing strategies, the heuristic predicted the loca-
tions best. Complex profiling strategies appear
to become more accurate if the number of crime
locations known is nine or higher. Snook et al.
(2004) taught two heuristics (including the cir-
cle heuristic) to laypeople in criminology and
reported that after a single session, laypeople
became about as accurate in predicting offender
locations as the CrimeStat algorithm. These re-
sults led to a heated debate with proponents
of optimization algorithms in profiling (e.g.,
Rossmo 2005).
Take-the-Best
The take-the-best heuristic is a model of how
people infer which of two alternatives has a
higher value on a criterion, based on binary
cue values retrieved from memory. For conve-
nience, the cue value that signals a higher cri-
terion value is 1, and the other cue value is 0.
Take-the-best consists of three building blocks:
1. Search rule: Search through cues in order
of their validity.
2. Stopping rule: Stop on finding the first
cue that discriminates between the alter-
natives (i.e., cue values are 1 and 0).
3. Decision rule: Infer that the alternative
with the positive cue value (1) has the
higher criterion value.
Take-the-best simplifies decision making by
stopping after the first cue and ordering cues
unconditionally according to validity v,which
is given by:
v=C/(C +W),
where Cis the number of correct inferences
when a cue discriminates, and Wis the num-
ber of wrong inferences. Alternative search
rules such as success (Martignon & Hoffrage
2002, Newell et al. 2004) and discrimination
(Gigerenzer & Goldstein 1996) have been in-
vestigated. Todd & Dieckmann (2005) studied
alternative simple principles for learning cue
orders. Karelaia (2006) showed that a “confir-
matory” stopping rule—stop after two cues are
found that point to the same alternative—leads
to remarkably robust results across varying cue
orders, which is ecologically rational in situ-
ations where the decision maker knows little
about the validity of the cues.
A striking discovery was that take-the-
best can predict more accurately than linear
multiple regression models (Czerlinski et al.
1999). It can even predict more accurately than
complex nonlinear strategies. Figure 1 shows
the predictive accuracy of an exemplar-based
model (nearest-neighbor classifier), Quinlan’s
decision-tree induction algorithm C4.5, and
classification and regression trees (CARTs),
compared to take-the-best. In both tasks,
and across most sample sizes, take-the-best
achieves higher predictive accuracy than each
of the three complex strategies (Brighton &
Gigerenzer 2011). This is not to say that re-
lying on one good reason is always better, but
the result in Figure 1 is the most frequently ob-
tained in a total of 20 environments. Note that
CARTs have been designed to be robust against
estimation error (variance) due to small samples
and other factors. These complex algorithms
can mimic the outcome of take-the-best in the
sense that they are models that include take-
the-best as a special case. Yet, although their
greater flexibility leads to better fit of known
data, more general models do not necessarily
lead to better predictions of unknown data.
As noted above, take-the-best orders cues
unconditionally, unlike the other models in
Figure 1. Ordering cues conditionally, that
is, taking their interdependencies into account,
may seem a more rational strategy. In fact, in a
small world where all cue validities are perfectly
known, conditional validity leads to higher (or
at least equal) accuracy than unconditional va-
lidity (Schmitt & Martignon 2006). However,
in large worlds as in Figure 1,wherethe
cue order needs to be estimated from samples,
this no longer holds. If one makes take-the-
best more sophisticated by ordering cues con-
ditionally (greedy take-the-best), the predic-
tive accuracy drops to the level of the complex
464 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
A
0 20406080
50 55 60 65 70 75
City Populations
Sample size, n
Mean predictive accuracy
Take-the-best
Nearest Neighbor
C45
CART
Greedy TTB
B
0 102030405060
50 55 60 65 70 75
Mammal Lifespans
Sample size, n
Mean predictive accuracy
Take-the-best
Nearest Neighbor
C45
CART
Greedy TTB
Figure 1
A competition between take-the-best and three well-known learning algorithms [nearest neighbor classifier, Quinlan’s decision-tree
induction algorithm C4.5, and classification and regression tree (CART)], also including a greedy version of take-the-best (TTB) that
orders cues by conditional validity instead of unconditional validity. Mean predictive accuracy in cross-validation is plotted as a function
of the sample size of the training set. The two tasks were deciding (A) which of two German cities has more inhabitants, and (B) which
of two mammal species lives longer on average (Brighton & Gigerenzer 2011).
strategies (Figure 1). This suggests that the
predictive power of take-the-best stems mostly
from the search rule rather than the stopping
rule.
The ecological rationality of take-the-best
has been studied in three different situations:
(a) when the cue order is known (Katsikopoulos
& Martignon 2006, Martignon & Hoffrage
2002), (b) when error is introduced in that
knowledge (Hogarth & Karelaia 2007), and
(c) when the order of cues needs to be inferred
from samples (Brighton 2006, Gigerenzer &
Brighton 2009). Taken together, these results
suggest two structures of environments that
take-the-best can exploit: high cue redundancy
and high variability in cue weights.
Many experimental studies asked the
descriptive question whether take-the-best can
predict people’s inferences (e.g., Br¨
oder 2003,
Br¨
oder & Gaissmaier 2007, Br ¨
oder & Schiffer
2006, Newell & Shanks 2003, Rieskamp &
Hoffrage 1999). Dieckmann & Rieskamp
(2007) first showed that in environments
with high redundancy, take-the-best is as
accurate as and more frugal than na¨
ıve Bayes
(a strategy that integrates all cues), and then
experimentally demonstrated that in high-
redundancy environments, take-the-best pre-
dicted participants’ judgments best, whereas in
low-redundancy environments, compensatory
strategies predicted best, indicating adaptive
strategy selection. Rieskamp & Otto (2006)
showed that in an environment with high vari-
ability of cue validities, judgments consistent
with take-the-best increased over experimental
trials from 28% to 71%, whereas in an envi-
ronment with low variability, they decreased to
12%. Br¨
oder (2003) reported similar selection
of take-the-best dependent on the variability
or cue validities. In several experiments,
individuals classified as take-the-best users for
tasks where the heuristic is ecologically rational
showed higher IQs than those who were classi-
fied as compensatory decision makers, suggest-
ing that cognitive capacity as measured by IQ “is
not consumed by strategy execution, but rather
by strategy selection” (Br¨
oder & Newell 2008,
p. 209).
www.annualreviews.org •Heuristic Decision Making 465
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
Bergert & Nosofsky (2007) formulated
a stochastic version of take-the-best, tested
it against a weighted additive model at the
individual level, and concluded that the vast
majority of participants adopted the take-the-
best heuristic. Comparing take-the-best with
both weighted additive and exemplar models
of categorization, Nosofsky & Bergert (2007)
found that most participants did not use an
exemplar-based strategy but instead followed
the response time predictions of take-the-best.
Br¨
oder & Gaissmaier (2007) analyzed five pub-
lished experiments and one new experiment,
and reported that in all instances when decision
outcomes indicated the use of take-the-best,
decision times increased monotonically with
the number of cues that had to be searched in
memory, as predicted by take-the-best’s search
and stopping rules. Taken together, these stud-
ies indicate systematic individual differences in
strategy use and adaptive use of take-the-best.
Garc´
ıa-Retamero & Dhami (2009) tested
how policemen, professional burglars, and
laypeople infer which of two residential prop-
erties is more likely to be burgled. Both ex-
pert groups’ inferences were best modeled by
take-the-best, and laypeople’s inferences by a
weighted additive rule. The latter may reflect
that laypeople need to explore all the infor-
mation, whereas experts know what is relevant,
consistent with findings of the literature on ex-
pertise (Ericsson et al. 2007, Reyna & Lloyd
2006, Shanteau 1992).
Concerns were raised by Juslin & Persson
(2002) that take-the-best is not so simple after
all but requires complex computations for
ordering the cues; Dougherty et al. (2008)
and Newell (2005) voiced similar concerns.
First, it is true that estimating validity order
can sometimes be nontrivial, yet it is simpler
than estimating other kinds of weights such
as regression weights. Second, people estimate
order from samples rather than by calculating
the “true” order from perfect knowledge about
the entire population, as Juslin and Persson as-
sumed. Even with minute sample sizes of two to
ten—resulting in estimated orders that deviate
from the true order—take-the-best predicted
more accurately than multiple regression when
both were provided with continuous cue values
(Katsikopoulos et al. 2010). Finally, a person
does not need to learn cue orders individually
but instead can learn from others, as through
teaching and imitation (Gigerenzer et al. 2008).
Consumer choice. How do consumers decide
which product to buy among an ever-increasing
assortment on the Internet or on supermarket
shelves? The classical methodology to answer
this question has been conjoint analysis, which
assumes a weighted linear combination of fea-
tures or cues. When John Hauser, a propo-
nent of conjoint analysis, began to test mod-
els of heuristics, he found to his surprise that
sequential heuristics predict consumer choices
well (Hauser et al. 2009). Examples are deci-
sions between computers (Kohli & Jedidi 2007)
and smartphones (Yee et al. 2007). In particu-
lar, heuristics are important early in the deci-
sion process to form a consideration set, which
consists of eliminating most products from fur-
ther consideration. Once the consideration set
is formed, consumers evaluate the remaining
options more carefully (Gaskin et al. 2007; see
also Reisen et al. 2008). Within their considera-
tion set of potential suppliers, they then appear
to trade off price and reliability to reach their
final choice.
Literature search. How should an organi-
zation design a search algorithm for priori-
tizing literature searches from the PsycINFO
database? Lee and colleagues (2002) engineered
two methods for identifying articles relevant
to a given topic of interest (e.g., eyewitness
testimony), one a variant of take-the-best, the
other a Bayesian model using all available in-
formation. Lee et al. tested both methods on
ten actual literature searches and measured the
methods’ performances against effort (i.e., the
proportion of the articles read by the user)
and accuracy (i.e., proportion of relevant arti-
cles found). The variant of take-the-best was as
good as or better than the Bayesian model, par-
ticularly in searches in which the proportion of
relevant articles was small.
466 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
Fast-and-Frugal Trees
One way to model classification is in terms
of trees. For instance, Bayes’ rule can be rep-
resented as a tree with 2mleaves, where m
is the number of binary cues or attributes.
Natural frequencies provide such a represen-
tation. Yet when the number of cues grows,
a Bayesian analysis—with or without natu-
ral frequencies—becomes computationally in-
tractable or fraught with estimation error be-
cause one typically has too few data points for
the thousands of leaves of such a gigantic tree.
A fast-and-frugal tree has only m+1 leaves and
thus is likely more robust. It has building blocks
similar to take-the-best (Martignon et al. 2003):
1. Search rule: Search through cues in a pre-
determined order.
2. Stopping rule: Stop search as soon as a
cue leads to an exit.
3. Decision rule: Classify the object
accordingly.
Fast-and-frugal trees are used by experts in
many fields, from cancer screening to bail de-
cisions (see Figure 2). Martignon et al. (2008)
tested the accuracy of fast-and-frugal trees in
30 classification problems from fields such as
medicine, sports, and economics. They re-
ported that complex benchmark strategies in-
cluding logistic regression excelled in data fit-
ting, but fast-and-frugal trees were close or
identical to these strategies in their predictive
accuracy.
Emergency medicine. When patients ar-
rive at the hospital with severe chest pain,
emergency physicians have to decide quickly
whether they suffer from acute ischemic heart
disease and should be assigned to the inten-
sive coronary care unit (ICU). In a Michigan
hospital, doctors preferred to err on what they
believed was the safe side by sending about
90% of the patients to the ICU, although only
about 25% of these actually had a myocardial
Figure 2
Fast-and-frugal trees for medical and legal decisions. The tree on the left prescribes how emergency physicians can detect acute
ischemic heart disease. It only asks up to three yes/no questions, namely whether the patient’s electrocardiogram shows a certain
anomaly (“ST segment changes”), whether chest pain is the patient’s primary complaint, and whether there is any other factor (Green
& Mehr 1997). The tree on the right describes how magistrates at a London court decided whether to bail a defendant or to react
punitively by imposing conditions such as curfew or imprisonment. The logic is defensive and “passes the buck.” The tree predicted
92% of bail decisions correctly (Dhami 2003). Abbreviations: MI, myocardial infarction; N.A., not applicable; NTG, nitroglycerin;
T, T-waves with peaking or inversion.
www.annualreviews.org •Heuristic Decision Making 467
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
infarction (Green & Mehr 1997). The result
was an overly crowded ICU, a decrease in qual-
ity of care, an increase in cost, and a risk of seri-
ous infection among those who were incorrectly
assigned. Something had to be done. Green
& Mehr (1997) tried two solutions: (a) a lo-
gistic regression, the Heart Disease Predictive
Instrument (HDPI), and (b) a fast-and-frugal
tree. To use the HDPI, doctors received a chart
with some 50 probabilities, checked the pres-
ence and absence of symptoms, and inserted the
relevant probabilities into a pocket calculator.
The fast-and-frugal tree ignored all probabil-
ities and asked only a few yes-or-no questions
(Figure 2). Ultimately, the tree was more accu-
rate in predicting actual heart attacks than the
HDPI: It sent fewer patients who suffered from
a heart attack wrongly into a regular bed and
also nearly halved physicians’ high false-alarm
rate. Last but not least, the tree was transparent,
easy to memorize, and easy to modify, and was
accepted by physicians who disliked relying on
a logistic regression they barely understood.
Easy memorization is an important feature
of fast-and-frugal trees, particularly in emer-
gency situations. After the terrorist attacks on
September 11, 2001, START (Simple Triage
and Rapid Treatment; Cook 2001) helped
paramedics to quickly classify victims into two
major categories: those who needed medical
treatment immediately and those whose treat-
ment could be delayed. A tree with only two
cues—age and duration of fever—was devel-
oped to decide upon macrolide prescription
in young children with community-acquired
pneumonia (Fischer et al. 2002). This tree was
slightly less accurate than a scoring system
based on logistic regression (72% versus 75%),
but it does not require any expensive technol-
ogy and thus can be applied to millions of chil-
dren worldwide who would otherwise not have
access to healthcare.
How to model physicians’ thinking? Tak-
ing for granted that physicians use heuristics
for diagnosing patients, the medical commu-
nity quickly adopted the heuristics-and-biases
view and has left it mainly unrevised as of today
(Croskerry 2009). For instance, Elstein (1999)
described heuristics as “mental shortcuts com-
monly used in decision making that can lead to
faulty reasoning or conclusions” (p. 791) and
blamed them for many errors in clinical rea-
soning. Some researchers, however, recognize
their potential to improve decisions. McDonald
(1996), for one, wrote, “admitting the role of
heuristics confers no shame” (p. 56). Rather,
the goal should be to formalize and understand
heuristics so that their use can be effectively
taught, which could lead to less practice varia-
tion and more efficient medical care. “The next
frontier will involve fast and frugal heuristics;
rules for patients and clinicians alike” (Elwyn
et al. 2001, p. 358).
For diagnosis, which is a form of classifi-
cation, fast-and-frugal trees potentially model
how physicians make decisions. For treatment
choice, all heuristics described above are po-
tential models. Both fast-and-frugal trees and
other heuristics differ from traditional models
of medical decision making, such as logistic
regression for classification and expected utility
maximization for choice. Dhami & Harries
(2001) compared a fast-and-frugal tree
(“matching heuristic”) to a linear regression
model on general practitioners’ decisions
to prescribe lipid-lowering drugs for a set
of hypothetical patients. Both models fitted
prescription decisions equally well, but the
simple tree relied on less information. Similar
results were obtained by Smith & Gilhooly
(2006) and Backlund et al. (2009). These
studies reported only fitting—not predicting—
physicians’ judgments, which is a limitation.
More direct evidence comes from the routine
use of fast-and-frugal trees by physicians in
cancer screening and HIV tests.
Bail decisions. Heuristics matter in the law in
multiple respects. They play a role in the mak-
ing of law (Haidt et al. 2006) as well as in lit-
igation (Hastie & Wittenbrink 2006). In both
domains, there has been debate whether heuris-
tics are a problem or a solution (Gigerenzer &
Engel 2006).
468 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
One of the initial decisions of the legal sys-
tem is whether to bail a defendant uncondi-
tionally or to react punitively by conditions
such as curfew or imprisonment. In England
and Wales, around two million bail decisions
are made every year by benches of two or
three magistrates, 99.9% of whom are mem-
bers of the local community without legal
training. How do they make these decisions?
When magistrates were interviewed, they gen-
erally responded that they thoroughly exam-
ined and weighed all information in a complex
way (Dhami & Ayton 2001). However, when
Dhami (2003) observed several hundreds of tri-
als in two London courts, she found that the
average time magistrates spent on a case was
6 to 10 minutes and that their decisions could
be predicted better with a fast-and-frugal tree
(“matching heuristic”) than with weighting and
adding all information (Figure 2). The logic
of the tree appears to be to “pass the buck,”
because it copies the punitive decisions of the
prosecution, a previous court, or the police.
It violates due process because it ignores rel-
evant information about the defendant. In the
two courts, the fast-and-frugal trees predicted
92% and 85% of all decisions correctly (cross-
validation), compared to 86% and 73% by a
weighted additive model that would correspond
to due process and what magistrates responded
in the interviews.
TRADE-OFF HEURISTICS
Unlike recognition-based and one-reason deci-
sions, the third class of heuristics weights cues
or alternatives equally and thus makes trade-offs
(compensatory strategies).
Tallying
Whereas take-the-best ignores cues (but in-
cludes a simple form of weighting cues by or-
dering them), tallying ignores weights, weight-
ing all cues equally. It entails simply counting
the number of cues favoring one alternative in
comparison to others.
Trade-offs: a class of
heuristics that weights
all cues or alternatives
equally and thus makes
trade-offs (e.g.,
tallying and 1/N)
1. Search rule: Search through cues in any
order.
2. Stopping rule: Stop search after mout of
a total of Mcues (with 1 <m≤M). If the
number of positive cues is the same for
both alternatives, search for another cue.
If no more cues are found, guess.
3. Decision rule: Decide for the alternative
that is favored by more cues.
Dawes (1979; Dawes & Corrigan 1974)
showed that tallying was about as accurate as
multiple regression and sometimes even better.
In a more extensive test across 20 environments,
Czerlinski et al. (1999) demonstrated that tal-
lying had, on average, a higher predictive accu-
racy. The challenge is to figure out when this is
the case. Einhorn & Hogarth (1975) found that
unit-weight models were successful in com-
parison to multiple regression when the ratio
of alternatives to cues was 10 or smaller, the
linear predictability of the criterion was small
(R2≤0.5), and cues were highly redundant.
Relatively few studies have identified condi-
tions under which people would use a tally-
ing strategy. Interestingly, it seems that more
people prefer to dispense with particular cues
(as in one-reason decision making) than with
cue order or weights (Br ¨
oder & Schiffer 2003,
Rieskamp & Hoffrage 2008; but see Wang
2008). One reason for the relatively low preva-
lence of tallying could be that these studies used
only few cues, typically four or five. Below we
provide two illustrations of the prescriptive use
of tallying in institutions (for more, see Astebro
& Elhedhli 2006, Graefe & Armstrong 2009,
Lichtman 2008, Wang 2008).
Magnetic Resonance Imaging (MRI) or
simple bedside rules? There are about
2.6 million emergency room visits for dizzi-
ness or vertigo in the United States every year
(Kattah et al. 2009). The challenging task for
the emergency physician is to detect the rare
cases where dizziness is due to a dangerous
brainstem or cerebellar stroke. Frontline mis-
diagnosis of strokes happens in about 35%
of the cases. One solution to this challenge
could be technology. Getting an early MRI with
www.annualreviews.org •Heuristic Decision Making 469
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
diffusion-weighted imaging takes 5 to 10 min-
utes plus several hours of waiting time, costs
more than $1,000, and is not readily available
everywhere. However, Kattah et al. (2009) de-
veloped a simple bedside eye exam that actu-
ally outperforms MRI and takes only about one
minute: It consists of three tests and raises an
alarm if at least one indicates a stroke. This
simple tallying rule correctly detected 100% of
those patients who actually had a stroke (sen-
sitivity), whereas an early MRI only detected
88%. Out of 25 patients who did not have a
stroke, the bedside exam raised a false alarm in
only one case (i.e., 4% false positive rate =96%
specificity). Even though the MRI did not raise
any false alarms, the bedside exam seems prefer-
able in total, given that misses are more severe
than false alarms and that it is faster, cheaper,
and universally applicable.
Avoiding avalanche accidents. Hikers and
skiers need to know when avalanches could oc-
cur. The obvious clues method is a tallying
heuristic that checks how many out of seven
cues have been observed en route or on the
slope that is evaluated (McCammon & H¨
ageli
2007). These cues include whether there has
been an avalanche in the past 48 hours and
whether there is liquid water present on the
snow surface as a result of recent sudden warm-
ing. When more than three of these cues are
present on a given slope, the situation should
be considered dangerous. With this simple tal-
lying strategy, 92% of the historical accidents
(where the method would have been applicable)
could have been prevented.
Mapping Model
How do people arrive at quantitative estimates
based on cues? The mapping model assumes
that people tally the number of relevant cues
with an object’s positive values (von Helversen
& Rieskamp 2008). The estimate is the median
criterion value of objects with the same number
of positive cues. The mapping model captured
people’s judgment better than a linear regres-
sion and an exemplar model when the criterion
values followed a skewed distribution.
Sentencing decision. In the adversarial U.S.
legal system, the vast majority of cases are closed
by plea bargaining, where the prosecution and
defense negotiate a sentence, which is then rat-
ified by a judge. In contrast, in Germany and
many other countries, plea bargaining before a
case goes to court is an exception rather than the
rule. Here, the judge has to determine an appro-
priate sentence proportional to the offender’s
guilt, within the range of the minimum and
maximum sentence specified for each offense.
The single most important factor influencing
judges’ decisions is the prosecution’s sentenc-
ing recommendation. How should the prose-
cution make its recommendation? The German
penal code lists over 20 factors to consider. The
legal literature recommends a three-step strat-
egy: Determine first all relevant factors and the
direction of their effect on the sentence (aggra-
vating or mitigating), then weight these by their
importance, and add them up to determine the
sentence. Von Helversen & Rieskamp (2009)
analyzed trial records of sentencing and tested
five models of how sentencing decisions have
been made in theft, fraud, and forgery, includ-
ing a linear regression model. The best predic-
tions of actual sentences were obtained by the
mapping model, a heuristic model of quantita-
tive estimation, based on a simple tallying rule
described above. As von Helversen & Rieskamp
(2009) pointed out, this result “provides further
evidence that legal decision makers rely heav-
ily on simple decision heuristics...and suggests
that eliciting these employed heuristics is an im-
portant step in understanding and improving
legal decision making” (pp. 389–390).
1/NRule
Another variant of the equal weighting princi-
ple is the 1/Nrule, which is a simple heuristic
for the allocation of resources (time, money) to
Nalternatives:
1/Nrule: Allocate resources equally to each of
Nalternatives.
This rule is also known as the equality
heuristic (Messick 1993). Sharing an amount
470 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
of money equally is the modal response in the
one-shot ultimatum game for adults and also
the most frequent split in children’s group de-
cisions, contrary to the predictions of game the-
ory (Takezawa et al. 2006).
Investment. When deciding how to allocate
financial resources among Noptions, some in-
dividuals rely on the 1/Nrule (Benartzi &
Thaler 2001), which allocates financial re-
sources equally across all alternatives. The 1/N
rule was compared to 14 optimizing mod-
els, including a Nobel Prize–winning model,
Markowitz’s mean-variance portfolio, in seven
investment problems (DeMiguel et al. 2009).
To estimate the models’ parameters, each opti-
mizing strategy received 10 years of stock data
and then had to predict the next month’s per-
formance on this basis. The same procedure was
repeated, with a moving window, for the next
month, and so forth, until no data were left.
Note that 1/Ndoes not have any free parame-
ters that need to be estimated. Nevertheless, it
came out first on certainty equivalent returns,
second on turnover, and fifth on the Sharpe ra-
tio. None of the complex optimizing models
could consistently beat it.
SOCIAL INTELLIGENCE
According to the social intelligence hypothesis,
also called the Machiavellian intelligence hy-
pothesis (Whiten & Byrne 1997), highly so-
cial species such as humans and other social
primates should be intellectually superior to
less social ones because the social environ-
ment is more complex, less predictable, and
more intellectually challenging. In Humphrey’s
(1976/1988) words, social primates “must be
able to calculate the consequences of their
own behavior, to calculate the likely behaviours
of others, to calculate the balance of advan-
tage and loss” (p. 19). For the sake of argu-
ment, let us assume that the social world is
indeed more complex and unpredictable than
the nonsocial one. Would social intelligence
therefore require more complex cognition? Not
necessarily, according to the following two hy-
potheses (Hertwig & Herzog 2009):
1. Social intelligence does not require com-
plex mental calculation; it also works with
heuristics.
2. The same heuristics that underlie nonso-
cial decision making also apply to social
decisions (but not vice versa).
The justification for hypothesis 1 is the same
as for complex nonsocial problems: The more
unpredictable a situation is, the more infor-
mation needs to be ignored to predict the fu-
ture. One reason for hypothesis 2 is that the
distinction between social and nonsocial is a
common oversimplification in the first place.
Nevertheless, for the purpose of this review,
we distinguish two meanings of social: whether
the input into a strategy is social informa-
tion (e.g., when imitating the behavior of a
peer) or not (e.g., features of digital cameras),
and whether the task is a game against na-
ture or a social game involving other humans
(Hertwig et al. 2011). The goals of social intel-
ligence go beyond accuracy, frugality, and mak-
ing fast decisions. They include transparency,
group loyalty, and accountability (Lerner &
Tetlock 1999). Consistent with hypothesis 2,
heuristics from all three classes discussed above
have been investigated in social situations.
Below are a few examples.
Recognition-Based Decisions
Reimer & Katsikopoulos (2004) first showed
analytically that less-is-more effects are larger
in group decisions than in individual decisions
and subsequently demonstrated this effect em-
pirically in an experiment in which another
fascinating phenomenon emerged. Consider a
group of three in which one member recog-
nized only city awhile the other two members
recognized both cities aand band individually
chose bas the larger one. The majority rule
predicts that bwould always be selected, yet
in 59% of the cases, the final group decision
was a, following the one who had not heard
of b.
www.annualreviews.org •Heuristic Decision Making 471
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
One-Reason Decision Making
As mentioned above, the behavior of most peo-
ple in the one-shot ultimatum game is in-
consistent with the classical economic predic-
tions. Most researchers nevertheless retained
the utility-maximizing framework and added
free parameters for other-regarding disposi-
tions (e.g., Fehr & Schmidt 1999). In contrast,
Rubinstein (2003) called for a radical change,
“to open the black box of decision making, and
come up with some completely new and fresh
modeling devices” (p. 1215). Fischbacher and
colleagues (2011) did so and modeled the in-
dividual differences observed in the ultimatum
game by fast-and-frugal trees of different sizes,
involving one to four cues. The number of cues
predicted how long decisions took.
Trade-Off Heuristics
Many parents try to divide their time every day
between their Nchildren equally by 1/N. If par-
ents have only two children, 1/Nwill attain the
long-term goal of providing each child with as
much time as the other. But if there are three or
more children (excepting multiple births), only
the first-born and last-born have exclusive time
with the parents, while the middle-borns have
to share with their siblings throughout their
childhood and thus end up receiving less time in
total. The simple 1/Nrule predicts a complex
pattern of care time for each child, a pattern
observed in a survey of 1,296 families (Hertwig
et al. 2002). This result illustrates that a heuris-
tic and its goal (fair division during childhood)
are not the same—the environment has the last
word. The majority rule is a second example
of a tallying rule applied to group decisions; it
also defines democratic voting systems (Hastie
& Kameda 2005).
Social Heuristics
Although the heuristics discussed so far can be
fed with both social and nonsocial information,
there are genuinely social heuristics designed
exclusively for social information. Examples
include imitation heuristics, tit-for-tat, the
social-circle heuristic, and averaging the judg-
ments of others to exploit the “wisdom of
crowds” (Hertwig & Herzog 2009). Imitate-
the-successful, for instance, speeds up learn-
ing of cue orders and can find orders that
excel take-the-best’s validity order (Garcia-
Retamero et al. 2009). Social heuristics prove
particularly helpful in situations in which the
actor has little knowledge. The classic example
is that of Francis Galton, who visited a livestock
fair where villagers estimated the weight of an
ox and was surprised to find that their median
and mean average estimates were only 9 and
1 pounds, respectively, off the actual weight of
1198 pounds (Galton 1907).
A peculiar social rule is the default heuristic:
“If there is a default, do nothing about it.”
Defaults are set by institutions and act as im-
plicit recommendations ( Johnson & Goldstein
2003). Every year, an estimated 5,000
Americans and 1,000 Germans die while wait-
ing for a suitable organ donor. Although most
citizens profess that they approve of organ
donation, relatively few sign a donor card: only
about 28% and 12% in the United States and
Germany, respectively. In contrast, 99.9% of
the French and Austrians are potential donors.
These striking differences can be explained
by the default heuristic. In explicit-consent
societies such as Germany, the law prescribes
that nobody is a donor unless one opts in. In
presumed-consent societies such as France, the
default is that everyone is a donor unless one
opts out. Although most people appear to fol-
low the same heuristic, the result is drastically
different because the legal environment differs.
Very few studies use large-scale demo-
graphic data to test social heuristics. For in-
stance, marriage patterns are studied by demog-
raphers without much attention to the social
heuristics that generate these, and vice versa.
Todd and colleagues (2005) had the ingenious
methodological insight that the aggregate de-
mographic data rule out certain heuristics and
can be used to test various satisficing strategies
for mate choice.
472 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
Moral Behavior
Although moral behavior has long been at-
tributed to conscious reflection, Haidt &
Bjorklund (2008) argued that reasons are typi-
cally used to justify behavior after the fact and
that the causes are mostly unconscious or in-
tuitive. Gigerenzer (2010) proposed that these
unconscious causes are often social heuristics,
such as imitating the behavior of peers in or-
der to gain acceptance by the group. Note that
one and the same social heuristic can lead to be-
havior evaluated as moral or immoral, such as
when imitating benevolent or malevolent peer
behavior. This perspective on moral behavior is
different from assuming that people have inter-
nalized specific moral rules such as don’t steal
and don’t kill.
Moral behavior has been related to “sacred
values” (Fiske & Tetlock 1997). If one under-
stands sacred values as top cues in lexicographic
heuristics, decisions between alternatives where
a sacred value conflicts with a secular value (e.g.,
life versus money) should be faster and eas-
ier than when two sacred values (e.g., one life
versus another) conflict with each other, as re-
ported by Hanselmann & Tanner (2008). Baron
& Ritov (2009) argued that, from a utilitarian
perspective, this form of one-reason decision
making can cause great problems for policy de-
cisions as it could prevent trade-offs for the
greater good. In the same vein, Sunstein (2005)
asserted that moral heuristics can lead to great
error, but added that we would not necessarily
“be better off without them. On the contrary,
such heuristics might well produce better re-
sults, from the moral point of view, than the fea-
sible alternatives” (p. 535). Cosmides & Tooby
(2006) located the origins of moral heuristics in
our ancestral world of tiny bands of individu-
als. An example of a moral heuristic is an intu-
itive search rule that looks for information that
could reveal whether one has been cheated in
a social contract. This heuristic correctly pre-
dicts when information search in the Wason
selection task contradicts propositional logic
(Gigerenzer 2000).
CONCLUSIONS
We began this review with the observation that
the three major tools for modeling decision
making—logic, statistics, and heuristics—have
not been treated equally, with each suited to a
particular kind of problem. Instead, in psychol-
ogy, heuristics became associated with errors
and contrasted with logical and statistical rules
that were believed to define rational thinking in
all situations. Yet this view has been questioned
for uncertain, large worlds where the assump-
tions of rational models are not met. We re-
viewed studies on decisions by individuals and
institutions, including business, medical, and
legal decision making, that show that heuristics
can often be more accurate than complex “ratio-
nal” strategies. This puts heuristic on a par with
statistical methods and emphasizes a new eco-
logical question: In what environment does a
given strategy (heuristic or otherwise) succeed?
This insight adds a prescriptive research pro-
gram to that of the existing descriptive research
program on heuristics. Pioneers such as Dawes,
Hogarth, and Makridakis demonstrated years
ago that simple forecasting methods can often
predict better than standard statistical proce-
dures; as James March, one of the most influen-
tial researchers in organizational decision mak-
ing, put it more than 30 years ago, “If behavior
that apparently deviates from standard proce-
dures of calculated rationality can be shown to
be intelligent, then it can plausibly be argued
that models of calculated rationality are defi-
cient not only as descriptors of human behavior
but also as guides to intelligent choice” (1978,
p. 593).
Nonetheless, a large community contin-
ues to routinely model behavior with com-
plex statistical procedures without testing these
against simple rules. Yet a fundamental change
in thinking about human and animal behav-
ior seems to be occurring. Mathematical biol-
ogists McNamara & Houston (2009) described
this shift: “Although behavioral ecologists have
built complex models of optimal behavior in
simple environments, we argue that they need
www.annualreviews.org •Heuristic Decision Making 473
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
to focus on simple mechanisms that perform
well in complex environments” (p. 675).
Formal models also help to answer the de-
scriptive question of when people rely on what
heuristic. As for the prescriptive question, a
similar conflict is waged between those who ar-
gue in favor of classical statistical techniques
as models of the mind (typically weighting
and adding of all information) and those who
argue that many people consistently rely on
heuristics. The best way to decide is compar-
ative testing; the difficulty is to understand
the individual differences reported in most
experiments.
With all these new insights, we are left
with big challenges. How should we develop
a systematic theory of the building blocks of
heuristics and the core capacities as well as
environmental structures that these exploit?
To what extent can the emerging science of
heuristic decision making provide a unifying
framework for the study of the mind? One
way to proceed is theory integration, that is,
to connect the simple heuristics framework
with other theoretical frameworks in psychol-
ogy. This is already happening with ACT-R
(Adaptive Control of Thought-Rational)
theory (Schooler & Hertwig 2005), signal
detection theory (Luan et al. 2010, Pleskac
2007), and the heuristics-and-biases program
(Read & Grushka-Cockayne 2010). In physics,
theory integration, such as quantum theory
and relativity theory, is a primary goal. In
psychology, theory integration is not accorded
the importance it deserves; instead, the field
still resembles a colorful loose patchwork. We
envision that the study of cognitive heuristics
may help to sew some of the pieces together.
SUMMARY POINTS
1. Heuristics can be more accurate than more complex strategies even though they process
less information (less-is-more effects).
2. A heuristic is not good or bad, rational or irrational; its accuracy depends on the structure
of the environment (ecological rationality).
3. Heuristics are embodied and situated in the sense that they exploit core capacities of
the brain and their success depends on the structure of the environment. They provide
an alternative to stable traits, attitudes, preferences, and other internal explanations of
behavior.
4. With sufficient experience, people learn to select proper heuristics from their adaptive
toolbox.
5. Usually, the same heuristic can be used both consciously and unconsciously, for inferences
and preferences, and underlies social as well as nonsocial intelligence.
6. Decision making in organizations typically involves heuristics because the conditions for
rational models rarely hold in an uncertain world.
FUTURE ISSUES
1. How do people learn, individually or socially, to use heuristics in an adaptive way? And
what prevents them from doing so? (For a start: Rieskamp & Otto 2006)
2. Does intelligence mean knowing when to select which strategy from the adaptive toolbox?
(For a start: Br ¨
oder & Newell 2008)
474 Gigerenzer ·Gaissmaier
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
3. Are gut feelings based on heuristics, and if so, on which? (For a start: Gigerenzer 2007)
4. To what extent is moral (and immoral) behavior guided by social heuristics? (For a start:
Gigerenzer 2010, Sunstein 2005)
5. How does the content of the adaptive toolbox change over the life span and between
cultures? (For a start: Mata et al. 2007)
6. Can people adapt the use of heuristics to idiosyncrasies in their core capacities, such as
differences in memory span, but also differences in knowledge? (For a start: Br¨
oder &
Gaissmaier 2007)
7. Which heuristics do humans share with which animals, and why? (For a start: Hutchinson
& Gigerenzer 2005 and commentaries)
8. Finally, the overarching goal: Develop a systematic theory of the building blocks of cog-
nitive heuristics (such as search, stopping, and decision rules) anchored in core capacities
and the social and physical structures they exploit.
ACKNOWLEDGMENTS
We thank Mirta Galesic, Sebastian Hafenbr¨
adl, Ralph Hertwig, Ulrich Hoffrage, Konstantinos
Katsikopoulos, Julian N. Marewski, and Lael J. Schooler for helpful comments, and Rona Unrau
for editing the manuscript.
DISCLOSURE STATEMENT
The authors are not aware of any affiliations, memberships, funding, or financial holdings that
might be perceived as affecting the objectivity of this review.
LITERATURE CITED
Alter AL, Oppenheimer DM. 2006. Predicting short-term stock fluctuations by using processing fluency. Proc.
Natl. Acad. Sci. USA 103:9369–72
Astebro T, Elhedhli S. 2006. The effectiveness of simple decision heuristics: forecasting commercial success
for early-stage ventures. Manage. Sci. 52:395–409
Attneave E. 1953. Psychological probability as a function of experienced frequency. J. Exp. Psychol. 46:81–86
Ayton P, Fischer I. 2004. The hot hand fallacy and the gambler’s fallacy: two faces of subjective randomness?
Mem. Cogn. 32:1369–78
Backlund LG, Bring J, Skaner Y, Strender L-E, Montgomery H. 2009. Improving fast and frugal in relation
to regression analysis: test of 3 models for medical decision making. Med. Decis. Making 29:140–48
Baron J, Ritov I. 2009. Protected values and omission bias as deontological judgments. In The Psychology of
Learning and Motivation. Vol. 50: Moral Judgment and Decision Making, ed. DM Bartels, CW Bauman, LJ
Skitka, DL Medin, pp. 133–67. San Diego, CA: Academic
Baucells M, Carrasco JA, Hogarth RM. 2008. Cumulative dominance and heuristic performance in binary
multiattribute choice. Oper. Res. 56:1289–304
Benartzi S, Thaler RH. 2001. Na¨
ıve diversification strategies in defined contribution saving plans. Am. Econ.
Rev. 91:79–98
Bergert FB, Nosofsky RM. 2007. A response-time approach to comparing generalized rational and take-the-
best models of decision making. J. Exp. Psychol.: Learn. Mem. Cogn. 331:107–29
Binmore K. 2009. Rational Decisions. Princeton, NJ: Princeton Univ. Press
www.annualreviews.org •Heuristic Decision Making 475
Annu. Rev. Psychol. 2011.62:451-482. Downloaded from www.annualreviews.org
by WIB6417 - Max-Planck-Gesellschaft on 01/17/11. For personal use only.
PS62CH17-Gigerenzer ARI 3 November 2010 6:43
Boyd M. 2001. On ignorance, intuition and investing: a bear market test of the recognition heuristic. J. Psychol.
Finan. Market. 2:150–56
Brighton H. 2006. Robust inference with simple cognitive models. In Between a Rock and a Hard Place: Cognitive
Science Principles Meet AI-Hard Problems. Papers from the AAAI Spring Symposium (AAAI Tech. Rep. No.
SS-06–03), ed. C Lebiere, B Wray, pp. 17–22. Menlo Park, CA: AAAI Press
Brighton H, Gigerenzer G. 2008. Bayesian brains and cognitive mechanisms: harmony or dissonance? In The
Probabilistic Mind: Prospects for Bayesian Cognitive Science, ed. N Chater, M Oaksford, pp. 189–208. New
York: Oxford Univ. Press
Brighton H, Gigerenzer G. 2011. How heuristics exploit uncertainty. In Ecological Rationality: Intelligence in
the World, ed. PM Todd, G Gigerenzer, ABC Res. Group. New York: Oxford Univ. Press. In press
Br¨
oder A. 2003. Decision making with