Content uploaded by Markus Kneer
Author content
All content in this area was uploaded by Markus Kneer on Feb 22, 2022
Content may be subject to copyright.
9
On Second Thought
Reflections on the Reflection Defense
Markus Kneer, University of Zurich
David Colaço, Tulane University
Joshua Alexander, Siena College
Edouard Machery, University of Pittsburgh
1. The Restrictionist Challenge
This much should be uncontroversial: The method of cases plays an import-
ant role in contemporary philosophy. While there is disagreement about
how best to interpret this method,¹ there is little doubt that philosophers
often proceed by considering actual or hypothetical situations, and use
intuitions about such situations to assess philosophical theories. Despite its
central role in philosophical practice, the method of cases has recently come
under pressure: A series of experimental studies suggests that judgments
regarding classic philosophical thought experiments (aka “cases”) are sen-
sitive to factors such as culture, gender, affect, framing, and presentation
order, factors, that is, that are not standardly thought to be of philosophical
relevance (for review and discussion, see Alexander, 2012; Machery, 2017;
Stich and Machery, forthcoming).
Critics of experimental philosophy have responded to this challenge in
various ways.² Our goal in this chapter is to shed light on one response that
has not yet received enough attention: the reflection defense (for previous
discussion, see Weinberg et al., 2012). The reflection defense targets features
¹ See Williamson (2007); Malmgrem (2011); Alexander (2012); Cappelen (2012); Deutsch
(2015); Nado (2016); Colaço and Machery (2017); Machery (2017); Strevens (2019).
² For discussion, see, e.g., Alexander and Weinberg (2007); Weinberg et al. (2010); Machery
(2011, 2012, 2017); Alexander (2012, 2016); Cappelen (2012); Schwitzgebel and Cushman
(2012, 2015); Deutsch (2015); Mizrahi (2015).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
Markus Kneer, David Colaço, Joshua Alexander and Edouard Machery, On Second Thought: Reflections On The
Reflection Defense In: Oxford Studies in Experimental Philosophy Volume 4. Edited by Tania Lombrozo, Joshua
Knobe and Shaun Nichols, Oxford University Press. © Markus Kneer, David Colaço, Joshua Alexander and Edouard
Machery 2021. DOI: 10.1093/oso/9780192856890.003.0010
This is an uncorrected proof. Please refer to the published version.
of the deliberative process invoked in experimental studies of ordinary
judgments about philosophical cases: According to proponents of this
defense, judgments about philosophical cases are relevant only when they
are the product of careful, nuanced, and conceptually rigorous reflection,
while, they hold, the judgments elicited in experimental studies are swift
shots from the hip that lack the necessary deliberative care; as such, they are
easily distorted by irrelevant factors. Proponents of the reflection defense
conclude that, since these kinds of judgments are unfit to serve as input for
responsible philosophical inquiry, experimental studies that reveal their
vagaries can be safely ignored.
We suspect that the reflection defense is misguided, and this chapter is an
attempt to defend this suspicion. The reflection defense assumes that reflec-
tion (i) influences how people think about philosophical cases and (ii) brings
their judgments more into alignment with philosophical orthodoxy (where it
exists). We call this the “Influence and Alignment Assumption.”To illus-
trate the point, take Gettier cases, invoked, for instance, in Kauppinnen’s
exposition of the reflection defense: The assumption is that increased reflec-
tion not only occasions a different rate of knowledge ascriptions in Gettier
cases than the standardly high rates of folk ascriptions, but—in line with
textbook epistemology—alower rate of knowledge ascriptions. The idea is
thus that increased reflection influences and—from the point of view of
philosophical orthodoxy—improves the responses to the cases at hand.
In order to examine the Influence and Alignment Assumption, we present
studies that explore how the folk think about four philosophical cases, or
pairs of cases, that have generated a great deal of attention from both
traditional and experimental philosophers across various areas of philoso-
phy: Cases used to challenge the idea that knowledge is justified true belief,
the idea that reference is fixed by description, the idea that knowledge
depends only on epistemic considerations, and the idea that knowledge
entails belief. For each of these cases, we attempted to manipulate reflective
care using four common tools from social psychology and behavioral eco-
nomics: A standard delay manipulation, a standard incentivization procedure,
a standard manipulation for increased accountability, and a standard prime
for analytic thinking. We also examined whether people who are primed to
give more reflective responses actually respond differently to philosophical
cases than people who are not so primed. Finally, we explored correlations
between how people responded to the cases at hand and individual differences
in preference for slow, careful deliberation using the Rational-Experiential
Inventory (Epstein et al., 1996). Nothing mattered. People seem to make the
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
258 , ç, .
same judgments when they are primed to engage in careful reflection as they
do in the conditions standardly used by experimental philosophers. The
reflection defense thus seems unwarranted to presume that reflection rele-
vantly changes how people think about philosophical cases.
We proceed as follows: In Section 2 we discuss the reflection defense in
more detail, and describe our strategy for addressing it in Section 3. After
setting the stage for our empirical research, we present five experimental
studies and their results in Sections 4–8, and conclude in Section 9 by
explaining what we take these results to mean for the reflection defense.
2. The Reflection Defense
Let’s begin with a few particularly clear examples of the reflection defense,
starting with Ludwig’sinfluential formulation:
We should not expect that in every case in which we are called on to make a
judgment we are at the outset equipped to make correct judgments without
much reflection. Our concepts generally have places in a family of related
concepts, and these families of concepts will have places in larger families
of concepts. How to think correctly about some cases we are presented can
be a matter that requires considerable reflection. When a concept, like that
of justification, is interconnected without our thinking in a wide variety of
domains, it becomes an extremely complex matter to map out the concep-
tual connections and at the same time sidestep all the confusing factors.
(2007, 149)
Kauppinen largely concurs:
When philosophers claim that according to our intuitions, Gettier cases are
not knowledge, they are not presenting a hypothesis about gut reactions to
counterfactual scenarios but, more narrowly, staking a claim of how
competent and careful users of the ordinary concept of knowledge would
pre-theoretically classify the case in suitable conditions. The claim, then, is
not about what I will call surface intuitions but about robust intuitions.
(2007, 97)
Liao presents “the argument from robust intuitions”(without embracing it)
as follows:
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
259
[S]ome might think that one should distinguish between surface intuitions,
which are “first-off”intuitions that may be little better than mere guesses;
and robust intuitions, which are intuitions that a competent speaker might
have under sufficiently ideal conditions such as when they are not biased.
(2008, 256)
Horvath presents the reflection defense (without embracing it) as follows:
[T]he existing studies only aim at spontaneous responses to hypothetical cases
(...).Theopposingclaim(...)isthatwhatweactuallyrelyoninphilosophy
are reflective intuitions, which are, it is suggested, of a much better epistemic
quality than the typically spontaneous—and unreflective—intuitive responses
ofthe folk (...).But if the“intuitions”’ (...)really haveto be understood as
“reflective intuitions,”then the available experimental studies do not contrib-
ute much to its support, or so the objection goes. (2010, 453)
Finally, Nado (2015) also discusses the reflection defense, connecting it to
the place of expertise in philosophical methodology (see also Swain et al.,
2008, section 3; Bengson, 2013; Gerken and Beebe, 2016).
The basic idea contained in these passages is rather straightforward:
Philosophers who use the method of cases are only interested in judgments
generated by careful reflection about the cases themselves and the concepts
we deploy in response to these cases, and whatever it is that experimental
philosophers have been studying, they have not been studying those kinds of
things. Thus, experimental studies revealing that unreflective judgments are
susceptible to a host of irrelevant factors do nothing to disqualify reflective
judgments from playing a role in philosophical argumentation.
In more detail, the reflection defense begins with a necessary condition
for the philosophical relevance of judgments about thought experiments:
These judgments are philosophically relevant only when they result from
careful reflection (Premise 1). It then makes a claim about experimental-
philosophy studies: These studies do not examine judgments that result
from careful reflection (Premise 2). It concludes that experimental philoso-
phy findings are not philosophically relevant.
The two premises of the reflection defense call for clarification. First, and
least important, Premise 2 can be formulated in several different ways. The
weakest formulation would merely assert that experimental philosophers
have not clearly demonstrated that their studies examine the right kind of
judgment; for all experimental philosophers have shown, their studies could
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
260 , ç, .
bear on the vagaries of unreflective judgments. A stronger formulation
would assert that extant experimental philosophy studies fail to examine
the right kind of judgment, while leaving open the possibility that improved
studies would get at the right kind of judgment. The strongest formulation
would assert that experimental-philosophy studies are necessarily unable to
examine the judgments that result from careful reflection, perhaps because
of what careful reflection involves. Kauppinen comes close to embracing the
strongest reading, asserting that experimental-philosophy studies are neces-
sarily unable to elicit reflective judgments:
Testing for ideal conditions and careful consideration does not seem to be
possible without engaging in dialogue with the test subjects, and that, again,
violates the spirit and letter of experimentalist quasi-observation. ( . . . )
We can imagine a researcher going through a test subject’s answers together
with her, asking for the reasons why she answered one way rather than
another (. . .). But this is no longer merely ‘probing’the test subjects. It is not
doing experimental philosophy in the new and distinct sense, but rather a
return to the good old Socratic method. (2007, 106)
The content and plausibility of Premises 1 and 2 also depend on how the
distinction between robust and surface judgments, or, as we will say in the
remainder of this chapter, reflective and unreflective judgments, is charac-
terized. It is useful to tease apart thin and thick characterizations of this
distinction.³ One end of this continuum is anchored by what we will call the
“thin characterization of reflective judgment.”A judgment is thinly reflect-
ive just in case it results from a deliberation process involving attention,
focus, cognitive effort, and so on—the type of domain-general psychological
resources that careful and attentive thinking requires—and unreflective
otherwise. We suspect that the thin characterization of reflection is similar
to both lay and psychological conceptions of reflection (e.g., Paxton et al.,
2012). Horvath (2010) and Nado (2015) also seem to understand reflection
thinly.
Thicker conceptions of reflective judgments add requirements to the thin
conception of reflective judgments. For Kauppinen, for example, reflective
³ Weinberg and Alexander (2014) provide an overview of the different conceptions of
“intuition”used in current metaphilosophical debates. Both the thin and thicker characteriza-
tions of reflective judgment discussed below count as thick conceptions on their way of carving
up the landscape.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
261
judgments are the products of the kind of dialogical activity central to the
Socratic method:
[T]here is no way for a philosopher to ascertain how people would respond
in such a situation without (. . .) entering into dialogue with them, varying
examples, teasing out implications, presenting alternative interpretations
to choose from to separate the semantic and the pragmatic, and so on. I will
call this approach the Dialogue Model of the epistemology of folk concepts.
(2007, 109)
Ludwig is also interested only in thick reflective judgments, and on his view
reflective judgments are based solely on conceptual competence:
Conducting and being the subject of a thought experiment is a reflective
exercise. It requires that both the experimenter and the subject understand
what its point is. As it is a reflective exercise, it also presupposes that the
subject of the thought experiment is able to distinguish between judgments
solely based on competence (or recognition of the limits of competence) in
deploying concepts in response to the described scenario. (2007, 135)
Ludwig clarifies what he means by “conceptual competence”in a footnote:
Failing to draw a distinction between unreflective judgments based on
empirical beliefs and judgments based solely on competence in the deploy-
ment of concepts not uncommonly leads to a failure to appreciate the
special epistemic status of the latter, the special role that first person
investigation of them plays in the acquisition of a priori knowledge, and
the stability of the judgments which are reached on this basis. (2007, 136)
That is, on his view, reflective judgments are epistemically analytic (that is,
entertaining the propositions they express can be sufficient for their
justification).
These are just some ways that we might think about reflective judgment
(for another proposal, see Hannon, 2018); what’s important for our pur-
poses is just that the reflection defense will take different forms depending
on which characterization of reflection is involved. It is beyond the scope of
a single chapter to address all the possible variants in depth, and so we will
focus here only on a version of the reflection defense that appeals to the thin
conception of reflection presented above. While this means that we will be
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
262 , ç, .
leaving thicker versions of the defense to the side for now, we think that
there are good reasons for focusing on a thin version of the reflection
defense. First, this version is the most easily tractable by means of experi-
mental tools—the tools we intend to deploy in what follows. There is a
wealth of tools in psychology and behavioral economics to single out judg-
ments that result from reflective deliberation thinly understood, and we
can use these to assess the reflection defense. Second, versions of the
reflection defense that appeal to thick characterizations of reflection face
problems of their own.⁴First, thicker versions of the reflection defense
face what we will call a “descriptive-inadequacy”problem: the thicker the
notion of reflection appealed to, the less likely it is that philosophers’
judgments in usual philosophical debates result from a reflective deliber-
ation so understood. To illustrate, consider Ludwig’s claim that answers to
philosophical cases must be “judgments based solely on competence.”
Although we will not argue for this claim here, we doubt that the judg-
ments elicited by cases in philosophy are typically of this kind; many of
them do not seem to express analytic propositions at all (Williamson, 2007;
Cappelen, 2012; Machery, 2017). Second, thicker versions of the reflection
defense face what we will call a “stipulation”problem: Characterizations of
reflective judgments should not make it the case by stipulation that experi-
mental philosophers’findings happen to bear only on unreflective judgments.
Stipulative victories are no victories at all, and it should be an empirical
question whether reflective judgments suffer from the vagaries evidenced by
fifteen years of experimental philosophy. To illustrate, when Ludwig proposes
that reflective judgments are solely based on conceptual competence, he
makes it the case by sheer stipulation that a large part of experimental
philosophy, which examines the influence of pragmatic factors on judgments
about thought experiments, happens to be studying unreflective judgments.
A more satisfying strategy, we propose, would specify “reflection”so as to
allow for the empirical study of whether reflective judgments are immune to
the influence of pragmatic considerations.
⁴Weinberg and Alexander (2014) also propose a set of conditions that must be met by
anyone attempting to argue that experimental philosophers simply have not been studying the
right kind of judgments or “intuitions.”Among those conditions is one that they call the
“current practice condition”; failure to meet their current practice condition is very similar to
what we will call the “descriptive-inadequacy”problem below.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
263
3. Addressing the Reflection Defense
Our goal in this chapter is to assess a presupposition of the reflection
defense: the Influence and Alignment Assumption, that is, the idea that,
when people consider a thought experiment reflectively, they would tend to
judge differently than in the conditions standardly employed by experimen-
tal philosophers, and their responses would be more in line with what
philosophical orthodoxy considers correct. For instance, while many people
may judge that, in a fake barn case, the character knows that she is seeing a
real barn under standard experimental-philosophy conditions (Colaço et al.,
2014), they would come to the opposite conclusion if they considered the
fake barn case reflectively, or so proponents of the reflection defense assume.
To determine whether the judgments made in response to a case result
from the process of careful reflection (thinly understood), we looked at two
distinct types of properties: dispositional qualities pertaining to the subject
responding to thought experiments and circumstances pertaining to the
process of deliberation itself; careful reflection can either be fostered by an
inherent inclination to engage in careful analytic thinking or else by appro-
priate conditions of deliberation. So, our strategy was to examine whether
the judgments of people disposed to make reflective judgments or the
judgments of people primed to engage in deliberation differ from the
judgments made under conditions standardly employed by experimental
philosophers.
One way to measure people’s disposition to reflection is the Need for
Cognition (NFC) test (Cacioppo and Petty, 1982; Cacioppo et al., 1986).
Some individuals are naturally drawn to complex analytic-thinking tasks
and might thus manifest the necessary care and reflection required for
reflective judgments. An alternative measure that targets much the same
dispositional quality is the Cognitive Reflection Test (Frederick, 2005; Toplak
et al., 2011) or CRT for short. Previous empirical studies using the NFC and
CRT (Weinberg et al. 2012; Gerken and Beebe, 2016) found little support for
the reflection defense; neither high NFC nor high CRT scores correlated
with decreased sensitivity to distortive factors such as contextual priming,
print font, or presentation order.⁵
⁵Pinillos et al. (2011) use the CRT as a proxy to measure “general intelligence,”and find that
“those who display higher general intelligence are less likely to exhibit the Knobe Effect”(124).
However, as long as one does not defend a bias account of the Knobe Effect, according to which
people’s judgments of intentionality are systematically distorted by outcome valence, the
findings of Pinillos and colleagues do not constitute evidence for the reflection defense.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
264 , ç, .
In our experiments, we employed a third standard psychological questionnaire
to measure people’sdispositiontoreflection, namely the Rational-Experiential
Inventory or REI (Epstein et al., 1996; Pacini and Epstein, 1999). Subjects on
the “rational”end of the spectrum typically manifest an increased “ability to
think logically and analytically”; those on the “experiential”end of the spec-
trum manifest a stronger “reliance on and enjoyment of feelings and intu-
itions in making decisions”(Pacini and Epstein, 1999, 974). Differently put,
“rational”subjects are more prone to analytic cognition, “experiential”sub-
jects to more intuition-driven, cognitively less effortful cognition.
Consistent with the studies cited above, we proposed to operationalize the
distinction between reflective and unreflective judgments in terms of the
rational/experiential distinction developed by Epstein and colleagues. If
people who are reflective as measured by yet a third standard psychological
measure (in addition to the NFC scale and the CRT already used by
Gonnerman et al., 2011; Weinberg et al., 2012; and Gerken and Beebe,
2016) do not differ from people who are unreflective, then this would be
evidence that reflection does not change the judgments people make in
response to cases.
Naturally, it could be that the Rational-Experiential Inventory fails to
really measure people’s tendency to engage in reflective deliberation, even
thinly understood, or that these people fail to act on their tendency in our
studies. To address these concerns, we needed to look at other ways of
determining whether people are reporting reflective judgments. A second
way to distinguish reflective from unreflective judgments draws on the
circumstances that lead people to engage in reflection when making judg-
ments about philosophical cases. Kauppinen (2007, 104) highlights the
importance of such circumstances: Reflective judgment “can take hard
thinking and time, and the attempt could be thwarted by passions or
loss of interest”, while “there is a general requirement to think through the
implications of individual judgements—a hasty judgement . . . will not count
as one’s robust intuition about the case.”⁶ We attempted to encourage
careful reflective processes by means of four standard experimental manipu-
lations familiar from social psychology and experimental economics: forced
delay, financial incentive, response justification via provision of reasons, and
priming of analytic cognition.
⁶Discussion of the circumstances that lead to reflection are also present in Dewey’s (1910)
five steps of reflective thought: He notes that one must take time to deliberate on a case, rather
than prematurely accepting the conclusion at which one arrives (73–4).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
265
In the forced-delay condition, participants were encouraged to read the
vignette slowly, carefully, and to think about possible variations of the
scenario. They could only proceed to a screen registering their answer
after a certain delay, which varied from 40 to 60 seconds depending on the
word count of the vignettes. Delay manipulations are frequently used in
social-psychological research; the speed/accuracy trade-off is one of the
most well-studied and pervasive effects in human judgment, perception,
and decision making. Slower responses tend to correlate positively with
improved accuracy and are less susceptible to biases or other distorting
factors.⁷Forced delay has been applied in various kinds of experiments.
Rand et al. (2012), for example, compare people’s level of altruistic behavior
in a one-shot public good game with and without a time delay, stating that in
the former condition “decisions are expected to be driven more by reflec-
tion”(428). Rand and colleagues find that people become less altruistic in
the latter condition, and conclude that “intuition supports cooperation in
social dilemmas, and that reflection can undermine these cooperative
impulses”(427). In their fourth experiment, Pizarro et al. (2003) do not
impose a delay on participants’answers, but they asked participants in
the rational-instructions condition to “make these judgments from (. . .)
a deliberative perspective (i.e., ‘my most rational, objective judgment is
that . . .’)”(657), which is similar to the instructions we used. Pizarro and
colleagues found that the moral assessment of causally deviant acts
changes when people are asked to judge from this “rational, objective”
perspective.
In the financial-incentive condition, participants were promised double
compensation in case they got the answer “right,”which was intended to
encourage careful reflection. All participants in this condition received extra
compensation independently of the answer chosen. Hertwig and Ortmann
(2001) survey studies in experimental economics that invoke financial
incentives and conclude that in certain areas—“in particular, research on
judgement and decision making”(395)—such incentives lead to “conver-
gence of the data toward the performance criterion and reduction of the
data’s variance.”The authors recommend that financial incentives, which
are common practice in experimental economics, be used more widely in
psychological studies so as to obtain more reliable and robust data. Camerer
and Hogarth’s (1999) literature review also shows that while incentivizing
⁷See Garrett (1922); Hick (1952); Ollman (1966); Schouten and Bekker (1967); Pachella
(1973); Wickelgren (1977); Ratcliff and Rouder (1998); Forstmann et al. (2008).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
266 , ç, .
participants financially does not always improve the rational standing of
their decision and judgment, it actually improves it in tasks similar to
making a judgment in response to a thought experiment.
In the reasons condition, the vignette and questions were preceded by a
screen which instructed participants that they would have to provide
detailed explanations of their answers. The aim of this manipulation con-
sisted in fostering an increased sensitivity to rational justification of the
chosen response. A large literature suggests that, in many cases, increasing
accountability by asking participants to justify their judgment or decision
improves the rational standing of these. For instance, Koriat et al. (1980)
find that requiring participants “to list reasons for and against each of the
alternatives prior to choosing an answer”reduces the overconfidence bias
(see also, e.g., the reduction of the sunk-cost fallacy in Simonson and Nye
1992). In their important literature review, Lerner and Tetlock (1999)
conclude that “[w]hen participants expect to justify their judgments [. . .]
[they tend to] (a) survey a wider range of conceivably relevant cues; (b) pay
greater attention to the cues they use; (c) anticipate counter arguments,
weigh their merits relatively impartially, and factor those that pass some
threshold of plausibility into their overall opinion or assessment of the
situation; and (d) gain greater awareness of their cognitive processes by
regularly monitoring the cues that are allowed to influence judgment and
choice’” (263).⁸
Afinal condition made use of analytic priming: Before receiving the
vignettes and questions, participants had to solve a simple mathematical
puzzle—a standard procedure to trigger analytic cognition. To our know-
ledge, the puzzle we used has not been employed in the social-psychological
literature, but the procedure of triggering analytic cognition by means of a
mathematical problem is standard practice. Paxton et al. (2012) and Pinillos
et al. (2011) use the CRT, which consists of three simple mathematical
puzzles with counterintuitive answers, to prime reflection and reasoning in
their participants before giving them some trolley-style moral dilemmas. In a
study regarding different explanations of the contrast-sensitivity of knowledge
ascriptions, Gerken and Beebe (2016) also employ the CRT. On their view, the
contrast effect of knowledge ascription is due to a bias in focus on selective bits
of evidence. High CRT scores, they hypothesized in ways consistent with the
⁸Increasing accountability, e.g., by reason giving, can also aggravate, rather than attenuate,
certain biases in judgment and decision making (Lerner and Tetlock, 1999). It would be
interesting to see how advocates of the reflection defense respond to such findings.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
267
reflection defense, should correlate with lesser susceptibility to the bias, but
they failed to find any such correlation.
All four manipulations were independent ways to elicit reflection during
the process of deliberation. The control condition, in which vignette and
questions were presented without further ado, was intended to be similar to
the characteristics of past empirical research of experimental philosophers,
which have allegedly failed to elicit reflective judgments.
Finally, reaction time was measured for all five conditions to explore
whether people who answered more slowly, presumably because they
reflected more before reporting a judgment, answered differently from
those who answered more quickly (and probably unreflectively).
Our choice of cases was guided by the following four considerations. First,
they should have received widespread attention. Second, there should be
relatively little controversy among professional philosophers about what the
“correct”response is. As the advocates of the reflection defense make plain,
not just any change in judgment occasioned is welcome: They expect
reflection to foster increased alignment with the responses favored by
professional philosophers, at least if there is a consensus. Third, in order
to assess whether encouraging reflection leads people to give responses
aligned to philosophers’, the cases must have elicited some disagreement
among lay people (in light of past research). Finally, the cases must be drawn
from several areas of philosophy. Overall, we chose four scenarios compris-
ing influential classics and more recent cases. Since all of them are rather
well-known, we will confine ourselves to brief summaries here.
The first vignette was a Gettier case, an adaptation of Russell’s (1948)
well-known Clock scenario: Wanda reads the time off a clock at the train
station. This clock has been broken for days, yet happens to display the
correct time when Wanda looks at it. Philosophers by and large agree that
Wanda does not know what time it is (Sartwell, 1992 is an exception), and
Machery et al. (2018) show that lay people are divided about this case:
A surprisingly large proportion ascribe knowledge in this case.
A second vignette focused on the thesis that knowledge entails belief.
Myers-Schulz and Schwitzgebel (2013) have reported astonishing evidence
according to which people are sometimes willing to ascribe knowledge
without ascribing belief (see also Murray et al., 2013). We used Myers-
Schulz and Schwitzgebel’s scenario, which is an adaptation of Radford’s
(1966) famous Queen Elizabeth example. Kate has studied hard for her
history exam; when she faces a question about the year of Queen
Elizabeth’s death, she blanks, despite the fact that she has prepared the
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
268 , ç, .
answer and recited it to a friend. Eventually, Kate settles on a precise year
without much conviction—1603—which is the correct response. In a
between-subjects design, participants receiving the first condition were
asked whether Kate believed Elizabeth died in 1603; in the second condition,
they were asked whether Kate knew Elizabeth died in 1603. Most philo-
sophers hold that knowledge entails belief (but see Radford, 1966; Williams,
1973), but Myers-Schulz and Schwitzgebel (2013) as well as Murray et al.
(2013) suggest that many lay people are willing in some circumstances to
ascribe knowledge while denying belief.
The third experiment explored the “epistemic side-effect effect”or
ESEE. Beebe and Buckwalter (2010) report that knowledge ascriptions
regarding side effects are sensitive to the latter’s general desirability.⁹
Beebe (2013) has produced similar data for belief ascriptions.
We used a scenario from Beebe and Jensen (2012) inspired by Knobe’s
(2003) influential case: The CEO of a movie studio is approached by his vice-
president who suggests implementing a new policy. The new policy would
increase profits and make the movies better or worse from an artistic
standpoint. The CEO replies that he does not care about the artistic qualities
of the movies; the policy is implemented and the vice-president’s predictions
are borne out. The question asked whether the CEO knew or believed the
new policy would make the films better or worse from an artistic standpoint.
To our knowledge, few philosophers, if any, think that the proper applica-
tion of the concepts of knowledge and belief is sensitive to desirability; by
contrast, the extensive body of research on the ESEE suggests that for many
lay people the ascription of knowledge and belief is sensitive to this factor.
The fourth and final vignette was an adaptation of Kripke’s (1972) Gödel
case, drawn from Machery et al. (2004). John has learned in school that a
man called “Gödel”proved the incompleteness theorem, but it turns out that
the proof was in fact accomplished by Gödel’s friend, Schmidt. The question
asked whether the name “Gödel”refers to the man who proved the incom-
pleteness theorem or the man who got hold of the manuscript and claimed
credit for it. Nearly all philosophers share Kripke’s judgment that “Gödel”
refers to the man originally called in the scenario “Gödel,”but extensive
research suggests that many Americans report the opposite judgment
(Machery et al., 2004, 2010, 2017).
⁹See also Beebe and Jensen (2012); Beebe and Shea (2013); Dalbauer and Hergovich (2013);
Buckwalter (2014); Turri (2014); Beebe (2016); Kneer (2018).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
269
4. Experiment 1
4.1 Participants and materials
Participants were recruited on Amazon Mechanical Turk in exchange for a
small compensation (0.2 USD).¹⁰Data sets from 60 participants failing a
general attention test or a vignette-specific comprehension test were dis-
carded. The final sample consisted of 179 respondents (male: 44.7%; age
M=35 years, SD=12 years; age range: 18–69).
The first experiment used the Clock vignette; participants were randomly
assigned to one of the five conditions described above: Control, Delay (40
seconds), Incentive, Reasons, Priming. Response times were collected for all
five conditions. Having responded to the target questions, all participants
completed a 10-item version of Epstein’s Rational-Experiential Inventory
and a demographic questionnaire.
4.2 Results and discussion
4.2.1 Main results
A logistic regression was performed to ascertain the effect of condition on
the likelihood that participants judge that the character does not have know-
ledge. The logistic regression model was not statistically significant, χ²(4)
=4.35, p=.36. The model only explained 3.2% (Nagelkerke R²) of the variance
in participants’answers and correctly classified only 58.7% of the data points.
With standard assumptions of α=.05 and a moderate effect size (w=.3), the
power of our χ²-test is very high (>.91); power remains high (>.7) for smaller
effect sizes (w.23), but is low for small effect sizes (Faul etal., 2007).
Figure 9.1 presents the proportion of “does not know”answers for the five
conditions. Hence, we failed to find any evidence that encouraging careful
reflection makes a difference to people’s judgments about the Clock case.
4.2.2 Response time
We also examined whether people who answer more slowly answer differ-
ently, excluding participants in the Delay condition. Averaging across the
¹⁰For all five experiments, the compensation was doubled for those participants in the
financial incentive condition (independently of whether their response actually fit the philo-
sophical consensus or not).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
270 , ç, .
four other conditions, we did not find any evidence that slower participants
answer differently (r(141)=.02, p=.85).¹¹ Figure 9.2 reports the proportion of
answers in line with philosophers’consensual judgment for the 50% faster
0
10
20
30
40
50
60
70
80
90
100
Control Delay Incentive Reasons Priming
Percentage of “Does not know” responses
Figure 9.1 Percentages of participants who deny knowledge in the 5 conditions
of Experiment 1
Note: bars: 95% confidence intervals.
0
10
20
30
40
50
60
70
80
90
100
Gettier:
She doesn’t know
KwoB: Knowledge KwoB: Belief Gödel case:
Kripkean answer
Percentage
Slow Fast
Figure 9.2 Percentage of participants responding “She does not know”in
Experiment 1 (Gettier case), “She knows”in the knowledge condition of
Experiment 3, “She believes”in the belief condition of Experiment 3, and giving
a Kripkean response in Experiment 5
Note: bars: 95% confidence intervals.
¹¹ The results are similar if one excludes the reaction times two standard deviations below
and above the mean RT (r(141)=.07, p=.55).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
271
and 50% slower participants in the Clock case (“Does not know”) and two
other cases with categorical data of the following experiments.
The results are similar when one looks at each condition (including
Delay) separately (Control: r(39)=.04, p=.80; Delay: r(38)=.24, p=.15;
Incentive: r(36)=.16, p=.36; Reasons: r(32)=.04, p=.83; Priming: r(34)
=.05, p=.79). Thus, we failed to find any evidence that people who answer
more slowly, possibly because they reflect about the case, answer differently.
4.2.3 Analytic thinking
In addition, we examined whether people who report a preference for
analytic thinking answer differently. Averaging across the five conditions,
we did not find any evidence that REI scores predict participants’response
to the Clock case (r(179)=.12, p=.12). Figure 9.3 reports the proportion of
answers in line with philosophers’judgment for the 50% most reflective and
50% least reflective participants for the Clock case, as well as two cases used
in the other experiments with categorical data. The results are similar when
one looks at each condition separately (Control: r(39)=.18, p=.27; Delay: r
(38)=.31, p=.06; Incentive: r(36)=.05, p=.78; Reasons: r(32)=.16, p=.37;
Priming: r(34)=.03, p=.85). So, there is no evidence that people who have a
preference for thinking answer differently from people who don’t have such
preference.
0
10
20
30
40
50
60
70
80
90
100
Gettier:
She doesn’t know
KwoB: Knowledge KwoB: Belief Gödel case:
Kripkean answer
Percentage
Low REI High REI
Figure 9.3 Comparison of more reflective and less reflective participants
responding “She does not know”in Experiment 1 (Gettier case), “She knows”in
the knowledge condition of Experiment 3, “She believes”in the belief condition
of Experiment 3, and giving a Kripkean response in Experiment 5
Note: bars: 95% confidence intervals.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
272 , ç, .
Note that in contrast to other Gettier cases (Machery et al., 2017), lay
people tend not to share philosophers’judgment that the protagonist in a
Clock case does not know the relevant proposition. This is in line with
previous studies examining the Clock case (Machery et al., 2018).
5. Experiment 2: Follow-Up to Experiment 1
While we failed to find any significant result with our manipulations, two of
the manipulations of Experiment 1 seemed to lead participants to agree
more with philosophers: asking participants to provide reasons for their
answer and providing monetary incentives to think things through in detail.
To explore these results further, we replicated the Incentive, Reasons, and
Control conditions of Experiment 1 with a larger sample size.
5.1 Participants and materials
Participants were recruited on Amazon Mechanical Turk in exchange for a
small compensation (0.3 USD). Datasets from 16 participants who failed the
attention check or answered the comprehension question incorrectly were
removed. Our final sample consisted of 264 respondents (male: 39.0%; age
M=40 years, SD=13 years; age range: 19–73). Participants were randomly
assigned to the Control, Incentive, or Reasons conditions. The vignette,
instructions, and procedure were otherwise identical to those of Experiment 1.
5.2 Results and discussion
5.2.1 Main results
Participants in the three conditions answered differently (χ²(2, 264)=8.2,
p=.017),¹² but the two manipulations did not lead participants to agree with
philosophers about the Clock case; rather, they led them to judge that the character
in the Clock case knows that it is 3:00 p.m. Figure 9.4 visualizes the results.
One may be surprised by the difference between the results in the
Incentive and Reasons conditions in this study and in Experiment 1. We
¹² Control vs. Incentive: χ²(1, 186)=8.1, p=.004; Control vs. Reasons: χ²(1, 171)=2.1, p=.15.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
273
do not have a ready explanation, except for the fact it may be simply random
sampling variation.
5.2.2 Response time
We also examined whether people who answer more slowly answer differ-
ently. Averaging across the three conditions, we did not find any evidence
that slower participants do so (r(264)=.02, p=.77; see Figure 9.2).¹³ The
results are similar when one looks at each condition separately (Control: r
(91)=.05, p=.64; Incentive: r(93)=0.0, p=.99; Reasons: r(78)=.10, p=.38).
Thus, as was the case in Experiment 1, we failed to find any evidence that
people who answer more slowly answer differently.
5.2.3 Analytic thinking
In addition, we examined again whether people who report a preference for
thinking answer differently. Averaging across the three conditions, we
did not find any evidence that REI scores predict participants’response to
the Clock case (r(264)=.09 , p=.16; see Figure 9.3). The results are similar
in the Incentive condition (r(93)=.06, p=.56) and the Reasons condition
(r(78)=.11, p=.32). By contrast, participants in the control condition with
higher REI scores (participants who report a taste for thinking) were more
0
10
20
30
40
50
60
70
80
90
100
Control Incentive Reasons
Percentage of “Does not know” responses
Figure 9.4 Percentages of participants who judge that Wanda does not know the
time in the 3 conditions of Experiment 2
Note: bars: 95% confidence intervals.
¹³ The results are similar if one excludes the reaction times two standard deviations below
and above the mean RT (r(258)=.01, p=.92).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
274 , ç, .
likely to disagree with philosophers and judge that the character in the
Gettier case knows that it is 3:00 p.m. (r(93)=.21, p=.05). Again, there is
little evidence that people who have a preference for thinking answer
differently from people who don’t have such preference; and when they
do, the evidence suggests they tend to disagree more with philosophers.
Differently put, the influence of increased reflection is limited, and where it
does produce a difference, it decreases alignment with textbook epistemology.
6. Experiment 3: Knowledge and Belief
So far, the results suggest that reflective judgments do not foster increased
alignment with philosophical orthodoxy. In fact, reflection does not have
much of an influence in the first place. But our results so far are limited to a
single case drawn from one area of philosophy (the Clock case in epistem-
ology). The following studies examine whether our findings generalize to
other thought experiments and other areas of philosophy, starting with
another case in epistemology. Experiment 3 focuses on the question of
whether knowledge entails belief. The issue was first raised by Radford
(1966), whose central thought experiment, Queen Elizabeth (described in
Section 3), was previously tested by Myers-Schulz and Schwitzgebel (2013).
6.1 Participants and materials
Participants were recruited on Amazon Mechanical Turk in exchange for a
small compensation (0.2 USD). Datasets from 233 participants who failed
the attention check or answered the comprehension question incorrectly
were removed. Our final sample consisted of 385 respondents (male: 35.6%;
age M=35 years, SD=16 years; age range: 18–83).
Our study had a 5x2 between-subjects design. Participants were randomly
assigned to one of the ten conditions invoking five manipulations (Control,
Delay, Incentive, Reasons, and Priming) and two epistemic states (know-
ledge and belief). Participants in the Knowledge conditions had to decide
whether the character in the vignette knew that Queen Elizabeth died in
1603, participants in the Belief conditions whether she believed it. The
instructions and procedures were identical to those of Experiment 1. The
only difference consisted in the delay in the Delay condition. Participants
had to wait 60 seconds before they could register their response, which we
estimated was twice as long as it would take to read the case leisurely.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
275
6.2 Results and discussion
6.2.1 Main results
A logistic regression was performed to ascertain the effect of our manipu-
lations and of the Knowledge vs. Belief factor on the probability that
participants judge that the character knows or believes that Queen Elizabeth
died in 1603. The logistic regression model was statistically significant, χ²(4)
=112.5, p<.001. It explained 33.8% (Nagelkerke R²) of the variance in partici-
pants’answers and correctly classified 74.5% of the data points. The Knowledge
vs. Belief factor was statistically significant: Participants were significantly
less likely to answer that the character believes that Queen Elizabeth died in
1603 than they were likely to answer that she knows that Queen Elizabeth
died in 1603 (Wald=85.9, p<.001). By contrast, the manipulations were not
statistically significant (Wald=1.1, p=.86). With standard assumptions of α=.05
and a moderate effect size (w=.3), the power of our χ²-test is very high (>.99);
power remains high (>.7) for small to moderate effectsizes (w.16), but is low
for small effect sizes (Faul etal., 2007). Figure9.5 presents the proportion of
“knows”and “believes”answer for the five conditions.
In our experiment, we replicated the results reported by Myers-Schulz
and Schwitzgebel (2013), which cast doubt on the entailment thesis (but see
Rose and Schaffer, 2013; Buckwalter et al., 2015). We failed to find any
evidence that compelling people to take their time in answering, telling
them in advance that they will have to justify their answers, paying them
to be accurate, or priming them to embrace an analytic cognitive style make
any difference in their ascription of either knowledge or belief to the
character in Schwitzgebel and Myers-Schulz’s case.
0
10
20
30
40
50
60
70
80
90
100
Control Delay Incentive Reasons Priming
Percentage of responses
“She knew” “She believed”
Figure 9.5 Percentages of knowledge ascription and belief ascription in the 5
conditions of Experiment 3
Note: bars: 95% confidence intervals.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
276 , ç, .
6.2.2 Response time
We also examined whether people who answer more slowly answer differ-
ently, excluding participants in the Delay condition. Averaging across the
four other conditions, we did not find any evidence that in the Knowledge
condition or in the Belief condition slower participants answer differently
(respectively, r(125)=.02, p=.81 and r(191)=.02, p=.75; see Figure 9.2).¹⁴The
results are largely similar when one looks at each condition separately (Belief
condition: Control: r(45)=.112, p=.44; Delay: r(44)=.26, p=.09; Incentive: r
(52)=.14, p=.33; Reasons: r(44)=.17, p=.28; Priming: r(50)=.001, p=.99;
Knowledge condition: Control: r(38)=.18, p=.28; Delay: r(25)=.26, p=.21;
Incentive: r(28)=.29, p=.14; Reasons: r(31)=.06, p=.77; Priming: r(28)
=.16 p=.41). Thus, we failed to find any evidence that people answer
differently when, on their own, they take their time in considering
Schwitzgebel and Myers-Schulz’s case and in providing an answer.
6.2.3 Analytic thinking
Finally, we examined whether people who report a preference for thinking
analytically answer differently. In the Knowledge condition, averaging across
the five conditions, we did not find any evidence that REI scores predict
participants’response to the target question (r(150)=.07, p=.42; see
Figure 9.3). In the Belief condition, averaging across the five conditions,
we did not find any evidence that REI scores predict participants’response
(r(235)=.06, p=.37; see Figure 9.3). For the individual conditions, none of the
Bonferroni-corrected p-values attained significance (all ps>.1). The results are
largely similar when one looks at uncorrected p-values. Belief condition:
Control: r(45)=.01, p=.93; Delay: r(44)=.33, p=.03; Incentive: r(52)=.02,
p=.90; Reasons: r(44)=.08, p=.62; Priming: r(50)=.01, p=.92. Knowledge
condition: Control: r(38)=.09, p=.61; Delay: r(25)=.20, p=.34; Incentive:
r(28)=.37, p=.05; Reasons: r(31)=.02, p=.91; Priming: r(28)=.17 p=.40.
In two sub-conditions—Incentive/Knowledge and Delay/Belief—the
uncorrected p-values just about reach significance. Given that none of the
overall p-values, or any of the individual Bonferroni-corrected p-values
attain significance, this clearly does not constitute systematic evidence that
people drawn to more analytic thinking answer differently from those who
are not so disposed.
¹⁴The results are similar if one excludes the reaction times two standard deviations below
and above the mean RT (r(145)=.07, p=.40 and r(288)=.09, p=.19).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
277
7. Experiment 4: Epistemic Side-Effect Effect
Experiment 4 investigates further whether lay people’sreflective judgments
in reaction to epistemological cases vary from the judgments reported so far
by experimental philosophers. In this case we focused on the asymmetric
ascriptions of knowledge and belief regarding differently desirable side
effects. The findings of previous studies by Beebe and Buckwalter (2010)
and Beebe (2013) are just as much at odds with standard epistemological
doctrine as those reported by Myers-Schulz and Schwitzgebel (2013).
7.1 Participants and materials
Participants were recruited on Amazon Mechanical Turk in exchange for a
small compensation (0.2 USD). The datasets of 237 participants who failed
the attention check or answered the comprehension question incorrectly
were removed. Our final sample consisted of 701 respondents (male: 41.7%;
age M=35 years, SD=12 years; age range: 18–73).
Our study used the Movie Studio scenario (described in Section 3), and
was a 5x2x2 between-subjects design. Each participant was assigned to one
of the 20 conditions differing with respect to manipulation (Control, Delay,
Incentive, Reasons, Priming), desirability of the side effect (better movies,
worse movies), and epistemic state (knowledge, belief). Answers were col-
lected on a 7-point Likert scale; participants reported to what extent they
agreed or disagreed that the protagonist believed or knew that the newly
adapted policy would make the movies better or worse from an artistic
standpoint. The instructions and procedures were identical to those of
Experiment 1. The only difference consisted in the delay in the Delay
condition. Participants had to wait 40 seconds before they could register
their response, which we estimated was twice as long as it would take to read
the case leisurely.
7.2 Results and discussion
7.2.1 Main results
An ANOVA with the five manipulations, the Better vs. Worse factor, and
the Knowledge vs. Belief factor was performed to ascertain the effect of our
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
278 , ç, .
manipulations on the probability that participants judge that the character
knows or believes that the movies were made worse or better. The Better vs.
Worse factor was significant (F(1, 681)=262.76, p<.001, η²=.28) as was the
Knowledge vs. Belief factor (F(1, 681)=159.87, p<.001, η²=.07). Participants
are more likely to ascribe knowledge and belief in the Worse condition than
in the Better condition, and they are more likely to ascribe knowledge than
belief. By contrast, our manipulations did not produce any significant effect
(F(1, 681)=.35, p=.85). With standard assumptions of α=.05 and a moderate
effect size (f=.25), the power of an F-test is very high (>.99); assuming a
small effect size (f=.10), power (.75) is still high (Faul et al., 2007). Figure 9.6
presents the means of the “knows”and “believes”answers for the five
conditions for the worse and better conditions.
Thus, we replicated Beebe and Buckwalter’s (2010) and Beebe’s (2013)
findings: The desirability of an action influences the ascription of knowledge
and belief. In addition, we failed to find any evidence that compelling people
to take their time in answering, telling them in advance that they will have to
justify their answers, paying them to be accurate, or priming them to
embrace a reflective cognitive style make any difference in their answers,
and it is likely that if there were small or moderate effects to be found, we
would have found them.
1
2
3
4
5
6
7
Control Delay Incentive Reasons Priming Control Delay Incentive Reasons Priming
Belief Knowledge
Mean respons e
Better Wors e
Figure 9.6 Mean agreement with the claim that the director knew or believed
the movies would become better or worse from an artistic standpoint in the 20
conditions of Experiment 4
Note: bars: 95% confidence intervals.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
279
7.2.2 Response time
We also examined whether people who answer more slowly answer differ-
ently, excluding participants in the Delay condition. Averaging across the
four other conditions, we did not find any evidence that they do (Harm and
Knowledge conditions: r(142)=.06, p=.51; Help and Knowledge conditions: r
(136)=.02, p=.80; Harm and Belief conditions: r(145)=.05, p=.56; Help
and Belief conditions: r(140)=.07, p=.43).¹⁵Figures 9.7a (belief) and 9.7b
(knowledge) report the scatterplot for these four conditions.
7
6
5
4
3
Belief Ascription
2
1
0204060
Response Times
80 100
Outcome
Better
Wor s e
(a)
Figure 9.7a Scatterplot and regression lines for the ascription of belief in
experiment 4 as a function of participants’response time
Note: data points larger than 2 SD above the mean are excluded.
¹⁵The results are similar if one excludes the reaction times two standard deviations below
and above the mean RT (Help and Knowledge conditions: r(134)=.07, p=.42; Harm and Belief
conditions: r(140)=.07, p=.39; Help and Belief conditions: r(139)=.07, p=.39), except for the
Harm and Knowledge conditions: r(139)=.20, p=.02.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
280 , ç, .
In sum, we failed to find any evidence that people who, on their own, take
their time in considering the relevant case and in answering judge differently
from people who don’t.
7.2.3 Analytic thinking
Finally, we examined whether people who report a preference for thinking
answer differently. Averaging across the four other conditions, we did not
find any evidence that participants with higher REI scores answer differently
(Harm and Knowledge conditions: r(231)=.09, p=.23; Help and
Knowledge conditions: r(119)=.02, p=.77; Harm and Belief conditions: r
(180)=.02, p=.77; Help and Belief conditions: r(177)=.05, p=.53).
Figures 9.8a (belief) and 9.8b (knowledge) report the scatterplot for these
four conditions.
7
(b)
6
5
4
3
Knowledge Ascription
2
1
0204060
Response Times
80 100
Outcome
Better
Wor s e
Figure 9.7b Scatterplot and regression lines for the ascription of knowledge in
experiment 4 as a function of participants’response time
Note: data points larger than 2 SD above the mean are excluded.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
281
So, there is no evidence that people who have a preference for analytic
thinking answer differently from people who don’t have such a preference.
8. Experiment 5: The Gödel Case
Our final experiment examined whether the findings reported so far gener-
alize to another area of philosophy: the philosophy of language. Following
Kripke (1972), most philosophers assume that in the Gödel case the proper
name “Gödel”refers to the man who stole the theorem. Previous work
suggests, however, that for a substantial proportion of Americans (between
25% and 40%), the Gödel case elicits judgments more in line with the
descriptivist theory of reference (Machery et al., 2004, 2010, 2017).
Experiment 5 examined whether lay people agree more with philosophers
when they report their reflective judgment.
7
6
5
4
3
Belief Ascription
2
1
123
REI
45
Outcome
Better
Wor s e
(a)
Figure 9.8a Scatterplot and regression lines for the ascription of belief in
experiment 4 as a function of participants’REI scores
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
282 , ç, .
8.1 Participants and materials
Participants were recruited on Amazon Mechanical Turk in exchange for a
small compensation (0.5 USD). The datasets of 80 participants who failed
the attention check, answered the comprehension question incorrectly, or
attempted to complete the survey multiple times (as evidenced by their IP
address) were removed. Our final sample consisted of 274 respondents
(male: 54.4%; age M=43 years, SD=14 years; age range: 21–79).
All participants were randomly assigned to one of five conditions:
Control, Delay, Incentive, Reasons, or Priming. The instructions and pro-
cedures were identical to those of Experiment 1. Participants had to decide
whether the protagonist in the case is talking about the man who stole the
theorem (Kripkean answer) or about the man who discovered the theorem
(descriptivist answer). The only difference consisted in the delay in the
7
6
5
4
3
Knowledge Ascription
2
1
123
REI
45
Outcome
Better
Wor s e
(b)
Figure 9.8b Scatterplot and regression lines for the ascription of knowledge in
experiment 4 as a function of participants’REI scores
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
283
Delay condition. Participants had to wait 60 seconds, which we estimated
was twice as long as it would take to read the Gödel case leisurely.
8.2 Results and discussion
8.2.1 Main results
A logistic regression was performed to ascertain the effect of our manipu-
lations on the probability that participants judge that the character is talking
about the character originally called “Gödel”when he uses the proper name
“Gödel.”The logistic regression model was not statistically significant, χ²(4)
=5.10, p=.28. The model explained 2.5% (Nagelkerke R²) of the variance in
participants’answers and correctly classified 60.9% of the data points. The
power of the χ² test, assuming a moderate effect size (w=.3) was very high
(.99); power remains high (>.7) for small to moderate effect sizes (w.19),
but is low for small effect sizes (Faul et al., 2007). We also note that
all the manipulations decreased the proportion of Kripkean responses,
although not significantly so. Figure 9.9 presents the proportion of the
“Gödel”answer for the five conditions.
Thus, we failed to find any evidence that compelling people to take their
time in answering, telling people in advance that they will have to justify
0
10
20
30
40
50
60
70
80
90
100
Control ReasonsIncentiveDelay Priming
Percentage of Kripkean responses
Figure 9.9 Percentages of Kripkean response in the 5 conditions of
Experiment 5
Note: bars: 95% confidence intervals.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
284 , ç, .
their answers, paying them to be accurate, or priming them to embrace a
reflective analytic style improved people’s responses to the Gödel case, and it
is likely that if there were a moderate or even a small to moderate effect to be
found, we would have found it.
8.2.2 Response time
We also examined whether people answer differently when they answer
more slowly, excluding participants in the Delay condition. Averaging
across the four other conditions, we did not find any evidence that they
do (r(219)=.007, p=.92; see Figure 9.2).¹⁶None of the uncorrected
(and a fortiori Bonferroni-corrected) p-values for the individual conditions
attained significance: Control: r(54)=.10, p=.48; Delay: r(55)=.12, p=.39;
Incentive: r(59)=.07, p=.62; Reasons: r(53)=.045 p=.76; Priming: r(53)
=.08, p=.57. Thus, we failed to find any evidence that people who, on their
own, take their time in considering the Gödel case and in answering answer
differently from people who don’t.
8.2.3 Analytic thinking
In addition, we examined whether people who report a preference for
thinking are more likely to agree with philosophers. Averaging across
the five conditions, we did not find any evidence that REI scores
predict participants’response to the Gödel case (r(274)=.04, p=.512;
see Figure 9.3). None of the Bonferroni-corrected p-values for the indi-
vidual conditions attained significance (all ps>.05). Except for the priming
condition, the same held for uncorrected p-values: Control: r(54)=.14,
p=.32; Delay: r(55)=.23, p=.10; Incentive: r(59)=.04, p=.75; Reasons:
r(53)=.11 p=.45; Priming: r(53)=.33, p=.01. So, there is no systematic
evidence that people who have a preference for analytic thinking
agree more with philosophers about the Gödel case than people who
don’t have such a preference. While suggestive, the correlation in the
Priming condition should not give solace to philosophers since, if it isn’t
a mere accident, it goes in the wrong direction: People who are less
reflective are more likely to give the response in line with philosophers’
judgments.
¹⁶The results are similar if one excludes the reaction times two standard deviations below
and above the mean RT (r(213)=.003, p=.97).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
285
9. Discussion
9.1 Meta-philosophical implications
of the experimental studies
The reflection defense assumes that increased reflection influences people’s
responses to philosophical cases and improves them, in the sense of bringing
them more into alignment with philosophical orthodoxy. Since experimen-
tal philosophers do not ordinarily encourage extensive reflection character-
istic of the philosophical method, the data thus collected—or so the
argument goes—is of no use. We put the empirical adequacy of the reflec-
tion defense to the test with respect to four well-known thought experi-
ments.¹⁷For each, philosophers agree as to what constitutes the correct
response.
Focusing on a thin conception of reflection and reflective judgment, we
have examined two types of factors that might be conducive to reflective
deliberation: the circumstances under which the deliberative process takes
place, and individual dispositions to engage in careful reflection. As regards
the former, we adapted a variety of standard manipulations from social
psychology and experimental economics to encourage diligent reflection:
time delay, financial incentives, reason specification, and analytic priming.
Out of the 18 conditions with manipulations contrasted with the respective
control conditions across five experiments, we could only detect a significant
difference—or some sign of “influence”—in two comparisons: In Experiment
2, financial incentives and reason specification somewhat changed responses
vis-à-vis the control condition, yet in both cases the increased reflection
produced results that were less in alignment with philosophical theory. As
regards circumstances then, the Influence and Alignment assumption has
proven a nonstarter: In nearly all conditions we failed to detect an influence of
reflection in the first place, and in the few cases where an influence was
detected, it decreased alignment.
Concerning individual dispositions to engage in reflection, we contrasted
the responses of participants who, out of their own free will, spent more time
with the task with those who responded quickly on the one hand. We
couldn’t detect a significant difference in slow vs. fast responses for a single
condition of any of the five experiments. We also explored whether people
¹⁷We also note, though do not elaborate on this point, that we replicated all the original
experimental-philosophy studies, in line with Cova et al. (2021).
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
286 , ç, .
with a penchant for more analytic thinking (i.e., subjects on the “rational”
end of the REI) respond differently from those who tend toward a more
intuitive thinking style (those on the “experiential”end of the REI spec-
trum). Averaging across conditions, we did not find a significant difference
in any of the five experiments. The Bonferroni corrected p-values for each of
the 23 conditions were also nonsignificant. In short, participants who have a
natural disposition to engage in more analytic thinking responded the same
as those who do not. These results are consistent with the findings reported
in previous studies, which attempted to measure the disposition to engage in
reflection by means of the NFC inventory or the CRT.
Taking stock: In a series of five experimental studies with a total of over
1800 individual subjects, we found that neither a disposition to engage in
reflection nor circumstantial factors conducive to reflective judgment bring
folk judgments into alignment with philosophical orthodoxy. In nearly all
cases tested, they fail to have an impact entirely. Our studies had sufficient
power to detect medium-sized effects with a very high probability, and small
to medium effects with high probability. We cannot exclude the possibility
of small effects induced by reflection. Note, however, that even if those were
to be found, it’s far from clear that this would make the reflection defense
any more convincing. Take, for instance, Radford’s unconfident examinee
case, where in the control condition we found more than 80% of the
participants to ascribe knowledge, while a mere 30% ascribed belief. The
difference, defying orthodox epistemology, constitutes a large effect
(h=1.15). Or consider the well documented epistemic side-effect effect: In
the control conditions, the effect size of the divergence between positively
and negatively valenced outcomes was very large for both belief (d=1.01)
and knowledge (d=1.02). Now assume it could be shown that extensive
reflection produces small effects in line with philosophical orthodoxy, e.g.,
increasing epistemic state ascriptions somewhat in the positively valenced
Knobe-type cases. The epistemic side-effect effect will still be of at least
moderate size, and it is more likely that they will remain large. The overall
conclusion—that folk judgments frequently differ strongly from philosoph-
ical consensus and that extensive reflection does not bring the two into
alignment—remains the same.
At this point, there are two responses available to proponents of the
reflection defense: First, they could argue that the reason we did not find
any effect is that our manipulations are poor means of leading people to
engage in sufficiently reflective deliberation about philosophical cases even
when reflection is thinly construed. Second, they could argue that the thin
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
287
conception of reflective judgment that we have been working with here is
not what they had in mind. We do not find either response compelling. Let’s
start with the first kind of response. Combined with earlier studies, we now
have eight different ways of inducing reflection thinly construed, and none
of them seem to support the central presuppositions of the reflection
defense, which are either that reflection sufficiently immunizes judgments
about philosophical cases from sensitivity to allegedly irrelevant factors, or
at least that it changes our judgments about those cases.
The second kind of response is no more compelling than the first. Of
course, it is perfectly possible that reflection more thickly construed might
lead people to change their judgments about philosophical cases, and we
happily admit that we have done nothing to address this possibility. Having
said that, it should be obviously unacceptable to attempt to rebut an
empirical challenge to the way philosophers standardly use the method of
cases by appealing to some unspecified account of reflection. A convincing
response to the experimental challenge must explain not only what proper-
ties the judgments studied by experimental philosophers apparently lack,
but also why these properties are important to the way philosophers stand-
ardly employ the method of cases. And here it is important that proponents
of the reflection defense do not rely on a bait-and-switch strategy. If
the reflection defense is deemed plausible and appealing at all, it is
largely, we submit, because the notion of reflection is characterized thinly:
Philosophical arguments, we agree, do not appeal to “gut reactions”to cases
or to “shots from the hip”in response to these, but rather to careful and
reflective judgments. However, the intuitive plausibility and appeal of the
reflection defense thinly understood do not transfer to versions of the
argument that appeal to thicker characterizations of reflection. If it’s plaus-
ible that only careful, slow, reflective judgments about cases are philosophic-
ally relevant, is it equally plausible that only epistemically analytic judgments
about cases are philosophically relevant? Surely not. For one, many deny that
any judgment is epistemically analytic (e.g., Williamson, 2007; Machery,
2017). And even if some are, it is not evident at all that the judgments made
in response to cases are epistemically analytic. What’s the upshot? Proponents
of the reflection defense who appeal to thicker characterizations of reflection
can’t simply trade upon the initial plausibility and appeal of the reflection
defense when reflection is characterized thinly on pain of engaging in bait-
and-switch. What is required is a detailed characterization of reflection
understood in some more substantial way, and clear arguments to the effect
that the method of cases demands this type of thick reflection.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
288 , ç, .
Until now, proponents of the reflection defense have not provided
compelling arguments for thicker characterizations of the notion of reflection.
Furthermore, we submit, compelling arguments will be hard to find. As noted
in Section 2, an adequate characterization of reflection must be consistent
with the way philosophers use thought experiments (the descriptive-
inadequacy problem), but the thicker the characterization, the more likely it
is that it will fall prey to the descriptive-inadequacy problem (Cappelen, 2012;
Machery, 2017, chapter 1). For instance, Kauppinen’s dialogical conception
of reflection fails to capture how philosophers usually judge in response to
cases. There is no doubt that, as Kauppinen insists, philosophers compare
cases and try to identify ways in which particular cases are alike or differ, but
they typically do not do this in the process of making a judgment about these
cases. Rather, having made judgment about several cases, they try to identify
potential reasons that explain their pattern of judgments. When Gettier (1963)
proposes his ten-coin case, he does not compare it to other cases to conclude
that the agent does not know that he has ten coins in his pocket. Nor do his
readers. Rather, once we have judged in response to several cases, wecompare
those judgments to identify the reasons that explain the relevant pattern of
judgments. It is thus implausible that only judgments understood along
Kauppinen’s dialogical conception of reflection are philosophically relevant.
Furthermore, a thick conception of reflection must not imply by stipula-
tion that the research done by experimental philosophers is not relevant for
philosophical methodology (the stipulation problem). Differently put, sim-
ply stipulating a concept of reflection according to which the kinds of biases
identified by experimental philosophers cannot arise is unhelpful. Instead, it
must be shown by means of arguments or empirical evidence that reflection,
understood in some thicker manner, results in judgments that do not fall
prey to such biases, or at least do so at a much lower rate.
9.2 Why doesn’treflection influence judgment about cases?
It is surprising that people who are disposed to engage in careful analytic
thinking and people who are primed to engage in reflective deliberation do
not judge differently in response to cases than people who read cases under
conditions that are standardly used by experimental philosophers. Why is
that? There are at least two answers to this question.
First, it could be that, contrary to what proponents of the reflection defense
assume (Premise 2 of the argument sketched in Section 2), participants in
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
289
experimental-philosophy studies are already engaged, by themselves, in
reflective deliberation when they respond to cases under conditions stand-
ardly used by experimental philosophers. If this were the case, then it would
be unsurprising that priming people to be reflective or looking at people
disposed to careful thinking would not make any difference, as we found.
While some subjects may, by themselves, engage in reflective deliberation
on their own under conditions standardly used by experimental philo-
sophers, we doubt that this is the case for all subjects, and while this
explanation may be partly correct, it is incomplete. The reason is that
many participants respond rather quickly to cases, too quickly for them to
have had the time to engage in careful, reflective deliberation.
The second explanation of our surprising result is more radical: Typically,
reflection does not change the judgments made in response to cases (for a
similar view about moral judgments, see Haidt 2001, and for a discussion of
the limits of reflection, see Kornblith 2010). Rather, it merely leads people to
find reasons for the judgments they made unreflectively in response to these
cases. People who are inclined to engage in analytic thinking are merely
better at finding arguments for the judgments they make about cases; people
who are primed to engage in reflection are primed to find reasons for their
judgments about cases. If finding arguments or justification is the product of
reflection, then it is unsurprising that priming people to be reflective or
looking at people disposed to careful thinking would not make any differ-
ence, as we found.
One may wonder why reflection does not change judgments about
philosophical cases much, while it appears to influence judgment in the
social-psychological and behavioral-economical literature, allowing people
to overcome their spontaneous answers. We propose to explain the differ-
ence between our findings and the past research on reflection as follows.
The judgments people make in the types of situations examined by social
psychologists and behavioral economists are frequently mistaken by
people’s own lights; when this happens, they tend to change their responses
when given an opportunity to reflect. By contrast, when judgments are made
with confidence and are not erroneous by participants’own lights, reflection
does not influence judgment much; rather, it leads people to think of
arguments for the judgments independently made.
The explanation proposed in this section, we submit, is the deepest
reason why the reflection defense fails. It misunderstands the role of reflec-
tion. It assumes that reflection would lead us to judge differently in response
to cases instead of, as our results suggest, prompting us to explore reasons,
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
290 , ç, .
arguments, and justifications for them, while leaving the judgments
themselves unchanged.
10. Conclusion
In this chapter, we have addressed the reflection defense put forward in
response to the challenge against the use of cases inspired by experimental
philosophy. We have shown experimentally that there is no systematic
evidence which suggests that reflection, thinly understood, leads people to
respond differently to cases than under standard experimental-philosophy
conditions. This finding undermines the view according to which reflective
and unreflective judgments about cases differ. Instead, we would like to
suggest, people’s response to thought experiments typically expresses deep-
seated judgments, and reflection merely bolsters this judgment by pushing
people to explore potential reasons for their judgment. While it is possible that
the reflection defense might appeal to some thicker reflection, we see little
reason for optimism. Given that both the expertise defense and the reflection
defense have so far proven inadequate, we conclude that philosophers should
take the experimentalist’s challenge against the use of cases seriously.
Acknowledgments
For very helpful feedback, we would like to thank Justin Sytsma, Joe Ulatowski,
Jonathan Weinberg, the editors, several anonymous referees and the audiences at the
Society of Philosophy and Psychology Meeting (2016), the Buffalo Annual
Experimental Philosophy Conference (2016), XPhi under Quarantine (2020) and
the Guilty Minds Lab Zurich (2020). While working on this chapter, Markus Kneer
was supported by an SNSF Ambizione Grant (# PZ00P1_179912); Joshua Alexander
was supported by a Templeton Foundation Grant (# 15628).
References
Alexander, J. (2012). Experimental Philosophy: An Introduction. Cambridge:
Polity.
Alexander, J. (2016). Philosophical expertise. In J. Sytsma and W. Buckwalter
(eds.), A Companion to Experimental Philosophy (pp. 555–67). Malden, MA:
Wiley-Blackwell.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
291
Alexander, J. and Weinberg, J. M. (2007). Analytic epistemology and experi-
mental philosophy. Philosophy Compass,2(1), 56–80.
Beebe, J. R. (2013). A Knobe effect for belief ascriptions. Review of Philosophy
and Psychology,4(2), 235–58.
Beebe, J. R. (2016). Do bad people know more? Interactions between attributions
of knowledge and blame. Synthese,193(8), 2633–57.
Beebe, J. R. and Buckwalter, W. (2010). The epistemic side-effect effect. Mind &
Language,25(4), 474–98.
Beebe, J. R. and Jensen, M. (2012). Surprising connections between knowledge
and action: the robustness of the epistemic side-effect effect. Philosophical
Psychology,25(5), 689–715.
Beebe, J. R. and Shea, J. (2013). Gettierized Knobeeffects. Episteme,10(3), 219–40.
Bengson, J. (2013). Experimental attacks on intuitions and answers. Philosophy
and Phenomenological Research,86(3), 495–532.
Buckwalter, W. (2014). The mystery of stakes and error in ascriber intuitions. In
J. Beebe (ed.), Advances in Experimental Epistemology (pp. 145–74). New
York: Bloomsbury Academic.
Buckwalter, W., Rose, D., and Turri, J. (2015). Belief through thick and thin.
Noûs,49(4), 748–75.
Cacioppo, J. T. and Petty, R. E. (1982). The need for cognition. Journal of
Personality and Social Psychology,42(1), 116–31.
Cacioppo, J. T., Petty, R. E., Kao, C. F., and Rodriguez, R. (1986). Central and
peripheral routes to persuasion: an individual difference perspective. Journal
of Personality and Social Psychology,51(5), 1032–43.
Camerer, C. F. and Hogarth, R. M. (1999). The effects of financial incentives in
experiments: a review and capital-labor-production framework. Journal of
Risk and Uncertainty,19(1–3), 7–42.
Cappelen, H. (2012). Philosophy Without Intuitions. Oxford: Oxford University
Press.
Colaço, D., Buckwalter, W., Stich, S., and Machery, E. (2014). Epistemic intu-
itions in fake-barn thought experiments. Episteme,11(2), 199–212.
Colaço, D. and Machery, E. (2017). The intuitive is a red herring. Inquiry,60(4),
403–19.
Cova, F. et al. (2021). Estimating the reproducibility of experimental philosophy.
Review of Philosophy and Psychology,12,9–44.
Dalbauer, N. and Hergovich, A. (2013). Is what is worse more likely?—the
probabilistic explanation of the epistemic side-effect effect. Review of
Philosophy and Psychology,4(4), 639–57.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
292 , ç, .
Deutsch, M. E. (2015). The Myth of the Intuitive: Experimental Philosophy and
Philosophical Method. Cambridge, MA: MIT Press.
Dewey, J. (1910). How We Think. Boston: DC: Heath and Company.
Epstein, S., Pacini, R., Denes-Raj, V., and Heier, H. (1996). Individual differences
in intuitive–experiential and analytical–rational thinking styles. Journal of
Personality and Social Psychology,71(2), 390–405.
Faul, F., Erdfelder, E., Lang, A. G., and Buchner, A. (2007). G* Power 3: a flexible
statistical power analysis program for the social, behavioral, and biomedical
sciences. Behavior Research Methods,39(2), 175–91.
Forstmann, B. U., Dutilh, G., Brown, S., Neumann, J., Von Cramon, D. Y.,
Ridderinkhof, K. R., and Wagenmakers, E.-J. (2008). Striatum and pre-SMA
facilitate decision-making under time pressure. Proceedings of the National
Academy of Sciences,105(45), 17538–42.
Frederick, S. (2005). Cognitive reflection and decision making. The Journal of
Economic Perspectives,19(4), 25–42.
Garrett, H. E. (1922). A Study of the Relation of Accuracy to Speed (Vol. 8). New
York: Columbia University.
Gerken, M. and Beebe, J. R. (2016). Knowledge in and out of contrast. Noûs,
50(1), 133–65.
Gettier, E. L. (1963). Is justified true belief knowledge? Analysis,23(6), 121–3.
Gonnerman, C., Reuter, S., and Weinberg, J. M. (2011). More oversensitive
intuitions: print fonts and could choose otherwise. Paper presented at the
108th Annual Meeting of the American Philosophical Association, Central
Division, Minneapolis, MN.
Haidt, J. (2001). The emotional dog and its rational tail: a social intuitionist
approach to moral judgment. Psychological Review,108(4), 814–34.
Hannon, M. (2018). Intuitions, reflective judgments, and experimental philoso-
phy. Synthese,195(9), 4147–68.
Hertwig, R. and Ortmann, A. (2001). Experimental practices in economics: a
methodological challenge for psychologists? Behavioral and Brain Sciences,
24(3), 383–403.
Hick, W. E. (1952). On the rate of gain of information. Quarterly Journal of
Experimental Psychology,4(1), 11–26.
Horvath, J. (2010). How (not) to react to experimental philosophy. Philosophical
Psychology,23(4), 447–80.
Kauppinen, A. (2007). The rise and fall of experimental philosophy.
Philosophical Explorations,10(2), 95–118.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
293
Kneer, M. (2018). Perspective and epistemic state ascriptions. Review of
Philosophy and Psychology,9(2), 313–41.
Knobe, J. (2003). Intentional action and side effects in ordinary language.
Analysis,63(279), 190–4.
Koriat, A., Lichtenstein, S., and Fischhoff, B. (1980). Reasons for confidence.
Journal of Experimental Psychology: Human Learning and Memory,6(2),
107–18.
Kornblith, H. (2010). What reflective endorsement cannot do. Philosophy and
Phenomenological Research,80(1), 1–19.
Kripke, S. A. (1972). Naming and necessity. In D. Davidson and G. Harman
(eds.), Semantics of Natural Language (pp. 253–355). Dordrecht: Springer.
Lerner, J. S. and Tetlock, P. E. (1999). Accounting for the effects of accountabil-
ity. Psychological Bulletin,125(2), 255–75.
Liao, S. M. (2008). A defense of intuitions. Philosophical Studies,140(2), 247–62.
Ludwig, K. (2007). The epistemology of thought experiments: first person versus
third person approaches. Midwest Studies in Philosophy,31(1), 128–59.
Machery, E. (2011). Thought experiments and philosophical knowledge.
Metaphilosophy,42(3), 191–214.
Machery, E. (2012). Expertise and intuitions about reference. THEORIA. Revista
de Teoría, Historia y Fundamentos de la Ciencia,27(1), 37–54.
Machery, E. (2017). Philosophy Within Its Proper Bounds. Oxford: Oxford
University Press.
Machery, E., Deutsch, M., Sytsma, J., Mallon, R., Nichols, S., and Stich, S. P.
2010. Semantic intuitions: reply to Lam. Cognition,117, 361–6.
Machery, E., Mallon, R., Nichols, S., and Stich, S. P. (2004). Semantics, cross-
cultural style. Cognition,92(3), B1–B12.
Machery, E., Stich, S., Rose, D., Chatterjee, A., Karasawa, K., Struchiner, N.,
Usui, N., and Hashimoto, T. (2017). Gettier across cultures. Noûs,51(3),
645–64.
Machery, E, Stich, S., Rose, D., Chatterjee, A., Karasawa, K., Struchiner, N.,
Sirker, S., Usui, N., and Hashimoto, T. (2018). Gettier was framed! In S. Stich,
M. Mizumoto, and E. McCready (eds.), Epistemology for the Rest of the World
(pp. 123–48). Oxford: Oxford University Press.
Malmgren, A. S. (2011). Rationalism and the content of intuitive judgements.
Mind,120(478), 263–327.
Mizrahi, M. (2015). Three arguments against the expertise defense.
Metaphilosophy,46(1), 52–64.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
294 , ç, .
Murray, D., Sytsma, J., and Livengood, J. (2013). God knows (but does God
believe?). Philosophical Studies,166(1), 83–107.
Myers-Schulz, B. and Schwitzgebel, E. (2013). Knowing that P without believing
that P. Noûs,47(2), 371–84.
Nado, J. (2015). Intuition, philosophical theorizing, and the threat of skepticism.
In E. Fischer and J. Collins (eds.), Experimental Philosophy, Rationalism, and
Naturalism: Rethinking Philosophical Method (chapter 9). New York:
Routledge.
Nado, J. (2016). The intuition deniers. Philosophical Studies,173(3), 781–800.
Ollman, R. (1966). Fast guesses in choice reaction time. Psychonomic Science,
6(4), 155–6.
Pachella, R. G. (1973). The interpretation of reaction time in information
processing research. Michigan University Ann Arbor Human Performance
Center (No. TR-45).
Pacini, R. and Epstein, S. (1999). The relation of rational and experiential
information processing styles to personality, basic beliefs, and the ratio-bias
phenomenon. Journal of Personality and Social Psychology,76(6), 972–87.
Paxton, J. M., Ungar, L., and Greene, J. D. (2012). Reflection and reasoning in
moral judgment. Cognitive Science,36(1), 163–77.
Pinillos, N. Á., Smith, N., Nair, G. S., Marchetto, P., and Mun, C. (2011).
Philosophy’s new challenge: experiments and intentional action. Mind &
Language,26(1), 115–39.
Pizarro, D. A., Uhlmann, E., and Bloom, P. (2003). Causal deviance and the
attribution of moral responsibility. Journal of Experimental Social Psychology,
39(6), 653–60.
Radford, C. (1966). Knowledge: by examples. Analysis,27(1), 1–11.
Rand, D. G., Greene, J. D., and Nowak, M. A. (2012). Spontaneous giving and
calculated greed. Nature,489(7416), 427–30.
Ratcliff, R. and Rouder, J. N. (1998). Modeling response times for two-choice
decisions. Psychological Science,9(5), 347–56.
Rose, D. and Schaffer, J. (2013). Knowledge entails dispositional belief.
Philosophical Studies,166(1), 19–50.
Russell, B. (1948). Human Knowledge: Its Scope and Its Limits. London: George
Allen & Unwin.
Sartwell, C. (1992). Why knowledge is merely true belief. The Journal of
Philosophy,89(4), 167–80.
Schouten, J. and Bekker, J. (1967). Reaction time and accuracy. Acta
Psychologica,27, 143–53.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
295
Schwitzgebel, E. and Cushman, F. (2012). Expertise in moral reasoning? Order
effects on moral judgment in professional philosophers and non-philo-
sophers. Mind & Language,27(2), 135–53.
Schwitzgebel, E. and Cushman, F. (2015). Philosophers’biased judgments persist
despite training, expertise and reflection. Cognition,141, 127–37.
Simonson, I. and Nye, P. (1992). The effect of accountability on susceptibility to
decision errors. Organizational Behavior and Human Decision Processes,
51(3), 416–46.
Stich, S. P. and Machery, E. (forthcoming). Demographic differences in philo-
sophical intuition: a reply to Joshua Knobe. Review of Philosophy and
Psychology.
Strevens, M. (2019). Thinking Off Your Feet: How Empirical Psychology
Vindicates Armchair Philosophy. Cambridge, MA: Harvard University Press.
Swain, S., Alexander, J., and Weinberg, J. M. (2008). The instability of philo-
sophical intuitions: running hot and cold on truetemp. Philosophy and
Phenomenological Research,76(1), 138–55.
Toplak, M. E., West, R. F., and Stanovich, K. E. (2011). The Cognitive Reflection
Test as a predictor of performance on heuristics-and-biases tasks. Memory &
Cognition,39(7), 1275–89.
Turri, J. (2014). Knowledge and suberogatory assertion. Philosophical Studies,
167(3), 557–67.
Weinberg, J. M. and Alexander, J. (2014). Intuitions through Thick and Thin.
Intuitions, 187–231.
Weinberg, J. M., Alexander, J., Gonnerman, C., and Reuter, S. (2012).
Restrictionism and reflection: challenge deflected, or simply redirected? The
Monist,95(2), 200–22.
Weinberg, J. M., Gonnerman, C., Buckner, C., and Alexander, J. (2010). Are
philosophers expert intuiters? Philosophical Psychology,23(3), 331–55.
Wickelgren, W. A. (1977). Speed-accuracy tradeoff and information processing
dynamics. Acta Psychologica,41(1), 67–85.
Williams, B. (1973). Deciding to believe. In B. Williams (ed.), Problems of the Self
(pp. 136–51). Cambridge: Cambridge University Press.
Williamson, T. (2007). The Philosophy of Philosophy. Oxford: Blackwell.
OUP UNCORRECTED PROOF –FIRST PROOF, 11/9/2021, SPi
296 , ç, .