# Overcoming Intuition: Metacognitive Difficulty Activates Analytic Reasoning

## Abstract

Humans appear to reason using two processing styles: System 1 processes that are quick, intuitive, and effortless and System 2 processes that are slow, analytical, and deliberate that occasionally correct the output of System 1. Four experiments suggest that System 2 processes are activated by metacognitive experiences of difficulty or disfluency during the process of reasoning. Incidental experiences of difficulty or disfluency--receiving information in a degraded font (Experiments 1 and 4), in difficult-to-read lettering (Experiment 2), or while furrowing one's brow (Experiment 3)--reduced the impact of heuristics and defaults in judgment (Experiments 1 and 3), reduced reliance on peripheral cues in persuasion (Experiment 2), and improved syllogistic reasoning (Experiment 4). Metacognitive experiences of difficulty or disfluency appear to serve as an alarm that activates analytic forms of reasoning that assess and sometimes correct the output of more intuitive forms of reasoning.
Keywords: fluency, disfluency, dual-system processing, reasoning, judgment
Few psychological theories enjoy the longevity of William
James’s (1890/1950) suggestion that human reasoning involves
two distinct processing systems: one that is quick, effortless,
associative, and intuitive and another that is slow, effortful, ana-
lytic, and deliberate. When deciding whether it is more dangerous
to travel by car or airplane, for instance, people may quickly
generate horrific images of airline disasters and (erroneously)
conclude that it is more dangerous to fly than to drive. Alterna-
tively, they may think more analytically about the total number of
automobile versus airline accidents, the number of miles driven
versus flown per accident, or the possibility that automobile acci-
dents are underreported whereas airline accidents command media
headlines and conclude (accurately) that they are safer in an
airplane.
Although not without controversy (see Kruglanski & Thomp-
son, 1999; Osman, 2004),
1
dual-process theories have been used
widely by developmental, cognitive, and social psychologists to
explain such diverse phenomena as persuasion (e.g., Chaiken,
1980; Petty & Cacioppo, 1986), social cognition (Epley, Keysar,
Van Boven, & Gilovich, 2004; Keysar & Barr, 2002), self-
perception (Schwarz, 1998), causal attribution (Gilbert, 1989),
stereotyping (Bodenhausen, Macrae, & Sherman, 1999), overcon-
fidence (Griffin & Tversky, 1992), higher order reasoning (Evans,
2003; Evans & Over, 1996; Sloman, 1996), various memory
phenomena (Jacoby, Kelley, & McElree, 1999; Jones & Jacoby,
2001; Whittlesea & Leboe, 2003), and a long list of non-Bayesian
biases in judgment and decision making (e.g., Kahneman & Fred-
erick, 2002). These dual-process theories enable understanding of
diverse phenomena because they predict qualitatively different
judgments depending on which reasoning system is used. In par-
ticular, deliberate and analytical systems of reasoning (System 2)
can override or undo intuitive and associative (System 1) re-
sponses. Understanding when System 2 reasoning is likely to be
used is therefore critical for understanding human judgment and
decision making.
Although most dual-process models provide an extensive de-
scription of each system, few devote much attention to exactly
when people will adopt each approach to information processing.
Those that do typically argue that System 2 will be activated when
1
Osman (2004) proposed a unitary process model that nonetheless
acknowledges that processing occurs along a gradient from explicit to
implicit, which mirrors System 1 and System 2 processing, respectively.
Kruglanski and Thompson (1999; the unimodel) adopted a subtly different
approach. They recognized that people make use of different types of
information but subsume them under a single umbrella of “persuasive
cues.” They emphasized that people are ultimately trying to make sense of
the world, regardless of which type of cue they use in a given situation.
However, like Osman’s unitary process model, the unimodel acknowledges
that some forms of processing are more complex than others.
Adam L. Alter, Psychology Department, Princeton University; Daniel
M. Oppenheimer, Psychology Department and the Woodrow Wilson
School of Public and International Affairs, Princeton University; Nicholas
Eyre, Department of Psychology, Harvard University.
This research was funded by National Science Foundation Grants
051811 and SES0241544. We thank Sara Etchison, Shane Frederick, Geoff
Goodwin, Leif Holtzmann, Daniel Kahneman, Eugenia Mamikonyan,
Melissa Miller, Manish Pakrashi, Cordaro Rodriguez, Yuval Rottenstreich,
Joe Simmons, Alex Todorov, and Erin Whitchurch for helpful assistance.
Alter, Psychology Department, Princeton University, Princeton, NJ 08540.
E-mail: aalter@princeton.edu
Journal of Experimental Psychology: General Copyright 2007 by the American Psychological Association
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

people have both the capacity and the motivation to engage in effortful processing. Existing research demonstrates that errors in System 1 reasoning are less likely to be corrected when people are under cognitive load or respond quickly (e.g., Bless & Schwarz, 1999; Chaiken, 1980; Petty & Cacioppo, 1986), but they are more likely to be corrected when people are accountable for their deci- sions (Tetlock & Lerner, 1999) and when the outcome is person- ally relevant (Ajzen & Sexton, 1999; Chaiken, 1980; Petty & Cacioppo, 1986). This existing research does not address, how- ever, when people will recognize that System 1 processes might be producing faulty output that requires more analytical thought. Accordingly, we investigated the novel question of when people are compelled to use System 2 processes in the first place. We predicted that people’s use of more elaborate reasoning processes would be based on experiential cues that more elaborate reasoning processes are required and that System 2 processes would there- fore be activated by cues that suggest a simple System 1 judgment might be faulty. Confidence in the accuracy of intuitive judgment appears to depend in large part on the ease or difficulty with which information comes to mind (Gill, Swann, & Silvera, 1998; Kelley & Lindsay, 1993) and the perceived difficulty of the judgment at hand. If information is processed easily or fluently, intuitive (System 1) processes will guide judgment. If informa- tion is processed with difficulty or disfluently, however, this experience will serve as a cue that the task is difficult or that one’s intuitive response is likely to be wrong, thereby activating more elaborate (System 2) processing. The ease or difficulty experienced while processing information is therefore used as a cue to guide one’s subsequent processing styles. We thus pre- dicted that experienced difficulty or disfluency would function as a signal that a simple and intuitive judgment was insufficient and that more elaborate cognitive processing would be neces- sary, thereby increasing System 2 processing. Indeed, neuro- scientific evidence suggests that disfluency triggers the anterior cingulate cortex (Boksman et al., 2005), an alarm that activates the prefrontal cortex responsible for deliberative and effortful thought (Botvinick, Braver, Carter, Barch, & Cohen, 2001; Lieberman, Gaunt, Gilbert, & Trope, 2002; see also Goel, Buchel, Frith, & Dolan, 2000). Previous research has shown that people rate disfluent stimuli more negatively than fluent stimuli across a range of domains. For example, people believe that disfluently named stocks will perform more poorly than will fluently named stocks (Alter & Oppenheimer, 2006), that disfluent prose is written by an author that is less intelligent than an author of fluent prose (Oppen- heimer, 2006), and that disfluent aphorisms are less likely to be true than fluent aphorisms (McGlone & Tofighbakhsh, 2000). However, the role of disfluency in the current research is novel. Unlike existing research in which disfluency served as a direct cue to judgment, we investigate disfluency as an indirect cue that serves as a metacognitive signal to prompt more systematic processing. We predicted that experiencing difficulty or disflu- ency during the course of reasoning would trigger System 2 processes and decrease the frequency of responses consistent with System 1 processes. We present four experiments, across a range of domains, that are consistent with this hypothesis. Experiment 1—Intuitive Defaults Experiment 1 was designed to provide initial evidence that people adopt a systematic approach to reasoning when they expe- rience cognitive disfluency. Participants completed the Cognitive Reflection Test (CRT; Frederick, 2005). This test consists of three items for which the gut reaction, or intuitive default, is incorrect but that respondents can correctly answer through deliberate re- consideration. A correct answer on each item suggests that the respondent engaged systematic processing to correct the intuitive response. Frederick (2005) administered the CRT to 3,400 partic- ipants across 35 studies and 11 samples and showed that scores on the CRT were highly correlated with a variety of measures asso- ciated with analytic thinking, including intelligence (Stanovich & West, 2000). If disfluency initiates systematic processing, then participants should perform better on the test if they experience disfluency while generating their answers. We manipulated disflu- ency in this experiment by printing the questions in either a difficult-to-read font (disfluent condition) or an easy-to-read font (fluent condition). We predicted that participants would answer more of the CRT items correctly when they were printed in a difficult-to-read font than when they were printed in an easy-to- read font. Method We recruited 40 Princeton University undergraduate volunteers at the student campus center to complete the three-item CRT (Frederick, 2005). Participants were seated either alone or in small groups, and the experimenter ensured that they completed the questionnaire individually. Those in the fluent condition com- pleted a version of the CRT written in easy-to-read black Myriad Web 12-point font, whereas participants in the disfluent condition completed a version of the CRT printed in difficult-to-read 10% gray italicized Myriad Web 10-point font. Participants were ran- domly assigned to complete either the fluent or the disfluent version of the CRT. Previous research has shown that similar font manipulations effectively influence fluency (e.g., Oppenheimer, 2006; Werth & Strack, 2003). Consistent with these studies, a separate sample of 13 participants rated (on a 5-point scale) the disfluent font (M3.08, SD 0.76) as being more difficult to read than the fluent font (M1.54, SD 0.87), t(12) 3.55, p .01, 2 .51. Results and Discussion As predicted, participants answered more items on the CRT correctly in the disfluent font condition (M2.45, SD 0.64) than in the fluent font condition (M1.90, SD 0.89), t(38) 2.25, p.03, 2 .12. Whereas 90% of participants in the fluent condition answered at least one question incorrectly, only 35% did so in the disfluent condition, 2 (1, N40) 12.91, p.001, Cramer's V.57. Finally, participants in the fluent condition provided the incorrect and intuitive response more often (23% of responses) than did participants in the disfluent condition (10% of responses), Z1.96, p.05, 2 .07. When the CRT was difficult to read, participants appeared to engage in systematic processing and overcame their invalid intuitions to answer more questions correctly. These results provide preliminary evidence that disfluency initiates systematic processing. We sought con- verging evidence for this hypothesis in the subsequent experiments and also addressed a variety of alternative interpretations for the results of Experiment 1. The most obvious alternative interpretation of Experiment 1 is that presenting the test items in a disfluent font simply slowed participants down, thereby forcing them to process the information more carefully. This exogenous, task-specific mechanism is some- what less interesting than our proposed endogenous mechanism, so we attempted in the remaining experiments to rule out the possi- bility that these effects merely reflected task-imposed constraints on processing speed. In Experiment 2, we adopted a fluency manipulation that did not force participants to process the infor- mation slowly and sought evidence for our proposed mechanism that people interpret disfluency as a signal to exert extra cognitive effort to adequately complete a task. In addition, although the CRT is a domain-general test of systematic reasoning, we wanted to ensure that the effects of disfluency on processing depth also applied to other concrete domains of judgment. Accordingly, we examined whether disfluency would lead people to focus on sys- tematic processing cues when reading a review of a new MP3 player. Experiment 2—Persuasion The dominant models of persuasion, the heuristic–systematic model (Chaiken, 1980) and the elaboration likelihood model (Petty & Cacioppo, 1986), propose two distinct types of processing cues. Systematic or central cues generally involve the evidentiary qual- ity of an argument, whereas heuristic or peripheral cues generally involve contextual factors irrelevant to an argument’s quality (e.g., a target’s appearance of competence). In Experiment 2, we tested whether experienced disfluency would increase people’s reliance on systematic processing cues when evaluating a persuasive com- munication. Participants read a fabricated review of a new MP3 player accompanied by a picture of either a competent-looking person discussing unimportant features (positive heuristic–negative sys- tematic condition) or an incompetent-looking person discussing important features (negative heuristic–positive systematic condi- tion). We manipulated processing fluency by presenting the mast- head of the review in either difficult-to-read (disfluent condition) or easy-to-read (fluent condition) type. We predicted that, consis- tent with the results of Experiment 1, participants in the disfluent condition would rely more heavily on the systematic cue than on the heuristic cue. Method Pilot Experiments Stimulus selection: Heuristic cues. A separate sample of 42 participants (Willis & Todorov, 2006) judged the apparent com- petence of a series of faces taken from a database of faces (Lund- qvist, Flykt, & O ¨hman, 1998). The faces judged, on average, to be the most competent and the most incompetent from the database served as the targets of our heuristic-cue manipulation. Given that the physical appearance of competence is considered a heuristic cue, the competent-looking face constituted a strong (convincing) heuristic cue, and the incompetent-looking face constituted a weak (unconvincing) heuristic cue. Stimulus selection: Systematic cues. In partial fulfillment of a course requirement, 10 Princeton University undergraduates re- ported the three most important and the three least important features of an MP3 player. We manipulated the strength of the systematic cue by using either the three most commonly men- tioned important features (strong systematic cue: price, storage capacity, and battery life) or the three most commonly mentioned unimportant features (weak systematic cue: variety of colors, pop- ular with celebrities, used by “everyone”). The resulting reviews are depicted in Figure 1. This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

questionnaire began with the fluent masthead for 10 participants and the disfluent masthead for the remaining 10 participants (see Figure 1 for mastheads). We asked participants to read the mast- head and to imagine that they were going to read the remainder of the review. They rated how much effort they expected to have to expend to understand the contents of the review and the difficulty of reading the masthead (both on 5-point scales). As we expected, the fluency manipulation check showed that the disfluent masthead was considered more difficult to read (M2.30, SD 0.67) than the fluent masthead (M1.20, SD 0.42), t(18) 4.37, p .001, 2 .52. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly. questionnaire began with the fluent masthead for 10 participants and the disfluent masthead for the remaining 10 participants (see Figure 1 for mastheads). We asked participants to read the mast- head and to imagine that they were going to read the remainder of the review. They rated how much effort they expected to have to expend to understand the contents of the review and the difficulty of reading the masthead (both on 5-point scales). As we expected, the fluency manipulation check showed that the disfluent masthead was considered more difficult to read (M2.30, SD 0.67) than the fluent masthead (M1.20, SD 0.42), t(18) 4.37, p .001, 2 .52. Perhaps more important, participants who read the disfluent masthead expected to need to expend more cognitive effort to understand the remaining contents of the review (M 2.30, SD 0.67) than did those who read the fluent masthead (M1.40, SD 0.70), t(18) 2.93, p.01, 2 .32. This preliminary result suggests that people rely on the ease with which they process initial information to determine how much effort they will require to process subsequent information. Main Experiment Having shown that participants interpret disfluency as a cue to engage greater cognitive resources, we investigated whether dis- fluency also increases participants’ reliance on systematic cues. Forty Princeton University volunteers at the student campus center read a short review of a new MP3 player, ostensibly printed from a Web site called Techbiz.com. Participants were seated alone and in small groups, and the experimenter ensured that participants in groups completed the questionnaire individually. The masthead on the review was written either in easy-to-read typeset (fluent con- dition) or in a difficult-to-read combination of letter-resembling symbols (disfluent condition; see Figure 1). Participants in the positive heuristic/negative systematic condition saw the highly competent-looking face used for the reviewer’s photo paired with a review praising unimportant features of the MP3 player. Partic- ipants in the negative heuristic/positive systematic condition saw the opposite: the incompetent-looking face paired with a review praising important features of the MP3 player. In both conditions, the overall review of the MP3 player was positive. We randomly assigned participants to read one of the four versions of the review. Note that, contrary to the fluency manipulation in Experiment 1, this fluency manipulation did not alter the information on which participants based their judgments. This manipulation instead al- tered the fluency of the masthead at the top of the review rather than the fluency of the information in the review itself. The actual content and format of the reviews were identical between condi- tions. Participants in the disfluent condition were therefore not compelled to read the review itself more slowly. Although it is possible that participants in the disfluent condition read the mast- head more slowly and then maintained this slower processing speed through the rest of the materials, we suspect that the thou- sands of hours of practice our participants spent reading text in normal fonts outside our experimental context render this partic- ular alternative quite unlikely. Nonetheless, this manipulation im- proves on many fluency manipulations as the actual content and format of the reviews were identical between conditions, and participants in the disfluent condition were therefore not required to read the review itself more slowly. After reading the review, participants rated the competence of the reviewer, rated the quality of the MP3 player, and estimated how much they would like the experience of owning the MP3 player on separate 7-point scales. Scores on the three scales were highly correlated (Cronbach’s ␣⫽.82), so we averaged scores to create a composite favorability rating. Results and Discussion As predicted, participants’ favorability ratings were more heavily influenced by the systematic cue (the quality of the argu- ments) in the disfluent condition than by the systematic cue in the fluent condition. Specifically, participants in the disfluent condi- tion preferred the MP3 player when the systematic cue was per- suasive (M systematic 4.50, SD 0.82, vs. M heuristic 3.47, SD 1.62), whereas those in the fluent condition preferred the MP3 player when the heuristic cue was persuasive (M heuristic 3.63, SD 0.72, vs. M systematic 3.00, SD 1.14), F interaction (1, 39) 5.44, p.03, 2 .13 (see Table 1 for results of each component of the composite favorability rating). 2 Neither follow-up simple effect comparison reached significance, although participants’ rat- ings were marginally more favorable toward the strong systematic cue review than the strong heuristic cue review when the masthead was disfluent, F(1, 19) 3.24, p.10, 2 .15. These results suggest that disfluency induces a more systematic processing style. It is important to note that this study used a manipulation that did not require participants to spend more time reading information in the disfluent condition than in the fluent condition. Furthermore, the pilot study shows that people expect to need additional cog- nitive resources to process information after the experience of disfluency. This strongly suggests that the apparent need for more systematic processing activated by the disfluent cues was at least a contributing if not the primary mechanism that led to System 2 processing. Nonetheless, it is still possible that participants in Experiment 2 who read the review with the disfluent masthead also processed subsequent information more slowly. We therefore adopted a fluency manipulation in Experiment 3 that did not alter the prop- erties of the stimuli at all, thereby eliminating the possibility that participants processed information more slowly because of changes in processing speed that originated in the stimuli. We also conducted Experiment 3 in a third domain of judgment to further demonstrate that the effects of disfluency on processing depth 2 On closer inspection, the interaction in this experiment was driven by the strong systematic cue condition, in which participants gave signifi- cantly higher favorability ratings when the masthead was disfluent than when it was fluent, F(1, 19) 18.89, p.001, 2 .51. In contrast, the masthead had little effect on favorability ratings in the strong heuristic cue condition, F1. One possible explanation for this effect is that in the strong systematic cue condition, the disheveled appearance of the reviewer acted as a negative cue that directly contrasted with the compelling content of the actual review. Thus, when participants paid relatively more attention to the content of the review, their ratings were commensurately more favorable. In contrast, although the review in the strong heuristic condition was not particularly compelling, it still praised the MP3 player. As such, in the strong heuristic condition, the two cues did not run in opposition to one another—they were both positive. Thus, in the strong heuristic condition, participants' favorability ratings might not have been swayed strongly by whether they attended to the reviewer or the content of his review.

Experiment 3—Representativeness Heuristic In reviewing dual-process accounts in judgment and decision making, Kahneman and Frederick (2002) discussed the represen- tativeness heuristic (Kahneman & Tversky, 1973) as a prototypical case of System 1 reasoning. Using this heuristic when making judgments of category inclusion entails categorizing exemplars primarily on the basis of the extent to which they are representative of a given category, ignoring more cognitively demanding cues such as base rates. Because representativeness is so strongly as- sociated with the heuristic branch of dual-process reasoning, in Experiment 3, we examined whether the experience of disfluency might reduce people's reliance on the representativeness heuristic. Experiment 3—Representativeness Heuristic In reviewing dual-process accounts in judgment and decision making, Kahneman and Frederick (2002) discussed the represen- tativeness heuristic (Kahneman & Tversky, 1973) as a prototypical case of System 1 reasoning. Using this heuristic when making judgments of category inclusion entails categorizing exemplars primarily on the basis of the extent to which they are representative of a given category, ignoring more cognitively demanding cues such as base rates. Because representativeness is so strongly as- sociated with the heuristic branch of dual-process reasoning, in Experiment 3, we examined whether the experience of disfluency might reduce people’s reliance on the representativeness heuristic. Extending the generality of the effect, we manipulated fluency by asking participants to adopt a facial expression consistent with the exertion of cognitive effort (furrowed brow) or a control expres- sion that did not imply the exertion of effort (puffed cheeks; see Stepper & Strack, 1993). As in Experiment 2, this fluency manip- ulation did not necessarily slow participants down, which elimi- nated the possibility that participants in the disfluent condition were merely forced to process the stimuli more slowly. Method Pilot Study As in Experiment 2, we first investigated whether our manipu- lation of fluency—in this case, one’s facial expression— did in- deed serve as a cue that greater cognitive effort would or would not be needed in the task. Because heuristic reasoning represents a default intuitive judgment that systematic reasoning would over- come, we investigated whether furrowing one’s brow influenced confidence in one’s judgment. Low confidence in judgment would serve as a cue that more cognitive processing would be necessary to arrive at the correct answer, and we expected those in the disfluent condition (who were furrowing their brows) would be less confident in their judgments than those in the fluent condition (who were puffing their cheeks). Twenty Harvard University undergraduates participated in a lab study in which they answered a series of trivia questions and indicated their confidence that each of their answers was correct (from 0% to 100% confident). The experimenter told participants that the study was designed to examine how people answer ques- tions under distracting conditions, and participants learned that they were going to answer questions while making various poten- tially distracting body movements. Participants were randomly assigned to adopt one of two facial expressions. Half of the participants were instructed to puff out their cheeks, whereas the other half were instructed to furrow their brows (Stepper & Strack, 1993). The experimenter demonstrated each expression until the participant held it correctly. Furrowing one’s brow is associated with difficult mental effort and concentration, whereas puffing one’s cheeks is a neutral expression that is equally difficult to maintain but is not associated with mental effort (Tourangeau & Ellsworth, 1979). As predicted, participants who puffed their cheeks were signif- icantly more confident in their responses (M65%, SD 14%) than were participants who furrowed their brows (M52%, SD 12%), t(18) 2.07, p.05, 2 .21. It is important to note that this difference in confidence did not reflect any difference in the actual accuracy of participants’ responses among those who puffed their cheeks (M36% correct, SD 13%) versus those who furrowed their brows (M38% correct, SD 10%), t(18) 0.37, ns; for the interaction between confidence and accuracy, F(1, 18) 4.85, p.05, 2 .22. These findings suggest that people are less confident in their judgments when they adopt facial expressions commonly associated with cognitive effort. Main Experiment One hundred fifty Harvard University undergraduates partici- pated in a lab study in exchange for course credit. The materials for this experiment were taken from one of the original demonstra- tions of the representativeness heuristic: the Tom W. scenario (Kahneman & Tversky, 1973). Participants in this original dem- onstration read a description of a fictitious person—Tom W.—who was described in a way intended to seem similar to the widely recognized stereotype of an engineer. As in the original experiment, one group of participants (n51) read a description of Tom W. and estimated on an 11-point scale how similar he was to a typical student with one of nine under- graduate majors (library science, social science and social work, business administration, computer science, humanities and educa- tion, law, medicine, engineering, and physical and life sciences). The mean similarity rating for each major functioned as a measure of how representative Tom W. was of a typical student with each major. A second group of participants (n55) did not read a descrip- tion of Tom W. but instead estimated the percentage of students on campus who were studying each of the nine majors. The mean proportion represented an estimate of the base rates associated with each of the nine majors.

Finally, a third group of students (n44) ranked the likelihood (from 1 most likely to 9 least likely) that Tom W. actually studied each of the nine majors. As in the pilot test, we randomly allocated half of these participants to puff their cheeks (fluent condition) and the other half to furrow their brows (disfluent condition) while rendering their judgments. Although the data adhere to similar patterns across the three measures, the interaction is significant on the competence dimension, F(1, 39) 7.50, p.01, 2 .17; marginally significant on the quality dimension, F(1, 39) 3.75, p .06, 2 .09; but nonsignificant on the liking dimension, F(1, 39) 1.94, p.17, 2 .05. 573 FLUENCY AND PROCESSING DEPTH This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly. Finally, a third group of students (n44) ranked the likelihood (from 1 most likely to 9 least likely) that Tom W. actually studied each of the nine majors. As in the pilot test, we randomly allocated half of these participants to puff their cheeks (fluent condition) and the other half to furrow their brows (disfluent condition) while rendering their judgments. Results and Discussion We correlated each participant’s nine likelihood ratings with the mean representativeness and base-rate evaluations from the first two groups of participants. These two correlations provide an estimate of how strongly each participant in the third group relied on representativeness and base-rate information when assessing the likelihood that Tom W. studied each of the nine majors. We began by reverse scoring participants’ rankings so that the most likely major was scored a 9 and the least likely was scored a 1. A positive correlation between this recoded ranking variable and representativeness scores therefore indicates that the participant relied heavily on the representativeness heuristic, whereas a neg- ative correlation indicates that the participant strongly neglected base rates. As predicted, participants who furrowed their brows relied less on the representativeness heuristic (mean r.43, SD 0.46) than did participants who puffed their cheeks (mean r.74, SD 0.19), t(42) 2.84, p.01, 2 .16. Similarly, participants who furrowed their brows exhibited less base-rate neglect (mean r .12, SD 0.30) than did participants who puffed their cheeks (mean r⫽⫺.36, SD 0.33), t(42) 2.60, p.01, 2 .14. Thus, participants whose facial expressions induced a feeling of disfluency were less likely to rely on System 1 processes and were more inclined to use System 2. As the pilot study suggests, participants who furrowed their brows felt less confident in their judgments. These results suggest that participants dealt with the experience of lowered confidence by adopting a more careful, systematic approach to the task. Although Experiments 1–3 demonstrate that experienced disflu- ency leads to more systematic processing, none of the experiments address the potential influence of mood. People in negative mood states— especially sadness—tend to process information more sys- tematically (Schwarz, Bless, & Bohner, 1991), and it is at least possible that experienced disfluency worsens people’s transient mood states. We therefore designed Experiment 4 to extend Ex- periments 1–3 into a new domain of reasoning while simulta- neously investigating the possible role of mood in producing the observed results. Experiment 4 —Syllogistic Reasoning Participants in Experiment 4 attempted to deduce logical con- clusions from a series of two-statement syllogisms printed in either easy-to-read or difficult-to-read font. Syllogistic reasoning is one of the most widely studied processes in cognitive psychology (e.g., Johnson-Laird & Bara, 1984; Rips, 1994) and is often used as a case study in higher order reasoning for dual-process models of cognition (e.g., Evans & Over 1996; Sloman, 1996). We expected people to answer more questions correctly when the syllogisms were written in difficult-to-read font, which would be consistent with the results of Experiments 1–3. We also sought to investigate whether these effects were driven by differential mood effects of the fluency conditions. Method Pilot Study As in Experiments 2 and 3, we first investigated whether our manipulation of fluency—in this case, the fonts of the items— could serve as a cue that greater cognitive effort would or would not be needed in the task. Accordingly, 69 Princeton University undergraduate volunteers at the student campus center read two syllogism questions printed in either an easy-to-read (fluent) font or a difficult-to-read (disfluent) font. The fonts were identical to those used in Experiment 1. Without actually answering the ques- tions, participants rated how confident they were that they would be able to answer them correctly and estimated their difficulty (each on a 5-point scale). As we expected, participants who read the syllogisms printed in fluent font were more confident that they would be able to answer the questions correctly (M4.86, SD 1.39, vs. M4.00, SD 1.39), t(67) 3.03, p.01, 2 .12, and believed the task was less difficult (M2.54, SD 1.17, vs. M3.29, SD 1.27) than did participants who read the syllo- gisms printed in disfluent font, t(67) 2.58, p.05, 2 .09. Thus, the experience of reading the syllogisms printed in disfluent font lowered participants’ confidence in their ability to answer the questions correctly and led them to believe that the questions were more difficult. Notice that in contrast to Study 3, these measures of confidence and difficulty were taken before answering the ques- tions rather than after, thereby providing convergent evidence for our proposed mechanism. Main Study Forty-one Princeton University undergraduates at the student campus center volunteered to complete a questionnaire that con- tained six syllogistic reasoning problems. The experimenter ap- proached participants individually or in small groups but ensured that they completed the questionnaire without the help of other participants. The syllogisms were selected on the basis of accuracy base rates established in prior research (Johnson-Laird & Bara, 1984; Zielinski, Goodwin, & Halford, 2006). Two were easy (answered correctly by 85% of respondents), two were moderately difficult (50% correct response rate), and two were very difficult (20% correct response rate). The easy and very difficult items were omitted from further analyses because the ceiling and floor effects obscured the effects of fluency on processing depth. Shallow heuristic processing enabled participants to answer the easy items correctly, whereas systematic reasoning was insufficient to guar- antee accuracy on the difficult questions. Participants were ran- domly assigned to read the questionnaire printed in either an easy-to-read (fluent) or a difficult-to-read (disfluent) font, the same fonts that were used in Experiment 1. Finally, participants indicated how happy or sad they felt on a 7-point scale (1 very sad;4neither happy nor sad;7very happy). This is a standard method for measuring transient mood states (e.g., Forgas, 1995). Results and Discussion As expected, participants in the disfluent condition answered a greater proportion of the questions correctly (M64%) than did participants in the fluent condition (M43%), t(39) 2.01, p .05, 2 .09. This fluency manipulation had no impact on participants' reported mood state (M fluent 4.50 vs. M disfluent 4.29), t1, 2 .01; mood was not correlated with performance, r(39) .18, p.25; and including participants' mood as a covariate did not diminish the impact of fluency on performance, t(39) 2.15, p.05, 2 .11. The performance boost associated with disfluent processing is therefore unlikely to be explained by differences in incidental mood states. General Discussion Dual-process models of human judgment are fundamental to much research in cognitive psychology, social psychology, and judgment and decision making. People make different decisions depending on whether they adopt systematic processing or rely on intuitive, heuristic processing. Understanding the output of human judgment and decision making therefore hinges on the ability to predict when people will activate these reasoning systems. This research provides evidence that experienced difficulty or disfluency is one cue that leads people to adopt a systematic approach to information processing. This effect emerged with three manipulations of fluency across four domains of reasoning (a domain-general test of systematic reasoning, persuasion, person perception, and higher order reasoning). Our results suggest that participants in each experiment who experienced difficulty or disfluency while reasoning believed the tasks were more difficult and therefore engaged in more analytical processing than did those who did not. Pilot tests conducted for Experiment 2, 3, and 4 confirmed that our manipulations of fluency increased the apparent difficulty of the task and decreased participants’ confidence in their accuracy of judgment. Although people are usually content to rely on heuristic processing, experienced difficulty or disfluency appears to act as a cue that the problem in question may require more elaborate thought and one’s simple or intuitive response is likely to be wrong. These results are consistent with research demonstrating that people are less likely to choose a default option when their confidence is weakened (Simmons & Nelson, 2006). In addition to describing the effect of fluency on processing depth, we also ruled out the alternative possibilities that stimulus- driven delays in processing speed (Experiments 2 and 3) and negative mood (Experiment 4) rather than disfluency per se in- duced systematic processing. Attending to stimuli more carefully and processing them more slowly are both hallmarks of System 2 processes, but our results were not created merely by constraints inherent in our stimuli that required more careful attention or slower processing. Instead, our results appear to have been pro- duced because disfluency acts as a cue that more deliberate pro- cessing is required. Existing fluency research has shown that people interpret stim- uli depending on how easy those stimuli are to process (for a review, see Schwarz, 2004), but our research makes the distinct theoretical point that processing fluency can also influence judg- ment indirectly by serving as a cue to engage in deeper reasoning. Until now, fluency researchers have not provided participants with cues at a shallow, heuristic level that systematically contradict cues at a deeper, systematic level (e.g., an MP3 player reviewer who looks incompetent but writes a compelling review and vice versa). As a result, it has not been possible to determine whether fluency affects how deeply people process available information. Our findings are therefore novel because they show that fluency is used not only directly as a cue for judgment but also indirectly as a mechanism for strategy selection. Moreover, these findings suggest a resolution for inconsisten- cies in the fluency literature. For instance, some research suggests that fluent stimuli are perceived as being more familiar than disfluent stimuli (Monin, 2003), whereas other research suggests that fluent stimuli are judged as being less familiar than disfluent stimuli (Guttentag & Dunn, 2003). In addition, although disfluency implies low quality at a superficial level (e.g., Reber, Winkielman, & Schwarz, 1998), it also activates System 2 processing that might lead people to look beyond a target’s superficial characteristics. Fluency might exert opposing influences when people use cogni- tive difficulty as a direct cue (e.g., “I don’t like the target”) or as an indirect cue (e.g., “I should think more carefully about my judgment”). Thus, the direct effects of fluency on evaluation and the indirect effects of fluency on subsequent processing might, in certain contexts, have opposing influences on judgment. More generally, these findings bring us closer to determining what activates System 2 processing. To be viable accounts of human reasoning, dual-system theories need to explain not only what the two systems are but also what activates the use of each system. The predictions under this account are the same as under the affective distancing account, as both decreased fluency and decreased affective response would tend to promote so-called System 2 processing (i.e., deliberate, analytical) (Alter et al. 2007;Kahneman and Frederick 2002) over so-called System 1 processing (i.e., heuristic, intuitive). We will return to this alternative account in the Discussion. ... ... Finally, we return to the alternative explanation for the affective distance effects related to cognitive fluency (Oppenheimer 2008;Segalowitz 2010). This account proposes that the increased processing or cognitive load associated with operating in one's non-native language promotes more "System 2" analytical, deliberative decision-making processes (Alter et al. 2007). Although our data do not rule out this proposal, two aspects of the current results appear difficult to square with a fluency-based interpretation. ... Article Health care delivery depends on effective provider–patient communication. An important issue is whether and how this communication differs for second language (SL) patients. While understanding health information can be impaired by limited English proficiency, we examined a potential benefit of SL use. SL users may be “affectively distanced”, with weaker emotional reactions to content presented in a foreign versus native language (NL). This distancing may have important implications for understanding, and for making decisions and judgements about health information to the extent these processes involve affective responses. For example, patients may respond to diagnostic test results indicating risk of illness with less intense negative affect if the information is presented in their SL. Language differences in affective response may in turn attenuate risk perception for SL versus NL users, with perceived risk being lower while the objective risk associated with test results increases, as predicted by the ‘risk as feelings’ view of risk perception, where perceived risk is based on affective response to the information. On the other hand, risk perception may be more calibrated with objective risk for SL users to the extent that affective distancing encourages SL users to rely on deliberative rather than affective-based, intuitive processes related to risk perception. SL use may also influence attitudes toward and intentions to perform behaviors that address risk because these processes are driven in part by risk perception and memory for the risk information. These processes may also depend on numeracy, defined as the ability to make sense of and rationalize numbers, because it influences risk perception. We tested these predictions in the context of a simulated Electronic Health Record (EHR) patient portal, in which participants were presented diagnostic test results in English from fictional patients. Native English speakers (n = 25), and native Mandarin speakers with higher numeracy (n = 25) and lower numeracy (n = 28) participated in the study. Consistent with the ‘affective distancing’ effect, SL participants with either higher or lower numeracy demonstrated a flatter slope for positive and negative affective responses to the test results compared to NL participants. Moreover, SL participants reported greater perceived risk than NL participants did as objective risk rose. A similar pattern occurred for attitudes toward and intentions to perform behaviors that addressed this risk, especially for treatment health behaviors. On the other hand, language did not influence memory for risk-related information. Our findings extend the affective distancing effect associated with SL use to the health domain and show that this effect influences risk perception and behavioral intentions beyond memory recall and numeracy skills. ... This suggests that future versions of the standard antiphishing training package can benefit most from incorporating elements embodied in the (TRI) package, including deterrence of (over)confidence. 35 Experiments have shown that deliberative reasoning can be "activated by metacognitive experiences of difficulty or disfluency during the process of reasoning" (Alter et al., 2007), that is, by experiences that affect confidence negatively. 36 TII subjects come out of their training just as confident (in distribution) as TB subjects, and their overall classification performance, true-positive performance, and true-negative 35 Our results do not conclusively rule out the possibility that TII training could outperform the baseline, because we compare here our initial attempts at producing training packages against packages informed and refined over years of industry-standard approach development by professionals in the field of information security training. ... ... 36 TII subjects come out of their training just as confident (in distribution) as TB subjects, and their overall classification performance, true-positive performance, and true-negative 35 Our results do not conclusively rule out the possibility that TII training could outperform the baseline, because we compare here our initial attempts at producing training packages against packages informed and refined over years of industry-standard approach development by professionals in the field of information security training. 36 "Confidence in the accuracy of intuitive judgment appears to depend in large part on the ease or difficulty with which information comes to mind … and the perceived difficulty of the judgment at hand" (Alter et al., 2007). ... Article Full-text available Normative decision theory proves inadequate for modeling human responses to the social‐engineering campaigns of advanced persistent threat (APT) attacks. Behavioral decision theory fares better, but still falls short of capturing social‐engineering attack vectors which operate through emotions and peripheral‐route persuasion. We introduce a generalized decision theory, under which any decision will be made according to one of multiple coexisting choice criteria. We denote the set of possible choice criteria by C$\mathcal {C}$. Thus, the proposed model reduces to conventional Expected Utility theory when |CEU|=1$|\mathcal {C}_{\text{EU}}|=1$, while Dual‐Process (thinking fast vs. thinking slow) decision making corresponds to a model with |CDP|=2$|\mathcal {C}_{\text{DP}}|=2$. We consider a more general case with |C|≥2$|\mathcal {C}|\ge 2\$, which necessitates careful consideration of how, for a particular choice‐task instance, one criterion comes to prevail over others. We operationalize this with a probability distribution that is conditional upon traits of the decisionmaker as well as upon the context and the framing of choice options. Whereas existing signal detection theory (SDT) models of phishing detection commingle the different peripheral‐route persuasion pathways, in the present descriptive generalization the different pathways are explicitly identified and represented. A number of implications follow immediately from this formulation, ranging from the conditional nature of security‐breach risk to delineation of the prerequisites for valid tests of security training. Moreover, the model explains the “stepping‐stone” penetration pattern of APT attacks, which has confounded modeling approaches based on normative rationality.
... The cognitive effort associated with text (or speech) comprehension in a FL, or the metacognitive effort associated with completing tasks in a FL, would enhance the cognitively-controlled analysis of potential decision outcomes generated by System 2, leading to choices that are more concerned with the benefit-maximizing component of moral violations. This perspective is supported by evidence showing that an increased perception of disfluency or task difficulty in a FL may result in the activation of analytic processes that correct the output of more intuitive forms of judgment on logical reasoning tasks (e.g., Alter, Oppenheimer, Epley, & Eyre, 2007). ...
Preprint
Full-text available
The moral Foreign Language Effect (MFLE) describes how people's decisions may change when a moral dilemma is presented in either their native (NL) or foreign language (FL). Growing attention is being directed to unpacking what aspects of bilingualism may influence the MFLE, though with mixed or inconclusive results. The current study aims to bridge this gap by adopting a conceptualization of bilingualism that frames this construct as a composite and continuous measure. In a between-group analysis, we asked 196 Italian-English bilinguals to perform a moral dilemmas task in either their NL (i.e., Italian) or FL (i.e., English). In a within-group analysis, we evaluated the effects of FL AoA, FL proficiency, and language dominance-all measured as continuous variables-on moral decision-making. Overall, findings indicate that differences within bilinguals' language experience impact on moral decisions in a FL. However, the effect of the linguistic factors considered was not ubiquitous across dilemmas, and not always emerged into a MFLE. In light of these results, our study addresses the importance of treating bilingualism as multidimensional, rather than a unitary variable. It also discusses the need to re-conceptualize the FLE and its implications on moral decision-making. Cambridge University Press Language and Cognition F o r P e e r R e v i e w Foreign to whom? Constraining the moral foreign language effect with bilinguals' language experience Abstract The Moral Foreign Language Effect (MFLE) describes how people's decisions may change when a moral dilemma is presented in either their native (NL) or foreign language (FL). Growing attention is being directed to unpacking what aspects of bilingualism may influence the MFLE, though with mixed or inconclusive results. The current study aims to bridge this gap by adopting a conceptualization of bilingualism that frames this construct as a composite and continuous measure. In a between-group analysis, we asked 196 Italian-English bilinguals to perform a moral dilemmas task in either their NL (i.e., Italian) or FL (i.e., English). In a within-group analysis, we evaluated the effects of FL AoA, FL proficiency, and language dominance-all measured as continuous variables-on moral decision-making. Overall, findings indicate that differences within bilinguals' language experience impact on moral decisions in a FL. However, the effect of the linguistic factors considered was not ubiquitous across dilemmas, and not always emerged into a MFLE. In light of these results, our study addresses the importance of treating bilingualism as multidimensional, rather than a unitary variable. It also discusses the need to re-conceptualize the FLE and its implications on moral decision-making.
... When conflict emerges, a voluntary decision about the importance of different attributes is required. Here, a deliberative approach is an automatic brain reaction (Alter et al., 2007;Botvinick, 2007), which involves more detailed analysis of the attributes and conscious reasoning about the importance of different attributes (Shafir et al., 1993) in order to resolve the conflict. Subsequently, the decision process slows down because the task requires serial top-down control. ...
Article
Full-text available
Imaging science has approached subjective image quality (IQ) as a perceptual phenomenon, with an emphasis on thresholds of defects. The paradigmatic design of subjective IQ estimation, the two-alternative forced-choice (2AFC) method, however, requires viewers to make decisions. We investigated decision strategies in three experiments both by asking the research participants to give reasons for their decisions and by examining the decision times. We found that typical for larger quality differences is a smaller set of subjective attributes, resulting from convergent attention toward the most salient attribute, leading to faster decisions and better accuracy. Smaller differences are characterized by divergent attention toward different attributes and an emphasis on preferential attributes instead of defects. In larger differences, attributes have sigmoidal relationships between their visibility and their occurrence in explanations. For other attributes, this relationship is more random. We also examined decision times in different attribute configurations to clarify the heuristics of IQ estimation, and we distinguished a top-down-oriented Take-the-Best heuristic and a bottom-up visual salience-based heuristic. In all experiments, heuristic one-reason decision-making endured as a prevailing strategy independent of quality difference or task.
... Some studies have attempted to boost performance on the CRT via subtle manipulations to encourage analytical thinking. For example, Alter et al. (2007) found that increasing the perceived difficulty of the CRT increased performance. Participants who received information in a difficult-to-read font performed better on the CRT than participants who received information in an easy-to-read font. ...
Article
Cognitive reflection, or the ability to inhibit intuitive and incorrect responses in favour of correct responses, predicts performance on a variety of cognitive tasks. The present study examined interventions to improve cognitive reflection. In two experiments, college students (N = 491) were assigned to one of three conditions, completed two versions of a cognitive reflection test (CRT), and then completed transfer tasks. Between the two CRTs, some participants were provided with elaborative feedback, others were instructed to consider additional responses for their initial responses and the final group was a control. In both experiments, CRT performance increased between the first and second CRT in the feedback and instruction groups, but not in the control group. There was little evidence, however, for transfer to other tasks. These results suggest that cognitive reflection performance can be improved with brief interventions but that this improvement may not transfer to related tasks.
... Increasingly, it is becoming evident that understanding learning processes over time is critical to both student progress and fundamental questions of how learning happens (Lodge, 2018). The capacity for students to engage in effective self-regulation of their learning (e.g., Panadero, 2017), to make sound judgements about their progress (e.g., Boud, Ajjawi, Dawson, & Tai, 2018), and to change strategies when needed (e.g., Alter, Oppenheimer, Epley, & Eyre, 2007) are vital, not only for the task at hand but for longer-term learning and development of the learner. Moreover, understanding processes that are indicative or predictive of learning can help to inform feedback, interventions, and other pedagogical moves that might positively affect learning (Puntambekar et al., 2011). ...
Article
Full-text available
In this paper, we argue that a particular set of issues mars traditional assessment practices. They may be difficult for educators to design and implement; only provide discrete snapshots of performance rather than nuanced views of learning; be unadapted to the particular knowledge, skills, and backgrounds of participants; be tailored to the culture of schooling rather than the cultures schooling is designed to prepare students to enter; and assess skills that humans routinely use computers to perform. We review extant artificial intelligence approaches that–at least partially–address these issues and critically discuss whether these approaches present additional challenges for assessment practice.
Article
Narratives are an effective way of presenting persuasive health communication because audience members can be transported into the story plot, which is shown to reduce various types of resistance. Using a laboratory experiment ( N = 144), this study examined the effects of different narrative structures on transportation and counterarguing. Results suggest that a narrative presenting events in chronological order increases transportation in the case of people who are not affected by the health issue addressed in the communication, and that transportation reduces counterarguing. The study also found that such narratives increase counterarguing in general.
Article
We incorporate entrepreneurial passion into a dual theory of information-processing, theorizing that passion can be used as information in analytical thinking, and as a heuristic in intuitive thinking. We find that passion is generally associated with an increase in crowdfunding success rates (+1%). Its effect becomes more powerful (+16% increase in success) when exhibited in conditions of coherence/fluency, that is, with high levels of preparedness, while it disappears in conditions of incoherence/disfluency and when the judgment is important. This conceptual framework reconciles contradictory findings of previous studies, suggesting that affective cues have the power to influence decisions in contexts characterized by large numbers of non-professional investors.
Article
Full-text available
This chapter outlines the two basic routes to persuasion. One route is based on the thoughtful consideration of arguments central to the issue, whereas the other is based on the affective associations or simple inferences tied to peripheral cues in the persuasion context. This chapter discusses a wide variety of variables that proved instrumental in affecting the elaboration likelihood, and thus the route to persuasion. One of the basic postulates of the Elaboration Likelihood Model—that variables may affect persuasion by increasing or decreasing scrutiny of message arguments—has been highly useful in accounting for the effects of a seemingly diverse list of variables. The reviewers of the attitude change literature have been disappointed with the many conflicting effects observed, even for ostensibly simple variables. The Elaboration Likelihood Model (ELM) attempts to place these many conflicting results and theories under one conceptual umbrella by specifying the major processes underlying persuasion and indicating the way many of the traditionally studied variables and theories relate to these basic processes. The ELM may prove useful in providing a guiding set of postulates from which to interpret previous work and in suggesting new hypotheses to be explored in future research. Copyright © 1986 Academic Press Inc. Published by Elsevier Inc. All rights reserved.
Article
Full-text available
Reports 2 experiments that test whether both emotional and nonemotional feelings may be influenced by uninterpreted proprioceptive input. The logic of the procedure was adopted from studies by F. Strack et al (1988), who unobtrusively manipulated people's facial expressions. In the 1st experiment, a functionally equivalent technique was used to vary the posture of the body. Study 1 results revealed that success at an achievement task led to greater feelings of pride if the outcome was received in an upright position rather than in a slumped posture. Study 2 results revealed that nonemotional feelings of effort were influenced by contraction of the forehead muscle (corrugator), and Ss' self-ratings on a trait dimension reflected this experience when the facial contraction was maintained during the recall of behavioral episodes exemplifying this trait. To account for these results, a framework is proposed that draws on a distinction between noetic and experiential representations. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
A neglected question regarding cognitive control is how control processes might detect situations calling for their involvement. The authors propose here that the demand for control may be evaluated in part by monitoring for conflicts in information processing. This hypothesis is supported by data concerning the anterior cingulate cortex, a brain area involved in cognitive control, which also appears to respond to the occurrence of conflict. The present article reports two computational modeling studies, serving to articulate the conflict monitoring hypothesis and examine its implications. The first study tests the sufficiency of the hypothesis to account for brain activation data, applying a measure of conflict to existing models of tasks shown to engage the anterior cingulate. The second study implements a feedback loop connecting conflict monitoring to cognitive control, using this to simulate a number of important behavioral phenomena.
Article
Confidence in personality impressions is proposed to stem from the richness of people's mental representations of others. Representational richness produces confidence because it enhances the fluency with which people can make judgments, and it increases confidence even when it does not result in more accurate impressions. Results of 3 experiments support these propositions. A 4th experiment suggests that representational richness is increased by both pseudorelevant and relevant information, but not by irrelevant information. A 5th experiment suggests that representational richness has effects on confidence above and beyond the effects of metainformation (i.e., extracontent aspects of information ). The implications of these findings for evaluating evidence of error in person perception and for reducing stereotyping and prejudice are discussed.
Article