Content uploaded by Michał Białek
Author content
All content in this area was uploaded by Michał Białek on May 30, 2023
Content may be subject to copyright.
1
Cite as:
Muda, R., Pennycook, G., Hamerski, D., & Białek, M. (2023). People are worse at detecting
fake news in their foreign language. Journal of Experimental Psychology: Applied. Advance
online publication. https://doi.org/10.1037/xap0000475
People are worse at detecting fake news in their foreign language.
Rafał Muda1, Gordon Pennycook2,3, Damian Hamerski 1, and Michał Białek4
1 Faculty of Economics, Maria Curie-Skłodowska University, Poland
2 Hill/Levene Schools of Business, University of Regina, Canada
3 Department of Psychology, University of Regina, Canada
4 Institute of Psychology, University of Wroclaw, Poland
Author Note:
Correspondence to: michal.bialek3@uwr.edu.pl, Institute of Psychology, University of
Wrocław, Dawida 1, 50-529 Wrocław, Poland
The reported experiments were preregistered: https://aspredicted.org/ux4ja.pdf
(Experiment 1), and https://aspredicted.org/ni2sd.pdf (Experiment 2). Data, code, and
materials can be found at https://osf.io/km4eu/.
Acknowledgments
The current project was financed by the resources of the Polish National Science
Centre (NCN) assigned by the decision no. PRELUDIUM 2018/29/N/ HS6/02058 to RM.
Work done by MB was supported by the National Science Centre, Poland (NCN) under Grant
no. SONATA 2017/26/D/HS6/ 01159. The funders had no role in study design, data
collection and analysis, decision to publish, or preparation of the manuscript.
Authorship contributions
2
Rafał Muda: Conceptualization, Methodology, Investigation, Validation, Statistical
analysis, Writing - original draft, Writing - review & editing, Funding acquisition, Project
administration
Gordon Pennycook: Conceptualization, Methodology, Writing - review & editing
Damian Hamerski: Methodology, Investigation, Writing - review & editing
Michał Białek: Conceptualization, Methodology, Validation, Statistical analysis,
Writing - original draft, Writing - review & editing, Funding acquisition, Supervision
3
People are worse at detecting fake news in their foreign language.
ABSTRACT
Across two preregistered within-subject experiments (N = 570), we found that when
using their foreign language, proficient bilinguals discerned true from false news less
accurately. This was the case for international news (Experiment 1) and more local news
(Experiment 2). When using a foreign (as opposed to native) language, false news headlines
were always judged more believable, while true news headlines were judged equally
(Experiment 2) or less believable (Experiment 1). In contrast to past theorizing, the foreign
language effect interacted neither with perceived arousal of news (Experiment 1) nor with
individual differences in cognitive reflection (Experiments 1 and 2). Finally, using signal
detection theory modeling, we showed that the negative effects of using a foreign language
were not caused by adopting different responding strategies (e.g., preferring omissions to
false alarms) but rather by decreased sensitivity to the truth.
Significance statement:
Proficient bilinguals who read news headlines in their foreign language (and not in their
native language) are less accurate in discerning true from false news. This harm caused by
fake news targets a vulnerable bilingual news audience that consists of immigrants (whose
prospects in their adopted country are harmed), and people suffering severe propaganda in
their national media (who may be further misled by false news).
Keywords: fake news, bilingualism, foreign language effect, cognitive reflection, arousal
Word count: 11920 (total)
Abstract: 139
4
Main text: 9755
People are worse at detecting fake news in their foreign language
The spread of falsehoods on social media is emerging as a global challenge that may
threaten democracies (Lee, 2019). The internet has provided unprecedented access to
information; unfortunately, it has also increased the reach of misinformation. An element of
this that has not received much attention is the fact that many people get their news in a
foreign language, e.g., immigrants who need to navigate their new homelands or people
suffering severe propaganda in their homeland who seek more reliable news in international
media. Therefore, this is a context in which misinformation and “fake news” (i.e., fabricated
news headlines that are presented as legitimate and spread on social media; Lazer et al., 2018)
can be particularly harmful.
In this project, we empirically tested whether reading news headlines in a foreign
language affects readers' ability to discern true from fake news. This addresses a problem that
is critically important for everyday life. Modern war is fought not only on a battlefield but
also in the media. To illustrate, as a result of severe propaganda in their media, more than
70% of Russians believed that government-owned media covered Ukrainian events (e.g.,
“atrocities” against the Russian-speaking population) truthfully and without bias (Khaldarova
& Pantti, 2016). In this way, the Russian government built support for a planned war against
Ukraine. Another example of using false news is that some countries' official response to the
COVID pandemic was to deny the threat. In these cases, the ability of citizens to find reliable
information about the actual risks of contracting COVID and the best methods to avoid
getting infected could save their lives.
Propaganda can be best fought if confronted with external news sources that are less
biased and can help people to make more accurate decisions (e.g., whether to evacuate and if
yes then to where). These external sources are often available in English or some other
foreign language. To date, only one study has tested how people deal with news when using a
foreign language, finding no effect of using a foreign language on the believability of false
news (Fernández-López & Perea, 2020). Because the study did not include true news as a
5
baseline, the critical question of whether people are more or less able to discern true from
false news was still unanswered.
Foreign language use may improve news discernment
Perhaps counterintuitively, there are reasons to expect that reading news in a foreign
language can be beneficial for proficient bilinguals. Initial research on the foreign language
effect (FLe) reported that using a second language helps bilinguals make better, less biased
decisions (Costa, Foucart, Arnon, et al., 2014; Keysar et al., 2012). Although FLe is thought
to change decision-making, its exact mechanism remains an open question. Several proposed
mechanisms include lower emotional engagement, greater cognitive engagement, or more
abstract construals (Hayakawa et al., 2016); in each case, the underlying mechanism is that
thinking in a foreign language helps bilinguals base their judgments less on gut feelings.
Biased decisions are often rooted in automatically produced, intuitively appealing, affect-rich
responses (Loewenstein & Lerner, 2003). Better decisions, such as not believing in fake news,
can be achieved through reflection. Yet, because people are cognitive misers, they tend to
default to intuitions when judging and deciding (Stanovich, 2018). Deliberative processing
that (often but not always) corrects biased intuitions is engaged due to metacognitive factors,
such as when confidence in intuitions is low, or there is some sort of conflict detected
between intuitions (Ackerman & Thompson, 2017; De Neys & Pennycook, 2019; Evans &
Stanovich, 2013; Pennycook et al., 2015). Differently put, metacognitive processes are
responsible for triggering reflection, which output in turn depends on cognitive abilities. The
sensitivity to metacognitive cues when deciding whether to reflect (vs. to defer to intuitions)
is called cognitive reflection.
Let us illustrate the difference between cognitive abilities and cognitive reflection. For
a chess player, cognitive abilities (e.g., intelligence, working memory capacity, processing
speed) are needed when solving a chess puzzle that one knows has a winning move. However,
identifying a position in which there is a winning move during a game requires metacognitive
insight: one has to have a hunch that there is something in the position that may be worth
searching for. Hence, finding a moment to stop and reflect on a move requires cognitive
reflection. Consistently, the top chess players' “fast” (under strict time constraints) and “slow”
chess play (in hours-long chess games) was more closely related than those of untitled or
intermediate players (Blanch et al., 2020). What this means is that expertise may help people
6
to improve their “fast” play so it matches their “slow” play through metacognitive gains rather
than improvement in cognitive abilities. This analogy works for people dealing with fake
news. Here, people typically believe in the news they read, especially when they find it
intuitive (Pennycook & Rand, 2019, 2020; Ross et al., 2019) or emotionally evocative
(Fernández-López & Perea, 2020; Martel et al., 2020). Being cognitive misers, people default
to their intuitions when thinking and deciding (Stanovich, 2018). To verify news as being
potentially fake, one has to first detect that news may be false (via metacognition) and then
employ reasoning to ultimately evaluate its truth status. Consistently with this account, people
who are dispositionally more reflective and less intuitive are better able to discern between
true and false news (Ecker et al., 2022).
What we argue is that the tendency to engage in reflection is critical in quality
decision-making in general, and fake news specifically. Cognitive abilities may be of no use if
one has not even started reflecting. If using a foreign language makes news appear less
arousing or causes people to generally reflect more, reasoners should be more able to detect
that reflection is needed, and, in turn, increase their consistency and normative accuracy. In
other words, using a foreign language can attenuate the intuitive appeal of fake news, increase
deliberation, and in turn, support the ability to discriminate true from false news.
Foreign language use may hurt news discernment
More recent research found that decision-makers do not always benefit from using
their foreign language: using a foreign language does not affect decision-making in gambling
(Muda, Walker, et al., 2020), intertemporal choice (Białek et al., 2022; Xu et al., 2021), and
may not increase reflective thinking (Mækelæ & Pfuhl, 2019; Milczarski et al., 2022). There
is also evidence that using a foreign language does not make people think more (that is, using
a foreign language does not result in increased cognitive reflection or greater task
engagement, Milczarski et al., 2022), and may even distort the ability to detect when a salient
intuition might be wrong (Białek et al., 2020). That is, people may have limited access to their
intuition-based interpretation of information. Consistently with this account, when
information is presented to bilinguals in their foreign language, related brain potentials
7
suggest that more effort is required to compare new information to prior knowledge (Romero-
Rivas et al., 2017).
Why would limited access to intuitions be bad, when we just argued that reduced
reliance on intuitions can be beneficial? This latter claim comes from performance in tasks
designed to produce an intuitively appealing but incorrect response. Participants solve such
tasks fairly inaccurately, with about 20-30% of accurate responses (Stanovich, 2009). In such
tasks, even random responding can appear to be an improvement. Hence, the benefit of over-
reflecting a problem heavily depends on the quality of intuitions. If intuitions are fairly
accurate, overriding them with reflection may not improve the ultimate performance, and
risks depleting cognitive resources.
To reiterate, recent research suggests that logical intuitions can be muted when
reasoning in a foreign language (Białek et al., 2020). Extending these findings to fake news,
using a foreign language can mute the intuitive believability judgments of news because it is
harder to compare news to previously held beliefs and because the potential conflict between
what is being read and what was previously known is less likely to trigger corrective
reflective processes. People have limited time and cognitive resources that they can normally
invest in judging news headlines, so they cannot reflect on all news they encounter. Given
these constraints, prior beliefs are an important cue in forming news believability judgments
(Tappin et al., 2020). For example, to evaluate whether an event is likely to happen, one can
consider the overall credibility of the source (i.e., the baseline level of truth of any news) and
the likelihood of an event (e.g., a base rate probability of government engaging in a secret
scheme). Using a foreign language can hamper access to such cues, forcing people to
additionally use other cues to judge the believability of news or to adopt a more conservative
strategy of handling new information and trusting all information less.
Of course, drivers of false beliefs are not restricted to intuitive thinking, e.g., because
of being aroused by the news. Among others, we can list factors such as source cues (e.g., its
attractiveness), worldview (e.g., personal views), cognitive failures (e.g., neglect of source
cues and/or knowledge), or illusory truth (e.g., familiarity) (Ecker et al., 2022). There is no
strong evidence to claim that the foreign language effect can reduce these sources of belief in
fake news. Finally, people may be inaccurate in assessing the believability of news simply
because they adopt a conservative strategy when dealing with potentially false news: they
8
may over-focus on avoiding false news (committing Type 1 error), which is then likely to
result in failure to accept true news (Type 2 error). Using a foreign language can affect
responding strategy rather than reflection. If so, even maintaining a high level of fake news
detection may be harmful to the overall quality of acquired news, because one fails to believe
in much more numerous true news (Guess et al., 2019). In every classification analysis,
discriminability and responding strategy can be confounded. To mitigate this issue, it is useful
to adopt the signal detection approach that allows us to study the accuracy and response
strategy separately (Batailler et al., 2022).
Hypotheses and predictions
To summarize our predictions, much of the reason people believe false news is that
such news is intuitively appealing (Pennycook & Rand, 2019, 2020; Ross et al., 2019),
familiar (Pennycook et al., 2018), and emotionally evocative (Martel et al., 2020). Belief in
false news is amplified in people who tend to rely on their gut feelings (Ecker et al., 2022)and
is reduced when people are prompted to reflect on its believability (Pennycook et al., 2020;
Pennycook, Epstein, et al., 2021; Pennycook & Rand, 2021b; Roozenbeek et al., 2021). Using
a foreign language is claimed to reduce the affective response to stimuli (Caldwell-Harris,
2015), reduce reliance on intuitions, and promote more reflection (Hayakawa et al., 2016). All
of this suggests that using a foreign language can be beneficial when dealing with fake news.
However, if using a foreign language distorts metacognition, people may not benefit
from using it when dealing with fake news (Białek et al., 2020; Niszczota et al., 2022).
Having their sensitivity to metacognitive cues distorted, people would be less accurate in
deciding when to reflect, and thus could overthink the believability of true news but skip
thinking about false news.
Finally, using a foreign language can affect people’s responding strategy instead. In
such a case, people using a foreign language can simply adopt a more conservative evaluation
strategy and believe all news less. This is possible because the strength of the intuitive appeal
of response predicts its confidence, then less vivid intuitions can result in overall lower
confidence in own judgment (Ackerman & Thompson, 2017; Thompson & Morsanyi, 2012).
And this, in turn, can result in a responding strategy that favors responding “false” instead of
“true”.
9
Such an effect of using a foreign language would not be reflected in the relative belief
in true vs. false news. However, such a strategy could be harmful to them because false news
is only a small fraction of the news consumed by the average person (Guess et al., 2019), and
thus decreased trust in all news will prevent people from believing in false news at the cost of
not believing in much more common true news. In other words, correcting false beliefs can be
achieved more efficiently by promoting belief in credible news rather than by fighting
misinformation (Acerbi et al., 2022). Using a foreign language can achieve exactly the
opposite.
Overview of the experiments
In two preregistered experiments, we presented proficient Polish bilinguals with a
mixture of true and false news headlines collected from a foreign source (snopes.com for false
news; nbcnews.com, foxnews.com, news.yahoo.com, nytimes.com for true news, Experiment
1) or national sources (fakenews.pl for false news; polsatnews.pl, tvn24.pl for true news,
Experiment 2). As in other research on fake news, each piece of news was presented as a
news snippet typically found on social media platforms (e.g., Pennycook et al., 2018;
Pennycook & Rand, 2019), here pretending to be from the BBC. The participant’s task was to
judge the believability of the news using a continuous 7-point scale (Experiment 1) and
true/false classifications followed by 3-point confidence ratings (Experiment 2). We
additionally tested moderators related to a participant (i.e., their foreign language proficiency
and cognitive reflection) and to properties of the news (i.e., their ability to evoke emotional
response and familiarity). In the subsequent section we provide the rationale for each and
every of these moderators. Experiment 2 also tested the intent to share the news as the
dependent variable. In both experiments, we found that people were less able to differentiate
between true and false news when using their foreign language, while cognitively reflective
individuals were better at discerning between true and false news in Experiment 1 but not in
Experiment 2. This effect was robust to all considered moderators. In Experiment 2, using
Signal Detection Theory Analysis, we found that the distorted ability to discern true from
false news is attributable to decreased sensitivity to the truth rather than a systematic response
bias (e.g., judging all news as less believable).
Experiment 1
10
To test the effects of using a foreign language on the ability to discern true from false
news, we designed an experiment in which bilingual participants were asked to judge the
believability of news headlines. Half of the news was false. The manipulation was that half of
the true and half of the false news was randomly selected to be displayed in a foreign
language. The goal of the experiment was to compare the (within-subject) ability to discern
true from false news presented to bilingual participants in their native language (Polish) vs. in
their foreign language (English).
We included several other factors responsible for belief in fake news: familiarity,
arousal, and participant's cognitive reflection. We did so because these factors can parse out
some of the variances in believability ratings, allowing for a clearer test of the unique
contribution of using a foreign language. For example, people trust news they find more
arousing (Fernández-López & Perea, 2020) or more familiar news (Pennycook & Rand,
2020). Critically, however, these measures are potential moderating variables that can better
understand the foreign language effect. For example, familiarity with the news may require
participants to retrieve information about the truth status of the news from memory rather than
reasoning about the content of the news. There is no expectation that FLe will affect memory
retrieval, but it should affect how people reason about the content. Hence, FLe should be
much weaker in the news that is classified as known to a participant. Next, the foreign
language effect can depend of cognitive reflection. Hypothetically, those whose thinking
mostly depend on their gut feelings may benefit more from thinking in a foreign language,
because they may start reflecting more. However, those who tend to reflect by default may be
harmed by using a foreign language because they may reflect less or misallocate their
cognitive effort. The foreign language effect can also be observed more strongly in more
arousing news because if using a foreign language decreases perceived arousal, then the
foreign language effect has the most space to operate in that news. Lastly, the foreign
language effect might be reduced when it comes to highly proficient bilinguals – in their case,
the use of a foreign language might be as fluent as their native language, creating no space for
the cognitive or emotional mechanism to work.
Methods
Transparency and Openness
11
We preregistered our experiments: https://aspredicted.org/ux4ja.pdf. Data, code, and
materials are to be found: at https://osf.io/km4eu/. The data were analyzed using JAMOVI
2.2.5 (The Jamovi Project, 2021), which makes it visible on click how each analysis was
conducted. We posted a working Jamovi file with all analyses reported, in the manuscript in
the Supplementary Materials. We also report an earlier experiment in the Supplementary
Materials (preregistration: https://aspredicted.org/976pr.pdf) that did not offer a good test of
the hypothesis because the true and false news that were used were indistinguishable from
each other in terms of their believability. Experiments 1 and 2 received an approval of the
Ethics Committee at the University of Wroclaw.
Participants
We analyzed data from N = 300 participants (188 females, MAge = 22, SD = 3.4; N =
177 native language condition; N = 123 foreign language condition). Participants were
recruited through an external research firm. Participants were deemed ineligible if they
declared that Polish is not their native language, had a parent who speaks English as their
native language, lived more than 10 months in an English-speaking country, or self-rated their
global English proficiency as 4 or lower on a scale from 1 to 10. These selection criteria were
preregistered and roughly follow selection criteria in past work; Białek, Muda, et al. (2020);
Costa, Foucart, Arnon, et al. (2014); Keysar et al. (2012); Muda, Pieńkosz, et al. (2020).
We initially recruited 301 participants but excluded 1 participant who failed the
translation task that served as a proficiency verification. Subjects received 20 PLN (about 5
USD) for participation in the 15-minute laboratory experiment. Our sample size provided us
with β = .80 power to detect correlations as small as r > .14, and differences in two within-
subject conditions of d > .14 (Faul et al., 2009). We believe that for this particular problem,
smaller effects can be considered practically insignificant (Anvari & Lakens, 2021). To put
this into context, if one manipulated IQ scores, and the effect would be of d = .14, the change
in IQ score would be on average 2.1 points.
Our sample size is however still potentially too small to detect complex interactions,
such as attenuated interactions in which effects in two groups point in the same direction but
differ only slightly in strength (Sommet et al., 2022) or any three-way interactions. Hence, if
an interaction was non-significant in our studies, it should not be considered as evidence
12
against its existence; a non-significant result only informs us that such interaction is likely
weak or context-dependent.
Materials and Procedure
Participants of this in-person lab experiment evaluated fake news in two language
conditions in counterbalanced order: native language (NL: Polish) vs. foreign language (FL:
English). Their task was to read 16 news headlines (8 true and 8 fake, see Supplementary
Materials for exact wording) and to judge their believability (1 – Extremely unbelievable; 7 –
Extremely believable), willingness to share the news (I never share anything online; No;
Maybe; Yes; we do not report an analysis on the intent to share because the scale we used was
flawed: 0 describing that a person never shares any news is qualitatively different from other
responses. Therefore, such a scale is not suitable for statistical analysis. Experiment 2 reports
such analysis with an improved scale), familiarity with the news (No; Unsure; Yes), and how
emotionally arousing is the news (1 – Not at all; 7 – Extremely). Furthermore, participants
self-rated their English proficiency (1 – Not at all; 10 – Perfectly). Except for using a foreign
language in some news, the procedure for selecting and presenting fake news closely mirrored
previous studies (Pennycook, Binnendyk, et al., 2021). Fake news headlines were selected
from Snopes.com, an independent fact-checking website. The real headlines were
contemporary stories from mainstream news outlets
1
. False news was found on
https://snopes.com/; True news was found on nbcnews.com, foxnews.com, news.yahoo.com,
and nytimes.com. The news was presented as screenshots in the same format as if it were
shared on Facebook, albeit with manipulation (i.e., the news pretended to be from the BBC
but its language was manipulated to be English or Polish). The news was translated into
Polish and back-translated by the bilingual authors of the study. For each participant, we
randomly selected four headlines to be presented in their native language and four headlines
1
Of course, the classification of news as true vs false is only superficial, based on a single fact-checking
platform. Moreover, many of the true news can be assessed as unbelievable because they report true but unlikely
events. Hence, the task at hand is difficult, and we do not expect participants to produce a fully accurate
classification of the news headlines. This problem was visible in our pilot experiment, in which we presented
participants with true and false headlines from snopes.com that our participants found as equally (un)believable.
The results of that experiment showed increased believability of such news in a foreign language, being roughly
consistent with the findings from Experiments 1 and 2: false news was believed more when presented in a
foreign language. We report this experiment in supplementary Materials.
13
to be presented in a foreign language. The order of presentation of the languages was
counterbalanced.
Between the two language trials, participants completed a filler task (i.e., delay
discounting) presented in the same language as the first trial. As their last task, participants
solved the Cognitive Reflection Test (CRT, Frederick, 2005), presented to them in their native
language. CRT consists of three mathematical tasks that have strongly appealing intuitive but
incorrect responses. The sum of correct responses serves as a measure of cognitive reflection.
The three CRT questions were:
(1) A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much
does the ball cost? _____ cents
(2) If it takes 5 machines 5 minutes to make 5 widgets, how long would it take 100 machines
to make 100 widgets? _____ minutes
(3) In a lake, there is a patch of lily pads. Every day, the patch doubles in size. If it takes 48
days for the patch to cover the entire lake, how long would it take for the patch to cover half
of the lake? _____ days
No other data was collected.
Results
The believability ratings were fairly high, considering that half of the news was fake,
M = 4.08 (1.78). Our preregistered linear mixed-effects analysis tested the effects of using a
foreign language on these believability ratings. In the model, news_ID and participants_ID
were random effects, and the type of news (true/false) and language (NL/ FL) were defined as
fixed effects. We extended this baseline model by additionally including covariates: CRT
score and its interactions (Model 1), perceived arousal of news and its interactions (Model 2),
familiarity with news and its interactions (Model 3), and general English proficiency and its
interactions (Model 4). We contrast-coded the factorial predictors as -0.5 and +0.5 and mean-
centered the continuous predictors
2
.
2
Our materials, as caught by one of the reviewers, included typo in one of the news. We report analyses
excluding this news in the Supplementary Materials but found that the results are almost identical to the ones
reported here.
14
Foreign language effects. Our main interest was in the main effect of the news type that
informed us whether people were able to discern true from false news and news type by
language interaction, informing us whether this ability was increased or decreased in FL
users. The results, means, and post-hoc comparisons for interaction are presented in Tables 1-
3.
In all models, using a foreign language had no direct effect on the believability ratings.
As hypothesized, in all models we observed a language-by-news type interaction. False news
was believed more and true news was believed less in the foreign language condition than in
the native language condition (Table 4). Or, differently looking at the interaction, the
difference in the believability of true vs. false news was larger in the native language
condition than in the foreign language condition (baseline model: diffNL = 1.55 [-1.90, -1.21],
diffFL = 1.11 [-1.46, -0.77]), see Figure 1, panel. To conclude, we observed a decrease in the
ability to differentiate true from false news when using a foreign language. This language-by-
news type interaction did not further interact with any other factors so that the distorted
discernment of true and false news was similar regardless of the perceived arousal of news,
familiarity with news, cognitive reflection of participants, or foreign language proficiency.
Note, however, that our experiment may still be insufficiently powered to detect three-way
interactions, so the nonsignificant findings are not evidence against models including such
complex interactions.
Secondary effects. True news was judged as more believable than false news. Consistently
with previous findings, we observed several other factors that influenced whether a news
headline will be rated as believable: more arousing news was perceived as more believable
(Martel et al., 2020). Higher cognitive reflection was not significantly associated with higher
believability ratings. Critically, the CRT score (M = 1.13, SD = 1.12
3
) interacted with the
news type, so that the increase in believability with increasing CRT scores was only observed
in true news B = 0.14 [0.05, 0.23] but not in false news B = 0.00 [-0.09, 0.09], see Figure 1
panel B. In general, cognitive reflection has beneficial effects on the ability to discern true
from false news (Pennycook & Rand, 2019). Foreign language proficiency was positively
3
CRT scores distribution closely mirrored the results observed by Frederick (2005): 0 points – 41% of
participants; 1 point – 21%; 2 points – 22.3%; 3 points – 15.7%
15
related to belief in true news, but not in false news. Critically, this effect was independent of
the language in which the news was presented. Finally, despite previous findings suggesting
that news that seems familiar is perceived as more believable (Pennycook et al., 2018), we
found no such effect in our dataset. This could be because we successfully selected unfamiliar
news.
A
B
Figure 1.
16
Effects of language and news type on believability ratings. Panel A presents the effects of
language (Baseline Model), and Panel B of cognitive reflections (standardized) on
believability ratings of news (Model 1).
Table 1. Effects on believability ratings in Experiment 1.
Baseline Model
Model 1*
Model 2*
Model 3
Model 4
Mean (SD)
B
p
B
p
B
p
B
p
B
p
News type (false, true)
-1.33
< .001
-1.33
< .001
-1.41
< .001
-1.34
< .001
-1.33
< .001
Language (foreign, native)
0.06
0.185
0.06
0.185
0.04
0.281
0.06
0.171
0.06
0.185
Language x News type
0.44
< .001
0.44
< .001
0.48
< .001
0.44
< .001
0.44
< .001
CRT (0-3) (α = .66)
1.13 (1.12)
0.07
0.091
CRT x News type
-0.14
< .001
CRT x Language
0.02
0.576
CRT x News type x Language
-0.06
0.417
Arousal (1-7)
4.06 (1.81)
0.26
< .001
News type x Arousal
0.04
.070
Language x Arousal
0.00
.979
News type x Language x Arousal
0.00
.959
Familiarity (0-2)
0.45 (0.72)
-0.03
.355
News type x Familiarity
-0.03
.651
Language x Familiarity
0.02
.792
News type x Language x Familiarity
0.06
.634
English proficiency (self-rated, 5-10)
7.08 (1.27)
0.11
0.001
News type x Proficiency
-0.13
< .001
Language x Proficiency
0.04
0.270
News type x Language x Proficiency
-0.10
0.118
Random Components
SD
ICC
SD
ICC
SD
ICC
SD
ICC
SD
ICC
ID
0.70
0.19
0.70
0.18
0.6
0.15
0.70
0.18
0.69
0.18
News
0.30
0.04
0.30
0.04
0.31
0.04
0.31
0.04
0.30
0.04
Model
Observations
4800
4800
4800
4800
4800
N ID
300
300
300
300
300
N News
16
16
16
16
16
Marginal R2 / Conditional R2
0.132 /0.325
0.147 /0.328
0.215 /0.358
0.144/0.325
0.152/0.328
Note: A * denotes preregistered models. Brackets next to each predictor describe factor levels or scale width. Factors were coded as -0.5 or +0.5, and the scales were centered.
18
Table 2.
Means for believability depending on the condition for Experiment 1 (baseline mode).
95% Confidence
Interval
Language
News
type
Mean
SE
Lower
Upper
native
true
4.83
0.122
4.58
5.09
foreign
true
4.67
0.122
4.42
4.93
native
false
3.28
0.122
3.03
3.53
foreign
false
3.55
0.122
3.30
3.81
Table 3.
Post-hoc comparisons for news type x language interaction for Experiment 1 (baseline model)
News
type
Language
News
type
Language
Difference
SE
t
df
pholm
false
native
-
false
foreign
-0.275
0.0600
-4.59
4484.3
< .001
false
native
-
true
foreign
-1.390
0.1626
-8.55
16.1
< .001
true
foreign
-
false
foreign
1.115
0.1626
6.86
16.1
< .001
true
native
-
false
foreign
1.278
0.1626
7.86
16.1
< .001
true
native
-
false
native
1.553
0.1626
9.55
16.1
< .001
true
native
-
true
foreign
0.163
0.0599
2.72
4484.3
0.007
Note. P-values are Holm-corrected.
Experiment 2
In Experiment 1, despite additionally testing different factors influencing the
believability ratings, we observed a robust and significant language by news type interaction.
Contrary to what could be inferred from the literature on the FLe, this interaction indicates
that people are worse at discerning between true and false news in their foreign language. In
Experiment 2, we address the possibility that the results could instead be attributed to foreign
language causing decreased confidence in responding (Geipel et al., 2015a; Montero-Melis et
al., 2020) or, simply, a more conservative responding strategy that results in fewer false
alarms (i.e., believing in false news) at the cost of more omissions (i.e., not believing in true
news). To this end, instead of having participants judge believability using a continuous 7-
19
point scale, we asked participants to make yes/no classifications followed by confidence
ratings. Among other improvements, we used a better scale to investigate the intention to
share news and presented participants with more local news rather than US news, as in
Experiment 1. To shorten and simplify the experiment, we dropped the question on prior
familiarity with the presented news.
Methods
Transparency and Openness
We preregistered our online experiment: https://aspredicted.org/ni2sd.pdf. Data and
materials for this research are posted in a public repository https://osf.io/km4eu/. As for
Experiment 1, the repository includes a JAMOVI file with all NHST analyses reported in the
manuscript and a ROCToolbox output file for the Receiver Operating characteristics analysis.
Participants
Data from N = 270 participants were analyzed (140 females, 130 males, MAge = 25.1,
SD = 8.27, N = 141 native language condition; N = 129 foreign language condition).
Participants were recruited through Prolific Academic. We initially recruited 274 participants
but excluded 4 participants who failed the translation task that served as a proficiency
verification (i.e., they were asked to translate the text of the instruction). Compared to
Experiment 1, we included an additional exclusion criterion, i.e. self-rate level of stimuli
understanding, but no one was excluded due to this criterion.
Materials and Procedure
The procedure regarding the selection and presentation of fake news mirrored
Experiment 1. However, (1) we conducted this study online and (2) instead of US-based
news, we used Poland-based news, which we then translated into English and back-translated.
Participants evaluated the believability of news headlines in one language condition: native
language (Polish) or foreign language (English). Their task was to read 16 news (8 true and 8
fake, exact stimuli can be found: https://osf.io/km4eu/) and to judge their believability
(YES/NO) followed by confidence ratings of the judgment (3-points: Not at all, Moderately,
Very). For each news headline, participants declared their willingness to share the news.
Because the scale used in Experiment 1 was problematic (e.g., the answer '0' described
20
general unwillingness to share any news online), we now implemented a scale of 0 – 100
Likert, with its ends labeled as 'not at all' and 'very'. The last task was to solve the 3-item
Cognitive Reflection Test. In contrast to Experiment 1, no data on arousal or familiarity was
collected.
The news headlines were contemporary stories from major Polish news outlets that
highlighted current international and Polish issues. Fake news was found at
https://fakenews.pl/; true news was found at https://www.polsatnews.pl/ and https://tvn24.pl/.
All news was presented in random order. In the last task, participants also solved the CRT. No
other data was collected.
Results
We preregistered a linear mixed model (LME) that controls for item-level effects (i.e.,
variability of the believability of given news or item-specific FLe). Additionally, as
recommended by Batailler et al. (2022), we preregistered the Receiver Operating
Characteristic Analysis (ROC) to dissociate sensitivity (ability to differentiate true from fake
news) from response bias (strategy to prefer type 1 or type 2 errors). By analogy to the
research incorporating ROCs to study belief bias (Białek et al., 2020; Dube et al., 2010;
Trippas et al., 2014), instead of simply using yes/no responses, we also collected participants’
confidence ratings (1 to 3), allowing for a more precise estimation of the parameters.
LME analysis
As in Experiment 1, our preregistered analysis tested the effects of using a foreign
language on the believability of true and false news. We report analyses on believability
measured in two ways: as dichotomous believability classification (true/false) and as
continuous believability ratings (combined true/false classifications with 3-point confidence
ratings into a 1-6 scale, where 1 is “false” with the greatest confidence, and 6 is “true” with
the greatest confidence).
We contrast-coded our factorial predictors as -0.5 and +0.5 and mean-centered the
continuous predictors. The clustering variables were participant IDs and news numbers. The
results, means, and post-hoc comparisons for language by news type interaction are presented
in Tables 4-6.
21
Table 4.
Results of Mixed Linear Models conducted for Experiment 2
Truth ratings (Yes/No)
Confidence
(1-3)
Believability*
(1-6)
Intent to share
(1-100)
Exp(B)
p
B
p
B
p
B
p
Intercept
1.26
.064
2.31
<.001
3.60
<.001
19.49
<.001
News type (true, false)
6.15
<.001
-0.06
.337
1.55
<.001
0.53
.812
Language (FL, NL)
1.10
.301
-0.09
.004
0.05
.541
5.26
.014
CRT (0-3) (α = 0.81)
1.05
.281
-0.03
.064
0.04
.262
-0.77
.432
News type x Language
0.73
.023
-0.01
.844
-0.25
.012
1.48
.239
News type x CRT
0.94
.346
0.01
.496
-0.07
.117
-0.38
.510
Language x CRT
1.02
.779
0.03
.324
0.00
.990
-1.23
.530
News type x Language x CRT
1.05
.682
-0.04
.247
0.06
.524
-1.25
.278
Random Components
SD
ICC
SD
ICC
SD
ICC
SD
ICC
ID
0.51
0.07
2.15
0.12
0.46
0.07
16.59
0.39
News
0.47
0.06
0.11
0.04
0.40
0.06
4.24
0.04
Model
Observations
4320
4320
4320
4320
N ID
270
270
270
270
N News
16
16
16
16
Marginal R2 / Conditional R2
0.182
0.286
0.001
0.160
0.166
0.267
0.013
0.417
Note: * denotes a preregistered model.
22
Table 5.
Means for believability depending on the condition for Experiment 2 (believability coded
yes/no).
95% Confidence Interval
Language
News type
Prob.
SE
Lower
Upper
native
false
0.310
0.0393
0.238
0.391
foreign
false
0.367
0.0428
0.288
0.454
native
true
0.764
0.0336
0.692
0.823
foreign
true
0.752
0.0349
0.678
0.814
Table 6.
Post-hoc comparisons for news type x language interaction for Experiment 2
News
type
Language
News
type
Language
exp(B)
SE
z
pholm
false
native
-
false
foreign
0.774
0.0880
-2.255
0.048
false
native
-
true
foreign
0.148
0.0388
-7.286
< .001
true
foreign
-
false
foreign
5.240
1.3357
6.497
< .001
true
native
-
false
foreign
5.576
1.4603
6.562
< .001
true
native
-
false
native
7.206
1.8335
7.762
< .001
true
native
-
true
foreign
1.064
0.1285
0.516
0.606
Note. Positive values of the 'difference' column suggest that the probability of news being
classified as false is higher for the right-hand side of the variables compared.
Effects on the believability of news. Fifty-four percent of the news was classified as true,
and the mean believability of the news was 3.60 (1.91). The main effect of using a foreign
language was nonsignificant, but as in Experiment 1, we found robust news by language
interaction (Table 2). The interaction showed that the difference in believability between true
and false news was larger in NL (Figure 2, panel A). This conclusion holds regardless of
whether we looked at the yes/no classifications, ORNL = 7.21 [4.38, 11.87]; ORFL = 5.24
[3.18, 8.64] or continuous believability ratings, DiffNL = 1.68 [1.28, 2.13]; DiffFL = 1.43
23
[0.97-1.88]. Unlike previous research and Experiment 1, CRT (Mean = 2.10, SD = 1.10
4
) did
not produce a main effect or interact with the type of news (Figure 2, panel B). There was also
no three-way interaction with the language in which the news was presented.
A
B
4
CRT scores significantly higher than observed in the Experiment 1 (p < .001). Observed distribution: 0 points –
14.1% of participants; 1 point – 13.3%; 2 points – 21.1%; 3 points – 51.5% which closely mirrored scores of
MIT students observed by Frederick (2005). The extremely high scores (mean above 2 points) could be the
reason why we did not observe significant interaction of CRT with news type or language.
24
Figure 2. Effects of language and news type on believability ratings (yes/no). Panel A
presents the effects of language and Panel B of cognitive reflection (standardized) on
believability ratings of news headlines.
Effects on the intent to share. The mean intent to share was 19.5 (SD=26.8) on a scale of 1
to 100, suggesting a rather low intent to share the news that our participants read. The only
significant effect predicting it was the one of language, with news being shared more widely
when headlines were presented in FL (Table 2). Surprisingly, the most straightforward
prediction that people would be less willing to share false news was not confirmed here.
Maybe people decided to share false news even when they thought it was false for its
entertaining value or because they thought that such news would be interesting if still true
(Altay et al., 2022). The foreign language effect on intent to share news is important because a
greater willingness to share news combined with lower discriminability of news in a foreign
language magnifies the potential problem with fake news: people can disseminate more fake
news.
Using the opportunity provided by our data, we also tested whether believability
ratings predict willingness to share (see Supplementary Materials for details). To this end, we
built an LME, where language, cognitive reflection, type of news, 6-point believability ratings
and their first-order interactions were fixed effects, and news ID and participant ID were
random effects. We observed significant main effects: (1) news being shared more in a
foreign language than in a native language, B = 4.89 [3.32, 6.46], (2) false news being shared
more than true news, B = 5.79 [1.72, 9.86]; (3) believable news being shared more than
unbelievable news, B = 4.09 [3.69, 4.49]; (4) higher cognitive reflection predicted less intent
to share news, B = -2.77 [-3.72, -1.82]. We also observed a language by believability
interaction, B = 0.96 [0.18, 1.74], with believability a stronger predictor of sharing intent
when the news was presented in FL, B = 4.50 [3.98, 5.03], compared to NL, B = 3.22 [2.73,
3.72]. Among other interactions, we observed news type x believability interaction, B = -0.99
[-1.77, -0.21], and a language by cognitive reflection interaction, B = 2.36 [0.36, 4.35]. The
language x news type x cognitive reflection interaction was non-significant at p = 0.051, B = -
2.67 [-5.35, 0.01]. Details of this analysis can be found in the Supplementary Materials.
Effects on confidence. On average, people were quite confident in their responses, M = 2.31,
SD = 0.62. We found a slight decrease in confidence in the classifications made in FL (Table
25
2). Lower confidence in FL could be confounded with a genuine difference in the ability to
distinguish between true and false news. If all judgments are made with lower confidence, the
difference in continuous ratings can be reduced because, for example, instead of judging false
news as '1' and true news as '6' on a six-point believability scale, lower confidence would
transform these ratings to '2' and '5', respectively. In this way, the difference in believability
would decrease from 5 to 3 points, suggesting a lower discriminability of news. However, the
reversed causal order between believability ratings and confidence can be true: participants
could metacognitively realize that their classification of news in FL is less accurate, and thus
report lower confidence.
Because we found that language affects both classification accuracy and confidence
ratings, we decided to test whether the two effects can be confounded with each other. We did
this using the Signal Detection Theory analysis (Green & Swets, 1966; Wixted, 2020).
Signal Detection Theory Analysis
Signal detection theory tests the accuracy of binary classifications by analyzing hits
(correctly detecting a signal is present; here, accurately classifying true news as true) and false
alarms (mistaking noise for a signal; here, classifying false news as true)
5
. The signal
detection theory allows us to estimate two components of classification: sensitivity (i.e., the
ability to discern signal from noise) and response criterion (i.e., relative preference for Type 1
vs. Type 2 error). Without the Signal Detection Theory analysis, changes in response criterion
can be mistaken for a change in sensitivity (Dube et al., 2010). To illustrate the difference
between response criterion and sensitivity, consider an airport machine that detects people
smuggling drugs. A machine that would always beep, regardless of whether or not a person
has drugs, would be 100% accurate in detecting true smugglers. Yet, such a machine would
be practically useless because of the liberal response criterion that causes an enormous false
alarm rate. Hence, the difference between sensitivity and response bias is more pronounced
with different proportions of signal to noise. This is especially important for fake news
5
Misses (rejecting signal as noise; here classifying true news as false) and correct rejections (correctly
classifying noise as noise; here classifying false news as false) are mathematically redundant (see Swets, 1988
for a more detailed explanation).
26
research, where experiments use an equal proportion of true to false news, but where real-
world news is mostly true.
In the context of this project, sensitivity tracks whether people accurately classify true
news as true and false news as false. Response bias tests whether, when making such
decisions, people prefer to err by accepting false news as true (Type 1 error) or to err by
rejecting true news as false (Type 2 error). News discernment tests for relative belief in true
vs. false news. However, this can be misleading. Consider a case in which, after manipulation,
the belief in false news decreased from 40 to 30%, but the belief in true news decreased from
90 to 80%. An analysis looking at the belief in false news would suggest that people believe
them less, and thus the manipulation is beneficial. An analysis looking at the relative news
belief scores would suggest the difference is constant and equates to 50 percentage points,
thus the manipulation had no effect on news discernment. Yet, it is clear that all news was
believed less, and thus the manipulation was potentially harmful in cases where true news
accounts for over 90% of all news consumed (Guess et al., 2019). This change will be
reflected in the response bias parameter.
Using a foreign language could affect both the sensitivity and the response bias. That
is when using a foreign language people may be less/more accurate when assessing the
believability of news and thus show changes in the sensitivity index. On the other hand,
people using a foreign language may simply process news with lower fluency or have less
access to metacognitive cues of its believability. This would lead them to adopt a conservative
response strategy (rather than rejecting true news as false than accepting false news as true)
and show changes in the response criterion index. Finally, both effects can occur
simultaneously.
Sensitivity and response bias can be estimated using ROCToolbox (Koen et al., 2017).
To perform such an analysis, we need binary classification data followed by its confidence
ratings. Signal Detection Theory analysis with ROCToolbox fits all participants’ responses in
a given condition to a receiver operating characteristic (ROC) curve. Because the ROC
estimates are not accompanied by any measure of standard error, the comparisons between
conditions are made descriptively. Our method is more advanced than traditional ROC
analysis because it not only plots hits vs. false alarms of classification decisions but also
incorporates information about the confidence with which each classification was made.
27
How to read the ROC plot (Figure 3)? Starting from the bottom-left corner, the first
point of the curve corresponds to a proportion of hits to false alarms of responses “true” made
with the highest confidence; point 2 to responses “true” made with the highest confidence and
with moderate confidence; point 3 to responses “true” with all levels of confidence; point 4 to
response “false” with the lowest confidence and “true” with all confidence levels, point 5 to
“false” with low and moderate confidence and all “true” responses with any confidence
levels. Point 6 is always in the top-right corner. The size of the area under the curve (AUC)
represents the accuracy of the classification. The larger the AUC, the greater the estimated
sensitivity. With perfect responding, hits = 1 and false alarms = 0. Therefore, the AUC = 1.
The diagonal represents a 'guessing line', where hits are as frequent as false alarms, and the
AUC = .5. If the AUC is smaller than .5, participants simply reverse-classified the news,
which is typically attributable to not following the instruction properly. The response bias is
visualized by positioning the points on the curve. For example, if the points on the ROC are
shifted to the top right corner, people have shown more hits and false alarms. They tended to
accept most of the news as true: a liberal responding strategy. If points are clustered at the
bottom left corner, then people have made fewer hits and fewer false alarms. That is, they
tended to reject most of the news as false.
Figure 3. Receiver operating characteristics (ROC) curves for discriminating between true
and false news headlines in native vs. foreign language
28
We fitted our data to a ROC curve using a dual-process signal detection model
(DPDS, (Yonelinas, 1994). A G-test indicated that the DPSD model fits both the NL and FL
data well, both G’s ≥ 9.02, p’s ≤ .029 (Fig. 3), with the ROC curves adjR2NL = 0.958, and
adjR2FL = 0.987. Therefore, we can proceed to analyze the model indices.
One of the simplest measures of accuracy in ROCs is captured by hit rates minus false
alarm rates. Since both are within the 0-1 range, a perfectly accurate participant would make
100% hits and 0% false alarms. Thus, this parameter would be 1.0. A guessing participant
would have similar hits and false alarm rates; thus, this parameter would be close to 0. In our
data, the parameter was 0.41 for NL condition participants and 0.35 for FL participants
6
.
Therefore, FL users were less accurate in their classifications. Alternative measures of
accuracy suggest similar conclusions: AUCNL = 0.73, AUCFL = 0.71, d’NL = 1.09, and d’FL =
0.91. So, alleviating our concerns about confounding decreased classification accuracy with
lower confidence in responding, we see that accuracy decreased when headlines were
presented in FL on all metrics.
ROC analysis also allows us to investigate whether participants in FL adopted
different response strategies. We found that the response biases were very similar, βNL = 0.90,
βFL = 0.87; and cNL = -0.10, cFL = -0.15. The bias parameters suggest that participants did not
generally adopt a radically liberal or conservative response criterion. Because the Confidence
Interval of the difference between the c-parameters includes zero, d = -0.64 [-1.80, 0.52]
7
, it is
statistically non-significant. Therefore, changes in accuracy caused by reading news headlines
in a foreign language were not convincingly accompanied by more conservative response
bias.
General Discussion
6
The probability to endorse true news (hit) vs false news (false alarm) is already compared in the LME analysis
(Tables 6). This analysis shows that the hit-false alarm rate is statistically larger in NL than in FL (i.e., the
language by news type interaction).
7
The ROCToolbox can estimate the overall response criterion c without a standard error attached, but also c1-5
parameters estimated for each confidence level. These parameters are estimated with the standard error metric
attached. We decided to conservatively compare the aggregated c parameters using the largest SE attached to a
c1-5 parameter (Białek et al., 2020).
29
Across two experiments with 570 participants - one in-lab, one online - we found that
using foreign language could disrupt the ability to discern true from fake news. When using a
foreign versus native language, true news was judged as equally (Experiment 2) or less
believable (Experiment 1), while false news was judged as more believable (Experiments 1
and 2). We observed that FLe was robust to changes in affective processing (i.e., the effect
was not moderated by perceived arousal of news, Experiment 1) and in reflective processing
(i.e., the effect was not moderated by individual differences in CRT, Experiments 1 and 2).
Foreign language proficiency of participants also did not moderate the effect (although we
removed those with inadequate proficiency, i.e., those who self-reported their proficiency to
be under 5 out of 10). Finally, using signal detection theory, we showed that the negative
effects of using a foreign language were not caused by adopting different responding
strategies (e.g., preferring omissions to false alarms in FL).
The difference in the believability ratings between true and false news was smaller in
the foreign language condition by approximately 0.4 points on a 7-point scale in Experiment 1
(p < .001) and by 0.2 points on a 6-point scale in Experiment 2 (p = .012). This may not be an
effect that will completely mislead a single individual with fake news when using their
foreign language. However, Funder and Ozer (2019) noted that even smaller effects could be
meaningful when affecting many people. And this is exactly the case we study – millions of
news consumers do it in their foreign language. If only some of them were misled, we would
end up with thousands of people affected by fake news. This may be enough to win elections
in a swinging state or to cause protests against a just cause such as COVID-19 vaccination
programs.
Practical implications
Although many news consumers may acquire news in their foreign language
8
, past
research has focused almost exclusively on native language consumers (and particularly in US
samples) (Grinberg et al., 2019; Pennycook & Rand, 2021a). Our research suggests that the
8
English is the most commonly used language on the internet with a 25.9% share
(https://www.statista.com/statistics/262946/share-of-the-most-common-languages-on-the-internet/), whereas
only about 360 million people are English native speakers (about 5% out of 7.6 billion people in Earth
(https://datatopics.worldbank.org/world-development-indicators/).
30
problem of fake news may be larger in previously unstudied bilingual populations, which are
less able to discern true from false news when using their foreign language. Our data suggest
that, when presented to people in their foreign language, false news may be believed more,
and true news may be believed less. As false news is only a small fraction of the news
consumed by the average person (Guess et al., 2019), the latter effect may cause the most
damage in terms of not believing in much more numerous true news rather than believing in
less common false news (Acerbi et al., 2022).
We currently have no answer to whether these results are true and observable in the
real world. If that is the case, two groups of bilinguals can be most harmed by their inability
to detect fake news: (1) people interested in international news or who seek to validate their
local, potentially biased news sources internationally; and (2) immigrants. The former group
may have additional difficulty in accurately contrasting their local news with international
news, potentially being misled by fake news outlets or internet trolls to support harmful
policies, e.g., war or oppression of civilians in other countries. The latter group of people is in
danger of being misinformed, decreasing their prospects in their adopted country. Because
these groups are already vulnerable, we need to consider whether there are methods that
reduce the potential negative effect of using a foreign language on news discernment.
Theoretical implications
Our results have implications for the applicability of the foreign language effect (FLe)
in aiding human decision-making. Specifically, we find evidence that thinking in a foreign
language may be not as beneficial (at least not in all cases) to the decision-maker as initially
suggested (Costa et al., 2014; Keysar et al., 2012); in some cases, it can be detrimental. This
adds to the growing literature casting doubt on the FLe as a debiasing tool (Białek et al., 2020,
2022; Caldwell-Harris & Ayçiçeği-Dinn, 2021; Mækelæ & Pfuhl, 2019; Muda, Walker, et al.,
2020; Vives et al., 2018; Xu et al., 2021).
We also provide some novel insights into the mechanisms that explain FLe. Two
initially proposed models suggested that foreign language use (1) decreases intuitiveness via
weaker affective processing or (2) increases deliberation via stronger processing difficulty
(Hayakawa et al., 2016). We tested both accounts, showing that the detrimental effects of
using a foreign language were independent of the perceived arousal of news and individual
31
differences in cognitive reflection. In other words, we found that the language x news type
interaction was not moderated by cognitive reflection (Experiments 1 and 2), familiarity with
news, or perceived arousal (Experiment 1). Of course, our study could be underpowered to
detect such effects; however, our results are consistent with studies showing that FLe is robust
to emotions (Chan Yuen-Lai et al., 2016; Geipel et al., 2015b; Muda, Walker, et al., 2020)
and does not affect cognitive reflection scores (Milczarski et al., 2022). Hence, our data do
not support the decreased-affect and increased-reflection models.
Finally, our findings confirmed that people tend to believe the news more when they
find it emotionally evocative (Martel et al., 2020). However, we found no support for the
claim that people tend to believe the news they find familiar (Pennycook et al., 2018). We
also found mixed evidence that people higher in cognitive reflection tend to discern better true
news from false news (Pennycook & Rand, 2019; Ross et al., 2019). CRT scores and news
type significantly interacted only in Experiment 1 but not in Experiment 2. Finally,
believability scores predicted participants' intent to share news, more so when the news was
presented in a foreign language.
Limitations and future directions
The best practice in language research is to use counterbalanced bilinguals, e.g., ask
Spanish native speakers who know English to answer questions in Spanish (NL) or English
(FL), and then English native speakers who know Spanish to answer questions in English
(NL) or Spanish (FL). The observed effect of using a foreign language should be visible to all
FL users, which would deconfound the effect from semantic issues in the translation or
cultural context. In this research, we only recruited Polish-English bilinguals, and thus we do
not know whether the reported negative effect of using FL is unique to this language pair.
Moreover, we cannot determine whether our effects are attributable to language processing
per se or to the foreign-language context the news was placed (Costa, Foucart, Arnon, et al.,
2014 for a discussion of this issue).
The practical difficulty of such a design is that native English speakers rarely know a
foreign language well enough to be included in such experiments. If they do, they are often
first- or second-generation immigrants with live contact with both languages (Portes &
Schauffler, 1994). Next, most bilinguals learned English as their second language. Therefore,
32
their native language must be paired with English, circling back to the lack of the
corresponding sample in English native speakers. Another way of avoiding such confound is
to use several pairs of languages always paired with English, as done in some prior research
(Białek et al., 2019, 2022; Costa et al., 2014; Hayakawa et al., 2019). Finally, for
methodological reasons, we rejected bilingual participants who had lived abroad for a long
time. Hence, we believe that extending our findings to such a group is problematic but not
necessarily wrong – the evidence for acculturation that reduces the foreign language effect is
questionable (Białek & Fugelsang, 2019).
We found the foreign language effect to be robust to all considered moderators. Yet,
moderators are difficult to identify, especially when the moderator simply attenuates an effect
(i.e., makes it slightly weaker or stronger) rather than when knocking out an effect or
reversing its sign (Sommet et al., 2022). Moreover, our main effect of interest was already an
interaction (believability in true vs. fake news), so testing an additional level of interaction
requires very large samples. Hence, even with the relatively large samples we collected, our
experiments could be underpowered in testing the moderators considered here.
Conclusions
We show that proficient bilinguals using their second language are less able to discern
true news from fake news because they think false news is more believable (Experiments 1
and 2) and potentially also because they judged true news headlines as less believable
(Experiment 1 but not Experiment 2). Experiment 1 showed that this detrimental effect of
using a foreign language did not interact with the arousal of the news, previous familiarity
with the news, or the cognitive reflection of the participants. Experiment 2 additionally
showed that using a foreign language decreased the confidence of believability judgments, but
increased the intent to share the news. These results highlight that people using their second
language, potentially including refugees and economic immigrants who are especially
vulnerable groups, can be less able to gain accurate information about their environment
through online news sources partially populated by fake news. If further studies (preferably
field studies on social media, Mosleh et al., 2021) confirm this phenomenon, methods that
help bilinguals use their foreign language in navigating news media should be developed.
References
33
Acerbi, A., Altay, S., & Mercier, H. (2022). Research note: Fighting misinformation or
fighting for information? Harvard Kennedy School Misinformation Review.
https://doi.org/10.37016/mr-2020-87
Ackerman, R., & Thompson, V. A. (2017). Meta-Reasoning: Monitoring and Control of
Thinking and Reasoning. Trends in Cognitive Sciences, 21(8), 607–617.
https://doi.org/10.1016/j.tics.2017.05.004
Altay, S., de Araujo, E., & Mercier, H. (2022). “If this account is true, it is most enormously
wonderful”: Interestingness-if-true and the sharing of true and false news. Digital
Journalism, 10(3), 373–394.
Anvari, F., & Lakens, D. (2021). Using anchor-based methods to determine the smallest effect
size of interest. Journal of Experimental Social Psychology, 96, 104159.
Batailler, C., Brannon, S. M., Teas, P. E., & Gawronski, B. (2022). A Signal Detection
Approach to Understanding the Identification of Fake News. Perspectives on
Psychological Science, 17(1), 78–98. https://doi.org/10.1177/1745691620986135
Białek, M., Domurat, A., Paruzel-Czachura, M., & Muda, R. (2022). Limits of the foreign
language effect: Intertemporal choice. Thinking & Reasoning, 28(1), 97–124.
https://doi.org/10.1080/13546783.2021.1934899
Białek, M., & Fugelsang, J. (2019). No evidence for decreased foreign language effect in
highly proficient and acculturated bilinguals: A commentary on Čavar and Tytus
(2018). Journal of Multilingual and Multicultural Development, 40(8), 679–686.
https://doi.org/10.1080/01434632.2018.1547072
34
Białek, M., Muda, R., Stewart, K., Niszczota, P., & Pieńkosz, D. (2020). Thinking in a
foreign language distorts allocation of cognitive effort: Evidence from reasoning.
Cognition, 205, 104420. https://doi.org/10.1016/j.cognition.2020.104420
Białek, M., Paruzel-Czachura, M., & Gawronski, B. (2019). Foreign language effects on
moral dilemma judgments: An analysis using the CNI model. Journal of Experimental
Social Psychology, 85, 103855. https://doi.org/10.1016/j.jesp.2019.103855
Blanch, A., Ayats, A., & Cornadó, M. P. (2020). Slow and fast chess performance across
three expert levels. Psychology of Sport and Exercise, 50, 101749.
https://doi.org/10.1016/j.psychsport.2020.101749
Caldwell-Harris, C. L. (2015). Emotionality differences between a native and foreign
language: Implications for everyday life. Current Directions in Psychological Science,
24(3), 214–219.
Caldwell-Harris, C. L., & Ayçiçeği-Dinn, A. (2021). When using the native language leads to
more ethical choices: Integrating ratings and electrodermal monitoring. Language,
Cognition and Neuroscience, 36(7), 885–901.
https://doi.org/10.1080/23273798.2020.1818266
Chan Yuen-Lai, Xuan, G., Ng, J. C., & Tse Chi-Shing. (2016). Effects of dilemma type,
language, and emotion arousal on utilitarian vs deontological choice to moral
dilemmas in Chinese-English bilinguals. Asian Journal of Social Psychology, 19(1),
55–65. https://doi.org/10.1111/ajsp.12123
Costa, A., Foucart, A., Arnon, I., Aparici, M., & Apesteguia, J. (2014). “Piensa” twice: On the
foreign language effect in decision making. Cognition, 130(2), 236–254.
35
De Neys, W., & Pennycook, G. (2019). Logic, fast and slow: Advances in dual-process
theorizing. Current Directions in Psychological Science, 28(5), 503–509.
Dube, C., Rotello, C. M., & Heit, E. (2010). Assessing the belief bias effect with ROCs: It’s a
response bias effect. Psychological Review, 117(3), 831–863.
https://doi.org/10.1037/a0019634
Ecker, U. K., Lewandowsky, S., Cook, J., Schmid, P., Fazio, L. K., Brashier, N., Kendeou, P.,
Vraga, E. K., & Amazeen, M. A. (2022). The psychological drivers of misinformation
belief and its resistance to correction. Nature Reviews Psychology, 1(1), 13–29.
Evans, J. S. B. T., & Stanovich, K. E. (2013). Dual-Process Theories of Higher Cognition:
Advancing the Debate. Perspectives on Psychological Science, 8(3), 223–241.
https://doi.org/10.1177/1745691612460685
Fernández-López, M., & Perea, M. (2020). Language does not modulate fake news
credibility, but emotion does. Psicológica Journal, 41(2), 84–102.
https://doi.org/10.2478/psicolj-2020-0005
Frederick, S. (2005). Cognitive reflection and decision making. Journal of Economic
Perspectives, 19(4), 25–42.
Funder, D. C., & Ozer, D. J. (2019). Evaluating Effect Size in Psychological Research: Sense
and Nonsense. Advances in Methods and Practices in Psychological Science, 2(2),
156–168. https://doi.org/10.1177/2515245919847202
Geipel, J., Hadjichristidis, C., & Surian, L. (2015a). How foreign language shapes moral
judgment. Journal of Experimental Social Psychology, 59, 8–17.
36
Geipel, J., Hadjichristidis, C., & Surian, L. (2015b). The foreign language effect on moral
judgment: The role of emotions and norms. PloS One, 10(7), e0131529.
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics (Vol. 1).
Wiley New York.
Grinberg, N., Joseph, K., Friedland, L., Swire-Thompson, B., & Lazer, D. (2019). Fake news
on Twitter during the 2016 U.S. presidential election. Science, 363(6425), 374–378.
https://doi.org/10.1126/science.aau2706
Guess, A., Nagler, J., & Tucker, J. (2019). Less than you think: Prevalence and predictors of
fake news dissemination on Facebook. Science Advances, 5(1), eaau4586.
https://doi.org/10.1126/sciadv.aau4586
Hayakawa, S., Costa, A., Foucart, A., & Keysar, B. (2016). Using a Foreign Language
Changes Our Choices. Trends in Cognitive Sciences, 20(11), 791–793.
Hayakawa, S., Lau, B. K. Y., Holtzmann, S., Costa, A., & Keysar, B. (2019). On the
reliability of the foreign language effect on risk-taking. Quarterly Journal of
Experimental Psychology, 72(1), 29–40. https://doi.org/10.1177/1747021817742242
Keysar, B., Hayakawa, S. L., & An, S. G. (2012). The foreign-language effect: Thinking in a
foreign tongue reduces decision biases. Psychological Science, 23(6), 661–668.
Khaldarova, I., & Pantti, M. (2016). Fake News. Journalism Practice, 10(7), 891–901.
https://doi.org/10.1080/17512786.2016.1163237
Koen, J. D., Barrett, F. S., Harlow, I. M., & Yonelinas, A. P. (2017). The ROC Toolbox: A
toolbox for analyzing receiver-operating characteristics derived from confidence
37
ratings. Behavior Research Methods, 49(4), 1399–1406.
https://doi.org/10.3758/s13428-016-0796-z
Lazer, D. M. J., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F.,
Metzger, M. J., Nyhan, B., Pennycook, G., Rothschild, D., Schudson, M., Sloman, S.
A., Sunstein, C. R., Thorson, E. A., Watts, D. J., & Zittrain, J. L. (2018). The science
of fake news. Science, 359(6380), 1094–1096.
https://doi.org/10.1126/science.aao2998
Lee, T. (2019). The global rise of “fake news” and the threat to democratic elections in the
USA. Public Administration and Policy, 22(1), 15–24. https://doi.org/10.1108/PAP-
04-2019-0008
Loewenstein, G. E., & Lerner, J. S. (2003). The role of affect in decision making. In
Handbook of affective sciences. Oxford University Press.
Mækelæ, M., & Pfuhl, G. (2019). Deliberate reasoning is not affected by language. PLOS
ONE, 14(1), e0211428. https://doi.org/10.1371/journal.pone.0211428
Martel, C., Pennycook, G., & Rand, D. G. (2020). Reliance on emotion promotes belief in
fake news. Cognitive Research: Principles and Implications, 5(1), 1–20.
Milczarski, W., Borkowska, A., Paruzel-Czachura, M., & Białek, M. (2022). Using a foreign
language does not make you think more: Null effects of using a foreign language on
cognitive reflection and numeracy. PsyArXiv. https://doi.org/10.31234/osf.io/wbjyc
38
Montero-Melis, G., Isaksson, P., van Paridon, J., & Ostarek, M. (2020). Does using a foreign
language reduce mental imagery? Cognition, 196, 104134.
https://doi.org/10.1016/j.cognition.2019.104134
Mosleh, M., Pennycook, G., & Rand, D. G. (2021). Field experiments on social media.
Current Directions in Psychological Science, 09637214211054761.
Muda, R., Pennycook, G., Hamerski, D., & Białek, M. (2023). People are worse at detecting
fake news in their foreign language [dataset] PsyArXiv. https://osf.io/km4eu/
Muda, R., Pieńkosz, D., Francis, K., & Białek, M. (2020). The moral foreign language effect
is stable across presentation modalities. Quarterly Journal of Experimental
Psychology, 174702182093507. https://doi.org/10.1177/1747021820935072
Muda, R., Walker, A. C., Pieńkosz, D., Fugelsang, J. A., & Białek, M. (2020). Foreign
Language does not Affect Gambling-Related Judgments. Journal of Gambling Studies,
36(2), 633–652. https://doi.org/10.1007/s10899-020-09933-6
Niszczota, P., Pawlak, M., & Białek, M. (2022). Bilinguals are less susceptible to the bias
blind spot in their second language. International Journal of Bilingualism,
13670069221110384. https://doi.org/10.1177/13670069221110383
Pennycook, G., Binnendyk, J., Newton, C., & Rand, D. G. (2021). A Practical Guide to Doing
Behavioral Research on Fake News and Misinformation. Collabra: Psychology, 7(1),
25293. https://doi.org/10.1525/collabra.25293
39
Pennycook, G., Cannon, T. D., & Rand, D. G. (2018). Prior exposure increases perceived
accuracy of fake news. Journal of Experimental Psychology: General, 147(12), 1865–
1880. https://doi.org/10.1037/xge0000465
Pennycook, G., Epstein, Z., Mosleh, M., Arechar, A. A., Eckles, D., & Rand, D. G. (2021).
Shifting attention to accuracy can reduce misinformation online. Nature, 592(7855),
Article 7855. https://doi.org/10.1038/s41586-021-03344-2
Pennycook, G., Fugelsang, J. A., & Koehler, D. J. (2015). Everyday consequences of analytic
thinking. Current Directions in Psychological Science, 24(6), 425–432.
Pennycook, G., McPhetres, J., Zhang, Y., Lu, J. G., & Rand, D. G. (2020). Fighting COVID-
19 misinformation on social media: Experimental evidence for a scalable accuracy
nudge intervention. Psychological Science. https://doi.org/10.31234/osf.io/uhbk9
Pennycook, G., & Rand, D. G. (2019). Lazy, not biased: Susceptibility to partisan fake news
is better explained by lack of reasoning than by motivated reasoning. Cognition, 188,
39–50. https://doi.org/10.1016/j.cognition.2018.06.011
Pennycook, G., & Rand, D. G. (2020). Who falls for fake news? The roles of bullshit
receptivity, overclaiming, familiarity, and analytic thinking. Journal of Personality,
88(2), 185–200. https://doi.org/10.1111/jopy.12476
Pennycook, G., & Rand, D. G. (2021a). The psychology of fake news. Trends in Cognitive
Sciences, in press.
40
Pennycook, G., & Rand, D. G. (2021b). Accuracy prompts are a replicable and generalizable
approach for reducing the spread of misinformation [Preprint]. PsyArXiv.
https://doi.org/10.31234/osf.io/v8ruj
Portes, A., & Schauffler, R. (1994). Language and the Second Generation: Bilingualism
Yesterday and Today. International Migration Review, 28(4), 640–661.
https://doi.org/10.1177/019791839402800402
Romero-Rivas, C., Corey, J. D., Garcia, X., Thierry, G., Martin, C. D., & Costa, A. (2017).
World knowledge and novel information integration during L2 speech
comprehension*. Bilingualism: Language and Cognition, 20(3), 576–587.
https://doi.org/10.1017/S1366728915000905
Roozenbeek, J., Freeman, A. L., & van der Linden, S. (2021). How accurate are accuracy-
nudge interventions? A preregistered direct replication of Pennycook et al.(2020).
Psychological Science, 32(7), 1169–1178.
Ross, R. M., Rand, D. G., & Pennycook, G. (2019). Beyond “fake news”: The role of analytic
thinking in the detection of inaccuracy and partisan bias in news headlines [Preprint].
PsyArXiv. https://doi.org/10.31234/osf.io/cgsx6
Sommet, N., Weissman, D. L., Cheutin, N., & Elliot, A. (2022). How many participants do I
need to test an interaction? Conducting an appropriate power analysis and achieving
sufficient power to detect an interaction. PsyArXiv.
https://doi.org/10.31219/osf.io/xhe3u
Stanovich, K. E. (2009). What intelligence tests miss: The psychology of rational thought.
Yale University Press.
41
Stanovich, K. E. (2018). Miserliness in human cognition: The interaction of detection,
override and mindware. Thinking & Reasoning, 24(4), 423–444.
Swets, J. A. (1988). Measuring the Accuracy of Diagnostic Systems. Science, 240(4857),
1285–1293. https://doi.org/10.1126/science.3287615
Tappin, B. M., Pennycook, G., & Rand, D. G. (2020). Bayesian or biased? Analytic thinking
and political belief updating. Cognition, 204, 104375.
https://doi.org/10.1016/j.cognition.2020.104375
The Jamovi Project. (2021). Jamovi 2.2.5. https://www.jamovi.org/
Thompson, V., & Morsanyi, K. (2012). Analytic thinking: Do you feel like it? Mind &
Society, 11(1), 93–105. https://doi.org/10.1007/s11299-012-0100-6
Trippas, D., Verde, M. F., & Handley, S. J. (2014). Using forced choice to test belief bias in
syllogistic reasoning. Cognition, 133(3), 586–600.
https://doi.org/10.1016/j.cognition.2014.08.009
Vives, M.-L., Aparici, M., & Costa, A. (2018). The limits of the foreign language effect on
decision-making: The case of the outcome bias and the representativeness heuristic.
PLOS ONE, 13(9), e0203528. https://doi.org/10.1371/journal.pone.0203528
Wixted, J. T. (2020). The forgotten history of signal detection theory. Journal of
Experimental Psychology: Learning, Memory, and Cognition, 46(2), 201–233.
Xu, Y., Zhang, Y.-Y., & Liang, Z.-Y. (2021). The foreign-language discount effect: Using
English increases intertemporal discount rates through more distant future perception
[Preprint]. Open Science Framework. https://doi.org/10.31219/osf.io/vrtf4
42
Yonelinas, A. P. (1994). Receiver-operating characteristics in recognition memory: Evidence
for a dual-process model. Journal of Experimental Psychology: Learning, Memory,
and Cognition, 20(6), 1341–1354. https://doi.org/10.1037/0278-7393.20.6.1341
A preview of this full-text is provided by American Psychological Association.
Content available from Journal of Experimental Psychology: Applied
This content is subject to copyright. Terms and conditions apply.