ArticlePDF Available

Abstract and Figures

In the current psychological debate, low replicability of psychological findings is a central topic. While the discussion about the replication crisis has a huge impact on psychological research, we know less about how it impacts public trust in psychology. In this article, we examine whether low replicability damages public trust and how this damage can be repaired. Studies 1–3 provide correlational and experimental evidence that low replicability reduces public trust in psychology. Additionally, Studies 3–5 evaluate the effectiveness of commonly used trust-repair strategies such as information about increased transparency (Study 3), explanations for low replicability (Study 4), or recovered replicability (Study 5). We found no evidence that these strategies significantly repair trust. However, it remains possible that they have small but potentially meaningful effects, which could be detected with larger samples. Overall, our studies highlight the importance of replicability for public trust in psychology.
Content may be subject to copyright.
Running head: LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 1
No Replication, no Trust? How Low Replicability Influences Trust in Psychology
Tobias Wingen
University of Cologne
Jana B. Berkessel
University of Mannheim
Birte Englich
University of Cologne
This paper is currently in press in Social Psychological and Personality Science
(SPPS). The present post-print may not replicate the final version published in this SAGE
journal. It is not the copy of record and may differ from the final version.
Please find the published version at:
https://journals.sagepub.com/doi/10.1177/1948550619877412
Author Note.
Tobias Wingen is a PhD candidate at the University of Cologne. His research focuses
on how novel research methods can contribute to answering classical social-psychological
questions. Moreover, he is interested in the open science movement, replicability, social
hierarchies, and implicit theories.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 2
Jana Berkessel is a PhD candidate at the Mannheim Centre for European Social
Research at the University of Mannheim. Her work revolves around cross-cultural social and
personality psychology, specifically culture fit, social class, and well-being. She is further
interested in the social-psychological aspects of replicability and open science.
Birte Englich is a full professor at the University of Cologne. Her research, which
currently focuses on topics including heuristics and biases, bias correction, social power,
expertise, and indecisiveness, combines an applied perspective with research on the relevant
underlying socio-cognitive processes.
We thank Nicolas Alef, Amelie Conrad, Elisabeth Jackson, and Estella Umbach for
their support with the preparation of materials. We thank Alexandra Fleischmann and Oscar
Lecuona for their helpful comments. Finally, we thank the Graduate School of the Faculty of
Human Sciences at the University of Cologne for providing a travel grant to present this work
at an international conference. We would like to dedicate this paper to the memory and
friendship of Prof. Birte Englich (co-author of this article), who passed away in September
2019, shortly after the manuscript had been accepted for publication. The authors are grateful
for her enduring trust and support, during this research project and way beyond.
Correspondence concerning this article should be addressed to Tobias Wingen,
Applied Social Psychology and Decision Making, Social Cognition Center Cologne,
University of Cologne, Herbert-Lewin-Str. 10, 50931 Köln, Germany. E-Mail:
tobias.wingen@uni-koeln.de
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 3
Abstract
In the current psychological debate, low replicability of psychological findings is a
central topic. While the discussion about the replication crisis has a huge impact on
psychological research, we know less about how it impacts public trust in psychology. In this
paper, we examine whether low replicability damages public trust and how this damage can
be repaired. Studies 1, 2 and 3 provide correlational and experimental evidence that low
replicability reduces public trust in psychology. Additionally, Studies 3, 4, and 5 evaluate the
effectiveness of commonly used trust-repair strategies, such as information about increased
transparency (Study 3), explanations for low replicability (Study 4), or recovered replicability
(Study 5). We found no evidence that these strategies significantly repair trust. However, it
remains possible that they have small but potentially meaningful effects, which could be
detected with larger samples. Overall, our studies highlight the importance of replicability for
public trust in psychology.
Keywords: replicability, public trust, replication crisis, open science
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 4
No Replication, no Trust? How Low Replicability Influences Trust in Psychology
A trustworthy reputation is crucial for psychology. Researchers in psychology aim to
have societal impact and to inform practitioners and policymakers. Additionally,
psychological research relies on public funding and participation. In turn, when facing
complex scientific questions, decision-makers seek advice from the (psychological) science
community (Bromme & Thomm, 2016). If based on robust evidence, these well-informed
decisions can lead to improved individual and societal outcomes (Ruggeri et al., 2019). Thus,
public trust is not only crucial for psychology itself, but also for the public. What happens to
this trust when psychological findings often fail to replicate?
This is a relevant question since many prominent studies indeed suggest a low
replicability of psychological findings. For example, the Reproducibility Project: Psychology
replicated 100 psychological studies and only about one-third to one-half of the original
findings were replicated (Open Science Collaboration, 2015). This low replication rate often
serves as an illustration for a “replication crisis” (e.g. Anderson & Maxwell, 2017).
Low Replicability Might Damage Public Trust in Psychology
Many researchers see crises positively since they play a central role in the
advancement of sciences and show that science self-corrects (Kuhn, 1970; Vazire, 2018).
Indeed, in the course of the replication crisis, psychology has gone through major changes.
Currently, journals and scientific societies encourage open science practices (e.g. Schönbrodt,
Gollwitzer, & Abele-Brehm, 2017) and many researchers implement open science (Lindsay &
Nosek, 2018). These changes are often seen as a major advancement of psychological science:
Researchers argue that the rate of scientific progress is likely to increase (Vazire, 2018), that
the widespread use of preregistration will increase the interpretability and credibility of
research findings (Nosek, Ebersole, DeHaven, & Mellor, 2018), and that open science will
liberate researchers and foster creativity (Frankenhuis & Nettle, 2018).
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 5
Nevertheless, there are reasons to assume that information about low replicability
might damage public trust in psychology. While findings regarding the effects of (scientific)
uncertainty on audience reactions are rather inconclusive (for a review, see van der Bles et al.,
2019) some studies suggest that non-scientists react negatively to scientific uncertainty. For
example, non-scientists who perceive scientific evidence as uncertain further perceive the
corresponding research field as less valuable (Broomell & Kane, 2017). Likewise, even
modest amounts of scientific dissent reduce public support for government policies and lead
to disagreement with the scientific consensus (Aklin & Urpelainen, 2014). Similarly, low
replicability might also result in reputational damage and diminished public trust (Białek,
2018; Chopik, Bremner, Defever, & Keller, 2018; Fanelli, 2018). We test this hypothesis in
the present article.
How can Public Trust be Repaired?
If low replicability damages public trust, an important question for the psychological
science community is if and how this damage can be repaired. We tested the following three
theory-based and commonly used approaches to repair public trust.
Repairing trust through increased transparency. Transparency signals that there is
nothing to hide and thus repairs trust (Bachmann, Gillespie, & Priem, 2015). Indeed, one
major response to the replication crisis is the open science movement (Frankenhuis & Nettle,
2018; LeBel, Campbell, & Loving, 2017). Central aspects of this movement, such as
preregistrations, open data, and open materials, aim to increase the transparency of
psychological research (Miguel et al., 2014; Nosek et al., 2015). Thus, building on the idea
that transparency can repair trust, information about the open science movement might help to
repair public trust in psychology.
Repairing trust through explanations. The causes and responsibilities of a
transgression are often not evident (Bachmann et al., 2015). Explanations of the causes of a
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 6
transgression can help to repair trust by establishing a shared understanding of what and why
the transgression happened (Bachmann et al., 2015; Dirks, Lewicky, & Zaheer, 2009). If low
replicability violates the public expectations of reliable published findings, the public may
perceive low replicability as a transgression. In this case, explanations for low replicability
could be an effective trust repair strategy.
Considering the replication crisis, two major explanations emerged. Some scholars
argue that questionable research practices (QRPs) are the main reason for the replication crisis
(e.g. Sijtsma, 2016; Simmons, Nelson, & Simonsohn, 2011). Other scholars attribute failed
replications to hidden moderators and the high context-sensitivity of psychological effects
(e.g. Stroebe & Strack, 2014; Van Bavel, Mende-Siedlecki, Brady, & Reinero, 2016). While
this debate has not been settled, it is an additional open question whether any of those
explanations QRPs vs. hidden moderators would repair public trust damaged by low
replicability.
Repairing trust by restoring the status quo. Trust can further be repaired by
restoring the status quo, as norms and expectations are also restored (Dirks et al., 2009).
Before the replication crisis, the majority of replications in psychology journals reported
similar findings to their original studies (Makel, Plucker, & Hegarty, 2012) and it was thus
likely assumed that psychology is highly replicable. To restore this status quo, psychological
science would thus need to achieve high levels of replicability. Indeed, many new
methodological standards in psychology aim at increasing replicability (Cook, Lloyd, Mellor,
Nosek, & Therrien, 2018; Van Bavel et al., 2016). If those standards succeed, the status quo
belief that psychology is a highly replicable science might be restored. Eventually, this
increase in replicability might also lead to a restoration of public trust.
Overview of Studies
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 7
We conducted five studies to test whether low replicability damages public trust and if
this damage can be repaired. Study 1 examined whether trust in psychology correlates with
expected replicability. Study 2 experimentally tested whether low replicability causes reduced
trust, which we replicated in Study 3. Moreover, Studies 3 to 5 tested different commonly
used trust-repair strategies: information about increased transparency (Study 3), explanations
for low replicability (Study 4), and information about increased replicability (Study 5).
Participants who took part in one of our studies were not allowed to participate in subsequent
studies. We relied on MTurk workers in all studies, since they are significantly more socio-
economically and ethnically diverse, and presumably less likely to have prior knowledge of
the replication crisis, compared with a student sample (Casler, Bickel, & Hackett, 2013).
We include all studies we conducted, and report all collected variables and all
conditions included in the study designs across all studies. We preregistered all analyses
presented in the manuscript (except for specifically highlighted correlations presented in
Table 1), and we report all preregistered analyses in either the manuscript or the supplemental
materials. We discuss the central preregistered hypotheses when introducing each study. All
analyses with a preregistered hypothesis are accompanied by one-sided p-values. All
participants who completed our studies were included in the analyses except if they met
preregistered exclusion criteria. All materials, data, analyses syntaxes, and preregistrations are
shared on https://osf.io/9ba28/?view_only=7f2edfc9b5f143beb5f86dfdc657d73d.
Study 1
Study 1 investigated which replication rate non-scientists assume for psychological
studies and whether their expected replication rate correlates with their trust in psychology
and their perceived value of psychological science. We expected positive correlations.
Method
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 8
Participants and design. Participants completed a short online study on Amazon’s
Mechanical Turk (MTurk) website for $0.50. The sample size was set to 266, based on an a
priori power analysis for 95% power (one-sided α of .05) to detect a small to moderate effect
of r = .2, that would be typical for similar social-psychological research (Richard, Bond Jr, &
Stokes-Zoota, 2003). The final sample was slightly larger as is often the case in online studies
and consisted of 271 participants (54.3% male; age: M = 33.7 years, SD = 8.9). No
participants were excluded from the analyses. A sensitivity analysis showed that our final
sample had a high chance (1 − β = 0.80, one-sided α = 0.05) to detect a correlation of r = .15
and a very high chance (1 − β = 0.95, one-sided α = 0.05) to detect r = .20.
Procedure. Participants read a short, jargon-free description of the Reproducibility
Project: Psychology (for details, see
https://osf.io/9ba28/?view_only=7f2edfc9b5f143beb5f86dfdc657d73d). Participants then
guessed how many of these 100 original findings were successfully replicated.
Afterward, participants indicated their trust in psychology with five items (e.g., “I trust
the psychological science community to do what is right; 1 = strongly disagree, 7 = strongly
agree; α = .90; adapted from Nisbet et al., 2015). Although we conveniently call this measure
“trust in psychology” throughout this manuscript, it is important to note that it was designed
to measure institutional trust in the (psychological) science community (Nisbet et al., 2015).
Alternatively, trust in psychology could for example also be conceptualized as trust in
psychological findings (Anvari & Lakens, 2019) or in the scientific methods used by
psychologists. However, for non-scientists, the scientific community might be the most vivid
aspect of psychology. Moreover, prior research showed that the used “trust in psychology”-
measure is affected by dissonant science communication (Nisbet et al., 2015), so it could be
particularly suitable to capture potential effects of (expected) low replicability.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 9
Although this measure showed acceptable to excellent reliability in Studies 1-5, it
showed a poor confirmatory model fit across most indices and studies. We believed this to
likely be due to the reverse-coding of items and found that accounting for this drastically
improves the model fit and does not change the pattern of our results (see supplemental
materials).
As an additional dependent variable, we measured participants perceived value of
psychological science with four items (e.g., “Please rate the societal benefit of research
produced by psychological science.”; α = .80; 1 = very low, 5 = very high; adapted from
Broomell & Kane, 2017). In all studies, perceived value showed a similar result pattern to
trust in psychology. All preregistered analyses regarding perceived value can be found in the
supplemental material. Finally, participants indicated whether they knew the results of the
Reproducibility Project: Psychology and completed demographics.
Results
Eleven participants (4.0%) said they had heard of the Reproducibility Project:
Psychology but only one participant reported to know the results. This participant, however,
indicated an incorrect replication rate of 14 out of 100 studies. On average, participants
estimated that 60.9 out of 100 studies could be replicated (SD = 22.9). Descriptive statistics
for the “trust in psychology”-measure across all studies and conditions are presented in Table
1. As predicted, the higher participants estimated the replication rate, the more they trusted
psychology, r(268) = .329, one-sided p < .001, 95% CI [.218, .431] (see Fig. 1). Perceived
value showed similar results to trust in psychology (see supplemental materials).
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 10
Figure 1. Relationship between the estimated replication rate and trust in psychology in Study
1. Histograms show the distribution of each measure.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 11
Study 2
Study 1 provided correlational evidence that expected replicability is related to public
trust in psychology. Building on Study 1, we employed an experimental approach to test
causality. We expected low replicability (compared with high replicability) to reduce trust in
psychology and to reduce the perceived value of psychological science.
Method
Participants and design. Participants completed a short online study on MTurk for
$0.50. We randomly assigned participants to three conditions (low replicability, medium
replicability, high replicability). We set sample size to 264, based on an a priori power
analysis for 95% power (one-sided α of .05) to find a moderate effect (d = 0.5), that would be
typical for similar social psychological research (Richard et al., 2003), requiring 88
participants per cell (the same power analysis was applied to Studies 3, 4, and 5). The final
sample consisted of 269 participants (59.9% male; age: M = 34.59 years, SD = 10.74). No
participants were excluded from the analyses. A sensitivity analysis showed that our final
sample had a high chance (1 − β = 0.80, one-sided α = 0.05) to detect a difference of d = 0.37
between the low replicability and any of the two other conditions and a very high chance
(1 − β = 0.95, one-sided α = 0.05) to detect d = 0.50.
Procedure. Participants read the same description of the Reproducibility Project:
Psychology as in Study 1. This time, however, participants received information about the
results. Depending on their condition, participants were told that out of the 100 investigated
studies, 39 (low replicability condition), 61 (medium replicability condition) or 83 (high
replicability condition) could be successfully replicated. These values were based on the
estimated replication rates found in Study 1: 61 represents the mean estimated replication rate
in Study 1, and 39 and 83 represent the mean plus/minus one standard deviation in Study 1.
Afterward, participants responded to three text-understanding items and a manipulation check
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 12
(“Psychological research is replicable”; 1 = strongly disagree, 7 = strongly agree). Then, they
filled out the five items from Study 1 to measure trust in psychology (α = .92). We also
measured participants’ perceived value of psychological science and various individual
differences as preregistered potential moderators (beliefs about science, error culture, error
attribution style; see supplemental materials for details). Finally, participants completed a
brief demographic questionnaire and were debriefed.
Results
The manipulation check suggested that the manipulation was successful (see
supplemental materials). A one-way analysis of variance revealed significantly different
levels of trust in psychology between the three conditions, F(2, 265) = 4.86, p = .008, η² =
.04, 90%-CI [.01, .07], see Figure 2. Participants in the low replicability condition indicated a
significantly lower trust in psychology than participants in the high replicability condition,
t(176) = 3.25, one-sided p < .001, d = 0.49, 95%-CI [0.19, 0.79].
Further analyses indicated that participants in the exploratory medium replicability
condition tended to be more trustful than the participants in the low replicability condition,
t(176) = 1.48, one-sided p = .070, d = 0.22, 95%-CI [-0.07, 0.52], and less trustful than
participants in the high replicability condition, t(176) = 1.59, one-sided p = .057, d = 0.24,
95% CI-[-0.06, 0.53], but these differences were not significant. Perceived value showed
similar results to trust in psychology (see supplemental materials).
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 13
Figure 2. Pirate plot (Phillips, 2017) showing trust in psychology in the different replicability
conditions in Study 2. The black dots represent the raw data which is shown with smoothed
densities indicating the distributions in each condition. The central tendency is the mean and
the intervals represent two standard errors around the mean.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 14
Study 3
As expected, Study 1 and 2 provided evidence that low replicability, compared with
high replicability, reduces trust in psychology. In Study 3, we replicated the trust-damaging
effect of low replicability and tested whether informing participants about the open science
movement and increased transparency of psychological science would repair trust damaged
by low replicability. We expected low replicability (compared with high replicability) to
reduce trust in psychology (as in Study 2). Crucially, we expected information about
increased transparency to repair public trust, compared with information about low
replicability only.
Method
Participants and design. Three hundred and four participants were recruited to
complete a short online study on MTurk for $0.60 each. Compared with Study 2, we
increased the target sample size to 300 to compensate for potential exclusions. Indeed, seven
participants were excluded for meeting the preregistered exclusion criteria (failing more than
one text understanding questions). We randomly assigned participants to three conditions
(low replicability, low replicability but transparency, high replicability). The final sample
consisted of 297 participants (56.9% male; age: M = 35.7 years, SD = 11.6). A sensitivity
analysis showed that our final sample had a high chance (1 − β = 0.80, one-sided α = 0.05) to
detect a difference of d = 0.36 between the low replicability and the low replicability but
transparency condition and a very high chance (1 − β = 0.95, one-sided α = 0.05) to detect d =
0.47.
Procedure. Participants read the same description of the Reproducibility Project:
Psychology as in Study 2. Once again, participants were told that out of the investigated 100
studies, 39 (low replicability condition) or 83 (high replicability condition) could be
successfully replicated. In a third condition (low replicability but transparency condition),
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 15
participants were also told that 39 studies could be replicated, but that psychology has since
then become much more open and transparent. This comprehensible information described
major aspects of the open science movement, including preregistration, open data, and open
materials, and highlighted that those measures contribute to increased transparency (for
details see https://osf.io/9ba28/?view_only=7f2edfc9b5f143beb5f86dfdc657d73d).
Afterward, participants responded to three text-understanding items and two
manipulation checks (“Psychological research is replicable.”; “Psychological research is
transparent.”; 1 = strongly disagree, 7 = strongly agree). Then, they filled out the five items
from Studies 1 and 2 to measure their trust in psychology (α = .86). Participants also
completed a brief demographic questionnaire and were debriefed.
Results
The manipulation check suggested that the manipulations were successful (see
supplemental materials).
As predicted, and replicating our prior findings, participants in the low replicability
condition indicated a significantly lower trust in psychology than participants in the high
replicability condition, t(196) = 3.36, one-sided p < .001, d = 0.48, 95%-CI [0.19, 0.76]; see
Figure 3. However, contrary to our prediction, participants in the low replicability but
transparency condition did not indicate a significantly higher trust in psychology than
participants in the low replicability condition, t(194) = 0.74, one-sided p = .231, d = 0.11,
95%-CI [-0.18, 0.39].
Finally, participants in the low replicability but transparency condition indicated a
significantly lower trust in psychology compared with participants in the high replicability
condition, t(192) = 2.60, p = .010, d = 0.37, 95%-CI [0.09, 0.66].
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 16
Figure 3. Pirate plot showing trust in psychology in the different replicability and
transparency conditions in Study 3.
Study 4
Study 3 found no evidence that increased transparency can repair trust. While this
approach focused on the consequences of the replication crisis, another approach to repair
public trust might be to explain the causes of low replicability. Thus, in Study 4, we tested the
effectiveness of the two most common explanatory strategies: hidden moderators and QRPs.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 17
We expected the hidden moderator explanation to lead to a higher trust in psychology than the
QRPs explanation. However, given the little effectiveness of our trust repair strategy in Study
3, we had no clear hypotheses on whether any of the explanations would be able to repair trust
compared with providing no explanation, so we preregistered these analyses as exploratory.
Method
Participants and design. Three hundred and three participants were recruited to
complete a short online study on MTurk for $0.60. 20 participants were excluded for meeting
the preregistered exclusion criteria (failing more than one text understanding questions). We
randomly assigned participants to three conditions (low replicability condition, hidden
moderator condition, QRPs condition). The final sample consisted of 283 participants (55.5%
male; age: M = 36.5 years, SD = 12.0). Sensitivity analyses showed that our final sample had
a high chance (1 − β = 0.80, α = 0.05) to detect a difference of d = 0.41 between the low
replicability and any of the two explanation conditions and a very high chance (1 − β = 0.95,
α = 0.05) to detect d = 0.53.
Procedure. Participants read the same description of the Reproducibility Project:
Psychology as in Studies 2 and 3. All participants were told that out of the 100 investigated
studies, 39 were successfully replicated. Depending on their condition, participants received
no explanation (low replicability condition), an explanation stating that QRPs caused the low
replication rate (QRPs condition) or an explanation stating that hidden moderators caused the
low replication rate (hidden moderator condition). In the QRPs condition, participants read
that researchers “primarily look for new and spectacular results which can lead to bad
research practices, for example repeating an experiment until a surprising effect emerges.
Often researchers only publish the spectacular results, while less spectacular but potentially
more reliable results end up in a drawer somewhere.” In contrast to that, participants in the
hidden moderator condition learned that: “When studying humans, unknown or hidden factors
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 18
such as individual differences between participants, participants' current state, or minimal
differences in the experimental procedure can affect the results. It is very difficult to always
have absolute control over these conditions and keep possible influencing factors constant.”
(for details see https://osf.io/9ba28/?view_only=7f2edfc9b5f143beb5f86dfdc657d73d).
Afterward, participants responded to three text-understanding items and two manipulation
checks (1. “Unknown or hidden factors explain the low replication rate”, 2. “questionable
research practices explain the low replication rate”; 1 = strongly disagree, 7 = strongly agree).
Participants did not answer the manipulation checks in the low condition, to avoid
highlighting these explanations to control participants. Then, participants filled out the same
five items used in Studies 1 - 3 to measure their trust in psychology (α = .80). Participants
also completed a brief demographic questionnaire and were debriefed.
Results
Manipulation checks indicated that the manipulation was successful (see supplemental
materials).
Participants in the QRPs condition showed a significantly lower trust in psychology
than participants in the hidden moderator condition, t(185) = 2.11, one-sided p = .018, d =
0.31, 95%-CI [0.02, 0.60]; see Figure 4. However, the low replicability condition, which
served as a control condition , did not differ significantly from the hidden moderator
condition, t(188) = 0.20, p = .839, d = 0.03, 95%-CI[ -0.27, 0.32] nor from the QRPs
condition, t(183) = -1.68, p = .094, d = 0.25, 95%-CI [0.04, 0.54], which showed an even
lower trust than the low replicability condition.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 19
Figure 4. Pirate plot showing trust in psychology in the different explanation conditions in
Study 4.
Study 5
Neither increased transparency (Study 3), nor explanations (Study 4) significantly
repaired trust. One reason for this might be that we did not provide information about both,
the causes and adequate solutions to the crisis, in one study. If non-scientists intuitively do not
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 20
believe that nontransparent practices (e.g., QRPs) cause low replicability, increasing
transparency would not be a sensible response to low replicability. Thus, it might be
necessary to inform non-scientists about both: QRPs as a cause of low replicability and
transparency as an adequate solution. Whereas a QRP explanation on its own might even
damage public trust (see Study 4), such an explanation could be especially effective when
combined with information about increased transparency.
Moreover, we did not provide information about whether increased transparency was
indeed effective in increasing replicability. Thus, we conducted Study 5 to address these
concerns. In this final study, we tested whether public trust can be repaired by providing
participants with both, information about the causes of, and adequate solutions for low
replicability, and by informing them that these solutions successfully restored high
replicability. We expected successfully restored replicability to lead to increased trust in
psychology.
Method
Participants and design. Three hundred and four participants were recruited to
complete a short online study on MTurk for $0.50 each. We again used an increased target
sample size of 300 to compensate for potential exclusions. 26 participants were excluded for
meeting the preregistered exclusion criteria (failing more than one text understanding
questions). We randomly assigned participants to three conditions (low replicability
condition, “now high” replicability condition, “still low” replicability condition). The final
sample consisted of 278 participants (64.7% male; age: M = 33.8 years, SD = 11.0).
Sensitivity analyses showed that our final sample had a high chance (1 − β = 0.80, α = 0.05) to
detect a difference of d = 0.36 between the low replicability and any of the two other
conditions and a very high chance (1 − β = 0.95, α = 0.05) to detect d = 0.48.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 21
Procedure. Participants read the same description of the Reproducibility Project:
Psychology as in Studies 2, 3, and 4, and additionally learned that the Reproducibility Project:
Psychology was published in 2015. All participants read that out of the 100 investigated
studies, 39 were successfully replicated. In the low replicability condition, participants
received no further information. In the “still low” replicability and “now high” replicability
conditions, participants received an explanation that QRPs caused the low replication rate, but
that this issue was now addressed through the open science movement and the increased
transparency of psychological science. In the “still low” condition, which served as an
additional control group, participants learned that these measures were not successful.
Concretely, they were informed that an (alleged) new systematic replication project in 2018
revealed that out of 100 studies conducted under the new transparency guidelines, still only 41
could be successfully replicated. In contrast, in the “now high” replicability condition,
participants learned that those measures were very successful since the alleged new
replication project in 2018 revealed that now 83 out of 100 recent studies could be
successfully replicated (for details see
https://osf.io/9ba28/?view_only=7f2edfc9b5f143beb5f86dfdc657d73d). Afterward,
participants responded to three text-understanding items and to the manipulation check
(“Psychological research is now more replicable”). Participants did not fill out the
manipulation check in the low replicability condition, which received no information about
the change in replicability. Then, participants answered the five items from Studies 1 - 4 to
measure their trust in psychology (α = .73). Participants also completed a brief demographic
questionnaire and were debriefed.
Results
According to our manipulation checks, the manipulation was successful (see
supplemental materials).
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 22
Participants in the “now high” replicability condition did not show significantly higher
trust in psychology than participants in the “still low” replicability condition, t(178) = 1.29,
one-sided p = .099, d = 0.19, 95%-CI [-0.10, 0.49], or participants in the low replicability
condition (M = 4.35, SD = 1.30), t(186) = 1.04, one-sided p = .149, d = 0.15, 95% CI-[-0.14,
0.44]; see Figure 5.
Figure 5. Pirate plot showing trust in psychology in the different replicability conditions in
Study 5.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 23
Table 1
Descriptive Statistics for the “Trust in Psychology”-Measure Across Studies
Study
Sample
Sizea
Mean
Standard
Deviation
Correlation with
Replicabilityb
Study 1
270
5.06
1.27
r(268) = .329***
Study 2
268
4.94
1.36
1. Low Replicability
88
4.63
1.34
r(86) = .205
2. Medium
Replicability
90
4.94
1.44
r(88) = .424***
3. High Replicability
90
5.26
1.24
r(88) = .239*
Study 3
294
4.91
1.32
1. Low Replicability
100
4.66
1.36
r(98) = .218*
2. Transparency
96
4.80
1.33
r(94) = .295**
3. High Replicability
98
5.27
1.20
r(96) = .373***
Study 4
281
4.67
1.21
1. Low Replicability
94
4.76
1.34
-
2. Hidden Moderators
96
4.80
1.09
-
3. QRPs
91
4.45
1.16
-
Study 5
278
4.40
1.20
1. Low Replicability
98
4.35
1.30
-
2. Still Low
Replicability
90
4.32
1.14
-
3. Now High
Replicability
90
4.54
1.14
-
Note. * p < .05, ** p < .01, *** p < .001.
aNumber of participants who completed the “trust in psychology”-measure
bIIn Study 1, this correlation refers to the preregistered correlation of the “trust in
psychology”-measure (ranging from 1 to 7) with the estimated replication rate. In Studies 2
and 3 this refers to the not preregistered correlation with the manipulation check
“Psychological research is replicable”. This manipulation check was not administered in
Studies 4 and 5.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 24
General Discussion
Our results show that concerns about reduced public trust in light of the replication
crisis are justified. Across three studies (Studies 1 - 3), we find correlational and experimental
evidence that low replicability reduces trust in psychology. Studies 1 and 2 suggest that not
only public trust but also the perceived value of psychological science is damaged by low
replicability. Moreover, Studies 3 to 5 found no evidence that commonly used trust repair
strategies significantly repair this damaged trust in psychology.
So does low replicability damage public trust beyond repair? Although sensitivity
analyses showed that it is unlikely that the tested strategies have large trust-repairing effects,
they also suggest that we had no sufficient power to rule out small, but potentially meaningful
effects, which could only be detected with larger samples (equivalence tests and Bayes factors
in line with this argumentation are presented in the supplemental materials). Our findings thus
do not allow us to conclude that the tested strategies are certainly ineffective. However, given
the non-significant observed effects of trust repair strategies, our findings also do not provide
evidence for the effectiveness of the tested strategies on trust in psychology.
Hence, the critical question is: What should psychological researchers do if they
encounter low replicability? Considering that replication studies have limitations and that
there is often no consensus about their interpretation (Gilbert, King, Pettigrew, & Wilson,
2016), one could potentially argue that psychologists should avoid informing the public about
low replicability. However, this non-transparent approach would be ethically problematic and
violates, for example, the APA Ethics Code (see APA, 2017, pp. 3- 4). Moreover, failed
attempts to cover up problematic research findings might reduce public trust even more
(Leiserowitz, Maibach, Roser-Renouf, Smith, & Dawson, 2013). Therefore, covering up low
replicability is neither an ethical nor an effective way to handle the problem.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 25
A more promising approach to maintaining the public trust might be to substantially
improve the replicability of psychological research findings. Although Study 5 remains
inconclusive about whether this is an effective strategy to repair the public trust directly after
a replication crisis, Studies 1 to 3 provide evidence that high replicability in the first-place
results in increased trust in psychology. Thus, if replicability is constantly high, public trust in
psychology might rise. Currently, there is considerable debate about whether constantly high
replicability is a worthwhile goal for psychological science. For example, Baumeister (2016)
discussed whether a strong focus on replicability could potentially reduce the likelihood of
discoveries and the progress and influence of the field. Moreover, scholars debate whether
conducting direct replications is after all meaningful (Stroebe & Strack, 2014; cf. Simons,
2014). Although we do not directly speak to these arguments, our work suggests that the
debate should also consider the reputational benefits associated with high replicability.
However, it is important to note that we communicated information about low
replicability in the form of very short texts, inspired by brief news reports. Potentially, an in-
depth explanation of the replication crisis and the open science movement might lead to less
negative, or even positive, audience reactions. This is especially likely for highly science-
interested audiences which would be willing to engage with such a detailed explanation.
Indeed, recent research suggests less negative consequences in such a situation: After a 1-hr
lecture on the replication crisis, psychology students attitudes toward psychology remained
relatively stable (Chopik et al., 2018).
Moreover, we conceptualized trust in psychology as trust in the psychological science
community. Trust in psychology could however also refer to trust in psychological findings.
Since low replicability typically refers to past findings, it seems possible that low replicability
of past findings does not necessarily damage trust in future findings (Anvari & Lakens, 2019).
Likewise, it is possible that the damaged trust in the psychological science community does
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 26
not generalize to future generations of psychological researchers educated under new, more
rigorous methodological guidelines.
Overall, our studies highlight the crucial importance of replicability for public trust in
psychology. Thus, the immense effort of the psychological science community to increase
replicability is not only scientifically important but also highly relevant to psychology’s
public reputation. This is especially important in the current political climate, where the
credibility of scientific evidence is questioned and science is threatened by defunding (Fanelli,
2018; Yong, 2017).
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 27
References
Aklin, M., & Urpelainen, J. (2014). Perceptions of scientific dissent undermine public support
for environmental policy. Environmental Science & Policy, 38, 173177.
https://doi.org/10.1016/j.envsci.2013.10.006
Anderson, S. F., & Maxwell, S. E. (2017). Addressing the “Replication Crisis”: Using
original studies to design replication studies with appropriate statistical power.
Multivariate Behavioral Research, 52, 305324.
https://doi.org/10.1080/00273171.2017.1289361
Anvari, F., & Lakens, D. (2019, September 9). The Replicability Crisis and Public Trust in
Psychological Science. https://doi.org/10.31234/osf.io/vtmpc
APA. (2017). Ethical principles of psychologists and code of conduct. Washington, DC.
Bachmann, R., Gillespie, N., & Priem, R. (2015). Repairing trust in organizations and
institutions: Toward a conceptual framework. Organization Studies, 36, 11231142.
Baumeister, R. F. (2016). Charting the future of social psychology on stormy seas: Winners,
losers, and recommendations. Journal of Experimental Social Psychology, 66, 153
158. https://doi.org/10.1016/j.jesp.2016.02.003
Białek, M. (2018). Replications can cause distorted belief in scientific progress: BBS
commentary by Białek on Zwaan et al. PsyArXiv.
https://doi.org/10.31234/osf.io/8a4h6
Bromme, R., & Thomm, E. (2016). Knowing who knows: Laypersons’ capabilities to judge
experts’ pertinence for science topics. Cognitive Science, 40, 241252.
Broomell, S. B., & Kane, P. B. (2017). Public perception and communication of scientific
uncertainty. Journal of Experimental Psychology: General, 146, 286304.
https://doi.org/10.1037/xge0000260
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 28
Casler, K., Bickel, L., & Hackett, E. (2013). Separate but equal? A comparison of participants
and data gathered via Amazon’s MTurk, social media, and face-to-face behavioral
testing. Computers in Human Behavior, 29, 21562160.
Chopik, W. J., Bremner, R. H., Defever, A. M., & Keller, V. N. (2018). How (and whether) to
teach undergraduates about the replication crisis in psychological science. Teaching of
Psychology, 45, 158163.
Cook, B. G., Lloyd, J. W., Mellor, D., Nosek, B. A., & Therrien, W. J. (2018). Promoting
open science to increase the trustworthiness of evidence in special education.
Exceptional Children, 85, 104118.
Dirks, K. T., Lewicky, R. J., & Zaheer, A. (2009). Repairing relationships within and between
organizations: Building a conceptual foundation. The Academy of Management
Review, 34, 6884.
Fanelli, D. (2018). Opinion: Is science really facing a reproducibility crisis, and do we need it
to? Proceedings of the National Academy of Sciences, 115, 26282631.
Frankenhuis, W. E., & Nettle, D. (2018). Open science is liberating and can foster creativity.
Perspectives on Psychological Science, 13, 439447.
Gilbert, D. T., King, G., Pettigrew, S., & Wilson, T. D. (2016). Comment on “Estimating the
reproducibility of psychological science.” Science, 351, 10371037.
Kuhn, T. S. (1970). The structure of scientific revolutions ([2d ed., enl). Chicago: University
of Chicago Press.
LeBel, E. P., Campbell, L., & Loving, T. J. (2017). Benefits of open and high-powered
research outweigh costs. Journal of Personality and Social Psychology, 113, 230243.
https://doi.org/10.1037/pspi0000049
Leiserowitz, A. A., Maibach, E. W., Roser-Renouf, C., Smith, N., & Dawson, E. (2013).
Climategate, public opinion, and the loss of trust. American Behavioral Scientist, 57,
818837.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 29
Lindsay, D. S., & Nosek, B. A. (2018). Preregistration becoming the norm in psychological
science. APS Observer, 31. Retrieved from
https://www.psychologicalscience.org/observer/preregistration-becoming-the-norm-
in-psychological-science
Makel, M. C., Plucker, J. A., & Hegarty, B. (2012). Replications in psychology research: How
often do they really occur? Perspectives on Psychological Science, 7, 537542.
https://doi.org/10.1177/1745691612460688
Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., … Imbens, G.
(2014). Promoting transparency in social science research. Science, 343, 3031.
Nisbet, E. C., Cooper, K. E., & Garrett, R. K. (2015). The partisan brain: How dissonant
science messages lead conservatives and liberals to (dis)trust science. The ANNALS of
the American Academy of Political and Social Science, 658, 3666.
https://doi.org/10.1177/0002716214555474
Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., …
Christensen, G. (2015). Promoting an open research culture. Science, 348, 14221425.
Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The preregistration
revolution. Proceedings of the National Academy of Sciences, 115, 26002606.
https://doi.org/10.1073/pnas.1708274114
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science.
Science, 349, aac4716aac4716. https://doi.org/10.1126/science.aac4716
Phillips, N. (2017). Yarrr: A companion to the e-book “yarrr!: The pirate’s guide to r.”
Retrieved from https://CRAN. R-Project. Org/Package= Yarrr.
Richard, F. D., Bond Jr, C. F., & Stokes-Zoota, J. J. (2003). One hundred years of social
psychology quantitatively described. Review of General Psychology, 7, 331363.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 30
Ruggeri, K., Ojinaga-Alfageme, O., Benzerga, A., Berkessel, J., Hlavová, R., Kunz, M., …
Sampat, B. (2019). Evidence-based policy. In Behavioral Insights for Public Policy:
Concepts and Case (pp. 1740). London: Routledge.
Schönbrodt, F., Gollwitzer, M., & Abele-Brehm, A. (2017). Data management in
psychological science: Specification of the DFG guidelines.
Sijtsma, K. (2016). Playing with dataOr how to discourage questionable research practices
and stimulate researchers to do things right. Psychometrika, 81, 115.
https://doi.org/10.1007/s11336-015-9446-0
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology:
Undisclosed flexibility in data collection and analysis allows presenting anything as
significant. Psychological Science, 22, 13591366.
https://doi.org/10.1177/0956797611417632
Simons, D. J. (2014). The value of direct replication. Perspectives on Psychological Science,
9(1), 7680.
Stroebe, W., & Strack, F. (2014). The alleged crisis and the illusion of exact replication.
Perspectives on Psychological Science, 9, 5971.
Van Bavel, J. J., Mende-Siedlecki, P., Brady, W. J., & Reinero, D. A. (2016). Contextual
sensitivity in scientific reproducibility. Proceedings of the National Academy of
Sciences, 113, 64546459. https://doi.org/10.1073/pnas.1521897113
van der Bles, A. M., van der Linden, S., Freeman, A. L., Mitchell, J., Galvao, A. B., Zaval, L.,
& Spiegelhalter, D. J. (2019). Communicating uncertainty about facts, numbers and
science. Royal Society Open Science, 6(5), 181870.
Vazire, S. (2018). Implications of the credibility revolution for productivity, creativity, and
progress. Perspectives on Psychological Science, 13, 411417.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 31
Yong, E. (2017). How the GOP could use science’s reform movement against it. The Atlantic.
Retrieved from https://www.theatlantic.com/science/archive/2017/04/reproducibility-
science-open-judoflip/521952/
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 32
Supplemental Materials for
No Replication, no Trust? How Low Replicability Influences Trust in Psychology
Analyses with Additional Variables (Perceived Value of Psychological Science
and Potential Moderators)
Study 1
Additional variables and analyses. We measured participants’ perceived value of
psychological science as an additional dependent variable with four items (e.g., “Please rate
the societal benefit of research produced by psychological science.”; α = .80; 1 = very low, 5
= very high; adapted from Broomell & Kane, 2017). This estimated replication rate was
significantly correlated with the perceived value of psychological science, r(267) = .309, 95%
CI [0.20, 0.41], one-sided p < .001. The results regarding the perceived value of psychological
science were thus parallel to our results regarding the public trust in the psychological science
community.
Study 2
Additional variables. The same four items from Study 1 were used to measure the
perceived value of psychological science (α = .83) as an additional dependent variable. To
explore potential moderators, participants responded to three items to measure beliefs about
whether science is an absolute truth or a debate (Rabinovich & Morton, 2012; α = .72; e.g.,
“There may be more than one correct answer to most scientific questions.”), and finally two
subscales of the error orientation questionnaire (Rybowiak, Garst, Frese, & Batinic, 1999),
namely “learning from errors” (α = .92; e.g., “My mistakes help me to improve my work.”)
and “error communication” (α = .82; e.g., When I make a mistake, I tell others about it in
order that they do not make the same mistake.”). We also created two items to measure
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 33
internal (“Mistakes are often caused by internal factors, such as lack of skill or effort.”) vs.
external error attribution style (“Mistakes are often caused by external factors, such as other
persons or circumstances”). However, these self-created items seem to lack validity, as they
correlated positively with each other, r(264) = .18, p = .003. Agreement to all items, except
for the perceived value of psychological science, was measured on a scale from 1 = “very
low” to 7 = “very high”.
Additional analyses. Regarding the perceived value of psychological science, a one-
way analysis of variance revealed a significant difference between the three conditions, F(2,
264) = 3.04, p = .049, η² = .02, 95% CI [0.0003, 0.41].
Participants in the low replicability condition indicated a significantly lower perceived
value of psychological science (M = 3.36, SD = 0.85) than participants in the high
replicability condition (M = 3.66, SD = 0.72), t(175) = 2.51, one-sided p = .006, d = 0.38,
95% CI [0.08, 0.68]. The results regarding the perceived value of psychological science were
thus again parallel to our results regarding the public trust in the psychological science
community.
Further analyses indicated that participants in the medium replicability condition (M =
3.45, SD = 0.90) perceived psychological science as not significantly more valuable than
participants in the low replicability condition, t(175) = 0.67, one-sided p = .25, d = 0.10, 95%
CI [-.20, 0.40], and as less valuable than participants in the high replicability condition, t(178)
= 1.72, one-sided p = . 043, d = 0.26, 95% CI [-0.04, 0.55].
We conducted multiple linear regression analyses to test whether any of our potential
moderator variables moderated the relationship between replicability (low vs. high) and
public trust. However, neither participants’ centered beliefs about science, B = -0.10, t(173) =
0.74, p = .458, nor their centered error attribution style, B = -0.02, t(173) = 0.20, p = .839,
their centered score on the learning from errors-subscale, B = -0.28, t(172) = 1.55, p = .122 or
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 34
their centered score on the error communication-subscale, B = -0.23, t(173) = 1.31, p = .193
significantly moderated this relationship.
The relationship between replicability (low vs. high) and the perceived value of
psychological science was also not significantly moderated by participants’ centered beliefs
about science, B = -0.16, t(173) = 1.94, p = .054, their centered error attribution style, B =
0.06, t(173) = 0.92, p = .359, their centered score on the learning from errors-subscale, B =
0.04, t(172) = 0.31, p = .756, or their centered score on the error communication-subscale, B =
0.06, t(173) = 0.55, p = .582.
Manipulation Checks
Study 2.
Participants in the low replicability condition indicated a significantly lower
agreement to the manipulation check “Psychological research is replicable” (M = 5.02, SD =
1.69) than participants in the high replicability condition (M = 5.72, SD = 1.08), t(177) = 3.30,
p < .001, d = 0.49, 95% CI [0.19, 0.79], so we deemed our manipulation successful.
In the exploratory medium replicability condition participants (M = 4.99, SD = 1.34)
also agreed less to the manipulation check than in the high replicability condition, t(178) =
4.03, one-sided p < .001, d = 0.60, 95% CI [0.30, 0.90]. However, the medium and the low
replicability condition did not differ significantly, t(177) = 0.15, one-sided p = .558, d = 0.02,
95% CI [-0.27, 0.32].
Study 3.
Participants in the high replicability condition indicated a significantly higher
agreement to the manipulation check “psychological research is replicable” (M = 5.78, SD =
1.00) than participants in the low replicability condition (M = 4.92, SD = 1.59), t(198) = 4.55,
one-sided p < .001, d = 0.64, 95% CI [0.36, 0.93], and participants in the low replicability but
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 35
transparency condition (M = 5.04, SD = 1.55), t(194) = 3.96, one-sided p < .001, d = 0.57,
95% CI [0.28, 0.85]. Participants in the low replicability but transparency condition indicated
a significantly higher agreement to the manipulation check “psychological research is
transparent” (M = 5.05, SD = 1.70) than participants in the low replicability condition (M =
4.59, SD = 1.57), t(196) = 1.97, one-sided p = .025, d = 0.28, 95% CI [-0.002, 0.56]..
Interestingly, a non-preregistered t-test showed that participants in the low replicability but
transparency condition did not differ in their agreement to the transparency manipulation
check compared to participants in the high replicability condition (M = 4.91, SD = 1.39),
t(194) = 0.64, p = .521, d = 0.09, 95% CI [-0.19, 0.37].
Study 4.
Participants in the QRPs condition indicated a significantly lower agreement to the
manipulation check “Unknown or hidden factors explain the low replication rate.” (M = 4.14,
SD = 1.96) than participants in the hidden moderator condition (M = 5.94, SD = 1.28), t(186)
= 7.46, one-sided p < .001, d = 1.09, 95% CI [0.78, 1.40]. In contrast, participants in the
QRPs condition indicated a significantly higher agreement to the manipulation check
“Questionable research practices explain the low replication rate.” (M = 5.95, SD = 1.37) than
participants in the hidden moderator condition (M = 3.30, SD = 1.96), t(186) = 10.68, one-
sided p < .001, d = 1.56, 95% CI [1.23, 1.89], indicating that our manipulations were
successful.
Study 5.
Participants in the “still low” condition indicated a significantly lower agreement to
the statement “Psychological research is now more replicable” (M = 4.39, SD = 2.05) than
participants in the “now high” condition (M = 6.02, SD = 0.97), t(178) = 6.82, one-sided p <
.001, d = 1.02, 95% CI [0.70, 1.33], indicating that our manipulation was successful.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 36
Psychometric Properties of the Trust in Psychology Scale
In all studies, we measured participants’ trust in psychology by using a scale for
institutional trust in the psychological science community, which contains five items (e.g., “I
trust the psychological science community to do what is right; 1 = strongly disagree, 7 =
strongly agree; adapted from Nisbet, Cooper, & Garrett, 2015). Importantly, three of the items
are reverse coded (i.e., measure distrust in the psychological science community), while only
two of the items are positively coded. The scale showed acceptable to excellent reliability
across all studies (all αs. > .73). However, a closer inspection of the fit indices in a
confirmatory factor analysis (CFA) using the R package lavaan (Rosseel, 2012) showed poor
fit of a simple one factor solution (s. Table 1).
To address this issue, we examined modification indices (MI) and expected parameter
changes (EPC) to gain insight into how to improve the low fit. Across all 5 studies, the
highest modification indices were found for the correlation of residual variances of the second
and third item of the scale (Study 1, MI = 20.0, EPC = 0.249; Study 2, MI = 50.7, EPC =
0.440; Study 3, MI = 137.5, EPC = 1.06; Study 4, MI =126.9, EPC = 1.09; Study 5, MI
=140.5, EPC = 1.31). A closer inspection revealed that these two items indeed have
something in common that is not captured by the latent variables, namely that both are coded
positively while all other items were reverse-coded. To improve the model fit, we thus
modified the model to allow the residual variances of the second and third item to be
correlated. This dramatically improved the model fit, especially in Studies 3 5, as displayed
in Table 1.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 37
Table 1.
Fit indices for a simple one-factor solution and for the modified model across studies.
CFI
TLI
RMSEA
SRMR
Lowest
standardized
loading
.97
.94
.14
.03
.73
.99
.98
.08
.02
.75
.94
.87
.22
.05
.72
.99
.97
.11
.02
.68
.79
.58
.35
.13
.46
.99
.97
.09
.03
.42
.78
.55
.35
.16
.28
.99
.98
.08
.02
.26
.73
.46
.39
.18
.02
.98
.95
.12
.02
.02
Note. Modified model refers to a model that allows the residual variances of the second and
third item to be correlated.
To test whether this modification would affect our results, we conducted a robustness
check by rerunning all our central analyses from the manuscript. However, this time we
modeled trust in psychology by using factor scores obtained from the modified CFA (i.e., the
CFA which allows the residual variances of the second and third item to be correlated) instead
of simply calculating the mean.
Robustness Checks Using Factor Scores Instead of Means
Robustness Check: Study 1. In Study 1, a higher estimated replication rate still
correlated significantly with a higher trust factor score r(268) = .321, one-sided p < .001, 95%
CI[.210;.425].
Robustness Check: Study 2. Participants in the low replicability condition indicated a
significantly lower trust factor score (M = -0.35, SD = 1.39) than participants in the high
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 38
replicability condition (M = 0.30, SD = 1.24), t(176) = 3.31, one-sided p < .001, d = 0.50,
95% CI [0.20, 0.80].
Robustness Check: Study 3. Again, participants in the low replicability condition
indicated a significantly lower trust factor score (M = -0.21, SD = 1.46) than participants in
the high replicability condition (M = 0.30, SD = 1.39), t(196) = 2.54, one-sided p = .006, d =
0.36, 95% CI [0.08, 0.64].
Participants in the low replicability but transparency condition did not indicate a
significantly higher trust factor score (M = -0.09, SD = 1.50) than participants in the low
replicability condition, t(194) = 0.59, one-sided p = .279, d = 0.08, 95% CI [-0.19, 0.37].
Robustness Check: Study 4. Participants in the QRPs condition showed a
significantly lower trust factor score (M = -0.33, SD = 1.43) than participants in the hidden
moderator condition (M = 0.27, SD = 1.39), t(185) = 2.91, one- sided p = .002, d = 0.43, 95%
CI [0.13, 0.72].
However, the low replicability condition (M = 0.03, SD = 1.52), which served as a
control condition and only reported on the low replication rate, did not differ significantly
from the hidden moderator condition, t(188) = 1.15, p = .254, d = 0.17, 95% CI[ -0.12, 0.45]
or from the QRPs condition, t(183) = 1.65, p = .100, d = 0.24, 95% CI [-0.05, 0.53].
Robustness Check: Study 5. Participants in the “now high” replicability condition
did not show significantly higher trust factor scores (M = 0.13, SD = 1.69) than participants in
the low replicability condition (M = -0.13, SD = 1.75), t(186) = 1.01, one-sided p = .157, d =
0.15, 95% CI [-0.14, 0.44].
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 39
Does low replicability reduce public trust in psychology beyond repair
Assessing evidence for the null hypothesis in Studies 3 to 5
Studies 3 to 5 found no evidence that commonly used trust repair strategies
significantly repair this damaged trust. So does low replicability damage public trust beyond
repair? It is important to note that non-significant results do not allow to conclude that the null
hypothesis is true (i.e., that the tested strategies have absolutely no effect). Thus, in addition
to the sensitivity analyses reported in the manuscript, we also conducted equivalence tests
(Lakens, 2017; Lakens, Scheel, & Isager, 2018) to test whether our observed effect sizes are
statistically equivalent to an interval only containing very small effects (|d| < .2). Moreover,
we computed Bayes factors (BF) to quantify the support the data provides for the null
hypothesis vis-a-vis the alternative hypothesis (Jarosz & Wiley, 2014). Results for these
analyses are presented in Table 2.
The results from the equivalence tests indicate that our observed effects are not
statistically equivalent with the interval covered by the equivalence bounds [d = -0.2, d = 0.2].
Following the logic of equivalence tests, we can thus not declare the absence of meaningful
effects. This is especially noteworthy since we used a rather high value for what constitutes a
meaningful effect (d = 0.2). While d = 0.2 is conventionally considered a small effect (Cohen,
1988), it could potentially still be a meaningful effect of a trust repair intervention after a
replication crisis: Given that such an intervention ideally would affect large parts of the
population, even very small effects could be meaningful (Matz, Gladstone, & Stillwell, 2017).
However, even using this potentially very liberal criterion of d = 0.2, equivalence tests were
not significant.
In contrast, the default Bayes factor for an unpaired t-test (calculated using JASP
version 0.8.4) shows that the data favors the null hypothesis over the alternative hypothesis in
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 40
all 3 studies. However, given that all Bayes factors are < 10, this would conventionally not be
considered as strong support for the null hypothesis (Jarosz & Wiley, 2014).
Table 2.
Observed Effect Sizes, Equivalence Test Results and Bayes Factors for the Tested Trust
Repair Strategies in Study 3 to 5.
Study
Observed d,
Equivalence tests
BFb (H0/H1)
Study 3
(Transparency)
d = 0.11
t(194) = 0.671,
p = .251
3.27
Study 4 (QRPs)
d = -0.25 a
t(183) = 0.320,
p = .625
1.68
Study 4 (Hidden
Moderators)
d = 0.03
t(188) = 1.152,
p = .125
6.22
Study 5 (Increased
Replicability)
d = 0.15
t(186) = 0.308,
p = .379
2.26
Note. The presented Bayes factors indicate the likelihood of the obtained data under the null
hypothesis, divided by the likelihood of the data under the alternative hypothesis. Alternative
hypotheses are one-sided in Studies 3 and 5 and two-sided in Study 4 (in line with our
preregistered hypotheses)
aInformation about QRPs have a negative effect on trust in psychology, indicated by the
negative value.
Overall, neither equivalence tests nor Bayes Factors provide conclusive evidence for
the null hypothesis. This is in line with our interpretation of the data presented in the
manuscript.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 41
Supplemental References
Broomell, S. B., & Kane, P. B. (2017). Public perception and communication of scientific
uncertainty. Journal of Experimental Psychology: General, 146, 286304.
https://doi.org/10.1037/xge0000260
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. New York:
Routledge.
Jarosz, A. F., & Wiley, J. (2014). What are the odds? A practical guide to computing and
reporting Bayes factors. The Journal of Problem Solving, 7, 2.
Lakens, D. (2017). Equivalence tests: A practical primer for t tests, correlations, and meta-
analyses. Social Psychological and Personality Science, 8(4), 355362.
Lakens, D., Scheel, A. M., & Isager, P. M. (2018). Equivalence testing for psychological
research: A tutorial. Advances in Methods and Practices in Psychological Science, 1,
259269.
Matz, S. C., Gladstone, J. J., & Stillwell, D. (2017). In a world of big data, small effects can
still matter: A reply to Boyce, Daly, Hounkpatin, and Wood (2017). Psychological
science, 28, 547550.
Nisbet, E. C., Cooper, K. E., & Garrett, R. K. (2015). The Partisan Brain: How dissonant
science messages lead conservatives and liberals to (dis)trust science. The ANNALS of
the American Academy of Political and Social Science, 658, 3666.
https://doi.org/10.1177/0002716214555474
Rabinovich, A., & Morton, T. A. (2012). Unquestioned answers or unanswered questions:
Beliefs about science guide responses to uncertainty in climate change risk
communication. Risk Analysis: An International Journal, 32, 9921002.
Rosseel, Y. (2012). Lavaan: An R package for structural equation modeling and more.
Version 0.512 (BETA). Journal of statistical software, 48, 136.
LOW REPLICABILITY AND TRUST IN PSYCHOLOGY 42
Rybowiak, V., Garst, H., Frese, M., & Batinic, B. (1999). Error orientation questionnaire
(EOQ): Reliability, validity, and different language equivalence. Journal of
Organizational Behavior: The International Journal of Industrial, Occupational and
Organizational Psychology and Behavior, 20, 527547.
... participants' perception of the source's credibility. However, other research shows that increasing transparency or performing reforms to increase reliability has no or even an adverse impact on the public's trust in science (Anvari & Lakens, 2018;Wingen et al., 2020). Although this latter research focused on general trust in science rather than a specific scientific source, more research is needed to clarify the impact of researchers' activities that are reflective of scientific norms and rules on their credibility. ...
... In terms of source pertinence, the findings show that laypeople not only consider this source feature an important determinant of credibility on an explicit, self-reported level , but in fact draw on the source's level of pertinence when evaluating scientific sources and their message. In terms of the source's scientific integrity, our findings also add to previous, partly inconsistent findings on the impact of scientific integrity (Altenm€ uller et al., 2021;Anvari & Lakens, 2018;Wingen et al., 2020) by indicating that laypeople interpret high integrity as a positive cue for source credibility and message validity. Thus, the present results support our expectation that laypeople consider both ability-related factors, expertise and pertinence, and both motivational factors, benevolence and integrity, to evaluate a scientific source's credibility and their claims. ...
Article
Adequate evaluation of information requires differentiated consideration of critical source features. We investigated which source features laypeople rely on to determine a scientific source’s credibility and message validity. Specifically, we examined whether they distinguish between the ability-related source features expertise and pertinence and the motivation-related features benevolence and scientific integrity. Medical laypeople read pairs of conflicting documents about a health issue. One document source varied in terms of their expertise and pertinence (Experiment 1) or in terms of their benevolence and integrity (Experiment 2). The results show that although laypeople consider both ability-related factors and both motivational factors to evaluate a scientific source’s credibility and their claims, they do not clearly distinguish between these source characteristics. Educational implications are discussed.
... Note that older participants (60+) were not included in this study because they were prioritized in the German vaccination campaign, and we assumed that most people in this age group who intended to be vaccinated had already been vaccinated. The study was conducted in May 2021 (9.4% of the German population were fully vaccinated at this time 46 For our trust in science manipulation, we build on prior research, which found that trust in science can be manipulated by providing information about the percentage of reliable research results 47 . We, therefore, developed a similar trust in scientist manipulation for the pandemic context. ...
... It should be noted that trust in science is operationalized here, as in previous research 4,47 , as trust in scientists. Obviously, trust in science could also refer to other aspects, such as trust in scientific findings or the scientific method. ...
Preprint
Full-text available
Drawing the right lessons from the COVID-19 pandemic will be essential for effective future policy responses. One central finding from the past pandemic is that trust in science predicts health-related protection intentions and behaviors, such as social distancing and vaccination. Many researchers and policymakers thus believe that interventions targeting trust in science will be key for curbing the spread of future pandemics. Yet, it is unclear whether the observed correlation between trust in science and protection intentions does indeed imply causation. In a series of studies (total N = 5,311), we replicated this correlation between trust in science and protection intentions (Studies 1, 3, and 4). At the same time, when experimentally manipulating trust in science, we found no evidence for causal effects on protection intentions (Studies 2 to 4). This absence of meaningful effects was further confirmed by equivalence tests and by an internal meta-analysis (N = 3,761). Finally, a machine learning algorithm likewise found no evidence that manipulating trust in science affects protection intentions. While it is inherently difficult to prove the absence of an effect, these results cast doubts on the causal importance of short-term changes in trust in science for protection intentions.
... Este projeto ilustra o potencial das iniciativas colaborativas, quando articuladas a objetivos pedagógicos, para gerar investigação de qualidade, replicável e de elevado impacto científico.Outra frente diz respeito à forma como os resultados dos estudos das replicações são comunicados. Enquadramentos deterministas, coberturas sensacionalistas e apresentação dicotómica dos resultados (i.e., como tendo ou não replicado um estudo anterior) tendem a distorcer a perceção da ciência, obscurecendo a sua natureza incerta e variável, e podem diminuir a confiança da comunidade científica e do público (National Academies of Sciences, Engineering, and Medicine, 2019;Ting & Greenland, 2024;Wingen et al., 2020).Methner et al. (2023) mostram que, quando os resultados das replicações são comunicados de forma contextualizada, destacando as iniciativas da reforma e os valores da ciência aberta, a confiança tende a manter-se elevada, mesmo perante evidências de dificuldades de replicação. Torna-se, assim, essencial promover uma comunicação pedagógica, que evidencie que a ciência é um processo contínuo, cumulativo e autocorretivo. ...
Preprint
Resumo: Este capítulo visa refletir sobre a importância da replicação no reforço da credibilidade da ciência. Clarificam-se conceitos-chave, como reprodutibilidade, replicabilidade e generalização, e distinguem-se os diferentes tipos de replicação em investigação. São também sistematizados critérios para a seleção de estudos a replicar e apresentadas recomendações para boas práticas na conceção, implementação e interpretação de estudos de replicação. O capítulo explora ainda os desafios metodológicos e disciplinares associados à adoção de práticas de replicação, bem como estratégias institucionais, formativas e de comunicação científica para promover a replicação, apresentando propostas para o reforço sustentado destas vias através da investigação.Abstract: This chapter discusses the central role of replication in enhancing the credibility of scientific knowledge. It clarifies key concepts: reproducibility, replicability, and generalization; and differentiates between various types of replications in research. It also outlines criteria for selecting studies to replicate and offers recommendations for best practices in the design, implementation, and interpretation of replication studies. The chapter also explores the methodological and disciplinary challenges associated with adopting replication practices and examines institutional, educational, and science communication strategies that promote replication. Finally, it presents proposals to strengthen these pathways through sustained research efforts.
... Second, neoliberalism might have increased pressures to publish ('publish or perish' culture 59,60 ), fuelling questionable research practices associated with the replication crisis 61,62 . Although the replication crisis in both natural and social sciences [63][64][65] is fairly recent (compared with, for example, when the lower trust among conservatives was first observed 2 ), it can indeed lead to public's decreased trust in science [66][67][68] . In short, the internal dynamics of science-embedded within a wider societal system-is one of the crucial aspects to account for when it comes to public (dis)trust in science. ...
Preprint
Full-text available
Trust in scientists is a key predictor of compliance with science-based solutions to societal challenges. Although liberals in the US generally trust scientists more than conservatives do, it is not clear how these ideological differences vary across different scientific occupations and whether they can be mitigated. The present registered report (N = 7,800, US participants) demonstrated that even though the strength of the relationship between political ideology and trust varies across scientific occupations, liberals (compared to conservatives) show higher trust in most scientists. Moreover, following motivational accounts of scientist distrust, the study tested five theoretically grounded intervention strategies to improve conservatives’ trust in scientists. None of the interventions were successful, suggesting that trust in scientists reflects relatively stable attitudes that require more elaborate and time-intensive interventions.
... Second, neoliberalism might have increased pressures to publish ('publish or perish' culture 59,60 ), fuelling questionable research practices associated with the replication crisis 61,62 . Although the replication crisis in both natural and social sciences [63][64][65] is fairly recent (compared with, for example, when the lower trust among conservatives was first observed 2 ), it can indeed lead to public's decreased trust in science [66][67][68] . In short, the internal dynamics of science-embedded within a wider societal system-is one of the crucial aspects to account for when it comes to public (dis)trust in science. ...
Article
Full-text available
Trust in scientists is a key predictor of compliance with science-based solutions to societal challenges. Although liberals in the USA generally trust scientists more than conservatives do, it is not clear how these ideological differences vary across different scientific occupations and whether they can be mitigated. Here, in this Registered Report (including 7,800 US participants), we demonstrate that, even though the strength of the relationship between political ideology and trust varies across scientific occupations, liberals (compared with conservatives) show higher trust in most scientists. Moreover, following motivational accounts of scientist distrust, the study tested five theoretically grounded intervention strategies to improve conservatives’ trust in scientists. None of the interventions were successful, suggesting that trust in scientists reflects relatively stable attitudes that require more elaborate and time-intensive interventions.
... Ensuring that the results of studies can be independently confirmed, is expected to reduce waste [2,5] and lead to more reliable outcomes that better inform evidence-based decisions [6,7]. Furthermore, studies that can be independently confirmed may increase public trust in the scientific enterprise [8,9]. Reproducibility and replicability therefore underpin the credibility and reliability of research findings in many fields, especially in science, technology, engineering and mathematics. ...
Article
Full-text available
Various open science practices have been proposed to improve the reproducibility and replicability of scientific research, but not for all practices, there may be evidence they are indeed effective. Therefore, we conducted a scoping review of the literature on interventions to improve reproducibility. We systematically searched Medline, Embase, Web of Science, PsycINFO, Scopus and Eric, on 18 August 2023. Any study empirically evaluating the effectiveness of interventions aimed at improving the reproducibility or replicability of scientific methods and findings was included. We summarized the retrieved evidence narratively and in evidence gap maps. Of the 105 distinct studies we included, 15 directly measured the effect of an intervention on reproducibility or replicability, while the remainder addressed a proxy outcome that might be expected to increase reproducibility or replicability, such as data sharing, methods transparency or pre-registration. Thirty studies were non-comparative and 27 were comparative but cross-sectional observational designs, precluding any causal inference. Despite studies investigating a range of interventions and addressing various outcomes, our findings indicate that in general the evidence base for which various interventions to improve reproducibility of research remains remarkably limited in many respects.
... Empirischen Forschungsergebnissen wird -trotz unübersehbarer wissenschaftsfeindlicher Tendenzen -hohe Vertrauenswürdigkeit und besondere Eignung als Grundlage für die Findung von individuellen und gesellschaftlichen Entscheidungen zugeschrieben (Anvari und Lakens 2018;Wingen et al. 2020). Dies erscheint gerechtfertigt, sofern die berichteten Befunde tatsächlich Resultat guter wissenschaftlicher Praxis sind (Haven et al. 2022). ...
Article
Full-text available
Zusammenfassung Die Ermöglichung von Replikationen und Reproduktionen wissenschaftlicher Arbeiten bildet die Grundlage für eine von den ursprünglichen Autorinnen und Autoren unabhängige Qualitätssicherung. Replikation umfasst eine wiederholte Datenerhebung und Datenauswertung unter gleichen Bedingungen, basierend auf den Angaben zur Vorgehensweise in Publikation und Begleitmaterial. Für eine Reproduktion werden die verfügbaren Rohdaten oder ersatzweise in der Publikation berichtete Daten mit den gleichen Verfahren analysiert, wie sie im Methodenteil und ggf. Anhängen der Publikationen dokumentiert wurden. Da das Thema Reproduzierbarkeit im Vergleich zu dem der Replizierbarkeit in der empirischen Bildungsforschung bislang weniger Berücksichtigung gefunden hat, fokussiert die vorliegende Arbeit auf die Vorgehensweise zur Schaffung der Voraussetzungen für Reproduzierbarkeit. Dabei gehen wir von Kriterien aus, welche in Leitlinien aus benachbarten Feldern wie Psychologie oder Sonderpädagogik publiziert wurden. Es werden Open Data, Open Metadata und Open Code als zentrale Voraussetzungen für Reproduzierbarkeit dargestellt und Hinweise gegeben, welche weiteren Kriterien die Überprüfung der Reproduzierbarkeit einer Arbeit sicherstellen, sofern Open Data, Open Metadata und Open Code nicht verfügbar sind. Es wird diskutiert, wie die Nutzung öffentlicher Datenrepositorien dazu beitragen kann, dass trotz eingeschränkter Zeichenzahl in Zeitschriftenpublikationen alle Kriterien erfüllt und Reproduzierbarkeit quantitativer Analysen sichergestellt werden können.
... Trust is a multi-faceted concept that holds significance across various disciplines including psychology, sociology, political sciences, governance, communication, economic and even health care (Cook & Santana, 2020;Festenstein, 2020;Flew & Jiang, 2021;Newton, 2020;Puranam & Vanneste, 2009;Tonkiss, 2000;Wingen, Berkessel, & Englich, 2020). Although the general literature on trust has been growing since the 1970s (Uslaner, 2017), empirical research about trust in communication and political communication started arguably in the 1950s, gradually bringing "considerable conceptual resources to questions of trust, which have in turn been adopted by other disciplines" (Flew & Jiang, 2021, p. 3); for instance, the theory of public sphere (Habermas, Lennox, & Lennox, 1974), agenda-setting theory (McCombs & Shaw, 1972), the theory of communicative action (Habermas, 1985), and active audience theory (Fiske & Hartley, 1978). ...
Article
Full-text available
Purpose This study explores how digital information consumption and perceptions of corruption affect trust in government, shaped by country-level democratic context. We argue that in less democratic countries, the Internet increases access to political corruption news, impacting trust in government. Design/methodology/approach Using Varieties of Democracy Institute (V-Dem) principles (liberal, participatory, deliberative, egalitarian), we analyze digital information consumption in 128 countries with Gallup data. Findings Participatory democracy strongly moderates the negative relationship between digital consumption and trust, especially in moderate democracies with high perceived corruption. Social implications Digital information’s effect on trust depends on government type and existing confidence levels. Originality/value This study uses data from 128 countries, enabling broad inferences about digital communication’s consequences. It also provides a novel examination of the relationship between digital information consumption, perceived corruption and attitudes towards government.
... While research irreproducibility could be caused by a myriad of reasons (e.g., poor experimental design, selective reporting), the lack of sufficient data, code, description of methods, and accompanying data documentation is a common cause. In addition to improving reproducibility of research and therefore improving the trust of researchers in the results, sharing data has also been found to increase public trust in science (Rosman et al., 2022;Wingen, Berkessel & Englich, 2019). This is especially relevant at present when many in research and higher education sectors are actively fighting against misinformation and where there is a growing scepticism about science in the general public (Nyhan, Porter & Wood, 2022). ...
Article
Full-text available
There has been a lot of discussion within the scientific community around the issues of reproducibility in research, with questions being raised about the integrity of research due to failure to reproduce or confirm the findings of some of the studies. Researchers need to adhere to the FAIR (findable, accessible, interoperable, and reusable) principles to contribute to collaborative and open science, but these open data principles can also support reproducibility and issues around ensuring data integrity. This article uses observations and metrics from data sharing and research integrity related activities, undertaken by a Research Integrity and Data Specialist at the Francis Crick Institute, to discuss potential reasons behind a slow uptake of FAIR data practices. We then suggest solutions undertaken at the Francis Crick institute which can be followed by institutes and universities to improve the integrity of research from a data perspective. One major solution discussed is the implementation of a data archive system at the Francis Crick Institute to ensure the integrity of data long term, comply with our funders’ data management requirements, and to safeguard our researchers against any potential research integrity allegations in the future.
Article
p> . This study aims to (1) analyze the efforts made by MI Kalipucang Kulon in increasing parental trust through the management and achievement of student academic achievement, (2) identify MI Kalipucang Kulon's strategies in increasing parental trust through the management and achievement of student non-academic achievement, and (3) reveal the obstacles faced by MI Kalipucang Kulon in increasing parental trust and providing relevant solutions to overcome these obstacles. The study was conducted at MI Kalipucang Kulon using a qualitative phenomenological approach. Data were collected through observation, interviews, and documentation. The results of the study showed that academic activities such as remedial and academic competitions, as well as non-academic activities such as drum bands, had a significant effect on parental trust. Obstacles such as miscommunication were overcome with effective communication and transparency. It was concluded that academic and non-academic activities play an important role in building parental trust. These findings provide a basis for developing strategies to increase parental trust in madrasahs. </p
Preprint
Full-text available
Replication failures of past findings in several scientific disciplines, including psychology, medicine, and experimental economics, have created a ‘crisis of confidence’ among scientists. Psychological science has been at the forefront of tackling these issues, with discussions about replication failures and scientific self-criticisms of questionable research practices (QRPs) increasingly taking place in public forums. How this replicability crisis impacts the public’s trust is a question yet to be answered by research. Whereas some researchers believe that the public’s trust will be positively impacted or maintained, others believe trust will be diminished. Because it is our field of expertise, we focus on trust in psychological science. We performed a study testing how public trust in past and future psychological research would be impacted by being informed about i) replication failures, ii) replication failures and criticisms of QRPs, and iii) replication failures, criticisms of QRPs, and proposed reforms. Results from a mostly European sample (N = 1129) showed that, compared to a control group, whereas trust in past research was reduced when people were informed about the aspects of the replication crisis, trust in future research was maintained except when they were also informed about proposed reforms. Potential explanations are discussed.
Article
Full-text available
Uncertainty is an inherent part of knowledge, and yet in an eraof contested expertise, many shy away from openly communicating their uncertainty about what they know, fearful of their audience’s reaction. But what effect doescommunication of such epistemic uncertainty have? Empirical research is widely scattered across many disciplines. This interdisciplinary review structures andsummarizes current practice and research across domains, combining a statistical and psychological perspective. This informs a framework for uncertainty communication inwhich we identify three objects of uncertainty—facts, numbers and science—and two levels of uncertainty: direct and indirect. An examination of current practices provides ascale of nine expressions of direct uncertainty. We discuss attempts to codify indirect uncertainty in terms of quality of the underlying evidence. We review the limited literature about the effects of communicating epistemic uncertainty on cognition, affect, trust and decision-making. While there is some evidence that communicating epistemic uncertainty does not necessarily affect audiences negatively, impact can vary between individuals and communication formats. Case studies in economic statistics and climate change illustrate our framework in action. We conclude with advice to guide both communicators and future researchers in this important but so far rather neglected field.
Article
Full-text available
Psychologists must be able to test both for the presence of an effect and for the absence of an effect. In addition to testing against zero, researchers can use the two one-sided tests (TOST) procedure to test for equivalence and reject the presence of a smallest effect size of interest (SESOI). The TOST procedure can be used to determine if an observed effect is surprisingly small, given that a true effect at least as extreme as the SESOI exists. We explain a range of approaches to determine the SESOI in psychological science and provide detailed examples of how equivalence tests should be performed and reported. Equivalence tests are an important extension of the statistical tools psychologists currently use and enable researchers to falsify predictions about the presence, and declare the absence, of meaningful effects.
Article
Scientific evidence should guide the selection of practice for individuals with disabilities. Scientific evidence, however, must be trustworthy to move special education toward greater empirical certainty and more effective policies and practices. Transparency, openness, and reproducibility increase the trustworthiness of evidence. We propose that researchers in special education adopt emerging open-science reforms, such as preprints, data and materials sharing, preregistration of studies and analysis plans, and Registered Reports. Adoption of these practices will require shifts in cultural norms, guidelines, and incentives. We discuss how adopting open-science practices can advance the quality of research and, consequently, policy and practice in special education.
Preprint
If we want psychological science to have a meaningful real-world impact, it has to be trusted by the public. Scientific progress is noisy; accordingly, replications sometimes fail even for true findings. We need to communicate the acceptability of uncertainty to the public and our peers, to prevent psychology from being perceived as having nothing to say about reality.
Article
The credibility revolution (sometimes referred to as the “replicability crisis”) in psychology has brought about many changes in the standards by which psychological science is evaluated. These changes include (a) greater emphasis on transparency and openness, (b) a move toward preregistration of research, (c) more direct-replication studies, and (d) higher standards for the quality and quantity of evidence needed to make strong scientific claims. What are the implications of these changes for productivity, creativity, and progress in psychological science? These questions can and should be studied empirically, and I present my predictions here. The productivity of individual researchers is likely to decline, although some changes (e.g., greater collaboration, data sharing) may mitigate this effect. The effects of these changes on creativity are likely to be mixed: Researchers will be less likely to pursue risky questions; more likely to use a broad range of methods, designs, and populations; and less free to define their own best practices and standards of evidence. Finally, the rate of scientific progress—the most important shared goal of scientists—is likely to increase as a result of these changes, although one’s subjective experience of making progress will likely become rarer.
Preprint
Calls for public access to research data have been ongoing for some time. For instance, in their “Recommendations for Secure Storage and Availability of Digital Primary Research Data” (2009) the German Research Foundation (“Deutsche Forschungsgemeinschaft”, DFG) demanded that publicly funded data are freely available after the completion of a project. In line with this, in 2010 the Alliance of Science Organizations in Germany called for long-term storage of and generally free access to research data. In September 2015, the DFG published data management guidelines that affirmed these goals and asked research associations to consider their data management regulations and to develop appropriate standards for discipline-specific use and sharing of research data. The German Psychological Society (DGPs) joins the DFG and the Alliance of Science Organizations in Germany in their mission to specify the DFG guidelines for the field of psychology. This document (a) emphasizes the importance of sustainable research data management, (b) defines what “primary data” are and how they should be stored, (c) defines standards and potential data sharing restrictions, and (d) defines the rights and duties of researchers that share data and researchers that use secondary data. The German Psychological Associations adopted these recommendations in September 2016.This is an English translation of the original German recommendations, which have been published here: Schönbrodt, F., Gollwitzer, M., & Abele-Brehm, A. (2017). Der Umgang mit Forschungsdaten im Fach Psychologie: Konkretisierung der DFG-Leitlinien. Psychologische Rundschau, 68, 20–35. doi:10.1026/0033-3042/a000341.
Article
Efforts to improve the reproducibility and integrity of science are typically justified by a narrative of crisis, according to which most published results are unreliable due to growing problems with research and publication practices. This article provides an overview of recent evidence suggesting that this narrative is mistaken, and argues that a narrative of epochal changes and empowerment of scientists would be more accurate, inspiring, and compelling.
Article
Progress in science relies in part on generating hypotheses with existing observations and testing hypotheses with new observations. This distinction between postdiction and prediction is appreciated conceptually but is not respected in practice. Mistaking generation of postdictions with testing of predictions reduces the credibility of research findings. However, ordinary biases in human reasoning, such as hindsight bias, make it hard to avoid this mistake. An effective solution is to define the research questions and analysis plan before observing the research outcomes-a process called preregistration. Preregistration distinguishes analyses and outcomes that result from predictions from those that result from postdictions. A variety of practical strategies are available to make the best possible use of preregistration in circumstances that fall short of the ideal application, such as when the data are preexisting. Services are now available for preregistration across all disciplines, facilitating a rapid increase in the practice. Widespread adoption of preregistration will increase distinctiveness between hypothesis generation and hypothesis testing and will improve the credibility of research findings.