ArticlePDF Available

Kinds of Replication: Examining the Meanings of “Conceptual Replication” and “Direct Replication”

Authors:

Abstract

Although psychology’s recent crisis has been attributed to various scientific practices, it has come to be called a “replication crisis,” prompting extensive appraisals of this putatively crucial scientific practice. These have yielded disagreements over what kind of replication is to be preferred and what phenomena are being explored, yet the proposals are all grounded in a conventional philosophy of science. This article proposes another avenue that invites moving beyond a discovery metaphor of science to rethink research as enabling realities and to consider how empirical findings enact or perform a reality. An enactment perspective appreciates multiple, dynamic realities and science as producing different entities, enactments that ever encounter differences, uncertainties, and precariousness. The axioms of an enactment perspective are described and employed to more fully understand the two kinds of replication that predominate in the crisis disputes. Although the enactment perspective described here is a relatively recent development in philosophy of science and science studies, some of its core axioms are not new to psychology, and the article concludes by revisiting psychologists’ previous calls to apprehend the dynamism of psychological reality to appreciate how scientific practices actively and unavoidably participate in performativity of reality.
https://doi.org/10.1177/17456916211041116
Perspectives on Psychological Science
2022, Vol. 17(5) 1490 –1505
© The Author(s) 2022
Article reuse guidelines:
sagepub.com/journals-permissions
DOI: 10.1177/17456916211041116
www.psychologicalscience.org/PPS
ASSOCIATION FOR
PSYCHOLOGICAL SCIENCE
Since 2011, psychology has been experiencing a period
of turmoil that is often referred to as a “crisis.” Meth-
odology, statistics, theory, publication practices, and
incentive structures have all become topics of often
heated debate. Replication in particular is cast as a
central issue, one shared by other sciences. It is note-
worthy that the current troubles are referred to as both
a “crisis of confidence” (e.g., Pashler & Wagenmakers,
2012) and as a “replication crisis” (e.g., Pashler & Harris,
2012). Failed replications have been a major factor in
denting trust in the solidity of the discipline’s accumu-
lated findings, which have been referred to as “a vast
graveyard of undead theories” (Ferguson & Heene,
2012). What has been reported in psychology journals
is thought to consist to a significant extent of false
positives: the product of sloppy methods (e.g., low-
powered studies) in combination with selective report-
ing and publication bias and/or other questionable
research practices, such as hypothesizing after the
results are known (i.e., HARKing; Kerr, 1998), and in
some cases outright fraud. Although most subfields of
psychology have been subject of these reports, some,
like social psychology, are receiving greater attention.
But replication also figures prominently in the solu-
tions to the problems that have been proposed: Many
researchers hold that only when replication is made a
standard element of the research process can confi-
dence be restored and will psychology live up to its
status as a science. Objectivity entails reproducibility,
and testing the reproducibility of an effect in a replica-
tion study is a crucial part of science, an idea that is
often attributed to Karl Popper (e.g., Srivastava, 2014a).
What psychology needs, therefore, is more attention to
1041116PPSXXX10.1177/17456916211041116Derksen, MorawskiPerspectives on Psychological Science 17(5)
research-article2022
Corresponding Author:
Maarten Derksen, Department of Theory & History of Psychology,
Faculty of Behavioural and Social Sciences, University of Groningen
Email: m.derksen@rug.nl
Kinds of Replication: Examining the
Meanings of “Conceptual Replication”
and “Direct Replication”
Maarten Derksen1 and Jill Morawski2
1Department of Theory & History of Psychology, Faculty of Behavioural and Social Sciences,
University of Groningen, and 2Psychology Department, Wesleyan University
Abstract
Although psychology’s recent crisis has been attributed to various scientific practices, it has come to be called a
“replication crisis,” prompting extensive appraisals of this putatively crucial scientific practice. These have yielded
disagreements over what kind of replication is to be preferred and what phenomena are being explored, yet the
proposals are all grounded in a conventional philosophy of science. This article proposes another avenue that invites
moving beyond a discovery metaphor of science to rethink research as enabling realities and to consider how empirical
findings enact or perform a reality. An enactment perspective appreciates multiple, dynamic realities and science as
producing different entities, enactments that ever encounter differences, uncertainties, and precariousness. The axioms
of an enactment perspective are described and employed to more fully understand the two kinds of replication
that predominate in the crisis disputes. Although the enactment perspective described here is a relatively recent
development in philosophy of science and science studies, some of its core axioms are not new to psychology, and
the article concludes by revisiting psychologists’ previous calls to apprehend the dynamism of psychological reality to
appreciate how scientific practices actively and unavoidably participate in performativity of reality.
Keywords
replication, crisis, performativity, multiplicity, epistemology
Perspectives on Psychological Science 17(5) 1491
replication and specifically to what is usually termed
“direct replication”: repeating the experimental proce-
dure of the original experiment as closely as possible
to test whether it produces the same result. Over the
past 10 years, many psychologists have taken up this
challenge, often collaborating in large-scale replication
projects involving dozens of researchers (Nosek etal.,
2021).
There are some, however, who are critical of the
emphasis on direct replication in the current debate
and believe that it is misguided to strive for the repro-
ducibility of effects. These authors argue that replica-
tion is not as rare as it is made out to be by the alarmist
critics, but that it usually takes the shape of so-called
“conceptual” replications, in which the same hypothesis
or theory is tested but in a different way.1 This practice
is defended with two related arguments. First, it is
pointed out that psychology is not about behavioral
phenomena, per se, but about the psychological pro-
cesses underlying them. It is these processes that psy-
chology’s theories describe. Second, it is argued that
behavior is sensitive to context, and this context is
socially, culturally, and historically highly variable. We
therefore cannot expect the same experimental manipu-
lation to have the same effect in different circumstances.
Because of this context sensitivity, failure of a direct
replication is not informative. The proper way to bolster
and extend a theory is by conceptual replication.
Whereas proponents of direct replication present the-
ory as constrained (by evidence), advocates of concep-
tual replication accord theory a more central place in
research.
In this article, we offer a perspective on psychologi-
cal research, replication in particular, in which research
is understood primarily as the production of effects,
phenomena, and events rather than as the discovery of
underlying mechanisms. We invite readers to envision
how the epistemic premises of this perspective depart
from conventional philosophy of science but retain a
commitment to realism. Research can fruitfully be
regarded as performative, in the sense that it creates
(multiple) realities rather than that it discovers (a single)
reality. We argue that such a perspective suggests a way
forward beyond the increasingly unhelpful dichotomy
of direct and conceptual replication and the disputes
over what constitutes the “good-enough” replication it
engenders. It also brings into focus the political dimen-
sion of psychological research and its “real-world”
applications, both relatively underrepresented issues in
the current debate in psychology.
Although performativity and, specifically, enactment
theory may seem an outlandish or even “postmodern”
perspective to some, we will also note that there are
similarities with ideas and proposals put forward earlier
by reputable psychologists such as William McGuire,
Anthony Greenwald, and Paul Rozin. Notwithstanding
the diversity of their views, they shared an emphasis on
multiplicity and on research as a process primarily of
making things happen, of producing effects. Each warned
against an exclusive focus on science as theory testing,
instead favoring a result-centered approach. In the discus-
sion we will explore these connections further.
From Representing to Enacting
Science is conventionally understood as generating accu-
rate representations of an ordered, singular world; thus,
psychological science aims to provide accurate repre-
sentations of ordered patterns of thought, feeling, and
behavior. This commonly held epistemic premise that
the world is singular, ordered, and relatively stable—that
reality is “out there” and that it can be discovered and
represented through science—has motivated substantial
research in science studies (also referred to as science
and technology studies [STS]). Over the past 4 decades,
STS researchers have examined scientific practices, not
simply its theories. In so doing, they departed from the
conventional view of science as “above all, a body of
representations of reality” and moved toward “an under-
standing of science as a mode of performative engage-
ment with the world” (Pickering, 2010, p. 19). In other
words, representation is relocated “from the theoretical
to the practical side of science” and thus is “no longer
regarded as a propositional account of the world but as
the activity of instrumentally producing traces, images
and artifacts, which enable scientists to better grasp and
handle the objects of their investigations” (Langlitz, 2015,
p. 20). This epistemic move marks an analytic shift from
scientific theories (representations) to scientific practices,
from science as contemplation of the world to science
as activity, and from science as writing (as texts) to sci-
ence as doing. By approaching science as practice,
researchers have closely investigated the extensive and
difficult scientific work that culminates in representations
of objects, concepts, and facts (Latour & Woolgar, 1986;
Lynch & Woolgar, 1990; Pickering, 1994).
The focus on scientific practices and the ways they
actively engage with and intervene in the world has
been extended to consider how scientific entities are
coextensive with scientific practices. Accordingly,
objects, processes, or entities are understood as enacted,
not found. As Latour and Woolgar proposed, “this bun-
dle of out-thereness can be understood as an accom-
plishment rather than something that defines and sets
limits to the ways in which we can properly know the
world” (quoted in Law, 2004, p. 37). This is a more
comprehensive and consequential view through which
science produces the very things it studies. Facts,
1492 Derksen, Morawski
“bundles of out-thereness,” are the result of research,
rather than discovered in research.
A simple, but striking illustration of enactment is a
recent study of the effects of analytical variability. The
“many analysts, one data set” project was a collabora-
tive effort of 29 teams that each investigated the same
research question (do soccer players with a dark skin
tone get more red card bookings from referees than
players with light skin tone?) with the same data set
(Silberzahn etal., 2018). The teams used a variety of
analytic strategies to answer this question, and despite
two rounds of (online) discussion between the teams,
there remained variability in the strategies. The differ-
ences made a difference: Although most teams found
a relation between skin color and red cards, some did
not; not all relationships were statistically significant,
and the effect sizes varied. Depending on the analytical
strategy one chooses, therefore, there is racism in soc-
cer or there is not. The fact is coextensive with the
analysis of the data.2 This is not to say that the fact is
made up or that “anything goes.” Every analytic strat-
egy has to find justification in the field of statistical
methodology, a field known for its robust debates. Nor
does it mean that race is not an issue in soccer. It
clearly is: The fact that this particular research question
is considered to be worth investigating is itself testi-
mony to that. But turning this issue into a scientific
fact is not so much a matter of discovering something
already out there (racism in soccer) but of mobilizing
particular statistical methods to detect patterns in the
data, which are then sent out into the world of soccer
as facts about racism. And these results can go on to
set in motion changes in soccer: diversity initiatives,
perhaps.
Alexandra Rutherford’s (2017) analysis of sexual
assault surveys provides another example. These surveys
are performative in that they “materialize experiences
in new ways,” making certain experiences—and not
others—“real” (2017, p. 115). Yet not any measure of
sexual assault can materialize experiences, for such prac-
tices necessarily perform “within a complex assemblage
of implicit and explicit beliefs, attitudes, institutions,
communities, and politics (including, importantly, femi-
nist politics)” (2017, p. 116). Neither does Rutherford
mean that, as American conservatives like to claim, date
rape is a fictitious phenomenon, created by feminist
social scientists. Instead she describes how rape was
given a particular kind of reality by the surveys that were
central to the debate. The experiences of rape survivors
became entangled with numbers collected to measure
their prevalence. The surveys “materialized and remate-
rialized—via numbers and statistics—experiences that
had been individual, private, unarticulated and—before
the 1980s—unmeasured” (Rutherford, 2017, p. 114). It
was in this quantified form—particularly the one-in-five
statistic—that date rape became a topic in American cul-
ture, mediated by bureaucrats, policy-makers, and media.
An enactment perspective on science, then, holds to
realism regarding the world and entities in the world
while at the same time maintaining that scientific prac-
tices “bring (aspects of) the world into existence”: Cer-
tain realities are being enacted.3 From this follows a
second feature of the enactment perspective. Given that
entities are enacted in the course of scientific investiga-
tions, it is possible, even likely, that different practices
can yield different entities. Entities and objects thus
are never finished or complete things but rather are
the effects of practices (Law, 2004; Law & Lien, 2013;
Woolgar & Lezaun, 2013). Mol’s (2002) influential study
of the medical diagnosis and treatment as well as
patients’ lived experiences of atherosclerosis reveals
that the numerous sites of diagnosis and treatment iden-
tify and engage different entities, not just different
representations of or perspectives on the entity “athero-
sclerosis.” In these different sites atherosclerosis is dif-
ferent things. To connect them into different instances
of a single entity requires work; it is not a straightfor-
ward “reflection of any innate commonality or charac-
teristic” (Woolgar & Lezaun, 2013, p. 325). That work
may be practical (connecting diagnostic results into a
single dossier) as well as theoretical.
Third, given that the entities are generative effects
of specific scientific practices, variations in scientific
practices typically result in differences, uncertainties,
and precariousness (Pickering, 2010). Scientific work
involves ongoing efforts to reduce these uncertainties
and multiplicities through techniques created and
deployed to align data and to stabilize and produce or
reproduce the entity (Guenther & Hess, 2016; Hoeppe,
2014). These efforts do not always succeed. For instance,
analyzing the conceptions of “antisociality” and “psy-
chopathy” and the proposed biomarkers for these dis-
orders developed in British psychiatry between 1950
and 2010, Pickersgill (2014) found enduring uncertain-
ties and diversity of both theories and methods; 60
years of investigations yielded neither scientific con-
sensus nor even stable referents for the disorders. Con-
trary to conventional science’s epistemology, which
would deem this a situation of problematic disunity, if
not a crisis that warrants resolution, researchers and
clinicians pursued these varied conceptions, undertak-
ing “practical uncertainty work” but also remaining
aware of the absence of consensus or clarity. Such a
state of “ontological anarchy” actually proves to be
generative, providing intellectual resources and degrees
of freedom as well as entities for both researchers and
clinicians to proceed with their practical endeavors.
Pickersgill concluded, “Ontological anarchy is thus—to
Perspectives on Psychological Science 17(5) 1493
a degree—autopoietic: it is a response to the uncertain-
ties inherent to dealing with antisociality in a psychi-
atric context, as well as an engine powering the
generation of yet more ambiguity” (p. 147).
Fourth, although scientific entities are the effects of
elaborate technical practices, that work can be trans-
ported and taken up outside investigative arenas and
applied to social life, with the possible eventual effect
of changing human thought and behavior (Hacking,
1995, 2007; MacIntyre, 1985; Richards, 2002; Stam,
2015). For example, through the uptake of the sciences
and social sciences, the application of economic models
to the financial world has performative effects “and
among these effects is to alter economic processes to
make them more like their depiction by economics”
(MacKenzie etal., 2007, p. 67). Taking economics as
the test case, MacKenzie (2007) identifies three variants
of performativity. “Generic performativity” is the basic
use of an economic idea. Effective performativity is the
use of an economic idea that “makes a difference” in
the world: Using the idea changes economic processes
and realities. Barnesian performativity is a stronger
version of effective performativity that results in the
altering of actual economic processes to make them
more like economics model or theory. By contrast,
counterperformativity is use of an economic idea that
changes economic processes so that they conform less
well with the depiction provided by economics. The
case of economics offers a tool for thinking about the
ways that psychology travels and its entities are per-
formed beyond scientific spaces. Evidence closely
resembling Barnesian performativity was found by
Haslam (2016) who tracked, both quantitatively and
qualitatively, the circulation of psychological concepts
in North American society. Haslam (2016) described the
expansion of psychological concepts as “concept
creep,” noting how they take shape and mutate in
response not merely to scientific evidence but also
psychologists’ political inclinations and changing social
conditions.
Finally, the enactment perspective can be general-
ized to encompass not only scientific practice but also
the world as a whole, thus turning this perspective into
a metaphysics, albeit a different one than assumed by
most scientists. The variability and multiplicity that are
shown in studies of scientific practice are then taken
to be characteristics of reality as such, whether science
is involved or not. Instead of the usual Western meta-
physical conception of reality as fundamentally singular,
stable, and determinate, such an alternative metaphys-
ics pictures reality as “an ultimately undecidable flux”
(Law, 2004, p. 144).4 This is obviously a radical and
potentially controversial idea, but it is directly relevant
to the ideas of the proponents of conceptual replication.
For them, flux is a fundamental character of the social
world.
Conceptual Replication as Enactment
A coterie of researchers has responded to the recent
calls for more direct replication in psychology by con-
testing its efficacy and scientific value. Some have even
challenged the very possibility of direct replications.
Instead, they promote conceptual replications, which
use different operationalizations, variables, experimen-
tal designs, and participants to test the theory of the
original study. (It warrants note that although the two
forms of replication—direct and conceptual—typically
are considered distinct kinds according to “standard
discourse” [Crandall & Sherman, 2016, p. 93], some
authors describe fuzzy boundaries between them or
give a more elaborate taxonomy of replication kinds.)
Proponents of conceptualism forward a nuanced under-
standing of Popper’s work on scientific epistemology,
especially regarding falsifiability and confirmation,
and also cite other philosophers of science who report
on the ambiguity and even logical impossibility of
direct replication or emphasize how science is a col-
lective, accumulating activity (Cesario, 2014; Crandall
& Sherman, 2016; Stroebe & Strack, 2014). Supported
by these philosophical positions, conceptualists argue
that the primary function of replication is not falsifica-
tion but exploration and development of theory, sub-
mitting that replication of a concept across different
experimental situations is more robust than replications
of exact situations. Advocates of conceptual replication
generally prioritize basic over applied research, discov-
ery over intervention, and exploration over confirma-
tion. Beyond maintaining that direct replications
(whether they disconfirm or confirm the original study)
provide ambiguous evidence, they advance three fun-
damental and substantive claims: the context sensitivity
of psychological phenomena and processes, the pre-
eminent scientific goal of theory development, and the
special expertise required of psychological scientists.
For conceptualists, failures of direct replication do
not indicate that the hypothesis or theory is necessarily
wrong but rather that many psychological phenomena
and their experimental effects are highly sensitive to
context and, therefore, often cannot be replicated
exactly. They observe that psychology’s phenomena are
affected by situation, culture, language, politics, and
personal experiences. Thus, the effects observed via
empirical investigations can vary over time and across
situations. Such is the mutability and flux of (social
psychological) phenomena that “one can never step in
the same river twice” (Crandall & Sherman, 2016, p. 94).
Extreme context-sensitivity is “the reality of our subject”
1494 Derksen, Morawski
(Dijksterhuis, 2014, p. 73). Conceptualists draw atten-
tion to two context-sensitive domains: the local context
of the investigative situation and the larger one of cul-
tural and worldly events. The identification of variations
in investigative settings (typically experiments) echoes
and amplifies a number of methodological concerns
raised in discussions of reproducibility: participant
populations and sampling, time, location, variations in
instruments and stimuli, experimenter effects, and the
like. Experimental manipulations in social psychology,
for instance, might have “different psychological prop-
erties and effects if used in contexts or populations
different from the original experiments”; exact replica-
tions, therefore, “can never be achieved” (Fabrigar &
Wegener, 2016, p. 72) and “are fundamentally impos-
sible in social-personality psychology” (Reis & Lee,
2016, p. 149). Experiments can never be repeated
because “effect sizes are not determined in a universe
that is purified of all other influences, observed strength
is determined by both the systematic variance between
and the error within the experimental conditions”
(Strack, 2017, p. 2).
Variability stems not only from unavoidable micro-
level differences across investigative situations but also
from variations in culture, history, politics, and climate
that can affect behavior, cognitions, and emotions. Con-
ceptualists find this macro-level sensitivity to be unre-
markable given that, per Bavel (2016b, p. 4936), “the
notion that human psychology is shaped by the social
context has been the central premise of the field (social
psychology) for nearly a century.” Likewise, confirming
cultural sensitivity is our knowledge about the evolved
complexity of the human mind (Cesario, 2014). Taking
seriously the cultural, environmental, and historical influ-
ences on psychological processes presents implications
that extend beyond reproducibility of science to core
questions about the very nature of psychological phe-
nomena: They are matters of ontology. Many phenomena
are moderated by cultural and historical conditions and
sometimes might even be “culture dependent” (Stroebe
etal., 2012, p. 679). To Iso-Ahola (2017), “all of this
means that there are no static phenomenon particles,
unlike the Higgs Boson particle in physics” (p. 2).
Some have suggested that such heightened concern
about the nature of psychological phenomena is more
serious than earlier crises in social psychology. Whereas
researchers once questioned whether a phenomenon
existed outside the lab, “the question being asked today
is much more unsettling: ‘does this phenomenon exist
at all’?” (Hales, 2016, p. 40). However, most conceptual-
ists maintain that there are limits to the flux or change-
ability of psychological phenomena. They hold either
that not all phenomena are dependent on cultural fac-
tors or that “the brain, behavior, and society are orderly
in their complexity rather than lawful in their simplicity”
(Bavel etal., 2016a, p. 6458), or that despite variability,
essential psychological phenomena can be located
through theory-guided research. Behaviors and thoughts
are the effects of numerous, sometimes imperceptible,
unseen and mediating factors—“underlying” or “inter-
nal” mechanisms. The path toward discovering these
essential mechanisms is not via refinement of direct-
replication techniques, although their improvement is
important, but through intensified theory development.
“Confidence in theory” is valued over the “confidence
in operationalizations” of researchers conducting direct
replications (Crandall & Sherman, 2016, p. 93).
Prioritizing theory over a concerted project to repli-
cate experiments and reproduce effects is warranted by
epistemological claims. Most basic of these claims is that
reproducibility of an empirical finding is less valuable
than evidence of validity of a theory (Greenfield, 2017;
S. B. Klein, 2014; Stroebe & Strack, 2014). (Given this
spotlighting of theory, Zwaan et al. [2018] suggest that
the term “conceptual” is a misnomer and that a more
appropriate designation would be “extension” to refer
to testing and extending theory.) Advocates of concep-
tual replication stress the cumulative nature of science
not as amassing countless empirical findings but as pro-
gressing through theory development, refinement, and
sometimes replacement. Crandall and Sherman ask fel-
low researchers to “trade higher confidence in a single
set of operations for higher confidence in theory” (2016,
p. 98). They note that not data but “ideas are the unit
of analysis in conceptual replication” (2016, p. 95). Sci-
ence is understood as a collective enterprise composed
of research programs for which the goal is creating valid
theories (Reis & Lee, 2016; Stroebe, 2016).
According to Strack and Stroebe (2018), the goal is
to understand underlying mechanisms, which requires
not only experimenting but also working at “the theo-
retical level” (para. 5). So relying on theory entails
appreciation that theories “are formulated on a level
that transcends the concrete evidence; and their validity
does not rest on the outcome of one specific experi-
mental paradigm” (p. 39). In contrast to reformers, con-
ceptualists engage the psychological world not at the
ground level of objects, behaviors, or effects. Hacking
(1999) finds use of what Willard Van Orman Quine
called “semantic ascent” (as quoted by Hacking, 1999,
p. 21), shifting attention from ground-level talk about
objects to the abstract level of talk about what those
objects mean. To Hacking, such ascent entails the use
of “elevator words” (p. 21): The conceptualists fore-
ground theory (over data), special expertise (over rou-
tine science training), and concepts (over behaviors and
effects). Without making this ascent, conceptualists
intimate, staying at ground-level absorption with direct
Perspectives on Psychological Science 17(5) 1495
replications risks ambiguous outcomes and perhaps
more importantly produces effects specific to the exper-
iment’s unique conditions. However, along with this
ascent to theory and expertise is an expectation of
ultimately locating basic psychological mechanisms,
presumably grounded in neurological processes.
Along with promoting a theory-driven enterprise on
epistemic grounds, conceptualists also champion theory
on ontological grounds. An observed phenomenon or
effect is not necessarily evidence of the “underlying
mechanisms” (Stroebe & Strack, 2014, p. 59). The “col-
lection of effects and phenomena” deters researchers
from exploring basic laws (Strack, 2017, p. 3). In other
words, the cultural and contextual sensitivity of psy-
chological processes and the unseen moderators pose
formidable challenges to ambitions regarding direct
replication. Conceptual replications instead aim to
“operationalize the underlying theoretical variables
using different manipulations and/or different mea-
sures” (Stroebe & Strack, 2014, p. 60). It is precisely
through variations on an earlier study (i.e., through
conceptual replications) that the underlying stable real-
ity can be brought into view. Foolishly replicating the
same procedure (a direct replication) only risks failure:
An experiment that once produced an effect may never
do so again because of the ever-changing context
(Crandall & Sherman, 2016). Social psychologists are
condemned to continuous variation if they want to keep
a hold on the stable psychological reality.
Another contention is that researchers conducting
direct replications underappreciate scientific expertise.
Whereas the projects initiated to foster direct replication
assume that well-trained researchers can proficiently con-
duct replications, the conceptualists often mention the
necessity of special “expertise and diligence to generate
a new result in a reliable fashion” (Strack, 2017, p. 3; Bavel
etal., 2016a). Some are remarkably critical, suggesting
that “the replication crisis can even be seen as rewarding
incompetence” through reformers’ supposition that any
researcher can undertake replications; in contrast, Bau-
meister avers, competence requires “years of specialized
training and skill cultivation” (2016, p. 156). Replications
depend on expertise in the specific subject area, and
without this extensive experiential proficiency, “replica-
tion experts” “may train a big telescope with a dirty lens
on the wrong planet” (Schwarz & Clore, 2016, p. 1409).
Indeed, sometimes researchers have responded to failed
replications of their original studies with comments on
the replication researcher’s lack of necessary expertise
(e.g., Schnall, 2014). Juxtaposed against reformers’ unease
about researchers’ “degrees of freedom” or nonstandard
research decisions, is conceptualists’ valuing of research-
ers’ expert judgment. Using terms from Daston and
Galison’s (2007) history of objectivity, the conceptualists
emphasize “trained judgment” (the crucial value of special
expertise) over the faith in “mechanical objectivity” (rigor-
ous, routine procedures) of researchers conducting direct
replications.
There are two ways that one might appreciate the
conceptualist position via enactment. From one angle,
conceptualists practically mirror enactment theory with
their strong emphasis on chance, variability, and flux in
social behavior. Such an enactment perspective is illus-
trated in the recognition that “even a technically identi-
cal manipulation does not guarantee an equivalent test
of psychology phenomenon when the context changes”
(Schwarz & Clore, 2016, p. 1408). However, this is not
the reality of significance to conceptualists, and their
enactments transpire in a different place and quite
differently—as internal or underlying mechanisms. An
underlying psychological reality, dynamic yet lawful,
consists of sometimes imperceptible, unseen factors.
Conceptualists hold to a belief in the lawfulness and
stability of reality, a reality that can be known through
crafting good theory: “Empirical outcomes are meaning-
ful only with respect to the theory being tested” (Stroebe
& Strack, 2014, p. 60). Theory building is a crucial form
of the work required to make entities singular. A theory
connects different effects and different results from stud-
ies run by various researchers working at various sites
and at different times, and makes them evidence of a
single mechanism, process, or disposition.
An example of the work of singularization is Barsalou’s
(2016) assessment of the varying forms and findings of
social priming, including evidence of individual differ-
ences. He found that “simple direct pathways from
primes to primed responses rarely, if ever, exist” (p. 9).
Given such cognitive and behavioral complexity, he
proposes a theory of “situated conceptualization” that
provides a “natural” and “principled account of the
knowledge structures that develop” in the form of indi-
viduals’ multimodal inferences (p. 9). The theory holds
that the brain processes different elements of a situation
(e.g., place, agents, objects, self, action) and multiple
experiences and, over time, integrates these experi-
ences and produces conceptual interpretations. These
“situated conceptualizations” are activated later when
the individual encounters a situation containing ele-
ments of earlier ones; there ensue pattern completion
inferences that are implemented through multimodal
(of place, agents, objects, etc.) simulations in the brain.
This multistage theory explains the inadequacy of direct
replications in the study of social priming: “Because
any aspect of these situated conceptualizations can trig-
ger this process, or be the outcome of it, social priming
takes infinitely many forms” (p. 8). Yet the theoretical
framework gives an account of the fundamental pro-
cesses that can yield highly variable effects.
1496 Derksen, Morawski
Another project aiming toward singularization, but
of a different kind, is Greenfield’s (2017) proposal to
move beyond the issue of whether a phenomenon is
replicable to study the effects of sociodynamic changes
on culture and behavior. Her “theory of social change
and development” (p. 763) offers a description of
changes on several levels, from sociodemographic
down to behavioral, and the causal influences going
from the higher to the lower levels. Greenfield discusses
two failed replications of social psychological experi-
ments and argues that her theory explains both failures.
They were not failures to replicate but rather demon-
strations “of the effect of culture change on behavior”
(p. 768).
The theory work being promoted is not abstract
theorizing; it works to bring associated enacted entities
together to be understood as the same thing. According
to Woolgar and Lezaun (2013) and others, the produc-
tion of singularity is always a fragile achievement and
often a source of tension. In fact, some conceptualists
admit that researchers cannot always attain consensus
about the meaning of their conceptualizations. To prevent
such problems, Crandall and Sherman advise “careful
pilot testing” and “robust manipulation checks” (2016,
p. 98); Strack and Stroebe (2018) advise using “theoreti-
cally grounded hypotheses that generate specific pre-
dictions” (para. 7). The conceptualists’ attempts to attain
consensus or realize singularity regarding the entities
being investigated remains fragile, however, because
direct replications, following enhanced methodological
guidelines, keep yielding findings that differ from the
original results.
Direct Replication as Enactment
Ever since the reform movement started to gather steam
in 2011, replication has been a central concern. There
are several aspects related to the role of (direct) repli-
cation in science according to the reformers. First, the
reproducibility of events is often presented, following
Popper, as a precondition of falsifiability and thus of
science. Conceptual replication is important (for the
refinement and further development of a theory), but
only after the reproducibility of the effect under the
same investigative conditions has been determined. “If
a phenomenon is not replicable (i.e., it cannot be con-
sistently observed), it is simply not possible to empiri-
cally pursue the other goals of science” (LeBel etal.,
2017, pp. 8–9; Earp & Trafimow, 2015). Second, replica-
tion must be possible by following explicit instructions,
given sufficient expertise—by “anyone who has learned
the relevant technique,” as Popper (2002, p. 81) put it.
Reformers are very skeptical about appeals to the need
for more than standard technical skills (Neuroskeptic,
2014; Srivastava, 2014b; Wilson, 2014). Third, (direct)
replication is seen as providing evidence for the reality
of a phenomenon. Reproducibility shows the robust-
ness and reliability of a phenomenon.
The reformers’ efforts to create a scientific practice
based on direct replication are characterized by atten-
tion to statistical and methodological detail; by an
emphasis on rules, regulations, and administration; and
by the important role of infrastructure. The statistical
and methodological inadequacies and errors of the cur-
rent practice have been listed in impressive detail
(Forstmeier etal., 2017; Wicherts etal., 2016) and are
generally seen to consist in researchers’ exploitation of
so-called researcher degrees of freedom (Simmons
etal., 2011) to arrive at the desired result. In every
scientific study, many decisions have to be made (e.g.,
regarding sample size, which comparisons to test,
which tests to report). Such decisions can have a great
influence on the study’s results, as shown, for example,
by Simmons et al. (2011), and opportunistic use of this
flexibility increases the chance of false positives. The
solution that is most commonly proposed is to constrain
this freedom by directing the researcher to make these
choices before data collection and publicly register the
study design and data analysis plan that has an elec-
tronic date stamp as validation. This is called preregis-
tration (Wagenmakers et al., 2012).5 In the related
Registered Report (RR) format, a journal editor guaran-
tees publication of a study if the preregistered study
plan is reviewed positively, regardless of the eventual
results of the study (Chambers, 2013). Thus, in an RR,
both researchers and editors constrain their freedom in
the interest of falsifiability, giving space to negative
results and their publication.
Preregistration is an administrative procedure for
reducing or making transparent the liberties researchers
might take in data analysis. An administrative gesture
with a similar purpose was proposed by Simmons et al.
(2012). Their “21 word solution” requires authors to
state that “We report how we determined our sample
size, all data exclusions (if any), all manipulations, and
all measures in the study” (2012, p. 4). Another kind of
“statement” was proposed by Simons et al. (2017). The
“constraints on generality” statement would have
researchers declare to which population they claim
their findings can be generalized, so that researchers
conducting replication studies can take this into
account. Finally, a related, declaration-type of gesture
are the open science badges and the associated stan-
dards introduced by OSF (Blohowiak et al., 2013).
Whereas the 21-word solution and the constraints-on-
generality statement have not (yet) found wide use, the
badges are implemented by an increasing number of
journals and have been claimed to be effective in
Perspectives on Psychological Science 17(5) 1497
encouraging preregistration, data sharing, and other
open practices (although this claim has been contested
by Bastian, 2017).
Online infrastructure is an important part of this
research practice. OSF (https://osf.io) facilitates a trans-
parent, collaborative research process from inception
to publication, including preregistration of study plan.
There are online inventories of replication results
(http://curatescience.org and the older http://www
.psychfiledrawer.org). There is a preprint archive for
psychology (https://psyarxiv.com, modeled on https://
arxiv.org) allowing the quick dissemination of manu-
scripts and their discussion by the community. Social
media—Twitter and Facebook, in particular—is a forum
where developments are discussed almost instanta-
neously, by the widely dispersed community, without
much hierarchy or gatekeeping. Finally, many reformers
have blogs, where they formulate opinions, present
results, comment on others’ work, continue discussions
that started on Twitter, or start debates that spill over
onto Twitter.
Together, the statistical and methodological rules and
strictures, the registration and archiving of decisions
and designs, the statements, the badges with their
standards, the repositories of results and manuscripts,
the online collaborative spaces, and the social-media
communication infrastructure, form a large, heteroge-
neous device—a “method assemblage” as Law (2004)
calls it—for the production of “reproducible science”
(Munafò etal., 2017, p. 1). The operation of this device
has resulted in serious doubt being cast upon accepted
theories and effects in social psychology; several social
priming effects (Doyen etal., 2012; Shanks etal., 2013)
and power posing (Ranehill etal., 2015), among others,
have been thrown into doubt by failed direct replica-
tions. Proponents have lauded this corrective role of
direct replications and see it as falsification in action,
whereas others have been critical of failed replications
of their work (Bargh, 2012; Schnall, 2014) and/or have
condemned what they see as the negative and hostile
attitude of some reformers (Baumeister, 2016; Fiske,
2016; Hamlin, 2017).
There has been much debate over whether direct
replication and its proponents play a corrective or a
destructive role. However, in our view, the science of
the reform movement is better seen as performative
and productive of reality—as enacting a reality. Reform-
ers take the production of phenomena very seriously.
An example of the performativity of their approach to
research is the replication by Wagenmakers et al. (2016)
of Strack et al.’s (1988) facial-feedback experiment. The
facial-feedback hypothesis states that the facial expres-
sion of an emotion will intensify or even bring about
the experience of that emotion itself. Strack et al. tested
the more specific hypothesis that this effect occurs even
without cognitive mediation (i.e., when people are not
aware they are expressing a certain emotion). To this
end, they devised a bogus experimental task that made
participants unwittingly create an expression (a smile
or a pout). Specifically, participants were asked to rate
the funniness of cartoons on a paper questionnaire with
a pen that they held either between their teeth (smile),
between their lips (pout), or in their nondominant
hands (neutral) as part of what they were told was “an
experiment investigating people’s ability to perform dif-
ferent tasks with parts of their body not normally used
for those tasks, as injured or handicapped persons often
have to do” (Strack etal., 1988, p. 770). In the smile
condition, cartoons were rated funnier than in the other
two conditions.6
Note that Strack et al.’s experiment is itself a study of
enactment, revolving as it does around the question of
whether enacting an emotion by expressing it creates the
reality of that emotion. Correspondingly, the report dwells
extensively on how to direct the performance of the
subjects. The description of the experimental procedure
is lengthy, including the precise wording of the cover
story, the instructions that the subjects received about
how to hold the pen (illustrated with two photographs),
what type of pen was used, and the fact that the four
Gary Larson cartoons that were used had been “prerated
as being moderately funny” (Strack etal., 1988, p. 771).
There was also a pretesting procedure to make sure that
the instructions produced the kind of spontaneous per-
formance that was intended: one in which participants
were not aware of the purpose of the experiment.
These performative aspects of the facial-feedback
experiment, already prominent in the original study,
are further emphasized in the replication. First of all,
great care was taken so that the script of the experiment
reproduced the proper performance of the participants.
Strack provided the original experimental materials and
gave feedback but declined to review the protocol.
Ultimately it was vetted by another researcher with
experience with this experimental task, and it was then
preregistered on OSF. Second, because this replication
study consisted of 17 separate replication experiments
in different labs, the coordinators of the collaboration
made sure that the participating labs received identical,
detailed instructions accompanied by a video of “the
complete 24-step procedure” (Wagenmakers et al.,
2016, p. 919; video available at https://osf.io/spf95/).
Care was taken that translations of the research materi-
als were accurate by having “a separate bilingual
speaker independently translate them back to the origi-
nal language” (Wagenmakers etal., 2016, p. 919). Third,
the replication study included several enhancements
of the original experiment intended to improve the
1498 Derksen, Morawski
participants’ performance. The participants received
part of their instruction by video (to prevent experimenter-
expectancy effects), and they were filmed while they
were doing the experimental task to check that they
held the pen correctly. Moreover, the researchers took
care to select participants who were unlikely to be
familiar with the original study, so that their perfor-
mance would be spontaneous.
This meticulously staged, precisely choreographed,
17-experiment study produced no statistically discern-
ible difference between the smile condition and the
pout condition. The 17 effect sizes (mean rating differ-
ences between the two conditions) were small, having
a meta-analytic effect size of .03 (Wagenmakers etal.,
2016). It would be a mistake, however, to think that
“nothing” came out of this study. Not only is a null
result still a result (as every statistician would empha-
size), but in terms of performativity, something real was
enacted here, meticulously and abundantly. The 1,894
participants all took a pen in their mouths in either of
two very specific ways, looked at a set of Gary Larson
cartoons, and indicated their level of amusement on a
piece of paper with that pen. On average, these par-
ticipants were moderately amused, whichever way they
held the pen in their mouths. Superficially this reality
is nothing new: The manipulation did not affect amuse-
ment, reality was not transformed. But it was a perfor-
mance that was both richer than the original in terms
of number of actors and their geographical spread, as
well as more homogeneous: Regardless of condition,
everyone acted the same on average. It may not be very
interesting at face value, but it is powerful in its over-
whelming uniformity.
Yet the uniformity was not perfect. Although Wagen-
makers and colleagues had connected the 17 replication
efforts into one singular null result, Strack pointed out
cracks in the uniform facade. He argued that the studies
that had employed nonpsychology students as partici-
pants collectively did have a significant effect, in the
expected direction, possibly because these students
were unaware of the existence of the facial-feedback
effect (Strack, 2016). In general, it seemed significant
to him that nine teams found an effect in one direction
and eight teams an effect in the opposite direction
(Strack, 2017). He also pointed out that filming the
participants during the replication experiments might
have made a difference: The camera could have made
the participants self-conscious about their performance,
inhibiting their amusement (Strack, 2016). This hypoth-
esis has subsequently been tested by Noah et al. (2018),
who found that the presence of a camera indeed elimi-
nated the facial-feedback effect. Wagenmakers & Gronau
(2018) and Gelman (2018), however, expressed reserva-
tions about Noah et al.’s replication study.
To get more clarity, a meta-analysis of 138 facial-
feedback studies was conducted that examined the
overall effect of facial feedback and the influence of 12
moderating variables. There were effects of facial feed-
back on emotional experience, but they tended to be
small and highly variable, for reasons that the meta-
analysis could not elucidate (Coles, Larsen, & Lench,
2019). Contrary to what Noah et al. (2018) found, video
recording the participants hardly made a difference.
Another multilab replication project is now under way
to shed more light on facial feedback and determine
when it should have a reliable effect on emotion (Coles,
March, etal., 2019). The performance of the participants
gets even more attention than in the original study and
the replication by Wagenmakers et al. (2016): Partici-
pants produce facial expressions in three different ways
(including mimicking the expressions of actors “display-
ing prototypical expressions of happiness”; Coles,
March, etal., 2019, p. 7), and rating their own perfor-
mance in four different ways. In three pilot studies,
facial-feedback effects could be reliably produced, but
not with the pen-in-mouth task. It is not clear why. Thus,
the multilab replication effort by Wagenmakers et al. of
the pen-in-mouth study set in motion further discussion
and research that have produced a view of the connec-
tion between facial feedback and emotion that is con-
siderably messier than was the case before 2016. Whereas
Strack saw one general facial-feedback hypothesis con-
firmed by many different studies, the current state of the
field is one of multiple effects that vary in strength for
reasons that are largely unknown, loosely connected by
the fact that they show that facial feedback generally
seems to have a small effect on emotion.
Enacting Variability
Other multilab replication efforts have had a similar
effect of creating “mess.” Many Labs 2, for example,
conducted replications of 28 original findings, using
125 samples with a total of 15,305 participants in 36
countries (R. A. Klein etal., 2018). Only 15 findings could
be replicated. Contrary to the conceptualists’ common
explanation that nonreplications may be due to vari-
ability in the cultural context of the participants or the
expertise of the researchers, for the most part, effects
could either be reproduced or not; lab or sample hardly
mattered. There was, however, some heterogeneity in
the effect sizes, particularly among the effects that were
larger on average. Thus there was still some variability,
but not where conceptualists would expect it, in differ-
ences between labs or cultural contexts. The concep-
tualists’ argument, that “manipulations and measures
often derive their meaning from the historical, social,
and cultural context at a given time” (Stroebe, 2019,
Perspectives on Psychological Science 17(5) 1499
p. 95) and a failure to reproduce an effect in a direct
replication is therefore uninformative, is problematic in
light of these results.
Olsson-Collentine et al. (2020) determined the het-
erogeneity in the sizes of 68 effects produced in pre-
registered, multilab, direct-replication studies and found
it to be small or zero in most cases. In other words, if
you maximize the similarity in procedure, remove
researcher degrees of freedom, but conduct the study
in different labs (or online), in different places and
countries, with different samples, effect sizes tend to
be quite similar. But Olsson-Collentine et al. also note
that for 12 out of 68 effects, heterogeneity was large,
particularly for large effects. Moreover, variability is
restricted here to sample and settings, but most of the
samples were undergraduates, and the (immediate) set-
tings were university labs. There are other potential
sources of variability. Commenting on Many Labs 2,
Srivastava (2018) has argued that that study did not
prove that social behavior is not contextually (histori-
cally, culturally) variable. It shows that there usually
are no hidden moderators lurking in experiments. Psy-
chologists’ efforts at experimental control are usually
successful. That means, Srivastava concluded, that if
you believe in contextual variability, you have to pur-
posely study it, rather than merely draw on it as a pos-
sible explanation of replication failures. Forscher has
similarly stressed that social psychologists need to do
more than their usual “small-ish one-shot experiments
using pallid manipulations of dubious validity” (2018b)
to produce situational influences on behavior. Instead
that may require going out of the lab to “leverage natu-
rally occurring experiments” (2018a) and doing longi-
tudinal studies.
Congruent points of view have been put forward
earlier in response to fundamental problems in the
discipline. Consider for example Greenwald et al.’s
(1986) article “Under What Conditions Does Theory
Obstruct Research Progress?” Their diagnosis of the
state of the discipline in the mid-1980s resembles that
put forward by current reformers. They noted that the
academic incentive structure and the publication prac-
tices that psychologists have to work with encourage a
strong confirmation bias in their research practices,
which in turn leads to methodological problems.
“Researchers’ dispositions to confirm hypotheses sup-
port their use of methods that are demonstrably prone
to misinterpretation and, because of that, obstruct sci-
entific progress” (1986, p. 222). Their solution to these
problems was to shift the aim of research from theory
to results. Psychological research should be “condition-
seeking”: Rather than testing theory (or, in practice,
seeking its confirmation), it should look for the condi-
tions under which a psychological phenomenon occurs
(1986, p. 223). In such an approach, theory is an instru-
ment rather than a goal in itself. It gives direction to
the condition-seeking process and keeps it from devolv-
ing into the simple, unstructured accumulation of quali-
fications of the general theory.
Greenwald et al. (1986) were inspired by McGuire’s
contextualism (later renamed perspectivism), according
to which every conceivable hypothesis in psychology
is true in some context, and the research process con-
sists of discovering that context and describing it in
detail. Sharing a contextualist premise that knowledge
emerges in contexts that are dynamic, McGuire’s per-
spectivism then reasons that “all hypotheses are true in
the sense that a reasonably ingenious and persistent
scientist with sufficient resources can always finally
create or find some special context in which the hypoth-
esized relationship obtains” (McGuire, 1986, p. 284).
Thus, any empirical claim “has potential for simulating
its referent adequately in some contexts and from some
perspectives” and “any hypothesis adequately repre-
sents the known from some viewpoints but not from
others” (p. 281). His epistemic guide for expanding and
clarifying hypotheses understands research as a “cre-
ative performance” (p. 293) that exploits rather than
constrains the “revelatory power” (p. 297) of both
empirical and theoretical work. Empirical research does
not test the truth of a theory but aims to develop the
theory by exploring the conditions in which a phenom-
enon occurs. Greenwald et al. (1986) distinguish their
proposal from McGuire’s by saying they go beyond his
ideas “primarily in concluding that theory testing should
often be displaced from its status as a central goal of
research” (p. 226).
A similar emphasis on phenomena and their context
can be found in Paul Rozin’s critique of social psychol-
ogy. Following Solomon Asch, Rozin contended that
social psychology’s attempt to emulate the rigor and
precision of the natural sciences has remained fruitless
because it was not preceded by “an extensive examina-
tion and collection of relevant phenomena and the
description of universal or contingent invariances”
(Rozin, 2001, p. 3). It is useless to test a hypothesis,
however rigorously, if it is not informed by a thorough
exploration of the phenomena of interest.7 Social psy-
chology tries to ascend toward theoretical abstraction
and formalization without a solid grounding in real-
world phenomena.8 Instead, experiments in social psy-
chology are usually oblivious to context, seemingly
transcending “time, location, culture, race, religion, and
social class” (2001, p. 4). Their results are often difficult
to generalize and have no obvious bearing on practical,
everyday problems.
An elaborate call for contextualism was forwarded
in the edited volume Contextualism and Understanding
1500 Derksen, Morawski
in the Behavioral Sciences (Rosnow & Georgoudi,
1986a). The editors ground contextualism with the
premise that social reality is active and ongoing; there-
fore, “all knowledge is perennially conceptual and con-
jectural and no method can conclusively demonstrate
the ‘truth’” (Rosnow & Georgoudi, 1986b, p. 4). That
psychology’s facts are indeterminate, however, does not
preclude their empirical scrutiny. Further and impor-
tantly, in this contextualist perspective, context is not
an “independent ontological entity” for context and act
are integral to each other. And the editors take meth-
odological pluralism as necessary to investigate “the
wider context that ‘allows’ or ‘invites’ the occurrence
of that event and renders it socially intelligible” (p. 5).
Scientific method does not stand outside this contextual
web to detect entities but is itself an active and produc-
tive process. Thus, “Both the products of this process,
as well as the process itself, will reflect the contextual
boundaries in which they operate or develop” (p. 18).
The enactment perspective goes beyond these pro-
posals in its rejection of the discovery metaphor, instead
seeing research as productive of reality. In our opinion,
the current crisis discussion is pointing in precisely this
direction, despite the generally rather traditional philo-
sophical assumptions of both conceptualists and
reformers. The emphasis of the proponents of concep-
tual replication on the variability of human behavior
and on the multiple constituents of psychological phe-
nomena is not incompatible with the attention to pro-
cedural detail of the advocates of direct replication. Our
proposal is not merely to do away with the dichotomy
of direct versus conceptual replication. We agree with
Nosek and Errington (2020) that this distinction is
unhelpful. Because no two studies can be identical, no
replication “exact,” the claim that one study’s methods
replicate another study’s methods requires criteria for
the relevance of differences (Nosek and Errington,
2020). A study replicates another study in some sense,
and that sense is supplied by theory.9 We believe that
social psychology requires a broad spectrum of replica-
tion studies, and that spectrum cannot be neatly divided
into “direct” versus “conceptual.” Most of all, however,
we think social psychology needs to be geared to pro-
ducing multiple psychological realities rather than dis-
covering a single psychological truth. It is a shift from
discovery to technology, from “mirroring to world-
making” (Gergen, 2015, p. 287). It is a shift away from the
seemingly endless proliferation of “functional entities”
that researchers produce and eventually discard as new
ones are introduced (Stam, 2010). Such a scientific pro-
gram would combine an interest in variability with a
focus on concrete effects and the minutiae of their pro-
duction. As the previous few years have made abun-
dantly clear, it is precisely through paying close attention
to whether, when, and how effects are replicated that
the reality performed in social psychology becomes
fragile, variable, and messy. That in turn invites the
consideration of other approaches to research, beyond
the traditional laboratory experiment, and beyond the
search for basic principles of social behavior.
Such a shift suggests the need to reflect on the politi-
cal as well as the pragmatic aspects of psychological
research and the realities it produces. It puts to question
the binary of “basic” and “applied” research that is gen-
erally presumed by both reformers and conceptualists.
If research is no longer conceived of as the discovery
of an objective reality but rather as the generation of
diverse realities, then what realities we choose to bring
into being is a political and ethical as well as a scientific
matter (Law, 2004; MacIntyre, 1985; Stam, 2010). In the
case of the facial-feedback controversy, for example, it
is remarkable that the practical relevance of the effect,
if there is any, is largely ignored in the discussion.10 In
general, we need to pay more attention to the reality
we are making as we are doing our research, talking
about it in TED talks, writing about it in the newspaper,
and using it in our profession, and pay less attention
to the search for a theory that will represent reality.
Conclusion
The crisis literature is densely populated with charges
of bad science, reports of one or another methodologi-
cal deficiency, and multiple, technologically instituted
directives for realizing robust psychological science,
which, in turn, ultimately yield stronger truth claims.
The various debates have produced a bifurcation of
perspectives and the emergence of two prominent
camps. One notably vocal group advocates direct rep-
lication (along with a host of other regulatory mea-
sures) as means to discover phenomena the existence
of which is confirmed through their reproducibility. The
other group advocates what can be understood as plu-
ral methods as a necessary means to discover psycho-
logical phenomena that they take to be dynamic and
highly dependent on context. The focused, ongoing
attention to what constitutes proper methods has often
overshadowed the different ways in which these two
camps think about ontology—about the nature of psy-
chological entities. We suspect that the trenchant meth-
odological disputes and differences in underlying
ontological commitments will not be resolved solely
through empirical work. Instead, a generative and
reparative approach is to understand how research
enacts realities that are generated through rigorous
thinking, technical operations, instruments, trained
judgment, and tact. Experiments perform certain reali-
ties that can be supported or challenged in subsequent
Perspectives on Psychological Science 17(5) 1501
empirical work. So understanding the enactment of
psychological realities underscores the importance of
plural methods and invites reconciliation by providing
a set of questions (what reality is performed here, to
what end, etc.) on which both camps can focus and
that can constructively move them beyond the direct
versus conceptual discussion.
That opposition of direct and conceptual replication,
and the way they are commonly associated with empha-
ses on permanence and variability, respectively, is
unhelpful. Direct replication is necessary not only to
detect flexibility in methods but also to demonstrate
variability. To determine whether a phenomenon is con-
text sensitive, one must try to produce it using the same
procedure in different contexts. If it does vary, one can
proceed to study this variability with studies that inten-
tionally change this or that aspect of the original study.11
There is no inherent contradiction between rigorous,
precise replication that seeks to control for flexibility
and an interest in the variability of social behavior. We
do think that that variability calls for methodological
pluralism and, above all, for an awareness of the per-
formativity of psychological research. Rather than per-
severing in a quest for stable mechanisms underlying
the variability, it is better to embrace the variable phe-
nomena psychology produces and take responsibility
for them. With this enactment perspective and the con-
sequent understanding of the roles of direct and con-
ceptual replication, psychology’s future would be more
phenomenon-centered and better able to determine
under what conditions phenomena are enacted.
This has implications for the politics and ethics of
psychology. These implications complicate even as they
expand upon Miller’s (1969) long-revered call for “giving
psychology away” (p. 1071) to improve “human wel-
fare.” Psychology’s part in the making of the world, its
ethical and political effects, has been long noticed if
rarely acted upon. Reflecting on the ways that psychol-
ogy makes its objects true (or false), MacIntyre (1985)
called for psychologists’ attention to how “psychology
has changed the human world in the course of interpret-
ing it and created new phenomena in the course of
trying to understand old ones” (p. 902). Psychology’s
effect on culture “has been to foster types of character
and modes of action,” an enormous effect that MacIntyre
suggests raises and extends psychologists’ responsibili-
ties. The ethics attending the realities that psychologists
produce were recently examined in Stam’s (2015) call
for an “ethics of shared understandings” (p. 117) and
Haslam’s (2016) study of “concept creep” (p. 1). As
Haslam concludes his analysis, understanding the driv-
ers of concept creep “and evaluating its costs and ben-
efits are important goals for people who care about
psychology’s place in our cultures. Equally important is
the task of deciding whether the trend should be encour-
aged, ignored, or resisted” (p. 15).
The stakes of electing one ontological perspective
over the other (or others) and thus privileging one
method over others are high. Alternatively, appreciating
psychological research as enacting realities, and appre-
ciating different methods as potentially producing dif-
ferent realities makes way for a genuinely open science,
generative research programs, expanded reflection on
ethics, and ultimately more richly informed, construc-
tive scientific exchanges about the nature of psycho-
logical entities.
Transparency
Action Editor: Adam Cohen
Editor: Laura A. King
Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of
interest with respect to the authorship or the publication
of this article.
ORCID iD
Maarten Derksen https://orcid.org/0000-0003-1572-4709
Acknowledgments
We thank Sara Kamens and Jonna Brenninkmeijer for their
helpful comments on an earlier version of this article and
Brian Nosek and Simine Vazire for their thorough and con-
structive reviews.
Notes
1. Nosek and Errington (2020) have argued that many “concep-
tual” replications are not in fact replications, but generaliza-
tions. They define a replication as a study the outcome of which
is diagnostic with respect to an earlier study, both when the
results confirm the claims of that earlier study and when they
disconfirm them. According to Nosek and Errington (2020),
however, conceptual replications are usually “not designed
such that a failure to replicate would revise confidence in the
original claim” (p. 7), and they are therefore not replications
at all. Crandall and Sherman (2016) are strong proponents of
conceptual replication but do think that failed conceptual rep-
lications should receive more attention than they currently do.
2. To which one could add that the “many analysts” project
itself was also performative because it enacted the variability of
analytical strategies and outcomes in a particular way. A differ-
ent procedure in the project, or a different research question or
data set, might all have resulted in different kinds and levels of
variation between the teams.
3. These realities are not constructed, for there is an important
difference between the notion of social construction and that
of enactment: “the former describes social processes that result
in durable realities, while the latter describes practices in the
here and now that produce ephemeral effects—effects essen-
tially coextensive with the practices that create them” (Woolgar
& Lezaun, 2015, p. 463).
1502 Derksen, Morawski
4. For a similar metaphysics, see Barad (2003).
5. Another solution is a so-called multiverse analysis, in which
all raw data are processed in all possible, reasonable ways, and
the resulting set of data sets is statistically analyzed (Steegen
etal., 2016).
6. The study consisted of two experiments; the second tested
several additional hypotheses. Strack et al.’s Experiment 2 is not
discussed here because Wagenmakers et al. did not attempt to
replicate it.
7. The same point was made by Eronen and Bringmann (2021):
“In psychological science, there is not enough knowledge of
robust phenomena to impose sufficient constraints” (p. 780) on
theory development.
8. Van Rooij and Baggio (2021) contend that the real world
capacities that psychology is about are basically known already,
but it remains to explain them. We agree with the focus on the
real world, but not that all “capacities” are known.
9. Since the theory is at the same time being tested, this leads
to the “experimenter’s regress,” formulated by Collins (1985):
The theory is both under investigation and is a criterion for a
proper investigation.
10. Strack, however, has mentioned that research into facial feed-
back has led to the development of a treatment for depression
with Botox to suppress frowning. This seems to us an application
of this research that is important to discuss in terms of perfor-
mativity. Coles and Larsen (2021), moreover, contest the quality
of the evidence for the efficacy of Botox in treating depression.
11. See also Nosek et al. (2021), who write that “replications
foster unplanned discovery of potential invalidity when an
apparent replication produces a different result and stimulates
theorizing about why the original and replication studies dif-
fered” (p. 7).
References
Barad, K. (2003). Posthumanist performativity: Toward an
understanding of how matter comes to matter. Signs:
Journal of Women in Culture and Society, 28(3), 801–831.
https://doi.org/10.1086/345321
Bargh, J. A. (2012, March 5). Nothing in their heads. The
natural unconscious blog. Psychology Today. https://rep
licationindex.com/wp-content/uploads/2020/07/bargh-
nothingintheirheads.pdf
Barsalou, L. W. (2016). Situated conceptualization offers a
theoretical account of social priming. Current Opinion
in Psychology, 12, 6–11. https://doi.org/10.1016/j.copsyc
.2016.04.009
Bastian, H. (2017, August 29). Bias in open science advocacy:
The case of article badges for data sharing. Absolutely Maybe.
http://blogs.plos.org/absolutely-maybe/2017/08/29/bias-
in-open-science-advocacy-the-case-of-article-badges-
for-data-sharing/
Baumeister, R. (2016). Charting the future of social psychology
on stormy seas: Winners, losers, and recommendations.
Journal of Experimental Social Psychology, 66, 153–158.
Bavel, J. J. V., Mende-Siedlecki, P., Brady, W. J., & Reinero, D. A.
(2016a). Contextual sensitivity in scientific reproducibility.
Proceedings of the National Academy of Sciences, USA,
113(23), 6454–6459. https://doi.org/10.1073/pnas.15218
97113
Bavel, J. J. V., Mende-Siedlecki, P., Brady, W. J., & Reinero,
D. A. (2016b). Reply to Inbar: Contextual sensitivity helps
explain the reproducibility gap between social and cog-
nitive psychology. Proceedings of the National Academy
of Sciences, USA, 113(34), E4935–E4936. https://doi
.org/10.1073/pnas.1609700113
Blohowiak, B. B., Cohoon, J., de Wit, L., Eich, E., Farach, F. J.,
Hasselman, F., Holcombe, A. O., Humphreys, M., Lewis, M.,
Nosek, B. A., Peirce, J., Spies, J. R., Seto, C., Bowman, S.,
Green, D., Nilsonne, G., Grahe, J., Wykstra, S., Mohr,
A. Hofelich, . . . Lowrey, O. (2013). Badges to acknowl-
edge open practices. https://osf.io/tvyxz/
Cesario, J. (2014). Priming, replication, and the hardest sci-
ence. Perspectives on Psychological Science, 9(1), 40–48.
https://doi.org/10.1177/1745691613513470
Chambers, C. D. (2013). Registered reports: A new publishing
initiative at Cortex. Cortex, 49(3), 609–610. https://doi
.org/10.1016/j.cortex.2012.12.016
Coles, N. A., & Larsen, J. T. (2021). Letter to the editor:
Claims about the effects of botulinum toxin on depres-
sion should raise some eyebrows. Journal of Psychiatric
Research, 140, 551–552. https://doi.org/10.1016/j.jpsy
chires.2021.05.021
Coles, N. A., Larsen, J. T., & Lench, H. C. (2019). A meta-
analysis of the facial feedback literature: Effects of facial
feedback on emotional experience are small and vari-
able. Psychological Bulletin, 145, 610–651. https://doi
.org/10.1037/bul0000194
Coles, N. A., March, D. S., Marmolejo-Ramos, F., Arinze, N. C.,
Ndukaihe, I., Ozdogru, A., Aczel, B., Hajdu, N., Nagy, T.,
Basnight-Brown, D., Ricaurte, D. Z., Francesco, F., Willis, M.,
Pfuhl, G., Gwenaël, K., IJzerman, H., Vezirian, K.,
Banaruee, H., Suarez, I., . . . Liuzza, M. T. (2019). A multi-
lab test of the facial feedback hypothesis by the many
smiles collaboration. PsyArXiv. https://doi.org/10.31234/
osf.io/cvpuw
Collins, H. M. (1985). Changing order: Replication and induc-
tion in scientific practice. Sage.
Crandall, C. S., & Sherman, J. W. (2016). On the scientific
superiority of conceptual replications for scientific prog-
ress. Journal of Experimental Social Psychology, 66, 93–
99. https://doi.org/10.1016/j.jesp.2015.10.002
Daston, L., & Galison, P. (2007). Objectivity. Zone Books.
Dijksterhuis, A. (2014). Welcome back theory! Perspectives
on Psychological Science, 9(1), 72–75. https://doi.org/
10.1177/1745691613513472
Doyen, S., Klein, O., Pichon, C.-L., & Cleeremans, A. (2012).
Behavioral priming: It’s all in the mind, but whose mind?
PLOS ONE, 7(1), Article e29081. https://doi.org/10.1371/
journal.pone.0029081
Earp, B., & Trafimow, D. (2015). Replication, falsification, and
the crisis of confidence in social psychology. Frontiers
in Psychology, 6, Article 621. https://doi.org/10.3389/
fpsyg.2015.00621
Eronen, M. I., & Bringmann, L. F. (2021). The theory cri-
sis in psychology: How to move forward. Perspectives
on Psychological Science, 16(4), 779–788. https://doi
.org/10.1177/1745691620970586
Fabrigar, L. R., & Wegener, D. T. (2016). Conceptualizing and
evaluating the replication of research results. Journal of
Perspectives on Psychological Science 17(5) 1503
Experimental Social Psychology, 66, 68–80. https://doi
.org/10.1016/j.jesp.2015.07.009
Ferguson, C. J., & Heene, M. (2012). A vast graveyard of
undead theories: Publication bias and psychological sci-
ence’s aversion to the null. Perspectives on Psychological
Science, 7(6), 555–561. https://doi.org/10.1177%2F174
5691612459059
Fiske, S. T. (2016, October 31). A call to change science’s
culture of shaming. APS Observer, 29(9), 5–6. http://www
.psychologicalscience.org/publications/observer/2016/
nov-16/a-call-to-change-sciences-culture-of-shaming.html
Forscher, P. S. [@psforscher.] (2018a, November 19). If we
truly want to understand the situational forces, I think
social psychologists need to be willing to leverage natu-
rally [Tweet]. Twitter. https://twitter.com/psforscher/
status/1064738399146393600
Forscher, P. S. [@psforscher.] (2018b, November 19b). This
vicious combination of an emphasis on situational influ-
ences, a desire for clean inference, and a refusal to
conduct intensive [Tweet]. Twitter. https://twitter.com/
psforscher/status/1064738397569331200
Forstmeier, W., Wagenmakers, E.-J., & Parker, T. H. (2017).
Detecting and avoiding likely false-positive findings–
a practical guide. Biological Reviews, 92, 1941–1968.
https://doi.org/10.1111/brv.12315
Gelman, A. (2018, November 1). Facial feedback: “These find-
ings suggest that minute differences in the experimental
protocol might lead to theoretically meaningful changes in
the outcomes.” Statistical Modeling, Causal Inference, and
Social Science. https://andrewgelman.com/2018/11/01/
facial-feedback-findings-suggest-minute-differences-
experimental-protocol-might-lead-theoretically-meaning-
ful-changes-outcomes/
Gergen, K. J. (2015). From mirroring to world-making:
Research as future forming. Journal for the Theory of
Social Behaviour, 45(3), 287–310. https://doi.org/10.1111/
jtsb.12075
Greenfield, P. M. (2017). Cultural change over time: Why rep-
licability should not be the gold standard in psychologi-
cal science. Perspectives on Psychological Science, 12(5),
762–771. https://doi.org/10.1177/1745691617707314
Greenwald, A. G., Pratkanis, A. R., Leippe, M. R., &
Baumgardner, M. H. (1986). Under what conditions does
theory obstruct research progress? Psychological Review,
93(2), 216–229.
Guenther, K., & Hess, V. (2016). Soul catchers: The mate-
rial culture of the mind sciences. Medical History, 60(3),
301–307. https://doi.org/10.1017/mdh.2016.24
Hacking, I. (1995). The looping effects of human kinds. In
D. Sperber, D. Premack, & A. J. Premack (Eds.), Causal
cognition. A multidisciplinary debate (pp. 351–383).
Clarendon.
Hacking, I. (1999). Social construction of what? Harvard
University Press.
Hacking, I. (2000). How inevitable are the results of success-
ful science? Philosophy of Science, 67, S58–S71. https://
doi.org/10.1086/392809
Hacking, I. (2007). Kinds of people: Moving targets. Proceed-
ings of the British Academy, 151, 285–318. https://doi
.org/10.5871/bacad/9780197264249.003.0010
Hales, A. H. (2016). Does the conclusion follow from the evi-
dence? Recommendations for improving research. Journal
of Experimental Social Psychology, 66, 39–46. https://doi
.org/10.1016/j.jesp.2015.09.011
Hamlin, J. K. (2017). Is psychology moving in the right direc-
tion? An analysis of the evidentiary value movement.
Perspectives on Psychological Science, 12(4), 690–693.
https://doi.org/10.1177/1745691616689062
Haslam, N. (2016). Concept creep: Psychology’s expand-
ing concepts of harm and pathology [Target article].
Psychological Inquiry, 27(1), 1–17. https://doi.org/10
.1080/1047840X.2016.1082418
Hoeppe, G. (2014). Working data together: The accountabil-
ity and reflexivity of digital astronomical practice. Social
Studies of Science, 44(2), 243–270.
Iso-Ahola, S. E. (2017). Reproducibility in psychological sci-
ence: When do psychological phenomena exist? Frontiers
in Psychology, 8, Article 879. https://doi.org/10.3389/
fpsyg.2017.00879
Kerr, N. L. (1998). HARKing: Hypothesizing after the results are
known. Personality and Social Psychology Review, 2(3),
196–217. https://doi.org/10.1207/s15327957pspr0203_4
Klein, R. A., Vianello, M., Hasselman, F., Adams, B. G., Adams,
R. B., Jr., Alper, S., Aveyard, M., Axt, J. R., Babalola, M. T.,
Bahník, Š., Batra, R., Berkics, M., Bernstein, M. J., Berry,
D. R., Bialobrzeska, O., Binan, E. Dami, Bocian, K.,
Brandt, M. J., Busching, R., . . . Nosek, B. A. (2018). Many
Labs 2: Investigating variation in replicability across
samples and settings. Advances in Methods and Practices
in Psychological Science, 1(4), 443–490. https://doi
.org/10.1177/2515245918810225
Klein, S. B. (2014). What can recent replication failures tell
us about the theoretical commitments of psychology?
Theory & Psychology, 24(3), 326–338. https://doi.org/
10.1177/0959354314529616
Langlitz, N. (2015). On a not so chance encounter of neu-
rophilosophy and science studies in a sleep laboratory.
History of the Human Sciences, 28(4), 3–24. https://doi
.org/10.1177/0952695115581576
Latour, B., & Woolgar, S. (1986). Laboratory life: The construc-
tion of scientific facts. Princeton University Press.
Law, J. (2004). After method: Mess in social science research.
Routledge.
Law, J., & Lien, M. E. (2013). Slippery: Field notes in empiri-
cal ontology. Social Studies of Science, 43(3), 363–378.
https://doi.org/10.1177/0306312712456947
LeBel, E. P., Berger, D., Campbell, L., & Loving, T. J. (2017).
Falsifiability is not optional. Journal of Personality and
Social Psychology, 113(2), 254–261. https://doi.org/
10.1037/pspi0000106
Lynch, M., & Woolgar, S. (1990). Representation in scientific
practice (1st ed.). MIT Press.
MacIntyre, A. (1985). How psychology makes itself true-
or false. In S. Koch & D. E. Leary (Eds.), A century of
psychology as science (pp. 897–903). American Psycho-
logical Association.
MacKenzie, D. (2007). Is economics performative? Option
theory and the construction of derivative markets. In
D. MacKenzie, F. Muniessa, & L. Siu (Eds.), Do economists
make markets? (pp. 54–86). Princeton University Press.
1504 Derksen, Morawski
MacKenzie, D. A., Muniesa, F., & Siu, L. (Eds.). (2007). Do
economists make markets? On the performativity of eco-
nomics. Princeton University Press.
McGuire, W. J. (1986). A perspectivist looks at contextualism
and the future of behavioral science. In R. Rosnow &
M. Georgoudi (Eds.), Contextualism and understanding
in behavioral science: Implications for research and prac-
tice (pp. 271–302). Praeger.
Miller, G. A. (1969). Psychology as a means of promoting
human welfare. American Psychologist, 24(12), 1063–
1075. https://doi.org/10.1037/h0028988
Mol, A. (2002). The body multiple: Ontology in medical
practice. Duke University Press. http://site.ebrary.com/
id/10198353
Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S.,
Chambers, C. D., Sert, N. P., du Simonsohn, U.,
Wagenmakers, E.-J., Ware, J. J., & Ioannidis, J. P. A. (2017).
A manifesto for reproducible science. Nature Human
Behaviour, 1, 1–9. https://doi.org/10.1038/s41562-016-
0021
Neuroskeptic. (2014 August 31). The replication crisis:
Response to Lieberman. Discover Magazine. http://blogs.
discovermagazine.com/neuroskeptic/2014/08/31/replica
tion-crisis-response-lieberman/
Noah, T., Schul, Y., & Mayo, R. (2018). When both the origi-
nal study and its failed replication are correct: Feeling
observed eliminates the facial-feedback effect. Journal
of Personality and Social Psychology, 114(5), 657–664.
https://doi.org/10.1037/pspa0000121
Nosek, B. A., & Errington, T. M. (2020). What is replica-
tion? PLOS Biology, 18(3), Article e3000691. https://doi
.org/10.1371/journal.pbio.3000691
Nosek, B. A., Hardwicke, T. E., Moshontz, H., Allard, A.,
Corker, K. S., Almenberg, A. D., Fidler, F., Hilgard, J.,
Kline, M., Nuijten, M. B., Rohrer, J. M., Romero, F., Scheel,
A. M., Scherer, L., Schönbrodt, F., & Vazire, S. (2021).
Replicability, robustness, and reproducibility in psycho-
logical science. PsyArXiv. https://doi.org/10.31234/osf
.io/ksfvq
Olsson-Collentine, A., Wicherts, J. M., & van Assen, M. A. L. M.
(2020). Heterogeneity in direct replications in psychol-
ogy and its association with effect size. Psychological
Bulletin, 146(10), 922–940. https://doi.org/10.1037/bul
0000294
Pashler, H., & Harris, C. R. (2012). Is the replicability cri-
sis overblown? Three arguments examined. Perspectives
on Psychological Science, 7(6), 531–536. https://doi.org/
10.1177/1745691612463401
Pashler, H., & Wagenmakers, E.-J. (2012). Editors’ introduc-
tion to the special section on replicability in psycho-
logical science: A crisis of confidence? Perspectives on
Psychological Science, 7(6), 528–530. https://doi.org/
10.1177/1745691612465253
Pickering, A. (1994). Objectivity and the mangle of practice.
In A. Megill (Ed.), Rethinking objectivity (pp. 109–125).
Duke University Press.
Pickering, A. (2010). The cybernetic brain: Sketches of another
future. University of Chicago Press.
Pickersgill, M. (2014). The endurance of uncertainty: Anti-
sociality and ontological anarchy in British psychiatry,
1950–2010. Science in Context, 27(1), 143–175. https://
doi.org/10.1017/S0269889713000410
Popper, K. R. (2002). The logic of scientific discovery (2nd
ed.). Taylor & Francis.
Ranehill, E., Dreber, A., Johannesson, M., Leiberg, S., Sul, S.,
& Weber, R. A. (2015). Assessing the robustness of power
posing: No effect on hormones and risk tolerance in a large
sample of men and women. Psychological Science, 26(5),
653–656. https://doi.org/10.1177/0956797614553946
Reis, H. T., & Lee, K. Y. (2016). Promise, peril, and per-
spective: Addressing concerns about reproducibility in
social–personality psychology. Journal of Experimental
Social Psychology, 66, 148–152. https://doi.org/10.1016/
j.jesp.2016.01.005
Richards, G. (2002). The psychology of psychology. Theory
& Psychology, 12, 7–36.
Rosnow, R. L., & Georgoudi, M. (1986a). Contextualism and
understanding in behavioral science: Implications for
research and theory. Praeger.
Rosnow, R. L., & Georgoudi, M. (1986b). The spirit of con-
textualism. In Contextualism and understanding in
behavioral science: Implications for research and theory
(pp. 3–22). Praeger.
Rozin, P. (2001). Social psychology and science: Some lessons
from Solomon Asch. Personality and Social Psychology
Review, 5(1), 2–14.
Rutherford, A. (2017). Surveying rape: Feminist social sci-
ence and the ontological politics of sexual assault. History
of the Human Sciences, 30(4), 100–123. https://doi.org/
10.1177/0952695117722715
Schnall, S. (2014, May 22). An experience with a regis-
tered replication project. Department of Psychology,
University of Cambridge. https://web.archive.org/web/
20140528045642/http://www.psychol.cam.ac.uk/cece/
blog/
Schwarz, N., & Clore, G. L. (2016). Evaluating psychologi-
cal research requires more than attention to the N: A
comment on Simonsohn’s (2015) “Small Telescopes.”
Psychological Science, 27(10), 1407–1409. https://doi
.org/10.1177/0956797616653102
Shanks, D. R., Newell, B. R., Lee, E. H., Balakrishnan, D.,
Ekelund, L., Cenac, Z., Kavvadia, F., & Moore, C. (2013).
Priming intelligent behavior: An elusive phenomenon.
PLOS ONE, 8(4), Article e56515. https://doi.org/10.1371/
journal.pone.0056515
Silberzahn, R., Uhlmann, E. L., Martin, D. P., Anselmi, P., Aust, F.,
Awtrey, E., Bahník, Š., Bai, F., Bannard, C., Bonnier, E.,
Carlsson, R., Cheung, F., Christensen, G., Clay, R.,
Craig, M. A., Rosa, A. Dalla, Dam, L., Evans, M. H.,
Cervantes, I. Flores, . . . Nosek, B. A. (2018). Many ana-
lysts, one data set: Making transparent how variations in
analytic choices affect results. Advances in Methods and
Practices in Psychological Science, 1(3), 337–356. https://
doi.org/10.1177/2515245917747646
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-
positive psychology. Psychological Science, 22(11), 1359–
1366. https://doi.org/10.1177/0956797611417632
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2012). A 21
word solution. Dialogue. The Official Newsletter of the
Society of Personality and Social Psychology, 26(2), 4–7.
Perspectives on Psychological Science 17(5) 1505
Simons, D. J., Shoda, Y., & Lindsay, D. S. (2017). Constraints
on generality (COG): A proposed addition to all empiri-
cal papers. Perspectives on Psychological Science, 12(6),
1123–1128. https://doi.org/10.1177/1745691617708630
Srivastava, S. (2014a, November 19). Popper on direct replica-
tion, tacit knowledge, and theory construction. The Hardest
Science. https://hardsci.wordpress.com/2014/11/19/
popper-on-direct-replication-tacit-knowledge-and-theory-
construction/
Srivastava, S. (2014b, July 1). Some thoughts on replication
and falsifiability: Is this a chance to do better? The Hardest
Science. https://hardsci.wordpress.com/2014/07/01/
some-thoughts-on-replication-and-falsifiability-is-this-a-
chance-to-do-better/
Srivastava, S. [@hardsci] (2018, November 19). Many Labs 2
looked for evidence of hidden moderators, found vanish-
ingly little. HMs have been suggested as an explanation for
[Thumbnail with link attached] [Tweet]. Twitter. https://
twitter.com/hardsci/status/1064593555690323971
Stam, H. J. (2010). The tradition of personalism and its rela-
tionship to contemporary indeterminate functionalism.
New Ideas in Psychology, 28(2), 143–150. https://doi
.org/10.1016/j.newideapsych.2009.02.004
Stam, H. J. (2015). The historical boundedness of psychologi-
cal knowledge and the ethics of shared understandings.
Journal of Theoretical and Philosophical Psychology,
35(2), 117–127. https://doi.org/10.1037/teo0000018
Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W.
(2016). Increasing transparency through a multiverse
analysis. Perspectives on Psychological Science, 11(5),
702–712. https://doi.org/10.1177/1745691616658637
Strack, F. (2016). Reflection on the smiling registered replica-
tion report. Perspectives on Psychological Science, 11(6),
929–930. https://doi.org/10.1177/1745691616674460
Strack, F. (2017). From data to truth in psychological science.
A personal perspective. Frontiers in Psychology, 8, Article
702. https://doi.org/10.3389/fpsyg.2017.00702
Strack, F., Martin, L. L., & Stepper, S. (1988). Inhibiting and
facilitating conditions of the human smile: A nonob-
trusive test of the facial feedback hypothesis. Journal
of Personality and Social Psychology, 54(5), 768–777.
https://doi.org/10.1037/0022-3514.54.5.768
Strack, F., & Stroebe, W. (2018). What have we learned? What
can we learn? Behavioral and Brain Sciences, 41, Article
E151. https://doi.org/10.1017/S0140525X18000870
Stroebe, W. (2016). Are most published social psychological
findings false? Journal of Experimental Social Psycho-
logy, 66, 134–144. https://doi.org/10.1016/j.jesp.2015
.09.017
Stroebe, W. (2019). What can we learn from Many Labs rep-
lications? Basic and Applied Social Psychology, 41(2),
91–103. https://doi.org/10.1080/01973533.2019.1577736
Stroebe, W., Postmes, T., & Spears, R. (2012). Scientific
misconduct and the myth of self-correction in science.
Perspectives on Psychological Science, 7(6), 670–688.
https://doi.org/10.1177/1745691612460687
Stroebe, W., & Strack, F. (2014). The alleged crisis and the
illusion of exact replication. Perspectives on Psychological
Science, 9(1), 59–71. https://doi.org/10.1177/17456916
13514450
van Rooij, I., & Baggio, G. (2021). Theory before the test: How
to build high-verisimilitude explanatory theories in psy-
chological science. Perspectives on Psychological Science,
16(4), 682–697. https://doi.org/10.1177/1745691620970604
Wagenmakers, E.-J., Beek, T., Dijkhoff, L., Gronau, Q. F.,
Acosta, A., Adams, R. B., Albohn, D. N., Allard, E. S.,
Benning, S. D., Blouin-Hudon, E.-M., Bulnes, L. C.,
Caldwell, T. L., Calin-Jageman, R. J., Capaldi, C. A.,
Carfagno, N. S., Chasten, K. T., Cleeremans, A., Connell, L.,
DeCicco, J. M., . . . Zwaan, R. A. (2016). Registered replica-
tion report: Strack, Martin, & Stepper (1988). Perspectives
on Psychological Science, 11(6), 917–928. https://doi.org/
10.1177/1745691616674458
Wagenmakers, E.-J., & Gronau, Q. (2018, May 10). Musings
on preregistration: The case of the facial feedback effect.
Bayesian Spectacles. https://www.bayesianspectacles.org/
musings-on-preregistration/
Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas,
H. L. J., & Kievit, R. A. (2012). An agenda for purely
confirmatory research. Perspectives on Psychological
Science, 7(6), 632–638. https://doi.org/10.1177/1745691612
463078
Wicherts, J. M., Veldkamp, C. L. S., Augusteijn, H. E. M.,
Bakker, M., van Aert, R. C. M., & van Assen, M. A. L. M.
(2016). Degrees of freedom in planning, running, ana-
lyzing, and reporting psychological studies A check-
list to avoid p-hacking. Quantitative Psychology and
Measure ment, 7, Article 1832. https://doi.org/10.3389/
fpsyg.2016.01832
Wilson, A. (2014, May 26). Psychology’s real replication prob-
lem: Our methods sections. Notes From Two Scientific
Psychologists. http://psychsciencenotes.blogspot.co
.uk/2014/05/psychologys-real-replication-problem.html
Woolgar, S., & Lezaun, J. (2013). The wrong bin bag: A
turn to ontology in science and technology studies?
Social Studies of Science, 43(3), 321–340. https://doi
.org/10.1177/0306312713488820
Woolgar, S., & Lezaun, J. (2015). Missing the (question) mark?
What is a turn to ontology? Social Studies of Science, 45(3),
462–467. https://doi.org/10.1177/0306312715584010
Zwaan, R. A., Etz, A., Lucas, R. E., & Donnellan, M. B.
(2018). Making replication mainstream. Behavioral and
Brain Sciences, 41, Article E120. https://doi.org/10.1017/
S0140525X17001972
... There is ongoing debate about the preference and suitability of conceptual and direct replication attempts (cf. Derksen & Morawski, 2022). Those in favour of conceptual replications tend to argue that the true goal of a replication attempt is to examine a theoretical hypothesis rather than a specific experimental procedure (Stroebe & Strack, 2014) and that a direct replication is often not possible due to contextual sensitivitythat variability in local and/or cultural context can impact an experiment even if procedurally identical (Fabrigar & Wegener, 2016;Van Bavel et al., 2016a, 2016b). ...
Full-text available
Article
Technology education research is a growing field, with the rate of growth increasing over the last 2 decades. As the field grows, it is paramount that credibility is maintained in published findings. To date there is no evidence to suggest a lack trust is warranted, however in the midst of the replication crisis there is need to ensure continued rigour. This article presents a z-curve analysis of the replicability of quantitative research in technology education since 1983 using statcheck for automated data extraction. The results indicate that authors often mis-report p-values, typically due to rounding errors, with a small percentage (1.59%) of inconsistently reported p-values leading to decision errors in terms of statistical inference. With respect to replicability, overall it is estimated that 55.7% of reported quantitative results in technology education would replicate, however since 2020 this estimate appears to be increasing. These results do not indicate specific findings which are likely or unlikely to replicate, but do suggest a need to invest effort in identifying studies which would have a high value in being replicated, particularly in the timeframe of work published from 2010 to 2020.
... Researchers do, however, communicate particular ontological assumptions via their methodological and empirical practices, as well as their linguistic norms (Levy, 2019). These philosophical commitments to specific ontologies -or "what actually exists" -are thus enacted through our research actions (Derksen & Morawski, 2022). The implied ontological commitments can be brought to the fore via a discursive approach (Edwards, 2004). ...
Full-text available
Article
Methodological and empirical questions concerning state self-esteem are contingent upon very specific underlying commitments to “what” state self-esteem and its dynamics actually are. These are questions concerning ontology. These underlying commitments or views about “what actually exists” are not explicit, but enacted through our research actions. It is vital to bring these implicit underlying ontologies to the surface, so that we as researchers can reflect upon them, and on the assumptions that we are communicating and reinforcing with our methodological and empirical practices. In service of a conceptually solid and unambiguous framework of theoretical and methodological approaches to state self-esteem, I aim to lay bare the ontological commitments enacted in current research on state self-esteem. I show that state-self-esteem research forms two different assemblages of practices, which are repertoires of conceptual assumptions, discourse norms, methods of analysis, and operationalizations. One assemblage sketches a narrative of daily self-esteem in mechanistic terms, the other sketches a narrative of daily self-esteem in processual terms. After analyzing how concrete practices enact these ontological commitments, I reflect on how the two research assemblages might converge to benefit research on state self-esteem in the future, emphasizing the need for reflexivity from researchers.
... Nor does this futurist sketch contemplate a rapprochement between the two positions on psychological science. A compromise of some sort certainly is feasible, yet given the robustness of the assemblages of objects, methods, and scientists and given the incommensurability of some of their epistemological tenets, a satisfying rapprochement requires substantive re-imagining (Derksen & Morawski, 2020). More likely is a mapping of separate states of science that selectively permit the challengers' science vision in selective specialty areas while maintaining open science regulations in most. ...
Article
Psychology’s current crisis attends most visibly to perceived problems with statistical models, methods, publication practices, and career incentives. Rarely is close attention given to the objects of inquiry—to ontological matters—yet the crisis-related literature does features statements about the nature of psychology’s objects. Close analysis of the ontological claims reveals discrepant understandings: some researchers assume objects to be stable and singular while others posit them to be dynamic and complex. Nevertheless, both views presume the objects under scrutiny to be real. The analysis also finds each of these ontological claims to be associated not only with particular method prescriptions but also with distinct notions of the scientific self. Though both take the scientific self to be objective, one figures the scientist as not always a rational actor and, therefore, requiring some behavior regulation, while the other sees the scientist as largely capable of self-governing sustained through painstakingly acquired expertise and self-control. The fate of these prevalent assemblages of object, method, and scientific self remains to be determined, yet as conditions of possibility they portend quite different futures. Following description of the assemblages, the article ventures a futuristic portrayal of the scientific practices they each might engender.
Article
An already pressing need to evidence the effectiveness of futures and foresight tools has been further amplified by the coronavirus pandemic, which highlighted more mainstream tools' difficulty with uncertainty. In light of this, the recent discussion in this journal on providing futures and foresight science with a stronger scientific basis is welcome. In this discussion critical realism has been proffered as a useful philosophical foundation and experiments a useful method for improving this field's scientific basis. Yet, experiments seek to isolate specific causal effects through closure (i.e., by controlling for all extraneous factors) and this may cause it to jar with critical realism's emphasis on uncertainty and openness. We therefore extend the recent discussion on improving the scientific basis of futures and foresight science by doing three things. First, we elaborate on critical realism and why the experimental method may jar with it. Second, we explain why the distinction between a conceptual and a direct replication can help overcome this jarring, meaning experiments can still be a valuable research tool for a futures and foresight science underpinned by critical realism. Third, we consider the appropriate unit of analysis for experiments on futures and foresight tools. In so doing, we situate the recent discussion on improving the scientific basis of futures and foresight science within the much longer running one on improving the scientific basis of business, management and strategy research more broadly. We use the case of scenario planning to illustrate our argument in relation to futures and foresight science.
Full-text available
Article
Political and cultural polarisation are leading explanations for climate change denial and inactions as seen in the Cultural Cognition Thesis (CCT). In this view, individuals hold positions on contested issues to conform to their ideological groups: people ascribe to certain beliefs, not to express what they know but to show their group identity. We present a conceptual test of the CCT using high-quality cross-national data from 21 European countries, Russia, and Israel (total N = 44,378). Climate change concern was correlated with identification with the political left (rs = 0.04–.13), egalitarianism (rs = 0.04–.13) and communitarianism (rs = 0.01–.07), but in a broad definition cultural cognition was a weak predictor of climate change beliefs (R² = 3.82%), policy preferences (R² = 2.09%), and actions (R² = 0.62%). Moreover, climate change polarisation was not greatest among the highly educated as predicted by the CCT. Education was positively associated with climate beliefs (rs = 0.07–.17), irrespective of political affiliation. Non-linear regressions indicated little evidence that the CCT's predictions held better for more extreme ideological groups. These results suggest cultural cognition may not be central to thoughts about climate change in Europe.
Article
What factors influence how people perceive the risk of getting COVID-19? Extending beyond features of general health conditions, media coverage, and genetic susceptibility to disease, the present research investigates whether the immediacy of experience with temperature, a subtle yet pervasive environmental factor, can affect people's estimation of contagion probability. According to the attribute substitution model, people may rely on the visceral experience of coldness, a far easier quantity to evaluate, to estimate the contagion probability of the new coronavirus disease. Study 1 found that Chinese university students who perceived the indoor temperature to be lower believed that the coronavirus was more infectious. To provide causal evidence for the effect, Study 2 randomly assigned participants to different conditions. The results showed that participants in the cold condition reported a higher likelihood of contracting the coronavirus than participants in the control condition. Overall, these findings are consistent with the attribute substitution model: people tend to recruit simpler and more accessible information (e.g., local temperature) in place of more diagnostic but less tangible information (e.g., scientific data) in assessing the risk of disease transmission. Theoretical contributions and the significance of this research for policy makers are discussed.
Full-text available
Article
Meehl argued in 1978 that theories in psychology come and go, with little cumulative progress. We believe that this assessment still holds, as also evidenced by increasingly common claims that psychology is facing a “theory crisis” and that psychologists should invest more in theory building. In this article, we argue that the root cause of the theory crisis is that developing good psychological theories is extremely difficult and that understanding the reasons why it is so difficult is crucial for moving forward in the theory crisis. We discuss three key reasons based on philosophy of science for why developing good psychological theories is so hard: the relative lack of robust phenomena that impose constraints on possible theories, problems of validity of psychological constructs, and obstacles to discovering causal relationships between psychological variables. We conclude with recommendations on how to move past the theory crisis.
Full-text available
Article
Drawing on the philosophy of psychological explanation, we suggest that psychological science, by focusing on effects, may lose sight of its primary explananda: psychological capacities. We revisit Marr’s levels-of-analysis framework, which has been remarkably productive and useful for cognitive psychological explanation. We discuss ways in which Marr’s framework may be extended to other areas of psychology, such as social, developmental, and evolutionary psychology, bringing new benefits to these fields. We then show how theoretical analyses can endow a theory with minimal plausibility even before contact with empirical data: We call this the theoretical cycle. Finally, we explain how our proposal may contribute to addressing critical issues in psychological science, including how to leverage effects to understand capacities better.
Full-text available
Article
We examined the evidence for heterogeneity (of effect sizes) when only minor changes to sample population and settings were made between studies and explored the association between heterogeneity and average effect size in a sample of 68 meta-analyses from 13 preregistered multilab direct replication projects in social and cognitive psychology. Among the many examined effects, examples include the Stroop effect, the "verbal overshadowing" effect, and various priming effects such as "anchoring" effects. We found limited heterogeneity; 48/68 (71%) meta-analyses had nonsignificant heterogeneity, and most (49/68; 72%) were most likely to have zero to small heterogeneity. Power to detect small heterogeneity (as defined by Higgins, Thompson, Deeks, & Altman, 2003) was low for all projects (mean 43%), but good to excellent for medium and large heterogeneity. Our findings thus show little evidence of widespread heterogeneity in direct replication studies in social and cognitive psychology, suggesting that minor changes in sample population and settings are unlikely to affect research outcomes in these fields of psychology. We also found strong correlations between observed average effect sizes (standardized mean differences and log odds ratios) and heterogeneity in our sample. Our results suggest that heterogeneity and moderation of effects is unlikely for a 0 average true effect size, but increasingly likely for larger average true effect size. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Full-text available
Article
Credibility of scientific claims is established with evidence for their replicability using new data. According to common understanding, replication is repeating a study’s procedure and observing whether the prior finding recurs. This definition is intuitive, easy to apply, and incorrect. We propose that replication is a study for which any outcome would be considered diagnostic evidence about a claim from prior research. This definition reduces emphasis on operational characteristics of the study and increases emphasis on the interpretation of possible outcomes. The purpose of replication is to advance theory by confronting existing understanding with new evidence. Ironically, the value of replication may be strongest when existing understanding is weakest. Successful replication provides evidence of generalizability across the conditions that inevitably differ from the original study; Unsuccessful replication indicates that the reliability of the finding may be more constrained than recognized previously. Defining replication as a confrontation of current theoretical expectations clarifies its important, exciting, and generative role in scientific progress.
Full-text available
Article
Several hundred research groups attempted replications of published effects in so-called Many Labs studies involving thousands of research participants. Given this enormous investment, it seems timely to assess what has been learned and what can be learned from this type of project. My evaluation addresses four questions: First, do these replication studies inform us about the replicability of social psychological research? Second, can replications detect fraud? Third, does the failure to replicate a finding indicate that the original result was wrong? Finally, do these replications help to support or disprove any social psychological theories? Although evidence of replication failures resulted in important methodological changes, the 2015 Open Science Collaboration findings sufficed to make the point. To assess the state of social psychology, we have to evaluate theories rather than randomly selected research findings.
Article
Replication—an important, uncommon, and misunderstood practice—is gaining appreciation in psychology. Achieving replicability is important for making research progress. If findings are not replicable, then prediction and theory development are stifled. If findings are replicable, then interrogation of their meaning and validity can advance knowledge. Assessing replicability can be productive for generating and testing hypotheses by actively confronting current understandings to identify weaknesses and spur innovation. For psychology, the 2010s might be characterized as a decade of active confrontation. Systematic and multi-site replication projects assessed current understandings and observed surprising failures to replicate many published findings. Replication efforts highlighted sociocultural challenges such as disincentives to conduct replications and a tendency to frame replication as a personal attack rather than a healthy scientific practice, and they raised awareness that replication contributes to self-correction. Nevertheless, innovation in doing and understanding replication and its cousins, reproducibility and robustness, has positioned psychology to improve research practices and accelerate progress. Expected final online publication date for the Annual Review of Psychology, Volume 73 is January 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Preprint
Replication, an important, uncommon, and misunderstood practice, is making a comeback in psychology. Achieving replicability is a necessary but not sufficient condition for making research progress. If findings are not replicable, then prediction and theory development are stifled. If findings are replicable, then interrogation of their meaning and validity can advance knowledge. Assessing replicability can be productive for generating and testing hypotheses by actively confronting current understanding to identify weaknesses and spur innovation. For psychology, the 2010s might be characterized as a decade of active confrontation. Systematic and multi-site replication projects assessed current understanding and observed surprising failures to replicate many published findings. Replication efforts also highlighted sociocultural challenges, such as disincentives to conduct replications, framing of replication as personal attack rather than healthy scientific practice, and headwinds for replication contributing to self-correction. Nevertheless, innovation in doing and understanding replication, and its cousins, reproducibility and robustness, have positioned psychology to improve research practices and accelerate progress.