The Road Less Travelled:
Understanding Adversaries Is Hard but Smarter than Ignoring Them
Cory J Clark
University of Pennsylvania
University of Virginia
Philip E Tetlock
University of Pennsylvania
Name: Cory Clark
Address: 425 S. University Ave, Stephen A. Levin Bldg.
Philadelphia, PA, 19104-6241
Word count: 2,067
Forthcoming in Journal of Applied Research in Memory and Cognition
© 2022, American Psychological Association. This paper is not the copy of record and may
not exactly replicate the final, authoritative version of the article. Please do not copy or cite
without authors' permission. The final article will be available, upon publication, via its
Our target article proposed that normalizing adversarial collaborations (ACs) will
catalyze progress in the behavioral sciences (Clark et al., 2022). ACs require scholars to state
their own positions precisely, address the real (not caricatured) version of their opponents’
claims, and work with their adversary to design studies that all parties agree constitute fair tests
(rather than carefully crafting studies likely to confirm their preferred hypotheses).
We welcome this opportunity to respond to seven commentaries by distinguished
scholars, who mostly agreed that ACs are a good idea in principle but highlighted the practical
difficulties of changing norms. They also provided numerous recommendations for how to
change norms in the behavioral sciences and better incentivize ACs. We can respond to only a
fraction of the many insightful points made in these commentaries, but we encourage curious
scholars to read all of them. Below, we identify themes running through the discussions—and
our grounds for optimism that, although ACs are challenging, the tipping point may be closer
than we think given the likely benefits from ACs.
Epistemic Cost-Benefit Analysis: Adversarial Collaborations Are (Usually) Worth the
As all commentators note, some ACs will be a lot more arduous than others, but nearly
all will be more challenging than the traditional approach of ignoring, marginalizing, or
derogating critics (Ceci & Williams, 2022; Cowan, 2022; Marsh, 2022; Melloni, 2022; Tatlidil &
Sloman, 2022; Tullett, 2022; Vlasceanu et al., 2022). So, do the benefits of ACs out-weigh the
Given the resources required and the potential for interpersonal awkwardness, ACs will
only pass the cost-benefit test if they accelerate scientific progress. Focusing on these costs,
Vlasceanu and colleagues (2022) predict that ACs will remain niche exercises, absent big
changes in professional incentives. Cowan (2022), an experienced adversarial collaborator,
commented that clear resolutions of disputes are rare and that peer scholars may not appreciate
complex and nuanced (though more accurate) reports over cleaner and simpler ones. Tatlidil and
Sloman (2022, p. XX) went further: “It is already exceedingly difficult to separate the academic
wheat from the chaff, and adding a torrent of inconclusive adversarial collaborations will not
help.” These are legitimate criticisms. Most editors, reviewers, and readers may well prefer tidy
results. But in an ideal world, science would prioritize publishing the most reliable and valid
conclusions, regardless of their complexity.
Truth in the behavioral sciences is always provisional, with updates to our understanding
hard-won and rarely accepted by all (Ceci & Williams, 2022; Melloni, 2022; Tullett, 2022).
Traditional and social media, as well as agenda-driven elites, already package tidy, one-sided
science. Why encourage scholars to do the same? Our concern is that there is already too much
encouragement in this direction. Top journals dislike messy results, institutions evaluate faculty
by their ability to publish in those journals, and literature reviews often downplay complexities
in favor of neat and simple narratives.
Vlasceanu and colleagues (2022) recommend that adversarial collaborators use registered
reports to lock in publications before ACs uncover messy results that might trigger rejection.
This recommendation will work well for ACs testing straightforward differences of opinion, but
some ACs require numerous iterations of negotiations as data come in. Registered reports may
lack the requisite flexibility. Cowan (2022) urges editors not to expect bold conclusions from
ACs as a rule. We agree—but this expectation should apply to all research that engages scholarly
debates. If well-crafted ACs rarely yield bold conclusions, should we not worry all the more
about bold conclusions emerging from studies that side-step critics?
In their commentary, Ceci and Williams (2022) describe their own adversarial efforts,
some successful and others less so. The collaboration on the validity of repressed memories
dissolved into separate conflicting reports and rebuttals—plus accusations of selective reporting
and interpretations (Alpert et al. 1998a, 1998b; Ornstein et al., 1998a, 1998b, 1998c). Although
outsiders might see repressed memories as a technical issue, the science has been applied in
high-profile courtroom cases of childhood abuse. Errors in either direction are of deep moral
significance: either discounting testimony of true survivors or incarcerating innocent people.
Although the topic’s applied significance has rendered it woefully contentious, had the
adversarial collaboration succeeded, it could have been one of the most impactful papers of its
time. This story highlights both the potential value of ACs and a key challenge: ACs may be
least feasible in domains with direct applications to morally-politically charged issues, but these
domains are also those that would most benefit from rigorous dispute resolution.
Given the difficulty of initiating and carrying out ACs on contentious topics, scholars
might understandably gravitate toward ACs focused on lower-controversy issues (e.g., Tetlock &
Mitchell, 2009), which would be unfortunate. Controversial research often entails high stakes
and is particularly important to get right. Tullett (2022) describes the challenges of using ACs to
resolve contentious moral-political debates. We agree that ACs, like other social science
methodologies, are ill-equipped to resolve disputes over values, but ACs are useful for advancing
empirical debates over facts relevant to moral issues. For example, debates over the propriety of
using base rates in decision-making can be informed by data on the costs and benefits of using or
not using them and who incurs the costs and receives the benefits. Such empirical information
can lead to more informed and honest policy debates in which each side acknowledges the
tradeoffs (Hammond & Adelman, 1976). And we suspect adversarial collaborators are more
likely to acknowledge the full range of costs and benefits rather than selectively highlighting the
most convenient ones (see Ceci and Williams’s  discussion of framing problems). Given
that the adversaries jointly write up successful ACs, conclusions drawn about policy implications
are likely to be circumspect and respectful of the limitations of the data. ACs should, in
principle, be especially useful for contentious topics.
Tullett (2022) and Cowan (2022) suggest that many ACs in Table 2 of our target article
(Clark et al., 2022, p. XX) may be too expansively defined. We see a trade-off. One benefit of an
AC is that it forces the parties to reduce abstract theoretical disputes to their empirically
manageable cores. Indeed, some scientific disagreements might even vanish if participants
avoided making broader claims than their data allow. More generally, as stated by Tullett (p.
XX), “if we are to avoid this slippage, we need to judiciously constrain our expectations of what
adversarial collaborations can tell us. The most politically contentious issues will systematically
be ones where data is only a piece of the puzzle.” We agree and hope these expectations apply to
all research on contentious issues, not only ACs, and the peer review process winnows out
As stated in our target article, and as reiterated by our experienced commentators (e.g.,
Cowan, 2022; Melloni, 2022), ACs are likelier to uncover boundary conditions than to overthrow
entire frameworks. Some might find this kind of progress too trivial for the higher costs of ACs.
But in our view, this kind of progress is large in relation to what traditional research often
produces. We prefer small, tentative steps toward convergence over large, confident steps in
Nearly all commentators mentioned that existing research incentives are not aligned with
ACs. Two barriers to ACs in the tenure process (pointed out by Marsh  and Melloni
) are that departments are often looking for first-authored publications, disincentivizing
the teamwork necessary for asking big questions with complicated answers. And Melloni (2022),
another veteran adversarial collaborator, discusses the need to normalize team science, as other
disciplines, such as physics, have successfully done. We see many benefits. First, a greater
appreciation for team science could allow for more specialization. Currently, behavioral
scientists are expected to have broad knowledge of numerous literatures and methodologies,
formulate important research questions, creatively design procedures for testing those questions,
program studies using relevant software, coordinate and carry out the data collection, perform
sophisticated statistical analyses, and then write a compelling article. Greater specialization
would allow scholars to spend more time doing what they do best. Even specialized roles for
skilled scientific diplomats to facilitate ACs might rise to prominence.
Specialization alone should improve the quality of research by having the best
theoreticians developing theories, the best methodologists designing procedures to test those
theories, and the best statisticians analyzing the resultant data. But team science also makes the
use of QRPs more difficult. If scholar A formulates the hypothesis, scholar B collects the data,
and scholar C analyzes the data and writes the results, scholar A has less influence over the final
product. To encourage team research, we must change how we credit scholars for authorship. As
Marsh (2022, p. XX) notes, “adversarial collaborations invite psychologists to rethink how we
personally conceptualize success in our field. What makes for an academic star: number of high
impact publications or finding a scientific truth?... Valuing an academic not for what they can do
alone, but for their ability to work with their greatest skeptics is a new definition of success.”
Tatlidil and Sloman (2022) and Cowan (2022) both expressed concern that incentivizing
ACs could cause scholars to game the system, turning ACs into performative displays or causing
scholars to describe minimally adversarial research as ACs to obtain unearned benefits. These
are legitimate concerns, but properly refereed ACs put more constraints on performative displays
than does research that shuts out critics likely to object to theoretical grandstanding.
Vlasceanu and colleagues (2022) question the severity of certain problems we raised
about peer review, arguing that QRPs and file-draw practices are less threatening to science than
many believe. The article they cite for the view that “the file drawer problem does not actually
produce significant biases in estimating effects” (Vlasceanu et al., 2022, p. XX) found that plenty
of non-significant relationships are reported in the published literature (Dalton et al., 2012). But
this analysis does not reveal the relative proportions of hypothesis-confirming vs. disconfirming
analyses that were performed in relation to their proportions in the published literature. Null
results are not always threats to theories: some analyses are orthogonal to researchers’ main
hypotheses and sometimes researchers hypothesize null results. In a working adversarial
collaboration meta-analysis of 281 meta-analyses published between 2012 and 2021, ~41% of
meta-analyses reported evidence of publication bias (Lu et al., 2022). This could mean that 41%
of conclusions surrounding broad research areas are biased by file drawering and reviewer
rejections of certain conclusions. And, incidentally, differing views on the file-drawer problem
are yet another opportunity for AC.
Of course, not all QRPs are acknowledged as QRPs. As Ceci and Williams (2022)
pointed out, many accurate interpretations of findings are still misleading because scholars are
free to highlight and ignore different parts of the same information in their framing of the
findings. Misleading-but-not-technically-inaccurate interpretations of findings are not
uncommon (for discussions of examples see Blanton et al., 2009; Clark et al., 2021; Clark et al.,
2022; Clark & Tetlock, 2021; Clark & Winegard, 2020; Dawson & Arkes, 2009; Mitchell &
Tetlock, 2009; Purser & Harper, 2020; Sniderman & Tetlock, 1986; Wright et al., 2021), but
these are not widely regarded as QRPs, nor would we expect many scholars to detect this
tendency in themselves. Whereas Open Science practices can constrain QRPs that are easily
detected with increased transparency, such as unplanned data exclusions and abuses of analyst
degrees of freedom, ACs can help constrain subtler practices such as refusals to run certain tests,
rigging methods, file drawering, and tendentious framing of conclusions.
In a recent talk, Kahneman (2022) describes his first experience of adversarial
collaboration with Anne Treisman about forty years ago. Each designed critical tests agreed upon
by the opposing spouse. When confronted with findings they had not predicted a priori,
Kahneman was astonished by how they suddenly gained “15 IQ points” and could now see so
clearly why the studies were flawed from the start. Prior to the AC, they were walking down
Lakatos’s (1976) degenerative path where scholars add auxiliary assumptions to protect their
core hypotheses from unexpected results. Making accurate predictions a priori is much harder
than forming post hoc reasons to dismiss undesirable results. By requiring public commitment to
predictions and nudging adversaries toward a shared understanding, ACs can reroute scholarly
debates onto the progressive Lakatosian path, where ad hoc modifications improve explanatory
power over time (Melloni, 2022). Our best bet is that ACs will accelerate convergence in
debates—and save resources that would be wasted on post hoc posturing.
Cory Clark wrote the original draft. Thomas Costello, Gregory Mitchell, and Philip Tetlock
provided many helpful comments and changes.
This research was funded in part by the Searle Freedom Trust (PD 10080850). The funding
source had no involvement in the research or preparation of the manuscript.
Alpert, J. L., Brown, L. S., Ceci, S. J., Courtois, C. A., Loftus, E. F., & Ornstein, P. A. (1998a).
Final conclusions of the American Psychological Association working group on
investigation of memories of childhood abuse. Psychology Public Policy and Law, 4(4),
Alpert, J. L., Brown, L. S., & Courtois, C. A. (1998b). Comment on Ornstein, Ceci, and Loftus
(1998): Adult recollections of childhood abuse. Psychology, Public Policy, and Law,
Blanton, H., Jaccard, J., Klick, J., Mellers, B., Mitchell, G., & Tetlock, P. E. (2009). Strong
claims and weak evidence: reassessing the predictive validity of the IAT. Journal of
Applied Psychology, 94(3), 567.
Ceci, S. J. & Williams, W. M. (2022). Viewpoint diversity is essential for scientific teams.
Journal of Applied Research in Memory and Cognition.
Clark, C. J., Costello, T., Mitchell, G., & Tetlock, P. E. (2022). Keep your enemies close:
Adversarial collaborations will improve behavioral science. Journal of Applied Research
in Memory and Cognition.
Clark, C. J., Honeycutt, N., & Jussim, L. (2021). Replicability and the psychology of science. In
S. Lilienfeld, A. Masuda, & W. O’Donohue (Eds.), Questionable Research Practices in
Psychology. New York: Springer.
Clark, C. J., & Tetlock, P. E. (2021). Adversarial collaboration: The next science reform. In C. L.
Frisby, R. E. Redding, W. T. O’Donohue, & S. O. Lilienfeld (Eds.), Political Bias in
Psychology: Nature, Scope, and Solutions. New York: Springer.
Clark, C. J., & Winegard, B. M. (2020). Tribalism in war and peace: The nature and evolution of
ideological epistemology and its significance for modern social science. Psychological
Inquiry, 31(1), 1-22.
Cowan, N. (2022). The Adversarial Collaboration within each of us. Journal of Applied
Research in Memory and Cognition.
Dalton, D. R., Aguinis, H., Dalton, C. M., Bosco, F. A., & Pierce, C. A. (2012). Revisiting the
file drawer problem in meta‐analysis: An assessment of published and nonpublished
correlation matrices. Personnel Psychology, 65(2), 221-249.
Dawson, N. V., & Arkes, H. R. (2009). Implicit bias among physicians. Journal of General
Internal Medicine, 24(1), 137-140.
Hammond, K. R., & Adelman, L. (1976). Science, Values, and Human Judgment: Integration of
facts and values requires the scientific study of human judgment. Science, 194(4263),
Kahneman, D. (2022). Anecdotes of Adversarial Collaboration. Talk presented January 5, 2022
Lakatos, I. (1976). Falsification and the Methodology of Scientific Research Programmes. In S.
G. Harding (Ed.), Can Theories be Refuted? Essays on the Duhem-Quine Thesis (pp.
205-259). Dordrecht: Springer Netherlands.
Lu, L., Crawford, J., Van Bavel, J., Clark, C., & Tetlock, P. (2022, Feb 18). Is the psychology
literature politically biased? An Adversarial Collaboration meta-analysis [Conference
presentation]. SPSP 2022 Annual Convention, San Francisco, CA, United States.
Marsh, J. (2022). Clearing the obstacles to adversarial collaborations for early career researchers.
Journal of Applied Research in Memory and Cognition.
Melloni, L. (2022). On keeping our adversaries close, preventing collateral damage, and
changing our minds. Journal of Applied Research in Memory and Cognition.
Ornstein, P. A., Ceci, S. J., & Loftus, E. (1998a). Adult Recollections of Childhood Abuse:
Cognitive and Developmental Perspectives. Psychology, Public Policy, and Law, Vol. 4,
Ornstein, P. A., Ceci, S. J., & Loftus, E. F. (1998b). Comment on Alpert, Brown, and Courtois
(1998): The science of memory and the practice of psychotherapy. Psychology, Public
Policy, and Law, 4(4), 996-1010.
Ornstein, P. A., Ceci, S. J., & Loftus, E. F. (1998c). More on the repressed memory debate: A
reply to Alpert, Brown, and Courtois (1998). Psychology, Public Policy, and Law, 4(4),
Purser, H., & Harper, C. A. (2020). Low system justification drives ideological differences in
joke perception: A critical commentary and re-analysis of Baltiansky et al. (2020).
Serra-Garcia, M., & Gneezy, U. (2021). Nonreplicable publications are cited more than
replicable ones. Science Advances, 7(21), eabd1705.
Sniderman, P. M., & Tetlock, P. E. (1986). Symbolic racism: Problems of motive attribution in
political analysis. Journal of Social Issues, 42(2), 129-150.
Tatlidil, S. & Sloman, S. (2022). Some collaborations just aren’t worth it. Journal of Applied
Research in Memory and Cognition.
Tetlock, P. E., & Mitchell, G. (2009). Implicit bias and accountability systems: What must
organizations do to prevent discrimination?. Research in Organizational Behavior, 29, 3-
Tullett, A. M. (2022). Adversarial collaborations won’t solve society’s moral debates. Journal of
Applied Research in Memory and Cognition.
Vlasceanu, M., Reinero, D. A., & Van Bavel, J. J. (2022). Adversarial Collaborations in
behavioral science: Benefits and boundary conditions. Journal of Applied Research in
Memory and Cognition.
Wright, J. D., Goldberg, Z., Cheung, I., & Esses, V. M. (2021). Clarifying the meaning of
symbolic racism. Unpublished manuscript.