ArticlePublisher preview available
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Open Science initiatives such as preregistration, publicly available procedures and data, and power analyses have rightly been lauded for increasing the reliability of findings. However, a potentially equally important initiative—aimed at increasing the validity of science—has largely been ignored. Adversarial collaborations (ACs) refer to team science in which members are chosen to represent diverse (and even contradictory) perspectives and hypotheses, with or without a neutral team member to referee disputes. Here, we provide background about ACs and argue that they are effective, essential, and underutilized. We explain how and why ACs can enhance both the reliability and validity of science and why their benefit extends beyond the realm of team science to include venues such as fact-checking, wisdom of crowds, journal reviewing, and sequential editing. Improving scientific validity would increase the efficacy of policy and interventions stemming from behavioral science research, and over time, it could help salvage the reputation of our discipline because its products would be perceived as resulting from a serious, open-minded consideration of diverse views.
American Psychologist
Adversarial Collaboration: An Undervalued Approach in Behavioral Science
Stephen J. Ceci, Cory J. Clark, Lee Jussim, and Wendy M. Williams
Online First Publication, August 15, 2024. https://dx.doi.org/10.1037/amp0001391
CITATION
Ceci, S. J., Clark, C. J., Jussim, L., & Williams, W. M. (2024). Adversarial collaboration: An undervalued
approach in behavioral science.. American Psychologist. Advance online publication. https://dx.doi.org/
10.1037/amp0001391
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
We identify points of conflict and consensus regarding (a) controversial empirical claims and (b) normative preferences for how controversial scholarship—and scholars—should be treated. In 2021, we conducted qualitative interviews ( n = 41) to generate a quantitative survey ( N = 470) of U.S. psychology professors’ beliefs and values. Professors strongly disagreed on the truth status of 10 candidate taboo conclusions: For each conclusion, some professors reported 100% certainty in its veracity and others 100% certainty in its falsehood. Professors more confident in the truth of the taboo conclusions reported more self-censorship, a pattern that could bias perceived scientific consensus regarding the inaccuracy of controversial conclusions. Almost all professors worried about social sanctions if they were to express their own empirical beliefs. Tenured professors reported as much self-censorship and as much fear of consequences as untenured professors, including fear of getting fired. Most professors opposed suppressing scholarship and punishing peers on the basis of moral concerns about research conclusions and reported contempt for peers who petition to retract papers on moral grounds. Younger, more left-leaning, and female faculty were generally more opposed to controversial scholarship. These results do not resolve empirical or normative disagreements among psychology professors, but they may provide an empirical context for their discussion.
Article
Full-text available
We propose a friendly amendment to integrative experiment design (IED), adversarial-collaboration IED, that incentivizes research teams from competing theoretical perspectives to identify zones of the design space where they possess an explanatory edge. This amendment is especially critical in debates that have high policy stakes and carry a strong normative-political charge that might otherwise prevent free exchange of ideas.
Article
Full-text available
A preregistered meta-analysis, including 244 effect sizes from 85 field audits and 361,645 individual job applications, tested for gender bias in hiring practices in female-stereotypical and gender-balanced as well as male-stereotypical jobs from 1976 to 2020. A “red team” of independent experts was recruited to increase the rigor and robustness of our meta-analytic approach. A forecasting survey further examined whether laypeople (n = 499 nationally representative adults) and scientists (n = 312) could predict the results. Forecasters correctly anticipated reductions in discrimination against female candidates over time. However, both scientists and laypeople overestimated the continuation of bias against female candidates. Instead, selection bias in favor of male over female candidates was eliminated and, if anything, slightly reversed in sign starting in 2009 for mixed-gender and male-stereotypical jobs in our sample. Forecasters further failed to anticipate that discrimination against male candidates for stereotypically female jobs would remain stable across the decades.
Article
Full-text available
Fazio, Sanbonmatsu, Powell, and Kardes (1986) demonstrated that Ss were able to evaluate adjectives more quickly when these adjectives were immediately preceded (primed) by attitude objects of similar valence, compared with when these adjectives were primed by attitude objects of opposite valence. Moreover, this effect obtained primarily for attitude objects toward which Ss were presumed to hold highly accessible attitudes, as indexed by evaluation latency. The present research explored the generality of these findings across attitude objects and across procedural variations. The results of 3 experiments indicated that the automatic activation effect is a pervasive and relatively unconditional phenomenon. It appears that most evaluations stored in memory, for social and nonsocial objects alike, become active automatically on the mere presence or mention of the object in the environment.
Article
Full-text available
We synthesized the vast, contradictory scholarly literature on gender bias in academic science from 2000 to 2020. In the most prestigious journals and media outlets, which influence many people's opinions about sexism, bias is frequently portrayed as an omnipresent factor limiting women's progress in the tenure-track academy. Claims and counterclaims regarding the presence or absence of sexism span a range of evaluation contexts. Our approach relied on a combination of meta-analysis and analytic dissection. We evaluated the empirical evidence for gender bias in six key contexts in the tenure-track academy: (a) tenure-track hiring, (b) grant funding, (c) teaching ratings, (d) journal acceptances, (e) salaries, and (f) recommendation letters. We also explored the gender gap in a seventh area, journal productivity, because it can moderate bias in other contexts. We focused on these specific domains, in which sexism has most often been alleged to be pervasive, because they represent important types of evaluation, and the extensive research corpus within these domains provides sufficient quantitative data for comprehensive analysis. Contrary to the omnipresent claims of sexism in these domains appearing in top journals and the media, our findings show that tenure-track women are at parity with tenure-track men in three domains (grant funding, journal acceptances, and recommendation letters) and are advantaged over men in a fourth domain (hiring). For teaching ratings and salaries, we found evidence of bias against women; although gender gaps in salary were much smaller than often claimed, they were nevertheless concerning. Even in the four domains in which we failed to find evidence of sexism disadvantaging women, we nevertheless acknowledge that broad societal structural factors may still impede women's advancement in academic science. Given the substantial resources directed toward reducing gender bias in academic science, it is imperative to develop a clear understanding of when and where such efforts are justified and of how resources can best be directed to mitigate sexism when and where it exists.
Article
As the practice of preregistration becomes more common, researchers need guidance in how to report deviations from their preregistered statistical analysis plan. A principled approach to the use of preregistration should not treat all deviations as problematic. Deviations from a preregistered analysis plan can both reduce and increase the severity of a test, as well as increase the validity of inferences. I provide examples of how researchers can present deviations from preregistrations and evaluate the consequences of the deviation when encountering 1) unforeseen events, 2) errors in the preregistration, 3) missing information, 4) violations of untested assumptions, and 5) falsification of auxiliary hypotheses. The current manuscript aims to provide a principled approach to deciding when to deviate from a preregistration and how to report deviations from an error-statistical philosophy grounded in methodological falsificationism. The goal is to help researchers reflect on the consequence of deviations from preregistrations by evaluating the test’s severity and the validity of the inference.
Article
The spread of misinformation is a pressing societal challenge. Prior work shows that shifting attention to accuracy increases the quality of people’s news-sharing decisions. However, researchers disagree on whether accuracy-prompt interventions work for U.S. Republicans/conservatives and whether partisanship moderates the effect. In this preregistered adversarial collaboration, we tested this question using a multiverse meta-analysis ( k = 21; N = 27,828). In all 70 models, accuracy prompts improved sharing discernment among Republicans/conservatives. We observed significant partisan moderation for single-headline “evaluation” treatments (a critical test for one research team) such that the effect was stronger among Democrats than Republicans. However, this moderation was not consistently robust across different operationalizations of ideology/partisanship, exclusion criteria, or treatment type. Overall, we observed significant partisan moderation in 50% of specifications (all of which were considered critical for the other team). We discuss the conditions under which moderation is observed and offer interpretations.
Article
With a sample of 228 psychology papers that failed to replicate, we tested whether the trajectory of citation patterns changes following the publication of a failure to replicate. Across models, we found consistent evidence that failing to replicate predicted lower future citations and that the size of this reduction increased over time. In a 14-y postpublication period, we estimated that the publication of a failed replication was associated with an average citation decline of 14% for original papers. These findings suggest that the publication of failed replications may contribute to a self-correcting science by decreasing scholars' reliance on unreplicable original findings.
Article
Across four preregistered experiments on American adults (total N = 968), and five supplemental experiments (total N = 869), we examined four accounts that might explain people’s aversion to “dirty money” (i.e., money earned in immoral ways): (a) they think it is morally tainted, (b) they care about illicit ownership, (c) they do not wish to profit from moral transgressions, and (d) accepting dirty money might imply an endorsement of the immoral means by which the money was acquired. Participants were unwilling to accept or touch dirty money, but they were relatively willing to take dirty money when it is lost and found. Together these findings suggest that people’s aversion to dirty money stems from concerns about both moral taint and endorsing the way in which dirty money was acquired.