Content uploaded by Samuel St-Jean
Author content
All content in this area was uploaded by Samuel St-Jean on Apr 23, 2021
Content may be subject to copyright.
Note: This is an unpublished manuscript. We welcome any feedback or comments.
Guidance for Multi-Analyst Studies
Authors:
Balazs Aczel1, Barnabas Szaszi1, Gustav Nilsonne2,3, Olmo R. van den Akker4, Casper J.
Albers5, Marcel A. L. M. van Assen4,6, Jojanneke A. Bastiaansen7,8, Dan Benjamin9,10, Udo
Boehm11, Rotem Botvinik-Nezer12, Laura F. Bringmann5, Niko A. Busch13, Emmanuel
Caruyer14, Andrea M. Cataldo15,16, Nelson Cowan17, Andrew Delios18, Noah N. N. van
Dongen11, Chris Donkin19, Johnny B. van Doorn11, Anna Dreber20,21, Gilles Dutilh22, Gary F.
Egan23, Morton Ann Gernsbacher24, Rink Hoekstra5, Sabine Hoffmann25, Felix
Holzmeister21, Juergen Huber21, Magnus Johannesson20, Kai J. Jonas26, Alexander T.
Kindel27, Michael Kirchler21, Yoram K. Kunkels7, D. Stephen Lindsay28, Jan-Francois
Mangin29,30, Dora Matzke11, Marcus R. Munafò31, Ben R. Newell19, Brian A. Nosek32,33,
Russell A. Poldrack34, Don van Ravenzwaaij5, Jörg Rieskamp35, Matthew J. Salganik27,
Alexandra Sarafoglou11, Tom Schonberg36, Martin Schweinsberg37, David Shanks38, Raphael
Silberzahn39, Daniel J. Simons40, Barbara A. Spellman33, Samuel St-Jean41,42, Jeffrey J.
Starns43, Eric L. Uhlmann44, Jelte Wicherts4, Eric-Jan Wagenmakers11
Affiliations:
1ELTE, Eotvos Lorand University, Budapest, Hungary, 2Karolinska Institutet, Stockholm,
Sweden, 3Stockholm University, Stockholm, Sweden, 4Tilburg University, Tilburg, The
Netherlands, 5University of Groningen, Groningen, The Netherlands, 6Utrecht University,
Utrecht, The Netherlands, 7University of Groningen, University Medical Center Groningen,
Groningen, The Netherlands, 8Friesland Mental Health Care Services, Leeuwarden, The
Netherlands, 9University of California Los Angeles, Los Angeles, CA, USA, 10National
Bureau of Economic Research, Cambridge, MA, USA, 11University of Amsterdam,
Amsterdam, The Netherlands, 12Dartmouth College, Hanover, NH, USA, 13University of
Münster, Münster, Germany, 14University of Rennes, CNRS, Inria, Inserm, Rennes, France,
15McLean Hospital, Belmont, MA, USA, 16Harvard Medical School, Boston, MA, USA,
17Department of Psychological Sciences, University of Missouri, MO, USA, 18National
University of Singapore, Singapore, 19University of New South Wales, Sydney, Australia,
20Stockholm School of Economics, Stockholm, Sweden, 21University of Innsbruck,
Innsbruck, Austria, 22University Hospital Basel, Basel, Switzerland, 23Monash University,
2
Note: This is an unpublished manuscript. We welcome any feedback or comments.
Melbourne, Victoria, Australia, 24University of Wisconsin-Madison, Madison, WI, USA,
25Ludwig-Maximilians-University, Munich, Germany, 26Maastricht University, Maastricht,
The Netherlands, 27Princeton University, Princeton, NJ, USA, 28University of Victoria,
Victoria, Canada, 29Université Paris-Saclay, Paris, France, 30Neurospin, CEA, France,
31University of Bristol, Bristol, UK, 32Center for Open Science, USA, 33University of
Virginia, Charlottesville, USA, 34Stanford University, Stanford, USA, 35University of Basel,
Basel, Switzerland, 36Tel Aviv University, Tel Aviv, Israel, 37ESMT Berlin, Germany,
38University College London, London, UK, 39University of Sussex, Brighton, UK,
40University of Illinois at Urbana-Champaign, USA, 41University of Alberta, Edmonton,
Canada, 42Lund University, Lund, Sweden, 43University of Massachusetts Amherst, USA,
44INSEAD, Singapore
*Correspondence: aczel.balazs@ppk.elte.hu and szaszi.barnabas@ppk.elte.hu
3
Note: This is an unpublished manuscript. We welcome any feedback or comments.
Standfirst:
We present consensus-based guidance for conducting and documenting multi-analyst
studies. We discuss why broader adoption of the multi-analyst approach will strengthen the
robustness of results and conclusions in empirical sciences.
The Unknown Fragility of Reported Conclusions
Typically, empirical results hinge on analytical choices made by a single data analyst
or team of authors, with limited independent, external input. This makes it uncertain whether
the reported conclusions are robust to justifiable alternative analytical strategies (Fig.1).
Studies in the social and behavioural sciences lend themselves to a multitude of justifiable
analyses. Empirical investigations require many analytical decisions, and the underlying
theoretical framework rarely imposes strong restrictions on how the data should be
preprocessed and modelled.
Fig.1 Example of a reported sequence of analysis choices (black line, leading to conclusion
A) shown as a subset of alternative plausible analysis paths (grey lines). In the left panel, all
plausible paths support conclusion A; in the right panel, most plausible paths support
conclusion B. This illustrates that without reporting the outcomes from alternative paths, it
remains unknown whether or not the conclusion is robust to justifiable alternative analytical
strategies.
As an example, the journal Surgery published two articles1,2 a few months apart that
used the same dataset and answered the same question: Does the use of a retrieval bag during
laparoscopic appendectomy reduce surgical site infections? Two reasonable, but different,
analyses were applied (with notable differences in analytical choices including inclusion and
exclusion criteria, outcome measures, sample sizes, and covariates). As a result of the
different analytical choices, the two articles reached opposite conclusions, one finding that
the use of a retrieval bag reduced infections, and the other that it did not3. This example
illustrates how independent analysis of the same data (in this case, unplanned) can reach
different, yet justifiable conclusions.
4
Note: This is an unpublished manuscript. We welcome any feedback or comments.
In this article, we describe how a multi-analyst approach can evaluate the impact of
alternative analyses on reported results and conclusions. In addition, we provide consensus-
based guidance to help research prepare, conduct, and report of multi-analyst studies.
Exploring Analytical Robustness with the Multi-Analyst Method
The robustness of results and conclusions can be studied by evaluating many distinct
analysis options simultaneously (e.g., vibration of effects4 or multiverse analysis5) or by
involving multiple analysts who independently analyse the same data6–13. Rather than
exhaustively evaluating all plausible analyses, the multi-analyst method examines analytical
choices that are deemed most appropriate by independent analysts.
Botvinik-Nezer et al.10, for example, asked 70 teams to test the same hypotheses using
the same functional magnetic resonance imaging dataset. They found that no two teams
followed the same data preprocessing steps or analysis strategies in their analyses, resulting
in substantial variability in their conclusions. This and other multi-analyst initiatives6–13
highlight how findings can vary depending on the judgment of the analyst.
Use and Benefits of the Multi-Analyst Method
Although the multi-analyst approach is new to many researchers, it has been in use
since the 19th century. A prominent example is the cuneiform competition14, which may be
viewed as a precursor to the modern multi-analyst method. In 1857, the Royal Asian Society
asked four scholars to independently translate a previously unseen inscription to verify that
the ancient Assyrian language had been deciphered correctly. The almost perfect overlap
between the solutions indicated that “they have Truth for their basis” (p. 4)14.
The central idea from this cuneiform competition can be applied to 21st century data
analysis with several benefits (Box 1). With even a few co-analysts, the multi-analyst
approach can be informative about the robustness of results and conclusions. When the
results of independent data analyses converge, more confidence in the conclusions is
warranted. When the results diverge, confidence falters, and scientists can examine the
reasons for these discrepancies. With enough co-analysts, it is possible to estimate the
variability among analysis strategies and identify factors explaining this variability.
5
Note: This is an unpublished manuscript. We welcome any feedback or comments.
Box 1
Benefits of the Multi-Analyst Approach
● Converging conclusions increase confidence in the analytical robustness of a finding
● Diverging conclusions decrease confidence in the analytical robustness of a finding and
prompt an examination of the reasons for the divergence
● Identifies a key source of uncertainty, namely the extent to which the results and
conclusions depend on the analytic preferences of the analyst
● Establishes the variability of results as a function of analytical choices
● With analysts from multiple disciplines, the approach stimulates cross-pollination of
analysis strategies that otherwise might remain isolated within research subcultures
● Diminishes or eliminates the analyst’s potential preference toward the hypotheses, since no
individual analysis is likely to determine the conclusions
Multi-Analyst Guidance
The multi-analyst approach is rarely used in empirical research, but many disciplines
could benefit from its broader adoption. Implementing a multi-analyst study involves
practical challenges that might discourage researchers from pursuing it further. To help
researchers overcome these practical challenges, we provide consensus-based guidance to
help researchers surmount the practical challenges of preparing, conducting, and reporting
multi-analyst studies.
To develop this guidance, we recruited an expert panel of 50 social and behavioural
scientists (all co-authors on this paper) with experience in organising multi-analyst projects or
expertise in research methodology. In a first survey we gathered their conceptual and
practical insights about this approach. These responses were used to create a draft of our
guidance. Next, the draft was iteratively improved by the expert panel, following a
preregistered ‘reactive-Delphi’ expert consensus procedure. The final draft was
independently rated by the members of the panel to ensure that each of the approved items
satisfied our preset criteria for a sufficiently high level of support. The expert consensus
procedure has been concluded in one round, resulting in a guide that represents a consensus
among the experts. Of course, other experts might have different views, and we welcome
feedback. For the survey materials, a list of panel members, and the details of the consensus
procedure see Supplementary Information.
The guidance includes 10 Recommended Practices (Table 1) and a Practical
Considerations document that supports these practices (see Supplementary Information). Both
the practices and considerations address the five main stages of a multi-analyst project: (1)
Recruiting co-analysts; (2) Providing the dataset, research questions, and research tasks; (3)
Conducting the independent analyses; (4) Processing the results; and (5) Reporting the
methods and results. To further assist researchers in documenting multi-analyst projects, we
also provide a modifiable Reporting Template that incorporates the elements of our guide.
6
Note: This is an unpublished manuscript. We welcome any feedback or comments.
Table 1
Recommended Practices for the Main Stages of the Multi-Analyst Method
Stage
Recommended Practices
Recruiting
Co-analysts
1. Determine a minimum target number of co-analysts and outline clear
eligibility criteria before recruiting co-analysts. We recommend that the
final report justifies why these choices are adequate to achieve the study
goals.
2. When recruiting co-analysts, inform them about (a) their tasks and
responsibilities; (b) the project code of conduct (e.g., confidentiality/ non-
disclosure agreements); (c) the plans for publishing the research report and
presenting the data, analyses, and conclusion; (d) the conditions for an
analysis to be included or excluded from the study; (e) whether their names
will be publicly linked to the analyses; (f) the co-analysts’ rights to update
or revise their analyses; (g) the project time schedule; and (h) the nature
and criteria of compensation (e.g., authorship).
Providing the
Dataset,
Research
Questions,
and Research
Tasks
3. Provide the datasets accompanied with a codebook that contains a
comprehensive explanation of the variables and the datafile structure.
4. Ensure that co-analysts understand any restrictions on the use of the data,
including issues of ethics, privacy, confidentiality, or ownership.
5. Provide the research questions (and potential theoretically derived
hypotheses that should be tested) without communicating the lead team’s
preferred analysis choices or expectations about the conclusions.
Conducting
the
Independent
Analyses
6. To ensure independence, we recommend that co-analysts do not
communicate with each other about their analyses until after all initial
reports have been submitted. In general, it should be clearly explained why
and at what stage co-analysts are allowed to communicate about the
analyses (e.g., to detect errors or call attention to outlying data points).
Processing
the Results
7. Require co-analysts to share with the lead team their results, the analysis
code with explanatory comments (or a detailed description of their point-
and-click analyses), their conclusions, and an explanation of how their
conclusions follow from their results.
8. The lead team makes the commented code, results, and conclusions of all
non-withdrawn analyses publicly available before or at the same time as
submitting the research report.
Reporting the
Methods and
Results
9. The lead team should report the multi-analyst process of the study,
including (a) the justification for the number of co-analysts; (b) the
eligibility criteria and recruitment of co-analysts; (c) how co-analysts were
given the data sets and research questions; (d) how the independence of
analyses was ensured; (e) the numbers of and reasons for withdrawals and
omissions of analyses; (f) whether the lead team conducted an independent
7
Note: This is an unpublished manuscript. We welcome any feedback or comments.
analysis; (g) how the results were processed; (h) the summary of the
results of co-analysts; (i) and the limitations and potential biases of the
study.
10. Data management should follow the FAIR principles15, and the research
report should be transparent about access to the data and code for all
analyses16.
Caveats and Conclusions
The present work does not cover all aspects of multi-analyst projects. For instance,
the multi-analyst approach outlined here entails the independent analysis of one or more
datasets, but it should be acknowledged that other crowdsourced analysis approaches might
not require such independence of the analyses. Also, we emphasize that this consensus-based
guidance is a first step towards the broader adoption of the multi-analyst approach in
empirical research; we hope and expect that our recommendations will be developed further.
This guidance document aims to facilitate adoption of the multi-analyst approach in
empirical research. We believe that the scientific benefits greatly outweigh the extra logistics
required, especially for projects with great scientific or societal impact. With a systematic
exploration of the analytical space we can assess whether the reported results and conclusions
are dependent on the chosen analytical strategy. Such exploration takes us beyond the tip of
the epistemic iceberg that results from a single data analyst executing a single statistical
analysis.
References
1. Fields, A. C. et al. Does retrieval bag use during laparoscopic appendectomy reduce
postoperative infection? Surgery 165, 953–957 (2019).
2. Turner, S. A., Jung, H. S. & Scarborough, J. E. Utilization of a specimen retrieval bag
during laparoscopic appendectomy for both uncomplicated and complicated appendicitis is
not associated with a decrease in postoperative surgical site infection rates. Surgery 165,
1199–1202 (2019).
3. Childers, C. P. & Maggard-Gibbons, M. Same Data, Opposite Results?: A Call to
Improve Surgical Database Research. JAMA Surg. 156, 219–220 (2021).
4. Patel, C. J., Burford, B. & Ioannidis, J. P. Assessment of vibration of effects due to
model specification can demonstrate the instability of observational associations. J. Clin.
Epidemiol. 68, 1046–1058 (2015).
5. Steegen, S., Tuerlinckx, F., Gelman, A. & Vanpaemel, W. Increasing Transparency
Through a Multiverse Analysis. Perspect. Psychol. Sci. 11, 702–712 (2016).
6. Bastiaansen, J. A. et al. Time to get personal? The impact of researchers choices on
the selection of treatment targets using the experience sampling methodology. J.
Psychosom. Res. 137, 110211 (2020).
7. Dongen, N. N. N. van et al. Multiple Perspectives on Inference for Two Simple
Statistical Scenarios. Am. Stat. 73, 328–339 (2019).
8. Salganik, M. J. et al. Measuring the predictability of life outcomes with a scientific
8
Note: This is an unpublished manuscript. We welcome any feedback or comments.
mass collaboration. Proc. Natl. Acad. Sci. 117, 8398–8403 (2020).
9. Silberzahn, R. et al. Many Analysts, One Data Set: Making Transparent How
Variations in Analytic Choices Affect Results. Adv. Methods Pract. Psychol. Sci. 1, 337–
356 (2018).
10. Botvinik-Nezer, R. et al. Variability in the analysis of a single neuroimaging dataset
by many teams. Nature 582, 84–88 (2020).
11. Dutilh, G. et al. The Quality of Response Time Data Inference: A Blinded,
Collaborative Assessment of the Validity of Cognitive Models. Psychon. Bull. Rev. 26,
1051–1069 (2019).
12. Fillard, P. et al. Quantitative evaluation of 10 tractography algorithms on a realistic
diffusion MR phantom. NeuroImage 56, 220–234 (2011).
13. Starns, J. J. et al. Assessing theoretical conclusions with blinded inference to
investigate a potential inference crisis. Adv. Methods Pract. Psychol. Sci. 2, 335–349
(2019).
14. Rawlinson, H. S., Talbot, F., Hincks, E. & Oppert, J. Inscription of Tiglath Pileser I,
King of Assyria, BC 1150, as translated by H. Rawlinson Fox Talbot Dr Hincks Dr Oppert
Publ. R. Asiat. Soc. Lond. JW Park. Son (1857).
15. Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management
and stewardship. Sci. Data 3, 160018 (2016).
16. Aczel, B. et al. A consensus-based transparency checklist. Nat. Hum. Behav. 4, 4–6
(2020).
Competing interests
B.A.N. is Executive Director of the Center for Open Science, a nonprofit technology
and culture change organization with a mission to increase openness, integrity, and
reproducibility of research. The other authors declare no competing interests.
Data and materials availability
All anonymized data as well as the survey materials are publicly shared on the Open
Science Framework page of the project: https://osf.io/4zvst/. Our methodology and data-
analysis plan were preregistered. The preregistration document can be accessed at:
https://osf.io/dgrua.
Funding
This research was not funded. A.S. was supported by a talent grant from the
Netherlands Organisation for Scientific Research (NWO) to AS (406-17-568). R.B.-N. is an
Awardee of the Weizmann Institute of Science – Israel National Postdoctoral Award Program
for Advancing Women in Science. B.A.N. was supported by grants from the John Templeton
Foundation, Templeton World Charity Foundation, Templeton Religion Trust, and Arnold
Ventures. S.St-J. is supported by the Natural Sciences and Engineering Research Council of
Canada (NSERC) [funding reference number BP–546283–2020] and the Fonds de recherche
9
Note: This is an unpublished manuscript. We welcome any feedback or comments.
du Québec - Nature et technologies (FRQNT) [Dossier 290978]. J.M.W. and O.R.v.d.A. were
supported by a Consolidator Grant (IMPROVE) from the European Research Council (ERC;
grant no. 726361). Y.K.K. was supported by a grant from the European Research Council
(ERC) under the European Union’s Horizon 2020 research and innovation programme (ERC-
CoG-2015; No 681466 to M. Wichers). D.v.R. was supported by a Dutch scientific
organization VIDI fellowship grant (016.Vidi.188.001). L.F.B. was supported by a Dutch
scientific organization VENI fellowship grant (Veni 191G.037). M.J.S., was supported by the
U.S. National Science Foundation (1760052).
Author contributions
Conceptualization: B.A., B.S., G.N., and E.-J.W.; Methodology: B.A., B.S., G.N., and E.-
J.W.; Project Administration: B.A.; Supervision: E.-J.W.; Writing - Original Draft
Preparation: B.A., B.S., G.N., and E.-J.W.; Writing - Review & Editing: B.A., B.S., G.N.,
O.R.v.d.A., C.J.A., M.A.L.M.v.A., J.A.B., D.B., U.B., R.B.-N., L.F.B., N.B., E.C., A.M.C.,
N.C., A. Delios, N.N.N.v.D., C.D., J.B.v.D., A. Dreber, G.D., G.F.E., M.A.G., R.H., S.H.,
F.H., J.H., M.J., K.J.J., A.T.K., M.K., Y.K.K., D.S.L., J.-F.M., D.M., M.R.M., B.R.N.,
B.A.N., R.A.P., D.v.R., J.R., M.J.S., A.S., T.S., M.S., D.S., R.S., D.J.S., B.A.S., S.St-J.,
J.J.S., E.L.U., J.W., and E.-J.W.
Supplementary Information
Recommended Practices and Practical Considerations for Multi-Analyst Projects:
https://osf.io/uvwgy/
Reporting Template: https://osf.io/h9mgy/
Supplementary Methods: https://osf.io/gjz2r/