Figure 1 - uploaded by Doris Pischedda
Content may be subject to copyright.
Voxels overlap. Maps showing at each voxel the proportion of teams reporting significant activations in their thresholded statistical map, for each hypothesis (labeled H1 -H9), thresholded at 10% (i.e., voxels with no color were significant in fewer than 10% of teams). +/-refers to direction of effect, gain/loss refers to the effect being tested, and equal indifference (EI) / equal range (ER) refers to the group being examined or compared. Hypotheses #1 and #3, as well as hypotheses #2 and #4, share the same statistical maps as the hypotheses are for the same contrast and experimental group, but for different regions (see Table 1). Images can be viewed at https://identifiers.org/neurovault.collection:6047

Voxels overlap. Maps showing at each voxel the proportion of teams reporting significant activations in their thresholded statistical map, for each hypothesis (labeled H1 -H9), thresholded at 10% (i.e., voxels with no color were significant in fewer than 10% of teams). +/-refers to direction of effect, gain/loss refers to the effect being tested, and equal indifference (EI) / equal range (ER) refers to the group being examined or compared. Hypotheses #1 and #3, as well as hypotheses #2 and #4, share the same statistical maps as the hypotheses are for the same contrast and experimental group, but for different regions (see Table 1). Images can be viewed at https://identifiers.org/neurovault.collection:6047

Source publication
Preprint
Full-text available
Data analysis workflows in many scientific domains have become increasingly complex and flexible. To assess the impact of this flexibility on functional magnetic resonance imaging (fMRI) results, the same dataset was independently analyzed by 70 teams, testing nine ex-ante hypotheses. The flexibility of analytic approaches is exemplified by the fac...

Contexts in source publication

Context 1
... of overlap between activated voxels was consistent with the variability in the reported hypothesis results, with most voxels in the thresholded maps showing inconsistent binary values. The maximum proportion of teams with activation in any single voxel for a given hypothesis was 0.77 (range 0.23 -0.77; Figure 1). ...
Context 2
... a coordinate-based meta-analysis using activation likelihood estimation (ALE) 16,17 across teams, which imposes additional smoothing, demonstrated convergent patterns of activation for all hypotheses (Supplementary Figure 1). Altogether, analysis of the similarity between thresholded statistical images suggests that these maps are substantially diverse, but aggregating across analyses can yield more consistent results. ...
Context 3
... approaches that aggregate information across analyses are one potential solution to the issue of analytic variability. We assessed the consistency across teams using an image-based meta-analysis (accounting for correlations due to common data), which demonstrated significant active voxels for all hypotheses except for #9 after false discovery rate correction (see Supplementary Figure 10 ...
Context 4
... bioRxiv preprint first posted online Nov. 15, 2019; 0.449 (sd = 0.146, min = 0.157, max = 0.652) for "non-team members". The result in the "team members" prediction market was not driven by over-representation of teams reporting significant results (see Supplementary Materials: Supplementary Figure 11 and Prediction markets results/exploratory analyses). Market's predictions in the "team members" prediction market did not significantly differ from those of the "non-team members" prediction markets (Wilcoxon signed-rank test, z = 1.035, p = 0.301, n = 9), but as mentioned above, the statistical power for this test was limited. ...
Context 5
... image-based meta-analysis was used to quantify the evidence for each hypothesis across analysis teams (see Supplementary Figure 10), accounting for the lack of independence due to the use of a common dataset across teams. While there are different meta-analysis-inspired approaches that could be taken (e.g. a random effects meta-analysis that penalizes for inter-team variation), we sought an approach that would preserve the typical characteristics of the teams' maps. ...
Context 6
... doi: bioRxiv preprint first posted online Nov. 15, 2019; a dedicated online market platform (see Supplementary Figure 12). Each hypothesis constitutes one asset in the market, with asset prices predicting the fraction of teams reporting significant whole-brain corrected results for the corresponding ex-ante hypothesis examined by the analysis teams using the same dataset. ...
Context 7
... market overview showed the nine assets (i.e., one corresponding to each hypothesis) in tabular format, including information on the (approximate) current price for buying a share and the number of shares held (separated for long and short positions) for each of the nine hypotheses. Via the trading interface, which was shown after clicking on any of the hypotheses, the participant could make investment decisions and view price developments for the particular asset (see Supplementary Figure 12). ...
Context 8
... of overlap between activated voxels was consistent with the variability in the reported hypothesis results, with most voxels in the thresholded maps showing inconsistent binary values. The maximum proportion of teams with activation in any single voxel for a given hypothesis was 0.77 (range 0.23 -0.77; Figure 1). ...
Context 9
... a coordinate-based meta-analysis using activation likelihood estimation (ALE) 16,17 across teams, which imposes additional smoothing, demonstrated convergent patterns of activation for all hypotheses (Supplementary Figure 1). Altogether, analysis of the similarity between thresholded statistical images suggests that these maps are substantially diverse, but aggregating across analyses can yield more consistent results. ...
Context 10
... approaches that aggregate information across analyses are one potential solution to the issue of analytic variability. We assessed the consistency across teams using an image-based meta-analysis (accounting for correlations due to common data), which demonstrated significant active voxels for all hypotheses except for #9 after false discovery rate correction (see Supplementary Figure 10 ...
Context 11
... bioRxiv preprint first posted online Nov. 15, 2019; 0.449 (sd = 0.146, min = 0.157, max = 0.652) for "non-team members". The result in the "team members" prediction market was not driven by over-representation of teams reporting significant results (see Supplementary Materials: Supplementary Figure 11 and Prediction markets results/exploratory analyses). Market's predictions in the "team members" prediction market did not significantly differ from those of the "non-team members" prediction markets (Wilcoxon signed-rank test, z = 1.035, p = 0.301, n = 9), but as mentioned above, the statistical power for this test was limited. ...
Context 12
... image-based meta-analysis was used to quantify the evidence for each hypothesis across analysis teams (see Supplementary Figure 10), accounting for the lack of independence due to the use of a common dataset across teams. While there are different meta-analysis-inspired approaches that could be taken (e.g. a random effects meta-analysis that penalizes for inter-team variation), we sought an approach that would preserve the typical characteristics of the teams' maps. ...
Context 13
... doi: bioRxiv preprint first posted online Nov. 15, 2019; a dedicated online market platform (see Supplementary Figure 12). Each hypothesis constitutes one asset in the market, with asset prices predicting the fraction of teams reporting significant whole-brain corrected results for the corresponding ex-ante hypothesis examined by the analysis teams using the same dataset. ...
Context 14
... market overview showed the nine assets (i.e., one corresponding to each hypothesis) in tabular format, including information on the (approximate) current price for buying a share and the number of shares held (separated for long and short positions) for each of the nine hypotheses. Via the trading interface, which was shown after clicking on any of the hypotheses, the participant could make investment decisions and view price developments for the particular asset (see Supplementary Figure 12). ...
Context 15
... of overlap between activated voxels was consistent with the variability in the reported hypothesis results, with most voxels in the thresholded maps showing inconsistent binary values. The maximum proportion of teams with activation in any single voxel for a given hypothesis was 0.77 (range 0.23 -0.77; Figure 1). ...
Context 16
... a coordinate-based meta-analysis using activation likelihood estimation (ALE) 16,17 across teams, which imposes additional smoothing, demonstrated convergent patterns of activation for all hypotheses (Supplementary Figure 1). Altogether, analysis of the similarity between thresholded statistical images suggests that these maps are substantially diverse, but aggregating across analyses can yield more consistent results. ...
Context 17
... approaches that aggregate information across analyses are one potential solution to the issue of analytic variability. We assessed the consistency across teams using an image-based meta-analysis (accounting for correlations due to common data), which demonstrated significant active voxels for all hypotheses except for #9 after false discovery rate correction (see Supplementary Figure 10 ...
Context 18
... bioRxiv preprint 0.449 (sd = 0.146, min = 0.157, max = 0.652) for "non-team members". The result in the "team members" prediction market was not driven by over-representation of teams reporting significant results (see Supplementary Materials: Supplementary Figure 11 and Prediction markets results/exploratory analyses). Market's predictions in the "team members" prediction market did not significantly differ from those of the "non-team members" prediction markets (Wilcoxon signed-rank test, z = 1.035, p = 0.301, n = 9), but as mentioned above, the statistical power for this test was limited. ...
Context 19
... image-based meta-analysis was used to quantify the evidence for each hypothesis across analysis teams (see Supplementary Figure 10), accounting for the lack of independence due to the use of a common dataset across teams. While there are different meta-analysis-inspired approaches that could be taken (e.g. a random effects meta-analysis that penalizes for inter-team variation), we sought an approach that would preserve the typical characteristics of the teams' maps. ...
Context 20
... doi: bioRxiv preprint a dedicated online market platform (see Supplementary Figure 12). Each hypothesis constitutes one asset in the market, with asset prices predicting the fraction of teams reporting significant whole-brain corrected results for the corresponding ex-ante hypothesis examined by the analysis teams using the same dataset. ...
Context 21
... market overview showed the nine assets (i.e., one corresponding to each hypothesis) in tabular format, including information on the (approximate) current price for buying a share and the number of shares held (separated for long and short positions) for each of the nine hypotheses. Via the trading interface, which was shown after clicking on any of the hypotheses, the participant could make investment decisions and view price developments for the particular asset (see Supplementary Figure 12). ...

Similar publications

Article
Full-text available
Data analysis workflows in many scientific domains have become increasingly complex and flexible. Here we assess the effect of this flexibility on the results of functional magnetic resonance imaging by asking 70 independent teams to analyse the same dataset, testing the same 9 ex-ante hypotheses¹. The flexibility of analytical approaches is exempl...

Citations

... We now present our re-analysis of the NARPS dataset (Botvinik-Nezer et al., 2020b). This dataset includes two studies, each of which is composed of a group of 54 participants who make a series of risky decisions. ...
Article
Full-text available
    Is irrational behavior the incidental outcome of biological constraints imposed on neural information processing? In this work, we consider the paradigmatic case of gamble decisions, where gamble values integrate prospective gains and losses. Under the assumption that neurons have a limited firing response range, we show that mitigating the ensuing information loss within artificial neural networks that synthetize value involves a specific form of self-organized plasticity. We demonstrate that the ensuing efficient value synthesis mechanism induces value range adaptation. We also reveal how the ranges of prospective gains and/or losses eventually determine both the behavioral sensitivity to gains and losses and the information content of the network. We test these predictions on two fMRI datasets from the OpenNeuro.org initiative that probe gamble decision-making but differ in terms of the range of gain prospects. First, we show that peoples' loss aversion eventually adapts to the range of gain prospects they are exposed to. Second, we show that the strength with which the orbitofrontal cortex (in particular: Brodmann area 11) encodes gains and expected value also depends upon the range of gain prospects. Third, we show that, when fitted to participant’s gambling choices, self-organizing artificial neural networks generalize across gain range contexts and predict the geometry of information content within the orbitofrontal cortex. Our results demonstrate how self-organizing plasticity aiming at mitigating information loss induced by neurons’ limited response range may result in value range adaptation, eventually yielding irrational behavior.
    ... While novel approaches allow the investigation of large multiverses (Dafflon et al., 2022), the inclusion of not defensible choices may bias results (Del Giudice & Gangestad, 2021). The processing steps and combinations included in a multiverse analysis may be chosen by the researcher a priori (Clayson et al., 2021;Sadus et al., 2023;Schubert et al., 2023), empirically sampled with a many analysts approach (Botvinik-Nezer et al., 2019;Trübutschek et al., 2022) or determined by a literature review (Šoškić et al., 2021). For instance, a systematic literature review of 132 stationary EEG publications revealed that no two publications chose the same approach of data recording, processing, and analysis and that most omitted at least some details -even while they all examined the same event-related potentials (ERPs) (Šoškić et al., 2021). ...
    Preprint
    Full-text available
    Preprocessing is necessary to extract meaningful results from electroencephalography (EEG) data. With many possible preprocessing choices, their impact on outcomes is fundamental. While previous studies have explored the effects of preprocessing on stationary EEG data, this research delves into mobile EEG, where complex processing is necessary to address motion artifacts. Specifically, we describe the preprocessing choices studies reported for analyzing the P3 event-related potential (ERP) during walking and standing. A systematic review of 258 studies of the P3 during walking, identified 27 studies meeting the inclusion criteria. Two independent coders extracted preprocessing choices reported in each study. Analysis of preprocessing choices revealed commonalities and differences, such as the widespread use of offline filters but limited application of line noise correction (3 of 27 studies). Notably, 59% of studies involved manual processing steps, and 56% omitted reporting critical parameters for at least one step. All studies employed unique preprocessing strategies. These findings align with stationary EEG preprocessing results, emphasizing the necessity for standardized reporting in mobile EEG research. We implemented an interactive visualization tool (Shiny app) to aid the exploration of the preprocessing landscape. The app allows users to structure the literature regarding different processing steps, enter planned processing methods, and compare them with the literature. The app could be utilized to examine how these choices impact P3 results and understand the robustness of various processing options. We hope to increase awareness regarding the potential influence of preprocessing decisions and advocate for comprehensive reporting standards to foster reproducibility in mobile EEG research.
    ... Whether with the same population, setting, and materials (the replicability challenge; Klein et al., 2014;Open Science Collaboration, 2015) or after a change to one or more of these features (the generalizability challenge; Henrich et al., 2010;Tiokhin et al., 2019;Yarkoni, 2019), replicated results often differ meaningfully from original results. Meaningful differences also occur in other forms of replication, such as: when separate teams develop research strategies to address the same research question (the strategy selection challenge; Landy et al., 2020) when separate teams develop analysis plans for the same dataset (the inferential reproducibility challenge; Botvinik-Nezer et al., 2019;Silberzahn et al., 2018), and even when separate teams write code to execute the same analysis (the computational reproducibility challenge; Donoho et al., 2008;Hardwicke et al., 2018;Obels et al., 2019). ...
    Article
    Full-text available
    Progress in psychology has been frustrated by challenges concerning replicability, generalizability, strategy selection, inferential reproducibility, and computational reproducibility. Although often discussed separately, these five challenges may share a common cause: insufficient investment of intellectual and nonintellectual resources into the typical psychology study. We suggest that the emerging emphasis on big-team science can help address these challenges by allowing researchers to pool their resources together to increase the amount available for a single study. However, the current incentives, infrastructure, and institutions in academic science have all developed under the assumption that science is conducted by solo principal investigators and their dependent trainees, an assumption that creates barriers to sustainable big-team science. We also anticipate that big-team science carries unique risks, such as the potential for big-team-science organizations to be co-opted by unaccountable leaders, become overly conservative, and make mistakes at a grand scale. Big-team-science organizations must also acquire personnel who are properly compensated and have clear roles. Not doing so raises risks related to mismanagement and a lack of financial sustainability. If researchers can manage its unique barriers and risks, big-team science has the potential to spur great progress in psychology and beyond.
    ... Whether with the same population, setting, and materials (the replicability challenge; Klein et al., 2014; Open Science Collaboration, 2015) or after a change to one or more of these features (the generalizability challenge; Henrich et al., 2010;Tiokhin et al., 2019;Yarkoni, 2019), replicated results often differ meaningfully from original results. Meaningful differences also occur in other forms of replication, such as when separate teams develop research strategies to address the same research question (the strategy-selection challenge; Landy et al., 2020), when separate teams develop analysis plans for the same data set (the inferential-reproducibility challenge; Botvinik-Nezer et al., 2020;Silberzahn et al., 2018), and even when separate teams write code to execute the same analysis (the computational-reproducibility challenge; ...
    Article
    Full-text available
    The facial feedback hypothesis suggests that an individual's facial expressions can influence their emotional experience (e.g., that smiling can make one feel happier). However, a reoccurring concern is that supposed facial feedback effects are merely methodological artifacts. Six experiments conducted across 29 countries (N = 995) examined the extent to which the effects of posed facial expressions on emotion reports were moderated by (a) the hypothesis communicated to participants (i.e., demand characteristics) and (b) participants' beliefs about facial feedback effects. Results indicated that these methodological artifacts moderated, but did not fully account for, the effects of posed facial expressions on emotion reports. Even when participants were explicitly told or personally believed that facial poses do not influence emotions, they still exhibited facial feedback effects. These results indicate that facial feedback effects are not solely driven by demand or placebo effects. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
    ... This is because we used a partially data-driven approach when fitting neural networks and did not have a second, similar data set available for cross-validation. Indeed, recent evidence in fMRI demonstrates that brain parcellations (Bryce et al., 2021), analytic pipelines (Botvinik-Nezer et al., 2020;Li et al., 2021), and other potentially subjective researcher decisions (Bloom et al., 2021;Steegen et al., 2016) impact results; hence, it is imperative that future work replicates these results in other adolescent samples, with other tasks that probe motivational processing, and using other preprocessing pipelines. Second, associations between self-reported sensation seeking and real-world risk-taking are often small-to-medium in adolescent samples (Demidenko et al., 2019). ...
    Article
    Adolescent risk-taking, including sensation seeking (SS), is often attributed to developmental changes in connectivity among brain regions implicated in cognitive control and reward processing. Despite considerable scientific and popular interest in this neurodevelopmental framework, there are few empirical investigations of adolescent functional connectivity, let alone examinations of its links to SS behavior. The studies that have been done focus on mean-based approaches and leave unanswered questions about individual differences in neurodevelopment and behavior. The goal of this paper is to take a person-specific approach to the study of adolescent functional connectivity during a continuous motivational state, and to examine links between connectivity and self-reported SS behavior in 104 adolescents (MAge = 19.3; SDAge = 1.3). Using Group Iterative Multiple Model Estimation (GIMME), person-specific connectivity during two neuroimaging runs of a monetary incentive delay task was estimated among 12 a priori brain regions of interest representing reward, cognitive, and salience networks. Two data-driven subgroups were detected, a finding that was consistent between both neuroimaging runs, but associations with SS were only found in the first run, potentially reflecting neural habituation in the second run. Specifically, the subgroup that had unique connections between reward-related regions had greater SS and showed a distinctive relation between connectivity strength in the reward regions and SS. These findings provide novel evidence for heterogeneity in adolescent brain-behavior relations by showing that subsets of adolescents have unique associations between neural motivational processing and SS. Findings have broader implications for future work on reward processing, as they demonstrate that brain-behavior relations may attenuate across runs.
    ... Multiverse analysis seems to be especially well suited for neuroscience research, given the multitude of preprocessing and data analysis choices that result in a complex 'garden of forking paths' (Gelman and Loken, 2014). As most neuroscience studies test only one analysis variant it is difficult to assess the robustness of any individual effect, and it seems that at least some neuroscientific findings are sensitive to the choice of signal analysis steps (Cohen, 2015;Cohen and Gulbinaite, 2014; see also Botvinik-Nezer et al., 2019). ...
    Article
    Full-text available
    For decades, the frontal alpha asymmetry (FAA) – a disproportion in EEG alpha oscillations power between right and left frontal channels – has been one of the most popular measures of depressive disorders (DD) in electrophysiology studies. Patients with DD often manifest a left-sided FAA: relatively higher alpha power in the left versus right frontal lobe. Recently, however, multiple studies failed to confirm this effect, questioning its reproducibility. Our purpose is to thoroughly test the validity of FAA in depression by conducting a multiverse analysis – running many related analyses and testing the sensitivity of the effect to changes in the analytical approach – on data from five independent studies. Only 13 of the 270 analyses revealed significant results. We conclude the paper by discussing theoretical assumptions underlying the FAA and suggest a list of guidelines for improving and expanding the EEG data analysis in future FAA studies.
    ... The discrepancy between the strong theoretical prediction of a common neural basis for beauty and lack of empirical evidence calls for more rigorous studies. Recently, the field of cognitive neuroscience has been challenged for its reproducibility (Botvinik-Nezer et al., 2020;Hu, Jiang, Jeffrey, & Zuo, 2018;Poldrack et al., 2017). Small sample size (which results in low statistical power) (Button et al., 2013), flexibility in data analysis (Botvinik-Nezer et al., 2020;Carp, 2012), and errors in implementing software (Eklund et al., 2016), accompanied by publication bias (Jennings & Horn, 2012), all threatened the reliability and reproducibility of the field (Hu et al., 2018;Poldrack et al., 2017). ...
    ... Recently, the field of cognitive neuroscience has been challenged for its reproducibility (Botvinik-Nezer et al., 2020;Hu, Jiang, Jeffrey, & Zuo, 2018;Poldrack et al., 2017). Small sample size (which results in low statistical power) (Button et al., 2013), flexibility in data analysis (Botvinik-Nezer et al., 2020;Carp, 2012), and errors in implementing software (Eklund et al., 2016), accompanied by publication bias (Jennings & Horn, 2012), all threatened the reliability and reproducibility of the field (Hu et al., 2018;Poldrack et al., 2017). Direct replication, albeit very few, found that pessimistic results (Boekel et al., 2015). ...
    ... In these cases, the final choice might be arbitrary. Studies have shown that flexibility in research practice can inflate the false-positive rate (Botvinik-Nezer et al., 2020;Carp, 2012;Simmons, Nelson, & Simonsohn, 2011). It also might be true for metaanalysis. ...
    Article
    During the past two decades, cognitive neuroscientists have sought to elucidate the common neural basis of the experience of beauty. Still, empirical evidence for such common neural basis of different forms of beauty is not conclusive. To address this question, we performed an activation likelihood estimation (ALE) meta-analysis on the existing neuroimaging studies of beauty appreciation of faces and visual art by nonexpert adults (49 studies, 982 participants, meta-data are available at https://osf.io/ s9xds/). We observed that perceiving these two forms of beauty activated distinct brain regions: While the beauty of faces convergently activated the left ventral striatum, the beauty of visual art convergently activated the anterior medial prefrontal cortex (aMPFC). However, a conjunction analysis failed to reveal any common brain regions for the beauty of visual art and faces. The implications of these results are discussed.
    ... Future studies could change the design by balancing the congruency and incongruency to better address the effort of conflict resolving; or even adding the feedback (emoji) after the item purchase, to further reflect the real lives of couple interactions; (b) the sample size might be relatively small (N = 30). By making the raw and processed data, stimuli, and codes public available, future researchers may include any one of them for advanced analyses 81 or, meta-analyses 82 , or any other type of combined endeavor to better understand the complex social exchanges in humans. Stimuli and procedure. ...
    Article
    Full-text available
    One of the typical campus scenes is the social interaction between college couples, and the lesson couples must keep learning is to adapt to each other. This fMRI study investigated the shopping interactions of 30 college couples, one lying inside and the other outside the scanner, beholding the same item from two connected PCs, making preference ratings and subsequent buy/not-buy decisions. The behavioral results showed the clear modulation of significant others’ preferences onto one’s own decisions, and the contrast of the “shop-together vs. shop-alone”, and the “congruent (both liked or disliked the item, 68%) vs. incongruent (one liked but the other disliked, and vice versa)” together trials, both revealed bilateral temporal parietal junction (TPJ) among other reward-related regions, likely reflecting mentalizing during preference harmony. Moreover, when contrasting “own-high/other-low vs. own-low/other-high” incongruent trials, left anterior inferior parietal lobule (l-aIPL) was parametrically mapped, and the “yield (e.g., own-high/not-buy) vs. insist (e.g., own-low/not-buy)” modulation further revealed left lateral-IPL (l-lIPL), together with left TPJ forming a local social decision network that was further constrained by the mediation analysis among left TPJ–lIPL–aIPL. In sum, these results exemplify, via the two-person fMRI, the neural substrate of shopping interactions between couples.
    ... Although we have focused on the use of validation criteria to make decisions about experiments, they can also be used to make decisions about which data to analyze. In fields such as electrophysiology or functional neuroimaging, for example, data typically pass through preprocessing pipelines before analysis: the use of predefined validation criteria could thus prevent the introduction of bias by researchers when exploring these pipelines (Phillips, 2004;Carp, 2012;Botvinik-Nezer et al., 2019). Genomics and a number of other high-throughput fields have also developed standard evaluation criteria to avoid bias in analysis (Kang et al., 2012). ...
    Article
    Full-text available
    The pressure for every research article to tell a clear story often leads researchers in the life sciences to exclude experiments that 'did not work' when they write up their results. However, this practice can lead to reporting bias if the decisions about which experiments to exclude are taken after data have been collected and analyzed. Here we discuss how to balance clarity and thoroughness when reporting the results of research, and suggest that predefining the criteria for excluding experiments might help researchers to achieve this balance.
    ... Although we have focused on the use of validation criteria to make decisions about experiments, they can also be used to make decisions about which data to analyze. In fields such as electrophysiology or functional neuroimaging, for example, data typically pass through preprocessing pipelines before analysis: the use of predefined validation criteria could thus prevent the introduction of bias by researchers when exploring these pipelines (Phillips, 2004;Carp, 2012;Botvinik-Nezer et al., 2019). Genomics and a number of other high-throughput fields have also developed standard evaluation criteria to avoid bias in analysis (Kang et al., 2012). ...
    Article
    Full-text available
    The pressure for every research article to tell a clear story often leads researchers in the life sciences to exclude experiments that 'did not work' when they write up their results. However, this practice can lead to reporting bias if the decisions about which experiments to exclude are taken after data have been collected and analyzed. Here we discuss how to balance clarity and thoroughness when reporting the results of research, and suggest that predefining the criteria for excluding experiments might help researchers to achieve this balance.