Figure 10 - available via license: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Content may be subject to copyright.

# Flowchart of study procedure. The present study consisted of five main sections (a, b, c, g, h). Sections and subsections colored teal (b, e, g, h) produced data that were analyzed for this study. Subsections on the right (i, j, k, l, m) are an expansion of section c of the flowchart, showing (i) the drawing page, (j) the drawing instructions, (k) one of the four stimulus graphs, (l) the graph caption, and (m) the upload instructions. See Methods for further procedural details.

Source publication

How do viewers interpret graphs that abstract away from individual-level data to present only summaries of data such as means, intervals, distribution shapes, or effect sizes? Here, focusing on the mean bar graph as a prototypical example of such an abstracted presentation, we contribute three advances to the study of graph interpretation. First, w...

## Citations

... The present study extends these findings to many other fields, while showing that this pattern was consistently observed from 2010 to 2020. Recent research indicates that some readers conflate bar graphs that show counts or proportions with bar graphs that show means of continuous data [21], even though these two data types have very different properties. When examining a bar graph of continuous data, one in five people incorrectly interpret the bar end as the maximum of the underlying data points, rather than the center of the data points [21]. ...

... Recent research indicates that some readers conflate bar graphs that show counts or proportions with bar graphs that show means of continuous data [21], even though these two data types have very different properties. When examining a bar graph of continuous data, one in five people incorrectly interpret the bar end as the maximum of the underlying data points, rather than the center of the data points [21]. This misinterpretation was found across general education levels [21]. ...

... When examining a bar graph of continuous data, one in five people incorrectly interpret the bar end as the maximum of the underlying data points, rather than the center of the data points [21]. This misinterpretation was found across general education levels [21]. An earlier study found that when comparing equidistant data points above and below the bar tip, viewers rated points within the bar as being more likely than points above the bar [22]. ...

Recent work has raised awareness about the need to replace bar graphs of continuous data with informative graphs showing the data distribution. The impact of these efforts is not known. This observational meta-research study examined how often scientists in different fields use various graph types, and assessed whether visualization practices have changed between 2010 and 2020. We developed and validated an automated screening tool, designed to identify bar graphs of counts or proportions, bar graphs of continuous data, bar graphs with dot plots, dot plots, box plots, violin plots, histograms, pie charts, and flow charts. Papers from 23 fields (approximately 1,000 papers/field/year) were randomly selected from PubMed Central and screened (n=227,998). F1 scores for different graphs ranged between 0.83 and 0.95 in the internal validation set. While the tool also performed well in external validation sets, F1 scores were lower for uncommon graphs. Bar graphs are more often used incorrectly to display continuous data than they are used correctly to display counts or proportions. The proportion of papers that use bar graphs of continuous data varies markedly across fields (range in 2020: 4%-58%), with high rates in biochemistry and cell biology, complementary and alternative medicine, physiology, genetics, oncology and carcinogenesis, pharmacology, microbiology and immunology. Visualization practices have improved in some fields in recent years. Fewer than 25% of papers use flow charts, which provide information about attrition and the risk of bias. This study highlights the need for continued interventions to improve visualization and identifies fields that would benefit most.

... The present study extends these findings to many other fields, while showing that this pattern was consistently observed from 2010 to 2020. Recent research indicates that some readers conflate bar graphs that show counts or proportions with bar graphs that show means of continuous data [21], even though these two data types have very different properties. When examining a bar graph of continuous data, one in five people incorrectly interpret the bar end as the maximum of the underlying data points, rather than the center of the data points [21]. ...

... Recent research indicates that some readers conflate bar graphs that show counts or proportions with bar graphs that show means of continuous data [21], even though these two data types have very different properties. When examining a bar graph of continuous data, one in five people incorrectly interpret the bar end as the maximum of the underlying data points, rather than the center of the data points [21]. This misinterpretation was found across general education levels [21]. ...

... When examining a bar graph of continuous data, one in five people incorrectly interpret the bar end as the maximum of the underlying data points, rather than the center of the data points [21]. This misinterpretation was found across general education levels [21]. An earlier study found that when comparing equidistant data points above and below the bar tip, viewers rated points within the bar as being more likely than points above the bar [22]. ...

Recent work has raised awareness about the need to replace bar graphs of continuous data with informative graphs showing the data distribution. The impact of these efforts is not known. This observational meta-research study examined how often scientists in different fields use various graph types, and assessed whether visualization practices have changed between 2010 and 2020. We developed and validated an automated screening tool, designed to identify bar graphs of counts or proportions, bar graphs of continuous data, bar graphs with dot plots, dot plots, box plots, violin plots, histograms, pie charts, and flow charts. Papers from 23 fields (approximately 1,000 papers/field/year) were randomly selected from PubMed Central and screened (n=227,998). F1 scores for different graphs ranged between 0.83 and 0.95 in the internal validation set. While the tool also performed well in external validation sets, F1 scores were lower for uncommon graphs. Bar graphs are more often used incorrectly to display continuous data than they are used correctly to display counts or proportions. The proportion of papers that use bar graphs of continuous data varies markedly across fields (range in 2020: 4%-58%), with high rates in biochemistry and cell biology, complementary and alternative medicine, physiology, genetics, oncology and carcinogenesis, pharmacology, microbiology and immunology. Visualization practices have improved in some fields in recent years. Fewer than 25% of papers use flow charts, which provide information about attrition and the risk of bias. This study highlights the need for continued interventions to improve visualization and identifies fields that would benefit most.