ArticlePDF Available

Getting Messy With Data

Authors:

Abstract and Figures

Analyzing and interpreting data is essential to the practice of scientists and is also an essential science and engineering practice for science teaching and learning. Although working with data has benefits for student learning, it is also challenging, particularly with respect to aspects of work with data that are not yet very common in schools, such as developing quantitative models, understanding variation in data, and using larger, complex data sources. In this article, we aim to describe tools for engaging students in work with data in your class as well as three general strategies, including understanding data collection, experiencing the transformation of messy data sets in preparation for analysis, and modeling the data to answer a question. We show how these strategies can be employed using the freely-available, browser-based Common Online Data Analysis Platform, and outline connections to curricular standards.
Content may be subject to copyright.
GETTING MESSY WITH DATA:
Tools and Strategies to Help Students Analyze
and Interpret Complex Data Sources
Joshua Rosenberg1, Alex Edwards2, and Bodong Chen3
1University of Tennessee, Knoxville
2Tate’s School Knoxville
3University of Minnesota
This is a preprint.
APA Citation: Rosenberg, J., Edwards, A., & Chen, B. (2020).
Getting messy with data: Tools and strategies to help students
analyze and interpret complex data sources. The Science Teacher,
87(5), 3034.
Abstract
Analyzing and interpreting data is essential to the practice of scientists
and is also an essential science and engineering practice for science teaching
and learning. Although working with data has benefits for student learn-
ing, it is also challenging, particularly with respect to aspects of work with
data that are not yet very common in schools, such as developing quanti-
tative models, understanding variation in data, and using larger, complex
data sources. In this article, we aim to describe tools for engaging students
to work with data in your class as well as three general strategies, includ-
ing understanding how data is collected, experiencing the transformation
of messy data sets in preparation for analysis, and modeling the data to
answer a question. We show how these strategies can be employed using
the freely-available, browser-based Common Online Data Analysis Platform,
and outline connections to curricular standards.
1
GETTING MESSY WITH DATA The Science Teacher, 87(5)
K ;A< J
1Introduction
Analyzing and interpreting datamay present challenges to teach-
ers and students because the Next Generation Science Standards em-
phasizes data analysis–related capabilities that are not often studied in
the classroom, such as developing quantitative models (Kastens 2015). Even mak-
ing ”simple” observations, such as the height of the school’s flagpole, requires
knowing what, how, and how many times students should measure and record
observations in light of variation in the data (Lehrer, Kim, and Schauble 2007).
Due to the shift called for in the NGSS from knowing about scientific theories
and ideas to figuring out how the world works (Schwarz, Passmore and Reiser
2017), students can now learn about data not only as interpreters of quantitative
models, but also as creators of those models themselves (Lehrer, Kim, and Jones
2011).
In addition, analyzing and interpreting newer sources of data, such as the
“big” data sets collected and created by scientists and engineers, presents ad-
ditional opportunities and challenges for science educators (Finzer, Busey, and
Kochevar 2018; Kastens, Krumhansl, and Baker 2015; Lee and Wilkerson 2018).
Traditionally, as part of the data-modeling approach, students use the data they
have collected themselves. In the context of larger sources of data, students often
use data originally collected for some other purpose, such as data from the city
in which they lived (Wilkerson and Laina 2018). Facing large data sources, stu-
dents also need to deal with analytical challenges with large data sources, such
as the importance of structuring hierarchical data (Konold et al. 2017) and using
technological tools (Finzer 2013).
In addition to what the educational tool can do, it is also essential to consider
how it aligns with particular pedagogical aims, content area, and context (Mishra
and Koehler 2006). Thus, we selected tools that we think exhibit some of the
characteristics of effective data analysis platforms for learners (see McNamara
2015). Additionally, the tools for working with data we identify are those that are
freely available (and do not require purchasing a license), browser-based (and so
can easily be used across computer operating systems), and relatively easy to use,
especially for students. This article also describes three strategies for analyzing
and interpreting complex data using the Common Online Data Analysis Platform
(CODAP).
K K;9NAB8
2
F
2Tools for working with data
K ;A< J
2Tools for working with data
2.1Desmos
Mathematics and science teachers commonly use Desmos (see
“On the web”) with, or in replacement of, graphing calculators. Like
graphing calculators, Desmos works well with functions that do not
require a data table, such as the use of the function f(x) = sin(x) to display the
form of that function. It also works well with datasets. Data can be typed directly
into Desmos or can be copied from Google Sheets or other spreadsheet software.
Then, functions, such as a sin, linear, or quadratic function, can be estimated
based on the data and added to a graph. Even for those who are not accustomed
to writing an equation can easily write complex functions.
2.2Google Sheets
Google Sheets (see “On the web”) is widely used by science teachers and students,
especially in school districts using Google Suite. A benefit of Google Sheets is
that it bears similarities to other, widely used tools, namely Microsoft Excel. This
may make it easy for students to begin to use this tool. Unlike Excel, Google
Sheets is browser-based, making it is easy for students to collaborate through
a single Google Sheet. While many high school students may be familiar with
Google Sheets, its advanced functionality, such as writing commands to populate
cells with values that rely on other cells (i.e., to create the mean of multiple
variables) or fitting functions to data, likely requires additional support. Finally,
while easy to use, sometimes Google Sheets can make it so easy to create a figure
that students may not have the opportunity to think carefully about what each
part of the figure represents.
2.3JASP and R
JASP is a statistical software program, based on R (R Core Team, 2019), that
students can use for data analysis (see “On the web”). R is a programming lan-
guage designed for data analysis. Unlike R, JASP has a point-and-click interface,
through which it is possible to perform a wide array of statistical tests. JASP may
8CAM:<J J
F3
GETTING MESSY WITH DATA The Science Teacher, 87(5)
K ;A< J
be most useful for teachers who want students to conduct complicated analy-
ses, such as t-tests for how two means (or averages) differ or multiple regression
analyses. In addition, R is most commonly used via R Studio software, which ex-
ecutes R and provides additional functionality for enhanced data-analysis work-
flows. While challenging to use, in some advanced applications—such as for the
use of methodologies for analyzing phylogenetic data—it may be useful to turn
to R. JASP has both a desktop version and a browser-based version. There is a
browser-based version of R Studio available, known as R Studio Cloud.
2.4The Common Online Data Analysis Platform (CODAP)
The Common Online Data Analysis Platform (CODAP) provides a distinctive
interface to view, transform, and analyze data and create and interpret graphs.
Developed by the Concord Consortium, CODAP (see “On the web”) draws upon
past research and development of TinkerPlots and Fathom statistical software.
One distinctive feature, related to how both data and graphs can be viewed to-
gether, is that elements of graphs, such as dots on a scatterplot, can be clicked on
to view the data to which they correspond (Figure 1). Another distinctive feature
of CODAP is its drag-and-drop interface. For example, to create a graph, columns
from a data table can be dragged to the x- or y-axes or the grid of the graph, to
color points based on the values in the column. It is also easy to load data (as
long as you can save the data as a .CSV file, which can be done in Google Sheets
or other software). A .CSV file can be dragged into the window to load the file as
a table. In addition to being easy to use, CODAP has more advanced functional-
ity, such as the ability to fit quantitative models (i.e., linear models or models for
simple linear regression). Additional resources include tutorials, example data
sets, and activities.
3CODAP strategies for analyzing and interpreting
complex data sources
We have been engaged in research about how tools such as those de-
scribed above—and in particular CODAP, because of its distinctive fea-
tures—can be used to support student learning in the context of the
NGSS. We have identified several research-based strategies that align with past
K K;9NAB8
4
F
3CODAP strategies for analyzing and interpreting complex data sources
K ;A< J
Figure 1:A screenshot of the freely-available, browser-based CODAP software.
research, and that can be used as a part of longer investigations across a lesson
sequence, over a unit, or as a part of a single lesson or class.
While we focus on how these strategies can be employed using CODAP, each
could also be employed using other tools described in this article or other than
those described here. Finally, while the strategies can be considered on their own,
they may best be considered as a part of a cycle, where students first explore how
the data were collected, then prepare a data set for analysis, and finally model
the data in order to answer a driving question.
3.1Strategy 1: Explore how the data were collected or created
Creating or collecting data is an essential step in the data analysis process (Han-
cock, Kaput, and Goldsmith 1992). This step can also serve as an introduction
to working with data, particularly for students who are familiar with the prac-
tice. When students record observations themselves, they have the opportunity
to consider how the data gets created and may be more confident when analyz-
ing it later on. When students use already-collected data or secondary data, there
8CAM:<J J
F5
GETTING MESSY WITH DATA The Science Teacher, 87(5)
K ;A< J
are still benefits to considering how the data came to be. Thus, when students
are analyzing already-collected data it is still important for students to have the
chance to think about how the data were originally collected or created. These
discussions may lead students to question how and why the data were collected
and to consider sources of bias (deliberate or unintentional) that change the na-
ture of the data, which can be seen as an example of critical data literacy (Hautea,
Dasgupta, and Hill 2017).
Figure 2:Which mammal eats meat and lives on both land and
water? Corresponding data points and their place in figures.
To help students ex-
plore how the data
were collected, start
with data that repre-
sents a single case. Of-
ten, the data that stu-
dents are analyzing
are aggregates of in-
dividual cases of data,
such as when a data
set includes a column
representing the mean
of a measurement col-
lected multiple times.
In CODAP, this is sup-
ported by the con-
nections between the
data points and the
figure (see Figure 2).
Another way is to
talk through, with stu-
dents, what the data
collection process was
like, or what it could
have been like, as fa-
cilitated through a dis-
cussion of a description of a study associated with the data, a codebook describ-
ing what the variables are, or a data collection instrument (or a description of
one).
K K;9NAB8
6
F
3CODAP strategies for analyzing and interpreting complex data sources
K ;A< J
3.2Strategy 2: Analyze complex data
Working with clean, tidy data makes it easier for students to reach conclusions;
however, particularly with complex sources of data, the need to think about
and work through the messier parts of the process—such as renaming and se-
lecting variables and joining together multiple datasets—can also be beneficial
(Kjelvik and Schultheis 2019; Konold, Finzer, and Kreetong 2017; Schultheis and
Kjelvik 2015; Wilkerson-Jerde et al. 2017). In CODAP, it is easy to include data
sets that are hierarchically structured or to create nested data structures. In this
way, students can see and explore connections between data at multiple levels.
Figure 3depicts how all of the observations associated with one elephant seal
are grouped.
Figure 3:Analyzing hierarchical data in CODAP.
Another way to
engage students in
the messier parts
of data analysis is
even more simple:
allow some time
for students to ex-
plore the data and
to generate their
own ideas about
the data. This can
be an especially
useful way to ex-
pose students to
raw, messy data,
akin to the kinds
that scientists cre-
ate and use, but
which may also re-
quire greater time and effort than is required in typical data analyses (see Data
Nuggets in “On the web” for structured activities that involve students in ana-
lyzing complex data from scientists).
8CAM:<J J
F7
GETTING MESSY WITH DATA The Science Teacher, 87(5)
K ;A< J
3.3Strategy 3: Model and explain variability in the data to an-
swer a question and solve a problem
Finally, a central goal of statistical models—and statistics—is to understand what
is going on in light of variation in the data (Aridor and Ben-Zvi 2019; Lehrer, Kim,
and Schauble 2007). Importantly, explaining variability does not need to involve
highly complex models: even a mean or a median can be an important summary.
A key part of using this strategy is recognizing that it is not essential for students
to learn about the mean or the median; it is important that students have the
opportunity to use statistics that are useful for determining what is going on
with something concrete: a phenomenon.
Figure 4:Adding a regression (linear model) in CODAP.
When using this
strategy, it is im-
portant to push
students to reach
and to defend their
conclusions in light
of variability to
answer an authen-
tic question, such
as a driving ques-
tion that allows
students the op-
portunity to an-
swer the question
in multiple ways,
as well as share and revise their answers. In CODAP, modeling and explaining
variability is easy to do by clicking on an already-created figure, as demonstrated
in Figure 4. In addition to adding a model, such as the linear model depicted in
Figure 4, students can also add statistics, such as the mean and median to a graph
(and to groups depicted within a graph). Students can also represent how spread
out a variable is through the inclusion of statistics, such as the standard deviation
or the range and through adding graphical representations of these statistics to a
graph.
K K;9NAB8
8
F
5References
K ;A< J
4Conclusion
Working with data is essential to the practice of scientists as well as
to science teaching and learning. As you consider these tools and strate-
gies, we encourage you to think creatively: data do not have to be about
something separate from students’ investigations of the world. In many cases,
data can come directly from students’ experiences in your classroom—or their
lives. We encourage you to not only seek out complex data sources that help
your students demonstrate a performance expectation, or standard, and that also
provide chances for students to investigate, critique, and share what they find
about topics that are of interest, personal investment, or relevance to them. Do-
ing so can support a shift away from students learning about the world toward
figuring out how and why the world works in the ways it does.
On the Web
CODAP: https://codap.concord.org/releases/latest/static/dg/en/cert/index.html
https://codap.concord.org/for-educators/
Data Nuggets: http://datanuggets.org Desmos: www.desmos.com/calculator
Google Sheets: www.google.com/sheets
Fathom: https://fathom.concord.org
JASP: https://jasp-stats.org www.rollapp.com/app/jasp
R Studio: https://rstudio.cloud
Tinkerplots: www.tinkerplots.com
5References
Aridor, K., and D. Ben-Zvi. 2019. Students’ aggregate reasoning with covariation.
In Topics and Trends in Current Statistics Education Research, 7194. New
York: Springer.
Finzer, W. 2013. The data science education dilemma. Technology Innovations in
Statistics Education 7(2): 1-9.
8CAM:<J J
F9
GETTING MESSY WITH DATA The Science Teacher, 87(5)
K ;A< J
Finzer, W., A. Busey, and R. Kochevar. 2018. Data-driven inquiry in the PBL class-
room. The Science Teacher 86 (1): 2834.
Hancock, C., J.J. Kaput, and L.T. Goldsmith. 1992. Authentic inquiry with data:
Critical barriers to classroom implementation. Educational Psychologist 27(3):
337364.
Hautea, S., S. Dasgupta, and B.M. Hill. 2017. Youth perspectives on critical data
literacies. Proceedings of the 2017 CHI Conference on Human Factors in Com-
puting Systems, 919930. https://doi.org/10.1145/3025453.3025823
Kastens, Kim. (2015, May). Data Use in the Next Generation Science Standards
(revised edition) [White paper]. Waltham, MA: Oceans of Data Institute, Edu-
cation Development Center, Inc. Retrieved from http://oceansofdata.edc.org/our-
work/data-next-generation-science-standards
Kjelvik, M.K., and E.H. Schultheis. 2019. Getting messy with authentic data: Ex-
ploring the potential of using data from scientific research to support student
data literacy. CBE—Life Sciences Education 18 (2): 18.
Konold, C., W. Finzer, and K. Kreetong. 2017. Modeling as a core component of
structuring data. Statistics Education Research Journal 16 (2): 191212.
Lee, V. R., and M. Wilkerson. 2018. Data use by middle and secondary students
in the digital age: A status report and future prospects. Commissioned paper
for the National Academies of Sciences, Engineering, and Medicine, Board
on Science Education, Committee on Science Investigations and Engineering
Design for Grades 6-12. Washington, DC.
Lehrer, R., M.J. Kim, and L. Schauble. 2007. Supporting the development of con-
ceptions of statistics by engaging students in measuring and modeling vari-
ability. International Journal of Computers for Mathematical Learning 12 (3):
195216.
Lehrer, R., M.J. Kim, and R.S. Jones. 2011. Developing conceptions of statistics by
designing measures of distribution. ZDM 43 (5): 723736.
McNamara, A. 2015. Bridging the gap between tools for learning and for doing
statistics [doctoral dissertation]. Retrieved from
https://cloudfront.escholarship.org/dist/prd/content/qt1mm9303x/qt1mm9303x.pdf
Mishra, P., and M.J. Koehler. 2006. Technological pedagogical content knowl-
edge: A framework for teacher knowledge. Teachers College Record 108 (6):
K K;9NAB8
10
F
5References
K ;A< J
10171054.
National Research Council. (2012). A framework for K–12 science education:
Practices, crosscutting concepts, and core ideas. Washington, DC: National
Academies Press.
NGSS Lead States. 2013. Next generation science standards: For states, by states.
Washington, DC: National Academies Press
R Core Team (2019). R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-
project.org/.
Schultheis, E.H., and M.K. Kjelvik. 2015. Data nuggets: Bringing real data into the
classroom to unearth students’ quantitative & inquiry skills. The American
Biology Teacher 77 (1): 1926.
Wilkerson, M.H., and V. Laina. 2018. Middle school students’ reasoning about
data and context through storytelling with repurposed local data. ZDM 50
(7): 12231235.
Wilkerson, M.H., K.A. Lanouette, R.L. Shareff, T. Erickson, N. Bulalacao, J. Heller,
and F. Reichsman. 2018. Data moves: Restructuring data for inquiry in a
simulation and data analysis environment. In Making the Learning Sciences
Count, 13th International Conference of the Learning Sciences, eds. J. Kay
and R. Luckin, 1383-1384. London, UK: International Society of the Learning
Sciences.
8CAM:<J J
F11
... The above educational efforts and tools are considered in light of recent scholarship (theoretical and empirical) on issue-based science instruction, teaching and learning of atmospheric science content, data visualization, and modeling (Finzer, 2013;Konold et al., 2017). More specifically, we review research on student development of analytic and modeling skills such as analyzing "messy" data sets and developing quantitative models (Rosenberg et al., 2020) as well as student development of visualization abilities; for example, we will explore the ability of students to visualize actual meteorological conditions behind symbolic representations on a weather map or situational awareness (Wilson, 2020). Our ultimate goal is to articulate a theory-based, research-informed perspective on the pedagogical potential of the Mesonet to promote data-intensive, issue-based instruction of atmospheric science at the school level. ...
... A good example is the traditional meteograms, which typically display abstract representations of the values of the different weather parameters (numbers, lines, etc.) (see Fig. 2). Although these platforms are laudable for their potential to help students develop data analytic and modeling skills (e.g., analyze "messy" data sets, develop quantitative models) (Rosenberg et al., 2020), their highly abstract nature can pose interpretive challenges to younger students who often are concrete thinkers (Leppink et al., 2013;Mayer, 2005) and may be unable to visualize the actual meteorological conditions behind the symbolic representations on the meteogram. As research shows, computer-based graphs are not always "transparent" to students as commonly assumed (Ainley, 2000;Aydın-Güç et al., 2022). ...
Article
Full-text available
This theoretical article proposes using statewide weather-observing networks (Mesonets) to support data-intensive, issue-based teaching of atmospheric topics in middle and high school science. It is argued that the incorporation of this new technology and its affordances into the school curriculum can drastically change the ways that atmospheric topics are taught and learned in classroom settings, from dull lectures to engaging explorations of weather phenomena with potential not only to spark in-the-moment curiosity but also long-term interest in STEM. However, this educational revolution is contingent upon the availability of instructional materials that are pedagogically sound and developmentally appropriate. School-aged students require strategic instructional design and supportive pedagogic scaffolding to pursue their curiosity feelings and develop a motivational profile that is conducive to interest in STEM (self-efficacy, outcome expectation, etc.) as well as situational awareness. In addition to articulating the theoretical underpinnings of this proposition, an account is provided of ongoing efforts to turn this cutting-edge scientific technology into a curriculum space for students to explore weather phenomena, conduct map-based inquiries, and engage in data-based deliberation in the context of real-world issues. Centered on the provision of investigative cases that are locally situated and relevant to students’ lifeworld (place), the Backyard Weather Curriculum is presented to illustrate how this can be accomplished through the adoption of a place-based approach wherein relevance serves as an essential design principle for curricular development and enactment. Such curriculum, it is argued, can help promote student development from curious explorers to inquirers with a deep epistemic interest in STEM.
Article
We propose a conceptual framework for STEM education that is centered around justice for minoritized groups. Justice‐centered STEM education engages all students in multiple STEM subjects, including data science and computer science, to explain and design solutions to societal challenges disproportionately impacting minoritized groups. We articulate the affordances of justice‐centered STEM education for one minoritized student group that has been traditionally denied meaningful STEM learning: multilingual learners (MLs). Justice‐centered STEM education with MLs leverages the assets they bring to STEM learning, including their transnational experiences and knowledge as well as their rich repertoire of meaning‐making resources. In this position paper, we propose our conceptual framework to chart a new research agenda on justice‐centered STEM education to address societal challenges with all students, especially MLs. Our conceptual framework incorporates four interrelated components by leveraging the convergence of multiple STEM disciplines to promote justice‐centered STEM education with MLs: (a) societal challenges in science education, (b) justice‐centered data science education, (c) justice‐centered computer science education, and (d) justice‐centered engineering education. The article illustrates our conceptual framework using the case of the COVID‐19 pandemic, which has presented an unprecedented societal challenge but also an unprecedented opportunity to cultivate MLs' assets toward promoting justice in STEM education. Finally, we describe how our conceptual framework establishes the foundation for a new research agenda that addresses increasingly complex, prevalent, and intractable societal challenges disproportionately impacting minoritized groups. We also consider broader issues pertinent to our conceptual framework, including the social and emotional impacts of societal challenges; the growth of science denial and misinformation; and factors associated with politics, ideology, and religion. Justice‐centered STEM education contributes to solving societal challenges that K‐12 students currently face while preparing them to shape a more just society.
Article
Full-text available
This snapshot illustrates my use of the Common Online Data Analysis Platform (CODAP), a web-based tool, to perform a sampling data task embedded within a real-world phenomenon. The aim is to identify the optimal sampling land areas on the map for estimating the population. I utilized a public dataset containing densely located alternative fuel stations and expansive vacant regions spread throughout a rectangular area. I sampled rectangular areas on the map and discussed the sampling results with a focus on the variability between and across the samples, primarily using mean absolute deviation as a measure of variation. The sampling task in CODAP is shared with the reader.
Article
Integrating data literacy into K-12 education in an increasingly data-driven society is imperative. Data literacy is conceptualized as an interdisciplinary competence that extends beyond traditional statistical understanding, encompassing skills in accessing, analyzing, interpreting, and effectively communicating insights derived from data. The paper argues for a paradigm shift in educational approaches, advocating for incorporating contextual, inquiry-based methodologies over the traditional formalisms-first approach. This shift is essential for enhancing students' ability to apply data literacy skills in real-world contexts. The limitations of a formalisms-first pedagogical approach are discussed, highlighting its potential to restrict students' practical application of theoretical knowledge. In contrast, the article advocates for inquiry-driven educational strategies like project-based and problem-based learning to foster deeper engagement and understanding of data literacy. These strategies may be more effective in connecting theoretical concepts with students' lived experiences and real-world applications. Additionally, the paper argues that data literacy should be framed as language. Designers of data literacy learning progressions should draw on examples from mathematics and science domains and research to build students' understanding of the transformation processes from data to evidence and subsequently to models and explanations. Further, the article explores the integration of technology in data literacy education. It underscores the role of digital tools and platforms in facilitating interactive, hands-on experiences with complex data sets, enriching the learning process, and preparing students for the challenges of the digital era. In conclusion, the article calls for a comprehensive, interdisciplinary approach to data literacy education underpinned by technology-enhanced learning environments. This approach is essential for developing both the technical skills for data manipulation and a critical mindset for data evaluation and interpretation, thereby cultivating a responsible, data-literate citizenry capable of informed decision-making in a data-rich world.
Article
Full-text available
This volume is largely about nontraditional data; this paper is about a nontraditional visualization: classification trees. Using trees with data will be new to many students, so rather than beginning with a computer algorithm that produces optimal trees, we suggest that students first construct their own trees, one node at a time, to explore how they work, and how well. This build‐it‐yourself process is more transparent than using algorithms such as CART; we believe it will help students not only understand the fundamentals of trees, but also better understand tree‐building algorithms when they do encounter them. And because classification is an important task in machine learning, a good foundation in trees can prepare students to better understand that emerging and important field. We also describe a free online tool—Arbor—that students can use to do this, and note some implications for instruction.
Article
As the field of data science evolves with advancing technology and methods for working with data, so do the opportunities for re‐conceptualizing how we teach undergraduate statistics and data science courses for majors and non‐majors alike. In this paper, we focus on three crucial components for this re‐conceptualization: Developing research questions, professional ethics, and team collaborations. We share vignettes from two teams of undergraduate statistics or data science majors at two different stages of their development (novice and expert) while they worked on a DataFest data challenge. These vignettes shed light on opportunities for re‐conceptualizing introductory courses to give more attention to issues of the process of developing focused research questions when given a complex data set, professional ethics and bias, and how to collaborate effectively with others. We provide some implications for teaching and learning as well as an example activity for educators to use in their courses.
Article
Full-text available
With improving technology and monitoring efforts, the availability of scientific data is rapidly expanding. The tools that scientists and engineers use to analyse data are changing in response. At the same time, science education standards have shifted to emphasize the importance of students making sense of data in science classrooms. However, it is not yet known whether these exciting new datasets and tools are used science classrooms, and what it would take to facilitate their use. To identify opportunities, research is needed to capture the data practices currently performed in classrooms, and the roles of technology for student learning. Here, we report findings from a survey conducted in the United States of 330 science teachers on the data sources, practices and technologies common to their classroom. We found that teachers predominantly involve their students in analysing relatively small data sets that they collect. In support of this work, teachers tend to use the technologies that are available to them—namely, calculators and spreadsheets. In addition, we found that a subset of teachers used a wide variety of data sources of varying complexity. We discuss what these findings suggest for practice, research and policy, with an emphasis on supporting teachers based on their needs. Practitioner notes What is already known about this topic Collecting and analysing data are central to the practice of science, and these skills are taught in many science classrooms at the pre‐collegiate (grades K‐12) level. Data are increasingly important in society and STEM, and types and sources of data are rapidly expanding. These changes have implications for science teachers and students. What this paper adds We found that the predominant data source science teachers use is student‐collected, small data sets. Teachers use digital tools familiar and available to them: spreadsheets and calculators. Teachers perceive the cost and time it would take to learn to use digital tools to analyse data with their students as key barriers to adopting new tools. Despite the predominance of small, student‐collected data analysed using spreadsheets or calculators, we also found notable variability in the data sources and digital tools some teachers used with their students. Implications for practice and/or policy Many of the changes called for in science education standards and reform documents, regarding how students should collect and analyse data, have not yet been fully realized in pre‐collegiate classrooms. Science teacher educators and science education researchers should build curricula and develop digital tools based on which kinds of data sources and digital tools teachers presently use, while encouraging more complex data useage in the future.
Article
Full-text available
Data are becoming increasingly important in science and society, and thus data literacy is a vital asset to students as they prepare for careers in and outside science, technology , engineering, and mathematics and go on to lead productive lives. In this paper, we discuss why the strongest learning experiences surrounding data literacy may arise when students are given opportunities to work with authentic data from scientific research. First, we explore the overlap between the fields of quantitative reasoning, data science, and data literacy, specifically focusing on how data literacy results from practicing quantitative reasoning and data science in the context of authentic data. Next, we identify and describe features that influence the complexity of authentic data sets (selection, curation, scope, size, and messiness) and implications for data-literacy instruction. Finally, we discuss areas for future research with the aim of identifying the impact that authentic data may have on student learning. These include defining desired learning outcomes surrounding data use in the classroom and identification of teaching best practices when using data in the classroom to develop students' data-literacy abilities.
Article
Full-text available
Publicly-available datasets, though useful for education, are often constructed for purposes that are quite different from students' own. To investigate and model phenomena, then, students must learn how to repurpose the data. This paper reports on an emerging line of research that builds on work in data modeling, exploratory data analysis, and storytelling to examine and support students' data repurposing. We ask: What opportunities emerge for students to reason about the relationship between data, context, and uncertainty when they repurpose public data to explore questions about their local communities? And, How can these opportunities be supported in classroom instruction and activity design? In two exploratory studies, students were asked to pose questions about their communities, use publicly-available data to explore those questions, and create visual displays and written stories about their findings. Across both enactments, opportunities for reasoning emerged especially when students worked to reconcile (1) their own knowledge and experiences of the context from which data were collected with details of the data provided; and (2) their different emerging stories about the data with one another. We review how these opportunities unfolded within each enactment at the level of group and classroom, with attention to facilitator support.
Poster
Full-text available
We explore data transformations, actions investigators take to make datasets more useful for intended inquiries. Fourteen young adults were interviewed while they interacted with online science and engineering games and were provided their own gameplay log data to improve their scores. We investigate the conditions under which data transformations are likely to emerge; provide examples of data moves as enacted by participants; and propose an initial taxonomy of data transformations and potential developmental supports.
Technical Report
Full-text available
What it means to work with data has changed significantly since the preparation and publication of America’s Lab Report (Singer, Hilton, & Schweingruber, 2006) in ways that are impacting students, educators, and the very practice of science. This change is expressing itself most obviously in the abundance of data that can be collected and accessed by students and teachers. There are also notable changes in the types of data (e.g., GPS data, network data, qualitative/verbal data) that are now readily available, and the purposes for which data are collected and analyzed. These shifts have both generated enthusiasm and raised a number of questions for K-12 science educators as new science standards are being adopted across the United States. The questions driving this paper are: In this age of data abundance, what is the state of research on data use to support middle and secondary students’ learning? And, how might science and engineering education and educational research for those grade levels adapt to the changes in data availability and use observed in the past 10 years?
Article
Full-text available
We gave participants diagrams of traffic on two roads with information about eight attributes, including the type of each vehicle, its speed, direction and the width of the road. Their task was to record and organize the data to assist city planners in its analysis. Successfully encoding the information required the creation of a case, a physical record of one repetition of a repeatable observational process. We analyzed data sheets participants created including the methods they used to bind information together into cases. Overall, 79% of their data sheets successfully encoded the data. Even 62% of the middle school students were able to create a bound structure that could hold the critical information from the diagrams. A majority of these structures involved a hierarchy of cases rather than the "flat" case-by-attribute structure that virtually all statistical software require. Our sense is that participants strove to create a structure that modeled the real-world as closely as they could, constructing cases that corresponded to the different sorts of objects they perceived-vehicles with their characteristics nested within road segments with their characteristics. © International Association for Statistical Education (IASE/ISI), November, 2017.
Chapter
Helping students interpret and evaluate the relations between two variables is challenging. This chapter examines how students’ aggregate reasoning with covariation (ARwC) emerged while they modeled a real phenomenon and drew informal statistical inferences in an inquiry-based learning environment using TinkerPlotsTM. We focus in this illustrative case study on the emergent ARwC of two fifth-graders (aged 11) involved in statistical data analysis and modelling activities and in growing samples investigations. We elucidate four aspects of the students’ articulations of ARwC as they explored the relations between two variables in a small real sample and constructed and improved a model of the predicted relations in the population. We finally discuss implications and limitations of the results. This article contributes to the study of young students’ aggregate reasoning and the role of models in developing such reasoning.
Book
Helping Students Make Sense of the World through Next Generation Science and Engineering practices provides educators an understanding of the practices strand of A Framework for K12 Science Education (Framework) and the Next Generation Science Standards (NGSS). It is written in clear, nontechnical language using real-world examples to show what's different about practice-centered teaching and learning at all grade levels. It unpacks each practices, provides information about what is important about practices, how to expand relationships among teachers, students and practices for more equitable learning, and how to support practices through classroom talk, and how to get started using practices.