Conference PaperPDF Available

Coding and Analyzing Scientific Observations from Middle School Students in Minecraft

Authors:

Abstract

The importance of promoting interest in STEM is crucial to the recruitment and retainment of underrepresented populations in the STEM field. We created a one-week summer camp program centered around astronomy using Minecraft to help promote interest in science. We capitalize on data logs collected from two 1-week camps in summer 2019 and code scientific observations made by campers for the types of scientific observations and their level of quality using Cohen's Kappa. Results showed that the majority of observations are descriptive, comparative, inferential, or analogous, as opposed to being off topic or factual. We discuss possible reasons for this distribution and design implications for future reiterations.
Copyright 2020 International Society of the Learning Sciences. Presented at the
International Conference of the Learning Sciences (ICLS) 2020. Reproduced by
permission.
Coding and Analyzing Scientific Observations from Middle School
Students in Minecraft
Sherry Yi, Matt Gadbury, & H. Chad Lane
fangyi1@illinois.edu; gadbury2@illinois.edu; hclane@illinois.edu
University of Illinois at Urbana-Champaign
Abstract: The importance of promoting interest in STEM is crucial to the recruitment and
retainment of underrepresented populations in the STEM field. We created a one-week
summer camp program centered around astronomy using Minecraft to help promote interest in
science. We capitalize on data logs collected from two 1-week camps in summer 2019 and
code scientific observations made by campers for the types of scientific observations and their
level of quality using Cohen’s Kappa. Results showed that the majority of observations are
descriptive, comparative, inferential, or analogous, as opposed to being off topic or factual.
We discuss possible reasons for this distribution and design implications for future
reiterations.
Introduction
Our effort is to recruit underrepresented populations into STEM, and in particular, we focus on triggering
interest and promoting engagement with scientific content. Importantly, interest and engagement are known to
be critical for informal learning (Renninger, 2007). We are particularly focused on triggering interest for those
who typically show little to no interest in STEM. Triggering interest early and subsequently supporting that
interest is important, if long-term individual interest is to develop ( Hidi & Renninger, 2006). The gender
disparity in the STEM workforce has vastly improved since the 1990’s, however women’s presence varies
widely across occupations and continues to be underrepresented in computer jobs and engineering (Funk &
Parker, 2018). The same report by Funk & Parker found that Blacks represent 9% of STEM workers while
Hispanics represent 7% of all STEM workers, though Black and Hispanic workers make up 27% of the entire
U.S. workforce. This gap is troubling as employment in STEM occupations is expected to grow by 8.8% as of
2028 in counter to the 5% projected growth of non-STEM occupations (U.S. Bureau of Labor Statistics, 2019)
and illustrates a need for additional research investigating the design of interest triggers that function across
underrepresented groups.
The context for our research is the popular game Minecraft, which allows players to exercise free
choice and engage with a simulation of the natural world (Lane & Yi, 2017). Minecraft is a sandbox game,
meaning it is open-ended and driven by player choice. Videogame consoles are a suitable platform for
generating STEM interest due to its high rate of ownership among minorities (Leith & Cotten, 2014).
Commercial videogame use has been linked to higher levels of information and communication technology self-
efficacy (Ball, Huang, Cotten, & Rikard, 2018) and have the power to increase technology-related skills
(Admiraal, 2015; Ting, 2010). Notably, interest in STEM tends to wane prior to entering high school (Maltese
& Tai, 2011), hence we target our camp recruitments at early adolescents. Ball, Huang, Rikard, & Cotten (2019)
generalized that STEM interventions should increase students’ academic related expectancies and values while
minimizing their emotional costs to address digital and STEM inequalities. We supplied all technical equipment
and implementation of camps at no cost to our nonprofit partner. We are building on our previous work (Lane,
Yi, Guerrero, & Comins, 2017; Yi, Lane, & Delialioğlu, 2019) using a one-week intervention consisting of
lectures on a hypothetical version of Earth (e.g., what if the Earth had a tilted axis?) and the exploration of such
worlds within Minecraft. Making scientific observations is one of the daily activities we ask campers to perform
using signs that can be planted anywhere in the Minecraft world (see Figure 1). In this paper, we discuss the
observations collected over two weeks of camp and describe our coding scheme that sorts scientific observations
by the quality of the observation to shed light on what campers were attending to and how we may improve our
intervention. The ability to identify the quality of scientific observations from campers will allow detections of
interest triggering instances and enable a digital roadmap of interest development (e.g., as a camper’s quality of
observation increases, we can examine if their interest in science increases as well).
Methods
Copyright 2020 International Society of the Learning Sciences. Presented at the
International Conference of the Learning Sciences (ICLS) 2020. Reproduced by
permission.
Sample
We partnered with a nonprofit community center and makerspace in the Midwest to host a series of summer
camps themed around astronomy using Minecraft. We hosted 21 participants in total (females = 8) between the
ages of 11- to 14-years-old, enrolled in the government program Teen REACH, and qualified for free and
reduced lunch. All were African American except one Caucasian American. We required consent from both the
parent or guardian and the camper, and they were informed that the study was voluntary. Our sample mostly
communicated in their vernacular and we were aware of the importance of language in understanding science
while coding our data (Finkelstein et al., 2013). We were not seeking or comparing observations to those in
standard English, but rather evaluating the quality of scientific content.
Data collection
We conducted two 1-week summer camps in the summer of 2019 (refer to Yi, Lane, & Delialioğlu, 2018).
Tables were set up in rows facing a television monitor where brief lectures on scientific concepts took place.
Each camper was provided a laptop with Minecraft installed and a mouse to use during the camp, and we
provided lab accounts during log-in to keep their identities anonymous. Campers were presented brief 10-minute
lectures on hypothetical scenarios of earth before exploring such worlds within Minecraft. The instructors
encouraged campers to make scientific observations and prompted campers with additional hypothetical
questions (e.g., “How did the moon form?”, “Why should we care about other planets?”). Data logs were
generated following the summer camp programs of: 1) scientific observations made by campers through
Minecraft signs that allow users to write messages for others to see, 2) the exact location occupied by users,
which included the version of earth occupied and exact coordinates, and 3) the date and time of the made
observation (Figure 1).
Figure 1. An example of signs plotted by campers stating their scientific observation.
Coding
The authors decided that the five initial codes developed based on a similar camp implementation in 2018 did
not adequately capture the nuance in the data and expanded to ten codes for the 2019 data used in this paper,
refining categorical definitions and adding an evaluation for the quality of the observation (whether or not the
connection between the scientific observation and category was clear or ambiguous). We organized categories
by their level of sensemaking and reliance on external knowledge: factual observations and off-topic comments
are the lowest level, followed by descriptive and comparative as mid-level, and inference and analogy as the
highest level of active sensemaking (see Table 1). Two of the authors coded 200 lines of observation data and
the measured Cohen’s Kappa for our results was .87, indicating a strong agreement.
Table 1: Codes used to describe scientific observations
Category Definition Quality
Factual Stating nouns without any elaborations. N/A, direct observation (“carrots”)
Off-topic Technology-related, conversational. N/A, irrelevant to task (“smoking is good”)
Descriptive Related to color, temperature, quantity, and
other physical attributes such as weight or size.
Clear (“lots of coral”)
Ambiguous (“the water is very nice”)
Comparativ
e
Comparing one natural phenomena to another;
expectations are violated.
Clear (“different color grass”)
Ambiguous (“the trees are different”)
Analogy Comparing natural phenomena with another
similar structure or object; an advanced form of
comparative.
Clear (“tumble weed looking plant”)
Ambiguous (“the trees look like animals”)
Inference A hypothesis or explanation is proposed. Clear (“trees grow because probably facing
the sun a lot”)
Ambiguous (“climate change”)
Copyright 2020 International Society of the Learning Sciences. Presented at the
International Conference of the Learning Sciences (ICLS) 2020. Reproduced by
permission.
Results
We found that 36% of observations were low level, including factual statements and those that were off topic.
Descriptive observations were 28% of the data set, with the majority (about 92%) making clear scientific
connections instead of ambiguous. Comparative observations were 22% of the data, with about 73% clear
scientific connections while 27% were ambiguous. We expected inferences and analogies to have the lowest
count due to the open nature of the initial prompt, and this was affirmed in our findings. Inferences were 5% of
the data set, and out of those 89% were clear scientific connections and 11% ambiguous. Analogies were 3% of
the data set, with 60% clear scientific connections and 40% ambiguous. The remaining 6% of observations were
disagreements, oftentimes in situations where it was difficult to decipher the mindset of the camper (e.g., is
“ICEY HOT” referring to what the camper is seeing or is the camper making a cultural reference to the Icy Hot
lidocaine patch?).
Table 2: Summary of scientific observations from 200 lines of data excluding inter-rater disagreements
Category Factual Off-topic Descriptive Comparative Analogy
Percentage 34.95 3.76 30.11 23.66 7.53
Discussion
We consider this intervention with an underrepresented population a successful pilot with more than half of the
observations made in the data set belonging to a higher level of active sensemaking, including descriptive,
comparative, inference, and analogy. The process of analogical reasoning draws from multiple resources,
including long-term memory and working memory, to make sense of how the new inferences and abstract
schemas operate (Holyoak, 2012). These conversations can serve as a starting point to generating a personal
value of science and a well-developed interest in science, which is possible given the proper support from one’s
social network (e.g., family, friends, instructors). The majority of descriptive observations and inferences were
clear, with more ambiguity in comparative and analogous statements. The ambiguity in comparative and
analogous statements may be due to campers recognizing differences between the maps, but not being sure of
the reason for the differences. Campers were mostly prompted to make observations upon first arriving on the
map, and explanations were given after campers had been on the map for around 10 minutes. Campers are given
the explanation that wind speed would be much higher without the moon (Earth would spin faster without the
moon’s gravitational pull) and trees would have to adapt to such conditions by having a stronger base. We do
not expect campers to know this information, but the fact they are frequently making comparative statements –
ambiguous or clear shows a deeper level of processing through recalling previous maps or real-life
observations on Earth.
In terms of descriptive observations, the worlds are designed to closely resemble Earth, and the habitats
are rich in detail, giving the campers plenty of input for making observations. We hope certain features “pop-
out” to campers, such as quantity or color. We also added a feature for campers to check the temperature at their
current location, and by doing so campers are able to make descriptive observations of hot or cold. The more
observe about the biomes shows they are becoming aware of the features, which may lead to asking questions
and the triggering of interest. Despite 33% of scientific observations being factual, we currently do not have
evidence that this is necessarily depriving campers from making higher level observations. For instance, does
the act of making factual statements lay out a foundation that eventually leads to inferences or analogies? We
also credit the success of this pilot to the flexibility of sandbox games. It was through Minecraft’s ability to
handle custom modifications that allowed compare-and-contrast of specific features within our worlds. We
consulted astrophysicists on our team about observable differences in hypothetical scenarios, the scientific
significance behind these differences, and then implemented the change within Minecraft (e.g., creating each
night on an Earth with no Moon to be equally dark, as no cycle of lunar phases to change the brightness of the
night sky means no variation in how bright stars need be for visibility).
While our curriculum included a level of instruction (e.g., sectioning camps to allow ample exploration
time of each planet), a majority of the camp’s design relied on free choice and the pursuit of one’s personal
interests. Campers were encouraged to actively explore their environment and their curiosity was fostered by
instructors and by peers, an experience that parallels that of a museum. Museum visitors are able to interact with
the workings of the natural world (Feher, 1990) and exercise free choice (Bamberger & Tal, 2007; Falk,
Storksdieck, & Dierking, 2007), and both behaviors tend to lead to deeper levels of science learning. Arguably,
Copyright 2020 International Society of the Learning Sciences. Presented at the
International Conference of the Learning Sciences (ICLS) 2020. Reproduced by
permission.
sandbox games enable a step further in the experimentation with the natural world by granting the player the
power of unlimited resources and, in turn, a limitless possibility to learn from successes, mistakes, and
improvements made across the time span the player has access to the game and is interested its utilization. The
importance of this research is to show that engagements with natural phenomena is possible in a open space
digital environment and that sandbox games have the potential to help spark interest in STEM topics for
underrepresented adolescents. We plan to continue replicating our Minecraft worlds with the same conditions of
hypothetical versions of Earth that were proposed by Dr. Neil Comins (Comins, 2010). Future camp reiterations
will track camper comments to see the progression of comments made by each individual camper. Through
analyzing this data, we should be able to capture if campers generated more high-quality scientific observations
as they progressed through the camp. Additionally, we hope to combine these types of scientific observations
made by learners with other sources of data, such as exploration patterns on our server maps, survey data
pertaining to STEM interest and Minecraft play patterns, and interview content to help inform our intervention
design. Lastly, we are working on the development of pedagogical agents that will help scaffold the camp
experience in hopes to increasing the overall quality and quantity of observations.
References
Admiraal, W. (2015). A role-play game to facilitate the development of students’ reflective internet skills.
International Forum of Educational Technology & Society, 18(3), 301–308.
Ball, C., Huang, K. T., Cotten, S. R., & Rikard, R. V. (2018). Gaming the SySTEM: The relationship between
video games and the digital and stem divides. Games and Culture, 1–28.
https://doi.org/10.1177/1555412018812513
Ball, C., Huang, K. T., Rikard, R. V., & Cotten, S. R. (2019). The emotional costs of computers: an expectancy-
value theory analysis of predominantly low-socioeconomic status minority students’ STEM attitudes.
Information Communication and Society, 22(1), 105–128.
https://doi.org/10.1080/1369118X.2017.1355403
Bamberger, Y., & Tal, T. (2007). Learning in a personal context: Levels of choice in a free choice learning
environment in science and natural history museums. Science Education, 91(1), 75–95.
https://doi.org/10.1002/sce.20174
Comins, N. F. (2010).What if the Earth had two Moons: and nine other thought-provoking speculations on
the solar system. Macmillan.
Falk, J. H., Storksdieck, M., & Dierking, L. D. (2007). Investigating public science interest and understanding:
Evidence for the importance of free-choice learning. Public Understanding of Science, 16(4), 455–469.
https://doi.org/10.1177/0963662506064240
Feher, E. (1990). Interactive museum exhibits as tools for learning: Explorations with light. International
Journal of Science Education, 12(1), 35–49. https://doi.org/10.1080/0950069900120104
Finkelstein, S., Yarzebinski, E., Vaughn, C., Ogan, A., & Cassell, J. (2013, July). The effects of culturally
congruent educational technologies on student achievement. InInternational Conference on Artificial
Intelligence in Education(pp. 493-502). Springer, Berlin, Heidelberg.
Hidi, S., & Renninger, K. A. (2006). The four-phase model of interest development. Educational Psychologist,
41(2), 111–127. https://doi.org/10.1207/s15326985ep4102
Holyoak, K. J. (2012). Analogy and relational reasoning. In K. J. Holyoak & R. G. Morrison (Eds.), The Oxford
haandbook fo thinking and reasoning (pp. 234-259). New York: Oxford University Press.
Lane, H. C., & Yi, S. (2017). Playing with virtual blocks: Minecraft as a learning environment for practice and
research. cognitive development in digital contexts. https://doi.org/10.1016/B978-0-12-809481-5.00007-9
Lane, H. C., Yi, S., Guerrero, B., & Comins, N. (2017). Minecraft as a sandbox for STEM interest development:
Preliminary results. Retrieved from http://www.wired.com/2015/05/data-effect-minecraft/
Renninger, K. A. (2007). Interest and motivation in informal science learning. Commissioned Paper for
Learning Science in Informal Environments Committee. IEEE Computer Society Press. Retrieved from
http://www.informalscience.com/researches/Renninger_Commissioned_Paper.pdf
Maltese, A. V., & Tai, R. H. (2011). Pipeline persistence: Examining the association of educational experiences
with earned degrees in STEM among U.S. students. Science Education, 95(5), 877–907.
https://doi.org/10.1002/sce.20441
Ting, Y. (2010). Using Mainstream Game to Teach Technology through an Interest Framework. International
Copyright 2020 International Society of the Learning Sciences. Presented at the
International Conference of the Learning Sciences (ICLS) 2020. Reproduced by
permission.
Forum of Educational Technology & Society, 13(2), 141–152.
Yi, S., Lane, H. C., & Delialioğlu, O. (2018, August). What if we were twice as close to the Sun? Interview
findings from a science summer camp serving underrepresented youth. In F. Khosmood et al. (Eds.),
Proceedings of the 14th International Conference on the Foundations of Digital Games. San Luis Obispo,
CA, USA: Foundations of Digital Games.
U.S. Bureau of Labor Statistics. (2019, September 4). Employment in STEM occupations. Retrieved from
https://www.bls.gov/emp/tables/stem-employment.htm
Acknowledgments
We would like to express gratefulness to our technical team comprised of Dr. Jeff Ginger, Jack Henhapl, and
Aidan Rivera-Rogers, for their immense contributions to this project. This material is based upon work
supported by the National Science Foundation under Grant No. 1713609. Any opinions, findings, and
conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily
reflect the views of the National Science Foundation.
... Specific frameworks have been developed to classify and interpret learners' observations in different educational contexts. For example, Yi's framework for coding textual observations made by student includes relevant observations, questions, hypotheses, and texts that might indicate lower engagement with astronomy or the Minecraft platform [42]. Other frameworks consider whether observations made by learners are disengaged [21]. ...
... Other frameworks consider whether observations made by learners are disengaged [21]. Building on these studies, we expand Yi's system to analyze learners' engagement and learning processes within the WHIMC platform, adding codes relevant to both engagement and off-task behavior [42]. ...
... This study expands on Yi's initial efforts to code the observations learners make while exploring WHIMC (Table 1) [42]. Yi's work classified observations into only 4 codes (Noun, Measure/Descriptive, Comparison, Hypothesis), which were designed to capture observations that closely align with WHIMC's learning goals [42]. ...
Conference Paper
Full-text available
This study explores how student actions in Minecraft-based virtual environments designed to simulate astronomical phenomena shift over time, as their interest in astronomy changes. We analyze observations made by middle school learners participating in the What-if Hypothetical Implementations in Minecraft (WHIMC) project, which adapts the game to immerse learners in scenarios exploring scientific concepts. Combining manual and automated coding techniques, we classify these observations and use epistemic network analysis to investigate how they relate to changes in interest levels as measured by pre-and post-surveys. Our findings show that learners who maintain or increase their astronomy interest produce more complex observational behaviors, such as hypothesis generation and comparisons. Conversely, learners whose interest declines produce more surface-level, factual observations. Results suggest ways to identify and support long-term interest in science education.
... One such study conducted during a summer camp examined campers' actions by giving them a quick 10 min presentation on hypothetical earth scenarios before allowing them to explore worlds in Minecraft. It revealed that sandbox games can spark interest in STEM subjects among underrepresented adolescents and that engagements with natural phenomena are possible in an open digital environment (Yi et al., 2020). Another study (Yi et al., 2021) examined interest triggers within Minecraft and found that personal relevance relates to a desire to reengage in camp content and with the design and structure of the intervention. ...
... This research contributes to the literature in a number of ways. It suggests that an open-ended environment can be used to foster STEM interest, which corroborates previous findings on the use of Minecraft during summer camps (Yi et al., 2020(Yi et al., , 2021Lane et al., 2022). It collects and analyzes game-based data from the Philippines, a population that is underrepresented in the literature. ...
Article
Full-text available
Due to the popularity and flexibility of Minecraft, educators have used this game to develop instructional materials and activities to cultivate student interests in science, technology, engineering and mathematics (STEM). One example of such an initiative is the What-If Hypothetical Implementations in Minecraft (WHIMC) project of the University of Illinois Urbana Champaign. The study reported in this paper describes a WHIMC deployment in the Philippines and the effects this deployment had on student STEM interest. The study used quantitative and qualitative methods to determine the effect of WHIMC on the STEM interest of Filipino students. We performed quantitative analysis of the pre- and post-STEM Interest Questionnaire (SIQ) ratings and Game Experience Questionnaire (GEQ) ratings of the high- and low-performers to determine the effect of using WHIMC in the students’ STEM interest and the difference between the game experience of high- and low-performers, respectively. Qualitative analysis of the answers to the open-ended questions about the attributes of the module was also conducted to determine the relationship between the module attributes and student performance. The analysis of the aggregated SIQ ratings before and after using the WHIMC-based modules revealed only a minimal effect on the STEM interests of the students. However, there was a significant effect in the Choice Actions construct, which implies that students recognize the importance of studying hard if they want to pursue STEM-related careers. Further, the analysis of the overall GEQ of high-performers and low-performers also revealed no significant difference. Although no significant difference was observed in the overall GEQ, high-performers had significantly higher GEQ ratings in the Immersion dimension. This result suggested that high-performers had a more positive, engaging, and enjoyable learning experience. Moreover, the findings on the favorite module attributes suggested that students perform better in the out-of-game assessments when they like all the module attributes. This implies that students must be engaged in the game and learning task aside from being interested in the learning topic to have better assessment scores. The study also showed that open-ended learning environments coupled with tasks that demand exploration, observation, and higher-ordered thinking are demanding even on high-performers.
... WHIMC provides a framework for making observations like a scientist. In particular, based on our prior work to assess learner observations (Yi, Gadbury, and Lane 2020), we identified five key categories for observations: ...
Article
MineObserver 2.0 is an AI framework that uses Computer Vision and Natural Language Processing for assessing the accuracy of learner-generated descriptions of Minecraft images that include some scientifically relevant content. The system automatically assesses the accuracy of participant observations, written in natural language, made during science learning activities that take place in Minecraft. We demonstrate our system working in real-time and describe a teacher dashboard to showcase observations, both of which advance our previous work. We present the results of a study showing that MineObserver 2.0 improves over its predecessor both in perceived accuracy of the system's generated descriptions as well as in usefulness of the system's feedback. In future work, we intend improve system generated descriptions to give more teacher control and shift the system to perform continuous learning to more rapidly respond to novel observations made by learners.
Chapter
Application of Bayesian Knowledge Tracing (BKT) has primarily occurred in formal learning settings. This paper presents an integration of BKT in an informal learning context to assess the structure and skill level of learner scientific observations. We compare different approaches to text classification in a Minecraft science simulation. Our models were trained on data collected from two separate middle schools with students of different backgrounds. Experimental results demonstrate the effectiveness of several machine learning models to automatically label observations.
Conference Paper
Full-text available
Our work investigates interest triggering, a necessary component of sustaining and developing long-term interest in STEM. We gathered interview data from middle school aged learners (N = 7) at a science-focused Minecraft summer camp over a period of one week. We first identified STEM interest triggering episodes, then categorized each episode based on codes developed previously by Renninger and Bachrach (2016). Our initial findings show differences in the frequency of interest triggering episodes across individuals and suggest that personal relevance and the use of Minecraft played prominent roles.
ResearchGate has not been able to resolve any references for this publication.