Conference PaperPDF Available

The Effects of Adaptive Sequencing Algorithms on Player Engagement within an Online Game


Abstract and Figures

Using the online educational game Battleship Numberline, we have collected over 8 million number line estimates from hundreds of thousands of players. Using random assignment, we evaluate the effects of various adaptive sequencing algorithms on player engagement and learning.
Content may be subject to copyright.
S.A. Cerri and B. Clancey (Eds.): ITS 2012, LNCS 7315, pp. 591–593, 2012.
© Springer-Verlag Berlin Heidelberg 2012
The Effects of Adaptive Sequencing Algorithms on Player
Engagement within an Online Game
Derek Lomas1, John Stamper1, Ryan Muller1, Kishan Patel2, and Kenneth Koedinger1
1 Carnegie Mellon University
2 Dhirubhai Ambani Institute of Information and Communication, Gujarat, India
Abstract. Using the online educational game Battleship Numberline, we have
collected over 8 million number line estimates from hundreds of thousands of
players. Using random assignment, we evaluate the effects of various adaptive
sequencing algorithms on player engagement and learning.
Keywords: games, number sense, engagement, adaptive sequencing.
1 Introduction
Number line estimation accuracy is highly correlated with math achievement scores in
grades K-8 (Siegler, Thompson, Schneider, 2011). To promote practice with number
line estimation, we have developed Battleship Numberline, a game involving estimat-
ing the location of ships on a number line. Using this game, we have collected over 8
million number line estimates from several hundred thousand online players. The
order of instructional items in the game is typically presented at random, but we hypo-
thesize that an adaptive sequence will result in an improved learning experience.
Adaptive instructional sequences are best known for increasing the efficiency of
learning [2]. However, Pavlik et al. [3] reported that students tended to chose an adap-
tive sequence of foreign language instructional items over a random sequence of
items. We further explore this phenomenon by investigating whether adaptive se-
quences can increase motivation to engage in a learning activity.
2 Adaptive Sequences
Conati et al. [1] describe using Bayesian Knowledge Tracing (BKT) to promote learn-
ing in an educational game. However, many games use far simpler algorithms to pro-
mote learning and player interest; for instance, they may require a player to perform
flawlessly on a level before progressing to the next. Could simpler adaptive
algorithms achieve comparable performance to Bayesian Knowledge Tracing? Specif-
ically, could they produce comparable learning (pre-post test gain) and player
engagement (duration of intrinsically-motivated play)?
In our implementation of BKT, we developed a knowledge component model with
five knowledge components (KC). The parameters for the model were developed
592 D. Lomas et al.
based on data collected from a prior classroom study involving 150 students in 4th-6th
grade. These parameters included the probability of existing knowledge (L0), learning
rates (T), and the probability for slipping (S) and guessing (G). The sequencing algo-
rithm worked by randomly choosing an item belonging to the KC with the highest
probability of being known, so long as it was below the threshold of .9 probability of
being known. When a KC exceeded .9, it was removed from the sequence. Once all
KCs in the level exceeded .9, the level was over.
The Difficulty Ladder (dLadder) is an adaptive sequencing algorithm that requires
mastery of easier items before allowing progress to more difficult items. Based on the
same dataset from which the BKT parameters were derived, the items in the instruc-
tional sequence were divided into 5 bins of difficulty, each with 4 items. Players be-
gan in the easiest bin; if they were correct twice in a row, they advanced to the next
more difficult bin. If they were incorrect twice in a row, they went back to the pre-
vious, less difficult bin. When the player completed the hardest bin, the level was
over. A high performing player could complete the ladder in only 10 trials.
Naïve ITS is based on the idea that a successful response tends to generate more
learning than an unsuccessful response. To promote success, if a player gets an item
incorrect, they are given another opportunity to attempt the item after a delay of one
other item. The delay of one trial facilitates working memory retrieval without mak-
ing the task trivially easy (as it might be if there was no delay). Once the player gets
every item correct at least once, the level is over.
The random sequence randomly presents (without replacement) one of 20 different
fractions. Unlike the adaptive sequences, the random sequence is not affected by the
player’s prior performance.
3 Experiment 1: Structure, Participants and Metrics
The adaptive sequencing experiment involved randomly assigning 1087 players to
one of sixteen different level sequences representing four different experimental con-
ditions (BKT, Difficulty Ladder, Naïve ITS, & Random) with four different
pre/posttest form combinations (A-B, B-C, C-D, D-A). Each level sequence consisted
of a pretest level, a level with one of four sequencing algorithms, a post-test level, and
then additional levels of the same sequencing algorithm (so that patterns of extended
play could be compared over the different algorithms). The pre/post tests involve four
fraction estimation problems, presented fully within the context of the game.
Our participants are anonymous online players who freely access our game through
the educational portal Despite this anonymity, we can infer from the
demographics of that our users are likely to be third to eighth grade
students, probably playing in a classroom setting. offers a number of
different educational games. We assume that students are free to stop playing Battle-
ship Numberline at any time; indeed, over 50% of students play less than 10 trials.
In this study, we define engagement as the number of trials that a player chooses to
play, as this is believed to reflect the players intrinsic motivation to participate in the
gameplay sequence. We measure learning as the gain from pretest to posttest.
The Effects of Adaptive Sequencing Algorithms on Player Engagement 593
Table 1. Initial conditions of experiment
Pretest Av. Av. # of
# of
% playing
> 40 Tri-
BKT 265 23% (.42) 25(30) 14 20%
23% (.42)
26% (.44)
23% (.42)
Table 2. Here, data is presented only for the players that completed the posttest. Gain is
significant from pre to post test over all conditions (p<.02, p<.01, p<.001) using a paired t-test.
Pretest Av. Posttest Av. Median #
of Trials
BKT 0 n/a n/a n/a
46% (.50)
31% (.46)
25% (.43)
4 Discussion
The data presented here suggests a modest effect from the sequencing algorithms.
Unfortunately, learning gains are impossible to compare directly, without statistically
correcting for the substantial rates of attrition. Our BKT algorithm apparently set the
bar too high—no players in this sample actually completed the level, despite some
players completing more than 100 trials. Future work will involve tuning the parame-
ters of the BKT algorithm, developing more comparable measures of learning, and
validating our online engagement construct in a classroom setting.
1. Conati, C., Zhao, X.: Building and evaluating an intelligent pedagogical agent to improve
the effectiveness of an educational game. In: Proceedings of the 9th International Confe-
rence on Intelligent User Interfaces, pp. 6–13. ACM (2004)
2. Corbett, A.T., Anderson, J.R.: Knowledge tracing: Modeling the acquisition of procedural
knowledge. User Modelling and User-Adapted Interaction 4(4), 253–278 (1995)
3. Pavlik Jr., P., Bolster, T., Wu, S.-M., Koedinger, K., MacWhinney, B.: Using Optimally Se-
lected Drill Practice to Train Basic Facts. In: Woolf, B.P., Aïmeur, E., Nkambou, R., Lajoie,
S. (eds.) ITS 2008. LNCS, vol. 5091, pp. 593–602. Springer, Heidelberg (2008)
4. Siegler, R.S., Thompson, C.A., Schneider, M.: An integrated theory of whole number and
fractions development. Cognitive Psychology 62(4), 273–296 (2011)
... Feedforward has been applied to enhance learning in different contexts. Educational games deployed algorithms that ensured players always received positive responses to their actions [60] by proactively adapting the difficulty to their learning rate [85]. In classroom teaching, feedforward evaluation let educators make real-time adjustments [15] and allowed students to adapt their plans [74] accordingly to the current learning outcomes. ...
Conference Paper
Full-text available
Media architecture exploits interactive technology to encourage passers-by to engage with an architectural environment. Whereas most media architecture installations focus on visual stimulation, we developed a permanent media facade that rhythmically knocks xylophone blocks embedded beneath 11 window sills, according to the human actions constantly traced via an overhead camera. In an attempt to overcome its apparent limitations in engaging passers-by more enduringly and purposefully, our study investigates the impact of feedforward learning, a constructive interaction method that instructs passers-by about the results of their actions. Based on a comparative (n=25) and a one-month in-the-wild (n=1877) study, we propose how feedforward learning could empower passers-by to understand the interaction of more abstract types of media architecture, and how particular quantitative indicators capturing this learning could predict how enduringly and purposefully a passer-might engage. We believe these contributions could inspire more creative integrations of non-visual modalities in future public interactive interventions.
... It is not uncommon to see the term used interchangeably with a variety of emotional states and other related aspects as well [4,3,5]. Some authors equate engagement with time spent on a task [28] or some other outcome that can be measured (e.g., the number of visitors or amount of replays) [29][30][31]. Others consider engagement a precursor to or 'initial stage' of immersion, while immersion, in turn, becomes conflated with aspects of presence (i.e., a sense of 'being' in the virtual environment) and absorption (i.e., loss of time and space) [4]. ...
Although games are frequently described as ‘engaging’, what this means exactly continues to be subject of debate in game literature. Engagement is often defined through related concepts like immersion and positive emotions. However, this neglects the fact that applied games aim to provide more than an entertaining experience, and that engagement with the applied purpose can exist separately from engagement with the game’s systems. To make this differentiation more apparent, this article introduces the Applied Games Engagement Model (AGEM), a theoretical model that distinguishes between an applied game’s systems and its non-entertainment purpose. It poses that game systems and purpose can overlap in varying amounts, both from game to game, and from moment to moment within a single game. The value of the model is in the explicit acknowledgement that the attention necessary for engaging with content is a limited resource, and that measures for engagement in applied games need to consider that not all engagement is purposeful. The article lays the conceptual foundation for the study of engagement in applied games, and provides a framework for how to design for an applied purpose. It illustrates its use in analysing applied games and their designs through three case studies.
... Systems that adapt to the user have been explored as a means of facilitating the flow state. This includes adaptive algorithms that determine when a player should be allowed to progress in a learning game (Lomas et al. 2012) and a model that adapts to the gameplayer, not by adapting the level of challenge but by changing the way the player perceives the results of his or her actions, with the aim of improving player confidence (Van Der Spek 2012). ...
Engaging users is a priority for designers of products and services of every kind. The need to understand users’ experiences has motivated a focus on user engagement across computer science. However, to date, there has been limited review of how Human--Computer Interaction and computer science research interprets and employs the concept. Questions persist concerning its conception, abstraction, and measurement. This article presents a systematic review of engagement spanning a corpus of 351 articles and 102 definitions. We map the current state of engagement research, including the diverse interpretation, theory, and measurement of the concept. We describe the ecology of engagement and strategies for the design of engaging experiences, discuss the value of the concept and its relationship to other terms, and present a set of guidelines and opportunities for future research.
Since the 1960s, researchers have been trying to optimize the sequencing of instructional activities using the tools of reinforcement learning (RL) and sequential decision making under uncertainty. Many researchers have realized that reinforcement learning provides a natural framework for optimal instructional sequencing given a particular model of student learning, and excitement towards this area of research is as alive now as it was over fifty years ago. But does RL actually help students learn? If so, when and where might we expect it to be most helpful? To help answer these questions, we review the variety of attempts to use RL for instructional sequencing. First, we present a historical narrative of this research area. We identify three waves of research, which gives us a sense of the various communities of researchers that have been interested in this problem and where the field is going. Second, we review all of the empirical research that has compared RL-induced instructional policies to baseline methods of sequencing. We find that over half of the studies found that RL-induced policies significantly outperform baselines. Moreover, we identify five clusters of studies with different characteristics and varying levels of success in using RL to help students learn. We find that reinforcement learning has been most successful in cases where it has been constrained with ideas and theories from cognitive psychology and the learning sciences. However, given that our theories and models are limited, we also find that it has been useful to complement this approach with running more robust offline analyses that do not rely heavily on the assumptions of one particular model. Given that many researchers are turning to deep reinforcement learning and big data to tackle instructional sequencing, we believe keeping these best practices in mind can help guide the way to the reward in using RL for instructional sequencing.
Conference Paper
A systematic review was designed to address the question of "What is engagement and how has the term been used, defined and measured in the context of serious games?". The goal of the review was to collect, evaluate, and analyse literature related to the definition and measurement of engagement in serious games published between in 1970 to 2015 across a broad range of disciplines. A total of 1390 papers were initially identified from the search criteria. These were reduced to 107 papers that directly assessed engagement in a serious game. These selected papers were then analysed to examine the use of the term 'engagement' in the serious games, the genres of the games studied, the various definitions of engagement and the methods used to measure different aspects of engagement. Three distinct types of engagement, related to behaviour, affect and cognition are found in the studies, along with a broad range of evaluation methods including interviews, questionnaires, physiological approaches, in game metrics, and time and performance on task.
Conference Paper
Full-text available
How to best sequence instruction in a collection of basic facts is a problem often faced by intelligent tutoring systems. To solve this problem, the following work details two tests of a system to provide drill practice (test trials with feedback) for foreign language vocabulary learning using a practice schedule determined to be optimal according to a cognitive model. In the first test, students chose between an optimized version and a version that merely cycled the vocabulary items. Examination of the time on task data revealed a preference for practice based on the decisions of the cognitive model. In the second test, the system was used to train the component parts of Chinese characters and measure the transfer of knowledge to subsequent learning of Chinese characters. Chinese character learning was improved for students with the relevant optimized training.
This paper describes an effort to model students' changing knowledge state during skill acquisition. Students in this research are learning to write short programs with the ACT Programming Tutor (APT). APT is constructed around a production rule cognitive model of programming knowledge, called theideal student model. This model allows the tutor to solve exercises along with the student and provide assistance as necessary. As the student works, the tutor also maintains an estimate of the probability that the student has learned each of the rules in the ideal model, in a process calledknowledge tracing. The tutor presents an individualized sequence of exercises to the student based on these probability estimates until the student has mastered each rule. The programming tutor, cognitive model and learning and performance assumptions are described. A series of studies is reviewed that examine the empirical validity of knowledge tracing and has led to modifications in the process. Currently the model is quite successful in predicting test performance. Further modifications in the modeling process are discussed that may improve performance levels.
Conference Paper
Electronic educational games can be highly entertaining, but studies have shown that they do not always trigger learning. To enhance the effectiveness of educational games, we propose intelligent pedagogical agents that can provide individualized instruction integrated with the entertaining nature of the games. In this paper, we describe one such agent, that we have developed for Prime Climb, an educational game on number factorization. The Prime Climb agent relies on a probabilistic student model to generate tailored interventions aimed at helping students learn number factorization through the game. After describing the functioning of the agent and the underlying student model, we report the results of an empirical study that we performed to test the agent's effectiveness.
This article proposes an integrated theory of acquisition of knowledge about whole numbers and fractions. Although whole numbers and fractions differ in many ways that influence their development, an important commonality is the centrality of knowledge of numerical magnitudes in overall understanding. The present findings with 11- and 13-year-olds indicate that, as with whole numbers, accuracy of fraction magnitude representations is closely related to both fractions arithmetic proficiency and overall mathematics achievement test scores, that fraction magnitude representations account for substantial variance in mathematics achievement test scores beyond that explained by fraction arithmetic proficiency, and that developing effective strategies plays a key role in improved knowledge of fractions. Theoretical and instructional implications are discussed.
Using Optimally Selected Drill Practice to Train Basic Facts
  • P Pavlik
  • T Bolster
  • S.-M Wu
  • K Koedinger
  • B Macwhinney
Pavlik Jr., P., Bolster, T., Wu, S.-M., Koedinger, K., MacWhinney, B.: Using Optimally Selected Drill Practice to Train Basic Facts. In: Woolf, B.P., Aïmeur, E., Nkambou, R., Lajoie, S. (eds.) ITS 2008. LNCS, vol. 5091, pp. 593-602. Springer, Heidelberg (2008)