Available via license: CC BY 4.0
Content may be subject to copyright.
A Nim-like game and a machine that plays it:
a learning situation at the interface of
mathematics and computer science
PIERRE ESCLAFIT, SIMON MODESTE AND NICOLAS SABY
Abstract. The purpose of this work is to take a didactic look at a learning situation located
at the interface between mathematics and computer science. This situation offers a first
approach to the concept of artificial intelligence through the study of a reinforcement
learning device. The learning situation, inspired by the Computer Science Unplugged
approach, is based on a combinatorial game, along with a device that learns how to play
this game. We studied the learning potential when the human players face the machine.
After an a priori analysis using the Theory of Didactic Situations (TDS), we conducted a
pre-experiment in order to strengthen our hypotheses. In this article, we will focus on the
analysis of the didactic variables, the values we have chosen for these variables and their
effects on students’ strategies.
Key words and phrases: Mathematics and Computing links, Learning Situation, Nim Game,
Theory of Didactic Situations.
MSC Subject Classification: 97D99, 97K99, 97P80.
18/4 (2020), 317-326
DOI: 10.5485/TMCS.2020.0481
tmcs@science.unideb.hu
http://tmcs.math.unideb.hu
318 Esclafit P. et al.
Context, Research Questions, Framework and Methodology
Our work, based on a Master thesis in didactics of science, is part of a project called
“DEMaIn” (Didactics and Epistemology of interactions between Mathematics and
Computer Science). This project aims to better understand the existing relationships
between mathematics and computer science, from an epistemological point of view, to
address didactic questions (Modeste, 2016). The objective of this project is to design,
suggest, observe and analyse learning situations related to mathematical and computer
sciences concepts.
Our main motivation is to study a learning situation of Computer Sciences
Unplugged (CSU) (Bell, Alexander, Freeman, & Grimley, 2009) concerning Artificial
Intelligence (AI), and to question its potential for learning computer science and
mathematics. Based on the hypothesis that combinatorial games are relevant to provide
learning situations at the interface between computer science and mathematics, we
formulated the following questions: How can we implement combinatorial games in a
CSU approach to highlight the links between mathematics and computer science? Under
what conditions notions of mathematics and computer science can be apprehended by
students in a situation based on AI?
In this paper, we study a learning situation involving a combinatorial game and a
machine that “learns” how to play the game. We specify our approach by focusing on the
following questions: How does an “unplugged” approach help students become familiar
with the machine? How do students explain the wins and losses of the machine? What
are the learning goals of this situation?
To answer these questions, we have used the framework of the Theory of Didactic
Situations (TDS) (Artigue, Haspekian, & Corblin-Lenfant, 2014). The choice of the TDS
will be useful for us in order to identify, thanks to the a priori analysis, the didactic
variables relating to the situation. This first analysis will allow us to settle the
environment the students will face during our experiment. To answer our problem, we
will follow a classic methodology. We will first carry out a general analysis of the
situation by studying the mathematical and computer science knowledge involved. We
will then carry out an a priori analyse of our situation, the framework of the TDS will
allow us to identify the relevant didactic variables, and the potential effects of their
values. Finally, we present a pre-experiment which strengthen the hypotheses resulting
from the a priori analysis, in the perspective of carrying later a more structured
A learning situation at the interface of mathematics and computer science 319
experiment. During these different stages, we will not lose sight of the specific objective
to study the interactions between mathematics and computer science.
Presentation of the CSU Learning Situation: a Machine Playing a
Nim-like Game
Computer Science Unplugged (CSU) is a scientific field that appeared in the 1990s
in New Zealand. It has been growing since the 2000s, when several reports continued the
initial work of Fellows, Bell & Witten (1998). As the name suggests, the goals are to
introduce the concepts of computer science, without using a computer. Table 1 sums up
the main objectives and means of the field.
Objectives of the field
Means of the field
- Introduce computer science using
its concepts rather than through
programming
- Initiate to the methods and
concepts of computing
- The activities must have an
attractive and engaging form
- The activities should promote
cooperative rather than individual
approaches
- The activities must be permissive
to errors
Table 1. Computer Sciences Unplugged: main objectives and means.
We hypothesise that an “unplugged approach” will be relevant for our propose.
We have chosen to explore a learning situation based on reinforcement learning.
Reinforcement learning is a paradigm of Artificial Intelligence. Its principle is that an
agent (a machine) acts in an environment, and get rewarded or penalise depending on the
effects of its actions. The objective is that, after a large number of phases of
penalty/reward, the behaviour of the agent gets more and more relevant. In particular,
reinforcement learning has proven to be efficient in order to make machines good at
games. In our case, the learning situation is based on a combinatorial game.
320 Esclafit P. et al.
Combinatorial games always oppose two players. These two individuals take turns
from an initial position. A player is not allowed to pass his turn or play twice in a row.
The rules of the game define which moves are allowed and which moves are prohibited.
In a combinatorial game, there cannot be any chance (throwing dice, drawing cards
randomly, etc.) throughout the game. Both players must know all of the available
information provided by the game (the information is said to be complete). When one of
the two players can no longer play, a final position is reached. In the “normal version”,
the first player who can no longer play has lost, in the “misery version”, it is the
opposite.
For our learning situation, we used a subtraction game. Subtraction games are
particulars combinatorial games where the players must, taking turns, remove a number
of objects (for example matches) from an initial pile. A subtraction set is defined, which
describes the numbers of objects that a player can remove at his turn. To set up a
subtraction game, an initial number of objects must be defined. In our case we will start
with eight matches and the subtraction set will be {1, 3, 4}. In this way, during their turn
the players can remove one, three or four matches (when it is possible). We will explain
this choice later. We will use the “normal” convention, so the player who takes the last
match wins.
To explore this game with reinforcement learning in a CSU context, we will use the
following device which will be called “the machine”. As presented in the Figure 1, we
place a cup filled with balls of three different colours in front of each match. To play a
move in a given position, the person who plays the machine must randomly draw a ball
from the cup corresponding to the number of matches remaining. If the person playing
the machine gets a yellow ball, he removes one match, if he gets a red ball, he removes
three matches and if he gets a blue ball, he removes four matches. At the beginning,
some settings of the device must be chosen: the number of balls per cup, the number of
balls per colour and the penalty/reward rules. After playing a game, the drawn balls are
put back in their cup, and:
• If the machine has won: to each cup used by the machine during the game, the
player adds balls of the same colour as the drawn ball, in a number defined in
the settings of the device (in our case, one ball is added, in addition to the
drawn ball).
• If the machine has lost: to each cup used by the machine during the game, the
player removed balls of the same colour as the drawn ball, in a number
A learning situation at the interface of mathematics and computer science 321
defined in the settings of the device (in our case the drawn ball is removed).
When a cup is completely empty, it must be reset as it was at the beginning of
the game.
This device is inspired by the work of Duchêne and Parreau (2019), also developed
by the French group "InfoSansOrdi" (“ConputerScienceWithoutComputer”).
Figure 1. The game and the device
The study of this game and this device is a learning situation that involves
mathematics and computer science, as shown in the following.
Analysis of the Learning Situation and Didactic Variables
We have identified many pieces of knowledge at stake in this situation. One is
related to mathematical content and more particularly to game theory: it is, for students,
to study the game and the strategies to be used to win. Another piece of knowledge
concerns the understanding of the functioning of a simple reinforcement learning device.
To organize this situation in relation to these pieces of knowledge, we conducted a study
on the didactic variables involved.
The concept of didactic variable was introduced by Brousseau (1982). A didactic
variable is a variable at the disposal of the person that is planning the activity. It can be
related to the knowledge involved or the organisation of the session. In both cases, the
choice of one value or another of a didactic variable will influence the strategies that the
students can implement.
322 Esclafit P. et al.
We have identified various didactic variables in our situation. We will develop three
examples that are typical. The first example of didactic variable refers to mathematical
knowledge. We called this variable "consecutivity of the subtraction set". It can take two
values: either the set is consecutive (for example {1, 2, 3}), or it is not (for example {1,
3, 4}). This variable is a didactic variable because the choice of its value will have an
effect on the students’ strategies. With a consecutive set (let us say {1, … , 𝑛}), it will be
easier to discover a winning strategy. It is well known that the possibility of wining, and
the strategy, depend on the remainder of the Euclidean division of the number of
matches by 𝑛 + 1. A non-consecutive set is less common, and it also involves more
complex resolution strategies, in particular, in order to find winning strategies, it
becomes strongly useful to introduce the mathematical notions of wining and loosing
positions. In order to make students "accept" the use of the machine, we had to make
sure that the winning strategy was not too easy to discover, so we chose the non-
consecutive set. This choice also allowed us to facilitate the devolution of the problem,
in fact, the students had to deal with an original game, which increased their interest in
the situation.
Our second example concerns a didactic variable that refers to computer science
knowledge. This variable is linked to the device, we called it "initial number of balls in
each cup". We distinguish two values for this variable: either the initial number of balls
of each colour in each cup is small (relative to the values of the number of balls in the
penalty/reward rules), or it is large. These variations will impact the students' strategies.
Indeed, a small number of balls will bring a frequent reset of the device, it will allow
"errors", it will also allow a rapid convergence. With a large number of balls, it will be
necessary to draw many times in each cup to "stabilise" it, intuitively, the device will
“learn slower”. Our objective was to allow the students to understand the convergence of
the machine, so we put a small number of balls of each colour in each cup. The students'
machines were thus able to converge after a relatively small numbers of games (around
15). Note that in a larger time scale with students, it could be interesting to choose the
other value of this didactic variable.
The third example is a didactic variable related to the situation, the choice of its
values is not directly linked to the knowledge at stake but rather to the organisation of
the situation. This third didactic variable concerns the way in which students’ work is
organised, it specifically describes if the work is done individually or by group
(including the number of students in a group). We are interested in the potential effects
of the value "students are in groups of two". When playing with the machines, the
A learning situation at the interface of mathematics and computer science 323
advantage of having students in pairs is that they can share the work: one is playing the
machine and the other can face the machine. This distribution has several advantages:
first, the student who manipulates the machine is "relieved" of any other tasks, he can
focus on the device, he will make less mistakes in the manipulation, and will be able to
make observations on the probabilistic behaviour of the machine; second, the player
facing the machine can also focus on the strategy the group want to implement:
observing the winning positions; thinking about a strategy that could help him win.
Finally, being two working around the same problem will foster discussions about the
situation. For these reasons, we chose to group the students in pairs.
These three examples, illustrating three categories of didactic variables (linked to
mathematical knowledge, linked to computer science knowledge, linked to the
organisation of the situation) clearly show the importance of the choices that must be
made by the instructor. These choices must be linked to teaching objectives. For our
experimentation, we made our choices after having made explicit our objectives and
after having carried out a complete analysis of the different values of the variables.
Indeed, according to A. Bessot (2003) in her lecture introducing the TDS, it is important
to describe the not selected values of didactic variables to understand the meaning of
knowledge in the particular situation.
Pre-experimentation and First Results
To carry out our experiment, we benefited from the support of the “Maison des
Mathématiques et de l’Informatique” (House of Mathematics and Computer Science) in
Lyon, France. We conducted this pre-experiment with a group of 30 students of first year
of high school (14-15 years old). We had two hours to carry out our experiment. We
were observer during the experiment, while instructors were expert of the fields
(mathematicians and computer scientists).
The analysis of the session, linked to our research questions, allowed us to make
several observations. We try to identify if the choices made on the didactic variables
produced the expected effects. We were also attentive to the learning potentials of the
situation, both those linked to the understanding of the device and those linked to the
mathematical content.
324 Esclafit P. et al.
First, it seems that the joint study of the game and the device has favoured the
establishment of a winning strategy by the groups of students. Indeed, before the
introduction of the device, only two groups had formalised a winning strategy and at the
end of the session, after all the students had played with the machine, all 10 groups were
able to explain and put in place a winning strategy. Regarding the didactic potential of
the situation, it seems that the organisation of the activity has promoted the devolution of
the problem (Artigue, Haspekian, & Corblin-Lenfant, 2014), that is the fact that the
students have accepted to take responsibility for the solving of the problem. This
organisation has also allowed students to understand how the device works. Indeed, the
fact of having designed our situation in connection with the TDS allowed us to organize
the environment, we are thinking particularly of the interaction between the student and
the device. Our choices in relation to the didactic variables seem to have allowed the
students to appropriate the different aspects of the problem: find a winning strategy and
understand how the machine works. Finally, we think that the situation is a good
introduction to the concept of "reinforcement learning". Indeed, we note that the students
think about the setup of the device, the evolution of the colours in the cups and the
ability of the machine to win regularly.
These first results are based on the answers given by the students to the questions of
the instructors. We also base ourselves on the interactions that took place between the
students and with the instructors at the end of the session during which we presented a
computer simulation of the device in which we could vary the rules of the game, the
initial configuration of the machine, and carry out a large number of games quickly.
Conclusion
The analysis of the situation, the identification of the didactic variables and the
choice of their values, allowed us to organise our pre-experimentation according to our
objectives as researchers and as instructors. The use of CSU seems to be relevant to
work on notions at the interface between mathematics and computer science. These
choices allow us to set up a structured experiment which will allow answers to research
questions and in particular to validate the hypotheses.
However, the use of an unplugged approach raises questions about the "connection"
between the proposed activities and the reality of using reinforcement learning systems.
From a situation like this, it would be very useful, to think about a specific moment to
A learning situation at the interface of mathematics and computer science 325
"reconnect" the knowledge, that is to say think of the transition between the
disconnected activity and activity on a machine (like simulation or modelling activities).
Following this a priori analysis and this pre-experiment, our perceptive is to test our
hypotheses through a structured experiment. In this way, we could show the relevance of
the use of didactic tools for the study of such a situation as well as the didactic potential
of the latter.
Acknowledgement
Communication supported by French National Research Agency <ANR-16-CE38-
0006-01>.
References
Artigue. (2014). Didactic Engineering in Mathematics Education. In S. Lerman (Ed.),
Encyclopedia of Mathematics Education (pp. 159-162). Springer Netherlands.
Artigue, M., Haspekian, M., & Corblin-Lenfant, A. (2014). Introduction to the Theory of
Didactical Situations (TDS). In A. Bikner-Ahsbahs, & S. Prediger (Ed.), Networking
of Theories as a Research Pratice in Mathematics Education (pp. 47-65). Springer
International Publishing.
Bell, T., Alexander, J., Freeman, I., & Grimley, M. (2009). Computer Science
Unplugged. School students doing real computing without computers. The Zew
Zealand Journal of Applied Computing and Information Technology, 13(1), 20-29.
Bessot, A. (2003). Une introduction à la théorie des situations didactiques. Les cahiers
du laboratoire Leibniz (91).
Brousseau, G. (1982). Ingénierie didactique : d'un problème à l'étude à priori d'une
situation didactique. Actes de la Deuxième école d’été de didactique des
mathématiques. Olivet, IREM d'Orléan.
Duchêne, E., & Parreau, A. (2019, Juin 7). La machine qui apprend à (bien) jouer toute
seule. Retrieved Janvier 26, 2020, from LIRIS Médiation:
https://projet.liris.cnrs.fr/lirismed/index.php?id=la-machine-qui-apprend-a-jouer-toute-
seule
326 Esclafit P. et al.
Fellows, M. R., Bell, T., & Witten, I. (1998, Juin 9). Computer Science Unplugged...
Offline activities and games for all ages. Retrieved from
https://classic.csunplugged.org/wp-content/uploads/2015/01/unplugged-book-v1.pdf
Modeste, S. (2016). Impact of Informatics on Mathemactics and Its Teaching. In F.
Gadducci, & M. Tavosanis (Ed.), History and Philosophy of Computing (pp. 243-
255). Springer International Publishing.
PIERRE ESCLAFIT, SIMON MODESTE AND NICOLAS SABY
IMAG, UNIVERSITY OF MONTPELLIER, CNRS, MONTPELLIER, FRANCE
E-mail: esclafit.pierre@yahoo.fr
E-mail: simon.modeste@umontpellier.fr
E-mail: nicolas.saby@umontpellier.fr