Content uploaded by Martin Ebner
Author content
All content in this area was uploaded by Martin Ebner on Jul 27, 2021
Content may be subject to copyright.
Speech-based Learning with Amazon Alexa
Michael Weiss
Graz University of Technology
Austria
michael.weiss@student.tugraz.at
Markus Ebner
Educational Technology
Graz University of Technology
Austria
markus.ebner@tugraz.at
Martin Ebner
Educational Technology
Graz University of Technology
Austria
martin.ebner@tugraz.at
Abstract: Almost every child and adult today have access to the Internet and different technologies
like smartphones, tablets and co. Scientists recognized this very quickly and started to create a
whole new way of learning. Therefore, learning applications have been developed to increase
student and adult interest in studying and training. This paper describes the development of a
speech-based Alexa Skill for educational purposes. The main goal is to help students as well as
adults to learn and train mathematical basic calculation types. The skill was designed in a way that
young students as well as adults can interact with Alexa in their own words. Therefore, the skill
“Mathe Rätsel” (Eng.: “Math riddle”) offers two difficulties to meet the needs of young and old
users.
Introduction
The so called “Generation Z” is growing up with different kinds of technologies, from smartphones to
tablets (Nagler et al, 2019). In the last years, especially devices which are controlled by voice such as Amazon Echo
have been launched on the markets and became popular in many households. Furthermore, it is important to mention
that voice has always been an omnipresent method of communication (Pearl, 2017). In 2019, more than 100 million
devices with Amazon’s Alexa assistant support, have been sold (Matney, 2019). In this context, Underwood states:
“Researchers have pointed to well-crafted use of technology benefiting increased learner effectiveness or
performance gains, increased learner efficiency, greater learner engagement or satisfaction and more positive student
attitudes to learning (Underwood, 2009).
“Personal Assistants (IPAs) are speech-enabled technologies in mobile platforms which have become one
of the fundamental devices of learning online.” (Goksel Canbek, 2016). Zhao et. al state that the people’s lives are
facilitated by smart speakers, like Amazon Echo and its associated IPA Alexa. First of all, they can make use of
voice commands and perform tasks like playing music, ordering dinner, shopping and answering questions by
searching for the answer on the internet (Zhao, 2020). Dizon et al. for example used Alexa as a training tool for
second language learning, it might be useful in classrooms (Dizon, 2017).
Nowadays people can acquire knowledge in different multimodal ways. It shows that students learn and
think differently nowadays than they did in the past, whereas a lot of educational material has not developed over
the last decades and schools are still making use of it. This does not reflect the spirit of time as students bring with
them other experiences than they did some decades ago. Moreover, their way of thinking and working has changed
too. Therefore, it is necessary, that schools that want to adapt to the new way of thinking and learning of younger
-156-
EdMedia + Innovate Learning 2021 - Online, United States, July 6-8, 2021
Originally published in: Weiss, M., Ebner, M. & Ebner, M. (2021). Speech-based Learning with Amazon Alexa. In T. Bastiaens (Ed.), Proceedings of EdMedia + Innovate Learning (pp. 156-163).
United States: Association for the Advancement of Computing in Education (AACE). Retrieved July 27, 2021 from https://www.learntechlib.org/primary/p/219651/.
generations integrate new learning models (Underwood, 2009). “Imagine being able to create technology and not
needing to instruct customers on how to use it because they already know: they can simply ask. Humans learn the
rules of conversation from a very young age, and designers can take advantage of that, bypassing clunky GUIs and
unintuitive menus.” (Pearl 2017). Pearl also states that the use of Voice User Interfaces has different advantages like
speed, hands-free use, intuitiveness and empathy.
This publication describes the development of an Alexa Skill for learners of all ages to support the way of
learning for basic arithmetical operations. With the help of a captivating story that is narrated by Alexa people have
the possibility to train the four basic arithmetical operations. The skill was developed in a way that the user has to
answer riddles which are based on real life situations. This should open a new and intuitive way of learning for
young and old. Following research questions should be carried out with the prototype:
Is it possible to build an easily expandable skill prototype which is telling stories with the help of
Amazon’s Skill Kit?
Is the skill suitable for both children and adults?
Can we design the skill in such a way that a repeated play through of the story with different experiences is
possible?
Which problems occur during the development of the skill?
Alexa Skill: prototype
State of the art
The first step in building the Alexa skill was to perform a research and test of mathematics learning skills
already available on the market with the skill language German. In this context, the current state of the art has to be
elicited and an overview of the different mathematics skills available on the market has to be created. Therefore, we
tested seven of the most popular mathematic learning apps on the market and pointed out their strengths and
weaknesses.
As a result of testing the skills “Mathe Monster, Kopfrechnen, Rechenkönig, Multiplikation Mathe, Blitz
Rechnen, Mathedreier, Rechenwürfel” (Eng.: “Maths monster, mental arithmetic, arithmetic king, multiplication
maths, flash arithmetic, maths threesome, arithmetic cube”) which can be found on the Amazon Skill Store, the main
difference was that most of the skills only offer simple calculations according to the scheme: “question-answer”. On
the one hand, this can be seen as an advantage because one can train the specific calculation types very easily, on the
other hand the players might get bored quickly.
In contrast to training programs, learning through play can be more effective and can bring overall higher
learning gains as Vogt et. al. showed in their studies of different learning approaches. (Vogt, 2018)
Preliminary Considerations
For the development of the skill, some important points had to be considered. Amazon's design guidelines
show that firstly the age group of the school children must be determined, since, for example, young people in the
age range 10 to 14 years have different language habits compared to those in the range 15 to18 years. The skill had
to be designed as flexible as possible in order to be able to respond to the individual language and interaction habits
of the schoolchildren with Alexa. Secondly, it is important that schoolchildren are guided through the interaction
with Alexa. It is often difficult for pupils to set their own hurdles and therefore they quickly lose motivation for this
learning method. For this reason, it is also crucial to design the skill so that it is easy to use. It is also important to
design the skill in such a way that there is enough support from the skill. Another important point was that the skill
must provide the students with a special sense of achievement. It should not be dull and boring, but rather a playful
approach should be chosen, which the pupils will remember. Finally, it was important to get into the children's heads
and think about how they interact with Alexa.
-157-
EdMedia + Innovate Learning 2021 - Online, United States, July 6-8, 2021
Skill Design
The skill was designed to help both children and adults to learn and practice the basic arithmetic operations
of addition, subtraction, multiplication, and division in a funny and exciting way. The skill is divided into two "level
areas", Level 1 and Level 2, which represent different levels of difficulty. In both levels, the knowledge is conveyed
with the help of stories. The story is told by the Alexa-enabled device. The users must listen carefully to the stories,
because meanwhile questions have to be answered in the form of math riddles. The name of the main character is
chosen by the users themselves. The main character is a 14-year-old pupil who tells his story about different
situations in her everyday life. The story begins at 7 AM in the morning and ends in the evening at 9 PM when she
goes to bed. In the process users are supposed to solve a total of ten math riddles per story. Only when the first
puzzle has been solved correctly Alexa continues to tell the story and then the users can solve the next puzzle. Once
all the puzzles have been solved, the story ends. In this way, users can learn and practice basic arithmetic in a
playful way. For both levels we developed a unique story to meet the need of young and old users.
In the story of Level 1 which is called “Ein lustiger Tag am Bauernhof” (Eng.: “A fun day at the farm”) the player
visits his grandparents at their farm and helps them with their daily tasks. This story is suitable for children who just
started learning the basic calculation types.
A sample use case is: “Bereits am Vormittag grasen auf der großen saftigen Wiese mehrere Tiere, darunter <RN1>
Ziegen, <RN2> Schafe und <RN3> Kühe. Wie viele Tiere gibt es insgesamt auf der Weide?” (Eng.: "Already in the
morning, several animals are grazing on the large lush meadow, including <RN1> goats, <RN2> sheep and <RN3>
cows. How many animals are there in total in the pasture?") where RN1, RN2 and RN3 are indicating a random
variable between 1 and 10. The user has to give the correct answer to continue.
In the story of Level 2 which is called “Endlich Urlaub” (Eng.: “Finally Holidays”), the player takes the train to visit
his best friend in the city and spends a day with him. This story is suitable for older people who want to train their
calculation skills. A sample use case is: “Juhu, endlich Urlaub! Heute geht es mit dem Zug in die Großstadt. <X>
blickt nervös auf die Uhr, hoffentlich verpasst <X> den Zug nicht. Der Zug wird am Bahnhof planmäßig um
vierzehn Uhr <RN1> abfahren, jetzt ist es vierzehn Uhr <RN2>. Wie viele Minuten bleiben <X> noch bis zur
Abfahrt des Zuges?” (Eng.: "Yay, finally a holiday! Today we're going to the big city by train. <X> glances
nervously at the clock, hopefully <X> won't miss the train. The train is scheduled to leave the station at four o'clock
<RN1>, now it's four o'clock <RN2>. How many minutes does <X> have left before the train leaves?") where X
indicates the name of the player and RN1, RN2 are random variables between 0 and 60.
If the user gives a wrong answer to the questions, Alexa tells the user that the question is answered incorrectly and
asks him to give a new solution. There is also the possibility to ask Alexa to repeat the current question by telling
Alexa “Alexa, wiederholen” (Eng.: “Alexa, repeat”) or simply “Wiederholen” (Eng.: “repeat”). Furthermore, we
have implemented a help function where the user is given assistance for the calculation task. This can be done by
asking Alexa for help.
Architecture
An overview of all the different technologies working together can be seen in figure 1.
-158-
EdMedia + Innovate Learning 2021 - Online, United States, July 6-8, 2021
Figure 1. Overview of different technologies
Skill Flow
The skill itself is called up with "Alexa, start Matherätsel" (Eng.: “Alexa, start Math riddle”). Afterwards,
Alexa reads out a welcome message and asks for the name of the character. The name can be chosen freely by the
user, whether it is a fictitious person/character or the user's own name. This name is saved by Alexa during the entire
game process.
Once a suitable name has been found and communicated to Alexa, the user is asked for the desired level.
The user has the option of choosing ¨Level 1” or ¨ Level 2”. Consequently, the corresponding story is started, and
the first riddle begins. In accordance with the events in the story, various mathematical riddles must be solved. Each
riddle must be answered correctly to get to the next one. The users have a total of ten seconds to answer. If the
question is answered incorrectly, a corresponding message is told, and the user is given another ten seconds to
answer. If no answer is given, the skill ends automatically. In addition, the user has the option to use the command
“Help” to receive assistance with the current riddle. With the command “Repeat”, the current riddle can be heard
again. In order to finish the story and consequently also the skill, all ten riddles in the story must be solved. A
premature end to the story is also possible with the command "Alexa, stop".
Amazon Echo
In order to run the Skill, an Alexa-enabled device such as Amazon Echo is required. The device is
connected to the Wi-Fi and can be activated with the command "Alexa". An Amazon account must be linked to the
Amazon Echo and the Skill must be installed.
AWS Lambda
AWS Lambda is the web service provided by Amazon, which executes the code provided by the
programmer when voice requests are made to the skill. Furthermore, Lambda handles all the necessary steps from
entering a voice command to the output of Alexa. In addition, all resources that are needed to run the skill are
provided by Lambda.
-159-
EdMedia + Innovate Learning 2021 - Online, United States, July 6-8, 2021
Alexa Skill Kit
The Alexa Skills Kit is a collection of APIs, tools and documentation required to create an Alexa Skill. The
Developer Console provided by Amazon, which is the core of the Skill Kit, only requires a web browser. All the
settings and steps that are required to develop and publish the skill can be made through the console.
The voice interaction model describes how the Skill handles user commands. For the skill “Mathe Rätsel”
(Eng.: “Math riddle”) we used a custom model which means that the whole skill flow is handled and controlled by
the programmer.
Intents determine which code is to be executed with which voice commands from the user. Intents consist
of so-called "sample utterances", which represent text phrases spoken by users. If such a text phrase is spoken by the
user, it triggers the "intent handler" defined for the intent. This executes the appropriate code.
A "Sample Utterance" can contain one or more slots. A slot can be understood as a variable that can be
filtered by Alexa and used in the code. These are defined with curly brackets within "Sample Utterances". These
slots can be created from different types of data. These range from numbers to strings. Even specially defined slot
types can be created.
Usability Test
Due to the current Covid-19 pandemic, we did not have the opportunity to conduct a meaningful field study
with multiple persons in different learning environments. However, we found a way to test our skill with a few
volunteers.
We prepared a quiet test room with a table and chairs around it and placed the Amazon Echo into the
middle of the table. After connecting it to the internet and installing our skill we gave the test person the instructions
on how to start the skill and which level should be played. From this point on the test person had to solve the riddles
without any help of the instructors. During the test, the instructors took notes about the test session (duration, use of
help function, problems that occurred).
With the help of a math teacher, we were able to test the skill and the story of level 1 in different ways. The teacher
was given the instructions to play the story with level 1 and start the skill by saying “Alexa, starte Matherätsel”
(Eng.: “Alexa, start Math Riddle”). He was able to finish the game within 5 minutes. He had no problems
understanding the questions and didn’t make use of the help function of the skill.
According to his expertise on teaching young pupils and playing the story, we received the following
outcome:
For the lower school level, he was satisfied with the questions for level 1. They are not too difficult and not
too easy due to the time restriction and the connection to real events. He confirmed that the numbers from 1 to 10
used in level one are suitable for children starting to learn the basic calculation types in the age between five and
seven. As far as the story line is concerned the level 1 is also suitable for children between eight and ten but with
increased numbers for example numbers between 0 and 50. He judges the story in level two suitable for children at
the age of twelve and higher and adults who want to train on basic calculation types. This was also confirmed when
playing the game with two children at the age of 6 and 10.
For the testing of level two we managed to find four adults from different social backgrounds to test the
skill. They were also given the instruction to start the skill by saying “Alexa, starte Matherätsel” (Eng.: “Alexa, start
Math Riddle”) and play the game of level 2.
Person 1at the age of 28 years managed to play the game without the help of the “repeat” or “help”
command. He didn’t answer any question incorrectly and managed complete the game within 6 minutes.
-160-
EdMedia + Innovate Learning 2021 - Online, United States, July 6-8, 2021
Person 2 at the age of 81 had some difficulties to follow the story and in her opinion, Alexa was talking too
fast. She criticized that during the game, one is not able to change the talking-speed of Alexa. In order to
change the speed, one must quit the game and ask Alexa to reduce the speed and then start the game once
again. She had a lot of trouble answering the questions and had to use the repeat and help function 5 times.
5 out of 10 puzzles were not answered correctly immediately. This session took about 10 minutes.
Person 3 at the age of 33 and person 4 at the age of 15 explained that they did not have problems with the
talking speed but had troubles finding an answer to the riddles at the first time and had to ask Alexa to
repeat the question once again. They also made use of the “help” command 3 times. Person 4 explained
that it would be helpful to pause the story sometimes and added that one cannot think out loud because
Alexa takes the first word she hears as a solution. Both test persons were able to finish the story within 8
minutes.
To sum up every adult was able to complete the story sooner or later.
Discussion
During the usability tests and development of the skill we made the following experiences:
The first thing when writing the stories was to find a suitable register, on the one hand the story for level 1
had to be easily understandable for children, on the other hand the story on level 2 had to be challenging
enough for teenagers and adults in order to not get boring. Furthermore, the stories have to be realistic and
accurate and not too-farfetched so that the players can benefit from the stories in their everyday life. This
can be seen as an advantage compared to the other tested skills for the reason that the it only offers abstract
calculations. When creating the story lines, we also had to consider the different social and ethical
background of our players. The stories had to be suitable for all our users who originate from all social
classes. As far as the story in level 2 is concerned, the narrated day in the story can be seen as a typical day
in anybody’s life.
The second thing we figured out was that, even if we are writing exciting stories, after playing them for the
second or third time you can easily remember the answers and the whole story isn’t challenging anymore.
At this point we decided to make use of random variables. For the story in Level 1 we chose random
variables between one and ten. For the story with level 2 we chose random variables from one to one
hundred. This solved the problem of remembering the answers and now we were able to play it multiple
times with different calculations. But this led to further problems as the following describes.
Even though one can play the story for multiple times with different random variables, the story gets boring
after you hear it several times. Therefore, we rewrote the code in such a way that the skill is easily
expandable with only a little coding knowledge, without the need to understand the whole skill source
code. Now it is possible to add a new story by using our JavaScript template file and writing new stories
with very little programming effort. It is possible to add a completely new story with different story lines
where the story writer is able to write whatever comes to his mind. Furthermore, the existing two stories in
level 1 and level 2 can be extended by for example adding another day on the farm or in the city.
By implementing division calculation type we faced the problem of numbers with decimal places. Since
our skill should only ask questions where the answer is a number without reminders, we had to find a
solution to odd numbers calculated by our random variable function. As a solution we took the odd number
and calculated the next higher even number for our questions in the story.
Another problem was the selection of the name. When starting the skill, the user had to choose the name
which can be of any gender. This resulted in the fact that we were not able to write the sentences with “he,
she, it” in it. Consequently, we had to save the name in the whole story session and use it instead of “he,
she, it”.
Starting the skill, choosing a name and a level wasn’t a problem at all. But already during the first questions
we faced some problems. First one only has six seconds of time to find the solution before Alexa asks you
-161-
EdMedia + Innovate Learning 2021 - Online, United States, July 6-8, 2021
if you already know the answer followed by another six seconds before the skill ends. This means that there
is only a total of twelve seconds to give an answer to the questions. This is a security measure from
Amazon to prevent the Skill from listening to the user for a longer time. As a workaround, we implemented
the “repeat” command where Alexa repeats the current task and the user gets another twelve seconds to
give an answer. As an alternative the user can give a wrong answer which also results in twelve more
seconds to answer the riddle correctly. This can be continued indefinitely.
Furthermore, we recognized that if one calculates by talking and thinking out loud, Alexa already
recognizes one’s words as the solution resulting in a wrong answer. This problem also appeared in a noisy
environment, for example if someone is talking in the background, Alexa sometimes recognized some
words as the solution.
Finally, we faced some difficulties regarding users speaking unclearly. For example, Alexa was not able to
differentiate between the German words “hundert” (Eng.: hundred) and “einhundert” (Eng.: one-hundred).
Conclusion
In this research we created an Alexa Skill destinated to children and adults to learn and train the basic
calculation types. Because Alexa is becoming more and more popular in many households, we wanted to create a
new learning experience with the help of a speech-based device. The development was achieved with the Alexa
Skill Kit. People wishing to play the game can choose between two levels which are suitable for young and old users
respectively beginners and advanced players. This created a new and funny way of learning. Alexa is telling stories
and users must listen carefully to answer mathematic riddles during the game. The small field study that we
conducted was successful in many ways. One the one hand, we got positive feedback from our testers concerning
the creativity of the story line and the idea of story-based learning. On the other hand, we also got a lot of input of
how we could improve our skill in the future development to make it even more functional. We also gained a lot of
new insights on how to create and implement different features of the skill. In the future we hope that a lot of
children and adults will make use of our skill and we want to further improve it by integrating new features and
stories.
References
Bryan Dean (2016), Rapidly Create Your Alexa Skill Backend with AWS Cloud Formation,
https://developer.amazon.com/de/blogs/alexa/post/Tx27NAUCY0KQ34D/rapidly-create-your-alexa-skill-backend-
with-aws-cloudformation; visited Dec 2020
Dizon, G. (2017), Using Intelligent Personal Assistants for Second Language Learning: A Case Study of Alexa.
TESOL J, 8: 811-830. https://doi.org/10.1002/tesj.353
Gong, Li. (2020) "Intelligent personal assistants." U.S. Patent Application No. 10/158,213.
Goksel Canbek, N. &. (2016). On the track of Artificial Intelligence: Learning with Intelligent Personal Assistants.
Journal of Human Sciences, pp. 13(1), 592-601.
Matney, L. (2019). More than 100 million Alexa devices have been sold. techcrunch.com.
Nagler, W., Haas, M., Schön, M. & Ebner, M. (2019). Professor YouTube and Their Interactive Colleagues How
Enhanced Videos and Online Courses Change the Way of Learning. In J. Theo Bastiaens (Ed.), Proceedings of
EdMedia + Innovate Learning (pp. 641-650). Amsterdam, Netherlands: Association for the Advancement of
Computing in Education (AACE).
-162-
EdMedia + Innovate Learning 2021 - Online, United States, July 6-8, 2021
Pearl C. (2017). “Is a VUI right for you and your app?”. retrieved from https://www.oreilly.com/content/is-a-vui-
right-for-you-and-your-app/; visited Dec 2020
Underwood, J. D. (2009). "The impact of digital technology: A review of the evidence of the impact of digital
technologies on formal education.". Becta.
Vogt F., Hauser B., Stebler R., Rechsteiner K. & Urech C. (2018) Learning through play - pedagogy and learning
outcomes in early childhood mathematics, European Early Childhood Education Research, Journal, 26:4, 589-
603, DOI: 10.1080/1350293X.2018.1487160
Zhao, Jinjin, Bhatt, Shreyansh, Thille, Candace, Zimmaro, Dawn, Gattani, Neelesh, and Walker, Josh. "Introducing
Alexa for E-learning." Proceedings of the Seventh ACM Conference on Learning @ Scale: 427-28. Web.
-163-
EdMedia + Innovate Learning 2021 - Online, United States, July 6-8, 2021