Content uploaded by Jared Duval
Author content
All content in this area was uploaded by Jared Duval on Nov 30, 2018
Content may be subject to copyright.
Designing Towards Maximum
Motivation and Engagement in an
Interactive Speech Therapy Game
Abstract
Children with speech impairments often find speech
curriculums tedious, limiting how often children are
motivated to practice. A speech therapy game has the
potential to make practice fun, may help facilitate
increased time and quality of at-home speech therapy,
and lead to improved speech. We explore using
conversational real-time speech recognition, game
methodologies theorized to improve immersion and
flow, and user centered approaches to design an
immersive interactive speech therapy solution. Our
preliminary user evaluation showed that compared to
traditional methods, children were more motivated to
practice speech using our system.
Author Keywords
Speech Processing; Speech Therapy; Human Computer
Interaction; Games; Motivation
ACM Classification Keywords
H. Information Systems; H.5. Information interfaces
and presentation (e.g., HCI); H.5.2. Voice I/O; I/O J.3.
Life and Medical Sciences: Health
Permission to make digital or hard copies of part or all of
this work for personal or classroom use is granted without
fee provided that copies are not made or distributed for
profit or commercial advantage and that copies bear this
notice and the full citation on the first page. Copyrights for
third-party components of this work must be honored. For
all other uses, contact the Owner/Author.
IDC '17, June 27-30, 2017, Stanford, CA, USA
© 2017 Copyright is held by the owner/author(s).
ACM ISBN 978-1-4503-4921-5/17/06.
http://dx.doi.org/10.1145/3078072.3084329
Jared Duval
Zachary Rubin
Elizabeth Goldman
Nick Antrilli
Yu Zhang
Su-Hua Wang
Sri Kurniawan
University of California Santa Cruz
Santa Cruz, CA 95064 USA
j
duval@ucsc.edu
zarubin@ucsc.edu
eljgoldm@ucsc.edu
nantrill@ucsc.edu
yzhan105@ucsc.edu
suhua@ucsc.edu
skurnia@ucsc.edu
Work in Progress/Late Breaking
IDC 2017, June 27–30, 2017, Stanford, CA, USA
589
Introduction
The prevalence of speech sound phonological and
articulation disorders in young children is 8 to 9 percent
[13]. Most speech impairments can be corrected with
practice and speech therapy. Speech therapy consists
of two components: in-office sessions with a speech
language pathologist (SLP) and at-home practice
curriculums [6, 11]. Orofacial cleft is one of the most
common causes of speech impairments.
Orofacial cleft is a common birth defect that results in a
gap in the lip or mouth, shown in Figures 1 and 2.
Children born with orofacial cleft undergo multiple
surgical procedures and often require long-term speech
therapy. Speech impairment is extremely common with
this birth defect because it allows air to escape through
the nasal cavity, which results in the use of
compensatory glottal attacks to mimic sounds, lisps,
and difficulty forming plosives.
During in-office therapy sessions, SLPs evaluate the
child’s speech to find specific targets that the child has
difficulty speaking. Once the SLP knows the child’s
unique impairments, they work with child on techniques
to improve the detected targets. The SLP gives children
and parents resources to take home to continue
practicing until the next session, where the child is
reassessed.
SLPs instruct parents to have their child practice for ten
minutes per day, but according to our interview with a
SLP, it is suspected that children practice little to none
outside of speech therapy sessions. There are many
contributing factors as to why children do not practice
enough at home: children find the curriculums tedious
and repetitive, parents are not qualified to assess the
small nuances in leading a speech curriculum, and
children may not be intrinsically motivated to improve
their speech.
Many children do not have enough access to SLPs and
could benefit from a supplemental speech therapy
game. The challenges a speech therapy game may
address are numerous [12]:
Moves responsibility to lead practice from the
parent to the game
Can assign appropriate words and phrases
from an expansive knowledge base for each
child’s unique speech goals
Adjusts difficulty as the child plays using a
dynamic curriculum to balance challenge and
boredom [3]
Adds intrinsic motivation to practicing speech
through gameplay and mechanics
Allows SLP to track practice and performance
in-between in-office therapy sessions to make
better use of their time
Allows children with less access to an SLP to
continue improving speech
A high-fidelity prototype of our speech therapy game
named SpokeIt has been developed and tested. Many
of the design considerations mentioned below are the
result of feedback we received from the original high-
fidelity prototype of SpokeIt. We first discuss the new
design motivations and then the results of our user
study that lead to these decisions.
Motivation
Design of the game content, mechanics, and
interactions are central to the success of improving
Figure 1: An image that shows a
baby with cleft lip. Image CC-BY
Center for Disease Control and
Prevention
Figure 2: An image that shows a
baby with cleft palate. Image CC-
BY Center for Disease Control and
Prevention
Work in Progress/Late Breaking
IDC 2017, June 27–30, 2017, Stanford, CA, USA
590
speech and motivating practice. “Placing high hopes on
games designed for the public good- as many
nonprofits, health organizations, and social enterprises
are doing-without realizing the bad game design can
undermine the most noble of ambitions. It's quite
possible to make terrible, dull, and unappealing games
for learning or training or health” [9].
To improve speech, one must practice speaking, but
saying a word or phrase repeatedly for the sake of
practice is often found to be tedious and boring. During
our observations of interactions between SLPs and their
patient’s we noticed the jaded children were not
interested in practicing speech.
Games turn repetition from something that stifles
motivation and induces boredom into an element that is
recognized as useful for progress in the game [10].
Hiding information about gameplay and maintaining a
sense of uncertainty about the outcome can make
games fun [10]. Repeating target words and phrases
for the advancement in a game is much more
motivating than for the completion of a speech
curriculum.
Educational software has traditionally attempted to
harness games as extrinsic motivation by using them
as sugar coating for learning content, but children learn
more from intrinsic approaches [8]. The learning
objectives should be part of the gameplay itself [8]
Practicing speech should be a game mechanic that
induces immersion and flow rather than a disruption to
play. The narrative and plot should align with the
learning content [5].
If done correctly, games can lead to improved learning,
enhanced motivation, greater attention, and increased
retention [2]. The end state we desire is the motivated
learner, which is someone who is enthusiastic, focused,
and engaged [7].
Design
In order to understand the needs of the relevant
stakeholders we held a focus group with cleft lip
specialists, researchers, and game developers. Cleft lip
specialists emphasized the need for virtual incentives
that would remain effective for months or even years.
Researchers and game developers agreed that
maximizing immersion through emotionally motivating
elements would best hide the repetitive speech
exercises. We design and validate the following aspects
of our game are emotionally motivating:
Characters that are relatable and have the
capacity to create empathy
An overarching plot for our characters that is
interesting and defines the player’s goals
Narrators that engage and encourage the
player to continue helping the game
characters
A Procedural Content Background Art
Generator that creates rich environments to
experience
Mechanics that seamlessly incorporate
conversational speech recognition into
completing game objectives
Characters
The characters in a game are often the single most
motivating factors to continue play [9]. We become
invested in their wellbeing, life, and goals [9]. The
Figure 3: Blue game character
Figure 4: Yellow game character
Work in Progress/Late Breaking
IDC 2017, June 27–30, 2017, Stanford, CA, USA
591
children who play our speech therapy game should be
able to relate to the characters and be motivated to
help them.
Our speech therapy game follows a race known as the
Migs, shown if Figures 3 and 4. The Migs do not speak.
The player is their voice. The game includes Migs of six
different colors, six different personalities, and unique
relationships with the other Migs.
We animate the Migs facial expressions to display the
primary emotions as a starting point for creating
natural and realistic experiences within the game [1].
Using Adobe’s Character Animator, we are able to
employ actors to capture and record animations in real
time that are mapped directly to the Migs. By
combining the animations created by graphics artists,
performances by actors in Character Animator, and
physics, we can make believable and natural facial
expressions so that children can easily understand our
game character’s emotions and hopefully empathize
with them.
Plot
The story of the Migs is a hopeful one. Before the game
is played for the first time, the player is provided an
introductory cinematic to the game that describes the
Migs history and story. The story begins in the Mig’s
beautiful world, where the narrator describes their
utopian lives. As the background music gets more
dramatic, the Migs are whisked up by a storm into a
new unfamiliar two-dimensional world.
Procedural Content Generation
Speech therapy is an ongoing process that takes each
child a different amount of time to complete. Therefore,
our speech therapy game should continue to be
available. Previous iterations of our work employed
mini-games that would play for an allotted amount of
time before switching to the next game. Games could
be randomly ordered and could be played multiple
times. We observed that the mini-games lacked
replayability and would not sustain long-term use.
Fresh procedurally generated content should solve
replayability issues.
Our PCG employs an evolutionary algorithm that learns
how to make art with our collection of over 750 image
assets. The evolutionary algorithm borrows from
biology to develop art that is believable and
aesthetically pleasing, shown in Figure 6. It actively
learns to match selected themes, materials, terrain
settings, lighting, and time of day to produce visually
striking worlds, shown in Figure 5.
As development continues, the plot and game
objectives will also employ PCG so that our speech
therapy game can always deliver fresh new content
that will keep the children motivated and engaged.
Mechanics
The elements of engaged learning are focused goals,
challenging tasks, clear and compelling standards,
protection from adverse consequences for initial
failures, affirmation of performance, affiliation with
others, novelty and variety, choice, and authenticity
[4].
Games are proven to have perceptual and cognitive
impacts as well as the ability for the player to acquire
new skills [2]. We can adjust difficulty by assigning
target words that are more difficult to the child,
Figure 5: Settings available to
system to procedurally generate
background art using
evolutionary algorithm
Work in Progress/Late Breaking
IDC 2017, June 27–30, 2017, Stanford, CA, USA
592
changing the number of repetitions needed to move
past a challenge, adjusting the rhythm and timing
required to say the targets, and by tweaking the
number of targets required per interaction.
Volume is a challenge for many children. To
successfully complete a challenge the child must say
the correct target at an appropriate volume. There is an
indicator that displays the loudness of the child’s
speech and it changes colors when volume moves in
and out of the threshold. It should be clear to the
player when the game is listening and when they
should be speaking. The child should only be able to
speak when the narrators are not. To make this clear,
we include an ear in the top-right corner of the game to
indicate when the child should speak and is toggled off
when the game is not listening, shown in Figure 7.
Results
SpokeIt (1
st
iteration shown in Figure 8 and 2
nd
iteration shown in Figure 9) has been tested on children
with cleft speech as well as with expert medical
professionals that work with children with cleft speech.
Four participants with cleft speech played SpokeIt: 3
children under 10 (Mean: 5.33, σ: 2.3) and 1 teenager.
One child in particular engaged much more with the
game than the shyer children. The children under 10
had speech barely above the detection threshold,
resulting in play difficulties. We addressed this with a
volume indicator and adding narrator voice prompts
instructing the player to speak louder.
The child life specialist said she loved watching the
eight-year old play the game. She noticed that he
immediately put much more effort into his speech. She
demonstrated how he overemphasized each spoken
syllable in the target sentence to make the game
advance. The SLP had trouble keeping his attention,
but when he was given the opportunity to try SpokeIt,
he completed the entire prototype and asked for more
content.
The SLP who watched her patients play SpokeIt loved
the storybook style of the game. She is skeptical that
the game will ever be able to make the distinction
between correct and incorrect speech, but looks
forward to motivating practice nonetheless.
The plastic surgeon who debriefed with us wants to add
a feature to the game that would display a mouth and
demonstrate correct speech when the child was
struggling.
The eldest spoke primarily using sign language. It was
very difficult for the SLP to convince her to use her
voice. At first, she was also too shy to try SpokeIt, but
once we started the game, she started speaking
without the SLP’s encouragement.
Discussion and Future Work
The speech recognition system accuracy can be
improved greatly. We are curious about how effectively
it can differentiate between correct and incorrect
speech. If the system can accurately diagnose speech
impairment details, it would save SLPs a lot of time that
could be used for helping the child improve.
A speech therapy game has the potential to track
progress for each child. It may be able to grade
children’s speech and report progress to SLPs. The data
can be studied to look for demographic patterns and
Figure 6: Sample output of PCG
while it continues learning how to
create background art
Figure 7: Indicator that displays
when the game is listening for
input. Indicator changes colors
when loudness thresholds are
met by the player
Work in Progress/Late Breaking
IDC 2017, June 27–30, 2017, Stanford, CA, USA
593
correlations to other aspects of the child’s
development.
SpokeIt may make it easier for SLPs to assign unique
curriculums for each child’s unique speech goals. It can
assign target words and phrases that will have the
most impact on the child.
Acknowledgements
This material is based in part upon work supported by
the National Science Foundation under Grant number
#1617253. We also thank Doctor Travis Tollefson and
SLP Christina Roth for their aid in conducting user
evaluations. Any opinions, findings, and conclusions or
recommendations expressed in this material are those
of the authors and do not necessarily reflect the views
of the National Science Foundation.
References
1. Becker-Asano, C., & Wachsmuth, I. (2010).
Affective computing with primary and
secondary emotions in a virtual
human. Autonomous Agents and Multi-Agent
Systems, 20(1), 32.
2. Boyle, E., Connolly, T. M., & Hainey, T. (2011).
The role of psychology in understanding the
impact of computer games. Entertainment
Computing, 2(2), 69-74.
3. Conati, C., & Zhou, X. (2002, June). Modeling
students’ emotions from cognitive appraisal in
educational games. In International Conference
on Intelligent Tutoring Systems (pp. 944-954).
Springer Berlin Heidelberg.
4. Dickey, M. D. (2005). Engaging by design: How
engagement strategies in popular computer
and video games can inform instructional
design. Educational Technology Research and
Development, 53(2), 67-83.
5. Dondlinger, M. J. (2007). Educational video
game design: A review of the literature.
Journal of applied educational technology,
4(1), 21-31.
6. Facts about Cleft Lip and Cleft Palate. (2015,
November 12). Retrieved March 25, 2017, from
https://www.cdc.gov/ncbddd/birthdefects/cleftl
ip.html
7. Garris, R., Ahlers, R., & Driskell, J. E. (2002).
Games, motivation, and learning: A research
and practice model. Simulation & gaming,
33(4), 441-467.
8. Habgood, M. J., & Ainsworth, S. E. (2011).
Motivating children to learn effectively:
Exploring the value of intrinsic integration in
educational games. The Journal of the Learning
Sciences, 20(2), 169-206.
9. Isbister, K. (2016). How Games Move Us:
Emotion by Design. MIT Press.
10. Sauvé, L. (2010). Effective educational games.
Educational gameplay and simulation
environments: Case studies and lessons
learned, 27-50.
11. Speech Development – Cleft Palate Foundation.
(n.d.). Retrieved March 25, 2017, from
http://www.cleftline.org/parents-
individuals/publications/speech-development/
12. Sri Kurniawan, Su-hua Wang, Christina Roth,
Travis Tollefson. (2016). CHS: Small: Game for
Cleft Speech Therapy (No. 1617253). National
Science Foundation.
13. Quick Statistics About Voice, Speech,
Language. (2016, May 19). Retrieved April 01,
2017, from
https://www.nidcd.nih.gov/health/statistics/qui
ck-statistics-voice-speech-language
Figure 8: Original high fidelity
prototype of SpokeIt actively
using conversational speech
recognition
Figure 9: Example challenge
where child must say “Fire”
Work in Progress/Late Breaking
IDC 2017, June 27–30, 2017, Stanford, CA, USA
594