Parakeet: A Demonstration of Speech Recognition on a
Mobile Touch-Screen Device
Keith Vertanen and Per Ola Kristensson
Cavendish Laboratory, University of Cambridge
JJ Thomson Avenue, Cambridge UK
We demonstrate Parakeet – a continuous speech recognition
system for mobile touch-screen devices. Parakeet’s inter-
face is designed to make correcting errors easy on a hand-
held device while on the move. Users correct errors using a
touch-screen to either select alternative words from a word
confusion network or by typing on a predictive software key-
board. Our interface design was guided by computational
experiments. We conducted a user study to validate our de-
sign. We found novices entered text at 18 WPM while seated
indoors and 13 WPM while walking outdoors.
Mobile continuous speech recognition, touch-screen inter-
face, error correction, speech input, word confusion network
ACM Classification Keywords
H.5.2 User Interfaces: Voice I/O
This is a demonstration companion paper to . In this pa-
per, we describe our work on a system called Parakeet. Para-
keet allows users to dictate text while on the move. Our
system consists of a speech recognition engine (based on
PocketSphinx ) and a novel interface for performing cor-
rections. Parakeet is designed to make mobile continuous
speech recognition pleasant and efficient.
Parakeet runs on mobile Linux devices, such as the Nokia
N800 (figure 1). To enter text, users speak into a wireless
microphone. While the user is speaking, audio is streamed
to a continuous speech recognizer which is running on the
actual device. Once recognition is complete, the result is
The best recognition hypothesis is shown at the top. Each
At the bottom, a series of delete buttons allow words to be
Copyright is held by the author/owner(s).
IUI’09, February 8-11, 2009, Sanibel Island, Florida, USA.
Figure 1. The Parakeet system running on a Nokia N800 device.
Figure 2. Parakeet’s main correction interface. The recognition result
is shown at the top. Likely alternative words are displayed in each col-
umn. In this example, the user is changing several words and deleting
another word in a single crossing action.
removed. The user can scroll left or right by touching the
arrow buttons on either side of the screen. Users make cor-
rections by using a number of different actions:
• Tapping – An alternate word can be chosen by simply
tapping on it. The selected word is displayed in green.
• Crossing – Multiple words can be corrected in a single
continuous crossing gesture (figure 2).
or inserted between columns (figure 3).
• Replacing with variant – By double-tapping a word, a
morphological variant can be chosen (figure 4).
• Typing – Arbitrary corrections can be made using a pre-
dictive software keyboard (figure 5). As a user types,
word completion predictions are offered.