A preview of this full-text is provided by Springer Nature.
Content available from Artificial Intelligence Review
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
Artificial Intelligence Review (2023) 56:6039–6060
https://doi.org/10.1007/s10462-022-10326-x
1 3
Detecting Persian speaker‑independent voice commands
based onLSTM andontology incommunicating
withthesmart home appliances
LeilaSafarpoorKalkhoran1· ShimaTabibian1· ElahehHomayounvala2
Accepted: 7 November 2022 / Published online: 19 November 2022
© The Author(s), under exclusive licence to Springer Nature B.V. 2022
Abstract
Nowadays, various interfaces are used to control smart home appliances. The human and
smart home appliances interaction may be based on input devices such as a mouse, key-
board, microphone, or webcam. The interaction between humans and machines can be
established via speech using a microphone as one of the input modes. The Speech-based
human and machine interaction is a more natural way of communication in comparison to
other types of interfaces. Existing speech-based interfaces in the smart home domain suffer
from some problems such as limiting the users to use a fixed set of pre-defined commands,
not supporting indirect commands, requiring a large training set, or depending on some
specific speakers. To solve these challenges, we proposed several approaches in this paper.
We exploited ontology as a knowledge base to support indirect commands and remove user
restrictions on expressing a specific set of commands. Moreover, Long Short-Term Mem-
ory (LSTM) has been exploited for detecting spoken commands more accurately. Addition-
ally, due to the lack of Persian voice commands for interacting with smart home appli-
ances, a dataset of speaker-independent Persian voice commands for communicating with
TV, media player, and lighting system has been designed, recorded, and evaluated in this
research. The experimental results show that the LSTM-based voice command detection
system performed almost 1.5% and 13% more accurately than the Hidden Markov Model-
based one, in scenarios ‘with’ and ‘without ontology’, respectively. Furthermore, using
ontology in the LSTM-based method has improved the system performance by about 40%.
Keywords Voice commands detection· Ontology· Smart home appliances· Long short-
term memory· Accessibility
* Shima Tabibian
sh_tabibian@sbu.ac.ir
Leila Safarpoor Kalkhoran
l.safarpour@mail.sbu.ac.ir
Elaheh Homayounvala
e.homayounvala@londonmet.ac.uk
1 Cyberspace Research Institute, Shahid Beheshti University, Velenjak, Tehran, Iran
2 School ofComputing andDigital Media, London Metropolitan University, London, UK
Content courtesy of Springer Nature, terms of use apply. Rights reserved.