ArticlePublisher preview available

Detecting Persian speaker-independent voice commands based on LSTM and ontology in communicating with the smart home appliances

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Nowadays, various interfaces are used to control smart home appliances. The human and smart home appliances interaction may be based on input devices such as a mouse, keyboard, microphone, or webcam. The interaction between humans and machines can be established via speech using a microphone as one of the input modes. The Speech-based human and machine interaction is a more natural way of communication in comparison to other types of interfaces. Existing speech-based interfaces in the smart home domain suffer from some problems such as limiting the users to use a fixed set of pre-defined commands, not supporting indirect commands, requiring a large training set, or depending on some specific speakers. To solve these challenges, we proposed several approaches in this paper. We exploited ontology as a knowledge base to support indirect commands and remove user restrictions on expressing a specific set of commands. Moreover, Long Short-Term Memory (LSTM) has been exploited for detecting spoken commands more accurately. Additionally, due to the lack of Persian voice commands for interacting with smart home appliances, a dataset of speaker-independent Persian voice commands for communicating with TV, media player, and lighting system has been designed, recorded, and evaluated in this research. The experimental results show that the LSTM-based voice command detection system performed almost 1.5% and 13% more accurately than the Hidden Markov Model-based one, in scenarios ‘with’ and ‘without ontology’, respectively. Furthermore, using ontology in the LSTM-based method has improved the system performance by about 40%.
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
Artificial Intelligence Review (2023) 56:6039–6060
https://doi.org/10.1007/s10462-022-10326-x
1 3
Detecting Persian speaker‑independent voice commands
based onLSTM andontology incommunicating
withthesmart home appliances
LeilaSafarpoorKalkhoran1· ShimaTabibian1· ElahehHomayounvala2
Accepted: 7 November 2022 / Published online: 19 November 2022
© The Author(s), under exclusive licence to Springer Nature B.V. 2022
Abstract
Nowadays, various interfaces are used to control smart home appliances. The human and
smart home appliances interaction may be based on input devices such as a mouse, key-
board, microphone, or webcam. The interaction between humans and machines can be
established via speech using a microphone as one of the input modes. The Speech-based
human and machine interaction is a more natural way of communication in comparison to
other types of interfaces. Existing speech-based interfaces in the smart home domain suffer
from some problems such as limiting the users to use a fixed set of pre-defined commands,
not supporting indirect commands, requiring a large training set, or depending on some
specific speakers. To solve these challenges, we proposed several approaches in this paper.
We exploited ontology as a knowledge base to support indirect commands and remove user
restrictions on expressing a specific set of commands. Moreover, Long Short-Term Mem-
ory (LSTM) has been exploited for detecting spoken commands more accurately. Addition-
ally, due to the lack of Persian voice commands for interacting with smart home appli-
ances, a dataset of speaker-independent Persian voice commands for communicating with
TV, media player, and lighting system has been designed, recorded, and evaluated in this
research. The experimental results show that the LSTM-based voice command detection
system performed almost 1.5% and 13% more accurately than the Hidden Markov Model-
based one, in scenarios ‘with’ and ‘without ontology’, respectively. Furthermore, using
ontology in the LSTM-based method has improved the system performance by about 40%.
Keywords Voice commands detection· Ontology· Smart home appliances· Long short-
term memory· Accessibility
* Shima Tabibian
sh_tabibian@sbu.ac.ir
Leila Safarpoor Kalkhoran
l.safarpour@mail.sbu.ac.ir
Elaheh Homayounvala
e.homayounvala@londonmet.ac.uk
1 Cyberspace Research Institute, Shahid Beheshti University, Velenjak, Tehran, Iran
2 School ofComputing andDigital Media, London Metropolitan University, London, UK
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Article
Full-text available
In this work, the main objective is oriented towards the development of a completed system of saving energy at a home in south Algeria. The proposed previously solution (OSEIM) is an intelligent solution based on domain ontology for the electrical energy management in the home; these are designed, built, and operated with increasingly complex technologies. However, the ontology used in this work brings together a body of knowledge that represents all the objects that can positively or negatively influence the consumption of electrical energy, all of this information is linked to each other by meaningful and useful relationships. In addition, the NewOSEIM solution comprising a knowledge base of the OWL (Web Ontology Language) and SWRL rules (Semantic Web Rule Language) providing a reasoning mechanism is developed to offer an optimal solution in energy saving. This work mainly depends on the import of the OSEIM ontology and other ontologies in the field of energy management. Finally, it is noted that the NewOSEIM solution provides an additional energy saving of 2.11%.
Article
Full-text available
The high time required for the deployment of cloud resources in Network Function Virtualization network architectures has led to the proposal and investigation of algorithms for predicting traffic or the necessary processing and memory resources. However, it is well known that whatever approach is taken, a prediction error is inevitable. Two types of prediction errors can occur that have a different impact on the increase in network operational costs. In case the predicted values are higher than the real ones, the resource allocation algorithms will allocate more resources than necessary with the consequent introduction of an over-provisioning cost. Conversely, when the predicted values are lower than the real values, the allocation of fewer resources will lead to a degradation of QoS and the introduction of an under-provisioning cost. When over-provisioning and under-provisioning costs are different, most of the prediction algorithms proposed in the literature are not adequate because they are based on minimizing the mean square error or symmetric cost functions. For this reason we propose and investigate a forecasting methodology in which it is introduced an asymmetric cost function capable of weighing the costs of over-provisioning and under-provisioning differently. We have applied the proposed forecasting methodology for resource allocation in a Network Function Virtualization architectures where the Network Function Virtualization Infrastructure Point-of-Presences are interconnected by an elastic optical network. We have verified a cost savings of 40% compared to solutions that provide a minimization of the mean square error.
Article
Full-text available
The Internet of Things (IoT) is an emerging Internet-based architecture, enabling the exchange of data and services in a global network. With the advent of the Internet of Things, more and more devices are connecting to the Internet in order to help people get and share data or program actions. In this paper, we introduce an IoT Agent, a Web application for monitoring and controlling a smart home remotely. The IoT Agent integrates a chat bot that can understand text or voice commands using natural language processing (NLP). With the use of NLP, home devices are more user-friendly and controlling them is easier, since even when a command or question/command is different from the presets, the system understands the user’s wishes and responds accordingly. Our solution exploits several available Application Programming Interfaces (APIs), namely: the Dialogflow API for the efficient integration of NLP to our IoT system, the Web Speech API for enriching user experience with voice recognition and synthesis features, MQTT (Message Queuing Telemetry Transport) for the lightweight control of actuators and Firebase for dynamic data storage. This is the most significant innovation it brings: the integration of several third-party APIs and open source technologies into one mash-up, highlighting how a new IoT application can be built today using a multi-tier architecture. We believe that such a tiered architecture can be very useful for the rapid development of smart home applications.
Article
Traffic and cloud resource prediction methodologies have been recently used in Network Function Virtualization environment for cloud and bandwidth resource allocation purposes. Both traditional and innovative prediction methodologies have been proposed for the application of allocation procedures. For instance Long Short Term Memory-based prediction techniques have been shown to be very effectiveness to allocate the resources. All of these techniques are based on the minimization of a symmetric cost function as the Root Mean Square Error that equally weights positive and negative prediction errors. However the error sign can differently impact the cost increase due to prediction errors. For instance when the Quality of Service degradation cost due to traffic loss is prevalent with respect to the cloud resource allocation cost, an algorithm is preferable that overestimates the offered traffic; conversely the traffic underestimation is preferable in the opposite case when the cloud allocation cost is higher than the QoS degradation one. For this reason we propose an Asymmetric LSTM traffic prediction procedure in which the cost function is defined so as to take into account both the QoS degradation and cloud resource allocation costs. In a typical network and traffic scenario, we show how the proposed solution allows for cost decrease by 40% with respect to classical LSTM prediction methodology based on the Root Mean Square Error.
Conference Paper
Today, smart home appliances are controlled using different user interfaces and based on various input devices. Speech is a natural and easy way for communication between human and machine. However, smart device manufacturers use a limited set of words to control them and their users must be familiar with the device control words. If there is any difference between the device control words and the user words, the device cannot execute the user query, correctly. In order to solve this problem, an ontology-based method has been proposed in this paper to improve the accuracy of the Hidden Markov Model (HMM)-based voice command detection system. The experimental results show that using the ontology besides the HMM-based voice command detection system improved its performance about 54.5 percent in comparison to the "without ontology case".