Conference Paper

Audio Signal Recognition Based on Intervals' Numbers (INs) Classification Techniques

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Speech is one of the major human-machine interaction modalities and it is especially important in the case of special education using social robots. Although modern speech recognition engines can effectively deal with normal human-robot conversations, there are instances in special education where additional word detection and word comparison capabilities are needed to run in parallel with the typical conversation flow. This paper investigates the efficiency of a word detection method based on intervals' numbers.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This process resulted in six linguistic cues presented in Table 4 that were used to control the robot's behaviour, when the normal flow has to be changed. The trigger words-recognition implemented algorithm was based on Interval Numbers (INs) [18]. An IN is an established mathematical object that may represent either a fuzzy interval or a distribution of numbers. ...
Article
Full-text available
Abstract The last decade autism research has been enhanced by the use of social robots and several studies suggest that children with Autism Spectrum Disorders (ASD) could benefit from robot-assisted interventions. In order to design and implement therapeutic interventions with robots without interrupting the intervention’s flow, one should take into account possible technical issues that could arise. The main objective of this study was to gather information from experts in the field of autism and develop linguistic cues to which the robot would respond automatically. A qualitative approach was used to explore specialists‘ preferences. Online surveys were completed by 33 professionals from different backgrounds to select the vocabulary more often used in psychosocial interventions with ASD children, in specific situations. Six linguistic cues were identified and specific phrases were used so as to accordingly program the robot to show empathy and respond, when a crisis emerges. The session’s flow in robot-enhanced interventions could benefit by controlling robot’s behaviour with linguistic cues phrased by the therapist. The implications of these findings are discussed in relation to pilot implementation. This work consists of a qualitative study aiming at strengthening the application of a larger research intervention protocol to explore the interaction of children with Autism Spectrum Disorder (ASD) with a social robot.
... During GENETIC optimization, the parameter ranges v 1 ∈[0,10], v 0 ∈[−1000,1000], θ 1 ∈[0,10], θ 0 ∈[−1000,1000], k∈ [4,10] were used; moreover, n ch = 1. Table 5 shows the classification results by the WINkNN classifier compared to the authors' previous results [50] (in brackets), where a time-series was represented by a single IN. For all aforementioned three class groups, the WINkNN classifier resulted in a superior improvement. ...
Article
Full-text available
Our interest is in time series classification regarding cyber–physical systems (CPSs) with emphasis in human-robot interaction. We propose an extension of the k nearest neighbor (kNN) classifier to time-series classification using intervals’ numbers (INs). More specifically, we partition a time-series into windows of equal length and from each window data we induce a distribution which is represented by an IN. This preserves the time dimension in the representation. All-order data statistics, represented by an IN, are employed implicitly as features; moreover, parametric non-linearities are introduced in order to tune the geometrical relationship (i.e., the distance) between signals and consequently tune classification performance. In conclusion, we introduce the windowed IN kNN (WINkNN) classifier whose application is demonstrated comparatively in two benchmark datasets regarding, first, electroencephalography (EEG) signals and, second, audio signals. The results by WINkNN are superior in both problems; in addition, no ad-hoc data preprocessing is required. Potential future work is discussed.
... For the intervention scenarios to be performed smoothly the robot should respond physically or verbally in some basic orders coming from the T/E, to resolve potential problems arising during the intervention. A methodology was developed and presented in [14] intending to serve the latter purpose; contribute towards the design of therapeutic interventions for children with ASD where speech is the basic communication channel. Trigger words to regulate a conversation and guide the therapeutic activities are represented by Interval Numbers (INs). ...
Chapter
The effectiveness of social robots in education is typically demonstrated, circumstantially, involving small samples of students [1]. Our interest here is in special education in Greece regarding Autism Spectrum Disorder (ASD) involving large samples of children students. Following a recent work review, this paper reports the specifications of a protocol for testing the effectiveness of robot (NAO)-based treatment of ASD children compared to conventional human (therapist)-based treatment. The proposed protocol has been developed by the collaboration of a clinical scientific team with a technical scientific team. The modular structure of the aforementioned protocol allows for implementing parametrically a number of tools and/or theories such as the theory-of-mind account from psychology; moreover, the engagement of the innovative Lattice Computing (LC) information processing paradigm is considered here toward making the robot more autonomous. This paper focuses on the methodological and design details of the proposed intervention protocol that is underway; the corresponding results will be reported in a future publication.
Article
Full-text available
In the context of accelerating globalization, intercultural communication competence has become a crucial element for the success of individuals and organizations. The frequent international interactions have underscored the importance of enhancing methods for improving intercultural communication skills within the educational sector. The rapid advancement of mobile technology offers unprecedented opportunities for educational innovation. Its convenience and widespread use have made mobile device-based learning models highly attractive. Particularly, advancements in real-time speech processing technologies have provided new tools and methods for intercultural communication education. By leveraging mobile real-time speech detection and synthesis technologies, a more interactive and personalized learning experience can be achieved, thereby enhancing the efficiency and effectiveness of language learning. This study aims to explore a mobile technology-based model for intercultural communication education and is divided into three main parts: firstly, the investigation of mobile real-time speech detection technologies aimed at intercultural communication education to provide instant feedback and improvement suggestions; secondly, the exploration of mobile real-time speech synthesis technologies to generate high-quality speech samples for learners to practice against; and thirdly, the integration of the aforementioned technologies to develop a flexible, efficient, and highly interactive learning system based on mobile technology. This study is expected to not only improve the effectiveness of intercultural communication education but also provide significant references for the innovative application of educational technologies.
Conference Paper
Full-text available
Human-robot interaction has been a significant area of research with the widespread use of social robots. Many modalities can be used to achieve interaction, including vision. For each modality, many methodologies have been proposed, with varying degrees of effectiveness and efficiency in terms of the computational power needed. The varied nature of these algorithms makes data fusion a complex and application-specific task. This paper introduces a novel Lattice Computing-based methodology to interpret visual stimuli for head pose estimation. An investigation of the various parameters involved and initial results are presented. The aim is to determine head pose in robot-assisted therapy settings and use it in decision making. This work is part of a broader effort to use the Lattice Computing (LC) paradigm as a unified methodology for sensory data interpretation in human-robot interaction.
Conference Paper
Full-text available
Visual stimuli are essential in many applications in human robot interaction. However, such tasks are usually computationally intensive. Also, data received from the various sensors on a robot require different data representation and processing techniques, which increases the complexity and makes the fusion of sensory data for decision making more difficult. An alternative approach is the use of the Lattice Computing (LC) paradigm for hybrid mathematical modelling based on mathematical lattice theory that unifies rigorously numerical data and non-numerical data. This paper presents an application of this approach, and more specifically a novel method for head pose estimation using LC techniques, as an initial step towards using LC as a unified methodology in social robot interaction applications.
Article
Full-text available
This paper describes the recognition of image patterns based on novel representation learning techniques by considering higher-level (meta-)representations of numerical data in a mathematical lattice. In particular, the interest here focuses on lattices of (Type-1) Intervals' Numbers (INs), where an IN represents a distribution of image features including orthogonal moments. A neural classifier, namely fuzzy lattice reasoning (flr) fuzzy-ARTMAP (FAM), or flrFAM for short, is described for learning distributions of INs; hence, Type-2 INs emerge. Four benchmark image pattern recognition applications are demonstrated. The results obtained by the proposed techniques compare well with the results obtained by alternative methods from the literature. Furthermore, due to the isomorphism between the lattice of INs and the lattice of fuzzy numbers, the proposed techniques are straightforward applicable to Type-1 and/or Type-2 fuzzy systems. The far-reaching potential for deep learning in big data applications is also discussed.
Conference Paper
Full-text available
An Intervals' Number (IN) is a mathematical object known to represent either a probability distribution or a possibility distribution. The space of INs has been studied during the last years. After summarizing some instrumental mathematical results, this work demonstrates comparatively novel schemes for tunable fuzzy rule interpolation and extrapolation. Extensions to Type-2 fuzzy sets are straightforward. Finally, this work demonstrates a preliminary application, regarding the reconstruction of partially occluded human facial expressions, based on a neural network that may predict a data distribution from other ones. Far reaching extensions of the proposed techniques are discussed.
Article
Full-text available
This paper proposes a fundamentally novel extension, namely, flrFAM, of the fuzzy ARTMAP (FAM) neural classifier for incremental real-time learning and generalization based on fuzzy lattice reasoning techniques. FAM is enhanced first by a parameter optimization training (sub)phase, and then by a capacity to process partially ordered (non)numeric data including information granules. The interest here focuses on intervals' numbers (INs) data, where an IN represents a distribution of data samples. We describe the proposed flrFAM classifier as a fuzzy neural network that can induce descriptive as well as flexible (i.e., tunable) decision-making knowledge (rules) from the data. We demonstrate the capacity of the flrFAM classifier for human facial expression recognition on benchmark datasets. The novel feature extraction as well as knowledge-representation is based on orthogonal moments. The reported experimental results compare well with the results by alternative classifiers from the literature. The far-reaching potential of fuzzy lattice reasoning in human-machine interaction applications is discussed.
Article
Full-text available
In this paper, an overview of human-robot interactive communication is presented, covering verbal as well as non-verbal aspects of human-robot interaction. Following a historical introduction, and motivation towards fluid human-robot communication, ten desiderata are proposed, which provide an organizational axis both of recent as well as of future research on human-robot communication. Then, the ten desiderata are examined in detail, culminating to a unifying discussion, and a forward-looking conclusion.
Article
Full-text available
Socially intelligent robotics is the pursuit of creating robots capable of exhibiting natural-appearing social qualities. Beyond the basic capabilities of moving and acting autonomously, the field has focused on the use of the robot's physical embodiment to communicate and interact with users in a social and engaging manner. One of its components, socially assistive robotics, focuses on helping human users through social rather than physical interaction. Early results already demonstrate the promises of socially assistive robotics, a new interdisciplinary research area with large horizons of fascinating and much needed research. Even as socially assistive robotic technology is still in its early stages of development, the next decade promises systems that will be used in hospitals, schools, and homes in therapeutic programs that monitor, encourage, and assist their users. This is an important time in the development of the field, when the board technical community and the beneficiary populations must work together to shape the field toward its intended impact on improved human quality of life
Book
By ‘model’ we mean a mathematical description of a world aspect. With the proliferation of computers a variety of modeling paradigms emerged under computational intelligence and soft computing. An advancing technology is currently fragmented due, as well, to the need to cope with different types of data in different application domains. This research monograph proposes a unified, cross-fertilizing approach for knowledge-representation and modeling based on lattice theory. The emphasis is on clustering, classification, and regression applications. It is shown how rigorous analysis and design can be pursued in soft computing using conventional (hard computing) methods. Moreover, non-Turing computation can be pursued. The material here is multi-disciplinary based on our on-going research published in major scientific journals and conferences. Experimental results by various algorithms are demonstrated extensively. Relevant work by other authors is also presented both extensively and comparatively.
Article
As the field of HRI evolves, it is important to understand how users interact with robots over long periods. This paper reviews the current research on long-term interaction between users and social robots. We describe the main features of these robots and highlight the main findings of the existing long-term studies. We also present a set of directions for future research and discuss some open issues that should be addressed in this field.
Article
This paper reviews “socially interactive robots”: robots for which social human–robot interaction is important. We begin by discussing the context for socially interactive robots, emphasizing the relationship to other research fields and the different forms of “social robots”. We then present a taxonomy of design methods and system components used to build socially interactive robots. Finally, we describe the impact of these robots on humans and discuss open issues. An expanded version of this paper, which contains a survey and taxonomy of current applications, is available as a technical report [T. Fong, I. Nourbakhsh, K. Dautenhahn, A survey of socially interactive robots: concepts, design and applications, Technical Report No. CMU-RI-TR-02-29, Robotics Institute, Carnegie Mellon University, 2002].
Article
Linear models are preferable due to simplicity. Nevertheless, non-linear models often emerge in practice. A popular approach for modeling nonlinearities is by piecewise-linear approximation. Inspired from fuzzy inference systems (FISs) of Tagaki–Sugeno–Kang (TSK) type as well as from Kohonen’s self-organizing map (KSOM) this work introduces a genetically optimized synergy based on intervals’ numbers, or INs for short. The latter (INs) are interpreted here either probabilistically or possibilistically. The employment of mathematical lattice theory is instrumental. Advantages include accommodation of granular data, introduction of tunable nonlinearities, and induction of descriptive decision-making knowledge (rules) from the data. Both efficiency and effectiveness are demonstrated in three benchmark problems. The proposed computational method demonstrates invariably a better capacity for generalization; moreover, it learns orders-of-magnitude faster than alternative methods inducing clearly fewer rules.
Free Spoken Digit Dataset (FSDD)
  • Z Jackson
Z. Jackson, "Free Spoken Digit Dataset (FSDD)," 2017. [Online].