Artificial intelligence (AI) is progressively changing techniques of teaching and learning. In the past, the objective was to provide an intelligent tutoring system without intervention from a human teacher to enhance skills, control, knowledge construction, and intellectual engagement. This paper proposes a definition of AI focusing on enhancing the humanoid agent Nao’s learning capabilities and interactions. The aim is to increase Nao intelligence using big data by activating multisensory perceptions such as visual and auditory stimuli modules and speech-related stimuli, as well as being in various movements. The method is to develop a toolkit by enabling Arabic speech recognition and implementing the Haar algorithm for robust image recognition to improve the capabilities of Nao during interactions with a child in a mixed reality system using big data. The experiment design and testing processes were conducted by implementing an AI principle design, namely, the three-constituent principle. Four experiments were conducted to boost Nao’s intelligence level using 100 children, different environments (class, lab, home, and mixed reality Leap Motion Controller (LMC). An objective function and an operational time cost function are developed to improve Nao’s learning experience in different environments accomplishing the best results in 4.2 seconds for each number recognition. The experiments’ results showed an increase in Nao’s intelligence from 3 to 7 years old compared with a child’s intelligence in learning simple mathematics with the best communication using a kappa ratio value of 90.8%, having a corpus that exceeded 390,000 segments, and scoring 93% of success rate when activating both auditory and vision modules for the agent Nao. The developed toolkit uses Arabic speech recognition and the Haar algorithm in a mixed reality system using big data enabling Nao to achieve a 94% success learning rate at a distance of 0.09 m; when using LMC in mixed reality, the hand sign gestures recorded the highest accuracy of 98.50% using Haar algorithm. The work shows that the current work enabled Nao to gradually achieve a higher learning success rate as the environment changes and multisensory perception increases. This paper also proposes a cutting-edge research work direction for fostering child-robots education in real time.
1. Introduction
Artificial intelligence (AI) was introduced half a century ago. Researchers initially wanted to build an electronic brain equipped with a natural form of intelligence. The concept of AI was heralded by Alan Turing in the 1950s, who proposed the Turing test to measure a form of natural language (symbolic) communication between humans and machines. In the 1960s, Lutfi Zadah proposed fuzzy logic with dominant knowledge representation and mobile robots [1]. Stanford University created the Automated Mathematician to explore new mathematical theories based on a heuristic algorithm. However, AI had become unpopular in the 1970s due to its inability to meet unrealistic expectations. The 1980s offered a promise for AI as sales of AI-based hardware and software for decision support applications exceeded $400 million [2]. By the 1990s, AI had entered a new era by integrating intelligent agent (IA) applications into different fields, such as games (Deep Blue, which is a chess program developed at Carnegie Mellon that defeated the world champion Garry Kasparov in 1997), spacecraft control, security (credit card fraud detection, face recognition), and transportation (automated scheduling systems) [3–7]. The beginning of the 21st century witnessed significant advances in AI in industrial business and government services with several initiatives, such as intelligent cities, intelligent economy, intelligent industry, and intelligent robots [3].
A unified definition of AI has not yet been offered; however, the concept of AI can be built from different definitions:(i)It is an interdisciplinary science because it interacts with cognitive science(ii)It uses creative techniques in modeling and mapping to improve average performance when solving complex problems(iii)It implements different processes to imitate intelligent human or animal behavior. Fourth, the developed system is either a virtual or a physical system with intelligent characteristics(iv)It attempts to duplicate human mental and sensory systems to model aspects of “humans” thoughts and behaviors(v)It passes the intelligence test if it interacts completely with other systems or creatures worldwide and in real time(vi)It follows a defined cycle of sense–plan–act
The present study proposes the definition of AI as follows: “AI is an interdisciplinary science suitable for implementation in any domain that uses heuristic techniques, modeling, and AI-based design principles to solve complex problems. Single or combined processes in perceiving, reasoning, learning, understanding, and collaborating can improve system behavior and decision-making. The goal of AI is to enable virtual and physical intelligent agents, including humans and/or systems that continuously upgrade their intelligence to attain superintelligence. Agents should be able to integrate with one another in fully learning, teaching, adapting themselves to dynamic environments, communicating logically, and functioning efficiently with one another or with other creatures in the world and real time through sense–plan–act–react cycles.” The three-constituent principle for an agent suggests that “designing an intelligent agent involves constituents, the definition of the ecological niche, the definition of the desired behaviors and tasks, and design of the agent [8, 9].” Therefore, an agent’s intelligence can grow in time using the “here and now” perspective during interactions in different dynamic environments. In the present study, the robot agent Nao’s design is not among the required tasks, but the other two constituents are related to the environment and involve interactions with a human agent. Therefore, this work defines the ecological niche using different environments (a classroom, a lab, and a home), focusing on a mixed-reality environment. Nao’s functions are present according to the desired behavior as teaching simple mathematics to a child. The objective is to improve Nao’s learning ability and increase its intelligence. Thus, the study shows that “the three-constituents principle, the definition of the ecological niche, and the definition of the desired behaviors and tasks [2]” are sufficient to increase Nao’s intelligence.(i)The “here and now” perspective: related to three-time frames and shows that the behavior of any ‘agent’s system matures over a certain period and is associated with three states(ii)State-oriented: describes the actual mechanism of the agent at any instance of time(iii)Learning and development: relates to learning and development from state-oriented action(iv)Evolutionary: explains the emergence of a higher level of cognition through a phylogenetic perspective by emphasizing the power of artificial evolution and performing more complex tasks
The Mixed Reality System TouchMe provides a third-person camera view of the system instead of human eyes [10]. The third-person camera view is considered more efficient for inexperienced users to interact with the robot [11]. Leutert et al. [12] reported using augmented spatial reality, a form of mixed reality, to relay information from the robot to the user’s workspace. They used a fixed-mobile projector. Socially aware interactive playgrounds [13] use various actuators to provide feedback to children. These actuators include projectors, speakers, and lights. These “interactive playgrounds can be placed at different locations, such as schools, streets, and gyms. Humans produce, interpret, and detect social signals (a communicative or informative signal conveyed directly or indirectly) [13].” Thus, their social signals can be used to enhance interactions with others. Various studies have been conducted on teaching humans to use robots in various environmental settings. RoboStage module implements learning among junior high school students through mixed reality systems [14–20]. Its creators compared the use of physical and virtual characters in a learning environment. RoboStage enables module interactions in robots to use voice and physical objects to achieve three stages of events: learning, situatedness, and blended. These events help students learn and practice activities, understand an environment, and execute an event. GENTORO uses a robot and a handheld projector to interact with children and perform a storytelling activity [21–27]. Its creators studied the effect of using a small handheld projector on the storytelling process. They also discussed the effects of using audio interactions instead of text and a wide-angle lens.
The agent matures into an adult by which the process in any state is affected by its previous state. The present study has focused on state-oriented and learning and development states to observe its outcome in association with the evolutionary state [28–30]. The proposed definition enhances research at the experimental design level using multisensory technologies to improve intelligence interaction and growth by applying the AI design principle [31–33]. Enhanced interaction between humans and robots improves learning, especially in the case of a child. Motion and speech sensor nodes are fused to this end. Contemporary children are familiar with handheld devices such as mobile phones, tablets, pads, and virtual reality cameras. Therefore, the toolkit developed in this study uses a mixed reality system featuring different ways of interaction between a child and a robot agent. This study makes the following contributions.(i)Enhancing the humanoid robot Nao’s learning capabilities with the objective to increase the robot’s intelligence, using a multisensory perception of vision, hearing, speech, and gestures for HRI interactions(ii)Implementing Arabic Speech Agent for Nao using phonological knowledge and HMM to eventually activate child-robot communication [34](iii)It developed a toolkit using Arabic speech recognition and the Haar algorithm for robust image recognition in a mixed reality system architecture using big data enabling Nao to achieve a 94% success learning rate featuring different environments, and for LMC, the highest accuracy of 98.50% using the Haar algorithm
The remainder of the study is organized mainly into Materials, Data, and Methods, which describe the architecture and experiment design, while the Discussion and Results section covers the intelligent big data management system using Haar algorithm-based Nao Agent Multisensory Communication in mixed reality and using LMC. Finally, the Conclusion and Future Work of the proposed study.
2. Materials, Data, and Methods
The experiment initiated at King Abdul-Aziz University with an Aldebaran representative was related to a three-year-old robot Nao, which could not speak Arabic or solve simple mathematics. The study analysis was initiated by selecting the suitable artificial intelligence principle design for the study. The experiment’s goals and tasks were defined precisely to increase Nao’s intelligence to at least seven years old. The Nao mathematics intelligence measurements were based on solving 100 children’s exercises for basic addition, subtraction, and multiplication problems with human agents’ help. Nao also reached the level of understanding simple sentences for Arabic language speech recognition. The experiment time scale was set for a total of two years. The study is aimed at involving the robot Nao in the learning-teaching process using interaction and multisensory Nao agent perceptions by exposing Nao to different environments (see Figure 1), enabling communication concept design. However, the present work focused more on the mixed reality environment.