Fig 3 - uploaded by Grégoire Milliez
Content may be subject to copyright.
Robotic system architecture 

Robotic system architecture 

Source publication
Conference Paper
Full-text available
Many robotic projects use simulation as a faster and easier way to develop, evaluate and validate software components compared with on-board real world settings. In the human-robot interaction field, some recent works have attempted to integrate humans in the simulation loop. In this paper we investigate how such kind of robotic simulation software...

Context in source publication

Context 1
... achieve geometric reasoning and to get the environment through robot perception SPARK [13] (SPAtial Reasoning and Knowledge) was used on our robot. To do the same in our simulated environment we need to get data from MORSE. We briefly explain here how we linked SPARK with MORSE and then what is obtained from this integration. SPARK gets three kinds of input: object identifier and position, human position and posture and robot position and posture. To obtain the object position in SPARK, we use the semantic camera on the robot head. This sensor can export the position and name of objects in view field. This data is sent using a middleware and is then read by SPARK to position the object in its representation. Concerning human and robot, we attach a pose sensor to them and we export their armature configuration. In this way SPARK can read the position and posture of the robot and human through the middleware, requesting only a mapping to match the MORSE joint representation with the SPARK one. SPARK uses robot perception data to build the environment as seen by the robot. It also computes geometrical facts such as topological description of object’s position ( Book isOn table ), agents affordances ( Book is visibleBy Human ) and knowledge of agents ( Human hasKnownLocation Book ). These high level data will be used to enrich the dialogue context. By using SPARK as a link in the system, we are able to use a full robotic architecture with MORSE. This architecture, shown in Figure 3, is composed of several modules: – Supervision System: the component in charge of commanding the other components of the system in order to complete a task. – HATP: the Human-Aware Task Planner [14], based on a Hierarchical Task Network (HTN) refinement [15]. HATP is able to produce plans for the robot actions as well as for the other participants (humans or robots). – Collaboration Planners: this set of planners are used in joint actions such as handover to estimate the user intentions and selects an action to perform. – SPARK: the Spatial Reasoning and Knowledge component, as explained in 3.1. – Knowledge Base: the facts produced by the geometric and temporal reasoning component are stored in a central symbolic knowledge base. This base maintains a different model for each agent, allowing to represent divergent beliefs. – Human Planners: a set of human aware motion, placement and manipulation planners [16]. Our system is able, using SPARK, to create different representations of the world for itself and for the other agents, which are then stored in the Knowledge Base. In this way the robot can take into account what each agent can see, reach and know when creating plans. Using HATP the robot can create a plan constituted by different execution streams for every present agent. complex goals and to adapt its plan to user actions. Human actions are mon- itored using SPARK by creating Monitor Spheres associated to items deemed interesting in a given context. A Monitor Sphere is a spheric area surrounding a point that can be associated to different events, like the hand of a human entering into it. The system is explained in more details in [17]. In this study the robot is dedicated to help a human achieving a specific object manipulation task. Thereby, multimodal dialogues are employed to solve ambiguities and to request missing information until task completion (i.e. full command execution) or failure (i.e. explicit user disengagement or wrong command execution). In this setup, the robot, more precisely the Dialogue Manager (DM), is responsible for taking appropriate multimodal dialogue decisions to fulfil the user’s goal based on uncertain dialogue contexts. To do so, the dialogue management problem is cast as a Partially Observ- able Markov Decision Process (POMDP). In this setup, the agent maintains a distribution over possible dialogue states, called the belief state in the literature, and interacting with its perceived environment using a dialogue policy learned by means of a Reinforcement Learning (RL) algorithm [18]. This mathematical framework has been successfully employed in the Spoken Dialogue System (SDS) field (e.g. [19,20,21]) as well as to manage dialogue in HRI context (e.g. [22,1]). Indeed, this framework explicitly handles parts of the inherent uncertainty of the information which the DM has to deal with (erroneous speech recognitions, misrecognized gestures, etc.). Recent attempts in SDS have shown the possibility to learn a dialogue policy from scratch with a limited number (several hundreds) of interactions [23,24,25] and the potential benefit of this technique compared to the classical use of WoZ or to develop a well-calibrated user simulator [23]. Following the same idea, we employ a sample-efficient learning algorithm, namely the Kalman Temporal Differences (KTD) framework [26,25], which enables us to learn and adapt the robot behaviour in an online setup. That is while interacting with users. The main shortcoming of the chosen method consists in the very poor initial perfor- mances. However, solutions as those proposed in [27,28] can be easily adopted to alleviate this limitation. Although objectively artificial, the presented robotic simulation platform provides a very interesting test-bed module for online dialogue learning. Indeed, a better control over the global experimental conditions can be achieved (e.g. environment instantiation, sensors equipped by the robot). Thereby, comparisons between different approaches and configurations are facilitated. Furthermore, this solution reduces the subjects’ recruitment costs without strongly hamper- ing their natural expressiveness (due to the capacities offered by the simulator). The multimodal dialogue architecture considered in our experiments is presented in Figure 4. Twelve components are responsible of the overall functioning of this dialogue system. The four orange ones are those which are implicated in the user’s input management, speech and gesture modalities in our case. Thus, the combination of the Google Web Speech API 3 for Automatic Speech Recognition (ASR) and a custom-defined grammar parser for Spoken Language Understanding (SLU) are used to perform speech recognition and understanding. The Gesture Recognition and Understanding (GRU) module simply catches the gesture-events generated by our spatial reasoner during the course of the interaction. Then, the Fusion module temporally aligns the monomodal inputs then merge them with custom- defined rules. Finally, the result of the fusion (i.e. N-best list of interpretation hypotheses and their related confidence scores) becomes the input of the multimodal DM. The three blue components are responsible of the context modelling. SPARK, previously presented in 3, for both detecting the user gestures and generating the per-agent spatial facts (perspective taking) which are used to dynamically feed the contextual knowledge base. These two modules are responsible of per-agent knowledge modelling which allows the robot to reason over different perspectives on the world. Furthermore, we also make use of a static knowledge base containing the list of all available objects (even those not perceived) and their related static properties (e.g. color). The four yellow components are dedicated to the output restitution. So, the Fission module splits the abstract system action into verbal and non-verbal ones. The spoken output is produced by chaining a template-based Natural Language Generation (NLG) module with a Text-To-Speech Synthesis (TTS) component based on the commercial Acapela TTS system 4 . The Non-verbal Behaviour ...

Citations

... The employee in the interaction must convince the supervisor of the abilities he needs in the education plan. Modern firms utilize this method to boost employee performance and meet the organization's goals (Milliez, Ferreira, Fiore, Alami, & Lefèvre, 2014). ...
Article
Full-text available
This essay highlights creative thinking and implementation strategies from conventional and Islamic perspectives. Since the birth of the Islamic religion more than 1400 years ago, Islam has taken care of creative thinking Strategies by guaranteeing that they are utilized in life. The Holy Qur'an and the Sunnah of the Prophet, peace, and blessings be upon him, encouraged creative thinking at that time. This paper addressed the subject of the feasibility of deriving creative thinking strategies from the Holy Quran and the Sunnah. And that these techniques have been employed in modern businesses, taking into account the diverse reasons for their implementation. When the application of these strategies was reviewed and researched in the Qur'an and Sunnah, it was determined that they are remarkably analogous to the creative thinking strategies that drew interest barely seventy years ago. In addition, the adoption of creative thinking strategies in businesses from a conventional standpoint was examined. The paper illustrated the processes for adopting these Strategies, their importance, and their results in organizations. These strategies extend the mind, move the mind, and raise diverse talents. The strategies of creative thinking in Islam were also explored, and the strategies for adopting each strategy were outlined. The essay featured Strategies: brainstorming, dialogue strategy and encouragement strategy from a conventional and Islamic perspective. Each strategy in Islam is detailed in terms of why it is employed, who used it, the situation that necessitated its usage, and the repercussions of employing it. The essay outlines the basics of employing creative thinking processes in Islam for individual development and character formation.
... We used the Robotic Operating System (ROS) and the Modular OpenRobots Simulation Engine (MORSE) as communication protocol and simulation platform, respectively. This configuration has been widely adopted for the testing and evaluation of robot software in several missions (Echeverria et al. 2011;Milliez 2014;Albore et al. 2015b;Degroote et al. 2016;Zhou et al. 2016;Park et al. 2017;Mulgaonkar and Kumar 2014). ROS is a robotic meta-operating system that provides hardware abstraction, low-level device control, implementation of commonly used functionality, message-passing between processes and package management (Quigley et al. 2009). ...
Article
Full-text available
Eucalyptus represents one of the main sources of raw material in Brazil, and each year substantial losses estimated at $400 million occur due to diseases. The active monitoring of eucalyptus crops can help getting accurate information about contaminated areas, in order to improve response time. Unmanned aerial vehicles (UAVs) provide low-cost data acquisition and fast scanning of large areas, however the success of the data acquisition process depends on an efficient planning of the flight route, particularly due to traditionally small autonomy times. This paper proposes a single framework for efficient visual data acquisition using UAVs that combines perception, environment representation and route planning. A probabilistic model of the surveyed environment, containing diseased eucalyptus, soil and healthy trees, is incrementally built using images acquired by the vehicle, in combination with GPS and inertial information for positioning. This incomplete map is then used in the estimation of the next point to be explored according to a certain objective function, aiming to maximize the amount of information collected within a certain traveled distance. Experimental results show that the proposed approach compares favorably to other traditionally used route planning methods.
... [3,6]) and the dialog manager (e.g. [4,5,7]). NLG is still mainly based on template-based models, that turn out to produce good results given a specific task. ...
Conference Paper
In this paper, we present a brief overview of our ongoing work about artificial interactive agents and their adaptation to users. Several possibilities to introduce humorous productions in a spoken dialog system are investigated in order to enhance naturalness during social interactions between the agent and the user. We finally describe our plan on how neuroscience will help to better evaluate the proposed systems, both objectively and subjectively.
... To do so, we rely on the Partially Observable Markov Decision Process (POMDP) framework. This latter is becoming a reference in the Spoken Dialogue System (SDS) field [21,17,14] as well as in HRI context [15,11,12], due to its capacity to explicitly handle parts of the inherent uncertainty of the information which the system (the robot) has to deal with (erroneous speech recognizer, falsely recognised gestures, etc.). In the POMDP setup, the agent maintains a distribution over possible dialogue states, the belief state, all along the dialogue course and interacts with its perceived environment using a Reinforcement Learning (RL) algorithm so as to maximise some expected cumulative discounted reward [16]. ...
... Concerning the simulation, the setup of [12] is applied to enable a rich multimodal HRI. Thus, the open-source robotics simulator MORSE [5] is used which provides a realistic rendering through the Blender Game Engine, a wide range support of middleware (e.g. ...
Conference Paper
Full-text available
Others can have a different perception of the world than ours. Understanding this divergence is an ability, known as perspective taking in developmental psychology, that humans exploit in daily social interactions. A recent trend in robotics aims at endowing robots with similar mental mechanisms. The goal then is to enable them to naturally and efficiently plan tasks and communicate about them. In this paper we address this challenge extending a state-of-the-art goal-oriented dialogue management framework, the Hidden Information State (HIS). The new version makes use of the robot’s awareness of the users’ belief in a reinforcement learning-based situated dialogue management optimisation procedure. Thus the proposed solution enables the system to cope with the communication ambiguities due to noisy channel but also with the possible misunderstandings due to some divergence among the beliefs of the robot and its interlocutor in a Human-Robot Interaction (HRI) context. We show the relevance of the approach by comparing different handcrafted and learnt dialogue policies with and without divergent belief reasoning in an in-house Pick-Place-Carry scenario by mean of user trials in a simulated 3D environment.
... To do so, we rely on the Partially Observable Markov Decision Process (POMDP) framework. This latter is becoming a reference in the Spoken Dialogue System (SDS) field [21,17,14] as well as in HRI context [15,11,12], due to its capacity to explicitly handle parts of the inherent uncertainty of the information which the system (the robot) has to deal with (erroneous speech recognizer, falsely recognised gestures, etc.). In the POMDP setup, the agent maintains a distribution over possible dialogue states, the belief state, all along the dialogue course and interacts with its perceived environment using a Reinforcement Learning (RL) algorithm so as to maximise some expected cumulative discounted reward [16]. ...
... Concerning the simulation, the setup of [12] is applied to enable a rich multimodal HRI. Thus, the open-source robotics simulator MORSE [5] is used which provides a realistic rendering through the Blender Game Engine, a wide range support of middleware (e.g. ...
... Also, relying on MORSE effectively supports collaboration between the partners involved in this project (MaRDi project 8 ): our partners are also using MORSE simulation to test their software and collect data with the same environment in their laboratory, where they focus on dialog processing. They can train their dialog system using MORSE feedback to test the robot behaviors [16]. ...
Conference Paper
Full-text available
Simulation in robotics is often a love-hate relationship: while simulators do save us a lot of time and effort compared to regular deployment of complex software architectures on complex hardware, simulators are also known to evade many (if not most) of the real issues that robots need to manage when they enter the real world. Because humans are the paragon of dynamic, unpredictable, complex, real world entities, simulation of human-robot interactions may look condemn to fail, or, in the best case, to be mostly useless. This collective article reports on five independent applications of the MORSE simulator in the field of human-robot interaction: It appears that simulation is already useful, if not essential, to successfully carry out research in the field of HRI, and sometimes in scenarios we do not anticipate.
Chapter
Choosing the best interaction modalities and protocols in Human-Robot Interaction (HRI) is far from being straightforward, as it strictly depends on the application domain, the tasks to be executed, the types of robots and sensors involved. In the last years, a growing number of HRI researchers exploited Virtual Reality (VR) as a mean to evaluate proposed solutions, focusing in particular on safety and correctness of collaborative tasks. This allows to prove the effectiveness and robustness of a certain approach in a simulated environment, thus permitting to converge more easily to the best solution, also avoiding to experiment potentially harmful actions in a real scenario. In this paper, we aim at reviewing existing VR based approaches targeting or embodying HRI.
Conference Paper
Full-text available
A simulator is a software application that allows imitating an experimental environment and controlling a process or an instrument. Projects in human-robot interaction (HRI) field face different challenges during real world experiments, which include improper robot behavior, mistakes in an algorithm logic, software and hardware failures, and generic lay participants on a user side. Simulators could help to avoid a number of problems that researchers would definitely face within real world experiments at algorithms' testing stage. This study presents an experimental validation in Gazebo simulator and real-world experiments for English language lessons scenarios with a small size humanoid Robotis DARwin OP2. We investigated advantages of using Gazebo simulator in constructing modular generic HRI blocks and global HRI scenarios in order to optimize a humanoid robot assisted process of studying English. The simulator saved a significant amount of efforts and time at preparatory stages, allowing robot programmers and HRI developers to obtain a quick feedback from users and their requirements for necessary adjustments.