Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The study of social learning in robotics has been motivated by both scientific interest in the learning process and practical desires to produce machines that are useful, flexible, and easy to use. In this review, we introduce the social and task-oriented aspects of robot imitation. We focus on methodologies for addressing two fundamental problems. First, how does the robot know what to imitate? And second, how does the robot map that perception onto its own action repertoire to replicate it? In the future, programming humanoid robots to perform new tasks might be as simple as showing them.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... • Imitation learning offers an implicit means of training a machine, such that explicit and tedious programming of a task can be minimized or even eliminated. Imitation learning is thus a natural means of endowing robotic machines with new capabilities [10]. ...
... Others follow a more cognitive science approach and build conceptual models of imitation learning in animals. Surveys of this area can be found in [32], [10]. ...
... More specifically, each time interval is mapped on a trapezoidal fuzzy number that is represented by the quadruple (p, m, n, q). For example, following this formulation, Fig. 4.3 depicts the time interval "approximately 6 to 9 moments" represented as the fuzzy number T B (3,6,9,10). In the same figure, negative values represent past time moments. ...
Thesis
Full-text available
The current PhD thesis addresses the formulation and implementation of a methodologi- cal framework for robot Learning from Demonstration (LfD). The latter refers to method- ologies that develop behavioral policies from example state-to-action mappings. To this end, we study the reciprocal interaction of perception and action, in order to teach robots a repertoire of novel action behaviors. Based on that, we design, develop and implement a robust imitation framework, termed IMFO (IMitation Framework by Observation), that facilitates imitation learning and relevant applications in human-robot interaction (HRI) tasks. IMFO can cope with the reproduction of learned (i.e. previously observed) actions, as well as novel ones. Mapping of human actions to the respective robotic ones is achieved via an indeterminate depiction, termed latent space representation. The latter accom- plishes a compact, yet precise abstraction of action trajectories, effectively representing high dimensional raw actions in a low dimensional space. Moreover, throughout this thesis, we examine the role of time in LfD by enhancing the aforementioned framework with the notion of learning both the spatial and temporal characteristics of human motions. Accordingly, learned actions can be subsequently re- produced in the context of more complex time-informed HRI scenarios. Unlike previous LfD methods that cope only with the spatial traits of an action, the formulated scheme effectively encompasses spatial and temporal aspects. Extensive experimentation with a variety of real robotic platforms demonstrates the robustness and applicability of the in- troduced integrated LfD scheme. Learned actions are reproduced under the high level control of a time-informed task planner. During the implementation of the studied scenarios, temporal and physical con- straints may impose speed adaptations in the performed actions. The employed latent space representation readily supports such variations, giving rise to novel actions in the temporal domain. Experimental results demonstrate the effectiveness of the proposed enhanced imitation scheme in the implementation of HRI scenarios. Additionally, a set of well defined evaluation metrics are introduced to assess the validity of the proposed approach considering the temporal and spatial consistency of the reproduced behaviors. A noteworthy extension of the above regards force-based object grasping for executing sensitive manipulation tasks. This is also treated in the current thesis via a novel super- vised learning scheme, termed SLF (Supervised Learning for Force-based manipulation). SLF is formulated as a three-stage process: (a) supervised trial-execution in simulation to acquire sufficient training data; (b) training to facilitate grasp learning with suitable robot-arm pose and lifting force; (c) grasp execution in simulation. Subsequently, follow- ing sim-to-real transfer, operation in real environments is achieved in addition to simu- lated ones, generalizing also for objects not included in the trial sessions. The proposed learning scheme is demonstrated in object lifting tasks where the applied force varies for different objects with similar contact friction coefficients, and likewise the grasping pose. Experimental results on the manipulator YuMi show that the robot is able to effectively reproduce demanding lifting and manipulation tasks after learning is accomplished. In summary, our thesis has studied LfD and has contributed with a novel approach that introduced latent space representations to encode the action characteristics. A framework implementation (IMFO) of our approach allowed extensive experimentation and also con- duction of HRI scenarios. The inclusion of temporal aspects in our approach enhanced it to cope with complex, real-life interactions. Finally, the extension of IMFO with force- based grasping facilitated manipulation tasks with sensitive objects.
... A FFECTIVE social robots are gaining increasing interest in research and social applications. However, achieving smooth human-robot interaction still has significant challenges such as making robots trustworthy through the incorporation of emotional compatibility in their interactions [1]- [4]. Humanoid social robots provide means to investigate social cognition, engage with, and support human mental health. ...
... Humans respond better to robots that behave empathetically toward them, by recognizing emotion and responding accordingly [5]- [11]. The fundamental work by Brazeal and Ishiguro [4], [12]- [14] grounded this field of affective humanrobot communication (see [7], [8], [15], [16] for recent surveys). Industry translation of social robots has also begun in service and hospitality sectors, though challenges in reliability and acceptance by humans remain unresolved [17]. ...
Article
Full-text available
We present the conceptual formulation, design, fabrication, control and commercial translation of an IoT enabled social robot as mapped through validation of human emotional response to its affective interactions. The robot design centres on a humanoid hybrid-face that integrates a rigid faceplate with a digital display to simplify conveyance of complex facial movements while providing the impression of three-dimensional depth. We map the emotions of the robot to specific facial feature parameters, characterise recognisability of archetypical facial expressions, and introduce pupil dilation as an additional degree of freedom for emotion conveyance. Human interaction experiments demonstrate the ability to effectively convey emotion from the hybrid-robot face to humans. Conveyance is quantified by studying neurophysiological electroencephalography (EEG) response to perceived emotional information as well as through qualitative interviews. Results demonstrate core hybrid-face robotic expressions can be discriminated by humans (80%+ recognition) and invoke face-sensitive neurophysiological event-related potentials such as N170 and Vertex Positive Potentials in EEG. The hybrid-face robot concept has been modified, implemented, and released by Emotix Inc in the commercial IoT robotic platform Miko (‘My Companion’), an affective robot currently in use for human-robot interaction with children. We demonstrate that human EEG responses to Miko emotions are comparative to that of the hybrid-face robot validating design modifications implemented for large scale distribution. Finally, interviews show above 90% expression recognition rates in our commercial robot. We conclude that simplified hybrid-face abstraction conveys emotions effectively and enhances human-robot interaction.
... Besides, there are recommendations for using modeling for the inputs received or the data captured. The modeling ranges from human moral learning process (Froese and Di Paolo, 2010) to human socialization (Breazeal and Scassellati, 2002;Fong et al., 2002). However, in the bottom-up approach, it is difficult to question, interpret, explain, supervise, and control AI systems "because deeplearning systems cannot easily track their own 'reasoning'" (Ciupa, 2017). ...
Article
Humans rely on machines in accomplishing missions while machines need humans to make them more intelligent and more powerful. Neither side can go without the other, especially in complex environments when autonomous mode is initiated. Things are becoming more complicated when law and ethical principles should be applied in these complex environments. One of the solutions is human-machine teaming, as it takes advantage of both the best humans can offer and the best that machines can provide. This article intends to explore ways of implementing law and ethical principles in artificial intelligence (AI) systems using human-machine teaming. It examines the existing approaches, reveals their limitations, and calls for the establishment of accountability and the use of a checks-and-balances framework in AI systems. It also discusses the legal and ethical implications of this solution.
... Many works have shown that within humans, the study of exhibited behavior (performance evolution for example) can lead to understanding the implicit knowledge behind that behavior (Smith and Yu, 2008;Mix et al., 2022;Siegel et al., 2021;Schulz, 2012). Other works (Breazeal and Scassellati, 2002;Hussein et al., 2017;Schillaci et al., 2016) have focused on mimicking the behavior of natural agents using artificial ones (Oudeyer et al., 2007;Colin, 2020). Interpretable and explainable techniques were used to get hints about the reason behind a specific behavior. ...
Article
Full-text available
During the learning process, a child develops a mental representation of the task he or she is learning. A Machine Learning algorithm develops also a latent representation of the task it learns. We investigate the development of the knowledge construction of an artificial agent through the analysis of its behavior, i.e., its sequences of moves while learning to perform the Tower of Hanoï(TOH) task. The TOH is a well-known task in experimental contexts to study the problem-solving processes and one of the fundamental processes of children’s knowledge construction about their world. We position ourselves in the field of explainable reinforcement learning for developmental robotics, at the crossroads of cognitive modeling and explainable AI. Our main contribution proposes a 3-step methodology named Implicit Knowledge Extraction with eXplainable Artificial Intelligence (IKE-XAI) to extract the implicit knowledge, in form of an automaton, encoded by an artificial agent during its learning. We showcase this technique to solve and explain the TOH task when researchers have only access to moves that represent observational behavior as in human-machine interaction. Therefore, to extract the agent acquired knowledge at different stages of its training, our approach combines: first, a Q-learning agent that learns to perform the TOH task; second, a trained recurrent neural network that encodes an implicit representation of the TOH task; and third, an XAI process using a post-hoc implicit rule extraction algorithm to extract finite state automata. We propose using graph representations as visual and explicit explanations of the behavior of the Q-learning agent. Our experiments show that the IKE-XAI approach helps understanding the development of the Q-learning agent behavior by providing a global explanation of its knowledge evolution during learning. IKE-XAI also allows researchers to identify the agent’s Aha! moment by determining from what moment the knowledge representation stabilizes and the agent no longer learns.
... In 1992, Mahadevan and Connell built a dynamic robot based on RL named OBELIX that learned how to push boxes [7]. In 1996, the Sarcos humanoid DB was constructed by Schaal to learn the pole-balancing task [8]. An RL method was proposed to control the dynamic walking of a robot without prior knowledge of the environment [9]. ...
Article
Full-text available
The world has seen major developments in the field of e-learning and distance learning, especially during the COVID-19 crisis, which revealed the importance of these two types of education and the fruitful benefits they have offered in a group of countries, especially those that have excellent infrastructure. At the Faculty of Sciences Semlalia, Cadi Ayyad University Marrakech, Morocco, we have created a simple electronic platform for remote practical work (RPW), and its results have been good in terms of student interaction and even facilitating the employment of a professor. The objective of this work is to propose a recommendation system based on deep quality-learning networks (DQNs) to recommend and direct students in advance of doing the RPW according to their skills of each mouse or keyboard click per student. We are focusing on this technology because it has strong, tremendous visibility and problem-solving ability that we will demonstrate in the result section. Our platform has enabled us to collect a range of students’ and teachers’ information and their interactions with the learning content we will rely on as inputs (a large number of images per second for each mouse or keyboard click per student) into our new system for output (doing the RPW). This technique is reflected in an attempt to embody the virtual teacher’s image within the platform and then adequately trained with DQN technology to perform the RPW.
... Néanmoins, le tableau 4.3 recense et décrit un grand nombre de paramètres utilisés pour la formation de coalitions concernant les tâches de transport. Les robots ayant été conçus pour imiter les humains, les paramètres décrits restent valables pour les deux (robots et humains) [22]. A noter que, dans le contexte de l'industrie 4.0, les humains travaillant dans les usines comme opérateurs, communiquent avec les autres éléments grâce à des objets (Tablette, casque. . ...
Thesis
Les manipulateurs mobiles composés d’une base mobile (AGV : Véhicule à guidage automatique) et d’un bras manipulateur ont été développés et introduits dans les ateliers de production flexibles (FMS) pour assurer l’exécution complète et autonome des tâches de transport. La supervision de ces entités intelligentes, leur architecture de contrôle, leur reconfigurabilité ainsi que l’ordonnancement de leurs tâches représentent des problèmes scientifiques et technologiques de taille ayant attiré plusieurs chercheurs. Par contre, peu de travaux ont étudié ces problématiques avec des systèmes de transport hétérogènes (constitués de plusieurs types de ressources : Manipulateurs mobiles, AGVs et bras manipulateurs) et/ou modulaires. Ce travail de thèse a permis de mettre en œuvre, de tester et de valider une architecture de supervision distribuée incluant une approche d’ordonnancement de tâches. Cette approche considère à la fois la modularité et l’intelligence des manipulateurs mobiles, de même que l’hétérogénéité des systèmes de transport. La méthode d’ordonnancement des tâches, se fonde sur un mécanisme hybride combinant un algorithme de décision globale basé sur les enchères et un modèle de décision locale basé sur une programmation linéaire en nombres entiers (ILP). Ceci permet de profiter pleinement de l’intelligence embarquée dans les robots en leur donnant la possibilité d’exécuter localement des ordres de haut niveau tout en respectant les objectifs globaux au travers d’un principe de coalition. Deux cas d’étude sur des ateliers de production flexibles ont été menés afin de valider la méthode proposée.
... In this article, we survey existing work targeted towards this approach, which, unlike LfD, seeks to enable end-user programming of the structure, logic, and characteristics of the desired robot behavior rather than having end-users demonstrate several instantiations of it. Our survey of end-user robot programming may subsume works on end-user programming using demonstrations, but we focus on demonstrations as a tool for program specification rather than on the techniques to enable robot learning from end-user demonstrations, which have been covered by numerous surveys (e.g., [5,12,15,22,26,56,71,106,121]). Furthermore, we only survey papers on end-user program specification, as opposed to papers on direct control or instruction of robots (e.g., [48]). ...
Preprint
Full-text available
As robots interact with a broader range of end-users, end-user robot programming has helped democratize robot programming by empowering end-users who may not have experience in robot programming to customize robots to meet their individual contextual needs. This article surveys work on end-user robot programming, with a focus on end-user program specification. It describes the primary domains, programming phases, and design choices represented by the end-user robot programming literature. The survey concludes by highlighting open directions for further investigation to enhance and widen the reach of end-user robot programming systems.
... Yes. Developmental robotics, long before machine learning, has for more than two decades being focusing on those problems of how machines can adapt and learn interaction with social peers [4,16,30]. People have studied for those two decades, for example, how children can learn basic social interaction skills such as joint attention [3,14,21]. Also, the problem of language grounding has been studied already 20 or 30 years ago, even before developmental robotics started as a field [28,29,27]. ...
Preprint
This paper outlines a perspective on the future of AI, discussing directions for machines models of human-like intelligence. We explain how developmental and evolutionary theories of human cognition should further inform artificial intelligence. We emphasize the role of ecological niches in sculpting intelligent behavior, and in particular that human intelligence was fundamentally shaped to adapt to a constantly changing socio-cultural environment. We argue that a major limit of current work in AI is that it is missing this perspective, both theoretically and experimentally. Finally, we discuss the promising approach of developmental artificial intelligence, modeling infant development through multi-scale interaction between intrinsically motivated learning, embodiment and a fastly changing socio-cultural environment. This paper takes the form of an interview of Pierre-Yves Oudeyer by Mandred Eppe, organized within the context of a KI - K{\"{u}}nstliche Intelligenz special issue in developmental robotics.
... Imitation of human behaviors is one of the effective ways to develop artificial intelligence [5][6][7]. Human dancers always present dance motions before a mirror, visually observe the mirror reflections of their own dance motions, and finally make aesthetic judgments about those motions. Similarly, if a robot perceives the aesthetics of its own dance motions just like this, it expresses more autonomous, humanoid behavior [3] and, to a certain extent, develops machine consciousness [8]. ...
Article
Full-text available
Imitation of human behaviors is one of the effective ways to develop artificial intelligence. Human dancers, standing in front of a mirror, always achieve autonomous aesthetics evaluation on their own dance motions, which are observed from the mirror. Meanwhile, in the visual aesthetics cognition of human brains, space and shape are two important visual elements perceived from motions. Inspired by the above facts, this paper proposes a novel mechanism of automatic aesthetics evaluation of robotic dance motions based on multiple visual feature integration. In the mechanism, a video of robotic dance motion is firstly converted into several kinds of motion history images, and then a spatial feature (ripple space coding) and shape features (Zernike moment and curvature-based Fourier descriptors) are extracted from the optimized motion history images. Based on feature integration, a homogeneous ensemble classifier, which uses three different random forests, is deployed to build a machine aesthetics model, aiming to make the machine possess human aesthetic ability. The feasibility of the proposed mechanism has been verified by simulation experiments, and the experimental results show that our ensemble classifier can achieve a high correct ratio of aesthetics evaluation of 75%. The performance of our mechanism is superior to those of the existing approaches.
... Naturally, a robot should also establish a similar mechanism to implement automatic aesthetics evaluation of its own dance poses so as to serve robotic choreography creation. is idea of developing artificial intelligence is feasible because imitating human behaviours is one of the effective learning approaches in robotics [16][17][18]. Considering the similarity with the human body, a biped humanoid robot is the best choice to reproduce the above human dance behaviour. For the convenience of description, the NAO robot is used as the prototype of a biped humanoid robot in this paper. ...
Article
Full-text available
Vision plays an important role in the aesthetic cognition of human beings. When creating dance choreography, human dancers, who always observe their own dance poses in a mirror, understand the aesthetics of those poses and aim to improve their dancing performance. In order to develop artificial intelligence, a robot should establish a similar mechanism to imitate the above human dance behaviour. Inspired by this, this paper designs a way for a robot to visually perceive its own dance poses and constructs a novel dataset of dance poses based on real NAO robots. On this basis, this paper proposes a hierarchical processing network-based approach to automatic aesthetics evaluation of robotic dance poses. The hierarchical processing network first extracts the primary visual features by using three parallel CNNs, then uses a synthesis CNN to achieve high-level association and comprehensive processing on the basis of multi-modal feature fusion, and finally makes an automatic aesthetics decision. Notably, the design of this hierarchical processing network is inspired by the research findings in neuroaesthetics. Experimental results show that our approach can achieve a high correct ratio of aesthetic evaluation at 82.3%, which is superior to the existing methods.
... Nowadays, consumer social robots are emerging to work for service, rehabilitation, caregiving and education industries (Muthugala el al., 2013). Robots can be understood as human-like social agents when robots are able to imitate human body shapes as well as mimic human facial expressions (Breazeal and Scassellati, 2002). For proper responses, the robots need to understand human gestures, cues and emotions. ...
Thesis
Full-text available
Today’s mobile communication technologies have increased verbal and text-based communication with other humans, social robots and intelligent virtual assistants. On the other hand, the technologies reduce face-to-face communication. This social issue is critical because decreasing direct interactions may cause difficulty in reading social and environmental cues, thereby impeding the development of overall social skills. Recently, scientists have studied the importance of nonverbal interpersonal activities to social skills, by measuring human behavioral and neurophysiological patterns. These interdisciplinary approaches are in line with the European Union research project, “Socializing sensorimotor contingencies” (socSMCs), which aims to improve the capability of social robots and properly deal with autism spectrum disorder (ASD). Therefore, modelling and benchmarking healthy humans’ social behavior are fundamental to establish a foundation for research on emergence and enhancement of interpersonal coordination. In this research project, two different experimental settings were categorized depending on interactants’ distance: distal and proximal settings, where the structure of engaged cognitive systems changes, and the level of socSMCs differs. As a part of the project, this dissertation work referred to this spatial framework. Additionally, single-sensor solutions were developed to reduce costs and efforts in measuring human behaviors, recognizing the social behaviors, and enhancing interpersonal coordination. First of all, algorithms using a head worn inertial measurement unit (H-IMU) were developed to measure human kinematics, as a baseline for social behaviors. The results confirmed that the H-IMU can measure individual gait parameters by analyzing only head kinematics. Secondly, as a distal sensorimotor contingency, interpersonal relationship was considered with respect to a dynamic structure of three interacting components: positivity, mutual attentiveness, and coordination. The H-IMUs monitored the social behavioral events relying on kinematics of the head orientation and oscillation during walk and talk, which can contribute to estimate the level of rapport. Finally, in a new collaborative task with the proposed IMU-based tablet application, results verified effects of different auditory-motor feedbacks on the enhancement of interpersonal coordination in a proximal setting. This dissertation has an intensive interdisciplinary character: Technological development, in the areas of sensor and software engineering, was required to apply to or solve issues in direct relation to predefined behavioral scientific questions in two different settings (distal and proximal). The given frame served as a reference in the development of the methods and settings in this dissertation. The proposed IMU-based solutions are also promising for various future applications due to widespread wearable devices with IMUs.
... In a typical imitation (Breazeal & Scassellati, 2002), a demonstrator (e.g., parent) shows the imitator (e.g., child) an action with an intention. In this letter, we employ the definition of intention and action described by Bernstein (1996): intention means either motor planning or motor control to Figure 1: Two types of failure illustrated using the reaching task. ...
Article
Full-text available
To help another person, we need to infer his or her goal and intention and then perform the action that he or she was unable to perform to meet the intended goal. In this study, we investigate a computational mechanism for inferring someone's intention and goal from that person's incomplete action to enable the action to be completed on his or her behalf. As a minimal and idealized motor control task of this type, we analyzed single-link pendulum control tasks by manipulating the underlying goals. By analyzing behaviors generated by multiple types of these tasks, we found that a type of fractal dimension of movements is characteristic of the difference in the underlying motor controllers, which reflect the difference in the underlying goals. To test whether an incomplete action can be completed using this property of the action trajectory, we demonstrated that the simulated pendulum controller can perform an action in the direction of the underlying goal by using the fractal dimension as a criterion for similarity in movements.
... Yes. Developmental robotics, long before machine learning, has for more than two decades being focusing on those problems of how machines can adapt and learn interaction with social peers [4,16,30]. People have studied for those two decades, for example, how children can learn basic social interaction skills such as joint attention [3,14,21]. Also, the problem of language grounding has been studied already 20 or 30 years ago, even before developmental robotics started as a field [28,29,27]. ...
... It is now well-accepted that humans respond better to robots that behave empathetically towards them, which involves the capacity to recognise emotion and respond accordingly [5][6][7][8][9][10][11]. Pioneering work, in particular by Brazeal and Ishiguro [4,[12][13][14], grounded this field of study with a very strong body of literature now available on affective human-robot communication (see [7,8,15,16] for recent surveys). Industry translation has also begun in areas such as service and hospitality, highlighted by the opening of the Henn-na Hotel in Nagasaki in 2015, though challenges in reliability and acceptance by humans remain unresolved [17]. ...
Preprint
Full-text available
We introduce the conceptual formulation, design, fabrication, control and commercial translation with IoT connection of a hybrid-face social robot and validation of human emotional response to its affective interactions. The hybrid-face robot integrates a 3D printed faceplate and a digital display to simplify conveyance of complex facial movements while providing the impression of three-dimensional depth for natural interaction. We map the space of potential emotions of the robot to specific facial feature parameters and characterise the recognisability of the humanoid hybrid-face robot's archetypal facial expressions. We introduce pupil dilation as an additional degree of freedom for conveyance of emotive states. Human interaction experiments demonstrate the ability to effectively convey emotion from the hybrid-robot face to human observers by mapping their neurophysiological electroencephalography (EEG) response to perceived emotional information and through interviews. Results show main hybrid-face robotic expressions can be discriminated with recognition rates above 80% and invoke human emotive response similar to that of actual human faces as measured by the face-specific N170 event-related potentials in EEG. The hybrid-face robot concept has been modified, implemented, and released in the commercial IoT robotic platform Miko (My Companion), an affective robot with facial and conversational features currently in use for human-robot interaction in children by Emotix Inc. We demonstrate that human EEG responses to Miko emotions are comparative to neurophysiological responses for actual human facial recognition. Finally, interviews show above 90% expression recognition rates in our commercial robot. We conclude that simplified hybrid-face abstraction conveys emotions effectively and enhances human-robot interaction.
... Research in humanoid robotics has been thriving in recent years (Hirai et al. (1998), Kaneko et al. (2008)), both due to their predicted relevance as personal and assistive robotics (Tapus et al. (2007), Oztop et al. (2005)), and because of the scientific challenges raised by robotics with regards to cognition (Asada et al., 2001), natural communication (Stiefelhagen et al. (2004), Breazeal and Scassellati (2002)), bipedal locomotion (Yamaguchi et al. (1999), Chestnutt et al. (2005), Collins and Ruina (2005)) and full-body physical interaction with the environment (Ude et al., 2004). ...
Thesis
This thesis suggests novel approaches and design processes to create and produce robotic platforms, the control and morphology of which can be freely explored through experimentation in the real world, that are easy to diffuse and reproduce in the research community. Especially, this alternative design methodology is driven by the desire to:• freely explore morphological properties,• reduce the amount of time required between an idea and its experimentation on an actual robotic platform in the real world,• makes experiments that should be easy to do, actually easy to do,• make the work easily reproducible in any other lab,• keep the work modular and free to use in accordance with open source principles,so it can be reused and extended for other projects.Our approach follows novel design methods for both design and production, for all technological aspects of the robot (i.e. mechanics, actuation, electronics, software,distribution). In particular these methods relies on 3D printing for all mechanical parts, the Arduino electronic architecture for the sensors acquisition, an easy to use Python API called pypot for the control and finally the distribution of all our work under open source licenses. Using this methodology, we create the Poppy Humanoid robot, a fully modular robot allowing exploring freely the role of morphology and adapting its body to specific experimental setup. This robot has been released under open source license and all files are easily accessible on the GitHub repository: https://www.github.com/poppy_project/. We experiment the use of this robot for several applications. First, as a scientific tool and we show that Poppy can be easily and quickly modified to either explore the role of morphology or to be adapted to different experimental setups. Based on this work, but from another perspective we investigate the potential impact of such platform for educational and artistic applications.
... Albeit, the absence of visual cues does hinder proper articulation of phonemes such as /u/, which has been found less rounded in the blind [5]. Still, it is a mystery how infants use auditory cues to generate matching vocalisations in light of the differences in the size of their vocal apparatus [6,7,8]. ...
Preprint
Full-text available
The way infants use auditory cues to learn to speak despite the acoustic mismatch of their vocal apparatus is a hot topic of scientific debate. The simulation of early vocal learning using articulatory speech synthesis offers a way towards gaining a deeper understanding of this process. One of the crucial parameters in these simulations is the choice of features and a metric to evaluate the acoustic error between the synthesised sound and the reference target. We contribute with evaluating the performance of a set of 40 feature-metric combinations for the task of optimising the production of static vowels with a high-quality articulatory synthesiser. Towards this end we assess the usability of formant error and the projection of the feature-metric error surface in the normalised F1-F2 formant space. We show that this approach can be used to evaluate the impact of features and metrics and also to offer insight to perceptual results.
... Robotic learning by imitation of sequential grasping and manipulation involves two processes [199]: (i) imitation through replication of the results of observed actions rather than replication of the actions themselves; (ii) understanding of the intentions of the observed agent based on motor simulations. Robots can learn and imitate movements of a human being by combining motor schema primitives which are activated similarly to mirror neurons [200]. The correspondence problem involves converting perceptual observations into its own motor responses-this requires the existence of predictive forward models. ...
Article
Full-text available
We present a comprehensive tutorial review that explores the application of bio-inspired approaches to robot control systems for grappling and manipulating a wide range of space debris targets. Current robot manipulator control systems exploit limited techniques which can be supplemented by additional bio-inspired methods to provide a robust suite of robot manipulation technologies. In doing so, we review bio-inspired control methods because this will be the key to enabling such capabilities. In particular, force feedback control may be supplemented with predictive forward models and software emulation of viscoelastic preflexive joint behaviour. This models human manipulation capabilities as implemented by the cerebellum and muscles/joints respectively. In effect, we are proposing a three-level control strategy based on biomimetic forward models for predictive estimation, traditional feedback control and biomimetic muscle-like preflexes. We place emphasis on bio-inspired forward modelling suggesting that all roads lead to this solution for robust and adaptive manipulator control. This promises robust and adaptive manipulation for complex tasks in salvaging space debris.
... In this paper, we therefore study whether humans are willing to imitate robots as they imitate other humans. In addition to having consequences for future societal roles for artificial agents -from assistive and medical applications to military roles and autonomous driving [e.g. 8, 14, 37] -this might lead to a novel measure of robot acceptance: while studies in HRI have previously addressed imitation, they have done so almost exclusively with a focus on the robot capability to imitate humans [48], such as whether and when to imitate people [12]. Although this has resulted in robots with increased capabilities and functionalities, the measurement of their acceptance is still confined to collecting people's judgements and opinions [e.g. ...
Conference Paper
Full-text available
Theories on social learning indicate that imitative choices are usually performed whenever copying the others' behaviour has no additional cost. Here, we extended such investigations of social learning to Human-Robot Interaction (HRI). Participants played the Economic Investment Game with a robot banker while observing another robot player also investing in the robot banker. By manipulating the robot banker payoff, three conditions of unfairness were created: (1) unfair payoff for the participants, (2) unfair payoff for the robot player and (3) unfair payoff for both. Results showed that when the payoff was low for the participants and high for the robot player, participants invested more money in the robot banker than when both parties received a low return. Also, for this specific condition, participants' investments increased further with a more interactive robot player (defined as demonstrating increased attention, congruent movements and speech) This suggests that social and cognitive human competencies can be used and transposed to non-human agents. Further, imitation can potentially be extended to HRI, with interactivity likely having a key role in increasing this effect.
... Appearance information, whether positive or negative, acts as an important factor in human-to-human interactions as well as in human interactions with robots [1]. For robots to have social relationships with people they need to mimic the way people have social relationships [2]. However, a strand of hair is very flexible and thin, which can be transformed into a variety of shapes, similar to skin color, or heavily influenced by external lighting, making it difficult to segment skin, hair, and background areas. ...
Article
Full-text available
We described a real-time hair segmentation method based on a fully convolutional network with the basic structure of an encoder–decoder. In one of the traditional computer vision techniques for hair segmentation, the mean shift and watershed methodologies suffer from inaccuracy and slow execution due to multi-step, complex image processing. It is also difficult to execute the process in real-time unless an optimization technique is applied to the partition. To solve this problem, we exploited Mobile-Unet using the U-Net segmentation model, which incorporates the optimization techniques of MobileNetV2. In experiments, hair segmentation accuracy was evaluated by different genders and races, and the average accuracy was 89.9%. By comparing the accuracy and execution speed of our model with those of other models in related studies, we confirmed that the proposed model achieved the same or better performance. As such, the results of hair segmentation can obtain hair information (style, color, length), which has a significant impact on human-robot interaction with people.
... For instance, equipping cognitive robots with the ability to process and integrate cross-modal information streams ensures that they will interact with the environment more efficiently, even under conditions of sensory uncertainty (Parisi et al., 2019). Similarly, developmental robotics, which is motivated by human cognitive and behavioral development, aims to provide a better understanding of the development of cognitive processes using robots with rich sensory and motor capabilities as testing platforms (Breazeal and Scassellati, 2002;Lungarella et al., 2003;Prince, 2008;Schlesinger, 2015, 2018). ...
Article
Full-text available
The emergence of cross-modal learning capabilities requires the interaction of neural areas accounting for sensory and cognitive processing. Convergence of multiple sensory inputs is observed in low-level sensory cortices including primary somatosensory (S1), visual (V1), and auditory cortex (A1), as well as in high-level areas such as prefrontal cortex (PFC). Evidence shows that local neural activity and functional connectivity between sensory cortices participate in cross-modal processing. However, little is known about the functional interplay between neural areas underlying sensory and cognitive processing required for cross-modal learning capabilities across life. Here we review our current knowledge on the interdependence of low- and high-level cortices for the emergence of cross-modal processing in rodents. First, we summarize the mechanisms underlying the integration of multiple senses and how cross-modal processing in primary sensory cortices might be modified by top-down modulation of the PFC. Second, we examine the critical factors and developmental mechanisms that account for the interaction between neuronal networks involved in sensory and cognitive processing. Finally, we discuss the applicability and relevance of cross-modal processing for brain-inspired intelligent robotics. An in-depth understanding of the factors and mechanisms controlling cross-modal processing might inspire the refinement of robotic systems by better mimicking neural computations.
... Imitation plays a prominent role in this process, but how does a robot know which behavior is to be imitated? They can do so by employing the teleological stance and detecting intentions-the behavior to be imitated is determined by the goal being pursued or the focus of attention (indicated by pointing or gaze) (Breazeal & Scassellati, 2002). Once robots acquire the cognitive ability to track goals, signaling functions on the modeler's side can arise to shape robot behavior. ...
Article
Full-text available
Recent work in the cognitive sciences has argued that beliefs sometimes acquire signaling functions in virtue of their ability to reveal information that manipulates “mindreaders.” This paper sketches some of the evolutionary and design considerations that could take agents from solipsistic goal pursuit to beliefs that serve as social signals. Such beliefs will be governed by norms besides just the traditional norms of epistemology (e.g., truth and rational support). As agents become better at detecting the agency of others, either through evolutionary history or individual learning, the candidate pool for signaling expands. This logic holds for natural and artificial agents that find themselves in recurring social situations that reward the sharing of one’s thoughts.
... Robot imitation is useful for many reasons, whether for helping robots learn through demonstration (Argall et al., 2009) or to make robots more persuasive (Bailenson and Yee, 2005). Robots can imitate humans in many ways-but usually in ways that are very different from how humans imitate each other (Breazeal and Scassellati, 2002). Robots that use oscillators to resonate or synchronize with people is a more limited approach but often useful. ...
Article
Full-text available
Resonance, a powerful and pervasive phenomenon, appears to play a major role in human interactions. This article investigates the relationship between the physical mechanism of resonance and the human experience of resonance, and considers possibilities for enhancing the experience of resonance within human–robot interactions. We first introduce resonance as a widespread cultural and scientific metaphor. Then, we review the nature of “sympathetic resonance” as a physical mechanism. Following this introduction, the remainder of the article is organized in two parts. In part one, we review the role of resonance (including synchronization and rhythmic entrainment) in human cognition and social interactions. Then, in part two, we review resonance-related phenomena in robotics and artificial intelligence (AI). These two reviews serve as ground for the introduction of a design strategy and combinatorial design space for shaping resonant interactions with robots and AI. We conclude by posing hypotheses and research questions for future empirical studies and discuss a range of ethical and aesthetic issues associated with resonance in human–robot interactions.
... A broad range of social robot scenarios can be defined [161], from ant-like robots to potential socially-intelligent agents (the latter within the domain of speculation). A specially relevant development in this area deals with the design of human-shaped robots able to learn facial expressions and react to them in meaningful ways [162,163]. ...
Article
Full-text available
When computers started to become a dominant part of technology around the 1950s, fundamental questions about reliable designs and robustness were of great relevance. Their development gave rise to the exploration of new questions, such as what made brains reliable (since neurons can die) and how computers could get inspiration from neural systems. In parallel, the first artificial neural networks came to life. Since then, the comparative view between brains and computers has been developed in new, sometimes unexpected directions. With the rise of deep learning and the development of connectomics, an evolutionary look at how both hardware and neural complexity have evolved or designed is required. In this paper, we argue that important similarities have resulted both from convergent evolution (the inevitable outcome of architectural constraints) and inspiration of hardware and software principles guided by toy pictures of neurobiology. Moreover, dissimilarities and gaps originate from the lack of major innovations that have paved the way to biological computing (including brains) that are completely absent within the artificial domain. As it occurs within synthetic biocomputation, we can also ask whether alternative minds can emerge from A.I. designs. Here, we take an evolutionary view of the problem and discuss the remarkable convergences between living and artificial designs and what are the pre-conditions to achieve artificial intelligence.
... The proposed test procedure is based on the 'robots-imitating-humans' approach [35][36][37] and existing statistical tools. The resources required are: ...
Article
Full-text available
Future service robots mass-produced for practical applications may benefit from having personalities. To engineer robot personalities in significant quantities for practical applications, we need first to identify the personality dimensions on which personality traits can be effectively optimised by minimising the distances between engineering targets and the corresponding robots under construction, since not all personality dimensions are applicable and equally prominent. Whether optimisation is possible on a personality dimension depends on how specific users consider the personalities of a type of robot, especially whether they can provide effective feedback to guide the optimisation of certain traits on a personality dimension. The dimensions may vary from user group to user group since not all people consider a type of trait to be relevant to a type of robot, which our results corroborate. Therefore, we had proposed a test procedure as an engineering tool to identify, with the help of a user group, personality dimensions for engineering robot personalities out of a type of robot knowing its typical usage. It applies to robots that can imitate human behaviour and small user groups with at least eight people. We confirmed its effectiveness in limited-scope tests.
... A broad range of social robot scenarios can be defined (Fong et al., 2003), from ant-like robots to potential socially-intelligent agents (the latter within the domain of speculation). A specially relevant development in this area deals with the design of human-shaped robots able to learn facial expressions and react to them in meaningful ways (Breazeal & Brian, 2002;Breazeal, 2003). ...
Preprint
When computers start to become a dominant part of technology around the 1950s, fundamental questions about reliable designs and robustness were of great relevance. Their development gave rise to the exploration of new questions such as what made brains reliable (since neurons can die) and how computers could get inspiration from neural systems. In parallel, the first Artificial Neural Networks came to life. Since then, the comparative view between brains and computers has been developed in new, sometimes unsuspected directions. With the rise of deep learning and the development of connectomics, an evolutionary look at how both hardware and neural complexity have evolved or designed is required. In this paper, we argue that important similarities have resulted both from convergent evolution (the inevitable outcome of architectural constraints) and inspiration of hardware and software principles guided by toy pictures of neurobiology. Moreover, dissimilarities and gaps originate from the lack of major innovations that have paved the way to biological computing (including brains) that are completely absent within the artificial domain. As it occurs within synthetic biocomputation, we can also ask whether alternative minds can emerge from A.I.\ designs. Here we take an evolutionary view of the problem and discuss the remarkable convergences between living and artificial designs and what are the pre-conditions to achieve artificial intelligence.
... In such a human-based knowledge economy, the winners would be creative engineers who master coding and create new ideas and knowledge. However, critiques have been addressed to such hierarchical relationships in which knowledge narrowly flows only from humans to robots (Breazeal & Scassellati, 2002). Here, robots contribute to knowledge creation narrowly by only repeating specified human-designed activities with exactly foreseen outcomes, and only humans bring knowledge due to its novelty, value, and critical justification. ...
Article
Full-text available
In the contemporary robotizing knowledge economy, robots take increasing responsibility for accomplishing knowledge-related tasks that so far have been in the human domain. This profoundly changes the knowledge-creation processes that are at the core of the knowledge economy. Knowledge creation is an interactive spatial process through which ideas are transformed into new and justified outcomes, such as novel knowledge and innovations. However, knowledge-creation processes have rarely been studied in the context of human–robot co-creation. In this article, we take the perspective of key actors who create the future of robotics, namely, robotics-related students and researchers. Their thoughts and actions construct the knowledge co-creation processes that emerge between humans and robots. We ask whether robots can have and create knowledge, what kind of knowledge, and what kind of spatialities connect to interactive human–robot knowledge-creation processes. The article’s empirical material consists of interviews with 34 robotics-related researchers and students at universities in Finland and Singapore as well as observations of human–robot interactions there. Robots and humans form top-down systems, interactive syntheses, and integrated symbioses in spatial knowledge co-creation processes. Most interviewees considered that robots can have knowledge. Some perceived robots as machines and passive agents with rational knowledge created in hierarchical systems. Others saw robots as active actors and learning co-workers having constructionist knowledge created in syntheses. Symbioses integrated humans and robots and allowed robots and human–robot cyborgs access to embodied knowledge.
... An intelligent agent can formulate these based on the knowledge from the perception that can be different from others in terms of content and weight. Moreover, this ability of intelligent agents can help them in the assessment of actions for mapping perception to their own actions repertoire in the taxonomy of social learning [43]. In the future intelligent agent can use this ability to assess collaborators in the collaborative environment during interaction where control over physical states, emotions, and feeling is required [44]. ...
Article
Full-text available
Personal semantic memory is a way of inducing subjectivity in intelligent agents. Personal semantic memory has knowledge related to personal beliefs, self-knowledge, preferences, and perspectives in humans. Modeling this cognitive feature in the intelligent agent can help them in perception, learning, reasoning, and judgments. This paper presents a methodology for the development of personal semantic memory in response to external information. The main contribution of the work is to propose and implement the computational version of personal semantic memory. The proposed model has modules for perception, learning, sentiment analysis, knowledge representation, and personal semantic construction. These modules work in synergy for personal semantic knowledge formulation, learning, and storage. Personal semantics are added to the existing body of knowledge qualitatively and quantitatively. We performed multiple experiments where the agent had conversations with the humans. Results show an increase in personal semantic knowledge in the agent’s memory during conversations with an F1 score of 0.86. These personal semantics evolved qualitatively and quantitatively with time during experiments. Results demonstrated that agents with the given personal semantics architecture possessed personal semantics that can help the agent to produce some sort of subjectivity in the future.
... As to what is required for imitation, there are debates in the literature ranging from the distinctions between program-level and production-level imitation (Byrne, 2002) to the necessity of pairing Theory of Mind (ToM) with behavioral imitation to obtain "true" imitation (Call et al., 2005). We refer the reader to Breazeal and Scassellati (2002) for a more detailed discussion of imitation in robots. ...
Article
Full-text available
This article introduces a three-axis framework indicating how AI can be informed by biological examples of social learning mechanisms. We argue that the complex human cognitive architecture owes a large portion of its expressive power to its ability to engage in social and cultural learning. However, the field of AI has mostly embraced a solipsistic perspective on intelligence. We thus argue that social interactions not only are largely unexplored in this field but also are an essential element of advanced cognitive ability, and therefore constitute metaphorically the “dark matter” of AI. In the first section, we discuss how social learning plays a key role in the development of intelligence. We do so by discussing social and cultural learning theories and empirical findings from social neuroscience. Then, we discuss three lines of research that fall under the umbrella of Social NeuroAI and can contribute to developing socially intelligent embodied agents in complex environments. First, neuroscientific theories of cognitive architecture, such as the global workspace theory and the attention schema theory, can enhance biological plausibility and help us understand how we could bridge individual and social theories of intelligence. Second, intelligence occurs in time as opposed to over time, and this is naturally incorporated by dynamical systems. Third, embodiment has been demonstrated to provide more sophisticated array of communicative signals. To conclude, we discuss the example of active inference, which offers powerful insights for developing agents that possess biological realism, can self-organize in time, and are socially embodied.
... In choosing whether to use humanlike or machinelike agents in fundraising appeals, marketing managers can draw from many studies examing effects of anthropomorphic versus robotic AI agents (Breazeal, 2004;Breazeal & Scassellati, 2002;Duffy, 2003;Ekman, 1999;Goudey & Bonnin, 2016;Mende et al., 2019;Sciutti et al., 2018;Zeng et al., 2009). Many have suggested that smiling human visages (Martin & Rovira, 1982) signal psychological proximity (Bogodistov & Dost, 2017). ...
Article
Anthropomorphism and construal level theories provide the bases for two studies showing that when nonprofit charity marketers design artificial intelligence (AI) agents to resemble humans and to smile like humans, potential donors feel greater psychological closeness to the agents and are motivated to increase charitable giving. Study 1 demonstrates that participants feel greater psychological closeness and willingness to donate in response to appeals from smiling AI agents that look like humans rather than like robots. Study 2 demonstrates that participants tend to donate more in reaction to appeals from humanlike (vs. machinelike) AI agents that smile broadly rather than slightly or not at all. The article concludes with a discussion of theoretical insights and practical implications for using AI representatives in nonprofit charity appeals.
... § Corresponding author and the demonstrating person are likely to have different embodiments, demonstrated motions need to be mapped to the robot's configuration space so that they can be executed on the robot's platform [8], [9]. Related to this, a demonstration can be observed using different sensory modalities, such that the modality determines the complexity of the demonstration setup and the difficulty of converting demonstrations to executable robot commands [7], [10]. For instance, using dedicated motion-capture sensors on the person's joints may simplify the data recording, but increases the complexity of the setup; on the other hand, using kinaesthetic teaching simplifies the demonstration mapping problem, but requires the robot platform to support this type of demonstration [11]. ...
Preprint
Robots applied in therapeutic scenarios, for instance in the therapy of individuals with Autism Spectrum Disorder, are sometimes used for imitation learning activities in which a person needs to repeat motions by the robot. To simplify the task of incorporating new types of motions that a robot can perform, it is desirable that the robot has the ability to learn motions by observing demonstrations from a human, such as a therapist. In this paper, we investigate an approach for acquiring motions from skeleton observations of a human, which are collected by a robot-centric RGB-D camera. Given a sequence of observations of various joints, the joint positions are mapped to match the configuration of a robot before being executed by a PID position controller. We evaluate the method, in particular the reproduction error, by performing a study with QTrobot in which the robot acquired different upper-body dance moves from multiple participants. The results indicate the method's overall feasibility, but also indicate that the reproduction quality is affected by noise in the skeleton observations.
... For users, automatic programming system hides the programming language. The automatic programming methods include LfD, [23] robot that imitates human behavior [24] and the programming by demonstration [25]. Currently, the four major robot manufacturers, which account for more than 60% of the world's industrial robots (ABB, Kuka, Fanuc and Yasukawa) all offer demonstration programming. ...
Article
Full-text available
Cyber-Physical System (CPS), which is a part of Industry 4.0, suggests that the physical systems such as robots can be controlled by automation systems to minimize human workload. With the rise of automatic programming systems, programmers are no longer required to have a thorough understanding of the code. As a result of the interaction between the robot and the human, automatic programming systems generate a robot program. Many processing and recognition technologies for human-computer interaction interface are required to make the robot realize a more natural interaction. It is necessary to consider disruptive technologies in order to provide innovation and enable us to change the way we program. In this paper, we proposed approach to do automatic programming for industrial robots with natural language. To begin, we use a multi-attention mechanism to measure the matching probability of natural language instructions to objects in the environment. Then, using a modular programming method, we generate code for robots and combine the prediction results. We extend the existing dataset for evaluation to make it more suitable for describing the actual manufacturing environment, taking into account position, attribute, and constraints. The experimental results show that the model in this paper has a 20% higher recognition rate than other existing methods for accurately locating the object position, and the similarity between code written by experienced engineers and code generated by our more reached 80%.
Article
Human dancers can understand and judge the aesthetics of their own dance motions from their movement perception. Inspired by this, we propose a novel mechanism of automatic aesthetics assessment of robotic dance motions, which is based on ensemble learning aimed at developing the autonomous judgment ability of robots. In the proposed mechanism, key pose descriptors based higher-order clustering features are designed to characterize robotic dance motion. Then, an ensemble classifier is built to train a machine aesthetics model for the automatic aesthetics assessment on robotic dance motions. The proposed mechanism has been implemented on a simulated robot environment, and experimental results show its feasibility and good performance.
Thesis
Full-text available
With the introduction of computer-aided methods for diagnosis and intervention, patient outcomes of many clinical procedures have been increased tremendously in the last decades. An essential task thereby is the registration of inter- or intrapatient images that are acquired using a single or multiple imaging modalities. In principle, vasculature permeates through all organs of human body. As it is interbedded spatially, it reflects the pathological changes of the surrounding tissue. Thus, accurate and robust registration methods to align vessel structures or images have the capability to benefit a large variety of clinical procedures. Numerous methods that propose novel vessel registration algorithms have been published to date. Both intensity and feature-based methods are proposed, where a clear dominance of feature-based, especially point-based methods, can be derived from the summary of the state-of-the-art. Although the methods proposed in the literature demonstrate promising results for dedicated clinical set up, a generalized approach for vessel registration regardless of region of interest has not been proposed and evaluated. Furthermore, most state-of-the-art methods requires cumbersome hyperparameter tuning, which reduce their clinical applicability consequently. Therefore, a primary goal of this thesis is to develop, implement and evaluate an accurate and inherent robust vessel registration framework with the generalizability and applicability to various clinical applications. Recent progress in machine learning and deep learning opens new perspectives to improve the efficiency of the conventional registration methods for vasculature. Promising registration results have been achieved with different learning paradigms, such as reinforcement learning, supervised- and unsupervised learning with synthetic and clinical data. Hence, a secondary goal of this thesis is to investigate the potential of other machine learning and deep learning techniques to resolve vessel image registration problems. One of the two point-based registration frameworks investigated in this thesis utilizes mixture models to align the centerlines of vasculature. Thereby, a hybrid mixture model is proposed as a key part of the framework, that models the spatial and topological information of the vasculature simultaneously and is consequently equipped with significant discriminative capacity. Moreover, an automatic refinement mechanism to identify regions with missing data is formulated. The final transformation to the target image can be estimated with different methods such as Thin-Plate-Spline, B-Spline or Gaussian Process. The evaluation with synthetic, phantom and clinical data acquired from different clinical procedures demonstrate the accuracy and inherent robustness of the entire vessel registration framework regardless of clinical set up. The other approach formulated in this thesis makes use of imitation learning paradigm to overcome the weakness and challenges of adopting reinforcement learning for image registration problem. Under the guidance of a demonstrator an agent is trained to find the optimal displacement of landmarks. The network architecture is inspired by PointNet that is able to consume raw point data as input. The proposed framework is evaluated with clinical fundoscopic images in a retrospective manner. A special attention of experiments conducted is to understand the principles of our imitation network, where the influences of the model parameters are analyzed in detail. The evaluation results demonstrate the potentials and effectiveness of the imitation network to register vascular images for the first time.
Article
Full-text available
Theory of Mind and Age-Associated Egocentricity Abstract: Theory of Mind (ToM) is a demanding, composite process relying on several cognitive skills. It is ex amined with sophisticated paradigms in psychological laboratories. Different reasons contribu te to increasing difficulties in coping with su ch tasks with older age. With cardio-vascular disease, visual and hearing impairment or specific brain diseases such as neurodegenerative or vascular dementia (e.g. frontotemporal lobar degeneration) contribute to the challenges in find ing the right answers to complex social questions. ToM needs constant training, which may suffer during social lay-offs with forced retirement, a major factor behind age-associated egocentricity (AAE).
Conference Paper
Full-text available
In this paper, we present a method for programming robust, reusable and hardware-abstracted robot skills. The goal of this work is to supply mobile robot manipulators with a library of skills that incorporate both sensing and action, which permit robot novices to easily reprogram the robots to perform new tasks. Critical to the success of this approach is the notion of hardware abstraction, that separates the skill level from the primitive level on specific systems. Leveraging a previously proposed architecture, we construct two complex skills by instantiating the necessary skill primitives on two very different mobile manipulators. The skills are parameterized by task level variables, such as object labels and environment locations, making re-tasking the skills by operators feasible.
Chapter
Dilemmas do occur time and again. How to efficiently and effectively handle a dilemma is a big challenge for artificial intelligence (AI) systems. With the current approaches, AI systems are not able to accomplish a task should a dilemma, especially an ethical one, be encountered. Utilizing a hypothetical dilemma case, this paper analyzes the limitations of the current approaches in dealing with dilemmas and points out the root cause of the problem. It then proposes a novel layered framework, which fully utilizes both the unique strengths of humans and the unique strengths of AI systems. This layered framework actively explores different angles, areas, aspects, dimensions, domains, facets, fields, and/or layers in an attempt to find tipping points so that ethical dilemmas can be successfully tamed and handled. This exploration will thus lead to a paradigm shift.
Thesis
A long-standing goal of Machine Learning (ML) and AI at large is to design autonomous agents able to efficiently interact with our world. Towards this, taking inspirations from the interactive nature of human and animal learning, several lines of works focused on building decision making agents embodied in real or virtual environments. In less than a decade, Deep Reinforcement Learning (DRL) established itself as one of the most powerful set of techniques to train such autonomous agents. DRL is based on the maximization of expert-defined reward functions that guide an agent’s learning towards a predefined target task or task set. In parallel, the Developmental Robotics field has been working on modelling cognitive development theories and integrating them into real or simulated robots. A core concept developed in this literature is the notion of intrinsic motivation: developmental robots explore and interact with their environment according to self-selected objectives in an open-ended learning fashion. Recently, similar ideas of self-motivation and open-ended learning started to grow within the DRL community, while the Developmental Robotics community started to consider DRL methods into their developmental systems. We propose to refer to this convergence of works as Developmental Machine Learning. Developmental ML regroups works on building embodied autonomous agents equipped with intrinsic-motivation mechanisms shaping open-ended learning trajectories. The present research aims to contribute within this emerging field. More specifically, the present research focuses on proposing and assessing the performance of a core algorithmic block of such developmental machine learners: Automatic Curriculum Learning (ACL) methods. ACL algorithms shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. Despite impressive success in traditional supervised learning scenarios (e.g. image classification), large-scale and real-world applications of embodied machine learners are yet to come. The present research aims to contribute towards the creation of such agents by studying how to autonomously and efficiently scaffold them up to proficiency.
Article
Full-text available
n on-linecontrol and decision-making systems, emotional brain training is a preferred methodology (compared to stochastic gradient-based and evolutionary algorithms) due to its low computational complexity and fast robust learning. To describe the emotional learning of the brain, a mathematical model was created —the brain emotional learning controller (BELC). The design of intelligent systems based on emotional signals basedoncontrol methods assoft computing technologies: artificial neural networks, fuzzy control and genetic algorithms. Based on the simulated mathematical model of mammals BEL, a controller architecture has been developed. Applied approachcalled “Brain Emotional Learning Based Intelligent Controller” (BELBIC) —a neurobiologically motivated intelligent controller based on a computational model of emotional learning in the mammalian limbic system. The article describes applied models of intelligent regulators based on emotional learning of the brain. BELBIC's learning capabilities;versatility and low computational complexity make it a very promising toolkitfor on-lineapplications.
Article
Children can imitate adults’ actions with ease. An imitator who observed demonstrator’s action can produce a bodily movement which is supposed to be similar with it. Any theory and hypothesis of imitation is required to describe how the imitator recognizes the similarity or identity of actions, i.e., in what sense from the imitator’s point-of-view the produced and the observed action can be similar or identified. In this paper, we review four existing hypotheses on the mechanism of imitation. One of our criteria for the hypotheses in this review is the possibility of implementation and validation of the hypotheses in computer simulations or physical robots. A straightforward implementation without any additional ad-hoc assumption would be difficult, if the hypothesis to be implemented is flawed. Thus, the computer-based and/or physical implementation tests if a hypothesis accounts for the imitation mechanism sufficiently. This motivates us to critically review and clarify the existing literature on imitation in the sense of technical and theoretical plausibility of the existing hypotheses. We review the four existing hypotheses on imitation with some of them had been implemented in a form of computer simulations and/or robots. By pointing out the additional assumptions for the implementation, this review will reveal the latent requirements for an account for the imitation, that has not been addressed well. Lastly, we briefly propose our own account for imitation, in which bodily movements are characterized and identified on the basis of dynamical invariants under smooth transformations.
Article
In an era of rapid advances in artificial intelligence, the deployment of robots in organizations is accelerating. Further, robotic capabilities are expanding to serve a broader range of leadership behaviors related to task accomplishment and relationship support. Despite the increasing use of robots in various roles across different industries, research on human-robot collaboration in the workplace is lagging behind. As such, the current research aims to provide a state-of-the-science review and directions for future work in this underdeveloped area. Drawing on current leadership paradigms, we review human-robot collaboration studies from four academic disciplines with a history of publishing such work (i.e., management, economics, psychology, engineering) and propose that the research trajectory of human-robot collaboration parallels the evolution of leadership research paradigms (i.e., leader centric, relational view, and follower centric). Given that leadership is an inherently multilevel phenomenon, we apply a levels-of-analysis framework to integrate and synthesize human-robot collaboration studies from cross-disciplinary research areas. Based on our findings, we offer suggestions for future research in terms of conceptualization, theory building and testing, practical implications, and ethical considerations.
Article
Productivity and flexibility are antagonists with opposite goals. Until now, optimal productivity could only be achieved in fully automated inflexible serial production. Increased demand for flexibility due to individualized products e.g. in most SMEs requires an increased level of flexibility – also for robots that should provide improved skills as well as improved means of interaction to simplify programming. Instrumented tools based on 3D tracking technology can be used as interfaces in Programming by Demonstration (PbD) scenarios but are prone to inaccuracies introduced by human demonstration. This paper presents a programming paradigm that combines a semantic model based geometric reasoning paradigms with an instrumented tool based PbD approach in order to compensate those introduced inaccuracies as well as investigates basic accuracies of demonstration of point based operations.
Article
In industrial manufacturing, the deployment of dual-arm robots in assembly tasks has become a trend. However, making the dual-arm robots more intelligent in such applications is still an open, challenging issue. This paper proposes a novel framework that combines task-oriented motion planning with visual perception to facilitate robot deployment from perception to execution and finish assembly problems by using dual-arm robots. In this framework, visual perception is first employed to track the effects of the robot behaviors and observe states of the workpieces, where the performance of tasks can be abstracted as a high-level state for intelligent reasoning. The assembly task and manipulation sequences can be obtained by analyzing and reasoning the state transition trajectory of the environment as well as the workpieces. Next, the corresponding assembly manipulation can be generated and parameterized according to the differences between adjacent states by combining with the prebuilt knowledge of the scenarios. Experiments are set up with a dual-arm robotic system (ABB YuMi and an RGB-D camera) to validate the proposed framework. Experimental results demonstrate the effectiveness of the proposed framework and the promising value of its practical application.
Article
Full-text available
Social robots are increasingly penetrating our daily lives. They are used in various domains, such as healthcare, education, business, industry, and culture. However, introducing this technology for use in conventional environments is not trivial. For users to accept social robots, a positive user experience is vital, and it should be considered as a critical part of the robots’ development process. This may potentially lead to excessive use of social robots and strengthen their diffusion in society. The goal of this study is to summarize the extant literature that is focused on user experience in social robots, and to identify the challenges and benefits of UX evaluation in social robots. To achieve this goal, the authors carried out a systematic literature review that relies on PRISMA guidelines. Our findings revealed that the most common methods to evaluate UX in social robots are questionnaires and interviews. UX evaluations were found out to be beneficial in providing early feedback and consequently in handling errors at an early stage. However, despite the importance of UX in social robots, robot developers often neglect to set UX goals due to lack of knowledge or lack of time. This study emphasizes the need for robot developers to acquire the required theoretical and practical knowledge on how to perform a successful UX evaluation.
Article
Full-text available
If we are to build human-like robots that can interact naturally with people, our robots must know not only about the properties of objects but also the properties of animate agents in the world. One of the fundamental social skills for humans is the attribution of beliefs, goals, and desires to other people. This set of skills has often been called a theory of mind. This paper presents the theories of Leslie (1994) and Baron-Cohen (1995) on the development of theory of mind in human children and discusses the potential application of both of these theories to building robots with similar capabilities. Initial implementation details and basic skills (such as finding faces and eyes and distinguishing animate from inanimate stimuli) are introduced. I further speculate on the usefulness of a robotic implementation in evaluating and comparing these two models.
Article
Full-text available
This team is working on easier ways to program behavior in humanoid robots, and potentially in other machines and computer systems, based on how we "program" behavior in our fellow human beings. Their current robot is DB, a hydraulic anthropomorphic robot with legs, arms (with palms but no fingers), a jointed torso, and a head. It has 30 degrees of freedom: three in the neck, two in each eye, seven in each arm, three in each leg, and three in the trunk. The robot is currently mounted at the pelvis so the researchers can focus on upper-body movement and avoid dealing with balance. The work described here discusses trajectory formation and planning, learning from demonstration, oculomotor control, and interactive behaviors. The team has already demonstrated paddling a single ball on a racket, learning a folk dance by observing a human perform it, drumming synchronized to sounds the robot hears (karaoke drumming), juggling three balls, performing a T'ai Chi exercise in contact with a human, and various oculomotor behaviors.
Article
Full-text available
Humans demonstrate a remarkable ability to generate accurate and appropriate motor behavior under many different and often uncertain environmental conditions. In this paper, we propose a modular approach to such motor learning and control. We review the behavioral evidence and benefits of modularity, and propose a new architecture based on multiple pairs of inverse (controller) and forward (predictor) models. Within each pair, the inverse and forward models are tightly coupled both during their acquisition, through motor learning, and use, during which the forward models determine the contribution of each inverse model's output to the final motor command. This architecture can simultaneously learn the multiple inverse models necessary for control as well as how to select the inverse models appropriate for a given environment. Finally, we describe specific predictions of the model, which can be tested experimentally.
Article
Full-text available
This paper addresses the role of imitation as a means to enhance the learning of communication skills in autonomous robots. A series of robotic experiments are presented in which autonomous mobile robots are taught a synthetic proto-language. Learning of the language occurs through an imitative scenario where the robot replicates the teacher's movements. Imitation is here an implicit attentional mechanism which allows the robot imitator to share a similar set of proprio- and exteroceptions with the teacher. The robot grounds its understanding of the teacher's words, which describe the teacher's current observations, upon its own perceptions which are similar to those of the teacher. Learning of the robot is based on a dynamical recurrent associative memory architecture (DRAMA). Learning is unsupervised and results from the self-organization of the robot's connectionist architecture. Results show that the imitative behavior greatly improves the efficiency and speed of the learning. More...
Article
Full-text available
To execute voluntary movements, the central nervous system must transform the neural representation of the direction, amplitude, and velocity of the limb, represented by the activity of cortical and subcortical neurons, into signals that activate the muscles that move the limb. This task is equivalent to solving an "ill-posed" computational problem because the number of degrees of freedom of the musculoskeletal apparatus is much larger than that specified in the plan of action. Some of the mechanisms and circuitry underlying the transformation of motor plans into motor commands are described. A central feature of this transformation is a coarse map of limb postures in the premotor areas of the spinal cord. Vectorial combination of motor outputs among different areas of the spinal map may produce a large repertoire of motor behaviors.
Article
Full-text available
While it is generally assumed that complex movements consist of a sequence of simpler units, the quest to define these units of action, or movement primitives, remains an open question. In this context, two hypotheses of movement segmentation of endpoint trajectories in three-dimensional human drawing movements are reexamined: (1) the stroke-based segmentation hypothesis based on the results that the proportionality coefficient of the two-thirds power law changes discontinuously with each new "stroke," and (2) the segmentation hypothesis inferred from the observation of piecewise planar endpoint trajectories of three-dimensional drawing movements. In two experiments human subjects performed a set of elliptical and figure eight patterns of different sizes and orientations using their whole arm in three dimensions. The kinematic characteristics of the endpoint trajectories and the seven joint angles of the arm were analyzed. While the endpoint trajectories produced similar segmentation features to those reported in the literature, analyses of the joint angles show no obvious segmentation but rather continuous oscillatory patterns. By approximating the joint angle data of human subjects with sinusoidal trajectories, and by implementing this model on a 7-degree-of-freedom (DOF) anthropomorphic robot arm, it is shown that such a continuous movement strategy can produce exactly the same features as observed by the above segmentation hypotheses. The origin of this apparent segmentation of endpoint trajectories is traced back to the nonlinear transformations of the forward kinematics of human arms. The presented results demonstrate that principles of discrete movement generation may not be reconciled with those of rhythmic movement as easily as has been previously suggested, while the generalization of nonlinear pattern generators to arm movements can offer an interesting alternative to approach the question of units of action.
Conference Paper
Full-text available
This paper considers the humanoid research as an approach to understanding and realizing complex real world interactions among the robot, environment, and human. As a first step towards extracting a common principle over the three term interactions, the concept of action oriented control has been investigated with simulation example. The complex interaction view casts unique constraints on the design of a humanoid, such as the whole body, smooth shape and non-functional-modular design. A brief description of ongoing design of ETL-humanoid which conforms to the above constraints is presented
Article
Full-text available
this paper, we present methods that give machines the ability to see people, interpret their actions and interact with them. We present the motivating factors behind this work, examples of how such computational methods are developed and their applications.
Article
Full-text available
. Motor control is a complex problem and imitation is a powerful mechanism for acquiring new motor skills. In this paper, we describe perceptuo-motor primitives, a biologically-inspired notion for a basis set of perceptual and motor routines. Primitives serve as a vocabulary for classifying and imitating observed human movements, and are derived from the imitator's motor repertoire. We describe a model of imitation based on such primitives and demonstrate the feasibility of the model in a constrained implementation. We present approximate motion reconstruction generated from visually captured data of typically imitated tasks taken from aerobics, dancing, and athletics. 1 Introduction Imitation is a powerful mechanism for acquiring new skills. It involves an intricate interaction between perceptual and motor mechanisms, both of which are complex in themselves. Research into vision and motor control has explored the role of "subroutines", schemas [1], and other variants based on ...
Article
Full-text available
Adults are extremely adept at recognizing social cues, such as eye direction or pointing gestures, that establish the basis of joint attention. These skills serve as the developmental basis for more complex forms of metaphor and analogy by allowing an infant to ground shared experiences and by assisting in the development of more complex communication skills. In this chapter, we review some of the evidence for the developmental course of these joint attention skills from developmental psychology, from disorders of social development such as autism, and from the evolutionary development of these social skills. We also describe an on-going research program aimed at testing existing models of joint attention development by building a human-like robot which communicates naturally with humans using joint attention. Our group has constructed an upper-torso humanoid robot, called Cog, in part to investigate how to build intelligent robotic systems by following a developmental progression of skills similar to that observed in human development. Just as a child learns social skills and conventions through interactions with its parents, our robot will learn to interact with people using natural social communication. We further consider the critical role that imitation plays in bootstrapping a system from simple visual behaviors to more complex social skills. We will present data from a face and eye finding system that serves as the basis of this developmental chain, and an example of how this system can imitate the head movements of an individual.
Article
Full-text available
A novel task instruction method for future intelligent robots is presented. In our method, a robot learns reusable task plans by watching a human perform assembly tasks. Functional units and working algorithms for visual recognition and analysis of human action sequences are presented. The overall system is model based and integrated at the symbolic level. Temporal segmentation of a continuous task performance into meaningful units and identification of each operation is processed in real time by concurrent recognition processes under active attention control. Dependency among assembly operations in the recognized action sequence is analyzed, which results in a hierarchical task plan describing the higher level structure of the task. In another workspace with a different initial state, the system re-instantiates and executes the task plan to accomplish an equivalent goal. The effectiveness of our method is supported by experimental results with block assembly tasks. Keywords--- Learni...
Article
Full-text available
. Roboticists have already invested considerable energy in building robot controllers which model the learning capacities of single animals. In this paper we present a new type of controller which draws upon insight from the field of imitative learning: one agent learns from perceiving and imitating the behaviour of another. We describe the architecture of an imitative learning controller and two implementations, a simulator and a robot. The learner robot follows a teacher robot through a maze and learns to associate its perceptions at locations where the teacher carries out a significant action with the action it subsequently undertakes as a result of its innate teacher-following behaviour. Such a controller limits the learning task to bouts of learning when there is something useful to be learnt. It allows a robot to learn in terms of its own perceptions, makes programming many nominally identical robots simpler, and opens the possibilities for cross-modal learning. 1 Introduction R...
Article
Full-text available
INTRODUCTION Imitation is a powerful mechanism for learning new skills. However, it involves an intricate interaction between perceptual and motor mechanisms, both of which are complex in themselves. Recently, findings from neuroscience have discovered so-called "mirror neurons", which directly couple the observation of a certain movements and their motor execution [6]. We combine the notions of perceptual and motor routines with the function of mirror neurons into primitives. The primitives are used as a basis set of motion, serving as a vocabulary for classifying and imitating observed movements. For a detailed discussion of the biological inspirations for our model, see [4]. 2. PRIMITIVES IN IMITATION In the imitation model we envisage (Figure 1) and are continually developing, perceptuo-motor primitives encode movements invariant to exact Cartesian position, rate of motion, size, and perspective. The primitives represent a basis set of mot
Article
Full-text available
To explore issues of developmental structure, physical embodiment, integration of multiple sensory and motor systems, and social interaction, we have constructed an upper-torso humanoid robot called Cog. The robot has twenty-one degrees of freedom and a variety of sensory systems, including visual, auditory, vestibular, kinesthetic, and tactile senses. This chapter gives a background on the methodology that we have used in our investigations, highlights the research issues that have been raised during this project, and provides a summary of both the current state of the project and our long-term goals. We report on a variety of implemented visual-motor routines (smooth-pursuit tracking, saccades, binocular vergence, and vestibular-ocular and opto-kinetic reflexes), orientation behaviors, motor control techniques, and social behaviors (pointing to a visual target, recognizing joint attention through face and eye finding, imitation of head nods, and regulating interaction through expressive feedback). We further outline a number of areas for future research that will be necessary to build a complete embodied system.
Article
Infants between 12 and 21 days of age can imitate both facial and manual gestures; this behavior cannot be explained in terms of either conditioning or innate releasing mechanisms. Such imitation implies that human neonates can equate their own unseen behaviors with gestures they see others perform.
Article
This paper describes experiments performed with 40 subjects wearing an eye-tracker and watching and imitating videos of finger, hand, and arm movements. For all types of stimuli, the subjects tended to fixate on the hand, regardless of whether they were imitating or . just watching. The results lend insight into the connection between visual perception and motor control, suggesting that: 1 people analyze human arm movements largely by tracking the hand or the end-point, even if the movement is performed with the entire arm, and .2 when imitating, people use internal innate and learned models of movement, possibly in the form of motor primitives, to recreate the details of whole-arm posture and movement from end-point trajectories. q 1998 Elsevier Science B.V. All rights reserved.
Article
We derive a simple operational definition of teaching that distinguishes it from other forms of social learning where there is no active participation of instructors, and then discuss the constituent parts of the definition in detail. From a functional perspective, it is argued that the instructor's sensitivity to the pupil's changing skills or knowledge, and the instructor's ability to attribute mental states to others, are not necessary conditions of teaching in nonhuman animals, as assumed by previous work, because guided instruction without these prerequisites could still be favored by natural selection. A number of cases of social interaction in several orders of mammals and birds that have been interpreted as evidence of teaching are then reviewed. These cases fall into two categories: situations where offspring are provided with opportunities to practice skills ("opportunity teaching"), and instances where the behavior of young is either encouraged or punished by adults ("coaching"). Although certain taxonomic orders appear to use one form of teaching more often than the other, this may have more to do with the quality of the current data set than with inherent species-specific constraints. We suggest several directions for future research on teaching in nonhuman animals that will lead to a more thorough understanding of this poorly documented phenomenon. We argue throughout that adherence to conventional, narrow definitions of teaching, generally derived from observations of human adult-infant interactions, has caused many related but simpler phenomena in other species to go unstudied or unrecorded, and severely limits further exploration of this topic.
Article
Investigated whether children would re-enact what an adult actually did or what the adult intended to do. In Experiment, 1 children were shown an adult who tried, but failed, to perform certain target acts. Completed target acts were thus not observed. Children in comparison groups either saw the full target act or appropriate controls. Results showed that children could infer the adult's intended act by watching the failed attempts. Experiment 2 tested children's understanding of an inanimate object that traced the same movements as the person had followed. Children showed a completely different reaction to the mechanical device than to the person: They did not produce the target acts in this case. Eighteen-mo-olds situate people within a psychological framework that differentiates between the surface behavior of people and a deeper level involving goals and intentions. They have already adopted a fundamental aspect of folk psychology—persons (but not inanimate objects) are understood within a framework involving goals and intentions. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
A theory of imitation is proposed, string parsing, which separates the copying of behavioural organization by observation from an understanding of the cause of its effectiveness. In string parsing, recurring patterns in the visible stream of behaviour are detected and used to build a statistical sketch of the underlying hierarchical structure. This statistical sketch may in turn aid the subsequent comprehension of cause and effect. Three cases of social learning of relatively complex skills are examined, as potential cases of imitation by string parsing. Understanding the basic requirements for successful string parsing helps to resolve the conflict between mainly negative reports of imitation in experiments and more positive evidence from natural conditions. Since string parsing does not depend on comprehension of the intentions of other agents or the everyday physics of objects, separate tests of these abilities are needed even in animals shown to learn by imitation.
Article
This paper proposes a research direction to study the development of ‘artificial social intelligence’ of autonomous robots which should result in ‘individualized robot societies’. The approach is highly inspired by the ‘social intelligence hypothesis’, derived from the investigation of primate societies, suggesting that primate intelligence originally evolved to solve social problems and was only later extended to problems outside the social domain. We suggest that it might be a general principle in the evolution of intelligence, applicable to both natural and artificial systems. Arguments are presented why the investigation of social intelligence for artifacts is not only an interesting research issue for the study of biological principles, but may be a necessary prerequisite for those scenarios in which autonomous robots are integrated into human societies, interacting and communicating both with humans and with each other. As a starting point to study experimentally the development of robots' ‘social relationships’, the investigation of collection and use of body images by means of imitation is proposed. A specific experimental setup which we use to test the theoretical considerations is described. The paper outlines in what kind of applications and for what kind of robot group structures social intelligence might be advantageous.
Article
This paper reports on experiments in which a physical autonomous robot is taught a basic vocabulary concerning objects in its close environment. Teaching is provided in one case by a second teacher robot and in another case by a human teacher. An imitative strategy, namely mutual following, is used to create a common perceptual context to learner and teacher agents, upon which the learner grounds its understanding of the teacher's words. Learning results from multiple associations between simultaneous and consecutive sensor stimuli and is performed by a Dynamical Recurrent Associative Memory Architecture. Successes and failures of the learning are investigated under different environmental constraints and by varying parameters internal to the agent's control system. The experiments are realised both in simulated and real environments. We observe correlations between environmental and internal parameters, namely that the duration of short-term memory of sensor stimuli has to be fixed in relation to the objects' relative dispersion and featural descriptions. We quantify our analysis by determining bounds on these parameters within which learning is successful.
Conference Paper
To build smart human interfaces, it is necessary for a system to know a user's intention and point of attention. Since the motion of a person's head pose and gaze direction are deeply related with his/her intention and attention, detection of such information can be utilized to build natural and intuitive interfaces. We describe our real-time stereo face tracking and gaze detection system to measure head pose and gaze direction simultaneously. The key aspect of our system is the use of real-time stereo vision together with a simple algorithm which is suitable for real-time processing. Since the 3D coordinates of the features on a face can be directly measured in our system, we can significantly simplify the algorithm for 3D model fitting to obtain the full 3D pose of the head compared with conventional systems that use monocular camera. Consequently we achieved a non-contact, passive, real-time, robust, accurate and compact measurement system for head pose and gaze direction
Conference Paper
We describe and compare two implemented controllers for Adonis, a physically simulated humanoid torso, one based on joint-space torques and the other on convergent force-fields applied to the hands. The two come from different application domains: the former is a common approach in manipulator robotics and graphics, while the latter is inspired by biological limb control. Both avoid explicit inverse kinematic calculations found in standard Cartesian control, trading generality of motion for programming efficiency. The two approaches are compared on a common sequential task, the familiar dance `Macarena' and evaluated based on ease of generating new behaviors, flexibility, and naturalness of movement; we also compare them against human performance on the same task. Finally, we discuss the tradeoffs and present a more general framework for addressing complex motor control of simulated agents.
Article
Human motion capture is a promising technique for the generation of humanoid robot motions. To convert human motion into humanoid robot motion, we need to relate the humanoid robot kinematics to the kinematics of a human performer. In this paper we propose an automatic approach for scaling of humanoid robot kinematic parameters to the kinematic parameters of a human performer. The kinematic model is constructed directly from the motion capture data without manual measurements. We discuss the use of the resulting kinematic model for the generation of humanoid robot motions based on teh observed human motions. The results of the proposed technique on real human motion capture data are presented.
Article
Infants between 12 and 21 days of age can imitate both facial and manual gestures; this behavior cannot be explained in terms of either conditioning or innate releasing mechanisms. Such imitation implies that human neonates can equate their own unseen behaviors with gestures they see others perform.
Article
The functional properties of neurons located in the rostral part of inferior area 6 were studied in awake, partially restrained macaque monkeys. The most interesting property of these neurons was that their firing correlated with specific goal-related motor acts rather than with single movements made by the animal. Using the motor acts as the classification criterion we subdivided the neurons into six classes, four related to distal motor acts and two related to proximal motor acts. The distal classes are: "Grasping-with-the-hand-and-the-mouth neurons", "Grasping-with-the-hand neurons", "Holding neurons" and "Tearing neurons". The proximal classes are: "Reaching neurons" and "Bringing-to-the-mouth-or-to-the-body neurons". The vast majority of the cells belonged to the distal classes. A particularly interesting aspect of distal class neurons was that the discharge of many of them depended on the way in which the hand was shaped during the motor act. Three main groups of neurons were distinguished: "Precision grip neurons", "Finger prehension neurons", "Whole hand prehension neurons". Almost the totality of neurons fired during motor acts performed with either hand. About 50% of the recorded neurons responded to somatosensory stimuli and about 20% to visual stimuli. Visual neurons were more difficult to trigger than the corresponding neurons located in the caudal part of inferior area 6 (area F4). They required motivationally meaningful stimuli and for some of them the size of the stimulus was also critical. In the case of distal neurons there was a relationship between the type of prehension coded by the cells and the size of the stimulus effective in triggering the neurons. It is proposed that the different classes of neurons form a vocabulary of motor acts and that this vocabulary can be assessed by somatosensory and visual stimuli.
Article
The neurotrophin family of survival factors is distinguished by a unique receptor-signaling system that is composed of two transmembrane receptor proteins. Nerve growth factor (NGF), brain-derived neurotrophic factor, neurotrophin-3 (NT-3) and NT-4/5 share similar protein structures and biological functions and interact with two different types of cell-surface proteins, the Trk family of receptor tyrosine kinases, and the p75, or low-affinity neurotrophin receptor. An important question is whether a dual receptor system is necessary for neurotrophin action. Evidence indicates that co-expression of the two genes for the p75 receptor and the Trk NGF receptor can potentially lead to greater responsiveness to NGF, and suggests additional levels of regulation for the family of neurotrophin factors.
Article
Grasping requires coding of the object's intrinsic properties (size and shape), and the transformation of these properties into a pattern of distal (finger and wrist) movements. Computational models address this behavior through the interaction of perceptual and motor schemas. In monkeys, the transformation of an object's intrinsic properties into specific grips takes place in a circuit that is formed by the inferior parietal lobule and the inferior premotor area (area F5). Neurons in both these areas code size, shape and orientation of objects, and specific types of grip that are necessary to grasp them. Grasping movements are coded more globally in the inferior parietal lobule, whereas they are more segmented in area F5. In humans, neuropsychological studies of patients with lesions to the parietal lobule confirm that primitive shape characteristics of an object for grasping are analyzed in the parietal lobe, and also demonstrate that this 'pragmatic' analysis of objects is separated from the 'semantic' analysis performed in the temporal lobe.
Article
Visual and motor properties of single neurons of monkey ventral premotor cortex (area F5) were studied in a behavioral paradigm consisting of four conditions: object grasping in light, object grasping in dark, object fixation, and fixation of a spot of light. The employed objects were six different three-dimensional (3-D) geometric solids. Two main types of neurons were distinguished: motor neurons (n = 25) and visuomotor neurons (n = 24). Motor neurons discharged in association with grasping movements. Most of them (n = 17) discharged selectively during a particular type of grip. Different objects, if grasped in similar way, determined similar neuronal motor responses. Visuomotor neurons also discharged during active movements, but, in addition, they fired also in response to the presentation of 3-D objects. The majority of visuomotor neurons (n = 16) showed selectivity for one or few objects. The response was present both in object grasping in light and in object fixation conditions. Visuomotor neurons that selectively discharged to the presentation of a given object discharged also selectively during grasping of that object. In conclusion, object shape is coded in F5 even when a response to that object is not required. The possible visual or motor nature of this object coding is discussed.
Article
This review investigates two recent developments in artificial intelligence and neural computation: learning from imitation and the development of humanoid robots. It is postulated that the study of imitation learning offers a promising route to gain new insights into mechanisms of perceptual motor control that could ultimately lead to the creation of autonomous humanoid robots. Imitation learning focuses on three important issues: efficient motor learning, the connection between action and perception, and modular motor control in the form of movement primitives. It is reviewed here how research on representations of, and functional connections between, action and perception have contributed to our understanding of motor acts of other beings. The recent discovery that some areas in the primate brain are active during both movement perception and execution has provided a hypothetical neural basis of imitation. Computational approaches to imitation learning are also described, initially from the perspective of traditional AI and robotics, but also from the perspective of neural network models and statistical-learning research. Parallels and differences between biological and computational approaches to imitation are highlighted and an overview of current projects that actually employ imitation learning for humanoid robots is given.
Article
Movement provides the only means we have to interact with both the world and other people. Such interactions can be hard-wired or learned through experience with the environment. Learning allows us to adapt to a changing physical environment as well as to novel conventions developed by society. Here we review motor learning from a computational perspective, exploring the need for motor learning, what is learned and how it is represented, and the mechanisms of learning. We relate these computational issues to empirical studies on motor learning in humans.
Article
Humans demonstrate a remarkable ability to generate accurate and appropriate motor behavior under many different and often uncertain environmental conditions. In this paper, we propose a modular approach to such motor learning and control. We review the behavioral evidence and benefits of modularity, and propose a new architecture based on multiple pairs of inverse (controller) and forward (predictor) models. Within each pair, the inverse and forward models are tightly coupled both during their acquisition, through motor learning, and use, during which the forward models determine the contribution of each inverse model's output to the final motor command. This architecture can simultaneously learn the multiple inverse models necessary for control as well as how to select the inverse models appropriate for a given environment. Finally, we describe specific predictions of the model, which can be tested experimentally.
Article
A general theory of movement-pattern perception based on bi-directional theory for sensory-motor integration can be used for motion capture and learning by watching in robotics. We demonstrate our methods using the game of Kendama, executed by the SARCOS Dextrous Slave Arm, which has a very similar kinematic structure to the human arm. Three ingredients have to be integrated for the successful execution of this task. The ingredients are (1) to extract via-points from a human movement trajectory using a forward-inverse relaxation model, (2) to treat via-points as a control variable while reconstructing the desired trajectory from all the via-points, and (3) to modify the via-points for successful execution. In order to test the validity of the via-point representation, we utilized a numerical model of the SARCOS arm, and examined the behavior of the system under several conditions. Copyright 1996 Elsevier Science Ltd.
Conference Paper
We present a robust implementation of stereo-based head tracking designed for interactive environments with uncontrolled lighting. We integrate fast face detection and drift reduction algorithms with a gradient-based stereo rigid motion tracking technique. Our system can automatically segment and track a user's head under large rotation and illumination variations. Precision and usability of our approach are compared with previous tracking methods for cursor control and target selection in both desktop and interactive room environments.
Conference Paper
This paper proposes a developmental approach to social intelligence, especially communication ability, for robots and other artificial systems. Any social being has to have two essential features: naturalistic embodiment, i.e., having a body similar to others; and socio-cultural situatedness, i.e., being able to communicate with others and to participate in the social activity. However, we still have an open question: how does the body become situated in the social environment? Our answer is epigenesis, where (1) we create a humanoid with minimum innate abilities, namely a primordial form of joint attention and indirect experience, then (2) through the attentional and imitative interaction with human caregivers, the humanoid autonomously explores how to interact socially with people. As an epigenetic embodiment, the authors are building an upper-torso humanoid, Infanoid, which is to acquire social situatedness in the human community
Conference Paper
Addresses the problem of estimating human body motion from video. Its main contribution is the introduction of a robust optimization framework that leads to reliable and accurate body tracking and posture recovery. The proposed approach is resistant to occlusions and demonstrates that it is possible to treat different problems arising in human motion analysis in a unified way without using many decision thresholds. The implemented system requires only a standard CCD camera and no special markers on the body. We present experimental results showing the reliability of the implemented tracker
Conference Paper
Designing a mechanism that will allow a robot to imitate the actions of a human, apart from being interesting for opening the possibilities for efficient social learning through observation and imitation, is challenging since it requires the integration of information from the visual, memory and motor systems. This paper deals with the implementation of an imitation architecture on an active, stereo vision head, and describes our experiments on the deferred imitation of human head movements
Conference Paper
Learning a complex dynamic robot manoeuvre from a single human demonstration is difficult. This paper explores an approach to learning from demonstration based on learning an optimization criterion from the demonstration and a task model from repeated attempts to perform the task, and using the learned criterion and model to compute an appropriate robot movement. A preliminary version of the approach has been implemented on an anthropomorphic robot arm using a pendulum swing up task as an example
Conference Paper
We are attempting to introduce a 3-dimensional realistic human-like animate face robot to interactive communication modality. The face robot can recognize human facial expressions as well as produce realistic facial expressions. For the face robot to communicate interactively, we propose a new concept of “Active Human Interface”; and we investigate the performance of real-time recognition of facial expressions by neural network (NN) and the expressionability of facial messages on the face robot. We find that the NN recognition of facial expressions and face robot's performance in generating facial expressions are of almost same level as that in humans. This implies a high potential for the animate face robot to undertake interactive communication with human
Conference Paper
A new approach to skill acquisition in assembly is proposed. An assembly skill is represented by a hybrid dynamic system where a discrete event controller models the skill at the task level. The output of the discrete event controller provides the reference commands for the underlying robot controller. This structure is naturally encoded by a hidden Markov model (HMM). The HMM parameters are obtained by training on sensory data from human demonstrations of the skill. Currently, assembly tasks have to be performed by human operators or by robots using expensive fixtures. Our approach transfers the assembly skill from an expert human operator to the robot, thus making it possible for a robot to perform assembly tasks without the use of expensive fixtures
Article
To successfully interact with and learn from humans in cooperative modes, robots need a mechanism for recognizing, characterizing, and emulating human skills. In particular, it is our interest to develop the mechanism for recognizing and emulating simple human actions, i.e., a simple activity in a manual operation where no sensory feedback is available. To this end, we have developed a method to model such actions using a hidden Markov model (HMM) representation. We proposed an approach to address two critical problems in action modeling: classifying human action-intent, and learning human skill, for which we elaborated on the method, procedure, and implementation issues in this paper. This work provides a framework for modeling and learning human actions from observations. The approach can be applied to intelligent recognition of manual actions and high-level programming of control input within a supervisory control paradigm, as well as automatic transfer of human skills to robotic systems