ArticlePDF Available

Abstract and Figures

We have developed a robot called "Robovie" that has unique mechanisms designed for communica-tion with humans. Robovie can generate human-like behaviors by using the human-like actuators and vision and audio sensors. In the development, the software is a key. We have obtained two important ideas in human-robot communication through research from the viewpoint of cognitive science; one is importance of physical expressions using the body and the other is effectiveness of the robot's autonomy in robot's utterance recognition by humans. Based on these psychological experiments, we have devel-oped a new architecture that generates episode chains in interactions with humans. The basic structure of the architecture is a network of situated modules. Each module consists of elemental behaviors to entrain humans and a behavior for communicating with humans.
Content may be subject to copyright.
A preview of the PDF is not available
... The video recordings were made using the humanoid robot Robovie2, which was designed and constructed by the Advanced Telecommunications Research Institute (ATR) in Kyoto, Japan (Ishiguro et al. 2001;Kanda et al. 2002). Robovie2 was selected for its anthropomorphic yet mechanical design, which positions it in the middle of the mechanical-humanoid continuum (MacDorman 2006;Manzi et al. 2020c;Phillips et al. 2018). ...
Article
The study aims to investigate the effect of cultural background between Italy and Japan on 5-year-old children's moral judgements and emotion attributions towards human and robot. The children watched videos drawn from classic 'happy victimizer' stories, in which the transgressors were either a child or a robot, violating rules against stealing or not sharing. We assessed the children's attribution of emotions both to the transgressors and to themselves as victimisers (i.e. first-person perspective), as well as the moral judgement of the violations. The results showed that children from both cultures do not significantly discriminate between human and robot in their moral judgement and emotion attributions. Concerning moral emotions, Italian children tend to attribute fewer negative emotions to the transgressor than Japanese children, especially in the not sharing scenario. Furthermore, adopting a first-person perspective to evaluate moral transgressions reduces cultural differences in emotion attributions. The study highlights how culture, rather than the transgressor's agency (human or robot), influences early moral reasoning. ARTICLE HISTORY
... One of the earliest and widely used rule-based interaction managers was Kismet [13], which responded to social cues using predefined rules. Similarly, the Robovie robot series [14,15] leveraged rule-based architectures to manage interactions, utilizing episode chains to dynamically adapt behaviors based on human responses. While effective for structured interactions, these approaches were rigid and struggled with off-script scenarios. ...
Preprint
Recent advances in large language models (LLMs) have demonstrated their potential as planners in human-robot collaboration (HRC) scenarios, offering a promising alternative to traditional planning methods. LLMs, which can generate structured plans by reasoning over natural language inputs, have the ability to generalize across diverse tasks and adapt to human instructions. This paper investigates the potential of LLMs to facilitate planning in the context of human-robot collaborative tasks, with a focus on their ability to reason from high-level, vague human inputs, and fine-tune plans based on real-time feedback. We propose a novel hybrid framework that combines LLMs with human feedback to create dynamic, context-aware task plans. Our work also highlights how a single, concise prompt can be used for a wide range of tasks and environments, overcoming the limitations of long, detailed structured prompts typically used in prior studies. By integrating user preferences into the planning loop, we ensure that the generated plans are not only effective but aligned with human intentions.
... Several robots, initially intended for high-performance dynamic locomotion, have been used to perform expressive motions [24], particularly in the context of dancing [3,1]. On the other hand, various robots have been purposefully designed for social interaction with humans [18,20]. Notably, humanoids such as iCub [27], NAO [9], and Pepper [31], have been widely used in the research community, serving both as platforms for motion control and human-robot interaction (HRI) research. ...
Preprint
Legged robots have achieved impressive feats in dynamic locomotion in challenging unstructured terrain. However, in entertainment applications, the design and control of these robots face additional challenges in appealing to human audiences. This work aims to unify expressive, artist-directed motions and robust dynamic mobility for legged robots. To this end, we introduce a new bipedal robot, designed with a focus on character-driven mechanical features. We present a reinforcement learning-based control architecture to robustly execute artistic motions conditioned on command signals. During runtime, these command signals are generated by an animation engine which composes and blends between multiple animation sources. Finally, an intuitive operator interface enables real-time show performances with the robot. The complete system results in a believable robotic character, and paves the way for enhanced human-robot engagement in various contexts, in entertainment robotics and beyond.
... In recent years, with advancements in artificial intelligence and human-computer interaction technology, automatic facial expression analysis has become a crucial research tool in fields such as clinical psychology, psychiatry, and cognitive science. It has demonstrated promising results on specific test databases and holds significant commercial prospects in various domains, including human-computer interaction [2,7], virtual reality [26,29] , augmented reality [1], smart driving [20] , depression recognition [30]. Companies like Affectiva and Kairos provide real-time assessment and prediction services, such as intelligent advertising and safe driving, by analyzing facial expressions along with other human behaviors such as lan-* Corresponding author guage, gaze, body movements, and responses in humancomputer interaction. ...
... Several robots, initially intended for high-performance dynamic locomotion, have been used to perform expressive motions [24], particularly in the context of dancing [3,1]. On the other hand, various robots have been purposefully designed for social interaction with humans [18,20]. Notably, humanoids such as iCub [27], NAO [9], and Pepper [31], have been widely used in the research community, serving both as platforms for motion control and human-robot interaction (HRI) research. ...
Article
Full-text available
In Relevance: Communication and Cognition , we outline a new approach to the study of human communication, one based on a general view of human cognition. Attention and thought processes, we argue, automatically turn toward information that seems relevant: that is, capable of yielding cognitive effects – the more, and the more economically, the greater the relevance. We analyse both the nature ofcognitive effects and the inferential processes by which they are derived. Communication can be achieved by two different means: by encoding and decoding messages or by providing evidence for an intended inference about the communicator's informative intention. Verbal communication, we argue, exploits both types of process. The linguistic meaning of an utterance, recovered by specialised decoding processes, serves as the input to unspecialised central inferential processes by which the speaker's intentions are recognised. Fundamental to our account of inferential communication is the fact that to communicate is to claimsomeone's attention, and hence to imply that the information communicated is relevant. We call this idea, that communicated information comes with a guarantee of relevance, the principle of relevance . We show that every utterance has at most a single interpretation consistent with the principle of relevance, which is thus enough on its own to account for the interaction of linguistic meaning with contextual factors in disambiguation, reference assignment, the recovery of implicatures, the interpretation ofmetaphor and irony, the recovery of illocutionary force, and other linguistically underdetermined aspects of utterance interpretation.
Conference Paper
Full-text available
Proposes a robot architecture that enables us to progressively develop a robot. The architecture consisting of situated modules has merits of both the traditional function-based and behavior-based architectures in addition to the merit in the development. We have developed a robot based on the architecture. By reporting the development process, the paper discusses advantages of the proposed architecture
Article
The "new approach" I mention in my title is a synthesis of approaches that seem, superficially, in conflict: Saussure's conception of language as a dynamic process on the individual level. This conflict can be resolved by introducing the dimension of time. The Saussurian and Vygotskyian conceptions of language have validity at different points in the temporal development of the linguistic acts of speaking or understanding. The time in question is not the time it takes to say or listen to speech, but internal, developmental time. If speaking time is called "surface time," then the synthesis of the Saussurian and Vygotskyian approaches takes place in "deep time." The theme unifying this book is an extended analysis of the processes taking place in "deep time." (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
As infants approach their 1st birthday, they begin to display an increased interest in various external objects and events during interactions with their caregivers. Previously established dyadic (infant–other) interactional structures are gradually transformed into a triadic (infant–object–other) social system. The present volume was developed to provide the reader with an overview of the rapidly growing literature concerned with the origins of these triadic joint attentional episodes and their potential role in early social, cognitive, and emotional development. The volume has been designed to occupy an important niche in social development libraries that currently exists between texts concerned primarily with early infant–caregiver dyadic interactions . . . and more recent texts concerned with the preschool child's emerging "theory of mind." (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The purpose of this paper is to construct a methodology for smooth communications between humans and robots. Here, focus is on a mindread-ing mechanism, which is indispensable in human-human communications. We propose a model of utterance understanding based on this mech-anism. Concretely speaking, we apply the model of a mindreading system (Baron-Cohen 1996) to a model of human-robot communications. More-over, we implement a robot interface system that applies our proposed model. Psychological exper-iments were carried out to explore the validity of the following hypothesis: by reading a robot's mind, a human can estimate the robot's intention with ease, and, moreover, the person can even un-derstand the robot's unclear utterances made by synthesized speech sounds. The results of the ex-periments statistically supported our hypothesis.
7: All behavior modules and their relationships Fig
  • Fig
Fig. 7: All behavior modules and their relationships Fig. 8: Interactions between a human and Robovie