Article

Language games for autonomous robots

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Integration and grounding are key AI challenges for human-robot dialogue. The author and his team are tackling these issues using language games and have experimented with them on progressively more complex platforms. A language game is a sequence of verbal interactions between two agents situated in a specific environment. Language games both integrate the various activities required for dialogue and ground unknown words or phrases in a specific context, which helps constrain possible meanings.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... For the interaction to be successful, the robot first has to be put in the right conditions by specific keywords. Reproduced from [Steels, 2001] Table 4.1 [Correll and Martinoli, 2007] 5.2 Mean (dot) and standard deviation (bars) of the quantity of aggregates in vanilla/NG aggregation (resp. left/middle) and of words (right) in stabilised swarms of 20/100 (resp. ...
... BCTs can also appear by self-organisation, as shown by the category game . The category game is a variation of the guessing game in which only the speaker knows the topic and the hearer actually has to guess the topic amongst various objects [Steels, 2001]. The guessing game is relatively similar to the imitation game except, in this instance, the topic is exterior to the agents and the second agent, rather than imitating the first, has to point the topic it believes to be correct. ...
... For the interaction to be successful, the robot first has to be put in the right conditions by specific keywords. Reproduced from [Steels, 2001] c 2001 IEEE. ...
Thesis
Automatically-controlled artificial systems have recently been used in numerous settings including environmental monitoring and explorations, with great success. In such cases, the use of multiple robots could increase efficiency, although we should ensure that their communication and organisation strategies are robust, exible, and scalable. These qualities can be ensured through decentralisation, redundancy (many/all robots perform the same task), local interaction, and simplistic rules, as is the case in swarm robotics. One of the key components of swarm robotics is local interaction or communication. The later has, so far, only been used for relatively simple tasks such as signalling a robot's preference or state. However, communication has more potential because the emergence of meaning, as it exists in human language, could allow robots swarms to tackle novel situations in ways that may not be a priori obvious to the experimenter. This is a necessary feature for having swarms that are fully autonomous, especially in unknown environments. In this thesis, we propose a framework for the emergence of meaningful communications in swarm robotics using language games as a communication protocol and probabilistic aggregation as a case study. Probabilistic aggregation can be a prerequisite to many other swarm behaviours but, unfortunately, it is extremely sensitive to experimental conditions, and thus requires specific parameter tuning for any setting such as population size or density.With our framework, we show that the concurrent execution of the naming game and of probabilistic aggregation leads, in certain conditions, to a new clustering and labelling behaviour that is controllable via the parameters of the aggregation controller. Pushing this interplay forward, we demonstrate that the social dynamics of the naming game can select efficient aggregation parameters through environmental pressure. This creates resilient controllers as the aggregation behaviour is dynamically evolved online according to the current environmental setting.
... Thus, the representation of the meanings of these first words, especially verbs and nouns, is intrinsically defined in terms of the action repertoire of children, and the way they perceive visually, haptically, auditorily or spatially the objects around them. From this perspective, robotic models of bodies and their physical perception and action on the environment are a prerequisite for modeling the acquisition of the meaning of these early words, such as shown in models of learning the names of shapes and colors (Steels, 2001;Roy, 2005), simple manipulative actions (Cangelosi et al., 2010) or spatial relationships among objects (Spranger & Steels, 2012). ...
... Language Games and horizontal transmission. Language Games are a set of computational models in which a population of individuals have pairwise interactions while trying to build or learn a common communication system (Steels, 2001;Loreto, Baronchelli, Mukherjee, Puglisi, & Tria, 2011). For each interaction of a typical Language Game, the corresponding pair of simulated agentsrandomly selected from the populationare assigned roles: one is the speaker, uttering a word to refer to a selected meaning or scene, and the other is the hearer, trying to guess what the speaker was referring to. ...
... This model also fits real data, with the average number of color categories produced by the model matching what is observed in the World Color Survey (Kay et al., 2009). Language Games have been used to model many other parts of language, including spatial representation (Spranger, 2012) and grammatical structures (Van Trijp, 2012), and many times the simulated agents are made to interact using real robotic bodies (Spranger, 2012, Steels, 2001. ...
... Thus, the representation of the meanings of these first words, especially verbs and nouns, is intrinsically defined in terms of the action repertoire of children, and the way they perceive visually, haptically, auditorily or spatially the objects around them. From this perspective, robotic models of bodies and their physical perception and action on the environment are a prerequisite for modeling the acquisition of the meaning of these early words, such as shown in models of learning the names of shapes and colors (Steels, 2001;Roy, 2005), simple manipulative actions (Cangelosi et al., 2010) or spatial relationships among objects (Spranger & Steels, 2012). ...
... Language Games and horizontal transmission. Language Games are a set of computational models in which a population of individuals have pairwise interactions while trying to build or learn a common communication system (Steels, 2001;Loreto, Baronchelli, Mukherjee, Puglisi, & Tria, 2011). For each interaction of a typical Language Game, the corresponding pair of simulated agentsrandomly selected from the populationare assigned roles: one is the speaker, uttering a word to refer to a selected meaning or scene, and the other is the hearer, trying to guess what the speaker was referring to. ...
... This model also fits real data, with the average number of color categories produced by the model matching what is observed in the World Color Survey (Kay et al., 2009). Language Games have been used to model many other parts of language, including spatial representation (Spranger, 2012) and grammatical structures (Van Trijp, 2012), and many times the simulated agents are made to interact using real robotic bodies (Spranger, 2012, Steels, 2001. ...
Preprint
Full-text available
We review computational and robotics models of early language learning and development. We first explain why and how these models are used to understand better how children learn language. We argue that they provide concrete theories of language learning as a complex dynamic system, complementing traditional methods in psychology and linguistics. We review different modeling formalisms, grounded in techniques from machine learning and artificial intelligence such as Bayesian and neural network approaches. We then discuss their role in understanding several key mechanisms of language development: cross-situational statistical learning, embodiment, situated social interaction, intrinsically motivated learning, and cultural evolution. We conclude by discussing future challenges for research, including modeling of large-scale empirical data about language acquisition in real-world environments. Keywords: Early language learning, Computational and robotic models, machine learning, development, embodiment, social interaction, intrinsic motivation, self-organization, dynamical systems, complexity.
... Phenomenally conscious machine. Intrinsic Table 2 shows the computational models for conscious machine that fall into CAS1, CAS2, CAS3 and CAS4 include CRONOS [13], Cyberchild [15], Khepera [13], [14], global workspace models [6], agent-based conscious architecture [7], [8][7]-[9], Haikonen's [10], Cicerobot [12], CofAff Schema [16], LIDA [17] and CODAM [18]. Cog [4], [5], global workspace models [6], agent-based conscious architecture [7], [8][7]- [9], Haikonen's [10], Schema-based model [11], Cicerobot [12], CAS2 ...
... Intrinsic Table 2 shows the computational models for conscious machine that fall into CAS1, CAS2, CAS3 and CAS4 include CRONOS [13], Cyberchild [15], Khepera [13], [14], global workspace models [6], agent-based conscious architecture [7], [8][7]-[9], Haikonen's [10], Cicerobot [12], CofAff Schema [16], LIDA [17] and CODAM [18]. Cog [4], [5], global workspace models [6], agent-based conscious architecture [7], [8][7]- [9], Haikonen's [10], Schema-based model [11], Cicerobot [12], CAS2 ...
... CRONOS [13], CyberChild [15], Khepera [13], [14], global workspace models [6], agentbased conscious architecture [7], [8][7]- [9], Haikonen's [10], Cicerobot [12]. ...
Chapter
Full-text available
The major aim of artificial general intelligence’s (AGI) is to allow a machine to perform general intelligence tasks similar to human counterparts. Hypothetically, this general intelligence in a machine can be achieved by establishing cross-domain optimization and learning machine approaches. However, contemporary artificial intelligence (AI) capabilities are only limited to narrow and specific domains utilizing machine learning. Consciousness concept is particularly interesting topic to attain the approaches because it simultaneously encodes and processes all types of information and seamlessly integrates them. Over the last several years, there has been a resurgence of interest in testing theories of consciousness using computer models. The studies of these models are classified into four categories: external behavior associated with consciousness, cognitive characteristics associated with consciousness, a computational architecture correlate of human consciousness and phenomenally of conscious machine. The critical challenge is to determine whether these artificial systems are capable of conscious states by providing a measurement the extent to which the systems are succeeded in realizing consciousness in a machine. Several tests for machine consciousness have been proposed yet their formulation is based on extrinsic measurement of consciousness. Yet extrinsic measurement is not inclusive because many conscious artificial systems behave implicitly. This research proposes a new framework to test machine consciousness based on intrinsic measurement so-called Pak Pandir test. The framework leverages three quantum double-slit settings and information integration theory as consciousness definition of choice.
... Artificial intelligence research has also emphasized the importance of learning to communicate through interaction for developing agents that can coordinate with other, possibly human agents in a goal-directed and intelligent way [33]. It has been shown that by playing communication games, artificial (robotic) agents can self-organize symbolic systems that are grounded in sensorimotor interactions with the world and other agents [34][35][36][37]. For example, in a case study with color stimuli, simulated agents established color categories and labels by playing a (perceptual) discrimination game, paired with a color reference game [36]. ...
... In our data set, there are 64 different object classes. The first sixteen classes comprise red objects (classes 1-16), followed by yellow objects (classes 17-32), turquoise objects (class [33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48], and purple objects (classes [49][50][51][52][53][54][55][56][57][58][59][60][61][62][63][64]. For example, if the input image belongs to class 2 ("tiny red cylinder"), the usual target label, y 0 , is a one-hot vector where the entire weight lies on the true class index. ...
Article
Full-text available
Language interfaces with many other cognitive domains. This paper explores how interactions at these interfaces can be studied with deep learning methods, focusing on the relation between language emergence and visual perception. To model the emergence of language, a sender and a receiver agent are trained on a reference game. The agents are implemented as deep neural networks, with dedicated vision and language modules. Motivated by the mutual influence between language and perception in cognition, we apply systematic manipulations to the agents' (i) visual representations, to analyze the effects on emergent communication, and (ii) communication protocols, to analyze the effects on visual representations. Our analyses show that perceptual biases shape semantic categorization and communicative content. Conversely, if the communication protocol partitions object space along certain attributes, agents learn to represent visual information about these attributes more accurately, and the representations of communication partners align. Finally, an evolutionary analysis suggests that visual representations may be shaped in part to facilitate the communication of environmentally relevant distinctions. Aside from accounting for co-adaptation effects between language and perception, our results point out ways to modulate and improve visual representation learning and emergent communication in artificial agents.
... Quanto à dimensão interacional do frame, há dois níveis cognitivos em jogo: o nível cognitivo off-line, em que a interação é acionada linguisticamente como categoria discursiva armazenada na memória (PE-DIDO_DE_INFORMAÇÃO 1 , CUMPRIMENTO, JULGAMENTO etc.); e o nível cognitivo on-line, em que a interação é um "jogo de linguagem" (WITTGENSTEIN, 1953;STEELS, 2001;DUQUE, 2018) em execução. Com base nos estudos de Vereza (2007de Vereza ( , 2010de Vereza ( , 2013de Vereza ( e 2016, sobre metáforas situadas (do nível cognitivo online) e metáforas conceptuais (do nível cognitivo offline), em associação com a noção de frames aqui adotada, buscamos desenvolver análises que contemplem o lugar e o papel dos frames (fonte e alvo) nas investigações sobre discurso e cognição. ...
... frames interacionais são estruturas cognitivas que emergem da rotinização de algumas formas de interação social. Cumpre distinguir frame interacional, estrutura off-line indexada linguisticamente, como CUMPRI-MENTO, NOTÍCIA, PEDIDO_DE_INFORMAÇÃO, EXPLICAÇÃO etc., de "Jogo de linguagem" (WITTGENSTEIN, 1953;STEELS, 2001;DUQUE, 2018), a execução de tarefas por, pelo menos, dois agentes visando a atingir um objetivo por eles compartilhado. ...
Article
Full-text available
Pretendemos descrever, neste artigo, alguns mecanismos discursivos empregados no enquadramento (framing) da situação econômica desfavorével. Para isso, evidenciamos, a partir de excertos de textos, que a escolha lexical de termos como “crise”, “arrocho” e “recessão”, assim como mapeamentos metafóricos, restringem a forma de conceber a situação econômica. No que diz respeito às bases teóricas que subsidiam este trabalho, centramos nossas discussões em torno das noções de frame (DUQUE, 2015, 2017); de metáfora conceptual (LAKOFF; JOHNSON, 1980 [2002]); e de metáfora situada e nicho metafórico (VEREZA, 2007, 2010, 2013, 2016). Quanto aos aspectos metodológicos, adotamos, como ferramenta analítica, o modelo de análise de frame (DUQUE, 2015), e utilizamos, como corpus, notícias, reportagens e entrevistas sobre economia em suas versões online. Buscamos demonstrar, ao final, que mecanismos como frames e metáforas apresentam, para além da natureza cognitiva, um teor potencialmente discursivo.
... In extension of the results presented recently by Breazeal and colleagues [2], our goal was to engage children in longer communicative exchanges. The design of the dialogue is motivated by language games as suggested by Steels [7]. These consist of structured interaction protocols between the dialogue partners [7]. ...
... The design of the dialogue is motivated by language games as suggested by Steels [7]. These consist of structured interaction protocols between the dialogue partners [7]. In contrast to previous studies that tested learning performance in children, the focus of our provided dialogue was to elicit interaction behavior in children. ...
Conference Paper
In this paper, we present an empirical study with children at the age of 4 and 5 years to reveal whether they engage with a robot in an interaction. For our analysis, we developed a score assessing the interaction level. It consists of emotional involvement, engagement and independence of the child in the interaction. For the interaction, an autonomous system was equipped with a designed dialogic structure. It consisted of a repertoire of (pre-recorded) interaction protocols that a robot can apply in an interaction. The results from the evaluation suggest that the implementation was successful as most of the studied children engaged in the interaction with the robot. The results also reveal some gender differences. The long-term aim of the study is to develop an autonomous system that can be applied in an interaction with young children.
... En cuanto a la robótica aplicada al aprendizaje de lenguas, se han encontrado trabajos de robótica educativa que permiten concebir, diseñar y desarrollar robots que apoyen el proceso de enseñanza/aprendizaje (Sykes, 2009), (Sykes, 2013). En el ámbito específico de la didáctica de lenguas, se han diseñado algunos robots educativos tales como: el trabajo de Steels (Steels, 2001), (Steels, 2012), que combinan la inteligencia artificial y los juegos de lengua a través de secuencias de interacción verbal entre dos agentes situados en un entorno. En realidad, la mayoría de las aplicaciones para el aprendizaje de lenguas con robots han sido concebidas para su implementación mediante el robot Nao (Ishida et al., 2016), (Raessens, 2006). ...
Article
Full-text available
En este artículo se presenta la formulación teórica y conceptual de la robótica ludoeducativa desde un enfoque multidisciplinario La principal contribución de este artículo es la definición de los conceptos de robótica, robot y aplicación ludoeducativa mediante un equilibrio deliberado entre la robótica, ciencias de juegos y didáctica de lenguas y culturas. Asimismo, se presenta una ejemplo práctico desarrollado con fines de aprendizaje y se discuten las diversas vertientes y los aportes de la disciplina propuesta buscando el aporte desde cada disciplina problematizando el juego. El caso práctico de un robot ludoeducativo es usado para el aprendizaje lúdico de partes del cuerpo en francés.
... Early language game implementations (Steels, 1995;2001) achieve communication convergence by using contrastive methods to update association tables between object referents and utterances. While recent works use deep learning methods to target high-dimensional signals they do not explore contrastive approaches. ...
Preprint
Full-text available
The framework of Language Games studies the emergence of languages in populations of agents. Recent contributions relying on deep learning methods focused on agents communicating via an idealized communication channel, where utterances produced by a speaker are directly perceived by a listener. This comes in contrast with human communication, which instead relies on a sensory-motor channel, where motor commands produced by the speaker (e.g. vocal or gestural articulators) result in sensory effects perceived by the listener (e.g. audio or visual). Here, we investigate if agents can evolve a shared language when they are equipped with a continuous sensory-motor system to produce and perceive signs, e.g. drawings. To this end, we introduce the Graphical Referential Game (GREG) where a speaker must produce a graphical utterance to name a visual referent object consisting of combinations of MNIST digits while a listener has to select the corresponding object among distractor referents, given the produced message. The utterances are drawing images produced using dynamical motor primitives combined with a sketching library. To tackle GREG we present CURVES: a multimodal contrastive deep learning mechanism that represents the energy (alignment) between named referents and utterances generated through gradient ascent on the learned energy landscape. We, then, present a set of experiments and metrics based on a systematic compositional dataset to evaluate the resulting language. We show that our method allows the emergence of a shared, graphical language with compositional properties.
... The joint importance of search and coordination, as we have argued, is pervasive in organizational adaptation. Even processes that can be modeled and created in the laboratory as pure coordination dynamics, such as the emergence of communication conventions (e.g., Steels 2001, Centola and Baronchelli 2015, Spike et al. 2017, can involve an element of search when created against a background of preexisting conventions (e.g., Fay et al. 2008, Guilbeault et al. 2021) and create frictions because of the need for unlearning (Koçak and Puranam 2022). Conversely, processes of search that do not require any coordination when carried outside organizations may require coordination when carried by agents in organizations whose choices are interdependent because of complementarities or common constraints such as budgets. ...
Article
Organizations increasingly need to adapt to challenges in which search and coordination cannot be decoupled. In response, many have experimented with “agile” and “flat” designs that dismantle traditional forms of hierarchy to harness the distributed knowledge of specialized individuals. Despite the popularity of such practices, there is considerable variation in their implementation as well as conceptual ambiguity about the underlying premise. Does effective rapid experimentation necessarily imply the repudiation of hierarchical structures of influence? We use computational models of multiagent reinforcement learning to study the effectiveness of coordinated search in groups that vary in how they influence each other’s beliefs. We compare the behavior of flat and hierarchical teams with a baseline structure without any influence on beliefs (a “crowd”) when all three are placed in the same task environments. We find that influence on beliefs—whether it is hierarchical or not—makes it less likely that agents stabilize prematurely around their own experiences. However, flat teams can engage in excessive exploration, finding it difficult to converge on good alternatives, whereas hierarchical influence on beliefs reduces simultaneous uncoordinated exploration, introducing a degree of rapid exploitation. As a result, teams that need to achieve agility (i.e., rapid satisfactory results) in environments that require coordinated search may benefit from a hierarchical structure of influence—even when the apex actor has no superior knowledge, foresight, or capacity to control subordinates’ actions.
... A popular approach to the study of language dynamics is represented by language games played by a population of agents/robots, with the purpose of mimicking real-world linguistic interactions leading to the emergence of a structured language. Various kinds of language games have been proposed to date, from imitation games (Billard & Hayes, 1997) to guessing games (Steels, 2001) and category games (Puglisi et al., 2008;Baronchelli et al., 2010). One game in particular has received a lot of attention: the naming game (Steels 1995;2003). ...
Article
Full-text available
In this study, we investigate the emergence of naming conventions within a swarm of robots that collectively forage, that is, collect resources from multiple sources in the environment. While foraging, the swarm explores the environment and makes a collective decision on how to exploit the available resources, either by selecting a single source or concurrently exploiting more than one. At the same time, the robots locally exchange messages in order to agree on how to name each source. Here, we study the correlation between the task-induced interaction network and the emergent naming conventions. In particular, our goal is to determine whether the dynamics of the interaction network are sufficient to determine an emergent vocabulary that is potentially useful to the robot swarm. To be useful, linguistic conventions need to be compact and meaningful, that is, to be the minimal description of the relevant features of the environment and of the made collective decision. We show that, in order to obtain a useful vocabulary, the task-dependent interaction network alone is not sufficient, but it must be combined with a correlation between language and foraging dynamics. On the basis of these results, we propose a decentralised algorithm for collective categorisation which enables the swarm to achieve a useful—compact and meaningful—naming of all the available sources. Understanding how useful linguistic conventions emerge contributes to the design of robot swarms with potentially improved autonomy, flexibility, and self-awareness.
... objects' names) using mutual information criteria. Other robotic applications were developed subsequently, such as the robotic language games by Steels et al. (1995;2001) used to teach robots meaning of words in a simple static world, or Needham et al. (2005) to teach artificial agents table-top games. Further, Steels (2002), Spranger (2015), and Bleys (2015) designed systems capable of learning objects' names and spatial relations by interacting with human or robot teachers. ...
Article
Full-text available
We present a cognitively plausible novel framework capable of learning the grounding in visual semantics and the grammar of natural language commands given to a robot in a table top environment. The input to the system consists of video clips of a manually controlled robot arm, paired with natural language commands describing the action. No prior knowledge is assumed about the meaning of words, or the structure of the language, except that there are different classes of words (corresponding to observable actions, spatial relations, and objects and their observable properties). The learning process automatically clusters the continuous perceptual spaces into concepts corresponding to linguistic input. A novel relational graph representation is used to build connections between language and vision. As well as the grounding of language to perception, the system also induces a set of probabilistic grammar rules. The knowledge learned is used to parse new commands involving previously unseen objects.
... The seminal work of Steels on language games [62,63] shows how robots could actually engage in a process that converges to a shared vocabulary of grounded words. When the set of symbols is closed and known beforehand, symbol grounding is not a challenge anymore, but it still is if the robot has to build it autonomously [64]. ...
Article
Full-text available
Robotics has a special place in AI as robots are connected to the real world and robots increasingly appear in humans everyday environment, from home to industry. Apart from cases were robots are expected to completely replace them, humans will largely benefit from real interactions with such robots. This is not only true for complex interaction scenarios like robots serving as guides, companions or members in a team, but also for more predefined functions like autonomous transport of people or goods. More and more, robots need suitable interfaces to interact with humans in a way that humans feel comfortable and that takes into account the need for a certain transparency about actions taken. The paper describes the requirements and state-of-the-art for a human-centered robotics research and development, including verbal and non-verbal interaction, understanding and learning from each other, as well as ethical questions that have to be dealt with if robots will be included in our everyday environment, influencing human life and societies.
... Artificial intelligence (AI) research has also emphasized the importance of learning to communicate through interaction for developing agents that can coordinate with other, possibly human agents in a goal-directed and intelligent way (e.g., Mikolov, Joulin, & Baroni, 2015). It has been shown that by playing communication games, artificial (robotic) agents can self-organize symbolic systems that are grounded in sensorimotor interactions with the world and other agents (e.g., Steels, 1998Steels, , 2001Steels & Belpaeme, 2005;Bleys, Loetzsch, Spranger, & Steels, 2009). For example, in a case study with color stimuli, simulated agents established color categories and labels by playing a (perceptual) discrimination game, paired with a color reference game (Steels & Belpaeme, 2005). ...
Preprint
Full-text available
Language interfaces with many other cognitive domains. This paper explores how interactions at these interfaces can be studied with deep learning methods, focusing on the relation between language emergence and visual perception. To model the emergence of language, a sender and a receiver agent are trained on a reference game. The agents are implemented as deep neural networks, with dedicated vision and language modules. Motivated by the mutual influence between language and perception in cognition, we apply systematic manipulations to the agents' (i) visual representations, to analyze the effects on emergent communication, and (ii) communication protocols, to analyze the effects on visual representations. Our analyses show that perceptual biases shape semantic categorization and communicative content. Conversely, if the communication protocol partitions object space along certain attributes, agents learn to represent visual information about these attributes more accurately. Finally, an evolutionary analysis suggests that visual representations may have evolved in part to facilitate the communication of environmentally relevant distinctions. Aside from accounting for co-adaptation effects between language and perception, our results point out ways to modulate and improve visual representation learning and emergent communication in artificial agents.
... Additionally, when fused with other fields unexpected outcomes can occur. Take for instance "Talking Heads" experiment by Luc Steels [63,64] that showed a common vocabulary emerges through the interaction of agents with each other and their environment via a language game. ...
Chapter
Full-text available
An undeniable part of a smart city is its use of smart agents. These agents can vary a lot in sizes, shapes, and functionalities. Embodied artificial intelligence is the field of study that takes a deeper look into these agents and explores how they can fit into the real-world and how they can eventually act as our future community workers, personal assistants, robocops, and many more. In the shift from Internet AI to embodied AI, simulators take the role that was previously played by traditional datasets. This chapter focuses on MINOS and Habitat since they provide more customization abilities and are implemented in a loosely coupled manner to generalize well to new multisensory tasks and environments. It shows numerous task definitions and how they each can be tackled by the agents. The chapter provides information on the three main goal-directed navigation tasks, namely, PointGoal Navigation, ObjectGoal Navigation, and RoomGoal Navigation.
... This new approach builds on Language Games (LGs) to create language-like communication, which would, eventually, give swarms rich vocabularies to define their environment [44,46], as well as their task, dynamically. PA-CE, by its use of a LG, is inscribed in this proposition and can be conceived of as a first step to verbalizing actions and creating 'language interwoven with an activity' [75], whereas existing implementation of LG still focus on very 'linguistic' tasks such as naming or categorizing [76]. ...
Article
Local interactions and communication are key features in swarm robotics, but they are most often fixed at design time, limiting flexibility and causing a stiff and inefficient response to changing environments. Motivated by the need for higher adaptation abilities, we propose that information about emergent collective structures should percolate onto the individual behavior, modifying it in a way that determines suitable responses in the face of new working conditions and organizational challenges. Indeed, complex societies are driven by an evolving set of individual and social norms subject to cultural propagation, which contribute to determining the individual behaviors. We leverage ideas from the evolution of natural language—an undoubtedly efficient cultural trait—and exploit the resulting social dynamics to select and propagate microscopic behavioral parameters that adapt continuously to macroscopic conditions, which in turn affect the agents’ communication topography, and, therefore, feed back onto the social dynamics. This concept is demonstrated on a self-organized aggregation behavior, which is a building block for most swarm robotics behaviors and a striking example of how collective dynamics are sensitive to experimental parameters. By means of experiments with simulated and real robots, we show that the cultural evolution of aggregation rules outperforms conventional approaches in terms of adaptivity to multiple experimental settings.
... Additionally, when fused with other fields unexpected outcomes can occur. Take for instance "Talking Heads" experiment by Luc Steels [63,64] that showed a common vocabulary emerges through the interaction of agents with each other and their environment via a language game. ...
Preprint
Full-text available
div>A smart city can be seen as a framework, comprised of Information and Communication Technologies (ICT). An intelligent network of connected devices that collect data with their sensors and transmit them using wireless and cloud technologies in order to communicate with other assets in the ecosystem plays a pivotal role in this framework. Maximizing the quality of life of citizens, making better use of available resources, cutting costs, and improving sustainability are the ultimate goals that a smart city is after. Hence, data collected from these connected devices will continuously get thoroughly analyzed to gain better insights into the services that are being offered across the city; with this goal in mind that they can be used to make the whole system more efficient. Robots and physical machines are inseparable parts of a smart city. Embodied AI is the field of study that takes a deeper look into these and explores how they can fit into real-world environments. It focuses on learning through interaction with the surrounding environment, as opposed to Internet AI which tries to learn from static datasets. Embodied AI aims to train an agent that can See (Computer Vision), Talk (NLP), Navigate and Interact with its environment (Reinforcement Learning), and Reason (General Intelligence), all at the same time. Autonomous driving cars and personal companions are some of the examples that benefit from Embodied AI nowadays. In this paper, we attempt to do a concise review of this field. We will go through its definitions, its characteristics, and its current achievements along with different algorithms, approaches, and solutions that are being used in different components of it (e.g. Vision, NLP, RL). We will then explore all the available simulators and 3D interactable databases that will make the research in this area feasible. Finally, we will address its challenges and identify its potentials for future research.</div
... People have studied for those two decades, for example, how children can learn basic social interaction skills such as joint attention [3,14,21]. Also, the problem of language grounding has been studied already 20 or 30 years ago, even before developmental robotics started as a field [28,29,27]. ...
Preprint
This paper outlines a perspective on the future of AI, discussing directions for machines models of human-like intelligence. We explain how developmental and evolutionary theories of human cognition should further inform artificial intelligence. We emphasize the role of ecological niches in sculpting intelligent behavior, and in particular that human intelligence was fundamentally shaped to adapt to a constantly changing socio-cultural environment. We argue that a major limit of current work in AI is that it is missing this perspective, both theoretically and experimentally. Finally, we discuss the promising approach of developmental artificial intelligence, modeling infant development through multi-scale interaction between intrinsically motivated learning, embodiment and a fastly changing socio-cultural environment. This paper takes the form of an interview of Pierre-Yves Oudeyer by Mandred Eppe, organized within the context of a KI - K{\"{u}}nstliche Intelligenz special issue in developmental robotics.
... People have studied for those two decades, for example, how children can learn basic social interaction skills such as joint attention [3,14,21]. Also, the problem of language grounding has been studied already 20 or 30 years ago, even before developmental robotics started as a field [28,29,27]. ...
... The emergence of signaling systems (or a shared lexicon between multiple individuals) has been extensively studied before in the domain of language evolution by means of language games, see for example (Steels, 2001). ...
Article
Full-text available
Lewis signaling games are a standard model to study the emergence of language. We introduce win-stay/lose-inaction , a random process that only updates behavior on success and never deviates from what was once successful, prove that it always ends up in a state of optimal communication in all Lewis signaling games, and predict the number of interactions it needs to do so: N 3 interactions for Lewis signaling games with N equiprobable types. We show three reinforcement learning algorithms (Roth-Erev learning, Q-learning, and Learning Automata) that can imitate win-stay/lose-inaction and can even cope with errors in Lewis signaling games.
... We pay special attention to non-verbal grounding in languages beyond English, including German (Han and Schlangen, 2018), Swedish (Kontogiorgos, 2017), Japanese (Endrass et al., 2013;Nakano et al., 2003), French (Lemaignan and Alami, 2013;Steels, 2001), Italian (Borghi and Cangelosi, 2014;Taylor et al., 1986), Spanish (Kery et al., 2019), Russian (Janda, 1988), and American sign language (Emmorey and Casey, 1995). These investigations often describe important language-dependent characteristics and cultural differences in studying non-verbal grounding. ...
... We could extend our model by adding signaling behaviors to agents and testing them in experimental setups similar to the seminal sender-receiver games proposed by Lewis [8]. One could also follow a more robot-centric approach such as that of [66,67]. These approaches enable one to study the emergence of complex communicative systems embedding a proto-syntax [49,68]. ...
Article
Full-text available
What is the role of real-time control and learning in the formation of social conventions? To answer this question, we propose a computational model that matches human behavioral data in a social decision-making game that was analyzed both in discrete-time and continuous-time setups. Furthermore, unlike previous approaches, our model takes into account the role of sensorimotor control loops in embodied decision-making scenarios. For this purpose, we introduce the Control-based Reinforcement Learning (CRL) model. CRL is grounded in the Distributed Adaptive Control (DAC) theory of mind and brain, where low-level sensorimotor control is modulated through perceptual and behavioral learning in a layered structure. CRL follows these principles by implementing a feedback control loop handling the agent’s reactive behaviors (pre-wired reflexes), along with an Adaptive Layer that uses reinforcement learning to maximize long-term reward. We test our model in a multi-agent game-theoretic task in which coordination must be achieved to find an optimal solution. We show that CRL is able to reach human-level performance on standard game-theoretic metrics such as efficiency in acquiring rewards and fairness in reward distribution.
... Languages games are games played between agents/robots, with the purpose of mimicking real-world linguistic interactions leading to the emergence of a structured language. Various kinds of language games have been proposed to date, from imitation games (Billard and Hayes, 1997) to guessing games (Steels, 2001) and category games (Puglisi et al., 2008;Baronchelli et al., 2010). One in particular have received strong attention: the naming game (Steels, 1995a(Steels, , 2003. ...
Preprint
Full-text available
We investigate the emergence of language convention within a swarm of robots foraging in an open environment from two identical resources. While foraging, the swarm needs to explore and decide which resource to exploit, moving through complex transitory dynamics towards different possible equilibria, such as, selection of a single resource or spread across the two. Our point of interest is the understanding of possible correlations between the emergent, evolving, task-induced interaction network and the language dynamics. In particular, our goal is to determine whether the dynamics of the interaction network are sufficient to determine emergent naming conventions that represent features of the task execution (e.g., choice of one or the other resource) and of the environment, In other words, we look for an emergent vocabulary that is both complete (a word for each resource) and correct (no misnomer) for as long as each resource is relevant to the swarm. In this study, robots are playing two variants of the minimal language game. The classic one, where words are created when needed, and a new variant we introduce in this article: the spatial minimal naming game, where the creation of words is linked with the discovery of resources by exploring robots. We end the article by proposing a proof of concept extension of the spatial minimal naming game that assures the completeness and correctness of the swarms vocabulary.
... Social robots being built today are obedient, golem-like servants but an increasing number of them are speech-enabled. Some of these real-world robots are also capable of learning human language (e.g., Connell 2014 this volume and Steels 2001), conversing (e.g., Dufty 2014 this volume), and communicating with each other in a language they may have developed themselves (Heath et al. 2013). These linguistic abilities and the complex intelligence that makes them possible are transforming real-world golems into entities that are more akin to the golems of modern fiction than to those of yore. ...
Preprint
Full-text available
Speech-enabled, social robots are being designed to provide a range of services to humans but humans are not passive recipients of this technology. We have preconceptions and expectations about robots as well as deeply-ingrained emotional responses to the concept of robots sharing our world. We examine three roles that humans expect robots to play: killer, servant, and lover. These roles are embodied by cultural icons that function as springboards for understanding important, potential human-robot relationships. Fears about a rampaging robot produced by misguided science are bonded to the image of the Frankenstein monster and the idea that we could be annihilated by our own intelligent creations originated with the androids in the play R.U.R. The pleasing appearance of many Japanese social robots and their fluid motions perpetuate the spirit of karakuri ningyo which also continues in Japanese theatre, manga 1 , and anime 2. Obedient Hebrew golems provide a pleasing model of dutiful servants. The concept of robot-human love is personified by the Greek myth about Pygmalion and his beloved statue. This chapter provides an overview of the three roles and the icons that embody them. We discuss how spoken language supports the image of robots in those roles and we touch on the social impact of robots actually filling those roles.
... This taxonomy have informed recent developmental robotics studies of the role of affective behaviour in the acquisition of negation. Steels and his group, [70,71], have used hybrid population of robots, Internet agents, and humans engaged in language games. In their work relevance is given to the social aspects of the symbol grounding, as well as the perceptual grounding of categories [72]. ...
Article
Full-text available
Recent advances in behavioural and computational neuroscience, cognitive robotics, and in the hardware implementation of large-scale neural networks, provide the opportunity for an accelerated understanding of brain functions and for the design of interactive robotic systems based on brain-inspired control systems. This is especially the case in the domain of action and language learning, given the significant scientific and technological developments in this field. In this work we describe how a neuroanatomically grounded spiking neural network for visual attention has been extended with a word learning capability and integrated with the iCub humanoid robot to demonstrate attention-led object naming. Experiments were carried out with both a simulated and a real iCub robot platform with successful results. The iCub robot is capable of associating a label to an object with a ‘preferred’ orientation when visual and word stimuli are presented concurrently in the scene, as well as attending to said object, thus naming it. After learning is complete, the name of the object can be recalled successfully when only the visual input is present, even when the object has been moved from its original position or when other objects are present as distractors.
... We could extend our model by adding signaling behaviors to agents and testing them in experimental setups similar to the seminal sender-receiver games proposed by Lewis [3]. One could also follow a more robot-centric approach such as that of [55,56]. These approaches enable one to study the emergence of complex communicative systems embedding a proto-syntax [41,57]. ...
Article
Full-text available
In order to understand the formation of social conventions we need to know the specific role of control and learning in multi-agent systems. To advance in this direction, we propose, within the framework of the Distributed Adaptive Control (DAC) theory, a novel Control-based Reinforcement Learning architecture (CRL) that can account for the acquisition of social conventions in multi-agent populations that are solving a benchmark social decision-making problem. Our new CRL architecture, as a concrete realization of DAC multi-agent theory, implements a low-level sensorimotor control loop handling the agent's reactive behaviors (pre-wired reflexes), along with a layer based on model-free reinforcement learning that maximizes long-term reward. We apply CRL in a multi-agent game-theoretic task in which coordination must be achieved in order to find an optimal solution. We show that our CRL architecture is able to both find optimal solutions in discrete and continuous time and reproduce human experimental data on standard game-theoretic metrics such as efficiency in acquiring rewards, fairness in reward distribution and stability of convention formation.
... Those routinized formats, also called 'pragmatic frames' [23], help children not only to participate in an interaction but also to identify a clear role that s/he can fulfill. They are routinized because their structures have become familiar to the participantsan aspect that should be considered when designing dialogue in HRI, as suggested by Steels [24]. In fact, routinized activities can become an 'educational game' [25] in which social interaction serves educational purposes because it systematically elicits children's verbal knowledge. ...
... In terms of speech production, there have also been a variety of models showing how the system of native speech sounds can be learned in interaction with linguistically proficient caregivers without imposing strong a priori constraints on the number and type of categories that exist in the language (e.g., [35,36]; see also [37] for a review). A number of robotic studies also demonstrate how new functional communicative systems and representations can emerge from situated interaction, internal drive for exploration, and capability to observe and act upon the environment beyond "linguistic" messaging (e.g., [38][39][40]). Finally, the referential nature of communication is also becoming utilized in models attempting to explain the structuring of sound systems [41]. ...
... objects' names) using mutual information criterion was presented. Several robotic applications were developed subsequently, such as Steels (2001) where language games for autonomous robots are used to teach the meaning of words in a simple static world. Further, researchers developed systems capable of learning objects' names and spatial relations by interacting with a human or robot teacher, as by Steels (2002), Bleys (2009) and Spranger (2015). ...
Conference Paper
Full-text available
We present a cognitively plausible system capable of acquiring knowledge in language and vision from pairs of short video clips and linguistic descriptions. The aim of this work is to teach a robot manipulator how to execute natural language commands by demonstration. This is achieved by first learning a set of visual `concepts' that abstract the visual feature spaces into concepts that have human-level meaning. Second, learning the mapping/grounding between words and the extracted visual concepts. Third, inducing grammar rules via a semantic representation known as Robot Control Language (RCL). We evaluate our approach against state-of-the-art supervised and unsupervised grounding and grammar induction systems, and show that a robot can learn to execute never seen-before commands from pairs of unlabelled linguistic and visual inputs.
... Interactive learning models have been proposed for the learning of spatial language [14], color [15] and other domains. They are also a prime paradigm for models of lexicon evolution [16]- [18]. ...
Article
Full-text available
This paper investigates the role of tutor feedback in language learning using computational models. We compare two dominant paradigms in language learning: interactive learning and cross-situational learning - which differ primarily in the role of social feedback such as gaze or pointing. We analyze the relationship between these two paradigms and propose a new mixed paradigm that combines the two paradigms and allows to test algorithms in experiments that combine no feedback and social feedback. To deal with mixed feedback experiments, we develop new algorithms and show how they perform with respect to traditional knn and prototype approaches.
... Signaling games share many analogies with naming games as presented by Steels (2001). The main difference is that in signaling games players use payoffs to coordinate, and in naming games the sender explicitly transmits to the receiver the Nature's state or "signal meaning". ...
Article
Many models explain the evolution of signalling in repeated stage games on social networks, differently in this study each signalling game evolves a communication strategy to transmit information across the network. Specifically, I formalise signalling chain games as a generalisation of Lewis' signalling games, where a number of players are placed on a chain network and play a signalling game in which they have to propagate information across the network. I show that probe and adjust learning allows the system to develop communication conventions, but it may temporarily perturb the system out of conventions. Through simulations, I evaluate how long the system takes to evolve a signalling convention and the amount of time it stays in it. This discussion presents a mechanism in which simple players can evolve signalling across a social network without necessarily understanding the entire system.
... A guessing game comprises an initiator-agent, a recipient-agent and a context including several topics such as shapes, colours and objects. In the game, the initiator (i.e., speaker) selects a topic from the context, represents it with an utterance, and sends the utterance to the recipient (i.e., listener), who attempts to identify the topic of the utterance based on its experience of the previous utterances (Steels, 2001a). In other ...
Thesis
Full-text available
The aim of this research is to investigate the mechanisms of creative design within the context of an evolving language through computational modelling. Computational Creativity is a subfield of Artificial Intelligence that focuses on modelling creative behaviours. Typically, research in Computational Creativity has treated language as a medium, e.g., poetry, rather than an active component of the creative process. Previous research studying the role of language in creative design has relied on interviewing human participants, limiting opportunities for computational modelling. This thesis explores the potential for language to play an active role in computational creativity by connecting computational models of the evolution of artificial languages and creative design processes. Multi-agent simulations based on the Domain-Individual-Field-Interaction framework are employed to evolve artificial languages with features that may support creative designing including ambiguity, incongruity, exaggeration and elaboration. The simulation process consists of three steps: (1) constructing representations associating topics, meanings and utterances; (2) structured communication of utterances and meanings through the playing of “language games”; and (3) evaluation of design briefs and works. The use of individual agents with different evaluation criteria, preferences and roles enriches the scope and diversity of the simulations. The results of the experiments conducted with artificial creative language systems demonstrate the expansion of design spaces by generating compositional utterances representing novel concepts among design agents using language features and weighted context free grammars. They can be used to computationally explore the roles of language in creative design, and possibly point to computational applications. Understanding the evolution of artificial languages may provide insights into human languages, especially those features that support creativity.
... objects' names) using mutual information criteria. Several robotic applications were developed subsequently, such as Steels et al. (2001) where language games for autonomous robots are used to teach the meaning of words in a simple static world. Researchers have since developed systems capable of learning objects' names and spatial relations by interacting with a human or robot teacher, as in Steels (2002) and Spranger (2015). ...
Conference Paper
Full-text available
We present a system that enables embodied agents to learn about different components of the perceived world, such as object properties, spatial relations, and actions. The system learns a semantic representation and the linguistic description of such components by connecting two different sensory inputs: language and vision. The learning is achieved by mapping observed words to extracted visual features from video clips.We evaluate our approach against state-of-the-art supervised and unsupervised systems that each learn from a single modality, and we show that an improvement can be obtained by using both language and vision as inputs.
... But it is clearly off the mark to assume that each and every speaker represents in some way every possible intensional formal language (this is questionable even for a single intensional formal language); that there is common knowledge that all speakers use the same language; that there is some kind of telepathy; and that there is a homogeneous knowledge of the mappings from terms to meanings in linguistic communities such as ours that feature DLL. As for the framework of self-organization, several works sought to show that a social language can emerge out of simple interactions between speakers (Barr, 2004;Beckner et al., 2009;Briscoe, 2002;Cangelosi, 2007;Gärdenfors, 1993;Hutchins & Hazlehurst, 1995;Lyon et al., 2007;Smith, 2005;Steels & Kaplan, 1999;Steels, 2001;Wellens et al., 2008). Some researchers have used real robots; others have used simulated agents. ...
... The language games of Steels [6], [7], [8] consider the origin and use of language. Steels suggests that the key to successful language grounding is to tightly couple it with sensory-motor features and feedback. ...
Article
Full-text available
Successful human-robot cooperation hinges on each agent's ability to process and exchange information about the shared environment and the task at hand. Human communication is primarily based on symbolic abstractions of object properties, rather than precise quantitative measures. A comprehensive robotic framework thus requires an integrated communication module which is able to establish a link and convert between perceptual and abstract information. The ability to interpret composite symbolic descriptions enables an autonomous agent to a) operate in unstructured and cluttered environments, in tasks which involve unmodeled or never seen before objects; and b) exploit the aggregation of multiple symbolic properties as an instance of ensemble learning, to improve identification performance even when the individual predicates encode generic information or are imprecisely grounded. We propose a discriminative probabilistic model which interprets symbolic descriptions to identify the referent object contextually w.r.t.\ the structure of the environment and other objects. The model is trained using a collected dataset of identifications, and its performance is evaluated by quantitative measures and a live demo developed on the PR2 robot platform, which integrates elements of perception, object extraction, object identification and grasping.
Article
Full-text available
The objective of this research is to shed light on industry 4.0 and the future of work based upon some certain evidences and probable scenarios from cinema as well as movies. As industry 1.0 has brought about radical changes in production with the emergence of factories and employees, industry 4.0 portrays the last stage of this ongoing change. Historically speaking, it can be asserted that industry 4.0 signals a novel epoch for its transformative nature on everything from top to toe including smart technologies, labor markets, wage philosophy, immigration, robotics, artificial intelligence, internet of things, climate change, digitalization, and so on. Considering the fact that movies are the gorgeous ways to visualize accurate stories or future scenarios, it is referenced that they can be utilized easily for forecasting studies or for current case analysis. As the research method of the study, content analysis was preferred.
Preprint
Full-text available
Effective communication is an important skill for enabling information exchange in multi-agent settings and emergent communication is now a vibrant field of research, with common settings involving discrete cheap-talk channels. Since, by definition, these settings involve arbitrary encoding of information, typically they do not allow for the learned protocols to generalize beyond training partners. In contrast, in this work, we present a novel problem setting and the Quasi-Equivalence Discovery (QED) algorithm that allows for zero-shot coordination (ZSC), i.e., discovering protocols that can generalize to independently trained agents. Real world problem settings often contain costly communication channels, e.g., robots have to physically move their limbs, and a non-uniform distribution over intents. We show that these two factors lead to unique optimal ZSC policies in referential games, where agents use the energy cost of the messages to communicate intent. Other-Play was recently introduced for learning optimal ZSC policies, but requires prior access to the symmetries of the problem. Instead, QED can iteratively discovers the symmetries in this setting and converges to the optimal ZSC policy.
Preprint
Effective communication is an important skill for enabling information exchange and cooperation in multi-agent settings. Indeed, emergent communication is now a vibrant field of research, with common settings involving discrete cheap-talk channels. One limitation of this setting is that it does not allow for the emergent protocols to generalize beyond the training partners. Furthermore, so far emergent communication has primarily focused on the use of symbolic channels. In this work, we extend this line of work to a new modality, by studying agents that learn to communicate via actuating their joints in a 3D environment. We show that under realistic assumptions, a non-uniform distribution of intents and a common-knowledge energy cost, these agents can find protocols that generalize to novel partners. We also explore and analyze specific difficulties associated with finding these solutions in practice. Finally, we propose and evaluate initial training improvements to address these challenges, involving both specific training curricula and providing the latent feature that can be coordinated on during training.
Presentation
Full-text available
During the past forty years astounding advances have been made in the field of quantum information theory. Many reasons lead to this development. First, Components shrink to where their behavior will soon be dominated more by quantum physics. Second, the physical limitations of the classical computer. Third, the characteristics offer by the exploitation of the quantum information theory. Velocity is one of these characteristics. As an example, a quantum algorithm can find such an item in a time proportional to the square root of the size of the set, which is considerably faster than classical methods that take the same time as the size of the set. In this work we have to show how a quantum algorithm is faster than a classical algorithm. As a case of study we compare between the Classical Search Algorithm (CSA) and the Grover’s Algorithm (G’sA). In computer science, CSA is a search algorithm which is suitable for searching a set of data for a particular value. It operates by checking every element of a list one at time in sequence until a match is found. The pseudo code describes the CSA is: «For each item in the list check to see if the item you’re looking for matches the item in the list. If it matches return the location where you found it. If it does not match continue searching until you reach the end of the list. If we get to the last element, we know that the item does not exist in the list ». As a result, the algorithm operates by checking every element of a list one at time in sequence until a match is found. CSA runs in O(N). If the data are distributed randomly, on average (N+1)/2 comparisons will needed. The best case is that the value is equal to the first element tested, in which case only one comparison is needed. The worst case is that the value is not in the list or in the last item in the list, in which case N comparisons are needed.
Chapter
The distribution of colors in the environment shapes local peoples’ perceptions of those colors, a phenomenon observable across all types of environments. We analyzed color categorization data from each of the 107 languages in the World Color Survey (WCS) database. Next, we grouped the WCS languages according to their geographic location, with reference to the seven terrestrial habitats (biomes) classified by the World Wildlife Fund (WWF). We developed a computer algorithm to establish the most frequently occurring colors in each environment based on the color distribution extracted from National Geographic natural images of the respective biomes. We then compared the average standardized value of the mode (i.e., most frequently occurring answers) for each group of WCS languages; we followed the same procedure for the most frequently occurring colors as well as the remaining colors. Results indicated statistically significant lower values of the average mode answers for the most frequently occurring colors. These results support our hypothesis that the environment type shapes color category boundaries. Further, we follow Steels and Belpaeme’s (Behav Brain Sci 28:469–489, 2005) model, which allows for computer simulations of the cultural emergence of color categories. An agent-based model of the cultural emergence of color categories shows that boundaries might be seen as a product of agent’s communication in a given environment. We propose the extension of this generic agent-based modeling framework to include a culturally driven emergence of color categories. We therefore underscore external constraints on cognition: the structure of the environment in which a system evolves and learns, and the learning capacities of individual agents. Finally, we discuss the methodological issues related to real data characterization (World Color Survey), as well as to the process of modeling the emergence of perceptual categories in human subjects.
Article
Full-text available
Some “traditional” issues in language emergence and development are viewed through the prism of the interaction of autonomous robots with their environment and through their communicative skills based on the signaling system which emerges as a result of the robots’ own evolution. The main goal of the paper is to present initial conditions necessary for the emergence of communication in a group of robots. First, the paper discusses, in relation to the general faculty of language, the change that has taken place within cognitive science, particularly within computational modelling and Artificial Intelligence. Then a number of basic, individual cognitive mechanisms (pre-adaptations) are suggested, including the robots’ ability to distinguish signals, associate them with particular situations and imitate signaling behavior. These basic individual abilities may develop in the context of a community of interacting agents as well as in the changing communicative environment. In order to practice and develop the cognitive capacities, robotic agents are expected to engage in a number of activities ("language games"), including the imitation of actions, the negotiation of reference and the use of signals in the absence of referents. Inquiries into the emergence of communication in natural and arti cial systems can help isolate the possible stages of the development of the robots’ communicative abilities.
Conference Paper
This paper investigates the role of tutor feedback in grounded language learning experiments. We compare two dominant paradigms in language learning: interactive learning and cross-situational learning using robot-robot experiments in the real world. In particular, we study a mixed paradigm, introduced in earlier work, only now in real world interaction. Our experiments quantify the potential impact of social feedback in language learning. Since the grounded world is a more structured environment than random worlds, we also quantify whether algorithms can make use of that structure.
Conference Paper
A shared understanding of language will assist natural interactions between humans and artificial agents or robots undertaking collaborative tasks. An important domain for collaborative armed robots is interacting with humans and objects on a table, for example, picking, placing, or handing over a variety of objects. Such tasks combine object representation and movement planning in the geometric domain with abstract reasoning about symbolic spatial representations. This paper presents an initial study in which a human partner teaches the robot words for spatial relationships by providing exemplars and indicating where words may be used over the surface. This study demonstrates how robots can be taught the words required for these tasks in a quick and simple manner that allows the concepts to be generalizable over different surfaces, objects, and object placements.
Article
The present paper proposes an operational semantic model of natural language quantifiers (e.g., many, some, three ) and their use in quantified noun phrases. To this end we use embodied artificial agents that communicate in and interact with the physical world. We argue that existing paradigms such as Generalized Quantifiers (Barwise and Cooper 1981; Montague 1973) and Fuzzy Quantifiers (Zadeh 1983) do not provide a satisfactory models for our situated-interaction scenarios and propose a more adequate semantic model, based on fuzzy-quantification.
Article
One major lesson learned in the cognitive sciences is that even basic human cognitive capacities are extraordinarily complicated and elusive to mechanistic explanations. This is definitely the case for naming and identity. Nothing seems simpler than using a proper name to refer to a unique individual object in the world. But psychological research has shown that the criteria and mechanisms by which humans establish and use names are unclear and seemingly contradictory. Children only develop the necessary knowledge and skills after years of development and naming degenerates in unusual selective ways with strokes, schizophrenia, or Alzheimer disease. Here we present an operational model of social interaction patterns and cognitive functions to explain how naming can be achieved and acquired. We study the Grounded Naming Game as a particular example of a symbolic interaction that requires naming and present mechanisms that build up and use the semiotic networks necessary for performance in the game. We demonstrate in experiments with autonomous physical robots that the proposed dynamical systems indeed lead to the formation of an effective naming system and that the model hence explains how naming and identity can get socially constructed and shared by a population of embodied agents.
Article
Full-text available
In this paper, we present Robot Entertainment as a new field of the entertainment industry using autonomous robots. For feasibility studies of Robot Entertainment, we have developed an autonomous quadruped robot, named MUTANT, as a pet-type robot. It has four legs, each of which has three degree-of-freedom, and a head which also has three degree-of-freedom. Micro camera, stereo microphone, touch sensors, and other sensor systems are coupled with newly developed behavior generation system, which has emotion module as its major components, and generates high complex and interactive behaviors. Agent architecture, real-world recognition technologies, software component technology, and some dedicated devices such as Micro Camera Unit, were developed and tested for this purpose. From the lessons learned from the development of MUTANT, we refined the design concept of MUTANT to derive requirements for a general architecture and a set of interfaces of robot systems for entertainment applications. Through these feasibility studies, we consider entertainment applications a significant target at this moment from both scientific and engineering points of view.
Article
Full-text available
The article focuses on animation control for real-time virtual humans. The computation speed and control methods needed to portray 3D virtual humans suitable for interactive applications have improved dramatically in recent years. Real-time virtual humans show increasingly complex features along the dimensions of appearance, function, time, autonomy and individuality. The virtual human architecture, which researchers have been developing at the University of Pennsylvania is representative of an emerging generation of such architectures and includes low-level motor skills, a mid-level parallel automata controller and a high-level conceptual representation for driving virtual humans through complex tasks. The architecture, called Jack, provides a level of abstraction generic enough to encompass natural language instruction representation as well as direct links from those instructions to animation control. Building models of virtual humans involves application-dependent notions of fidelity.
Article
Full-text available
this paper is to identify the factors that we found to be crucial for the success of these experiments. These can be grouped into two subsets: internal factors relating to the individual architecture of the agents and external factors relating to the group dynamics and the environments encountered. We report not only those factors that we explicitly incorporated in the experiment but also the ones that we expressly omitted in order to prove that they are not needed
Article
Full-text available
This paper proposes a framework for constructing an operating system in an open and mobile computing environment. The framework provides object/metaobject separation and metahierarchy. In the framework, we view object migration as a basic mechanism to accommodate object heterogeneity. The relevance of the proposed framework to existing system structures is discussed. We then present a practical implementation of the Apertos operating system in this framework, where reflectors are introduced for metaobject programming and MetaCore for providing common primitives. We present some evaluation results of the Apertos operating system. We also present related work in terms of reflection mechanisms and systems.
Article
The paper proposes a set of principles and a general architecture that may explain how language and meaning may originate and complexify in a group of physically grounded distributed agents. An experimental setup is introduced for concretising and validating specific mechanisms based on these principles. The setup consists of two robotic heads that watch static or dynamic scenes and engage in language games, in which one robot describes to the other what they see. The first results from experiments showing the emergence of distinctions, of a lexicon, and of primitive syntactic structures are reported.
Shakey the Robot, " tech. note 323, SRI AI Center
  • J J Nilson
J.J. Nilson, " Shakey the Robot, " tech. note 323, SRI AI Center, Menlo Park, Calif., 1984.
Grounding in Communication &lt;i&gt;Perspectives on Socially Shared Cognition,&lt;/i&gt
  • H H Clark
  • S A Brennan