Article

Imagery in cognitive architecture: Representation and control at multiple levels of abstraction

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

In a cognitive architecture, intelligent behavior is contingent upon the use of an appropriate abstract representation of the task. When designing a general-purpose cognitive architecture, two basic challenges related to abstraction arise, which are introduced and examined in this article. The perceptual abstraction problem results from the difficulty of creating a single perception system able to induce appropriate abstract representations in any task the agent might encounter, and the irreducibility problem arises because some tasks are resistant to being abstracted at all. The first contribution of this paper is identifying these problems, and the second contribution is showing a means to address them. This is accomplished through the use of mental imagery.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... In recent years there have been a number of attempts to develop computational accounts of mental imagery from within the assumptions and constraints of cognitive architectures (e.g., Rosenbloom, 2012;Wintermute, 2012). Cognitive architectures are theories of the core memory and control structures, learning mechanisms, and perception-action processes required for general intelligence and how they are integrated into a "system of systems" to enable human cognition and autonomous, human-level artificial cognitive agents. ...
... The cognitive architecture with one of the most well developed and comprehensive set of representations for spatial reasoning and visual imagery is Soar (Laird, 2012) and its Spatial/Visual System (SVS) (Lathrop, Wintermute, & Laird, 2011;Wintermute, 2012). The SVS system contains two layers of representation: a visual depictive layer (a bitmap array representation of space and the topological structure of objects), and a quantitative spatial layer (an amodal symbolic/numerical representation of objects and their spatial co-ordinates, location, rotation and scaling) 1 . ...
... A more stringent test of the assumptions is necessary therefore and this will come from modelling more challenging tasks, for example the Raven's Progressive Matrices (c.f. Kunda et al., 2013), the pedestal blocks world or the nonholonomic car motion planning task (Wintermute, 2012) as these will provide richer behavioural data and will require more complex strategies involving a wider range of spatial transformations. This is the plan for the next stage of this project. ...
Conference Paper
Full-text available
I present two models of mental rotation created within the ACT-R theory of cognition, each of which implements one of the two main strategies identified in the literature. A holistic strategy rotates mental images as a whole unit whereas piecemeal strategy decomposes the mental image into pieces and rotates them individually. Both models provide a close fit to human response time data from a recent study of mental rotation strategies conducted by Khooshabeh, Hegarty, and Shipley (2013). This work provides an account of human mental rotation data and in so doing, tests a new proposal for representing and processing spatial information to model mental imagery in ACT-R.
... In recent years there have been a number of attempts to develop computational accounts of mental imagery from within the assumptions and constraints of cognitive architectures (e.g., Rosenbloom, 2012;Wintermute, 2012). Cognitive architectures are theories of the core memory and control structures, learning mechanisms, and perception-action processes required for general intelligence and how they are integrated into a "system of systems" to enable human cognition and autonomous, human-level artificial cognitive agents. ...
... The cognitive architecture with one of the most well developed and comprehensive set of representations for spatial reasoning and visual imagery is Soar (Laird, 2012) and its Spatial/Visual System (SVS) (Lathrop, Wintermute, & Laird, 2011;Wintermute, 2012). The SVS system contains two layers of representation: a visual depictive layer (a bitmap array representation of space and the topological structure of objects), and a quantitative spatial layer (an amodal symbolic/numerical representation of objects and their spatial coordinates, location, rotation and scaling) 1 . ...
... A more stringent test of the assumptions is necessary therefore and this will come either from modelling different strategies in the mental rotation task or from different, more challenging tasks, for example the Raven's Progressive Matrices (c.f. Kunda et al., 2013), the pedestal blocks world or the nonholonomic car motion planning task (Wintermute, 2012) as these require more complex strategies involving a wider range of spatial transformations and will provide richer behavioural data. This is the plan for the next stage of this project. ...
Conference Paper
Full-text available
I present a novel approach to modelling spatial mental imagery within the ACT-R cognitive architecture. The proposed method augments ACT-R's representation of visual objects to enable the processing of spatial extent and incorporates a set of linear and affine transformation functions to allow the manipulation of internal spatial representations. The assumptions of the modified architecture are then tested by using it to develop models of two classic mental imagery phenomena: the mental scanning study of Kosslyn, Ball, and Reiser (1978) and mental rotation (Shepard & Metzler, 1971). Both models provide very close fits to human response time data.
... With memory being represented as a shared repository of goals, problems and partial results, and modifiable by all agents, our paradigm can resemble the blackboard model. However, in CORTEX, the objective of this central representation is not only to be a means for moving/sharing information, but also to give a response to the three different meta-problems identified by Wintermute (2012): Physical symbol grounding, perceptual abstraction and irreducibility. To achieve these goals, the DSR is also used to estimate how the current state changes when something perturbs it. ...
... It would be pretentious to say that the idea of integrating abstract and concrete concepts into a unique representation is a novel contribution in CORTEX. Apart from surveying previous work, Samuel Wintermute provided a meticulous analysis of the problems inherent to building an abstract representation of real world entities (Wintermute, 2012). Basically, these problems can be summarised into (i) physical symbol grounding, (ii) perceptual abstraction, and (iii) irreducibility. ...
... The imagery architecture byWintermute (2012). ...
Article
CORTEX is a cognitive robotics architecture inspired by three key ideas: modularity, internal modelling and graph representations. CORTEX is also a computational framework designed to support early forms of intelligence in human interacting robots by selecting, a priori, a functional decomposition of the capabilities of the robot. This set of abilities are then translated into computational modules or agents, each one built as a network of software interconnected components. The nature of these agents can range from pure reactive modules connected to sensors and/or actuators, to pure deliberative ones, but they can only communicate with each other through a graph structure called Deep State Representation (DSR). DSR is a short-term dynamic representation of the space surrounding the robot, the objects and the humans in it, and the robot itself. All these entities are perceived and transformed into different levels of abstraction, ranging from geometric data to high-level symbolic relations such as “the person is talking and gazing at me”. The combination of symbolic and geometric information endows the architecture with the potential to simulate and anticipate the outcome of the actions executed by the robot. In this paper we present recent advances in the CORTEX architecture and several real-world human-robot interaction scenarios in which they have been tested. We describe our interpretation of the ideas inspiring the architecture and the reasons why this specific computational framework is a promising architecture for the social robots of tomorrow.
... Robinson and El Kaliouby [77], finds favour. This is to be expected but it is at the expense of more challenging research directions such as the use of imagery in reasoning [78] or the establishment of truly generic tests of robot ability and generality. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 Reasoning with BDI Robots 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 Figure 3). The abstract agency may actually reside on a robot with sufficient processing power and/or simultaneously elsewhere across a communication network (as indicated in Figure 3). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 Reasoning with BDI Robots 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 ...
... Alternatively it may seem that some particular topic, for example the computational modelling of emotion [76] as described earlier and reviewed in Robinson and El Kaliouby [77], finds favour. This is to be expected but it is at the expense of more challenging research directions such as the use of imagery in reasoning [78] or the establishment of truly generic tests of robot ability and generality. E. Section 5 should be better connected to the previous section in order to show how did you actually apply the theory in this section. ...
Chapter
Full-text available
The research work presented in this article investigates and explains the conceptual mechanisms of consciousness and common-sense thinking of animates. These mechanisms are computationally simulated on artificial agents as strategic rules to analyze and compare the performance of agents in critical and dynamic environments. Awareness and attention to specific parameters that affect the performance of agents specify the consciousness level in agents. Common sense is a set of beliefs that are accepted to be true among a group of agents that are engaged in a common purpose, with or without self-experience. The common sense agents are a kind of conscious agents that are given with few common sense assumptions. The so-created environment has attackers with dependency on agents in the survival-food chain. These attackers create a threat mental state in agents that can affect their conscious and common sense behaviors. The agents are built with a multi-layer cognitive architecture COCOCA (Consciousness and Common sense Cognitive Architecture) with five columns and six layers of cognitive processing of each precept of an agent. The conscious agents self-learn strategies for threat management and energy level maintenance. Experimentation conducted in this research work demonstrates animate-level intelligence in their problem-solving capabilities, decision making and reasoning in critical situations.
... • The Soar/SVS architecture [25] combines symbolic reasoning and prototypebased visual filters to infer environment state from computer-generated images depicting a simulated environment. This at least enforces a proper separation between environment and agent, as the later must parse its perceptions in order to infer the former's state, but the idealized environments are still far removed from the complexity of the real world. ...
... Darker shades of gray represent higher response. 25 ...
Chapter
Self-location—recognizing one’s surroundings and reliably keeping track of current position relative to a known environment—is a fundamental cognitive skill for entities biological and artificial alike. At a minimum, it requires the ability to match current sensory (mainly visual) inputs to memories of previously visited places, and to correlate perceptual changes to physical movement. Both tasks are complicated by variations such as light source changes and the presence of moving obstacles. This article presents the Difference Image Correspondence Hierarchy (DICH), a biologically inspired architecture for enabling self-location in mobile robots. Experiments demonstrate DICH works effectively despite varying environment conditions.
... • The Soar/SVS architecture [25] combines symbolic reasoning and prototypebased visual filters to infer environment state from computer-generated images depicting a simulated environment. This at least enforces a proper separation between environment and agent, as the later must parse its perceptions in order to infer the former's state, but the idealized environments are still far removed from the complexity of the real world. ...
... Darker shades of gray represent higher response. 25 ...
Article
Visual recognition of previously visited places is a basic cognitive skill for a wide variety of living beings, including humans. This requires a method to extract relevant cues from visual input and successfully match them to memories of known locations, disregarding environmental variations such as lighting changes, viewer pose differences, moving objects and scene occlusion. Interest point correlation is a visual place recognition method inspired by results from neuroscience and psychology; specifically, it addresses those challenges by converting raw visual inputs to a lowvariance representation, selecting regions-of-interest for representation matching, and identifying consistent matching trends. Real-world experiments employing a mobile robot demonstrate that interest point correlation is robust to visual changes, suggesting its founding principles are sound.
... A number of robot experiments have investigated the possibility of using simulations as specified by the simulation hypothesis to guide behaviour (Stening, Jakobsson, & Ziemke, 2005;Svensson, Morse, & Ziemke, 2009b;Ziemke, Jirenhed, & Hesslow, 2005), as well as other similar theories grounding cognition in sensorimotor processes (Gross, Heinze, Seiler, & Stephan, 1999;Hoffmann, 2007;Hoffmann & Mo¨ller, 2004;Wintermute, 2012). However, the task is far from trivial and the InSim hypothesis might be useful for constructing more robust simulations. ...
... Although different tasks have been investigated three different aspects of central concern for simulation theory in general and mental-imagery-like processes in particular: prediction, abstraction and planning. Abstraction concerns the question of how concepts are formed and grounded in sensorimotor states (and simulations) (Holland & Goodman, 2003;Stening et al., 2005;Wintermute, 2012). Planning concerns how simulations can be used to plan, for example, an (optimal) route between places (Baldassarre, 2001). ...
Article
Full-text available
According to the simulation hypothesis, mental imagery can be explained in terms of predictive chains of simulated perceptions and actions, i.e., perceptions and actions are reactivated internally by our nervous system to be used in mental imagery and other cognitive phenomena. Our previous research shows that it is possible but not trivial to develop simulations in robots based on the simulation hypothesis. While there are several previous approaches to modelling mental imagery and related cognitive abilities, the origin of such internal simulations has hardly been addressed. The inception of simulation (InSim) hypothesis suggests that dreaming has a function in the development of simulations by forming associations between experienced, non-experienced but realistic, and even unrealistic perceptions. Here, we therefore develop an experimental set-up based on a simple simulated robot to test whether such dream-like mechanisms can be used to instruct research into the development of simulations and mental imagery-like abilities. Specifically, the hypothesis is that ‘dreams’ informing the construction of simulations lead to faster development of good simulations during waking behaviour. The paper presents initial results in favour of the hypothesis.
... Alternatively it may seem that some particular topic, for example the computational modelling of emotion [76] as described earlier and reviewed in Robinson and El Kaliouby [77], finds favour. This is to be expected but it is at the expense of more challenging research directions such as the use of imagery in reasoning [78] or the establishment of truly generic tests of robot ability and generality. ...
Article
Full-text available
In this paper an overview of the state of research into cognitive robots is given. This is driven by insights arising from research that has moved from simulation to physical robots over the course of a number of sub-projects. A number of major issues arising from seminal research in the area are explored. In particular in the context of advances in the field of robotics and a slowly developing model of cognition and behaviour that is being mapped onto robot colonies. The work presented is ongoing but major themes such as the veracity of data and information, and their effect on robot control architectures are explored. A small number of case studies are presented where the theoretical framework has been used to implement control of physical robots. The limitations of the current research and the wider field of behavioral and cognitive robots are explored.
... Two systems with currently the most well developed and comprehensive sets of representations in this category are the Spatial/Visual System (SVS) (Lathrop, Wintermute, and Laird 2011;Wintermute 2012), an extension of the Soar cognitive architecture (Laird 2012) and the sketch understanding system CogSketch . ...
Conference Paper
Full-text available
The widely demonstrated ability of humans to deal with multiple representations of information has a number of important implications for a proposed standard model of the mind (SMM). In this paper we outline four and argue that a SMM must incorporate (a) multiple representational formats and (b) meta-cognitive processes that operate on them. We then describe current approaches to extend cognitive architectures with visual-spatial representations, in part to illustrate the limitations of current architectures in relation to the implications we raise but also to identify the basis upon which a consensus about the nature of these additional representations can be agreed. We believe that addressing these implications and outlining a specification for multiple representations should be a key goal for those seeking to develop a standard model of the mind.
... However, the concept of deep representations implies an unified, hierarchical organization of the knowledge that ranges from the symbolic layer to the motor one, mapping abstract concepts to, or from, geometric environment models and sensor data structures of the robot. The presence of a detailed representation of the spatial state of the problem is also required in the work of S. Wintermute: ... actions can be simulated (imagined) in terms of this concrete representation, and the agent can derive abstract information by applying perceptual processes to the resulting concrete state [30]. The use of a situational representation of the outer world to endow the robot with the ability to understand physical consequences of their actions can be extended, in a collaborative scenario, to support proactive robot behaviors. ...
Chapter
Full-text available
Enabling autonomous mobile manipulators to collaborate with people is a challenging research field with a wide range of applications. Collaboration means working with a partner to reach a common goal and it involves performing both, individual and joint actions, with her. Human-robot collaboration requires, at least, two conditions to be efficient: a) a common plan, usually under-defined, for all involved partners; and b) for each partner, the capability to infer the intentions of the other in order to coordinate the common behavior. This is a hard problem for robotics since people can change their minds on their envisaged goal or interrupt a task without delivering legible reasons. Also, collaborative robots should select their actions taking into account human-aware factors such as safety, reliability and comfort. Current robotic cognitive systems are usually limited in this respect as they lack the rich dynamic representations and the flexible human-aware planning capabilities needed to succeed in these collaboration tasks. In this paper, we address this problem by proposing and discussing a deep hybrid representation, DSR, which will be geometrically ordered at several layers of abstraction (deep) and will merge symbolic and geometric information (hybrid). This representation is part of a new agents-based robotics cognitive architecture called CORTEX. The agents that form part of CORTEX are in charge of high-level functionalities, reactive and deliberative, and share this representation among them. They keep it synchronized with the real world through sensor readings, and coherent with the internal domain knowledge by validating each update.
... Models of cognitive systems generally address selected aspects of cognition and often focus on specific findings from cognitive experiments (e.g., with respect to memory, attention, spatial imagery; review see Langley et al. (2009), Wintermute (2012. Duch et al. (2008) introduced a distinction between different cognitive architectures. ...
Article
Full-text available
It has often been stated that for a neuronal system to become a cognitive one, it has to be large enough. In contrast, we argue that a basic property of a cognitive system, namely the ability to plan ahead, can already be fulfilled by small neuronal systems. As a proof of concept, we propose an artificial neural network, termed reaCog, that, first, is able to deal with a specific domain of behavior (six-legged-walking). Second, we show how a minor expansion of this system enables the system to plan ahead and deploy existing behavioral elements in novel contexts in order to solve current problems. To this end, the system invents new solutions that are not possible for the reactive network. Rather these solutions result from new combinations of given memory elements. This faculty does not rely on a dedicated system being more or less independent of the reactive basis, but results from exploitation of the reactive basis by recruiting the lower-level control structures in a way that motor planning becomes possible as an internal simulation relying on internal representation being grounded in embodied experiences.
... The inclusion of a detailed physical layer on the representation will allow the robot to solve naive physics problems, which cannot be performed based on abstractions, using temporal projection [26]. The presence of a detailed representation of the spatial state of the problem is also required in the work of Wintermute: ... actions can be simulated (imagined) in terms of this concrete representation, and the agent can derive abstract information by applying perceptual processes to the resulting concrete state [35]. The use of a situational representation of the outer world to endow the robot with the ability to understand physical consequences of their actions can be extended, in a collaborative scenario, to support proactive robot behaviors. ...
Conference Paper
Full-text available
Collaboration is an essential feature of human social interaction. Briefly, when two or more people agree on a common goal and a joint intention to reach that goal, they have to coordinate their actions to engage in joint actions, planning their courses of actions according to the actions of the other partners. The same holds for teams where the partners are people and robots, resulting on a collection of technical questions difficult to answer. Human-robot collaboration requires the robot to coordinate its behavior to the behaviors of the humans at different levels, e.g., the semantic level, the level of the content and behavior selection in the interaction, and low-level aspects such as the temporal dynamics of the interaction. This forces the robot to internalize information about the motions, actions and intentions of the rest of partners, and about the state of the environment. Furthermore, collaborative robots should select their actions taking into account additional human-aware factors such as safety, reliability and comfort. Current cognitive systems are usually limited in this respect as they lack the rich dynamic representations and the flexible human-aware planning capabilities needed to succeed in tomorrow human-robot collaboration tasks. Within this paper, we provide a tool for addressing this problem by using the notion of deep hybrid representations and the facilities that this common state representation offers for the tight coupling of planners on different layers of abstraction. Deep hybrid representations encode the robot and environment state, but also a robot-centric perspective of the partners taking part in the joint activity.
... Although Pogamut in itself implements many cognitive functions, it is also a recommended middleware for the BotPrize contest and is used with modifications by other groups to implement the UT2004 bots ( [268], [269], [270], [271], [271], [272]). Other video games are Freeciv (REM [273]), Atari Frogger II (Soar [274]), Infinite Mario (Soar [275]), browser games (STAR [276]) and custom made games. ...
Article
Full-text available
In this paper we present a broad overview of the last 40 years of research on cognitive architectures. Although the number of existing architectures is nearing several hundred, most of the existing surveys do not reflect this growth and focus on a handful of well-established architectures. While their contributions are undeniable, they represent only a part of the research in the field. Thus, in this survey we wanted to shift the focus towards a more inclusive and high-level overview of the research in cognitive architectures. Our final set of 86 architectures includes 55 that are still actively developed, and borrow from a diverse set of disciplines, spanning areas from psychoanalysis to neuroscience. To keep the length of this paper within reasonable limits we discuss only the core cognitive abilities, such as perception, attention mechanisms, learning and memory structure. To assess the breadth of practical applications of cognitive architectures we gathered information on over 700 practical projects implemented using the cognitive architectures in our list. We use various visualization techniques to highlight overall trends in the development of the field. Our analysis of practical applications shows that most architectures are very narrowly focused on a particular application domain. Furthermore, there is an apparent gap between general research in robotics and computer vision and research in these areas within the cognitive architectures field. It is very clear that biologically inspired models do not have the same range and efficiency compared to the systems based on engineering principles and heuristics. Another observation is related to a general lack of collaboration. Several factors hinder communication, such as the closed nature of the individual projects (only one-third of the reviewed here architectures are open-source) and terminological differences.
... Indeed, mental imagery can be viewed as a synonym for internal simulation: "all imagery is mental emulation" (Moulton & Kosslyn, 2009), p. 1276. For a computational perspective on mental imagery, specifically in the context of cognitive architectures, see (Wintermute, 2012). ...
Article
Full-text available
This paper complements Ron Sun’s influential Desiderata for Cognitive Architectures by focussing on the desirable attributes of a biologically-inspired cognitive architecture for an agent with a capacity for autonomous development. Ten desiderata are identified, dealing with value systems & motives, embodiment, sensorimotor contingencies, perception, attention, prospective action, memory, learning, internal simulation, and constitutive autonomy. These desiderata are motivated by studies in developmental psychology, cognitive neuroscience, and enactive cognitive science. All ten focus on the ultimate aspects of cognitive development — why a feature is necessary and what it enables — rather on than the proximate mechanisms by which it can be realized. As such, the desiderata are for the most part neutral regarding the paradigm of cognitive science — cognitivist or emergent — that is adopted when designing a cognitive architecture. Where some element of a desideratum is specific to a particular paradigm, this is noted.
... The first is Soar/SVI [Lathrop 2009], which gives Soar the ability to create and reason about spatial representations and abstractions (imagery). The second is Soar/SVS [Wintermute 2010, Wintermute 2011, which adds the ability to simulate the effects of actions in the environment. This body of work is the most closely related work to ours. ...
... The ability to predict an intended action outcome and its consequences is important in action planning. It has also been hypothesized that motor imagery plays an important role in predicting actions [12][13][14][15]. For example, when planning an intended action, using imagery to consider important timing and biomechanical information helps predict the movement's sensory consequences. ...
Article
Full-text available
Abstract: Contemporary research findings indicate that in older persons (typically 64 > years) there are functional decrements in the ability to mentally represent and effectively plan motor actions. Actions, if poorly planned, can result in falling, a major health concern for the elderly. Whereas a number of factors may contribute to falls, over- or underestimation of reach abilities may lead to loss of postural control (balance) and pose a higher risk of falling. Our intent with this paper was to provide: (1) a brief background of the problem, (2) suggest strategies for mental (motor) imagery practice in the context of reach planning, and (3) describe general guidelines and a sample practice format of a training program for clinical use. Mental (motor) imagery practice of reach planning has potential for improving motor performance in reach-related everyday activities and reducing the risk of falls in older persons.
... Son un importante punto de partida para alcanzar un conocimiento situacional que permita la representación espacial del estado del problema. También permiten al robot desarrollar, a partir de este conocimiento, simulaciones que le permitan anticipar situaciones, reacciones y decisiones, tal y como se indica en el trabajo propuesto por S. Wintermute [Wintermute, 2012]. ...
... However, the concept of deep representations implies an unified, hierarchical organization of the knowledge that ranges from the symbolic layer to the motor one, mapping abstract concepts to, or from, geometric environment models and sensor data structures of the robot. The presence of a detailed representation of the spatial state of the problem is also required in the work of S. Wintermute: ... actions can be simulated (imagined) in terms of this concrete representation, and the agent can derive abstract information by applying perceptual processes to the resulting concrete state [30]. The use of a situational representation of the outer world to endow the robot with the ability to understand physical consequences of their actions can be extended, in a collaborative scenario, to support proactive robot behaviors. ...
Conference Paper
Full-text available
Enabling autonomous mobile manipulators to collaborate with people is a challenging research field with a wide range of applications. Collaboration means working with a partner to reach a common goal and it involves performing both, individual and joint actions, with her. Human-robot collaboration requires, at least, two conditions to be efficient: a) a common plan, usually under-defined, for all involved partners; and b) for each partner, the capability to infer the intentions of the other in order to coordinate the common behavior. This is a hard problem for robotics since people can change their minds on their envisaged goal or interrupt a task without delivering legible reasons. Also, collaborative robots should select their actions taking into account human-aware factors such as safety, reliability and comfort. Current robotic cognitive systems are usually limited in this respect as they lack the rich dynamic representations and the flexible human-aware planning capabilities needed to succeed in these collaboration tasks. In this paper, we address this problem by proposing and discussing a deep hybrid representation, DSR, which will be geometrically ordered at several layers of abstraction (deep) and will merge symbolic and geometric information (hybrid). This representation is part of a new agents-based robotics cognitive architecture called CORTEX. The agents that form part of CORTEX are in charge of high-level functionalities, reactive and deliberative, and share this representation among them. They keep it synchronized with the real world through sensor readings, and coherent with the internal domain knowledge by validating each update.
... However, reflecting the interdependence of perception and action, covert action often has elements of both motor imagery and visual imagery and, vice versa, the simulation of perception often has elements of motor imagery. Visual imagery and motor imagery are sometimes referred to collectively as mental imagery (Wintermute, 2012). Moulton and Kosslyn (2009) identify several different types of perceptual imagery and distinguish between two different types of simulation: instrumental simulation and emulative simulation. ...
Article
Full-text available
Prospection lies at the core of cognition: it is the means by which an agent – a person or a cognitive robot – shifts its perspective from immediate sensory experience to anticipate future events, be they the actions of other agents or the outcome of its own actions. Prospection, accomplished by internal simulation, requires mechanisms for both perceptual imagery and motor imagery. While it is known that these two forms of imagery are tightly entwined in the mirror neuron system, we do not yet have an effective model of the mentalizing network which would provide a framework to integrate declarative episodic and procedural memory systems and to combine experiential knowledge with skillful know-how. Such a framework would be founded on joint perceptuo-motor representations. In this paper, we examine the case for this form of representation, contrasting sensory-motor theory with ideo-motor theory, and we discuss how such a framework could be realized by joint episodic-procedural memory. We argue that such a representation framework has several advantages for cognitive robots. Since episodic memory operates by recombining imperfectly recalled past experience, this allows it to simulate new or unexpected events. Furthermore, by virtue of its associative nature, joint episodic-procedural memory allows the internal simulation to be conditioned by current context, semantic memory, and the agent’s value system. Context and semantics constrain the combinatorial explosion of potential perception-action associations and allow effective action selection in the pursuit of goals, while the value system provides the motives that underpin the agent’s autonomy and cognitive development. This joint episodic-procedural memory framework is neutral regarding the final implementation of these episodic and procedural memories, which can be configured sub-symbolically as associative networks or symbolically as content-addressable image databases and databases of motor-control scripts.
... The same process of retaining and combining information is likely to be necessary for building a mental model. Furthermore, VSTM is likely to have at least some imagery capability (Phillips, 1983;Wintermute, 2012). Phillips (1983), one of the first to introduce the concept of VSTM, emphasized that VSTM facilitates our ability to visualize problem space and is not just a sensory store. ...
Article
Full-text available
This paper introduces a framework of human reasoning and its ACT-R based implementation called the Human Reasoning Module (HRM). Inspired by the human mind, the framework seeks to explain how a single system can exhibit different forms of reasoning ranging from deduction to induction, from deterministic to probabilistic inference, from rules to mental-models. The HRM attempts to unify previously mentioned forms of reasoning into a single coherent system rather than treating them as loosely connected separate subsystems. The validity of the HRM is tested with cognitive models of three tasks involving simple casual deduction, reasoning on spatial relations and Bayesian-like inference of cause/effect. The first model explains why people use an inductive, probabilistic reasoning process even when using ostensibly deductive arguments such as Modus Ponens and Modus Tollens. The second model argues that visual bottom-up processes can do fast and efficient semantic processing. Based on this argument, the model explains why people perform worse in a spatial relation problem with ambiguous solutions than in a problem with a single solution. The third model demonstrates that statistics of Bayesian-like reasoning can be reproduced using a combination of a rule-based reasoning and probabilistic declarative retrievals. All three models were validated successfully against human data. The HRM demonstrates that a single system can express different facets of reasoning exhibited by the human mind. As a part of a cognitive architecture, the HRM is promising to be a useful and accessible tool for exploring deeps of human mind and modeling biologically inspired agents.
... These representations are hypothesized to be an integral part of action planning. Complementing the forward model idea and central to the present discussion is the widely acknowledged proposition that simulation in the form of MI provides a window into the process of action representation, that is, it reflects an internal action representation (Chabeauti, Assaiante, & Vaugoyeau, 2012;Jeannerod, 2001;Wintermute, 2012). ...
Article
Full-text available
Recent research findings indicate that with older adulthood, there are functional decrements in spatial cognition and more specially, in the ability to mentally represent and effectively plan motor actions. A typical finding is a significant over- or underestimation of one's actual physical abilities with movement planning-planning that has implications for movement efficiency and physical safety. A practical, daily life example is estimation of reachability-a situation that for the elderly may be linked with fall incidence. A strategy used to mentally represent action is the use of motor imagery-an ability that also declines with advancing older age. This brief review highlights research findings on mental representation and motor imagery in the elderly and addresses the implications for improving movement efficiency and lowering the risk of movement-related injury.
... The result is processes that enable successful planning and execution of action. Accompanying the forward model propositions, and central to the discussion here, is the widely acknowledged observation that simulation in the form of motor imagery provides a window into the process of action representation [5][6][7][8]. ...
Article
Full-text available
Physiotherapy interventions have proven to play an important role in preventing and rehabilitating fall injury in the elderly. An important goal of such programs is to modify risk factors and thereby reduce the likelihood for future falls. Recent research observations suggest that one such factor is the ability to mentally represent and plan movements, an ability that declines with advancing age. In addition to physiotherapy exercise for balance, mobility and stabilizing strength, the use of motor imagery practice, a form of mental representation, has gained interest in the clinical community. Recent research findings highlight the merits of combining physiotherapy with motor imagery practice. Such practice has the potential to help the individual maintain motor (action) planning networks while recovering from brain and / or muscle injury. Another, more proactive approach, is to use motor imagery practice to improve action planning and subsequent movement efficiency. This brief review highlights research findings on mental representation and motor imagery, notes implications for the elderly, and provides recommendations for practice strategies to improve motor planning and potentially lower risk of movement-related injury.
... Driving that research interest are bodies of evidence suggesting that, for example, motor control and motor simulation states are functionally equivalent (Burianová et al., 2013;Kunz, Creem-Regehr, & Thompson, 2009;Lorey et al., 2010). Furthermore, for those interested in the connection between action processing and underlying cognitive processes, recent research indicates that there may be a critical functional relationship between motor imagery and higher-level cognitive processes (Barsalou, 2008; see review by Madan & Singhal, 2012;Wintermute, 2012). Kosslyn, one of the foremost experts on the topic of mental imagery declares that mental simulation of all types underscores memory, reasoning, and learning (Kosslyn, Thompson, & Ganis, 2006). ...
Article
Full-text available
This study examined the role of visual working memory when transforming visual representations to motor representations in the context of motor imagery. Participants viewed randomized number sequences of three, four, and five digits, and then reproduced the sequence by finger tapping using motor imagery or actually executing the movements; movement duration was recorded. One group viewed the stimulus for three seconds and responded immediately, while the second group had a three-second view followed by a three-second blank screen delay before responding. As expected, delay group times were longer with each condition and digit load. Whereas correlations between imagined and executed actions (temporal congruency) were significant in a positive direction for both groups, interestingly, the delay group's values were significantly stronger. That outcome prompts speculation that delay influenced the congruency between motor representation and actual execution.
Preprint
Learning image transformations is essential to the idea of mental simulation as a method of cognitive inference. We take a connectionist modeling approach, using planar neural networks to learn fundamental imagery transformations, like translation, rotation, and scaling, from perceptual experiences in the form of image sequences. We investigate how variations in network topology, training data, and image shape, among other factors, affect the efficiency and effectiveness of learning visual imagery transformations, including effectiveness of transfer to operating on new types of data.
Article
Full-text available
In this paper we present a broad overview of the last 40 years of research on cognitive architectures. To date, the number of existing architectures has reached several hundred, but most of the existing surveys do not reflect this growth and instead focus on a handful of well-established architectures. In this survey we aim to provide a more inclusive and high-level overview of the research on cognitive architectures. Our final set of 84 architectures includes 49 that are still actively developed, and borrow from a diverse set of disciplines, spanning areas from psychoanalysis to neuroscience. To keep the length of this paper within reasonable limits we discuss only the core cognitive abilities, such as perception, attention mechanisms, action selection, memory, learning, reasoning and metareasoning. In order to assess the breadth of practical applications of cognitive architectures we present information on over 900 practical projects implemented using the cognitive architectures in our list. We use various visualization techniques to highlight the overall trends in the development of the field. In addition to summarizing the current state-of-the-art in the cognitive architecture research, this survey describes a variety of methods and ideas that have been tried and their relative success in modeling human cognitive abilities, as well as which aspects of cognitive behavior need more research with respect to their mechanistic counterparts and thus can further inform how cognitive science might progress.
Article
I apply my proposed modification of Soar/Spatial/Visual System and Kosslyn’s (1983) computational operations on images to problems within a 2 × 2 taxonomy that classifies research according to whether the coding involves static or dynamic relations within an object or between objects (Newcombe & Shipley, 2015). I then repeat this analysis for problems that are included in mathematics and science curricula. Because many of these problems involve reasoning from diagrams Hegarty’s (2011) framework for reasoning from visual-spatial displays provides additional support for organizing this topic. Two more relevant frameworks specify reasoning at different levels of abstraction (Reed, 2016) and with different combinations of actions and objects (Reed, 2018). The article concludes with suggestions for future directions.
Book
This book constitutes revised selected papers from the Second International Workshop on Brain-Inspired Computing, BrainComp 2015, held in Cetraro, Italy, in July 2015. The 14 papers presented in this volume were carefully reviewed and selected for inclusion in this book. They deal with brain structure and function; computational models and brain-inspired computing methods with practical applications; high performance computing; and visualization for brain simulations.
Conference Paper
Full-text available
ion is a core concept in cognitive science, representing a challenge for all theories of cognition. Conceptualization of abstraction is also complicated by the fact that it is an entity with several potential meanings and involved mechanisms. Abstraction occupies the agenda of many disciplines, including psychology, linguistics , artificial intelligence, and more recently, neuroscience. In this paper, we attempt to shed light on this topic, by summarizing evidence accumulated in these disciplines.
Article
We describe the all-engine-out landing of Air Transat Flight 236 in the Azores Islands (August 24, 2001) and use certain aspects of that accident to motivate a conceptual framework for the organization and display of information in complex human-interactive systems. Four hours into the flight, the aircraft experienced unusual oil indications. Two hours later, a fuel system failure led to a full-blown emergency that was not evident to the crew until it was too late. Although all relevant data to avoid the emergency were available to the aircraft computer systems, the design choices made about what to display and how to display it kept the pilots in the dark. The framework proposed here consists of six levels, beginning from the extraction of data from physical signals, abstracting from raw data to form visual representations on the user interface, and finally integrating high-level elements and information structures. We illustrate how the framework can be used to analyze some of the shortcomings in current display design, and we discuss some principles of information organization and formal analysis of task logic that might help to improve design. Finally, we sketch a design for a helicopter engine display based on these principles.
Article
Full-text available
In problem solving a goal/subgoal is either solved by generating needed information from current information, or further decomposed into additional subgoals. In traditional problem solving, goals, knowledge, and problem states are all modeled as expressions composed of symbolic predicates, and information generation is modeled as rule application based on matching of symbols. In problem solving with diagrams on the other hand, an additional means of generating information is available, viz., by visual perception on diagrams. A subgoal is solved opportunistically by whichever way of generating information is successful. Diagrams are especially effective because certain types of information that is entailed by given information is explicitly available - as emergent objects and emergent relations - for pickup by visual perception. We add to the traditional problem solving architecture a component for representing the diagram as a configuration of diagrammatic objects of three basic types, point, curve and region; a set of perceptual routines that recognize emergent objects and evaluate a set of generic spatial relations between objects; and a set of action routines that create or modify the diagram. We discuss how domain-specific capabilities can be added on top of the generic capabilities of the diagram system. The working of the architecture is illustrated by means of an application scenario.
Article
Full-text available
Real-world planners must be able to temporally project their external actions internally. Typ- ically, this has been done entirely at an ab- stract representational level. We investigate an alternative approach, performing projections on a concrete, property-based representation, and re- deriving the abstract level from it. We show that this approach can greatly alleviate the frame prob- lem in certain types of spatial domains, while maintaining the advantages of planning at an ab- stract level. We present a comparison of the ap- proaches in various object-manipulation domains, including the results of a re-implementation of the Robo-Soar system (Laird et al., 1989).
Article
Full-text available
The argument for a propositional over a pictorial representation for visual imagery has largely taken the form of an attack on the logical coherence of pictorial representations. These attacks have not been valid since one can develop a coherent dual-code model involving pictorial and verbal (nonpropositional) representations. On the other hand, empirical demonstrations that are claimed to support pictorial representations fail to discriminate such representations from propositional ones. It is argued that the failure of the anti- and pro-pictorial arguments stems from a fundamental indeterminancy in deciding issues of representations. It is shown that wide classes of different representations, and in particular propositional vs dual-code models, can be made to yield identical behavior predictions. Criteria such as parsimony and efficiency in addition to prediction of behavior may yield further constraints on representation; and, in particular, it may be possible to establish whether there are 2 codes, one for visual information and one for verbal, or whether there is a single abstract code. It is concluded that barring decisive physiological data, it will not be possible to establish whether an internal representation is pictorial or propositional. (69 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
This paper presents an approach for integrating action in the world with general symbolic reasoning. Instead of working with task-specific symbolic abstractions of continuous space, our system mediates action through a simple spatial representation. Low-level action controllers work in the context of this representation, and a high-level symbolic system has access to it. By allowing actions to be spatially simulated, general reasoning about action is possible. Only very simple task-independent symbolic abstractions of space are necessary, and controllers can be used without the need for symbolic characterization of their behavior. We draw parallels between this system and a modern robotic motion planning algorithm, RRT. This algorithm is instantiated in our system, and serves as a case study showing how the architecture can effectively address real robotics problems.
Article
Full-text available
Inspired by mental imagery, we present results of extending a symbolic cognitive architecture (Soar) with general computational mechanisms to support reasoning with symbolic, quantitative spatial, and visual depictive representations. Our primary goal is to achieve new capabilities by combining and manipulating these representations using specialized processing units specific to a modality but independent of task knowledge. This paper describes the architecture supporting behavior in an environment where perceptual-based thought is inherent to problem solving. Our results show that imagery provides the agent with additional functional capabilities improving its ability to solve rich spatial and visual problems.
Article
Full-text available
The authors have proposed a very simple autonomous learning system consisting of one neural network (NN), whose inputs are raw sensor signals and whose outputs are directly passed to actuators as control signals, and which is trained by using reinforcement learning (RL). However, the current opinion seems that such simple learning systems do not actually work on complicated tasks in the real world. In this paper, with a view to developing higher functions in robots, the authors bring up the necessity to introduce autonomous learning of a massively parallel and cohesively flexible system with massive inputs based on the consideration about the brain architecture and the sequential property of our consciousness. The authors also bring up the necessity to place more importance on "optimization" of the total system under a uniform criterion than "understandability" for humans. Thus, the authors attempt to stress the importance of their proposed system when considering the future research on robot intelligence. The experimental result in a real-world-like environment shows that image recognition from as many as 6240 visual signals can be acquired through RL under various backgrounds and light conditions without providing any knowledge about image processing or the target object. It works even for camera image inputs that were not experienced in learning. In the hidden layer, template-like representation, division of roles between hidden neurons, and representation to detect the target uninfluenced by light condition or background were observed after learning. The autonomous acquisition of such useful representations or functions makes us feel the potential towards avoidance of the frame problem and the development of higher functions.
Conference Paper
Full-text available
An embodied agent senses the world at the pixel level through a large number of sense elements. In order to function intelligently, an agent needs high-level concepts, grounded in the pixel level. For human designers to program these concepts and their grounding explicitly is almost certainly intractable, so the agent must learn these foundational concepts autonomously. We describe an approach by which an autonomous learning agent can bootstrap its way from pixel-level interaction with the world, to individuating and tracking objects in the environment, to learning an effective policy for its behavior. We use methods drawn from computational scientific discovery to identify derived variables that support simplified models of the dynamics of the environment. These derived variables are abstracted to discrete qualitative variables, which serve as features for temporal difference learning. Our method bridges the gap between the continuous tracking of objects and the discrete state representation necessary for efficient and effective learning. We demonstrate and evaluate this approach with an agent experiencing a simple simulated world, through a sensory interface consisting of 60,000 time-varying binary variables in a 200 x 300 array, plus a three-valued motor signal and a real-valued reward signal.
Article
Full-text available
In this paper, we examine the motivations for research on cognitive architectures and review some candidates that have been explored in the literature. After this, we consider the capabilities that a cognitive architecture should support, some properties that it should exhibit related to representation, organization, performance, and learning, and some criteria for evaluating such architectures at the systems level. In closing, we discuss some open issues that should drive future research in this important area.
Article
Full-text available
In this paper, we describe an architectural modification to Soar that gives a Soar agent the opportunity to learn statistical information about the past success of its actions and utilize this information when selecting an operator. This mechanism serves the same purpose as production utilities in ACT-R, but the implementation is more directly tied to the standard definition of the reinforcement learning (RL) problem. The paper explains our implementation, gives a rationale for adding an RL capability to Soar, and shows results for Soar-RL agents’ performance on two tasks.
Conference Paper
Full-text available
Spatial reasoning is a fundamental aspect of intelligent behavior, which cognitive architectures must address in a problem-independent way. Bimodal systems, employing both qualitative and quantitative representations of spatial information, are efficient and psychologically plausible means for spatial reasoning. Any such system must employ a translation from the qualitative level to the quantitative, where new objects (images) are created through the process of predicate projection. This translation has received little scrutiny. We examine this issue in the context of a bimodal spatial reasoning system integrated with a cognitive architecture (Soar). As part of this system, we define an expressive language for predicate projection that supports general and flexible image creation. We demonstrate this system on multiple spatial reasoning problems in the ORTS real-time strategy game environment.
Conference Paper
Full-text available
Symbolic AI systems typically have difficulty reasoning about motion in continuous environments, such as determining whether a cornering car will clear a close obstacle. Bimodal systems, integrating a qualitative symbolic system with a quantitative diagram-like spatial representation, are capable of solving this sort of problem, but questions remain of how and where knowledge about fine-grained motion processes is represented, and how it is applied to the problem. In this paper, we argue that forward simulation of motion is an appropriate method, and introduce continuous motion models to enable this simulation. These motion-specific models control behavior of objects at the spatial level, while general mechanisms in the higher qualitative level control and monitor them. This interaction of low- and high-level activity allows for problem solving that is both precise in individual problems and general across multiple problems. In addition, this approach allows perception and action mechanisms to be reused in reasoning about hypothetical motion problems and abstract non-motion problems, and points to how symbolic AI can become more grounded in the real world. We demonstrate implemented systems that solve problems in diverse domains, and connections to action control are discussed.
Conference Paper
Full-text available
Rich representations in reinforcement learning have been studied for the purpose of enabling generalization and making learning feasible in large state spaces. We introduce Object-Oriented MDPs (OO-MDPs), a representation based on objects and their interactions, which is a natural way of modeling environments and offers impor- tant generalization opportunities. We introduce a learning algorithm for deterministic OO-MDPs and prove a polynomial bound on its sample complexity. We illustrate the performance gains of our representation and algorithm in the well- known Taxi domain, plus a real-life videogame.
Conference Paper
Full-text available
One approach in pursuit of general intelligent agents has been to concentrate on the underlying cognitive architecture, of which Soar is a prime example. In the past, Soar has relied on a minimal number of architectural modules together with purely symbolic representations of knowledge. This paper presents the cognitive architecture approach to general intelligence and the traditional, symbolic Soar architecture. This is followed by major additions to Soar: non-symbolic representations, new learning mechanisms, and long-term memories.
Conference Paper
Full-text available
State abstraction (or state aggregation) has been extensively studied in the fields of artificial intel- ligence and operations research. Instead of work- ing in the ground state space, the decision maker usually finds solutions in the abstract state space much faster by treating groups of states as a unit by ignoring irrelevant state information. A num- ber of abstractions have been proposed and studied in the reinforcement-learning and planning litera- tures, and positive and negative results are known. We provide a unified treatment of state abstraction for Markov decision processes. We study five partic- ular abstraction schemes, some of which have been proposed in the past in dierent forms, and analyze their usability for planning and learning.
Conference Paper
Full-text available
We investigate the problem of reinforcement learning (RL) in a challenging object-oriented environment, where the functional diversity of objects is high, and the agent must learn quickly by generalizing its experience to novel situations. We present a novel two-layer architecture, which can achieve efficient learning of value function for such environments. The algorithm is implemented by integrating an unsupervised, hierarchical clustering component into the Soar cognitive architecture. Our system coherently incorporates several principles in machine learning and knowledge representation including: dimension reduction, competitive learning, hierarchical representation and sparse coding. We also explore the types of prior domain knowledge that can be used to regulate learning based on the characteristics of environment. The system is empirically evaluated in an artificial domain consisting of interacting objects with diverse functional properties and multiple functional roles. The results demonstrate that the flexibility of hierarchical representation naturally integrates with our novel value function approximation scheme and together they can significantly improve the speed of RL.
Conference Paper
Full-text available
The paper presents a state-space perspective on the kinodynamic planning problem, and introduces a randomized path planning technique that computes collision-free kinodynamic trajectories for high degree-of-freedom problems. By using a state space formulation, the kinodynamic planning problem is treated as a 2n-dimensional nonholonomic planning problem, derived from an n-dimensional configuration space. The state space serves the same role as the configuration space for basic path planning. The bases for the approach is the construction of a tree that attempts to rapidly and uniformly explore the state space, offering benefits that are similar to those obtained by successful randomized planning methods, but applies to a much broader class of problems. Some preliminary results are discussed for an implementation that determines the kinodynamic trajectories for hovercrafts and satellites in cluttered environments resulting in state spaces of up to twelve dimensions
Conference Paper
Full-text available
I propose that the notion of cognitive state be broadened from the current predicate-symbolic, Language-of-Thought framework to a multi-modal one, where perception and kinesthetic modalities participate in thinking. In contrast to the roles assigned to perception and motor activities as modules external to central cognition in the currently dominant theories in AI and Cognitive Science, in the proposed approach, central cognition incorporates parts of the perceptual machinery. I motivate and describe the proposal schematically, and describe the implementation of a bi-modal version in which a diagrammatic representation component is added to the cognitive state. The proposal explains our rich multimodal internal experience, and can be a key step in the realization of embodied agents. The proposed multimodal cognitive state can significantly enhance the agent's problem solving.
Article
Full-text available
We propose a factored approach to mobile robot map-building that handles qualitatively different types of uncertainty by combining the strengths of topological and metrical approaches. Our framework is based on a computational model of the human cognitive map; thus it allows robust navigation and communication within several different spatial ontologies. This paper focuses exclusively on the issue of map-building using the framework. Our approach factors the mapping problem into natural sub-goals: building a metrical representation for local small-scale spaces; finding a topological map that represents the qualitative structure of large-scale space; and (when necessary) constructing a metrical representation for large-scale space using the skeleton provided by the topological map. We describe how to abstract a symbolic description of the robot’s immediate surround from local metrical models, how to combine these local symbolic models in order to build global symbolic models, and how to create a globally consistent metrical map from a topological skeleton by connecting local frames of reference.
Article
In this paper, we consider the problem of reinforcement learning in spatial tasks. These tasks have many states that can be aggregated together to improve learning efficiency. In an agent, this aggregation can take the form of selecting appropriate perceptual processes to arrive at a qualitative abstraction of the underlying continuous state. However, for arbitrary problems, an agent is unlikely to have the perceptual processes necessary to discriminate all relevant states in terms of such an abstraction.To help compensate for this, reinforcement learning can be integrated with an imagery system, where simple models of physical processes are applied within a low-level perceptual representation to predict the state resulting from an action. Rather than abstracting the current state, abstraction can be applied to the predicted next state. Formally, it is shown that this integration broadens the class of perceptual abstraction methods that can be used while preserving the underlying problem. Empirically, it is shown that this approach can be used in complex domains, and can be beneficial even when formal requirements are not met.
Book
This book explores the intersection between cognitive sciences and social sciences. In particular, it explores the intersection between individual cognitive modeling and modeling of multi-agent interaction (social stimulation). The two contributing fields - individual cognitive modeling (especially cognitive architectures) and modeling of multi-agent interaction (including social simulation and, to some extent, multi-agent systems) - have seen phenomenal growth in recent years. However, the interaction of these two fields has not been sufficiently developed. We believe that the interaction of the two may be more significant than either alone. © Cambridge University Press 2006 and Cambridge University Press, 2009.
Article
INTRODUCTION This chapter presents an overview of a relatively recent cognitive architecture for modeling cognitive processes of individual cognitive agents (in a psychological sense) (see Sun et al., 1998, 2001; Sun, 2002). We will start with a look at some general ideas underlying this cognitive architecture as well as the relevance of these ideas to social simulation. To tackle a host of issues arising from computational cognitive modeling that are not adequately addressed by many other existent cognitive architectures, such as the implicit-explicit interaction, the cognitivemetacognitive interaction, and the cognitive-motivational interaction, CLARION, a modularly structured cognitive architecture, has been developed (Sun, 2002; Sun et al., 1998, 2001). Overall, CLARION is an integrative model. It consists of a number of functional subsystems (for example, the action-centered subsystem, the metacognitive subsystem, and the motivational subsystem). It also has a dual representational structure – implicit and explicit representations being in two separate components in each subsystem. Thus far, CLARION has been successful in capturing a variety of cognitive processes in a variety of task domains based on this division of modules (Sun et al., 2002). See Figure 4.1 for a sketch of the architecture. A key assumption of CLARION, which has been argued for amply before (see Sun et al., 1998, 2001; Sun, 2002), is the dichotomy of implicit and explicit cognition. Generally speaking, implicit processes are less accessible and more “holistic,” whereas explicit processes are more accessiblee crisp (Reber, 1989; Sun, 2002). This dichotomy is closely related to some other well-known dichotomies in cognitive science: the dichotomy of symbolic versus subsymbolic processing, the dichotomy of conceptual versus subconceptual processing, and so on (Smolensky, 1988; Sun, 1994). © Cambridge University Press 2006 and Cambridge University Press, 2009.
Article
In this paper, we discuss the field of sampling-based motion planning. In contrast to methods that construct boundary representations of configuration space obstacles, sampling-based methods use only information from a collision detector as they search the configuration space. The simplicity of this approach, along with increases in computation power and the development of efficient collision detection algorithms, has resulted in the introduction of a number of powerful motion planning algorithms, capable of solving challenging problems with many degrees of freedom. First, we trace how sampling-based motion planning has developed. We then discuss a variety of important issues for sampling-based motion planning, including uniform and regular sampling, topological issues, and search philosophies. Finally, we address important issues regarding the role of randomization in sampling-based motion planning.
Article
Article
When we try to remember whether we left a window open or closed, do we actually see the window in our mind? If we do, does this mental image play a role in how we think? For almost a century, scientists have debated whether mental images play a functional role in cognition. The Case for Mental Imagery presents a complete and unified argument that mental images do depict information, and that these depictions do play a functional role in human cognition. It outlines a specific theory of how depictive representations are used in information processing, and shows how these representations arise from neural processes. To support this theory, it weaves together conceptual analyses and the many varied empirical findings from cognitive psychology and neuroscience. In doing so, the book presents the conceptual grounds for positing this type of internal representation, summarizing and refuting arguments to the contrary. Its argument also serves as a historical review of the imagery debate from its earliest inception to its most recent phases, and provides evidence that significant progress has been made in our understanding of mental imagery. In illustrating how scientists think about one of the most difficult problems in psychology and neuroscience, this book goes beyond the debate, to explore the nature of cognition and to draw out implications for the study of consciousness. © 2006 by Stephen M. Kosslyn, William L. Thompson, and Giorgio Ganis. All rights reserved.
Article
This paper presents the first randomized approach to kinodynamic planning (also known as trajectory planning or trajectory design). The task is to determine control inputs to drive a robot from an initial configuration and velocity to a goal configuration and velocity while obeying physically based dynamical models and avoiding obstacles in the robot's environment. The authors consider generic systems that express the nonlinear dynamics of a robot in terms of the robot's high-dimensional configuration space. Kinodynamic planning is treated as a motion-planning problem in a higher dimensional state space that has both first-order differential constraints and obstacle based global constraints. The state space serves the same role as the configuration space for basic path planning; however, standard randomized path-planning techniques do not directly apply to planning trajectories in the state space. The authors have developed a randomized planning approach that is particularly tailored to trajectory planni ng problems in high-dimensional state spaces. The basis for this approach is the construction of rapidly exploring random trees, which offer benefits that are similar to those obtained by successful randomized holonomic planning methods but apply to a much broader class of problems. Theoretical analysis of the algorithm is given. Experimental results are presented for an implementation that computes trajectories for hovercrafts and satellites in cluttered environments, resulting in state spaces of up to 12 dimensions.
Article
We present a general cognitive architecture that tightly integrates symbolic, spatial, and visual representations. A key means to achieving this integration is allowing cognition to move freely between these modes, using mental imagery. The specific components and their integration are motivated by results from psychology, as well as the need for developing a functional and efficient implementation. We discuss functional benefits that result from the combination of multiple content-based representations and the specialized processing units associated with them. Instantiating this theory, we then discuss the architectural components and processes, and illustrate the resulting functional advantages in two spatially and visually rich domains. The theory is then compared to other prominent approaches in the area.
Article
Since its definition by McCarthy in 1969, the Frame Problem (FP) has been one of the more heavily debated problems in AI. Part of the debate has been on the exact definition of what the FP really is. The computational aspect of the FP can be thought of as reasoning about what changes and what doesn't change in a dynamic world. The "sleeping dog strategy" is considered to be a viable solution to this aspect of the FP. We intend to show that this strategy has a weakness that can be partially solved using diagrammatic reasoning, under certain conditions. A related and equally important problem, called the Ramification Problem, is to be able to reason about the indirect effects of an action in the world. Our proposal provides a more efficient solution to the Ramification Problem when reasoning about spatial relations. To illustrate our solution, we introduce a problem solving architecture based on Soar that is augmented with a diagrammatic reasoning component. A problem state in this augmented Soar is bi-modal in nature, one part being symbolic and the other diagrammatic. We describe its use in certain problems and show how the use of diagrams can handle the frame and ramification problems with respect to spatial relations.
Article
We present an extension to biSoar, a bimodal version of the cognitive architecture Soar, by adding a bimodal version of chunking, Soar's basic learning mechanism. We show how this new biSoar is a useful tool in modeling cognitive phenomena involving spatial or diagrammatic elements by applying it to the modeling of problem solving involving large-scale space, such as way-finding. We suggest how such models can help in identifying variables to control for in human subject experiments.
Article
This paper describes the architecture and implementation of an autonomous passenger vehicle designed to navigate using locally perceived information in preference to potentially inaccurate or incomplete map data. The vehicle architecture was designed to handle the original DARPA Urban Challenge requirements of perceiving and navigating a road network with segments defined by sparse waypoints. The vehicle implementation includes many heterogeneous sensors with significant communications and computation bandwidth to capture and process high-resolution, high-rate sensor data. The output of the comprehensive environmental sensing subsystem is fed into a kinodynamic motion planning algorithm to generate all vehicle motion. The requirements of driving in lanes, three-point turns, parking, and maneuvering through obstacle fields are all generated with a unified planner. A key aspect of the planner is its use of closed-loop simulation in a rapidly exploring randomized trees algorithm, which can randomly explore the space while efficiently generating smooth trajectories in a dynamic and uncertain environment. The overall system was realized through the creation of a powerful new suite of software tools for message passing, logging, and visualization. These innovations provide a strong platform for future research in autonomous driving in global positioning system–denied and highly dynamic environments with poor a priori information. © 2008 Wiley Periodicals, Inc.
Article
A new architecture for controlling mobile robots is described. Layers of control system are built to let the robot operate at increasing levels of competence. Layers are made up of asynchronous modules that communicate over low-bandwidth channels. Each module is an instance of a fairly simple computational machine. Higher-level layers can subsume the roles of lower levels by suppressing their outputs. However, lower levels continue to function as higher levels are added. The result is a robust and flexible robot control system. The system has been used to control a mobile robot wandering around unconstrained laboratory areas and computer machine rooms. Eventually it is intended to control a robot that wanders the office areas of our laboratory, building maps of its surroundings using an onboard arm to perform simple tasks.
Article
Spatial reasoning is ubiquitous in human problem solving. Significantly, many aspects of it appear to be qualitative. This paper describes a general framework for qualitative spatial reasoning and demonstrates how it can be used to understand complex mechanical systems, such as clocks. The framework is organized around three ideas. (1) We conjecture that no powerful, general-purpose, purely qualitative representation of spatial properties exists (the poverty conjecture). (2) We describe the MD/PV model of spatial reasoning, which overcomes this fundamental limitation by combining the power of diagrams with qualitative spatial representations. In particular, a metric diagram, which combines quantitative and symbolic information, is used as the foundation for constructing a place vocabulary, a symbolic representation of shape and space which supports qualitative spatial reasoning. (3) We claim that shape and connectivity are the central features of qualitative spatial representations for kinematics.We begin by exploring these ideas in detail, pointing out why simpler representations have not proven fruitful. We also describe how inferences can be organized using the MD/PV model. We demonstrate the utility of this model by describing clock, a program which reasons about complex two-dimensional mechanisms. clock starts with a CAD description of a mechanism's parts and constructs a qualitative simulation of how it can behave. clock successfully performed the first complete qualitative simulation of a mechanical clock from first principles, a milestone in qualitative physics. We also examine other work on qualitative spatial reasoning, and show how it fits into this framework. Finally, we discuss new research questions this framework raises.
Article
The Spatial Semantic Hierarchy is a model of knowledge of large-scale space consisting of multiple interacting representations, both qualitative and quantitative. The SSH is inspired by the properties of the human cognitive map, and is intended to serve both as a model of the human cognitive map and as a method for robot exploration and map-building. The multiple levels of the SSH express states of partial knowledge, and thus enable the human or robotic agent to deal robustly with uncertainty during both learning and problem-solving.The control level represents useful patterns of sensorimotor interaction with the world in the form of trajectory-following and hill-climbing control laws leading to locally distinctive states. Local geometric maps in local frames of reference can be constructed at the control level to serve as observers for control laws in particular neighborhoods. The causal level abstracts continuous behavior among distinctive states into a discrete model consisting of states linked by actions. The topological level introduces the external ontology of places, paths and regions by abduction to explain the observed pattern of states and actions at the causal level. Quantitative knowledge at the control, causal and topological levels supports a “patchwork map” of local geometric frames of reference linked by causal and topological connections. The patchwork map can be merged into a single global frame of reference at the metrical level when sufficient information and computational resources are available.We describe the assumptions and guarantees behind the generality of the SSH across environments and sensorimotor systems. Evidence is presented from several partial implementations of the SSH on simulated and physical robots.
Article
Many stochastic planning problems can be represented using Markov Decision Processes (MDPs). A difficulty with using these MDP representations is that the common algorithms for solving them run in time polynomial in the size of the state space, where this size is extremely large for most real-world planning problems of interest. Recent AI research has addressed this problem by representing the MDP in a factored form. Factored MDPs, however, are not amenable to traditional solution methods that call for an explicit enumeration of the state space. One familiar way to solve MDP problems with very large state spaces is to form a reduced (or aggregated) MDP with the same properties as the original MDP by combining “equivalent” states. In this paper, we discuss applying this approach to solving factored MDP problems—we avoid enumerating the state space by describing large blocks of “equivalent” states in factored form, with the block descriptions being inferred directly from the original factored representation. The resulting reduced MDP may have exponentially fewer states than the original factored MDP, and can then be solved using traditional methods. The reduced MDP found depends on the notion of equivalence between states used in the aggregation. The notion of equivalence chosen will be fundamental in designing and analyzing algorithms for reducing MDPs. Optimally, these algorithms will be able to find the smallest possible reduced MDP for any given input MDP and notion of equivalence (i.e., find the “minimal model” for the input MDP). Unfortunately, the classic notion of state equivalence from non-deterministic finite state machines generalized to MDPs does not prove useful. We present here a notion of equivalence that is based upon the notion of bisimulation from the literature on concurrent processes. Our generalization of bisimulation to stochastic processes yields a non-trivial notion of state equivalence that guarantees the optimal policy for the reduced model immediately induces a corresponding optimal policy for the original model. With this notion of state equivalence, we design and analyze an algorithm that minimizes arbitrary factored MDPs and compare this method analytically to previous algorithms for solving factored MDPs. We show that previous approaches implicitly derive equivalence relations that we define here.
Article
Artificial intelligence research has foundered on the issue of representation. When intelligence is approached in an incremental manner, with strict reliance on interfacing to the real world through perception and action, reliance on representation disappears. In this paper we outline our approach to incrementally building complete intelligent Creatures. The fundamental decomposition of the intelligent system is not into independent information processing units which must interface with each other via representations. Instead, the intelligent system is decomposed into independent and parallel activity producers which all interface directly to the world through perception and action, rather than interface to each other particularly much. The notions of central and peripheral systems evaporate—everything is both central and peripheral. Based on these principles we have built a very successful series of mobile robots which operate without supervision as Creatures in standard office environments.
Article
After many years of neglect, the topic of mental imagery has recently emerged as an active area of research and debate in the cognitive science community. This article proposes a concept of computational imagery, which has potential applications to problems whose solutions by humans involve the use of mental imagery. Computational imagery can be defined as the ability to represent, retrieve, and reason about spatial and visual information not explicitly stored in long‐term memory. The article proposes a knowledge representation scheme for computational imagery that incorporates three representations: a long‐term memory, descriptive representation and two working‐memory representations, corresponding to the distinct visual and spatial components of mental imagery. The three representations, and a set of primitive functions, are specified using a formal theory of arrays and implemented in the array‐based language Nial. Although results of studies in mental imagery provide initial motivation for the representations and functionality of the scheme, our ultimate concerns are expressive power, inferential adequacy, and efficiency.
Article
We distinguish diagrammatic from sentential paper-and-pencil representations of information by developing alternative models of information-processing systems that are informationally equivalent and that can be characterized as sentential or diagrammatic. Sentential representations are sequential, like the propositions in a text. Diagrammatic representations are indexed by location in a plane. Diagrammatic representations also typically display information that is only implicit in sentential representations and that therefore has to be computed, sometimes at great cost, to make it explicit for use. We then contrast the computational efficiency of these representations for solving several illustrative problems in mathematics and physics. When two representations are informationally equivalent, their computational efficiency depends on the information-processing operators that act on them. Two sets of operators may differ in their capabilities for recognizing patterns, in the inferences they can carry out directly, and in their control strategies (in particular, the control of search). Diagrammatic and sentential representations support operators that differ in all of these respects. Operators working on one representation may recognize features readily or make inferences directly that are difficult to realize in the other representation. Most important, however, are differences in the efficiency of search for information and in the explicitness of information. In the representations we call diagrammatic, information is organized by location, and often much of the information needed to make an inference is present and explicit at a single location. In addition, cues to the next logical step in the problem may be present at an adjacent location. Therefore problem solving can proceed through a smooth traversal of the diagram, and may require very little search or computation of elements that had been implicit.
Article
We describe a new problem solver called STRIPS that attempts to find a sequence of operators in a space of world models to transform a given initial world model in which a given goal formula can be proven to be true. STRIPS represents a world model as an arbitrary collection in first-order predicate calculus formulas and is designed to work with models consisting of large numbers of formula. It employs a resolution theorem prover to answer questions of particular models and uses means-ends analysis to guide it to the desired goal-satisfying model.
Conference Paper
In this paper we describe Icarus, a cognitive archi- tecture for physical agents that integrates ideas from a number of traditions, but that has been especially inuenced by results from cognitive psychology. We review Icarus' commitments to memories and repre- sentations, then present its basic processes for perfor- mance and learning. We illustrate the architecture's behavior on a task from in-city driving that requires in- teraction among its various components. In addition, we discuss Icarus' consistency with qualitative nd- ings about the nature of human cognition. In closing, we consider the framework's relation to other cognitive architectures that have been proposed in the literature.
Conference Paper
AI has generally interpreted the organized nature of everyday activity in terms of plan-following. Nobody could doubt that people often make and follow plans. But the complexity, uncertainty, and immediacy of the real world require a central role for moment-to-moment improvisation. But before and beneath any planning ahead, one continually decides what to do now. Investigation of the dynamics of everyday routine activity reveals important regularities in the interaction of very simple machinery with its environment. We have used our dynamic theories to design a program, called Pengi, that engages in complex, apparently planful activity without requiring explicit models of the world.
Conference Paper
In this paper, we consider the problem of reinforcement learning in spatial tasks. These tasks have many states that can be aggregated together to improve learning efficiency. In an agent, this aggregation can take the form of selecting appropriate perceptual processes to arrive at a qualitative abstraction of the underlying continuous state. However, for arbitrary problems, an agent is unlikely to have the perceptual processes necessary to discriminate all relevant states in terms of such an abstraction. To help compensate for this, reinforcement learning can be integrated with an imagery system, where simple models of physical processes are applied within a low-level perceptual representation to predict the state resulting from an action. Rather than abstracting the current state, abstraction can be applied to the predicted next state. Formally, it is shown that this integration broadens the class of perceptual abstraction methods that can be used while preserving the underlying problem. Empirically, it is shown that this approach can be used in complex domains, and can be beneficial even when formal requirements are not met. Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Conference Paper
In this paper, we discuss the fleld of sampling-based motion planning. In contrast to methods that con- struct boundary representations of conflguration space obstacles, sampling-based methods use only information from a collision detector as they search the conflguration space. The simplicity of this approach, along with in- creases in computation power and the development of ef- flcient collision detection algorithms, has resulted in the introduction of a number of powerful motion planning algorithms, capable of solving challenging problems with many degrees of freedom. First, we trace how sampling- based motion planning has developed. We then discuss a variety of important issues for sampling-based motion planning, including uniform and regular sampling, topo- logical issues, and search philosophies. Finally, we ad- dress important issues regarding the role of randomiza- tion in sampling-based motion planning.
Article
Diagrams are of substantial benefit to WHISPER, a computer problem-solving system, in testing the stability of a “blocks world” structure and predicting the event sequences which occur as that structure collapses. WHISPER's components include a high level reasoner which knows some qualitative aspects of Physics, a simulated parallel processing “retina” to “look at” its diagrams, and a set of re-drawing procedures for modifying these diagrams. Roughly modelled after the human eye, WHISPER's retina can fixate at any diagram location, and its resolution decreases away from its center. Diagrams enable WHISPER to work with objects of arbitrary shape, detect collisions and other motion discontinuities, discover coincidental alignments, and easily update its world model after a state change. A theoretical analysis is made of the role of diagrams interacting with a general deductive mechanism such as WHISPER's high level reasoner.
Article
This research aims to clarify, by constructing and testing a computer simulation, the use of multiple representations in problem solving, focusing on their role in visual reasoning. The model is motivated by extensive experimental evidence in the literature for the features it incorporates, but this article focuses on the system's structure. We illustrate the model's behavior by simulating the cognitive and perceptual processes of an economics expert as he teaches some well-learned economics principles while drawing a graph on a blackboard. Data in the experimental literature and concurrent verbal protocols were used to guide construction of a linked production system and parallel network, CaMeRa (Computation with Multiple Representations), that employs a “Mind's Eye” representation for pictorial information, consisting of a bitmap and associated node-link structures. Propositional list structures are used to represent verbal information and reasoning. Small individual pieces from the different representations are linked on a sequential and temporary basis to form a reasoning and inferencing chain, using visually encoded information recalled to the Mind's Eye from long-term memory and from cues recognized on an external display. CaMeRa, like the expert, uses the diagrammatic and verbal representations to complement one another, thus exploiting the unique advantages of each.
Article
Ever since the days of Shannon's proposal for a chess-playing algorithm [12] and Samuel's checkers-learning program [10] the domain of complex board games such as Go, chess, checkers, Othello, and backgammon has been widely regarded as an ideal testing ground for exploring a variety of concepts and approaches in artificial intelligence and machine learning. Such board games offer the challenge of tremendous complexity and sophistication required to play at expert level. At the same time, the problem inputs and performance measures are clear-cut and well defined, and the game environment is readily automated in that it is easy to simulate the board, the rules of legal play, and the rules regarding when the game is over and determining the outcome.
Article
This dissertation presents a theory describing the components of a cognitive architecture supporting intelligent behavior in spatial tasks. In this theory, an abstract symbolic representation serves as the basis for decisions. As a means to support abstract decision-making, imagery processes are also present. Here, a concrete (highly detailed) representation of the state of the problem is maintained in parallel with the abstract representation. Perceptual and action systems are decomposed into parts that operate between the environment and the concrete representation, and parts that operate between the concrete and abstract representations. Control processes can issue actions as a continuous function of information in the concrete representation, and actions can be simulated (imagined) in terms of it. The agent can then derive useful abstract information by applying perceptual processes to the resulting concrete state. This theory addresses two challenges in architecture design that arise due to the diversity and complexity of spatial tasks that an intelligent agent must address. The perceptual abstraction problem results from the difficulty of creating a single perception system able to induce appropriate abstract representations in each of the many tasks an agent might encounter, and the irreducibility problem arises because some tasks are resistant to being abstracted at all. Imagery works to mitigate the perceptual abstraction problem by allowing a given perception system to work in more tasks, as perception can be dynamically combined with imagery. Continuous control, and the simulation thereof via imagery, works to mitigate the irreducibility problem. The use of imagery to address these challenges differs from other approaches in AI, where imagery is considered as an alternative to abstract representation, rather than as a means to it. A detailed implementation of the theory is described, which is an extension of the Soar cognitive architecture. Agents instantiated in this architecture are demonstrated, including agents that use reinforcement learning and imagery to play arcade games, and an agent that performs sampling-based motion planning for a car-like vehicle. The performance of these agents is discussed in the context of the underlying architectural theory. Connections between this work and psychological theories of mental imagery are also discussed.