A biologically inspired meta-control navigation system for the Psikharpax rat robot

Institut des Systèmes Intelligents et de Robotique (ISIR), Université Pierre et Marie Curie, 4 place Jussieu, 75005 Paris, France.
Bioinspiration & Biomimetics (Impact Factor: 2.35). 06/2012; 7(2):025009. DOI: 10.1088/1748-3182/7/2/025009
Source: PubMed


A biologically inspired navigation system for the mobile rat-like robot named Psikharpax is presented, allowing for self-localization and autonomous navigation in an initially unknown environment. The ability of parts of the model (e.g. the strategy selection mechanism) to reproduce rat behavioral data in various maze tasks has been validated before in simulations. But the capacity of the model to work on a real robot platform had not been tested. This paper presents our work on the implementation on the Psikharpax robot of two independent navigation strategies (a place-based planning strategy and a cue-guided taxon strategy) and a strategy selection meta-controller. We show how our robot can memorize which was the optimal strategy in each situation, by means of a reinforcement learning algorithm. Moreover, a context detector enables the controller to quickly adapt to changes in the environment-recognized as new contexts-and to restore previously acquired strategy preferences when a previously experienced context is recognized. This produces adaptivity closer to rat behavioral performance and constitutes a computational proposition of the role of the rat prefrontal cortex in strategy shifting. Moreover, such a brain-inspired meta-controller may provide an advancement for learning architectures in robotics.

  • Source
    • "This works well in simple simulated tasks with less than 10 states but does not scale-up to real-world robotic tasks involving hundreds of states. Keramati and colleagues [7] proposed to avoid estimating the uncertainty of the MB system by considering that it always has " perfect information " , which again cannot be true in tasks where the large number of states imposes to compute approximations of the transition function, as we previously found in a robotic navigation implementation of these learning processes [3]. In a recent work, we proposed a new neuro-inspired cognitive architecture combining MB and MF RL applied to a robotic cube-pushing task [11] and a human-robot cooperation task [10]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Combining model-based and model-free reinforcement learning systems in robotic cognitive architectures appears as a promising direction to endow artificial agents with flexibility and decisional autonomy close to mammals. In particular, it could enable robots to build an internal model of the environment, plan within it in response to detected environmental changes, and avoid the cost and time of planning when the stability of the environment is recognized as enabling habit learning. However, previously proposed criteria for the coordination of these two learning systems do not scale up to the large, partial and uncertain models autonomously learned by robots. Here we precisely analyze the performances of these two systems in an asynchronous robotic simulation of a cube-pushing task requiring a permanent trade-off between speed and accuracy. We propose solutions to make learning successful in these conditions. We finally discuss possible criteria for their efficient coordination within robotic cognitive architectures.
    Full-text · Article · Dec 2015 · Procedia Computer Science
    • "While there is some theoretical evidence that grid cells may help the accuracy of spatial navigation (Guanella and Verschure, 2007), there is little evidence that they do so experimentally (Hales et al., 2014). There is extensive research on spatial cognition models inspired by place cells coding in the rat's hippocampus used to evaluate goal-oriented spatial navigation with simulation and with real robots (Burgess et al., 1994; Brown and Sharp, 1995; Redish and Touretzky, 1997; Guazzelli et al., 1998; Gaussier et al., 2002; Filliat and Meyer, 2002; Arleo et al., 2004; Milford and Wyeth, 2009, 2007; Barrera and Weitzenfeld, 2008; Dollé et al., 2010; Caluwaerts et al., 2012; Tejera et al., 2013; Recce and Harris, 1996; Krichmar et al., 2005; Sukumar et al., 2012; Pata et al., 2014). However, few of them incorporate some aspects of multi-scale representation of space. "
    [Show abstract] [Hide abstract]
    ABSTRACT: There has been extensive research in recent years on the multi-scale nature of hippocampal place cells and entorhinal grid cells encoding which led to many speculations on their role in spatial cognition. In this paper we focus on the multi-scale nature of place cells and how they contribute to faster learning during goal-oriented navigation when compared to a spatial cognition system composed of single scale place cells. The task consists of a circular arena with a fixed goal location, in which a robot is trained to find the shortest path to the goal after a number of learning trials. Synaptic connections are modified using a reinforcement learning paradigm adapted to the place cells multi-scale architecture. The model is evaluated in both simulation and physical robots. We find that larger scale and combined multi-scale representations favor goal-oriented navigation task learning.
    No preview · Article · Nov 2015 · Neural networks: the official journal of the International Neural Network Society
  • Source
    • " MB systems are then updated according to the action a taken by the full model in state s – even if the systems would have individ - ually favoured different actions – and the resulting new state s 0 and retrieved reward r , as previously done in other computational mod - els involving a cooperation between model - free and model - based systems ( Caluwaerts et al . , 2012 ) ."
    [Show abstract] [Hide abstract]
    ABSTRACT: Gaining a better understanding of the biological mechanisms underlying the individual variation observed in response to rewards and reward cues could help to identify and treat individuals more prone to disorders of impulsive control, such as addiction. Variation in response to reward cues is captured in rats undergoing autoshaping experiments where the appearance of a lever precedes food delivery. Although no response is required for food to be delivered, some rats (goal-trackers) learn to approach and avidly engage the magazine until food delivery, whereas other rats (sign-trackers) come to approach and engage avidly the lever. The impulsive and often maladaptive characteristics of the latter response are reminiscent of addictive behaviour in humans. In a previous article, we developed a computational model accounting for a set of experimental data regarding sign-trackers and goal-trackers. Here we show new simulations of the model to draw experimental predictions that could help further validate or refute the model. In particular, we apply the model to new experimental protocols such as injecting flupentixol locally into the core of the nucleus accumbens rather than systemically, and lesioning of the core of the nucleus accumbens before or after conditioning. In addition, we discuss the possibility of removing the food magazine during the inter-trial interval. The predictions from this revised model will help us better understand the role of different brain regions in the behaviours expressed by sign-trackers and goal-trackers.
    Full-text · Article · Jun 2014 · Journal of Physiology-Paris
Show more