Article

A biologically inspired meta-control navigation system for the Psikharpax rat robot.

Institut des Systèmes Intelligents et de Robotique (ISIR), Université Pierre et Marie Curie, 4 place Jussieu, 75005 Paris, France.
Bioinspiration &amp Biomimetics (Impact Factor: 2.53). 06/2012; 7(2):025009. DOI: 10.1088/1748-3182/7/2/025009
Source: PubMed

ABSTRACT A biologically inspired navigation system for the mobile rat-like robot named Psikharpax is presented, allowing for self-localization and autonomous navigation in an initially unknown environment. The ability of parts of the model (e.g. the strategy selection mechanism) to reproduce rat behavioral data in various maze tasks has been validated before in simulations. But the capacity of the model to work on a real robot platform had not been tested. This paper presents our work on the implementation on the Psikharpax robot of two independent navigation strategies (a place-based planning strategy and a cue-guided taxon strategy) and a strategy selection meta-controller. We show how our robot can memorize which was the optimal strategy in each situation, by means of a reinforcement learning algorithm. Moreover, a context detector enables the controller to quickly adapt to changes in the environment-recognized as new contexts-and to restore previously acquired strategy preferences when a previously experienced context is recognized. This produces adaptivity closer to rat behavioral performance and constitutes a computational proposition of the role of the rat prefrontal cortex in strategy shifting. Moreover, such a brain-inspired meta-controller may provide an advancement for learning architectures in robotics.

1 Follower
 · 
187 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: When encountering novel environments, animals perform complex yet structured exploratory behaviors. Despite their typical structuring, the principles underlying exploratory patterns are still not sufficiently understood. Here we analyzed exploratory behavioral data from two modalities: whisking and locomotion in rats and mice. We found that these rodents maximized novelty signal-to-noise ratio during each exploration episode, where novelty is defined as the accumulated information gain. We further found that these rodents maximized novelty during outbound exploration, used novelty-triggered withdrawal-like retreat behavior, and explored the environment in a novelty-descending sequence. We applied a hierarchical curiosity model, which incorporates these principles, to both modalities. We show that the model captures the major components of exploratory behavior in multiple timescales: single excursions, exploratory episodes, and developmental timeline. The model predicted that novelty is managed across exploratory modalities. Using a novel experimental setup in which mice encountered a novel object for the first time in their life, we tested and validated this prediction. Further predictions, related to the development of brain circuitry, are described. This study demonstrates that rodents select exploratory actions according to a novelty management framework and suggests a plausible mechanism by which mammalian exploration primitives can be learned during development and integrated in adult exploration of complex environments.
    The Journal of Neuroscience : The Official Journal of the Society for Neuroscience 09/2014; 34(38):12646-61. DOI:10.1523/JNEUROSCI.1872-14.2014 · 6.75 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Vicarious trial-and-error (VTE) is a behavior observed in rat experiments that seems to suggest self-conflict. This behavior is seen mainly when the rats are uncertain about making a decision. The presence of VTE is regarded as an indicator of a deliberative decision-making process, that is, searching, predicting, and evaluating outcomes. This process is slower than automated decision-making processes, such as reflex or habituation, but it allows for flexible and ongoing control of behavior. In this study, we propose for the first time a robotic model of VTE to see if VTE can emerge just from a body-environment interaction and to show the underlying mechanism responsible for the observation of VTE and the advantages provided by it. We tried several robots with different parameters, and we have found that they showed three different types of VTE: high numbers of VTE at the beginning of learning, decreasing numbers afterward (similar VTE pattern to experiments with rats), low during the whole learning period, and high numbers all the time. Therefore, we were able to reproduce the phenomenon of VTE in a model robot using only a simple dynamical neural network with Hebbian learning, which suggests that VTE is an emergent property of a plastic and embodied neural network. From a comparison of the three types of VTE, we demonstrated that 1) VTE is associated with chaotic activity of neurons in our model and 2) VTE-showing robots were robust to environmental perturbations. We suggest that the instability of neuronal activity found in VTE allows ongoing learning to rebuild its strategy continuously, which creates robust behavior. Based on these results, we suggest that VTE is caused by a similar mechanism in biology and leads to robust decision making in an analogous way.
    PLoS ONE 07/2014; 9(7):e102708. DOI:10.1371/journal.pone.0102708 · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Gaining a better understanding of the biological mechanisms underlying the individual variation observed in response to rewards and reward cues could help to identify and treat individuals more prone to disorders of impulsive control, such as addiction. Variation in response to reward cues is captured in rats undergoing autoshaping experiments where the appearance of a lever precedes food delivery. Although no response is required for food to be delivered, some rats (goal-trackers) learn to approach and avidly engage the magazine until food delivery, whereas other rats (sign-trackers) come to approach and engage avidly the lever. The impulsive and often maladaptive characteristics of the latter response are reminiscent of addictive behaviour in humans. In a previous article, we developed a computational model accounting for a set of experimental data regarding sign-trackers and goal-trackers. Here we show new simulations of the model to draw experimental predictions that could help further validate or refute the model. In particular, we apply the model to new experimental protocols such as injecting flupentixol locally into the core of the nucleus accumbens rather than systemically, and lesioning of the core of the nucleus accumbens before or after conditioning. In addition, we discuss the possibility of removing the food magazine during the inter-trial interval. The predictions from this revised model will help us better understand the role of different brain regions in the behaviours expressed by sign-trackers and goal-trackers.
    Journal of Physiology-Paris 06/2014; DOI:10.1016/j.jphysparis.2014.06.001 · 2.35 Impact Factor