# Emmanuel DaucéAix-Marseille Université | AMU · Institut de Neurosciences des Systèmes (UMR_S 1106 INS)

Emmanuel Daucé

PhD Hab.

## About

54

Publications

2,360

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

272

Citations

Introduction

## Publications

Publications (54)

The capability to widely sample the state and action spaces is a key ingredient toward building effective reinforcement learning algorithms. The variational optimization principles exposed in this paper emphasize the importance of an occupancy model to synthesizes the general distribution of the agent's environmental states over which it can act (d...

Convolutional Neural Networks have been considered the goto option for object recognition in computer vision for the last couple of years. However, their invariance to object’s translations is still deemed as a weak point and remains limited to small translations only via their max-pooling layers. One bio-inspired approach considers the What/Where...

Convolutional Neural Networks have been considered the goto option for object recognition in computer vision for the last couple of years. However, their invariance to object’s translations is still deemed as a weak point and remains limited to small translations only via their max-pooling layers. One bio-inspired approach considers the What/Where...

Stemming on the idea that a key objective in reinforcement learning is to invert a target distribution of effects, end-effect drives are proposed as an effective way to implement goal-directed motor learning, in the absence of an explicit forward model. An end-effect model relies on a simple statistical recording of the effect of the current policy...

Visual search is an essential cognitive ability, offering a prototypical control problem to be addressed with Active Inference. Under a Naive Bayes assumption, the maximization of the information gain objective is consistent with the separation of the visual sensory flow in two independent pathways, namely the “What” and the “Where” pathways. On th...

End-effect drives are proposed here as an effective way to implement goal-directed motor learning, in the absence of an explicit forward model. An end-effect model relies on a simple statistical recording of the effect of the current policy, here used as a substitute for the more resource-demanding forward models. When combined with a reward struct...

We develop a visuo-motor model that implements visual search as a focal accuracy-seeking policy across a crowded visual display. Stemming from the active inference framework, saccade-based visual exploration is idealized as an inference process, assuming that the target position and category are independently drawn from a common generative process....

What motivates an action in the absence of a definite reward? Taking the case of visuomotor control, we consider a minimal control problem that is how select the next saccade, in a sequence of discrete eye movements, when the final objective is to better interpret the current visual scene. The visual scene is modeled here as a partially-observed en...

We develop a comprehensive description of the active inference framework, as proposed by Friston (2010), under a machine-learning compliant perspective. Stemming from a biological inspiration and the auto-encoding principles, a sketch of a cognitive architecture is proposed that should provide ways to implement estimation-oriented control policies...

The objective of this dissertation is to shed light on some fundamental impediments in learning control laws in continuous state spaces. In particular, if one wants to build artificial devices capable to learn motor tasks the same way they learn to classify signals and images, one needs to establish control rules that do not necessitate comparisons...

Human biomedical research distinguishes between two principal types of variability, particularly in the domain of cognitive neurosciences and neuroimagery: intrasubject variability and intersubject variability. The research in cognitive neuroscience specifically aims to improve our understanding of the origins of the intrasubject variability in beh...

The bandit classification problem considers learning the labels of an online data stream under a mere "hit-or-miss" binary guiding. Adapting the OVA ("one-versus-all") hinge loss setup, we develop a sparse and lightweight solution to this problem. The issued sequential norm-minimal update solves the classification problem in finite time in the sepa...

We develop an algorithm for online learning of multiclass classifiers in the case where the classification information owns a binary form (correct or incorrect answer). The absence of an explicit label information leads to a random sampling of the label space, along the lines of contextual bandits. The developed algorithm is based on the optimizati...

Les signaux utilisés dans le cadre des BCI non-invasives (EEG, MEG, ...) varient fortement au cours du temps chez un même sujet, à la fois entre les différentes sessions d'utilisation, mais également au sein d'une même session, en fonction de la fatigue ou de la motivation du sujet, ou de changements matériels, comme la position des capteurs et leu...

Noise driven exploration of a brain network's dynamic repertoire has been hypothesized to be causally involved in cognitive function, aging and neurodegeneration. The dynamic repertoire crucially depends on the network's capacity to store patterns, as well as their stability. Here we systematically explore the capacity of networks derived from huma...

This paper presents a new online multiclass algorithm with bandit feedback, where, after making a prediction, the learning algorithm receives only partial feedback, i.e., the prediction is correct or not, rather than the true label. This algorithm, named Bandit Passive-Aggressive online algorithm (BPA), is based on the Passive-Aggressive Online alg...

We adapt a policy gradient approach to the problem of reward-based online learning
of a non-invasive EEG-based ``P300''-speller.
We first clarify the nature of the P300-speller classification problem
and present a general regularized gradient ascent formula.
We then show that when the reward is immediate and binary (namely ``bad response'' or ``goo...

So far P300-speller design has put very little emphasis on the design of optimized flash patterns, a surprising fact given the importance of the sequence of flashes on the selection outcome. Previous work in this domain has consisted in studying consecutive flashes, to prevent the same letter or its neighbors from flashing consecutively. To this ef...

Noise driven exploration of a brain network's dynamic repertoire has been hypothesized to be causally involved in cognitive function, aging and neurodegeneration. The dynamic repertoire crucially depends on the network's capacity to store patterns, as well as their stability. Here we systematically explore the capacity of networks derived from huma...

How does the brain learn to predict when an event is going to occur? We know from studies that vary the foreperiod – the time between a warning and a response stimulus – that people can model the temporal variability of stimulus onset, i.e. react faster when a stimulus is statistically more likely [1]. We also know that reaction time (RT) decreases...

Saccades are the fast eye movements dedicated to sight orientation toward targets. Robinson [1] suggested a model based on feedback control according to an internally estimated motor error relying on a "neuronal integrator". Although supported by numerous behavioral studies, this principle is still in search of biological confirmation. Indeed, neur...

We present a pilot simulation experiment in order to validate the use of reinforcement learning methods in the field of BCIs. We show that a direct policygradient approach combined with an appropriate spatial filter allows to derive an autonomous online learning algorithm whose improvement is based on an error signal which can be, in principle, dir...

We present a neural architecture which combines a new reinforcement learning algorithm with a topographic encoding of the inputs as inspired by kernel-based methods. This architecture is able to learn to control non-linear systems defined on a continuous space. Some results on a task of reaching are also given.

Various forms of noise are present in the brain. The role of noise in a exploration/exploitation trade-off is cast into the framework of reinforcement learning for a complex task of motor learning. A neuro-controler using a linear transformation of the input to which is added a gaussian noise is modelized as a stochastic controler that can be learn...

In machine learning, ``kernel methods'' give a consistent framework for applying the perceptron algorithm to non-linear problems. In reinforcement learning, an analog of the perceptron delta-rule can be derived from the "policy-gradient" approach proposed by Williams in 1992 in the framework of stochastic neural networks. Despite its generality and...

Despite the long and fruitful history of neuroscience, a global, multi-level description of cardinal brain functions is still far from reach. Using analytical or numerical approaches, Computational Neuroscience aims at the emergence of such common principles by using concepts from Dynamical Systems and Information Theory. The aim of this Special Is...

We study a model of neuronal specialization using a policy gradient reinforcement approach. (1) The neurons stochastically
fire according to their synaptic input plus a noise term; (2) The environment is a closed-loop system composed of a rotating
eye and a visual punctual target; (3) The network is composed of a foveated retina, a primary layer an...

We study a model of neuronal specialization using a pol-icy gradient reinforcement approach. (1) The neurons sto-chastically fire according to their synaptic input plus a noise term; (2) The environment is a closed-loop system composed of a rotating eye and a visual punctual target; (3) The network is composed of a foveated retina directly connecte...

In this paper, we investigate how Spike-Timing Dependent Plasticity, when applied to a random recurrent neural network of leaky integrate-and-fire neurons, can affect its dynamical regime. We show that in an autonomous network with self-sustained activity, STDP has a regularization effect and simplifies the dynamics. We then look at two different w...

This paper addresses the question of the functional role of the dual application of positive and negative Hebbian time dependent plasticity rules, in the particular framework of reinforcement learning tasks. Our simulations take place in a recurrent network of spiking neurons with inhomogeneous synaptic weights.A spike-timing dependent plasticity (...

This paper is a presentation of neuronal control systems
in the terms of the dynamical systems theory, where (1) the
controller and its surrounding environment are seen as two
co-dependent controlled dynamical systems (2) the behavioral
transitions that take place under adaptation processes are analyzed
in terms of phase-transitions. We present in...

We present in this paper a general model of recurrent networks of spiking neurons, composed of several populations, and whose
interaction pattern is set with a random draw. We use for simplicity discrete time neuron updating, and the emitted spikes
are transmitted through randomly delayed lines. In excitatory-inhibitory networks, we show that inhom...

Taking a global analogy with the structure of perceptual biological systems, we present a system composed of two layers of real-valued sigmoidal neurons. The primary layer receives stimulating spatiotemporal signals, and the secondary layer is a fully connected random recurrent network. This secondary layer spontaneously displays complex chaotic dy...

In this paper, we first present a new mathematical approach, based on large deviation techniques, for the study of a large random recurrent neural network with discrete time dynamics. In particular, we state a mean field property and a law of large numbers, in the most general case of random models with sparse connections and several populations. O...

After critical appraisal of mathematical and biological characteristics of the model, we discuss how a classical hippocampal neural network expresses functions similar to those of the chaotic model, and then present an alternative stimulus-driven chaotic random recurrent neural network (RRNN) that learns patterns as well as sequences, and controls...

We present a process that allows to learn complex spatiotemporal signals, in a large random neural system with a self-generated
chaotic dynamics. Our system modelizes a generic sensory structure. We study the interplay between a periodic spatio-temporal
stimulus, i.e a sequence of spatial patterns (which are not necessarily orthogonal), and the inn...

Freeman's investigations on the olfactory bulb of the rabbit showed that its signal dynamics was chaotic, and that recognition of a learned stimulus is linked to a dimension reduction of the dynamics attractor. In this paper we address the question whether this behavior is specific of this particular architecture, or if it is a general property. We...

Although extraordinarily complexes, the mental processes can be regarded as products of the neuronal dynamical system. In this context, biological observations make it possible to emit the conjecture that recognition of a form or a stimulus leads to a reduction of neuronal dynamics. This paper proposes a generic model for the study of such dynamics...

This paper presents the guidelines of an ongoing project of the "Movement Dynamics" team in the "Movement and perception" Lab, UMR6152, Marseille. We address the question of Hebbian learning in large recurrent networks. The aim of this research is to present new functional models of learning, through the use of well known methods in a context of hi...

We present a multi-population dynamic neural network model with binary activation and a ran-dom interaction pattern. The weights parame-ters have been specied in order to distinguish excitatory populations from inhibitory popula-tions. Under specic parameters, we design func-tional modules composed of two populations, one of excitatory neurons, one...

In the framework of dynamic neural networks, learning refers to the slow process by which a neural network modies its own structure under the inuence of environmental pressure. Our simula-tions take place on large random recurrent neural net-works (RRNNs). We present several results obtained with the use of a TD (temporal dierence) and STDP (Spike-...

## Projects

Project (1)