Raphaël Feraud

Raphaël Feraud
Orange Innovation

PhD

About

80
Publications
44,284
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,276
Citations
Citations since 2017
24 Research Items
625 Citations
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
Introduction
Currently my research interests are machine learning, reinforcement learning and online learning. I have a past in image processing and specially in face detection. I still have a weakness for neural networks.
Additional affiliations
January 2004 - present

Publications

Publications (80)
Conference Paper
Full-text available
To address the contextual bandit problem, we propose an online random forest algorithm. The analysis of the proposed algorithm is based on the sample complexity needed to find the optimal decision stump. Then, the decision stumps are recursively stacked in a random collection of decision trees, BANDIT FOREST. We show that the proposed algorithm is...
Conference Paper
Full-text available
Dialogue systems rely on a careful reinforcement learning design: the learning algorithm and its state space representation. In lack of more rigorous knowledge, the designer resorts to its practical experience to choose the best option. In order to automate and to improve the performance of the aforementioned process, this article formalises the pr...
Article
Full-text available
We consider a variant of the stochastic multi-armed bandit with K arms where the rewards are not assumed to be identically distributed, but are generated by a non-stationary stochastic process. We first study the unique best arm setting when there exists one unique best arm. Second, we study the general switching best arm setting when a best arm sw...
Conference Paper
Full-text available
We consider the decentralized exploration problem: a set of players collaborate to identify the best arm by asynchronously interacting with the same stochastic environment. The objective is to ensure privacy in the best arm identification problem between asynchronous, collaborative, and thrifty players. In the context of a digital service , we advo...
Conference Paper
Full-text available
In this paper, we consider the problem of sequential change-point detection where both the change-points and the distributions before and after the change are assumed to be unknown. For this problem of primary importance in statistical and sequential learning theory, we derive a variant of the Bayesian Online Change Point Detector proposed by (Fear...
Preprint
Full-text available
In Batched Multi-Armed Bandits (BMAB), the policy is not allowed to be updated at each time step. Usually, the setting asserts a maximum number of allowed policy updates and the algorithm schedules them so that to minimize the expected regret. In this paper, we describe a novel setting for BMAB, with the following twist: the timing of the policy up...
Conference Paper
Full-text available
One characteristic of the Cloud is elasticity: it provides the ability to adapt resources allocated to applications as needed at run-time. This capacity relies on scaling and scheduling. In this article online horizontal scaling is studied. The aim is to determine dynamically applications deployment parameters and to adjust them in order to respect...
Conference Paper
Full-text available
In various recommender system applications, from medical diagnosis to dialog systems, due to observation costs only a small subset of a potentially large number of context variables can be observed at each iteration; however, the agent has a freedom to choose which variables to observe. In this paper, we analyze and extend an online learning framew...
Preprint
In this paper, we analyze and extend an online learning framework known as Context-Attentive Bandit, motivated by various practical applications, from medical diagnosis to dialog systems, where due to observation costs only a small subset of a potentially large number of context variables can be observed at each iteration;however, the agent has a f...
Presentation
Full-text available
In reinforcement learning, an agent chooses actions in order to maximize the rewards given by a dynamic environment. As the environment is initially unknown, the agent has to interact with it to gather information. Moreover, only the reward of the chosen actions is revealed. That is why the agent faces the exploration/exploitation dilemma: she has...
Presentation
Full-text available
Presentation of Decentralized Exploration in Multi-Armed Bandits
Conference Paper
Full-text available
The Industrial Internet of Things (IIoT) faces multiple challenges to achieve high reliability, low-latency and low power consumption. The IEEE 802.15.4 Time-Slotted Channel Hopping (TSCH) protocol aims to address these issues by using frequency hopping to improve the transmission quality when coping with low-quality channels. However, an optimized...
Presentation
Full-text available
Reinforcement Learning, Multi-Armed Bandits, and AB testing
Conference Paper
Full-text available
The use of Low Power Wide Area Networks (LPWANs) is growing due to their advantages in terms of low cost, energy efficiency and range. Although LPWANs attract the interest of industry and network operators, it faces certain constraints related to energy consumption, network coverage and quality of service. In this paper we demonstrate the possibili...
Presentation
Full-text available
The optimization of LoRa transmission is cast as a reinforcement learning problem: several Multi-Armed bandit algorithms are compared with Adaptive Data Rate, which is the algorithm defined in LoRa Network. On experiments done on a realistic LoRa Network simulator, ADR is dominated by MAB both in terms of energy consumption and packet losses.
Presentation
Full-text available
Présentation IA et handicap pour les trophées Femmes En Entreprise Adaptée
Poster
Full-text available
The Thompson Sampling exhibits excellent results in practice and it has been shown to be asymptotically optimal. The extension of Thompson Sampling algorithm to the Switching Multi-Armed Bandit problem, proposed in [13], is a Thompson Sampling equiped with a Bayesian online changepoint detector [1]. In this paper, we propose another extension of th...
Conference Paper
Full-text available
The Thompson Sampling exhibits excellent results in practice and it has been shown to be asymptotically optimal. The extension of Thompson Sampling algorithm to the Switching Multi-Armed Bandit problem, proposed in [13], is a Thompson Sampling equiped with a Bayesian online changepoint detector [1]. In this paper, we propose another extension of th...
Conference Paper
Full-text available
We consider a novel formulation of the multi-armed bandit model, which we call the contextual bandit with restricted context, where only a limited number of features can be accessed by the learner at every iteration. This novel formulation is motivated by different online problems arising in clinical trials, recommender systems and attention modeli...
Conference Paper
Full-text available
The contextual bandits can be viewed as a generalization of online classification models, where only the chosen class is observed. The selection of learning experts allows to find the best parametrization of an expert during its learning, within a set of predefined parameters, and reduces the bias of the hypothesis space, and hence improves the per...
Data
The purpose of this code is to test BanditForest algorithm, and to reproduce experiments of the AISTATS paper "Random forest for the contextual bandit problem."
Technical Report
Long version available at: https://www.researchgate.net/publication/315522010_The_Non-stationary_Stochastic_Multi-armed_Bandit_Problem
Article
We consider a variant of the multi-armed bandit model, which we call multi-armed bandit problem with known trend, where the gambler knows the shape of the reward function of each arm but not its distribution. This new problem is motivated by different online problems like active learning, music and interface recommendation applications, where when...
Presentation
Full-text available
Bandit Forest and use cases
Conference Paper
Full-text available
The multi-armed bandit is a model of exploration and exploitation, where one must select, within a finite set of arms, the one which maximizes the cumulative reward up to the time horizon T. For the adversarial multi-armed bandit problem, where the sequence of rewards is chosen by an oblivious adversary, the notion of best arm during the time horiz...
Conference Paper
Full-text available
Dans le problème des bandits manchots, un joueur possède le choix entre plusieurs bras possédant des espérances de gain différentes. Son but est de maximiser la récompense obtenue après T essais. Il doit alors explorer pour estimer les récompenses de chaque machine tout en exploitant le bras qu'il estime le meilleur. C'est le dilemme exploration/ex...
Conference Paper
Dans le problème des bandits manchots, un joueur possède le choix entre plusieurs bras possédant des espérances de gain différentes. Son but est de maximiser la récompense obtenue après T essais. Il doit alors explorer pour estimer les récompenses de chaque machine tout en exploitant le bras qu'il estime le meilleur. C'est le dilemme exploration/ex...
Conference Paper
The labelling of training examples is a costly task in a supervised classification. Active learning strategies answer this problem by selecting the most useful unlabelled examples to train a predictive model. The choice of examples to label can be seen as a dilemma between the exploration and the exploitation over the data space representation. In...
Article
To address the contextual bandit problem, we propose an online decision tree algorithm. We show that the proposed algorithm, KMD-Tree, incurs an expected cumulated regret in the order of O(log T) against the greedy decision tree built knowing the joint distribution of contexts and rewards. We show that this problem dependent regret bound is optimal...
Conference Paper
Full-text available
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt e a l...
Conference Paper
Full-text available
This paper presents a new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards. Several neural networks are trained to modelize the value of rewards knowing the context. Two variants, based on multi-experts approach, are proposed to choose online the parameters of multi-layer perceptrons....
Conference Paper
Full-text available
Dans cet article, nous proposons un nouvel algorithme de bandits contextuels, NeuralBandit, ne faisant aucune hypothèse de stationnarité sur les contextes et les récompenses. L'algorithme proposé utilise plusieurs perceptrons multicouches, chacun apprenant la probabilité qu'une action, étant donné le contexte, entraine une récompense. Afin de régle...
Data
Optimization of online decisions and its applications. Presentation done at the Laboratoire de Recherche en Informatique of Orsay, Paris XI.
Article
Full-text available
We consider a variant of the multi-armed bandit model, which we call scratch games, where the sequences of rewards are finite and drawn in advance with unknown starting dates. This new problem is motivated by online advertising applications where the number of ad displays is fixed according to a contract between the advertiser and the publisher, an...
Conference Paper
Full-text available
We study a stochastic online learning scheme with partial feedback where the utility of decisions is only observable through an estimation of the environment parameters. We propose a generic pure-exploration algorithm, able to cope with various utility functions from multi-armed bandits settings to dueling bandits. The primary application of this s...
Conference Paper
Full-text available
Stochastic multi-armed bandit algorithms are used to solve the exploration and exploitation dilemma in sequential optimization problems. The algorithms based on upper confidence bounds offer strong theoretical guarantees, they are easy to implement and efficient in practice. We considers a new bandit setting, called "scratch-games", where arm budge...
Conference Paper
Full-text available
One of the most critical operators for a Data Stream Management System is the join operator. Unfortunately, the join operator between the stream A and B is a blocking operator: for each current tuple of the stream A, the entire stream B have to be scanned. The usual technique used for unblocking stream operators consists to restrict the processing...
Conference Paper
Full-text available
This paper addresses a task of variable selection which consists in choosing a subset of variables that is sufficient to predict the target label well. Here instead of trying to directly determine which variables are better, we make use of prior knowledge to learn the properties of good variables and guide the selection towards the most relevant d...
Conference Paper
Full-text available
In itself, the continuous exponential increase of the data-warehouses size does not necessarily lead to a richer and finer-grained information since the processing capabilities do not increase at the same rate. Current state-of-the-art technologies require the user to strike a delicate balance between the processing cost and the information quality...
Conference Paper
Full-text available
Résumé : Une tendance lourde depuis la fin du siècle dernier est l'augmentation exponentielle du volume des données stockées. Cette augmentation ne se traduit pas nécessairement par une information plus riche puisque la capacité à traiter ces données ne progresse pas aussi rapidement. Avec les technologies actuelles, un difficile compromis doit êtr...
Conference Paper
Full-text available
This paper presents a method to interpret the output of a classification (or regression) model. The interpretation is based on two concepts: the variable importance and the value importance of the variable. Unlike most of the state of art interpretation methods, our approach allows the interpretation of the model output for every instance. Understa...
Conference Paper
Full-text available
Résumé. Cet article présente une méthode permettant d'interpréter la sortie d'un modèle de classification ou de régression. L'interprétation de la sortie du modèle se base sur deux grandeurs : l'importance de la variable et l'importance de la valeur de la variable. Contrairement à la plupart des méthodes d'interpré-tation de l'état de l'art, notre...
Conference Paper
Full-text available
L'afflux de données sur les usages des produits et services nécessite des traitements lourds pour les transformer en information. Or la capacité à traiter les données ne peut pas suivre l'augmentation exponentielle des volumes stockés. Avec les technologies actuelles, un difficile compromis doit être trouvé entre le coût de mise en oeuvre et la qua...
Conference Paper
Full-text available
In the field of neural networks, feature selection has been studied for the last ten years and classical as well as original methods have been employed. This paper reviews the efficiency of four approaches to feature selection applied on neural networks. We assess the efficiency of these methods when the number of examples is significantly lower th...
Conference Paper
Full-text available
In the field of neural networks, feature selection has been studied for the last ten years and classical as well as original methods have been employed. This paper reviews the efficiency of four approaches to do a driven forward features selection on neural networks . We assess the efficiency of these methods compare to the simple Pearson criterion...
Patent
L'invention concerne un procédé d'extraction d'un tableau croisé (22) d'une base de données (24). Le tableau croisé comprend en ligne, respectivement colonne, des instances et en colonne, respectivement ligne, des indicateurs caractérisant les instances. Le procédé comprend les étapes suivantes : a) obtention (51) d'une spécification initiale d'une...
Article
Full-text available
Neural networks are still frustrating tools in the data mining arsenal. They exhibit excellent modelling performance, but do not give a clue about the structure of their models. We propose a methodology to explain the classification obtained by a multilayer perceptron. We introduce the concept of 'causal importance' and define a saliency measuremen...
Article
Full-text available
Detecting faces in images with complex backgrounds is a difficult task. Our approach, which obtains state of the art results, is based on a neural network model: the constrained generative model (CGM). Generative, since the goal of the learning process is to evaluate the probability that the model has generated the input data, and constrained since...
Patent
The invention concerns an automatic system for sound and image recording in particular for videoconference, comprising means controlling (20) image and sound recording sensors (10) and sequence analysing means (40) monitoring said control means (20) to obtain automatic framing of the sequence being filmed. The invention is characterised in that an...
Conference Paper
Full-text available
Detecting faces in images with complex backgrounds is a difficult task. Our approach, which obtains state-of-the-art results, is based on a generative neural network model: the constrained generative model (CGM). To detect side-view faces and to decrease the number of false alarms, a conditional mixture of networks is used. To decrease the computat...
Conference Paper
Full-text available
Allocating resources to data trafic in telecommunication networks is a difficult problem because of the complex dynamics exhibited by this kind of traffic and because of the difficult trade-off between the delivered quality of service and the wasted bandwidth. We describe and compare the performances of two controllers of different designs (a Kalma...
Conference Paper
Full-text available
In a face to face meeting, participants are free to choose who they look at and who they listen to. In a videoconference, participants can only see or hear the audio-visual signal that the distant participants have decided or have been able to transmit. We present Panorama, a contactless visual man machine interface which allows to explore a distan...
Conference Paper
Full-text available
A real-time system is described for automatic detection and tracking of multiple persons, in the context of video-conferencing systems. This system, called MULTRAK (multiperson locating and tracking automatic kernel) is able to continuously detect and track the position of faces in its field of view. The heart of the system as a modular neural netw...
Thesis
Full-text available
Les réseaux de neurones sont des modèles statistiques, qui permettent l'apprentissage numérique par l'exemple. Un modèle pour l'apprentissage d'un réseau de neurones est développé et appliqué à la détection de visages. Cette approche est basée sur les réseaux de neurones autoassociatifs. Nous verrons, que pour utiliser ce type de réseau comme un es...
Conference Paper
Full-text available
We present a neural network approach to human face detection. Using a modular system, a conditional mixture of networks, we are able to detect front view faces as well as turned faces (up to 50 degrees) with excellent performances. This modular network is integrated into LISTEN, our face tracking system. It enables this system to detect and track i...
Conference Paper
Full-text available
Les informations visuelles et acoustiques sont au coeur de la (télé)communication entre les personnes. Le visage est la principale source d'information. Des techniques de détection du mouvement et de la teinte de la peau délimitent des régions d'intérêt où peuvent se trouver des visages. Un réseau de neurones détecte le visage et fournit la positio...
Article
Full-text available
A generative neural network model, constrained by non-face examples chosen by an iterative algorithm, is applied to face detection. To extend the detection ability in orientation and to decrease the number of false alarms, different combinations of networks are tested: ensemble, conditional ensemble and conditional mixture of networks. The use of a...
Article
Full-text available
A generative neural network model, constrained by non-face examples chosen by an iterative algorithm, is applied to face detection. To improve the generalization ability of the model, another constraint based on the minimum description length is added. This model is tested and compared with another state-of-the-art face detection system on a large...
Conference Paper
Full-text available
A new learning model based on autoassociative neural networks is developped and applied to face detection. To extend the detection ability in orientation and to decrease the number of false alarms, different combinations of networks are tested: ensemble, conditional ensemble and conditional mixture of networks. The use of a conditional mixture of n...
Conference Paper
Full-text available
Conference Paper
Full-text available
Both visual and acoustical informations provide effective means of telecommunication between persons. In this context, the face is the most important part of the person both visually and acoustically. We describe how the cooperation of image and audio processing allows to track a person's face and to collect the audio information it produces. We pr...

Questions

Questions (7)

Network