About
36
Publications
5,561
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
319
Citations
Citations since 2017
Introduction
Additional affiliations
October 2005 - June 2015
Publications
Publications (36)
Task-oriented dialogue systems are designed to achieve specific goals while conversing with humans. In practice, they may have to handle simultaneously several domains and tasks. The dialogue manager must therefore be able to take into account domain changes and plan over different domains/tasks in order to deal with multidomain dialogues. However,...
A learning dialogue agent can infer its behaviour from interactions with the users. These interactions can be taken from either human-to-human or human-machine conversations. However, human interactions are scarce and costly, making learning from few interactions essential. One solution to speedup the learning process is to guide the agent's explor...
Les smartphones sont omniprésents dans notre quotidien. Ils constituent
une ressource informatique à portée de la main avec un accès direct à une
quantité considérable d’informations personnelles. Ils représentent une source
de données très précieuse pour les opérateurs de télécommunication, mais la
nature très décentralisée de ces données et les a...
We study a variant of the stochastic multi-armed bandit (MAB) problem in which the rewards are corrupted. In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process...
We formalize the asynchronous multi-armed bandits with known trend problem (AMABKT)
and propose a few empirical solutions, the most efficient one being based on finite-horizon
Gittins indices.
To address the contextual bandit problem, we propose an online random forest algorithm. The analysis of the proposed algorithm is based on the sample complexity needed to find the optimal decision stump. Then, the decision stumps are recursively stacked in a random collection of decision trees, BANDIT FOREST. We show that the proposed algorithm is...
We present an approach to structured prediction from bandit feedback, called
Bandit Structured Prediction, where only the value of a task loss function at a
single predicted point, instead of a correct structure, is observed in
learning. We present an application to discriminative reranking in Statistical
Machine Translation (SMT) where the learnin...
Bandit Forest and use cases
Partial monitoring is a generic framework for sequential decision-making with incomplete
feedback. It encompasses a wide class of problems such as dueling bandits, learning with
expect advice, dynamic pricing, dark pools, and label efficient prediction. We study the
utility-based adversarial dueling bandit problem as an instance of partial monitori...
Partial monitoring is a generic framework for sequential decision-making with
incomplete feedback. It encompasses a wide class of problems such as dueling
bandits, learning with expect advice, dynamic pricing, dark pools, and label
efficient prediction. We study the utility-based dueling bandit problem as an
instance of partial monitoring problem a...
We study the K-armed dueling bandit problem which is a variation of the
classical Multi-Armed Bandit (MAB) problem in which the learner receives only
relative feedback about the selected pairs of arms. We propose a new algorithm
called Relative Exponential-weight algorithm for Exploration and Exploitation
(REX3) to handle the adversarial utility-ba...
Optimization of online decisions and its applications.
Presentation done at the Laboratoire de Recherche en Informatique of Orsay, Paris XI.
We consider a variant of the multi-armed bandit model, which we call scratch games, where the sequences of rewards are finite and drawn in advance with unknown starting dates. This new problem is motivated by online advertising applications where the number of ad displays is fixed according to a contract between the advertiser and the publisher, an...
We study a stochastic online learning scheme with partial feedback where the utility of decisions is only observable through an estimation of the environment parameters. We propose a generic pure-exploration algorithm, able to cope with various utility functions
from multi-armed bandits settings to dueling bandits. The primary application of this s...
Social networks mirror public opinion. Thus, they are of great interest for opinion
mining and sentiment analysis. In most cases, sentiment is classified according to a polarity criterion
or a linear gradation. Is it possible to learn more complex patterns with limited itemsets?
In this article, we investigate brand equity. The goal is to analyze h...
Stochastic multi-armed bandit algorithms are used to solve the exploration and exploitation dilemma in sequential optimization problems. The algorithms based on upper confidence bounds offer strong theoretical guarantees, they are easy to implement and efficient in practice. We considers a new bandit setting, called "scratch-games", where arm budge...
We propose a practical method of semi-supervised feature learning with constructed
kernels from combinations of non-linear weak rankers for classification
applications. While in kernel methods one usually avoids working in the implied
implicit feature space, we use the outputs of weak rankers as new features, and define
the kernel as scalar product...
The Pascal Exploration & Exploitation challenge 2011 seeks to evaluate algorithms for the online website content selection problem. This article presents the solution we used to achieve second place in this challenge and some side-experiments we performed. The methods we evaluated are all structured in three layers. The first layer provides an onli...
Fake content is flourishing on the Internet, ranging from basic random word salads to web scraping. Most of this fake content
is generated for the purpose of nourishing fake web sites aimed at biasing search engine indexes: at the scale of a search
engine, using automatically generated texts render such sites harder to detect than using copies of e...
We present here the contribution of the MADSPAM consor-tium to the ECML/PKDD Discovery Challenge 2010. The submitted method is based on both a RankBoost algorithm and on propagation techniques.
Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g., commercial sites, blogs and other sites edited using web authoring software), as well as less legitimate spamdexing attempts (e.g., link farms, faked directories).
Those pages built using the same generating method (t...
How to distinguish natural texts from artificially gener- ated ones ? Fake content is commonly encountered on the Internet, ranging from web scraping to random word salads. Most of this fake content is generated for spam purpose. In this paper, we present two methods to deal with this problem. The first one uses classical lan- guage models, while t...
Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g. commercial sites, blogs and other sites pow- ered by a web authoring software), as well as less legitimous spamdexing attempts (e.g. link farms, faked directories...). Those pages built using the same generating method...
We propose to study two infinite graph transformations that we respectively call bounded and unbounded path transduction. These graph transformations are based on path substitutions and graph products. When graphs are considered as automata, path transductions correspond to rational word transductions on the accepted languages. They define strict s...
A natural way to describe a family of languages is to use rational transformations from a generator. From these transformations,
Ginsburg and Greibach have defined the Abstract Family of Languages (AFL). Infinite graphs (also called infinite automata)
are natural tools to study languages. In this paper, we study families of infinite graphs that are...
An important part of knowledge in Go programs is based on two dimentional patterns. When these patterns are used as heuristic in reading algorithms the pattern matching can become time critical. We present a fast pattern matching algorithm based on Deterministic Finite State Automata.
The aim of this article is to make a link between the congruential systems investigated by Conway and the theory of infinite graphs. We compare the graphs of congruential systems with a well known family of infinite graphs: the regular graphs of finite degree. Those graphs, first considered by Muller and Schupp then by Courcelle, are the transition...
Un graphe congruentiel est une union finie de relations affines étiquetées sur les entiers. Les graphes congruentiels déterministes, étudiées par Conway, ont la puissance du calculable. On s'intéresse à la famille des graphes réguliers de degré fini, dont la théorie monadique est décidable. Ce sont les restrictions rationnelles des graphes de trans...
Projects
Projects (3)
We have created a working group on time series including Orange people but also people from universities or Inria teams. The aim is to share ideas and to collaborate (from example based on PhD students).
The last ten years were prolific in the statistical learning and data min-
ing field and it is now easy to find learning algorithms which are fast and automatic. Historically a strong hypothesis was that all examples were available or can be loaded into memory so that learning algorithms can use them straight away. But recently new use cases generating lots of data came up as for example: monitoring of telecommunication network, user modeling in dynamic social network,
web mining... The volume of data increases rapidly and it is now necessary to use incremental learning algorithms on data streams. This projects presents the main approaches of incremental supervised classification we developed... up to now