Tanguy Urvoy

Tanguy Urvoy
France Télécom, France · Orange-Labs

PhD

About

36
Publications
5,561
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
319
Citations
Citations since 2017
6 Research Items
152 Citations
2017201820192020202120222023051015202530
2017201820192020202120222023051015202530
2017201820192020202120222023051015202530
2017201820192020202120222023051015202530
Additional affiliations
October 2005 - June 2015
Orange Labs
Position
  • Engineer

Publications

Publications (36)
Preprint
Full-text available
Task-oriented dialogue systems are designed to achieve specific goals while conversing with humans. In practice, they may have to handle simultaneously several domains and tasks. The dialogue manager must therefore be able to take into account domain changes and plan over different domains/tasks in order to deal with multidomain dialogues. However,...
Preprint
Full-text available
A learning dialogue agent can infer its behaviour from interactions with the users. These interactions can be taken from either human-to-human or human-machine conversations. However, human interactions are scarce and costly, making learning from few interactions essential. One solution to speedup the learning process is to guide the agent's explor...
Conference Paper
Full-text available
Les smartphones sont omniprésents dans notre quotidien. Ils constituent une ressource informatique à portée de la main avec un accès direct à une quantité considérable d’informations personnelles. Ils représentent une source de données très précieuse pour les opérateurs de télécommunication, mais la nature très décentralisée de ces données et les a...
Article
Full-text available
We study a variant of the stochastic multi-armed bandit (MAB) problem in which the rewards are corrupted. In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process...
Poster
Full-text available
We formalize the asynchronous multi-armed bandits with known trend problem (AMABKT) and propose a few empirical solutions, the most efficient one being based on finite-horizon Gittins indices.
Conference Paper
Full-text available
To address the contextual bandit problem, we propose an online random forest algorithm. The analysis of the proposed algorithm is based on the sample complexity needed to find the optimal decision stump. Then, the decision stumps are recursively stacked in a random collection of decision trees, BANDIT FOREST. We show that the proposed algorithm is...
Article
Full-text available
We present an approach to structured prediction from bandit feedback, called Bandit Structured Prediction, where only the value of a task loss function at a single predicted point, instead of a correct structure, is observed in learning. We present an application to discriminative reranking in Statistical Machine Translation (SMT) where the learnin...
Presentation
Full-text available
Bandit Forest and use cases
Conference Paper
Full-text available
Partial monitoring is a generic framework for sequential decision-making with incomplete feedback. It encompasses a wide class of problems such as dueling bandits, learning with expect advice, dynamic pricing, dark pools, and label efficient prediction. We study the utility-based adversarial dueling bandit problem as an instance of partial monitori...
Article
Full-text available
Partial monitoring is a generic framework for sequential decision-making with incomplete feedback. It encompasses a wide class of problems such as dueling bandits, learning with expect advice, dynamic pricing, dark pools, and label efficient prediction. We study the utility-based dueling bandit problem as an instance of partial monitoring problem a...
Conference Paper
Full-text available
We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms. We propose a new algorithm called Relative Exponential-weight algorithm for Exploration and Exploitation (REX3) to handle the adversarial utility-ba...
Data
Optimization of online decisions and its applications. Presentation done at the Laboratoire de Recherche en Informatique of Orsay, Paris XI.
Article
Full-text available
We consider a variant of the multi-armed bandit model, which we call scratch games, where the sequences of rewards are finite and drawn in advance with unknown starting dates. This new problem is motivated by online advertising applications where the number of ad displays is fixed according to a contract between the advertiser and the publisher, an...
Conference Paper
Full-text available
We study a stochastic online learning scheme with partial feedback where the utility of decisions is only observable through an estimation of the environment parameters. We propose a generic pure-exploration algorithm, able to cope with various utility functions from multi-armed bandits settings to dueling bandits. The primary application of this s...
Conference Paper
Full-text available
Social networks mirror public opinion. Thus, they are of great interest for opinion mining and sentiment analysis. In most cases, sentiment is classified according to a polarity criterion or a linear gradation. Is it possible to learn more complex patterns with limited itemsets? In this article, we investigate brand equity. The goal is to analyze h...
Conference Paper
Full-text available
Stochastic multi-armed bandit algorithms are used to solve the exploration and exploitation dilemma in sequential optimization problems. The algorithms based on upper confidence bounds offer strong theoretical guarantees, they are easy to implement and efficient in practice. We considers a new bandit setting, called "scratch-games", where arm budge...
Conference Paper
Full-text available
We propose a practical method of semi-supervised feature learning with constructed kernels from combinations of non-linear weak rankers for classification applications. While in kernel methods one usually avoids working in the implied implicit feature space, we use the outputs of weak rankers as new features, and define the kernel as scalar product...
Conference Paper
Full-text available
The Pascal Exploration & Exploitation challenge 2011 seeks to evaluate algorithms for the online website content selection problem. This article presents the solution we used to achieve second place in this challenge and some side-experiments we performed. The methods we evaluated are all structured in three layers. The first layer provides an onli...
Article
Full-text available
Fake content is flourishing on the Internet, ranging from basic random word salads to web scraping. Most of this fake content is generated for the purpose of nourishing fake web sites aimed at biasing search engine indexes: at the scale of a search engine, using automatically generated texts render such sites harder to detect than using copies of e...
Article
Full-text available
We present here the contribution of the MADSPAM consor-tium to the ECML/PKDD Discovery Challenge 2010. The submitted method is based on both a RankBoost algorithm and on propagation techniques.
Article
Full-text available
Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g., commercial sites, blogs and other sites edited using web authoring software), as well as less legitimate spamdexing attempts (e.g., link farms, faked directories). Those pages built using the same generating method (t...
Conference Paper
Full-text available
How to distinguish natural texts from artificially gener- ated ones ? Fake content is commonly encountered on the Internet, ranging from web scraping to random word salads. Most of this fake content is generated for spam purpose. In this paper, we present two methods to deal with this problem. The first one uses classical lan- guage models, while t...
Conference Paper
Full-text available
Automatically generated content is ubiquitous in the web: dynamic sites built using the three-tier paradigm are good examples (e.g. commercial sites, blogs and other sites pow- ered by a web authoring software), as well as less legitimous spamdexing attempts (e.g. link farms, faked directories...). Those pages built using the same generating method...
Conference Paper
Full-text available
We propose to study two infinite graph transformations that we respectively call bounded and unbounded path transduction. These graph transformations are based on path substitutions and graph products. When graphs are considered as automata, path transductions correspond to rational word transductions on the accepted languages. They define strict s...
Conference Paper
Full-text available
A natural way to describe a family of languages is to use rational transformations from a generator. From these transformations, Ginsburg and Greibach have defined the Abstract Family of Languages (AFL). Infinite graphs (also called infinite automata) are natural tools to study languages. In this paper, we study families of infinite graphs that are...
Article
Full-text available
An important part of knowledge in Go programs is based on two dimentional patterns. When these patterns are used as heuristic in reading algorithms the pattern matching can become time critical. We present a fast pattern matching algorithm based on Deterministic Finite State Automata.
Conference Paper
Full-text available
The aim of this article is to make a link between the congruential systems investigated by Conway and the theory of infinite graphs. We compare the graphs of congruential systems with a well known family of infinite graphs: the regular graphs of finite degree. Those graphs, first considered by Muller and Schupp then by Courcelle, are the transition...
Conference Paper
Full-text available
Un graphe congruentiel est une union finie de relations affines étiquetées sur les entiers. Les graphes congruentiels déterministes, étudiées par Conway, ont la puissance du calculable. On s'intéresse à la famille des graphes réguliers de degré fini, dont la théorie monadique est décidable. Ce sont les restrictions rationnelles des graphes de trans...

Network

Cited By

Projects

Projects (3)
Project
We have created a working group on time series including Orange people but also people from universities or Inria teams. The aim is to share ideas and to collaborate (from example based on PhD students).
Archived project
The last ten years were prolific in the statistical learning and data min- ing field and it is now easy to find learning algorithms which are fast and automatic. Historically a strong hypothesis was that all examples were available or can be loaded into memory so that learning algorithms can use them straight away. But recently new use cases generating lots of data came up as for example: monitoring of telecommunication network, user modeling in dynamic social network, web mining... The volume of data increases rapidly and it is now necessary to use incremental learning algorithms on data streams. This projects presents the main approaches of incremental supervised classification we developed... up to now
Project
Octopus is Multi-Armed Bandits server providing a decision API.