Planning and Acting under Uncertainty: A New Model for Spoken Dialogue Systems

Source: DBLP

ABSTRACT Uncertainty plays a central role in spoken dialogue systems. Some stochastic
models like Markov decision process (MDP) are used to model the dialogue
manager. But the partially observable system state and user intention hinder
the natural representation of the dialogue state. MDP-based system degrades
fast when uncertainty about a user's intention increases. We propose a novel
dialogue model based on the partially observable Markov decision process
(POMDP). We use hidden system states and user intentions as the state set,
parser results and low-level information as the observation set, domain actions
and dialogue repair actions as the action set. Here the low-level information
is extracted from different input modals, including speech, keyboard, mouse,
etc., using Bayesian networks. Because of the limitation of the exact
algorithms, we focus on heuristic approximation algorithms and their
applicability in POMDP for dialogue management. We also propose two methods for
grid point selection in grid-based approximation algorithms.

Download full-text


Available from: Jianfeng Mao, Jul 06, 2014
  • Source
    • "Even so, speech recognition technology remains imperfect: speech recognition errors are common and undermine dialog systems. To tackle this, the research community has begun applying POMDPs to dialog control (Roy, Pineau, and Thrun 2000; Zhang et al. 2001; Williams and Young 2007a; Young et al. 2007; Doshi and Roy 2007). POMDPs maintain a distribution over many hypotheses for the correct dialog state and choose actions using an optimization process, in which a developer specifies high-level goals via a reward function. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A common problem for real-world POMDP applications is how to incorporate expert knowledge and constraints such as business rules into the optimization process. This paper de-scribes a simple approach created in the course of developing a spoken dialog system. A POMDP and conventional hand-crafted dialog controller run in parallel; the conventional dia-log controller nominates a set of one or more actions, and the POMDP chooses the optimal action. This allows designers to express real-world constraints in a familiar manner, and also prunes the search space of policies. The method nat-urally admits compression, and the POMDP value function can draw on features from both the POMDP belief state and the hand-crafted dialog controller. The method has been used to build a full-scale dialog system which is currently running at AT&T Labs. An evaluation shows that this unified archi-tecture yields better performance than using a conventional dialog manager alone, and also demonstrates an improvement in optimization speed and reliability vs. a pure POMDP.
  • Source
    • "Short-and long-term objectives of the system are specified in the form of the POMDP's reward function, and actions are selected with the goal of maximizing the sum of rewards over time: i.e., the POMDP performs planning to determine an optimal course of action which balances short-term and long-term priorities. Maintaining multiple hypotheses for the current dialog state enables POMDPs to better interpret conflicting evidence, and in the literature POMDPs have been shown to outperform (automated and hand-crafted) techniques which maintain a single dialog state hypothesis [1], [2], [3], [4], [5], [6]. Even so, POMDPs face severe scalability limitations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Control in spoken dialog systems is challenging largely because automatic speech recognition is unreliable, and hence the state of the conversation can never be known with certainty. Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for planning and control in this context; however, POMDPs face severe scalability challenges, and past work has been limited to trivially small dialog tasks. This paper presents a novel POMDP optimization technique-composite summary point-based value iteration (CSPBVI)-which enables optimization to be performed on slot-filling POMDP-based dialog managers of a realistic size. Using dialog models trained on data from a tourist information domain, simulation results show that CSPBVI scales effectively, outperforms non-POMDP baselines, and is robust to estimation errors.
    IEEE Transactions on Audio Speech and Language Processing 10/2007; DOI:10.1109/TASL.2007.902050 · 2.63 Impact Factor
  • Source
    • "As shown by the labelling in (8), the probability distribution for a u is called the user action model. It allows the observation probability that is conditioned on a u to be scaled by the probability that the user 4 Note that alternative POMDP formulations can also be used for SDS eg [15] [16] "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper explains how partially observable Markov decision processes (POMDPs) can provide a principled mathematical framework for modelling the inherent uncertainty in spoken dialog systems. It briefly summarises the basic mathematics and explains why exact optimisation is intractable. It then describes a form of approximation called the Hidden Information State model which does scale and which can be used to build practical systems. Index Terms — statistical dialog modelling; partially observable Markov decision processes (POMDPs); hidden information state model 1.
    Spoken Language Technology Workshop, 2006. IEEE; 01/2007
Show more