Planning and Acting under Uncertainty: A New Model for Spoken Dialogue Systems

Source: DBLP

ABSTRACT Uncertainty plays a central role in spoken dialogue systems. Some stochastic
models like Markov decision process (MDP) are used to model the dialogue
manager. But the partially observable system state and user intention hinder
the natural representation of the dialogue state. MDP-based system degrades
fast when uncertainty about a user's intention increases. We propose a novel
dialogue model based on the partially observable Markov decision process
(POMDP). We use hidden system states and user intentions as the state set,
parser results and low-level information as the observation set, domain actions
and dialogue repair actions as the action set. Here the low-level information
is extracted from different input modals, including speech, keyboard, mouse,
etc., using Bayesian networks. Because of the limitation of the exact
algorithms, we focus on heuristic approximation algorithms and their
applicability in POMDP for dialogue management. We also propose two methods for
grid point selection in grid-based approximation algorithms.

Download full-text


Available from: Jianfeng Mao, Jul 06, 2014
17 Reads
  • Source
    • "For both of these functions of the system, speech acts will be used to carry on dialogues in cases where the system attempts to uncover additional information from the user, such as whether they need help, etc. This requires some form of dialogue management encoded in the dynamics, which is an active area of research in POMDP modeling (Roy, Gordon, & Thrun 2003; Zhang et al. 2001; Williams, Poupart, & Young 2005). "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a general decision theoretic model of interactions between users and cognitive assistive technologies for various tasks of importance to the el- derly population. The model is a partially observable Markov decision process (POMDP) whose goal is to work in conjunction with a user towards the comple- tion of a given activity or task. This requires the model to monitor and assist the user, to maintain indicators of overall user health, and to adapt to changes. The key strengths of the POMDP model are that it is able to deal with uncertainty, it is easy to specify, it can be ap- plied to different tasks with little modification, and it is able to learn and adapt to changing tasks and situations. This paper describes the model, gives a general learn- ing method which enables the model to be learned from partially labeled data, and shows how the model can be applied within our research program on technologies for wellness. In particular, we show how the model is used in three tasks: assisted handwashing, health and safety monitoring, and wheelchair mobility. The paper gives an overview of ongoing work into each of these areas, and discusses future directions.
  • Source
    • "Even so, speech recognition technology remains imperfect: speech recognition errors are common and undermine dialog systems. To tackle this, the research community has begun applying POMDPs to dialog control (Roy, Pineau, and Thrun 2000; Zhang et al. 2001; Williams and Young 2007a; Young et al. 2007; Doshi and Roy 2007). POMDPs maintain a distribution over many hypotheses for the correct dialog state and choose actions using an optimization process, in which a developer specifies high-level goals via a reward function. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A common problem for real-world POMDP applications is how to incorporate expert knowledge and constraints such as business rules into the optimization process. This paper de-scribes a simple approach created in the course of developing a spoken dialog system. A POMDP and conventional hand-crafted dialog controller run in parallel; the conventional dia-log controller nominates a set of one or more actions, and the POMDP chooses the optimal action. This allows designers to express real-world constraints in a familiar manner, and also prunes the search space of policies. The method nat-urally admits compression, and the POMDP value function can draw on features from both the POMDP belief state and the hand-crafted dialog controller. The method has been used to build a full-scale dialog system which is currently running at AT&T Labs. An evaluation shows that this unified archi-tecture yields better performance than using a conventional dialog manager alone, and also demonstrates an improvement in optimization speed and reliability vs. a pure POMDP.
  • Source
    • "Short-and long-term objectives of the system are specified in the form of the POMDP's reward function, and actions are selected with the goal of maximizing the sum of rewards over time: i.e., the POMDP performs planning to determine an optimal course of action which balances short-term and long-term priorities. Maintaining multiple hypotheses for the current dialog state enables POMDPs to better interpret conflicting evidence, and in the literature POMDPs have been shown to outperform (automated and hand-crafted) techniques which maintain a single dialog state hypothesis [1], [2], [3], [4], [5], [6]. Even so, POMDPs face severe scalability limitations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Control in spoken dialog systems is challenging largely because automatic speech recognition is unreliable, and hence the state of the conversation can never be known with certainty. Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for planning and control in this context; however, POMDPs face severe scalability challenges, and past work has been limited to trivially small dialog tasks. This paper presents a novel POMDP optimization technique-composite summary point-based value iteration (CSPBVI)-which enables optimization to be performed on slot-filling POMDP-based dialog managers of a realistic size. Using dialog models trained on data from a tourist information domain, simulation results show that CSPBVI scales effectively, outperforms non-POMDP baselines, and is robust to estimation errors.
    IEEE Transactions on Audio Speech and Language Processing 10/2007; 15(7-15):2116 - 2129. DOI:10.1109/TASL.2007.902050 · 2.48 Impact Factor
Show more