Planning and Acting under Uncertainty: A New Model for Spoken Dialogue Systems

Article (PDF Available) · January 2013with22 Reads
Source: arXiv
Uncertainty plays a central role in spoken dialogue systems. Some stochastic models like Markov decision process (MDP) are used to model the dialogue manager. But the partially observable system state and user intention hinder the natural representation of the dialogue state. MDP-based system degrades fast when uncertainty about a user's intention increases. We propose a novel dialogue model based on the partially observable Markov decision process (POMDP). We use hidden system states and user intentions as the state set, parser results and low-level information as the observation set, domain actions and dialogue repair actions as the action set. Here the low-level information is extracted from different input modals, including speech, keyboard, mouse, etc., using Bayesian networks. Because of the limitation of the exact algorithms, we focus on heuristic approximation algorithms and their applicability in POMDP for dialogue management. We also propose two methods for grid point selection in grid-based approximation algorithms.

Full-text (PDF)

Available from: Jianfeng Mao, Jul 06, 2014
    • "The second equality in the equation, the product of probabilities, is due to the independence of words given an user intention. User intentions have been previously suggested to be used as states of dialogue POMDPs (Roy et al. 2000; Zhang et al. 2001b; Matsubara et al. 2002; Roy 2007, 2008). However, to the best of our knowledge, they have not been automatically extracted from real data. "
    [Show abstract] [Hide abstract] ABSTRACT: The partially observable Markov decision process (POMDP) framework has been applied in dialogue systems as a formal framework to represent uncertainty explicitly while being robust to noise. In this context, estimating the dialogue POMDP model components is a significant challenge as they have a direct impact on the optimized dialogue POMDP policy. To achieve such an estimation, we propose methods for learning dialogue POMDP model components using noisy and unannotated dialogues. Specifically, we introduce techniques to learn the set of possible user intentions from dialogues, use them as the dialogue POMDP states, and learn a maximum likelihood POMDP transition model from data. Since it is crucial to reduce the observation state size, we then propose two observation models: the keyword model and the intention model. Using these two models, the number of observations is reduced significantly while the POMDP performance remains high particularly in the intention POMDP. Learning states and observations sustaining a POMDP are both covered in this first part (part I) and experimented from dialogues collected by SmartWheeler (an intelligent wheelchair which aims to help persons with disabilities). Part II covers the reward model learning required by the POMDP.
    Full-text · Article · Dec 2014
    • "Reinforcement learning (RL): RL in (partially observable) Markov decision processes, so called the (PO)MDPs, is a learning approach in sequential decision making. In particular, (PO)MDPs have been successfully applied in dialogue agents (Roy et al., 2000; Zhang et al., 2001; Williams, 2006; Thomson and Young, 2010; Gaši´Gaši´c, 2011). The (PO)MDP framework is a formal framework to represent uncertainty explicitly while supporting automated strategy solving. "
    Full-text · Article · Jan 2014 · International Journal of Speech Technology
    • "For both of these functions of the system, speech acts will be used to carry on dialogues in cases where the system attempts to uncover additional information from the user, such as whether they need help, etc. This requires some form of dialogue management encoded in the dynamics, which is an active area of research in POMDP modeling (Roy, Gordon, & Thrun 2003; Zhang et al. 2001; Williams, Poupart, & Young 2005). "
    [Show abstract] [Hide abstract] ABSTRACT: This paper presents a general decision theoretic model of interactions between users and cognitive assistive technologies for various tasks of importance to the el- derly population. The model is a partially observable Markov decision process (POMDP) whose goal is to work in conjunction with a user towards the comple- tion of a given activity or task. This requires the model to monitor and assist the user, to maintain indicators of overall user health, and to adapt to changes. The key strengths of the POMDP model are that it is able to deal with uncertainty, it is easy to specify, it can be ap- plied to different tasks with little modification, and it is able to learn and adapt to changing tasks and situations. This paper describes the model, gives a general learn- ing method which enables the model to be learned from partially labeled data, and shows how the model can be applied within our research program on technologies for wellness. In particular, we show how the model is used in three tasks: assisted handwashing, health and safety monitoring, and wheelchair mobility. The paper gives an overview of ongoing work into each of these areas, and discusses future directions.
    Full-text · Article · Jan 2011 · International Journal of Speech Technology
Show more