Jeffrey Zhao's scientific contributions

Publications (10)

Preprint
Full-text available
Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount...
Preprint
Full-text available
We propose AnyTOD, an end-to-end task-oriented dialog (TOD) system with zero-shot capability for unseen tasks. We view TOD as a program executed by a language model (LM), where program logic and ontology is provided by a designer in the form of a schema. To enable generalization onto unseen schemas and programs without prior training, AnyTOD adopts...
Preprint
Full-text available
Most research on task oriented dialog modeling is based on written text input. However, users interact with practical dialog systems often using speech as input. Typically, systems convert speech into text using an Automatic Speech Recognition (ASR) system, introducing errors. Furthermore, these systems do not address the differences in written and...
Preprint
Full-text available
While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e.g. chain-of-thought prompting) and acting (e.g. action plan generation) have primarily been studied as separate topics. In this paper, we explore the use of LLMs to gen...
Preprint
Full-text available
Building universal dialogue systems that can seamlessly operate across multiple domains/APIs and generalize to new ones with minimal supervision and maintenance is a critical challenge. Recent works have leveraged natural language descriptions for schema elements to enable such systems; however, descriptions can only indirectly convey schema semant...
Preprint
Full-text available
Task-oriented dialogue (TOD) systems are required to identify key information from conversations for the completion of given tasks. Such information is conventionally specified in terms of intents and slots contained in task-specific ontology or schemata. Since these schemata are designed by system developers, the naming convention for slots and in...
Preprint
Full-text available
Sequence-to-sequence models have been applied to a wide variety of NLP tasks, but how to properly use them for dialogue state tracking has not been systematically investigated. In this paper, we study this problem from the perspectives of pre-training objectives as well as the formats of context representations. We demonstrate that the choice of pr...

Citations

... RADDLE (Peng et al., 2021b) is the first to consider evaluating model robustness by adding various ASR noises to the original MultiWOZ but is not publicly available now. Shafran et al. (2022) extends the same idea to propose a speech-aware dialog systems technology challenge. NSD aims to discover unknown or out-of-domain slot types for dialogue systems. ...
... Yao et al. 64 and Schick et al. 25 have shown that LLMs can be used as agents that can autonomously make use of external tools such as Web-APIs-a paradigm that some call MRKL (pronounced "miracle") systems-modular reasoning, knowledge, and language systems. 26 By giving LLMs access to tools and forcing them to think step-by-step, 65 we can thereby convert LLMs from hypercondent models that oen hallucinate to systems that can reason based on observations made by querying robust tools. ...
... While there are some existing prompt-based approaches for DST with different designs of prompts such as using slot name [20,21,22,23], slot description [24], slot type [25], possible values [25], priming examples [26] and/or slot-specific question [4,27,28,29,8,30] in prompt sentences, they all fine-tune the entire LM along with the prompt tokens for a new domain, which requires a significant amount of training time, system resources, and annotated data [31,32]. The computing and data resource-hungry issues are more severe in the real-world deployment where LMs tuned for different domains and tasks need to be trained and hosted, and a typical dialogue system has to serve dozens of such LMs [33,34,35]. ...
... This process is known as adaptive pre-training [14], which is conducted between pre-training and fine-tuning. Previous work [43,66] gains consistent improvements by continuously pre-training an adaptive language model with the Masked Language Model (MLM) [10] objective. Furthermore, through three pre-training objectives -the MLM, Span Boundary Objective (SBO) [26], and Perturbation Masking Objective (PMO) -Wu et al. [64] improved the overall performance of a dialogue understanding model. ...