Shunyu Yao's research while affiliated with Princeton University and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (23)
Recent efforts have incorporated large language models (LLMs) with external resources (e.g., the Internet) or internal control flows (e.g., prompt chaining) for tasks requiring grounding or reasoning. However, these efforts have largely been piecemeal, lacking a systematic framework for constructing a fully-fledged language agent. To address this c...
Text generation under constraints have seen increasing interests in natural language processing, especially with the rapidly improving capabilities of large language models. However, existing benchmarks for constrained generation usually focus on fixed constraint types (e.g.,generate a sentence containing certain words) that have proved to be easy...
Humans write code in a fundamentally interactive manner and rely on constant execution feedback to correct errors, resolve ambiguities, and decompose tasks. While LLMs have recently exhibited promising coding capabilities, current coding benchmarks mostly consider a static instruction-to-code sequence transduction process, which has the potential f...
We propose Referral-Augmented Retrieval (RAR), a simple technique that concatenates document indices with referrals, i.e. text from other documents that cite or link to the given document, to provide significant performance gains for zero-shot information retrieval. The key insight behind our method is that referrals provide a more complete, multi-...
Comprehending characters' personalities is a crucial aspect of story reading. As readers engage with a story, their understanding of a character evolves based on new events and information; and multiple fine-grained aspects of personalities can be perceived. This leads to a natural problem of situated and fine-grained personality understanding. The...
Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount...
Embodied control requires agents to leverage multi-modal pre-training to quickly learn how to act in new environments, where video demonstrations contain visual and motion details needed for low-level perception and control, and language instructions support generalization with abstract, symbolic structures. While recent approaches apply contrastiv...
Yi Gu Shunyu Yao Chuang Gan- [...]
Mo Yu
Text games present opportunities for natural language understanding (NLU) methods to tackle reinforcement learning (RL) challenges. However, recent work has questioned the necessity of NLU by showing random text hashes could perform decently. In this paper, we pursue a fine-grained investigation into the roles of text in the face of different RL ch...
While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e.g. chain-of-thought prompting) and acting (e.g. action plan generation) have primarily been studied as separate topics. In this paper, we explore the use of LLMs to gen...
Existing benchmarks for grounding language in interactive environments either lack real-world linguistic elements, or prove difficult to scale up due to substantial human involvement in the collection of data or feedback signals. To bridge this gap, we develop WebShop -- a simulated e-commerce website environment with $1.18$ million real-world prod...
We propose a new task for assessing machines' skills of understanding fictional characters in narrative stories. The task, TVShowGuess, builds on the scripts of TV series and takes the form of guessing the anonymous main characters based on the backgrounds of the scenes and the dialogues. Our human study supports that this form of task covers compr...
The study of language emergence aims to understand how human languages are shaped by perceptual grounding and communicative intent. Computational approaches to emergent communication (EC) predominantly consider referential games in limited domains and analyze the learned protocol within the game framework. As a result, it remains unclear how the em...
Text adventure games present unique challenges to reinforcement learning methods due to their combinatorially large action spaces and sparse rewards. The interplay of these two factors is particularly demanding because large action spaces require extensive exploration, while sparse rewards provide limited feedback. This work proposes to tackle the...
Despite their impressive performance in NLP, self-attention networks were recently proved to be limited for processing formal languages with hierarchical structure, such as $\mathsf{Dyck}_k$, the language consisting of well-nested parentheses of $k$ types. This suggested that natural language can be approximated well with models that are too weak f...
Text-based games simulate worlds and interact with players using natural language. Recent work has used them as a testbed for autonomous language-understanding agents, with the motivation being that understanding the meanings of words or semantics is a key component of how humans understand, reason, and act in these worlds. However, it remains uncl...
Text-based games present a unique challenge for autonomous agents to operate in natural language and handle enormous action spaces. In this paper, we propose the Contextual Action Language Model (CALM) to generate a compact set of action candidates at each game state. Our key insight is to train language models on human gameplay, where people demon...
Citations
... Yao et al. 64 and Schick et al. 25 have shown that LLMs can be used as agents that can autonomously make use of external tools such as Web-APIs-a paradigm that some call MRKL (pronounced "miracle") systems-modular reasoning, knowledge, and language systems. 26 By giving LLMs access to tools and forcing them to think step-by-step, 65 we can thereby convert LLMs from hypercondent models that oen hallucinate to systems that can reason based on observations made by querying robust tools. ...
... They have been used to study tasks like discourse parsing and summarization (Afantenos et al., 2015;Manuvinakurike et al., 2021, i.a.) and coreference resolution (Walker and Reithinger, 1997;Jovanovic et al., 2005;Frampton et al., 2009;Choi and Chen, 2018). Despite the breadth of domains and formality across all datasets, each multiparty dataset itself is narrowly focused, like meetings (McCowan et al., 2005;Hsueh et al., 2006), board game play (Asher et al., 2016), fantasy storytelling (Rameshkumar and Bailey, 2020), technical or persuasive online forums (Li et al., 2020a;Wang et al., 2019), and sitcom transcripts (Choi and Chen, 2018;Sang et al., 2022). ...
... Such a joint system is regarded as a semi-parametric RL agent, which can evolve its ability through its interaction experiences analogically to a full-parametric system, however, without fine-tuning the LLM parameters. We evaluate REMEMBERER on two recent RL task sets with the promising performance of LLM-based agents, WebShop [Yao et al., 2022a] and WikiHow [Zhang et al., 2023]. The agent is trained on a few tasks and tested on some other tasks to check whether the experiences from different tasks can help the agent in the decision of the unseen episodes. ...
... other works have investigated specific classes of algorithms, e.g. bounded-depth Dyck languages [50], modular prefix sums [2], adders [25], regular languages [4], and sparse logical predicates [11]. Liu et al. [22] provide a unified theory on understanding automata-like mechanisms within transformers. ...
Reference: Trainable Transformer in Transformer
... The overall idea is that instead of relying on reinforcement learning to learn how to play the games through repeated interactions with the same game (He et al., 2016;Ammanabrolu and Riedl, 2019;Guo et al., 2020), the model can rely on general knowledge it has learned about language to give a good sense for which actions are possible and/or may be useful. Yao et al. (2021) explore the use of intrinsic motivation (Pathak et al., 2017)-i.e. additional loss functions encouraging the model to explore more-to obtain more semantically relevant representations. ...
... The basis for this is the large amount of training data representing a wide range of human behaviour through language [11,8]. In relation to games, models such as OpenAI's Generative Pre-trained Transformer (GPT-2 and its successors) have shown early successes in the procedural generation of interactive stories [20], text-based adventure dialog as well as action candidates [66,12]. In a recent simulation study by Park et al. [44], language models were implemented in artificial agent architectures to populate a sandbox world reminiscent of The Sims [4]. ...