Preprint

Making Large Language Models into World Models with Precondition and Effect Knowledge

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.

Abstract

World models, which encapsulate the dynamics of how actions affect environments, are foundational to the functioning of intelligent agents. In this work, we explore the potential of Large Language Models (LLMs) to operate as world models. Although LLMs are not inherently designed to model real-world dynamics, we show that they can be induced to perform two critical world model functions: determining the applicability of an action based on a given world state, and predicting the resulting world state upon action execution. This is achieved by fine-tuning two separate LLMs-one for precondition prediction and another for effect prediction-while leveraging synthetic data generation techniques. Through human-participant studies, we validate that the precondition and effect knowledge generated by our models aligns with human understanding of world dynamics. We also analyze the extent to which the world model trained on our synthetic data results in an inferred state space that supports the creation of action chains, a necessary property for planning.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
It has long been hypothesised that causal reasoning plays a fundamental role in robust and general intelligence. However, it is not known if agents must learn causal models in order to generalise to new domains, or if other inductive biases are sufficient. We answer this question, showing that any agent capable of satisfying a regret bound for a large set of distributional shifts must have learned an approximate causal model of the data generating process, which converges to the true causal model for optimal agents. We discuss the implications of this result for several research areas including transfer learning and causal inference.
Article
Full-text available
Introduced the statistic kappa to measure nominal scale agreement between a fixed pair of raters. Kappa was generalized to the case where each of a sample of 30 patients was rated on a nominal scale by the same number of psychiatrist raters (n = 6), but where the raters rating 1 s were not necessarily the same as those rating another. Large sample standard errors were derived.
Article
Full-text available
Human evaluations of machine translation are extensive but expensive. Human evaluations can take months to finish and involve human labor that can not be reused.
Conference Paper
Generating explanations for reinforcement learning (RL) is challenging as actions may produce long-term effects on the future. In this paper, we develop a novel framework for explainable RL by learning a causal world model without prior knowledge of the causal structure of the environment. The model captures the influence of actions, allowing us to interpret the long-term effects of actions through causal chains, which present how actions influence environmental variables and finally lead to rewards. Different from most explanatory models which suffer from low accuracy, our model remains accurate while improving explainability, making it applicable in model-based learning. As a result, we demonstrate that our causal model can serve as the bridge between explainability and learning.
Conference Paper
Comprehending action preconditions and effects is an essential step in modeling the dynamics of the world. In this paper, we express the semantics of precondition relations extracted from text in terms of planning operations. The challenge of modeling this connection is to ground language at the level of relations. This type of grounding enables us to create high-level plans based on language abstractions. Our model jointly learns to predict precondition relations from text and to perform high-level planning guided by those relations. We implement this idea in the reinforcement learning framework using feedback automatically obtained from plan execution attempts. When applied to a complex virtual world and text describing that world, our relation extraction technique performs on par with a supervised baseline, yielding an F-measure of 66% compared to the baseline's 65%. Additionally, we show that a high-level planner utilizing these extracted relations significantly outperforms a strong, text unaware baseline -- successfully completing 80% of planning tasks as compared to 69% for the baseline.
Learning knowledge graph-based world models of textual environments
  • Prithviraj Ammanabrolu
  • Mark Riedl
Prithviraj Ammanabrolu and Mark Riedl. 2021a. Learning knowledge graph-based world models of textual environments. Advances in Neural Information Processing Systems, 34:3720-3731.
Modeling worlds in text
  • Prithviraj Ammanabrolu
  • Mark Riedl
Prithviraj Ammanabrolu and Mark Riedl. 2021b. Modeling worlds in text. In Workshop on Commonsense Reasoning and Knowledge Bases.
Berall: Towards generating retrieval-augmented state-based interactive fiction games
  • Rachel Chambers
  • Naomi Tack
  • Eliot Pearson
  • J Lara
  • Francis Martin
  • Ferraro
Rachel Chambers, Naomi Tack, Eliot Pearson, Lara J Martin, and Francis Ferraro. 2024. Berall: Towards generating retrieval-augmented state-based interactive fiction games. In The 4th Wordplay: When Language Meets Games Workshop.
Scaling instruction-finetuned language models
  • Hyung Won
  • Le Chung
  • Shayne Hou
  • Barret Longpre
  • Yi Zoph
  • William Tay
  • Yunxuan Fedus
  • Xuezhi Li
  • Mostafa Wang
  • Siddhartha Dehghani
  • Brahma
Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, et al. 2024. Scaling instruction-finetuned language models. Journal of Machine Learning Research, 25(70):1-53.
Do embodied agents dream of pixelated sheep: Embodied decision making using language guided world modelling
  • Kolby Nottingham
  • Prithviraj Ammanabrolu
  • Alane Suhr
  • Yejin Choi
  • Hannaneh Hajishirzi
  • Sameer Singh
  • Roy Fox
Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, and Roy Fox. 2023. Do embodied agents dream of pixelated sheep: Embodied decision making using language guided world modelling. In International Conference on Machine Learning, pages 26311-26325. PMLR.
Chatgpt: A large-scale opendomain chatbot
  • Openai
OpenAI. 2022. Chatgpt: A large-scale opendomain chatbot. OpenAI blog.
  • Josh Openai
  • Steven Achiam
  • Sandhini Adler
  • Lama Agarwal
  • Ilge Ahmad
  • Florencia Leoni Akkaya
  • Diogo Aleman
  • Janko Almeida
  • Sam Altenschmidt
  • Altman
OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
Unifying large language models and knowledge graphs: A roadmap
  • Linhao Shirui Pan
  • Yufei Luo
  • Chen Wang
  • Jiapu Chen
  • Xindong Wang
  • Wu
Shirui Pan, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, and Xindong Wu. 2024. Unifying large language models and knowledge graphs: A roadmap. IEEE Transactions on Knowledge & Data Engineering, 36(07):3580-3599.
Mastering memory tasks with world models
  • Artem Mohammad Reza Samsami
  • Janarthanan Zholus
  • Sarath Rajendran
  • Chandar
Mohammad Reza Samsami, Artem Zholus, Janarthanan Rajendran, and Sarath Chandar. 2024. Mastering memory tasks with world models. In The Twelfth International Conference on Learning Representations.
Atomic: An atlas of machine commonsense for ifthen reasoning
  • Maarten Sap
  • Emily Ronan Le Bras
  • Chandra Allaway
  • Nicholas Bhagavatula
  • Hannah Lourie
  • Brendan Rashkin
  • Roof
  • A Noah
  • Yejin Smith
  • Choi
Maarten Sap, Ronan Le Bras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A Smith, and Yejin Choi. 2019. Atomic: An atlas of machine commonsense for ifthen reasoning. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 3027-3035.
Planning to explore via self-supervised world models
  • Ramanan Sekar
  • Oleh Rybkin
  • Kostas Daniilidis
  • Pieter Abbeel
  • Danijar Hafner
  • Deepak Pathak
Ramanan Sekar, Oleh Rybkin, Kostas Daniilidis, Pieter Abbeel, Danijar Hafner, and Deepak Pathak. 2020. Planning to explore via self-supervised world models. In International conference on machine learning, pages 8583-8592. PMLR.
Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition
  • Erik F Tjong
  • Kim Sang
  • Fien De Meulder
Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pages 142-147.
Llama 2: Open foundation and fine-tuned chat models
  • Hugo Touvron
  • Louis Martin
  • Kevin Stone
  • Peter Albert
  • Amjad Almahairi
  • Yasmine Babaei
  • Nikolay Bashlykov
  • Soumya Batra
  • Prajjwal Bhargava
  • Shruti Bhosale
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
Investigating the effectiveness of self-critiquing in LLMs solving planning tasks
  • Karthik Valmeekam
  • Matthew Marquez
  • Subbarao Kambhampati
Karthik Valmeekam, Matthew Marquez, and Subbarao Kambhampati. 2023. Investigating the effectiveness of self-critiquing in LLMs solving planning tasks. In NeurIPS 2023 Foundation Models for Decision Making Workshop.
Language models meet world models: Embodied experiences enhance language models. Advances in neural information processing systems
  • Jiannan Xiang
  • Tianhua Tao
  • Yi Gu
  • Tianmin Shu
  • Zirui Wang
  • Zichao Yang
  • Zhiting Hu
Jiannan Xiang, Tianhua Tao, Yi Gu, Tianmin Shu, Zirui Wang, Zichao Yang, and Zhiting Hu. 2024. Language models meet world models: Embodied experiences enhance language models. Advances in neural information processing systems, 36.
World model as a graph: Learning latent landmarks for planning
  • Lunjun Zhang
  • Ge Yang
  • Bradly C Stadie
Lunjun Zhang, Ge Yang, and Bradly C Stadie. 2021. World model as a graph: Learning latent landmarks for planning. In International conference on machine learning, pages 12611-12620. PMLR.