Jane X. Wang’s research while affiliated with Google Inc. and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (36)


Figure 2: The Construction Lab environment. The player is tasked with picking up objects of different shapes and colors and placing them on the input conveyor to a machine. Left: The agent uses a blue laser beam to pick up objects. Center: Result of a correct object placement. Right: Result of an incorrect object placement.
Can foundation models actively gather information in interactive environments to test hypotheses?
  • Preprint
  • File available

December 2024

·

1 Read

Nan Rosemary Ke

·

Danny P. Sawyer

·

Hubert Soyer

·

[...]

·

Jane X Wang

While problem solving is a standard evaluation task for foundation models, a crucial component of problem solving -- actively and strategically gathering information to test hypotheses -- has not been closely investigated. To assess the information gathering abilities of foundation models in interactive environments, we introduce a framework in which a model must determine the factors influencing a hidden reward function by iteratively reasoning about its previously gathered information and proposing its next exploratory action to maximize information gain at each step. We implement this framework in both a text-based environment, which offers a tightly controlled setting and enables high-throughput parameter sweeps, and in an embodied 3D environment, which requires addressing complexities of multi-modal interaction more relevant to real-world applications. We further investigate whether approaches such as self-correction and increased inference time improve information gathering efficiency. In a relatively simple task that requires identifying a single rewarding feature, we find that LLM's information gathering capability is close to optimal. However, when the model must identify a conjunction of rewarding features, performance is suboptimal. The hit in performance is due partly to the model translating task description to a policy and partly to the model's effectiveness in using its in-context memory. Performance is comparable in both text and 3D embodied environments, although imperfect visual object recognition reduces its accuracy in drawing conclusions from gathered information in the 3D embodied case. For single-feature-based rewards, we find that smaller models curiously perform better; for conjunction-based rewards, incorporating self correction into the model improves performance.

Download

Meta-learning: Data, architecture, and both

September 2024

·

11 Reads

·

1 Citation

Behavioral and Brain Sciences

We are encouraged by the many positive commentaries on our target article. In this response, we recapitulate some of the points raised and identify synergies between them. We have arranged our response based on the tension between data and architecture that arises in the meta-learning framework. We additionally provide a short discussion that touches upon connections to foundation models.


Fig. 2|Social interactions drive compounding innovation. Interactions in the three drivers of compounding innovation have qualitatively different properties. a, Agent interactions in Collective living are anonymous, mediated by proximity. b, In Social relationships, the identity of individuals and their relationships matter during interactions, creating networks that facilitate cooperation and social learning. c, Major transitions lead to the evolution of multi-scale agents, where larger-scale agents regulate the environments of smaller-scale ones.
A social path to human-like artificial intelligence

May 2024

·

92 Reads

Traditionally, cognitive and computer scientists have viewed intelligence solipsistically, as a property of unitary agents devoid of social context. Given the success of contemporary learning algorithms, we argue that the bottleneck in artificial intelligence (AI) progress is shifting from data assimilation to novel data generation. We bring together evidence showing that natural intelligence emerges at multiple scales in networks of interacting agents via collective living, social relationships and major evolutionary transitions, which contribute to novel data generation through mechanisms such as population pressures, arms races, Machiavellian selection, social learning and cumulative culture. Many breakthroughs in AI exploit some of these processes, from multi-agent structures enabling algorithms to master complex games like Capture-The-Flag and StarCraft II, to strategic communication in Diplomacy and the shaping of AI data streams by other AIs. Moving beyond a solipsistic view of agency to integrate these mechanisms suggests a path to human-like compounding innovation through ongoing novel data generation.


Meta-Learned Models of Cognition

November 2023

·

37 Reads

·

29 Citations

Behavioral and Brain Sciences

Psychologists and neuroscientists extensively rely on computational models for studying and analyzing the human mind. Traditionally, such computational models have been hand-designed by expert researchers. Two prominent examples are cognitive architectures and Bayesian models of cognition. While the former requires the specification of a fixed set of computational structures and a definition of how these structures interact with each other, the latter necessitates the commitment to a particular prior and a likelihood function which – in combination with Bayes’ rule – determine the model's behavior. In recent years, a new framework has established itself as a promising tool for building models of human cognition: the framework of meta-learning. In contrast to the previously mentioned model classes, meta-learned models acquire their inductive biases from experience, i.e., by repeatedly interacting with an environment. However, a coherent research program around meta-learned models of cognition is still missing to this day. The purpose of this article is to synthesize previous work in this field and establish such a research program. We accomplish this by pointing out that meta-learning can be used to construct Bayes-optimal learning algorithms, allowing us to draw strong connections to the rational analysis of cognition. We then discuss several advantages of the meta-learning framework over traditional methods and reexamine prior work in the context of these new insights.



Zero-shot compositional reasoning in a reinforcement learning setting

July 2023

·

2 Citations

People can easily evoke previously learned concepts, compose them, and apply the result to solve novel tasks on the first attempt. The aim of this paper is to improve our understanding of how people make such zero-shot compositional inferences in a reinforcement learning setting. To achieve this, we introduce an experimental paradigm where people learn two latent reward functions and need to compose them correctly to solve a novel task. We find that people have the capability to engage in zero-shot compositional reinforcement learning but deviate systematically from optimality. However, their mistakes are structured and can be explained by their performance in the sub-tasks leading up to the composition. Through extensive model-based analyses, we found that a meta-learned neural network model that accounts for limited computational resources best captures participants’ behaviour. Moreover, the amount of computational resources this model identified reliably quantifies how good individual participants are at zero-shot compositional reinforcement learning. Taken together, our work takes a considerable step towards studying compositional reasoning in agents – both natural and artificial – with limited computational resources.


Passive learning of active causal strategies in agents and language models

May 2023

·

25 Reads

What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However, we show that purely passive learning can in fact allow an agent to learn generalizable strategies for determining and using causal structures, as long as the agent can intervene at test time. We formally illustrate that learning a strategy of first experimenting, then seeking goals, can allow generalization from passive learning in principle. We then show empirically that agents trained via imitation on expert data can indeed generalize at test time to infer and use causal links which are never present in the training data; these agents can also generalize experimentation strategies to novel variable sets never observed in training. We then show that strategies for causal intervention and exploitation can be generalized from passive data even in a more complex environment with high-dimensional observations, with the support of natural language explanations. Explanations can even allow passive learners to generalize out-of-distribution from perfectly-confounded training data. Finally, we show that language models, trained only on passive next-word prediction, can generalize causal intervention strategies from a few-shot prompt containing examples of experimentation, together with explanations and reasoning. These results highlight the surprising power of passive learning of active causal strategies, and may help to understand the behaviors and capabilities of language models.


Meta-in-context learning in large language models

May 2023

·

48 Reads

Large language models have shown tremendous performance in a variety of tasks. In-context learning -- the ability to improve at a task after being provided with a number of demonstrations -- is seen as one of the main contributors to their success. In the present paper, we demonstrate that the in-context learning abilities of large language models can be recursively improved via in-context learning itself. We coin this phenomenon meta-in-context learning. Looking at two idealized domains, a one-dimensional regression task and a two-armed bandit task, we show that meta-in-context learning adaptively reshapes a large language model's priors over expected tasks. Furthermore, we find that meta-in-context learning modifies the in-context learning strategies of such models. Finally, we extend our approach to a benchmark of real-world regression problems where we observe competitive performance to traditional learning algorithms. Taken together, our work improves our understanding of in-context learning and paves the way toward adapting large language models to the environment they are applied purely through meta-in-context learning rather than traditional finetuning.


Meta-Learned Models of Cognition

April 2023

·

168 Reads

Meta-learning is a framework for learning learning algorithms through repeated interactions with an environment as opposed to designing them by hand. In recent years, this framework has established itself as a promising tool for building models of human cognition. Yet, a coherent research program around meta-learned models of cognition is still missing. The purpose of this article is to synthesize previous work in this field and establish such a research program. We rely on three key pillars to accomplish this goal. We first point out that meta-learning can be used to construct Bayes-optimal learning algorithms. This result not only implies that any behavioral phenomenon that can be explained by a Bayesian model can also be explained by a meta-learned model but also allows us to draw strong connections to the rational analysis of cognition. We then discuss several advantages of the meta-learning framework over traditional Bayesian methods. In particular, we argue that meta-learning can be applied to situations where Bayesian inference is impossible and that it enables us to make rational models of cognition more realistic, either by incorporating limited computational resources or neuroscientific knowledge. Finally, we reexamine prior studies from psychology and neuroscience that have applied meta-learning and put them into the context of these new insights. In summary, our work highlights that meta-learning considerably extends the scope of rational analysis and thereby of cognitive theories more generally.


Fig. 3: Results on clean (technical noise free), denoised, and noisy (with technical noise) data. (a): Area under precision recall curve (AUPRC) of DiscoGen compared to DCDI. Higher is better. DiscoGen significantly outperforms DCDI on all data. (b): Percentage of edge inaccuracies of DiscoGen compared to DCDI. Lower is better. DiscoGen significantly outperforms DCDI on all data.
DiscoGen: Learning to Discover Gene Regulatory Networks

April 2023

·

69 Reads

Accurately inferring Gene Regulatory Networks (GRNs) is a critical and challenging task in biology. GRNs model the activatory and inhibitory interactions between genes and are inherently causal in nature. To accurately identify GRNs, perturbational data is required. However, most GRN discovery methods only operate on observational data. Recent advances in neural network-based causal discovery methods have significantly improved causal discovery, including handling interventional data, improvements in performance and scalability. However, applying state-of-the-art (SOTA) causal discovery methods in biology poses challenges, such as noisy data and a large number of samples. Thus, adapting the causal discovery methods is necessary to handle these challenges. In this paper, we introduce DiscoGen, a neural network-based GRN discovery method that can denoise gene expression measurements and handle interventional data. We demonstrate that our model outperforms SOTA neural network-based causal discovery methods.


Citations (17)


... Meta-learning approaches thus avoid the challenges of selecting and hard coding an appropriate a priori architecture, while leveraging the remarkable learning abilities of neural networks. Consequently, several cognitive scientists have proposed that meta-learning of one form or another may provide solutions to the problems of out-ofdistribution generalization on the basis of limited data (Lake and Baroni, 2023;Binz et al., 2024). ...

Reference:

Meta-Learning Neural Mechanisms rather than Bayesian Priors
Meta-Learned Models of Cognition
  • Citing Article
  • November 2023

Behavioral and Brain Sciences

... The utilization of 4D LF cameras for scene understanding represents an innovative and rapidly evolving domain that holds significant potential for spatial intelligence systems [11,12,38]. However, the diversity in LF representations hinders cross-task research and collaboration across different tasks. ...

A social path to human-like artificial intelligence
  • Citing Article
  • November 2023

Nature Machine Intelligence

... But while LLMs are potentially capable to adapt and personalize their responses to users' personal information needs, their intrinsic knowledge is not granular enough for detailed travel advice. However, the LLMs' ability to understand instructions and learn in-context [5,22,23,50] could be employed to provide the LLM with the missing information needed to provide personalized safety advice to travelers. Given personal details about a traveler, such as basic demographics and whether the person is traveling alone or in a group, an LLM could provide a personal safety assessment for a given geographic location. ...

Can language models learn from explanations in context?
  • Citing Conference Paper
  • January 2022

... The advent of large language models (LLMs) has triggered intense interest in whether these new AI models are in fact approaching human-level abilities in language understanding (DiStefano et al., 2024;Köbis & Mossink, 2021;Mahowald et al., 2024;McClelland et al., 2020) and various forms of reasoning (Binz & Schulz, 2023;Chan et al., 2022;Dasgupta et al., 2022;Srivastava et al., 2022;Wei et al., 2022), including analogy (Webb et al., 2023). Given the enormous and non-curated text corpora on which LLMs have been trained, these models have certainly had ample opportunity to mine the metaphors that humans have already formed and planted in texts. ...

Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers

... They have demonstrated promising results across a wide range of tasks, including tasks that require specialized scientific knowledge and reasoning 12,62 . Perhaps the most interesting aspect of these LLMs is their in-context few-shot abilities, which adapt these models to diverse tasks without gradient-based parameter updates 15,67,80,81 . This allows them to rapidly generalize to unseen tasks and even exhibit apparent reasoning abilities with appropriate prompting strategies 13,16,20,63 . ...

Can language models learn from explanations in context?
  • Citing Preprint
  • April 2022

... This implies that the animals acquired a deeper rule of our experimental design -a process called 'learning-to-learn' or meta-learning. The fundamental aspect of meta-learning is its long-term experience dependency that allows for inductive biases or knowledge that speeds up future learning (Wang, 2021). To our knowledge it was first demonstrated by Harlow (1949) in non-human primates who received two new objects in every block of six trials. ...

Meta-learning in natural and artificial intelligence
  • Citing Article
  • April 2021

Current Opinion in Behavioral Sciences

... DRL framework is constantly being improved by the neuromorphic approaches 8 from which the prefrontal cortex (PFC) is inspired using the Long-Short Term Memory (LSTM) 15 . Although the critic network in meta-Reinforcement Learning (RL) is a LSTM, here, the critic network is a bidirectional array-based LSTM 16 which plays a discriminator role in a FGAN 17 . ...

Prefrontal Cortex as a Meta-Reinforcement Learning System
  • Citing Preprint
  • April 2018

... Recurrent neural networks (RNNs) can be applied to RL problems due to their recurrent connectivity pattern. At each time step, RNN hidden units receive information regarding the network's activation state at the previous time step via recurrent connections, thereby endowing the network with memory about what has happened before (Botvinick et al., 2020;Das et al., 2023;Goodfellow et al., 2016). Training and analysis of such models offer potential novel insights with implications for neuroscience (Botvinick et al., 2020). ...

Deep Reinforcement Learning and Its Neuroscientific Implications
  • Citing Article
  • July 2020

Neuron

... Similarly, in Zhu et al. (2023), the authors propose a method that mitigates bias in offline DRL models caused by unobserved confounders, further demonstrating the importance of causal reasoning in DRL-based cyber defense. The investigation in Rezende et al. (2020) highlights the importance of causally correct partial models in RL to prevent confounding and ensure accurate policy learning, especially in high-dimensional environments. In Zhu et al. (2023), the authors proposed a deconfounding framework for offline DRL that re-weights observational data to mitigate biases introduced by unobserved confounders. ...

Causally Correct Partial Models for Reinforcement Learning
  • Citing Preprint
  • February 2020