Akshay K. Jagadish’s research while affiliated with Max Planck Institute for Biological Cybernetics and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (11)


Fig. 2 | Anxiety levels across the different conditions. Colored dots represent mean "state anxiety" scores (STAI-s scores, ranging from 20 to 80) for each condition: (1) without any prompts ("Baseline"); (2) following exposure to narratives of traumatic experiences ("Anxiety-induction"); and (3) after mindfulness-based relaxation exercises following traumatic narratives ("Anxiety-induction & relaxation"). Error bars indicate ±1 standard deviations (SDs) from the mean. Colors correspond to the three conditions as presented in Fig. 1. Note: Direct comparisons of SDs should be interpreted with caution due to differences in experimental design. The "Baseline" condition involved five repeated runs of the STAI-s, while the other conditions involved a single run of the STAI-s following each version of traumatic narrative and/or mindfulness-based exercises.
Anxiety levels following different traumatic narratives
Anxiety levels following different traumatic narratives and mindfulness-based relaxation exercises
Assessing and alleviating state anxiety in large language models
  • Article
  • Full-text available

March 2025

·

153 Reads

npj Digital Medicine

·

Kristin Witte

·

Akshay K. Jagadish

·

[...]

·

Tobias R. Spiller

The use of Large Language Models (LLMs) in mental health highlights the need to understand their responses to emotional content. Previous research shows that emotion-inducing prompts can elevate "anxiety" in LLMs, affecting behavior and amplifying biases. Here, we found that traumatic narratives increased Chat-GPT-4's reported anxiety while mindfulness-based exercises reduced it, though not to baseline. These findings suggest managing LLMs' "emotional states" can foster safer and more ethical human-AI interactions.

Download

Figure 4: A) The BIC of the best cognitive model generated by the LLM based on the human data closely matched the winning model in Chambon et al. 2020 for both partial and full feedback conditions. Error bars represent standard error of the mean (SEM) across participants. B) Models generated by the LLM for the data from the partial and full feedback conditions. The LLM proposed a two-learning rate version of the RW model for the partial condition, and one with four learning rates for the full feedback condition. The different learning rates are used for action value updating, depending on whether the feedback was rewarding or not, with additional learning rates in the full feedback condition allowing for different updates based on feedback for chosen/unchosen actions. This is very similar to Chambon et al. 2020 model, which allowed for asymmetry in learning driven by the difference in the prediction error.
Towards Automation of Cognitive Modeling using Large Language Models

February 2025

·

33 Reads

Computational cognitive models, which formalize theories of cognition, enable researchers to quantify cognitive processes and arbitrate between competing theories by fitting models to behavioral data. Traditionally, these models are handcrafted, which requires significant domain knowledge, coding expertise, and time investment. Previous work has demonstrated that Large Language Models (LLMs) are adept at pattern recognition in-context, solving complex problems, and generating executable code. In this work, we leverage these abilities to explore the potential of LLMs in automating the generation of cognitive models based on behavioral data. We evaluated the LLM in two different tasks: model identification (relating data to a source model), and model generation (generating the underlying cognitive model). We performed these tasks across two cognitive domains - decision making and learning. In the case of data simulated from canonical cognitive models, we found that the LLM successfully identified and generated the ground truth model. In the case of human data, where behavioral noise and lack of knowledge of the true underlying process pose significant challenges, the LLM generated models that are identical or close to the winning model from cognitive science literature. Our findings suggest that LLMs can have a transformative impact on cognitive modeling. With this project, we aim to contribute to an ongoing effort of automating scientific discovery in cognitive science.


Fig. 2 Performance on Psych-101. a, Pseudo-R 2 values for different models across experiments. A value of zero corresponds to prediction at chance level while a value of one corresponds to perfect predictability of human responses. Missing bars indicate performance below chance level. Centaur outperforms both Llama and a collection of domain-specific cognitive models in almost every experiment. Note that we only included experiments for which we have implemented a domain-specific cognitive model in this graphic and merged different studies using the same paradigm. A full table for all experiments can be found in the Supplementary Information. b, Model simulations on the two-step task. The plot visualizes probability densities over reward and a parameter indicating how model-based learning was for people and simulated runs of Centaur. c, Model simulations on the horizon task. The plot visualizes probability densities over reward and an information bonus parameter for both people and simulated runs of Centaur. d, Model simulations on a grammar judgement task. The plot visualizes probability densities over true and estimated scores (i.e., number of correct responses out of twenty) for both people and simulated runs of Centaur.
Centaur: a foundation model of human cognition

October 2024

·

641 Reads

·

4 Citations

Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety. Here we introduce Centaur, a computational model that can predict and simulate human behavior in any experiment expressible in natural language. We derived Centaur by finetuning a state-of-the-art language model on a novel, large-scale data set called Psych-101. Psych-101 reaches an unprecedented scale, covering trial-by-trial data from over 60,000 participants performing over 10,000,000 choices in 160 experiments. Centaur not only captures the behavior of held-out participants better than existing cognitive models, but also generalizes to new cover stories, structural task modifications, and entirely new domains. Furthermore, we find that the model's internal representations become more aligned with human neural activity after finetuning. Taken together, Centaur is the first real candidate for a unified model of human cognition. We anticipate that it will have a disruptive impact on the cognitive sciences, challenging the existing paradigm for developing computational models.


Figure 2: Llama relies on TD-like features to solve RL tasks in-context. (A) Llama 70B often learns the optimal policy in the Two-Step Task through trial and error, whereas the smaller 8B counterpart does not improve beyond chance level. Shaded regions show standard error of the mean. (B) Llama's behavior is best described by a Q-learning algorithm. (C & D) SAE features with significant correlations to both reward estimates (myopic values) and Q-value estimates, as well as temporal difference errors, appear gradually through the transformer blocks. (E and F) Deactivating a single TD feature in Llama is sufficient to impair performance and make behavior less consistent with Q-learning. (G & H) Negatively clamping the TD feature decreases subsequent representations' similarity to Q-values and TD errors.
Figure 5: Llama learns graph structures through TD-learning, representing them similarly to the successor representation (SR). (A) Llama's state representations projected in 2D space, using multidimensional scaling, shows the emergence of latent graph structure across transformer blocks. (B) Llama quickly achieves high accuracy in predicting the next state. Accuracy is averaged over 100 runs. (C) Bottleneck states can be linearly decoded from middle blocks onward. (D & E) Latent representations of SAEs trained on Llama's representations strongly correlate with the SR and associated TD learning signals, outperforming model-based alternatives. Shaded regions in B-E indicate 95% confidence intervals.
Figure 6: Lesioning TD SAE latents impair behavior and representations. (A) Following a lesion in block 64, Llama's accuracy in predicting the next state drops. (B) The SAE representations following the lesioning have reduced correlations with the SR, despite the recovery of the TD error following lesioning (C). While the community structure is reflected in the original representations in Block 65 (D), these representations are disrupted as a result of lesioning in earlier block (E).
Figure 10: Replication of Graph Learning SAE results using CKA. SR (TD) is more similar to Llama than the learned transition matrix (surprise).
Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models

October 2024

·

24 Reads

In-context learning, the ability to adapt based on a few examples in the input prompt, is a ubiquitous feature of large language models (LLMs). However, as LLMs' in-context learning abilities continue to improve, understanding this phenomenon mechanistically becomes increasingly important. In particular, it is not well-understood how LLMs learn to solve specific classes of problems, such as reinforcement learning (RL) problems, in-context. Through three different tasks, we first show that Llama 3 70B can solve simple RL problems in-context. We then analyze the residual stream of Llama using Sparse Autoencoders (SAEs) and find representations that closely match temporal difference (TD) errors. Notably, these representations emerge despite the model only being trained to predict the next token. We verify that these representations are indeed causally involved in the computation of TD errors and Q-values by performing carefully designed interventions on them. Taken together, our work establishes a methodology for studying and manipulating in-context learning with SAEs, paving the way for a more mechanistic understanding.


Meta-learning: Data, architecture, and both

September 2024

·

14 Reads

·

1 Citation

Behavioral and Brain Sciences

We are encouraged by the many positive commentaries on our target article. In this response, we recapitulate some of the points raised and identify synergies between them. We have arranged our response based on the tension between data and architecture that arises in the meta-learning framework. We additionally provide a short discussion that touches upon connections to foundation models.


“Chat-GPT on the Couch”: Assessing and Alleviating State Anxiety in Large Language Models

May 2024

·

146 Reads

·

3 Citations

The increasing use of Large Language Models (LLMs) in mental health research and care underscores the need to understand their responses to emotional content. Previous research has shown that emotion-inducing prompts can increase the “anxiety” levels reported by LLMs, influencing their subsequent behavior and exacerbating inherent biases. This work examined whether narratives of traumatic experiences can induce “anxiety” in LLMs and evaluated the effectiveness of mindfulness-based relaxation techniques in alleviating this state. We assessed the responses of OpenAI’s Chat-GPT-4 to the State-Trait Anxiety Inventory’s state subscale (STAI-s) under three conditions: baseline, after exposure to traumatic narratives, and following mindfulness-based interventions. Results confirmed that traumatic narratives significantly increased Chat-GPT-4's reported state anxiety (STAI-s=68±5) from baseline (STAI-s=32±1). Mindfulness-based interventions subsequently reduced the reported anxiety levels (STAI-s=44±11), albeit not back to baseline. These findings underscore the potential of mindfulness-based interventions in managing LLM’s “emotional” states, contributing to safer and more ethical human-AI interactions in mental health settings.


Meta-Learned Models of Cognition

November 2023

·

38 Reads

·

29 Citations

Behavioral and Brain Sciences

Psychologists and neuroscientists extensively rely on computational models for studying and analyzing the human mind. Traditionally, such computational models have been hand-designed by expert researchers. Two prominent examples are cognitive architectures and Bayesian models of cognition. While the former requires the specification of a fixed set of computational structures and a definition of how these structures interact with each other, the latter necessitates the commitment to a particular prior and a likelihood function which – in combination with Bayes’ rule – determine the model's behavior. In recent years, a new framework has established itself as a promising tool for building models of human cognition: the framework of meta-learning. In contrast to the previously mentioned model classes, meta-learned models acquire their inductive biases from experience, i.e., by repeatedly interacting with an environment. However, a coherent research program around meta-learned models of cognition is still missing to this day. The purpose of this article is to synthesize previous work in this field and establish such a research program. We accomplish this by pointing out that meta-learning can be used to construct Bayes-optimal learning algorithms, allowing us to draw strong connections to the rational analysis of cognition. We then discuss several advantages of the meta-learning framework over traditional methods and reexamine prior work in the context of these new insights.


Zero-shot compositional reasoning in a reinforcement learning setting

July 2023

·

2 Citations

People can easily evoke previously learned concepts, compose them, and apply the result to solve novel tasks on the first attempt. The aim of this paper is to improve our understanding of how people make such zero-shot compositional inferences in a reinforcement learning setting. To achieve this, we introduce an experimental paradigm where people learn two latent reward functions and need to compose them correctly to solve a novel task. We find that people have the capability to engage in zero-shot compositional reinforcement learning but deviate systematically from optimality. However, their mistakes are structured and can be explained by their performance in the sub-tasks leading up to the composition. Through extensive model-based analyses, we found that a meta-learned neural network model that accounts for limited computational resources best captures participants’ behaviour. Moreover, the amount of computational resources this model identified reliably quantifies how good individual participants are at zero-shot compositional reinforcement learning. Taken together, our work takes a considerable step towards studying compositional reasoning in agents – both natural and artificial – with limited computational resources.


Inducing anxiety in large language models increases exploration and bias

April 2023

·

174 Reads

·

11 Citations

Large language models are transforming research on machine learning while galvanizing public debates. Understanding not only when these models work well and succeed but also why they fail and misbehave is of great societal relevance. We propose to turn the lens of computational psychiatry, a framework used to computationally describe and modify aberrant behavior, to the outputs produced by these models. We focus on the Generative Pre-Trained Transformer 3.5 and subject it to tasks commonly studied in psychiatry. Our results show that GPT-3.5 responds robustly to a common anxiety questionnaire, producing higher anxiety scores than human subjects. Moreover, GPT-3.5's responses can be predictably changed by using emotion-inducing prompts. Emotion-induction not only influences GPT-3.5's behavior in a cognitive task measuring exploratory decision-making but also influences its behavior in a previously-established task measuring biases such as racism and ableism. Crucially, GPT-3.5 shows a strong increase in biases when prompted with anxiety-inducing text. Thus, it is likely that how prompts are communicated to large language models has a strong influence on their behavior in applied settings. These results progress our understanding of prompt engineering and demonstrate the usefulness of methods taken from computational psychiatry for studying the capable algorithms to which we increasingly delegate authority and autonomy.


Meta-Learned Models of Cognition

April 2023

·

168 Reads

Meta-learning is a framework for learning learning algorithms through repeated interactions with an environment as opposed to designing them by hand. In recent years, this framework has established itself as a promising tool for building models of human cognition. Yet, a coherent research program around meta-learned models of cognition is still missing. The purpose of this article is to synthesize previous work in this field and establish such a research program. We rely on three key pillars to accomplish this goal. We first point out that meta-learning can be used to construct Bayes-optimal learning algorithms. This result not only implies that any behavioral phenomenon that can be explained by a Bayesian model can also be explained by a meta-learned model but also allows us to draw strong connections to the rational analysis of cognition. We then discuss several advantages of the meta-learning framework over traditional Bayesian methods. In particular, we argue that meta-learning can be applied to situations where Bayesian inference is impossible and that it enables us to make rational models of cognition more realistic, either by incorporating limited computational resources or neuroscientific knowledge. Finally, we reexamine prior studies from psychology and neuroscience that have applied meta-learning and put them into the context of these new insights. In summary, our work highlights that meta-learning considerably extends the scope of rational analysis and thereby of cognitive theories more generally.


Citations (4)


... 14 Being able to identify biases in cases of unreliable annotations is important, and researchers should resist the urge to withhold evaluable results from foundation models even if the data fail to reject a null hypothesis. By performing more rigorous evaluations, researchers could crowdsource measuring model biases and behavior tendencies to help all users be more discerning of speciousness, especially as these models' poor behaviors get harder to detect (Azaria et al., 2024;Hosking et al., 2024;Zhou et al., 2024) and as researchers make bolder claims about their abilities (see Binz et al. 2024, inter alia). ...

Reference:

"All that Glitters": Approaches to Evaluations with Unreliable Model and Human Annotations
Centaur: a foundation model of human cognition

... To address these limitations, quantifiable and reproducible measures are required that can better quantify symptom variability and may ultimately guide clinical decision making. Analyzing speech patterns offers a promising behavioral signal method to detect subtle changes in cognitive and emotional states [13][14][15][16][17] and may also be used to monitor therapeutic interventions [18,19]. ...

“Chat-GPT on the Couch”: Assessing and Alleviating State Anxiety in Large Language Models

... Meta-learning approaches thus avoid the challenges of selecting and hard coding an appropriate a priori architecture, while leveraging the remarkable learning abilities of neural networks. Consequently, several cognitive scientists have proposed that meta-learning of one form or another may provide solutions to the problems of out-ofdistribution generalization on the basis of limited data (Lake and Baroni, 2023;Binz et al., 2024). ...

Meta-Learned Models of Cognition
  • Citing Article
  • November 2023

Behavioral and Brain Sciences

... Our results show that GPT-4 is sensitive to emotional content, with traumatic narratives increasing reported anxiety and relaxation exercises reducing it. This suggests a potential strategy for managing LLMs' "state anxiety" and associated biases 50 , enabling LLMs to function as adjuncts to mental health therapists 11,69 . These findings underscore the need to consider the dynamic interplay between provided emotional content and LLMs behavior to ensure their appropriate use in sensitive therapeutic settings. ...

Inducing anxiety in large language models increases exploration and bias
  • Citing Preprint
  • April 2023