Thomas L. Griffiths’s research while affiliated with Princeton University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (614)


Costly Exploration Produces Stereotypes With Dimensions of Warmth and Competence
  • Article
  • Publisher preview available

November 2024

·

80 Reads

Journal of Experimental Psychology General

Xuechunzi Bai

·

Thomas L. Griffiths

·

Traditional explanations for stereotypes assume that they result from deficits in humans (ingroup-favoring motives, cognitive biases) or their environments (majority advantages, real group differences). An alternative explanation recently proposed that stereotypes can emerge when exploration is costly. Even optimal decision makers in an ideal environment can inadvertently form incorrect impressions from arbitrary encounters. However, all these existing theories essentially describe shortcuts that fail to explain the multidimensionality of stereotypes. Stereotypes of social groups have a canonical multidimensional structure, organized along dimensions of warmth and competence. We show that these dimensions and the associated stereotypes can result from feature-based exploration: When individuals make self-interested decisions based on past experiences in an environment where exploring new options carries an implicit cost and when these options share similar attributes, they are more likely to separate groups along multiple dimensions. We formalize this theory via the contextual multiarmed bandit problem, use the resulting model to generate testable predictions, and evaluate those predictions against human behavior. We evaluate this process in incentivized decisions involving as many as 20 real jobs and successfully recover the classic dimensions of warmth and competence. Further experiments show that intervening on the cost of exploration effectively mitigates bias, further demonstrating that exploration cost per se is the operating variable. Future diversity interventions may consider how to reduce exploration cost, in ways that parallel our manipulations.

View access options

Computation-Limited Bayesian Updating: A Resource-Rational Analysis of Approximate Bayesian Inference

November 2024

·

1 Read

Data and computational capacity are essential resources for any intelligent system that update its beliefs by integrating new information. However, both data and computational resources are inherently limited. Here, we introduce a new resource-rational analysis of belief updating that formalizes these constraints using information-theoretic principles. Our analysis reveals an interaction between data and computational limitations: when computational resources are scarce, agents may struggle to fully incorporate new data. The resource-rational belief updating rule we derive provides a novel explanation for conservative Bayesian updating, where individuals tend to underweight the likelihood of new evidence. Our theory also generates predictions consistent with several process models, particularly those based on approximate Bayesian inference.


Inverting Cognitive Models With Neural Networks to Infer Preferences From Fixations

November 2024

Cognitive Science A Multidisciplinary Journal

Inferring an individual's preferences from their observable behavior is a key step in the development of assistive decision‐making technology. Although machine learning models such as neural networks could in principle be deployed toward this inference, a large amount of data is required to train such models. Here, we present an approach in which a cognitive model generates simulated data to augment limited human data. Using these data, we train a neural network to invert the model, making it possible to infer preferences from behavior. We show how this approach can be used to infer the value that people assign to food items from their eye movements when choosing between those items. We demonstrate first that neural networks can infer the latent preferences used by the model to generate simulated fixations, and second that simulated data can be beneficial in pretraining a network for predicting human‐reported preferences from real fixations. Compared to inferring preferences from choice alone, this approach confers a slight improvement in predicting preferences and also allows prediction to take place prior to the choice being made. Overall, our results suggest that using a combination of neural networks and model‐simulated training data is a promising approach for developing technology that infers human preferences.


Figure 9: Human Evaluation
Visual analogy results: GPT-4o.
Visual analogy results: Claude Sonnet 3.5.
Visual analogy results: Gemini Ultra 1.5.
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem

October 2024

·

2 Reads

Declan Campbell

·

Sunayana Rane

·

Tyler Giallanza

·

[...]

·

Taylor W. Webb

Recent work has documented striking heterogeneity in the performance of state-of-the-art vision language models (VLMs), including both multimodal language models and text-to-image models. These models are able to describe and generate a diverse array of complex, naturalistic images, yet they exhibit surprising failures on basic multi-object reasoning tasks -- such as counting, localization, and simple forms of visual analogy -- that humans perform with near perfect accuracy. To better understand this puzzling pattern of successes and failures, we turn to theoretical accounts of the binding problem in cognitive science and neuroscience, a fundamental problem that arises when a shared set of representational resources must be used to represent distinct entities (e.g., to represent multiple objects in an image), necessitating the use of serial processing to avoid interference. We find that many of the puzzling failures of state-of-the-art VLMs can be explained as arising due to the binding problem, and that these failure modes are strikingly similar to the limitations exhibited by rapid, feedforward processing in the human brain.


Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse

October 2024

·

28 Reads

Chain-of-thought (CoT) prompting has become a widely used strategy for working with large language and multimodal models. While CoT has been shown to improve performance across many tasks, determining the settings in which it is effective remains an ongoing effort. In particular, it is still an open question in what settings CoT systematically reduces model performance. In this paper, we seek to identify the characteristics of tasks where CoT reduces performance by drawing inspiration from cognitive psychology, looking at cases where (i) verbal thinking or deliberation hurts performance in humans, and (ii) the constraints governing human performance generalize to language models. Three such cases are implicit statistical learning, visual recognition, and classifying with patterns containing exceptions. In extensive experiments across all three settings, we find that a diverse collection of state-of-the-art models exhibit significant drop-offs in performance (e.g., up to 36.3% absolute accuracy for OpenAI o1-preview compared to GPT-4o) when using inference-time reasoning compared to zero-shot counterparts. We also identify three tasks that satisfy condition (i) but not (ii), and find that while verbal thinking reduces human performance in these tasks, CoT retains or increases model performance. Overall, our results show that while there is not an exact parallel between the cognitive processes of models and those of humans, considering cases where thinking has negative consequences for human performance can help us identify settings where it negatively impacts models. By connecting the literature on human deliberation with evaluations of CoT, we offer a new tool that can be used in understanding the impact of prompt choices and inference-time reasoning.


Fig. 2 Performance on Psych-101. a, Pseudo-R 2 values for different models across experiments. A value of zero corresponds to prediction at chance level while a value of one corresponds to perfect predictability of human responses. Missing bars indicate performance below chance level. Centaur outperforms both Llama and a collection of domain-specific cognitive models in almost every experiment. Note that we only included experiments for which we have implemented a domain-specific cognitive model in this graphic and merged different studies using the same paradigm. A full table for all experiments can be found in the Supplementary Information. b, Model simulations on the two-step task. The plot visualizes probability densities over reward and a parameter indicating how model-based learning was for people and simulated runs of Centaur. c, Model simulations on the horizon task. The plot visualizes probability densities over reward and an information bonus parameter for both people and simulated runs of Centaur. d, Model simulations on a grammar judgement task. The plot visualizes probability densities over true and estimated scores (i.e., number of correct responses out of twenty) for both people and simulated runs of Centaur.
Centaur: a foundation model of human cognition

October 2024

·

300 Reads

·

1 Citation

Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety. Here we introduce Centaur, a computational model that can predict and simulate human behavior in any experiment expressible in natural language. We derived Centaur by finetuning a state-of-the-art language model on a novel, large-scale data set called Psych-101. Psych-101 reaches an unprecedented scale, covering trial-by-trial data from over 60,000 participants performing over 10,000,000 choices in 160 experiments. Centaur not only captures the behavior of held-out participants better than existing cognitive models, but also generalizes to new cover stories, structural task modifications, and entirely new domains. Furthermore, we find that the model's internal representations become more aligned with human neural activity after finetuning. Taken together, Centaur is the first real candidate for a unified model of human cognition. We anticipate that it will have a disruptive impact on the cognitive sciences, challenging the existing paradigm for developing computational models.


Environmental complexity and regularity shape the evolution of cognition

October 2024

·

37 Reads

The environmental complexity hypothesis suggests that cognition evolves to allow animals to negotiate a complex and changing environment. By contrast, signal detection theory suggests cognition exploits environmental regularities by containing biases (e.g. to avoid dangerous predators). Therefore, two significant bodies of theory on cognitive evolution may be in tension: one foregrounds environmental complexity, the other regularity. Difficulty in reconciling these theories stems from their focus on different aspects of cognition. The environmental complexity hypothesis focuses on the reliability of sensors in the origin of cognition, while signal detection theory focuses on decision making in cognitively sophisticated animals. Here, we extend the signal detection model to examine the joint evolution of mechanisms for detecting information (sensory systems) and those that process information to produce behaviour (decision-making systems). We find that the transition to cognition can only occur if processing compensates for unreliable sensors by trading-off errors. Further, we provide an explanation for why animals with sophisticated sensory systems nonetheless disregard the reliable information it provides, by having biases for particular behaviours. Our model suggests that there is greater nuance than has been previously appreciated, and that both complexity and regularity can promote cognition.



Figure 2: Cost and Performance for Phi-2 Across Selected MMLU Domains. Left panel: Input and output token distributions for different methods and domains. Right panel: While CoT improves performance in subjects like mathematics and chemistry, it appears less effective in history and macroeconomics. The latter domains show a greater reduction in output tokens (35% for history and 37% for macroeconomics), compared to the smaller reductions seen in chemistry (20%) and mathematics (12%), where intermediate reasoning chains provide more substantial benefits.
Rational Metareasoning for Large Language Models

October 2024

·

2 Reads

Being prompted to engage in reasoning has emerged as a core technique for using large language models (LLMs), deploying additional inference-time compute to improve task performance. However, as LLMs increase in both size and adoption, inference costs are correspondingly becoming increasingly burdensome. How, then, might we optimize reasoning's cost-performance tradeoff? This work introduces a novel approach based on computational models of metareasoning used in cognitive science, training LLMs to selectively use intermediate reasoning steps only when necessary. We first develop a reward function that incorporates the Value of Computation by penalizing unnecessary reasoning, then use this reward function with Expert Iteration to train the LLM. Compared to few-shot chain-of-thought prompting and STaR, our method significantly reduces inference costs (20-37\% fewer tokens generated across three models) while maintaining task performance across diverse datasets.


Embers of autoregression show how large language models are shaped by the problem they are trained to solve

October 2024

·

15 Reads

·

36 Citations

Proceedings of the National Academy of Sciences

The widespread adoption of large language models (LLMs) makes it important to recognize their strengths and limitations. We argue that to develop a holistic understanding of these systems, we must consider the problem that they were trained to solve: next-word prediction over Internet text. By recognizing the pressures that this task exerts, we can make predictions about the strategies that LLMs will adopt, allowing us to reason about when they will succeed or fail. Using this approach—which we call the teleological approach—we identify three factors that we hypothesize will influence LLM accuracy: the probability of the task to be performed, the probability of the target output, and the probability of the provided input. To test our predictions, we evaluate five LLMs (GPT-3.5, GPT-4, Claude 3, Llama 3, and Gemini 1.0) on 11 tasks, and we find robust evidence that LLMs are influenced by probability in the hypothesized ways. Many of the experiments reveal surprising failure modes. For instance, GPT-4’s accuracy at decoding a simple cipher is 51% when the output is a high-probability sentence but only 13% when it is low-probability, even though this task is a deterministic one for which probability should not matter. These results show that AI practitioners should be careful about using LLMs in low-probability situations. More broadly, we conclude that we should not evaluate LLMs as if they are humans but should instead treat them as a distinct type of system—one that has been shaped by its own particular set of pressures.


Citations (41)


... 14 Being able to identify biases in cases of unreliable annotations is important, and researchers should resist the urge to withhold evaluable results from foundation models even if the data fail to reject a null hypothesis. By performing more rigorous evaluations, researchers could crowdsource measuring model biases and behavior tendencies to help all users be more discerning of speciousness, especially as these models' poor behaviors get harder to detect (Azaria et al., 2024;Hosking et al., 2024;Zhou et al., 2024) and as researchers make bolder claims about their abilities (see Binz et al. 2024, inter alia). ...

Reference:

"All that Glitters": Approaches to Evaluations with Unreliable Model and Human Annotations
Centaur: a foundation model of human cognition

... Another approach to theory development focuses on improving predictive models. Reichman et al. (2024) provide a comprehensive review of advances in this area, examining the relationship between prediction and explanation, the application of active learning, and methods for combining multiple theory-based predictors. In contrast, Glöckner et al. (2024) conduct an empirical test of various predictive models using a data set of over 62,000 risky choices. ...

Machine Learning for Modeling Human Decisions

Decision

... Here, one major question is that how much we can use AI to model human thoughts and behaviours. This question has been posed since the advent of machine intelligence, though recently it gained momentum due to current advances in generative AI (Bail (2024); Brinkmann et al. (2023); Collins et al. (2024); Tsvetkova et al. (2024)). While planning in the brain is still an unsolved problem (Mattar and Lengyel (2021)), it is believed that humans are resource rational entities able to use heuristics for planning. ...

Building machines that learn and think with people
  • Citing Article
  • October 2024

Nature Human Behaviour

... Encoder Models Encoder family models are custom transformer encoders trained on the NCTE classroom transcripts. The five models (un1, un2, un3, gte, and e5) use fixed-parameter pretrained sentence embeddings, differing in these and in training hyperparamters, thereby exploiting LLM sensitivites to pretraining regimes (D'Amour et al., 2020;McCoy et al., 2023). A summary of differences is in Table 7 and more training details can be found in Appendix D. In contrast to the model experiments of Xu et al. who used different combinations of models by item, each encoder model produces labels for all 13 MQI (and 12 CLASS) items. ...

Embers of autoregression show how large language models are shaped by the problem they are trained to solve
  • Citing Article
  • October 2024

Proceedings of the National Academy of Sciences

... Nonetheless, recent diffusion models (Sohl-Dickstein et al., 2015;Song et al., 2020;Ho et al., 2020;Kadkhodaie & Simoncelli, 2020;Song et al., 2020) learn to generate high-quality images Saharia et al., 2022;Rombach et al., 2022;Chen et al., 2023;2024a) and even videos (Singer et al., 2022;Ho et al., 2022;Girdhar et al., 2023;Blattmann et al., 2023;OpenAI, 2024) using relatively few samples when compared to the underlying highdimensional space. This indicates that diffusion models exhibit powerful inductive biases (Wilson & Izmailov, 2020;Goyal & Bengio, 2022;Griffiths et al., 2024) that promote effective generalization. What exactly are these powerful inductive biases? ...

Bayes in the Age of Intelligent Machines
  • Citing Article
  • September 2024

Current Directions in Psychological Science

... An important yet often overlooked aspect of this alignment is the perceptual alignment. Perceptual alignment refers to the agreement between AI assessments and human subjective judgments across different sensory modalities, such as vision, hearing, taste, touch, and smell [1,2,3]. It enables AI to better understand the physical world as humans experience it, ensuring that AI applications are reliable and beneficial in real-world settings. ...

Large language models predict human sensory judgments across six modalities

... Another new context involves interactions between humans and AI. Gopnarayan et al. (2024), Meyer (2024), Herzog and Franklin (2024), and Oktar et al. (2024) explore this domain from complementary perspectives. Gopnarayan et al. (2024) discuss how process data, such as decision time and eye movements, can be integrated into AI models to significantly improve predictions and human-AI interactions. ...

Dimensions of Disagreement: Divergence and Misalignment in Cognitive Science and Artificial Intelligence

Decision

... Our findings are in line with previous work comparing LLM creativity to humans in story generation (Chakrabarty et al., 2023;Tian et al., 2024;Marco et al., 2024a,b) and further indicates that LLMs are less creative not only when compared to professional writers, but also everyday people. A similar finding has also been reported in past work comparing LLMs to everyday people in creative problem-solving (Tian et al., 2023). ...

MacGyver: Are Large Language Models Creative Problem Solvers?
  • Citing Conference Paper
  • January 2024

... This claim is supported by a large body of experimental work (for a review, see Egner, 2023). It also resonates with theoretical work on amortized inference, which posits that inferring an appropriate task representation online can be computationally costly, and that per-formance can be improved by re-using the products of these costly inference processes (Dasgupta, Schulz, Goodman, & Gershman, 2018;Dasgupta & Gershman, 2021); in line with this view, it has been shown that EM in particular can support efficient re-use when the same task repeats (Dasgupta et al., 2018;Lu et al., 2023). ...

Reconciling shared versus context-specific information in a neural network model of latent causes

... Also, the relative importance of word length and word frequency in word processing will need reevaluation (Barton et al., 2014;Carter & Luke, 2019;Hauk & Pulvermüller, 2004;Hudson & Bergman, 1985;Juhasz & Rayner, 2003;Kliegl et al., 2004;Koplenig et al., 2022;Kuperman et al., 2024;Meylan & Griffiths, 2024). The dominant view now is that word frequency is more important than word length. ...

Word Forms Reflect Trade-Offs Between Speaker Effort and Robust Listener Recognition
  • Citing Article
  • July 2024

Cognitive Science A Multidisciplinary Journal