R. Thomas McCoy’s research while affiliated with Yale University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (44)


Overview of our approach
a Standard models of learning conflate strength of inductive biases with strength of representational commitments: Bayesian models have strong biases and strong representational commitments, while standard neural networks have weak biases and weak representational commitments. In this work, we create prior-trained neural networks—neural networks that have strong biases yet flexible representations. b The process of inductive bias distillation that we use to give strong inductive biases to neural networks. First, a target inductive bias is instantiated in a Bayesian model which gives a prior over hypotheses. Then, hypotheses are sampled from that prior to create tasks that instantiate the inductive bias in data. Finally, a neural network meta-learns from the data, a step which transfers the inductive bias into the network.
Meta-learning using MAML
We illustrate a model M as a point in parameter space. a A single episode of meta-learning. We sample one language L from a space of synthetic languages that we have defined and then sample two sets of sentences from this language: a training set and a test set. The model begins the episode with parameter values Mt. These parameter values are copied into a temporary model M′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${M}^{{\prime} }$$\end{document}, which then learns from the training set for L using standard (non-meta) learning (dotted trajectory). The trained M′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${M}^{{\prime} }$$\end{document} is then evaluated on the test set of language L. Based on the errors that M′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${M}^{{\prime} }$$\end{document} makes on this test set, the parameters of the original model Mt are adjusted (following the solid arrow) to create a new set of parameters Mt+1, such that if Mt+1 were duplicated into a new M′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${M}^{{\prime} }$$\end{document} the new M′\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${M}^{{\prime} }$$\end{document} would learn this language more effectively than if it had been copied from Mt. b A complete meta-learning process encompassing 10 episodes. This diagram shows the model starting with random initial parameters M0 and then going through 10 episodes (solid arrows) until arriving at its final parameters M10; note that our experiments actually use 25,000 meta-learning episodes, not 10 episodes. If meta-learning has succeeded, these final parameter values should serve as a useful initialization from which the model can easily learn any language in our space of languages. Here, each oval represents the region of parameter space that would lead to effective processing of a single language such as L2. The meta-learned parameter values M10 indeed position the model in a region where it can readily reach any of the languages shown through a small amount of standard learning (dotted arrows).
Assessing the ability of our model to learn formal languages
This plot averages over the 56 formal languages used by Yang and Piantadosi²³; see Supplementary Note 2 for results for individual languages. The Bayesian model results are taken from Yang and Piantadosi. The prior-trained neural network is our model that has undergone inductive bias distillation; the standard neural network has the same architecture but has not undergone distillation. For each neural network condition, the plot shows the mean over 40 re-runs, with the bottom right showing error bars (giving the full range) averaged over the five training set sizes.
Perplexity of neural networks trained on English text
For perplexity, lower is better. a Results for our largest model size (1024 hidden units) trained on the full dataset. The boxplots show summary statistics (center = median, bounds of box = first and third quartiles, whiskers = minimum and maximum) across 40 runs, while the dots show the values for the individual re-runs. The dotted line is the best model from prior literature⁷⁶, which was a Transformer. b Effect of varying model size and amount of training data. Each cell shows the mean perplexity and a 95% confidence interval for a standard model (S) and prior-trained model (P), across 20 runs for each model type in each cell. The shading shows the proportion by which inductive bias distillation changes the perplexity (a negative value—i.e., blue—indicates an improvement).
Results on targeted linguistic evaluations
a Accuracy on four minimal pair datasets that each cover a broad range of syntactic phenomena, averaging across 40 re-runs. The p-values are based on two-sided two-sample t-tests; see “Methods” for details. b The extent to which models display priming in sentences that are either short or long and either semantically plausible or semantically implausible. The lower the value on the y-axis is, the more extensively priming has occurred. The boxplots show summary statistics (center = median, bounds of box = first and third quartiles, whiskers = minimum and maximum) across 40 re-runs. c Evaluations of recursion averaged across 40 re-runs; data are presented as mean values. The two model types score similarly when there are few levels of recursion, but at higher levels the prior-trained model often has a higher accuracy than the standard one.

+1

Modeling rapid language learning by distilling Bayesian priors into artificial neural networks
  • Article
  • Full-text available

May 2025

·

24 Reads

·

1 Citation

R. Thomas McCoy

·

Thomas L. Griffiths

Humans can learn languages from remarkably little experience. Developing computational models that explain this ability has been a major challenge in cognitive science. Existing approaches have been successful at explaining how humans generalize rapidly in controlled settings but are usually too restrictive to tractably handle naturalistic data. We show that learning from limited naturalistic data is possible with an approach that bridges the divide between two popular modeling traditions: Bayesian models and neural networks. This approach distills a Bayesian model’s inductive biases—the factors that guide generalization—into a neural network that has flexible representations. Like a Bayesian model, the resulting system can learn formal linguistic patterns from limited data. Like a neural network, it can also learn aspects of English syntax from naturally-occurring sentences. Thus, this model provides a single system that can learn rapidly and can handle naturalistic data.

Download

Identifying and Mitigating the Influence of the Prior Distribution in Large Language Models

April 2025

·

7 Reads

Large language models (LLMs) sometimes fail to respond appropriately to deterministic tasks -- such as counting or forming acronyms -- because the implicit prior distribution they have learned over sequences of tokens influences their responses. In this work, we show that, in at least some cases, LLMs actually compute the information needed to perform these tasks correctly, and we identify some interventions that can allow them to access this information to improve their performance. First, we show that simply prompting the language model to not rely on its prior knowledge leads to dramatic improvements in prior-dominated tasks. We then use mechanistic interpretability techniques to localize the prior within the LLM and manipulate the extent to which that prior influences its responses. Specifically, we show that it is possible to identify layers of the underlying neural network that correlate with the prior probability of a response and that lightweight finetuning of these layers with basic prompts on prior-dominated tasks achieves high performance on held-out answers. These results suggest that the information required to produce a correct response is contained within the representations of the problems formed by the models. Furthermore, we show that this finetuning is significantly more effective for prior-dominated tasks, and that the error after finetuning is no longer correlated with the prior. Our results suggest that it may be possible to define effective methods for manipulating the extent to which LLMs rely upon their priors in solving problems, potentially increasing their performance in settings where LLMs hallucinate for reasons related to the prior probability of token sequences.


Figure 2: Meta-learning initial weights for generalization. A neural network with a standard initialization (θ 0 ) typically requires a large amount of training to learn a specific task. Meta-learning optimizes the network's initialization to create a meta-learned initialization θ * 0 from which a range of different tasks can be learned with a small amount of training. In our setting, we use meta-learning not for its typical purpose of enabling the rapid learning of many tasks but rather as a way to encourage abstraction.
Convolutional Neural Networks Can (Meta-)Learn the Same-Different Relation

March 2025

·

13 Reads

While convolutional neural networks (CNNs) have come to match and exceed human performance in many settings, the tasks these models optimize for are largely constrained to the level of individual objects, such as classification and captioning. Humans remain vastly superior to CNNs in visual tasks involving relations, including the ability to identify two objects as `same' or `different'. A number of studies have shown that while CNNs can be coaxed into learning the same-different relation in some settings, they tend to generalize poorly to other instances of this relation. In this work we show that the same CNN architectures that fail to generalize the same-different relation with conventional training are able to succeed when trained via meta-learning, which explicitly encourages abstraction and generalization across tasks.


Neural Networks Can Capture Human Concept Learning Without Assuming Symbolic Representations

March 2025

·

3 Reads

People can learn new concepts from a small number of examples by drawing on inductive biases that favor some hypotheses over others. These inductive biases have previously been captured by Bayesian models that use symbolic representations such as logical formulas to define prior distributions over hypotheses. But does this imply that people must be using symbolic representations when solving these problems? We show that it is possible to create an artificial neural network that displays the same inductive biases as symbolic Bayesian models without making use of explicit symbolic representations. Our approach is based on distilling a prior distribution from a symbolic Bayesian model via meta-learning, a method for extracting the common structure from a set of tasks. This approach is used to create an artificial neural network with an inductive bias towards concepts expressed as short logical formulas. Analyzing results from previous behavioral experiments in which people learned logical concepts from a few examples shows that neural networks trained via meta-learning are able to capture human performance. These results suggest that while symbolic models are an effective tool for giving an interpretable description of human inductive biases, they do not imply that people must be using symbolic representations in order to make inferences that align with those inductive biases.


Figure 2 Incoherent probability judgments from humans (a, b) and GPT-4 (c, d). Like human probability judgments (a), GPT-4's judgments systematically deviate from zero when combined into probabilistic identities (c). When repeatedly queried about the same event, the mean-variance relationship of probability judgments follows an inverted-U shape for both humans (b) and GPT-4 (d). Human data are adapted from (Zhu et al., 2020), GPT-4 results are from (Zhu and Griffiths, 2024a).
Figure 4 Exploring the sensory representations of large language models with similarity judgments. (a) For musical pitch, both humans and LLMs show a decrease in judged similarity with increases in the interval between tones, but also show an increase at tones an octave apart (a full similarity matrix for GPT-3 is shown inset). As a consequence, both human and LLM similarities are best captured by helical solutions when converted into spatial representations by multidimensional scaling. (b) Two-dimensional multidimensional scaling solutions for vocal consonants and colors for GPT-4 similarity matrices, showing that LLMs can reproduce patterns seen in human representations despite never having had direct experience of sound or color.
Figure 6 Both humans and large language models show reductions in performance when engaging in verbal reasoning (as resulting from chain of thought prompting) on these tasks. (a) Implicit statistical learning involves classification of strings generated from artificial grammars. (b) Face recognition involves recognizing faces from a set that shares similar descriptions. (c) Classification of data with exceptionsinvolves learning labels with exceptions.
Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis

March 2025

·

47 Reads

Alexander Ku

·

Declan Campbell

·

Xuechunzi Bai

·

[...]

·

Thomas L. Griffiths

Modern artificial intelligence systems, such as large language models, are increasingly powerful but also increasingly hard to understand. Recognizing this problem as analogous to the historical difficulties in understanding the human mind, we argue that methods developed in cognitive science can be useful for understanding large language models. We propose a framework for applying these methods based on Marr's three levels of analysis. By revisiting established cognitive science techniques relevant to each level and illustrating their potential to yield insights into the behavior and internal organization of large language models, we aim to provide a toolkit for making sense of these new kinds of minds.


Figure 1: Input data for all 16 objects used in conceptlearning with their bitstring and image representations.
Figure 2: Input data for modular arithmetic for 4 example numbers, with number, image, and bitstring representations.
Hyperparameter search space for each architecture.
Teasing Apart Architecture and Initial Weights as Sources of Inductive Bias in Neural Networks

February 2025

·

11 Reads

Artificial neural networks can acquire many aspects of human knowledge from data, making them promising as models of human learning. But what those networks can learn depends upon their inductive biases -- the factors other than the data that influence the solutions they discover -- and the inductive biases of neural networks remain poorly understood, limiting our ability to draw conclusions about human learning from the performance of these systems. Cognitive scientists and machine learning researchers often focus on the architecture of a neural network as a source of inductive bias. In this paper we explore the impact of another source of inductive bias -- the initial weights of the network -- using meta-learning as a tool for finding initial weights that are adapted for specific problems. We evaluate four widely-used architectures -- MLPs, CNNs, LSTMs, and Transformers -- by meta-training 430 different models across three tasks requiring different biases and forms of generalization. We find that meta-learning can substantially reduce or entirely eliminate performance differences across architectures and data representations, suggesting that these factors may be less important as sources of inductive bias than is typically assumed. When differences are present, architectures and data representations that perform well without meta-learning tend to meta-train more effectively. Moreover, all architectures generalize poorly on problems that are far from their meta-training experience, underscoring the need for stronger inductive biases for robust generalization.



Embers of autoregression show how large language models are shaped by the problem they are trained to solve

October 2024

·

44 Reads

·

127 Citations

Proceedings of the National Academy of Sciences

The widespread adoption of large language models (LLMs) makes it important to recognize their strengths and limitations. We argue that to develop a holistic understanding of these systems, we must consider the problem that they were trained to solve: next-word prediction over Internet text. By recognizing the pressures that this task exerts, we can make predictions about the strategies that LLMs will adopt, allowing us to reason about when they will succeed or fail. Using this approach—which we call the teleological approach—we identify three factors that we hypothesize will influence LLM accuracy: the probability of the task to be performed, the probability of the target output, and the probability of the provided input. To test our predictions, we evaluate five LLMs (GPT-3.5, GPT-4, Claude 3, Llama 3, and Gemini 1.0) on 11 tasks, and we find robust evidence that LLMs are influenced by probability in the hypothesized ways. Many of the experiments reveal surprising failure modes. For instance, GPT-4’s accuracy at decoding a simple cipher is 51% when the output is a high-probability sentence but only 13% when it is low-probability, even though this task is a deterministic one for which probability should not matter. These results show that AI practitioners should be careful about using LLMs in low-probability situations. More broadly, we conclude that we should not evaluate LLMs as if they are humans but should instead treat them as a distinct type of system—one that has been shaped by its own particular set of pressures.


When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1

October 2024

·

10 Reads

·

1 Citation

In "Embers of Autoregression" (McCoy et al., 2023), we showed that several large language models (LLMs) have some important limitations that are attributable to their origins in next-word prediction. Here we investigate whether these issues persist with o1, a new system from OpenAI that differs from previous LLMs in that it is optimized for reasoning. We find that o1 substantially outperforms previous LLMs in many cases, with particularly large improvements on rare variants of common tasks (e.g., forming acronyms from the second letter of each word in a list, rather than the first letter). Despite these quantitative improvements, however, o1 still displays the same qualitative trends that we observed in previous systems. Specifically, o1 - like previous LLMs - is sensitive to the probability of examples and tasks, performing better and requiring fewer "thinking tokens" in high-probability settings than in low-probability ones. These results show that optimizing a language model for reasoning can mitigate but might not fully overcome the language model's probability sensitivity.


Meta-learning as a bridge between neural networks and symbolic Bayesian models

September 2024

·

6 Reads

·

1 Citation

Behavioral and Brain Sciences

Meta-learning is even more broadly relevant to the study of inductive biases than Binz et al. suggest: Its implications go beyond the extensions to rational analysis that they discuss. One noteworthy example is that meta-learning can act as a bridge between the vector representations of neural networks and the symbolic hypothesis spaces used in many Bayesian models.


Citations (25)


... One significant advantage of meta-learning is its ability to imbue neural network systems with inductive biases akin to symbolic models [238]. In the language domain, McCoy & Griffiths [239] demonstrated that meta-learning effectively transfers the strong inductive biases of Bayesian models into neural networks. Using a Model-Agnostic Meta-Learning (MAML) algorithm, researchers were able to integrate Bayesian priors (symbolic grammar) into neural networks, endowing them with the inductive biases of symbolic models. ...

Reference:

Teleology-Driven Affective Computing: A Causal Framework for Sustained Well-Being
Modeling rapid language learning by distilling Bayesian priors into artificial neural networks

... A critical condition for the advancement of artificial intelligence (AI) is that AI outputs align with human values [10]. Human feedback, a scarce and subjective resource [11,14,26,44], plays a vital role. LLMs, trained on vast datasets, utilize alignment techniques to generate more human-like and accurate responses [16,30,37]. ...

Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy Reasoning
  • Citing Conference Paper
  • January 2024

... We construct a logistic mixedeffects regression predicting whether a given language model correctly assigns the possible sentence a higher probability than the impossible one. As predictors, we include the semantic relatedness of the possible and impossible critical words, the typicality of the possible and impossible critical words, and the frequency of the possible and impossible critical words (a possible confound; see, e.g., McCoy et al., 2024). We also include random intercepts for each language model and sentence context, as well as random uncorrelated slopes of each predictor for each of these. ...

Embers of autoregression show how large language models are shaped by the problem they are trained to solve
  • Citing Article
  • October 2024

Proceedings of the National Academy of Sciences

... This minimizes the risk of catastrophic forgetting and enhances long-term learning capabilities in dynamic environments [237]. One significant advantage of meta-learning is its ability to imbue neural network systems with inductive biases akin to symbolic models [238]. In the language domain, McCoy & Griffiths [239] demonstrated that meta-learning effectively transfers the strong inductive biases of Bayesian models into neural networks. ...

Meta-learning as a bridge between neural networks and symbolic Bayesian models
  • Citing Article
  • September 2024

Behavioral and Brain Sciences

... Bayesian inference is a strong computational-level theory (Griffiths, Chater, et al., 2024;Griffiths, Zhu, et al., 2024;Oaksford & Chater, 2007), appealing for rational solutions to uncertainty. However, human behavior (e.g., choices, response times, confidence judgments) is often measured to test algorithmic-level or process models, making it challenging to test computational-level theories directly. ...

Bayes in the Age of Intelligent Machines
  • Citing Article
  • September 2024

Current Directions in Psychological Science

... This has led to the assumption that the way caregivers speak to children is tailored to their developmental needs and functional for efficient language learning. Motivated by this long-standing assumption, recent computational modeling research has investigated how training on CDL vs. ADL affects syntactic learning and generalization in neural networkbased language models (LMs) (Feng et al., 2024;Mueller and Linzen, 2023;Yedetore et al., 2023). Notably, Huebner et al. (2021) showed that Baby-BERTa, a masked LM trained on 5M tokens of child-directed speech transcripts 2 , achieves the level of syntactic ability similar to that of a much larger RoBERTa model trained on 30B tokens of ADL (Zhuang et al., 2021). ...

How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech
  • Citing Conference Paper
  • January 2023

... This allows the models to mimic formal and, to a lesser degree, functional linguistic competence [14] by imitating human language production. Sometimes this is achieved by reproducing chunks of models' training data [15]. In most cases, this involves conforming to very specific goals, such as the instantiation of a goal to "follow the user's instructions helpfully and safely" via instruction tuning and reinforcement learning with human feedback [16]. ...

How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty in Text Generation Using RAVEN

Transactions of the Association for Computational Linguistics

... McCoy and Griffiths (2023) use meta-learning in the special case of language learning: they use meta-learning techniques to, as they put it, "distill" a Bayesian prior for a particular kind of simplicity into a neural network. They use model-agnostic meta-learning (MAML; Finn et al., 2017) to metatrain LSTMs to learn simple formal languages from small amounts of data in the same schema as Yang and Piantadosi (2022). ...

Modeling rapid language learning by distilling Bayesian priors into artificial neural networks
  • Citing Preprint
  • May 2023

... It is undeniable that language models have the potential to provide novel insights into the information that can be extracted from the statistics of human language. Moreover, language models have been applied to address issues that are central to linguistics and cognitive science: they have been used to empirically assess learnability claims (Lan et al., 2024;Wilcox et al., 2023;Yedetore et al., 2023), to generate hypotheses about how grammatical information might be compactly represented in a(n artificial) neural system (Frank, 2023;Lakretz et al., 2021), and as 'animal models' to test the relationship between computational mechanisms and behavior (McCloskey, 1991;Scholte, 2018). Each of these usages is informed by linguistic theory, in the sense that they target linguistic phenomena that have been discovered and described extensively in linguistics. ...

How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech
  • Citing Preprint
  • January 2023

... Others, like Fodor (1975), Bloom & Keil (2001) and Fedorenko et al. (2024) argue that the cognitive utility of language may be much smaller. The development and 7 Marcus (2018Marcus ( , 2022 takes an extreme position here, Lake et al. (2017) and Quilty-Dunn et al. (2023), Smolensky et al. (2022) are more cautious defenders of a role for symbolic architecture. 8 For example, Google DeepMind's Go playing systems like AlphaZero (Silver et al. 2018) rely on a classical stochastic search system to supplement the connectionist network. ...

Neurocompositional computing: From the Central Paradox of Cognition to a new generation of AI systems