Matthew M Botvinick

Matthew M Botvinick
Princeton University | PU · Department of Psychology

About

106
Publications
32,096
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
31,239
Citations

Publications

Publications (106)
Preprint
Full-text available
We propose a novel framework for multitask reinforcement learning based on the minimum description length (MDL) principle. In this approach, which we term MDL-control (MDL-C), the agent learns the common structure among the tasks with which it is faced and then distills it into a simpler representation which facilitates faster convergence and gener...
Preprint
Full-text available
Learned communication between agents is a powerful tool when approaching decision-making problems that are hard to overcome by any single agent in isolation. However, continual coordination and communication learning between machine agents or human-machine partnerships remains a challenging open problem. As a stepping stone toward solving the conti...
Preprint
Full-text available
Artificial intelligence systems increasingly involve continual learning to enable flexibility in general situations that are not encountered during system training. Human interaction with autonomous systems is broadly studied, but research has hitherto under-explored interactions that occur while the system is actively learning, and can noticeably...
Preprint
Full-text available
Evaluating choices in multi-step tasks is thought to involve mentally simulating trajectories. Recent theories propose that the brain simplifies these laborious computations using temporal abstraction: storing actions' consequences, collapsed over multiple timesteps (the Successor Representation; SR). Although predictive neural representations and,...
Preprint
The recently-proposed Perceiver model obtains good results on several domains (images, audio, multimodal, point clouds) while scaling linearly in compute and memory with the input size. While the Perceiver supports many kinds of inputs, it can only produce very simple outputs such as class scores. Perceiver IO overcomes this limitation without sacr...
Article
Full-text available
Exploration, consolidation and planning depend on the generation of sequential state representations. However, these algorithms require disparate forms of sampling dynamics for optimal performance. We theorize how the brain should adapt internally generated sequences for particular cognitive functions and propose a neural mechanism by which this ma...
Preprint
Full-text available
Humans and animals make predictions about the rewards they expect to receive in different situations. In formal models of behavior, these predictions are known as value representations, and they play two very different roles. Firstly, they drive choice : the expected values of available options are compared to one another, and the best option is se...
Preprint
Full-text available
Theories of dACC function have to contend with an increasingly long and diverse list of signals that have been tied to this region. To account for this apparent heterogeneity, we recently proposed a theory of dACC function that embraces that heterogeneity and offers a unifying function focused on the evaluation, motivation and allocation of cogniti...
Preprint
Full-text available
Hemodynamic activity in dorsal anterior cingulate cortex (dACC) correlates with conflict, suggesting it contributes to conflict processing. This correlation could be explained by multiple neural processes that can be disambiguated by population firing rates patterns. We used targeted dimensionality reduction to characterize activity of populations...
Article
Full-text available
When making decisions we often face the need to adjudicate between conflicting strategies or courses of action. Our ability to understand the neuronal processes underlying conflict processing is limited on the one hand by the spatiotemporal resolution of functional MRI and, on the other hand, by imperfect cross-species homologies in animal model sy...
Preprint
Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success in natural language processing (NLP), achieving state-of-the-art results in domains such as language modeling and machine translation. Harnessing the tra...
Preprint
Some of the most successful applications of deep reinforcement learning to challenging domains in discrete and continuous control have used policy gradient methods in the on-policy setting. However, policy gradients can suffer from large variance that may limit performance, and in practice require carefully tuned entropy regularization to prevent p...
Preprint
Humans make decisions and act alongside other humans to pursue both short-term and long-term goals. As a result of ongoing progress in areas such as computing science and automation, humans now also interact with non-human agents of varying complexity as part of their day-to-day activities; substantial work is being done to integrate increasingly i...
Conference Paper
The behavioral dynamics of multi-agent systems have a rich and orderly structure, which can be leveraged to understand these systems, and to improve how artificial agents learn to operate in them. Here we introduce Relational Forward Models (RFM) for multi-agent learning, networks that can learn to make accurate predictions of agents' future behavi...
Preprint
Full-text available
Cognitive models are a fundamental tool in computational neuroscience, embodying in software precise hypotheses about the algorithms by which the brain gives rise to behavior. The development of such models is often largely a hypothesis-first process, drawing on inspiration from the literature and the creativity of the individual researcher to cons...
Article
A longstanding view of the organization of human and animal behavior holds that behavior is hierarchically organized, in other words, directed toward achieving superordinate goals through the achievement of subordinate goals or subgoals. However, most research in neuroscience has focused on tasks without hierarchical structure. In past work, we hav...
Preprint
Full-text available
Recent research has placed episodic reinforcement learning (RL) alongside model-free and model-based RL on the list of processes centrally involved in human reward-based learning. In the present work, we extend the unified account of model-free and model-based RL developed by Wang et al. (2018) to further integrate episodic learning. In this accoun...
Article
Full-text available
In the version of this article initially published, the green label in Fig. 1c read "rightward choices" instead of "leftward choices." The error has been corrected in the HTML and PDF versions of the article.
Article
Full-text available
Decision-making is typically studied as a sequential process from the selection of what to attend (e.g., between possible tasks, stimuli, or stimulus attributes) to which actions to take based on the attended information. However, people often process information across these various levels in parallel. Here we scan participants while they simultan...
Article
Full-text available
In the version of this article initially published, equation (7) read.
Conference Paper
The seemingly infinite diversity of the natural world arises from a relatively small set of coherent rules, such as the laws of physics or chemistry. We conjecture that these rules give rise to regularities that can be discovered through primarily unsupervised experiences and represented as abstract concepts. If such representations are composition...
Article
Psychlab is a simulated psychology laboratory inside the first-person 3D game world of DeepMind Lab (Beattie et al. 2016). Psychlab enables implementations of classical laboratory psychological experiments so that they work with both human and artificial agents. Psychlab has a simple and flexible API that enables users to easily create their own ta...
Article
Full-text available
A cognitive map has long been the dominant metaphor for hippocampal function, embracing the idea that place cells encode a geometric representation of space. However, evidence for predictive coding, reward sensitivity and policy dependence in place cells suggests that the representation is not purely spatial. We approach this puzzle from a reinforc...
Article
Full-text available
Theories of reward learning in neuroscience have focused on two families of algorithms thought to capture deliberative versus habitual choice. ‘Model-based’ algorithms compute the value of candidate actions from scratch, whereas ‘model-free’ algorithms make choice more efficient but less flexible by storing pre-computed action values. We examine an...
Data
Preventing replay slows acquisition for both SR-Dyna and Dyna-Q. Both algorithms under the two sampling settings were simulated on the task displayed in S1 Fig. a) Results of simulations with SR-Dyna. b) Results of simulations with Dyna-Q. Both a) and b) show number of steps on each trial for agent permitted to replay 20 samples between each decisi...
Data
Robustness of simulation results to varying parameters. Here, we display the results of simulating each task, using each algorithm under a wide variety of parameter settings. Each table below corresponds to a particular algorithm simulating a particular task. For a given parameter setting, the algorithm was simulated 500 times. A check indicates th...
Data
Advantage of TD learning over direct reward learning of weights. a) Task environment. On each trial, the agent was placed in state S. Trials ended when the agent reached state R, which contained a reward value of 10. Unlike the latent learning task in the main text, this task did not contain an exploratory period enabling the agent to learn the suc...
Preprint
Humans and animals are capable of evaluating actions by considering their long-run future rewards through a process described using model-based reinforcement learning (RL) algorithms. The mechanisms by which neural circuits perform the computations prescribed by model-based RL remain largely unknown; however, multiple lines of evidence suggest that...
Preprint
Full-text available
Decision-making is typically studied as a sequential process from the selection of what to attend (e.g., between possible tasks, stimuli, or stimulus attributes) to the selection of which actions to take based on the attended information. However, people often gather information across these levels in parallel. For instance, even as they choose the...
Preprint
Full-text available
A cognitive map has long been the dominant metaphor for hippocampal function, embracing the idea that place cells encode a geometric representation of space. However, evidence for predictive coding, reward sensitivity, and policy dependence in place cells suggests that the representation is not purely spatial. We approach this puzzle from a reinfor...
Article
Full-text available
Planning can be defined as action selection that leverages an internal model of the outcomes likely to follow each possible action. Its neural mechanisms remain poorly understood. Here we adapt recent advances from human research for rats, presenting for the first time an animal task that produces many trials of planned behavior per session, making...
Preprint
Full-text available
Planning can be defined as a process of action selection that leverages an internal model of the environment. Such models provide information about the likely outcomes that will follow each selected action, and their use is a key function underlying complex adaptive behavior. However, the neural mechanisms supporting this ability remain poorly unde...
Chapter
This chapter reviews a number of important domains that encompass the intersection between decision making and cognitive control. It explains a decision-making framework that proposes a set of computational and neural mechanisms by which the costs and benefits of control are integrated in order to decide whether and how cognitive control should be...
Article
Full-text available
In spite of its familiar phenomenology, the mechanistic basis for mental effort remains poorly understood. Although most researchers agree that mental effort is aversive and stems from limitations in our capacity to exercise cognitive control, it is unclear what gives rise to those limitations and why they result in an experience of control as cost...
Article
Full-text available
A growing literature suggests that the hippocampus is critical for the rapid extraction of regularities from the environment. Although this fits with the known role of the hippocampus in rapid learning, it seems at odds with the idea that the hippocampus specializes in memorizing individual episodes. In particular, the Complementary Learning System...
Preprint
Full-text available
Recent years have seen a surge of research into the neuroscience of planning. Much of this work has taken advantage of a two-step sequential decision task developed by Daw et al. (2011), which gives the ability to diagnose whether or not subjects’ behavior is the result of planning. Here, we present simulations which suggest that the techniques mos...
Article
Full-text available
Humans and animals are capable of evaluating actions by considering their long-run future rewards through a process described using model-based reinforcement learning (RL) algorithms. The mechanisms by which neural circuits perform the computations prescribed by model-based RL remain largely unknown; however, multiple lines of evidence suggest that...
Article
Debates over the function(s) of dorsal anterior cingulate cortex (dACC) have persisted for decades. So too have demonstrations of the region's association with cognitive control. Researchers have struggled to account for this association and, simultaneously, dACC's involvement in phenomena related to evaluation and motivation. We describe a recent...
Article
Full-text available
Recent research has highlighted a distinction between sequential foraging choices and traditional economic choices between simultaneously presented options. This was partly motivated by observations in Kolling, Behrens, Mars, and Rushworth, Science, 336(6077), 95-98 (2012) (hereafter, KBMR) that these choice types are subserved by different circuit...
Preprint
Full-text available
Recent research has highlighted a distinction between sequential foraging choices and traditional economic choices between simultaneously presented options. This was partly motivated by observations in Kolling et al. (2012) [KBMR] that these choice types are subserved by different circuits, with dorsal anterior cingulate (dACC) preferentially invol...
Preprint
A growing literature suggests that the hippocampus is critical for the rapid extraction of regularities from the environment. Although this fits with the known role of the hippocampus in rapid learning, it seems at odds with the idea that the hippocampus specializes in memorizing individual episodes. In particular, the Complementary Learning System...
Article
Slots in CA1; Supplementary Figure 1. Timecourse of pair structure learning; Supplementary Figure 2. Undeveloped TSP; Supplementary Figure 3. Inhibition and pair learning; Supplementary Figure 4. Inhibition and community structure; Supplementary Table 1. Parameters for layer sizes and inhibition, as implemented in the Emergent simulation environmen...
Article
Slots in CA1; Supplementary Figure 1. Timecourse of pair structure learning; Supplementary Figure 2. Undeveloped TSP; Supplementary Figure 3. Inhibition and pair learning; Supplementary Figure 4. Inhibition and community structure; Supplementary Table 1. Parameters for layer sizes and inhibition, as implemented in the Emergent simulation environmen...
Article
A growing literature suggests that the hippocampus is important for the rapid extraction of temporal structure in the environment (Bornstein & Daw, 2012; Curran, 1997; Harrison, Duggins, & Friston, 2006; Schapiro, Gregory, Landau, McCloskey, & Turk-Browne, 2014; Schapiro, Kustner, & Turk-Browne, 2012; Strange, Duggins, Penny, Dolan, & Friston, 2005...
Article
Research on the dynamics of reward-based, goal-directed decision making has largely focused on simple choice, where participants decide among a set of unitary, mutually exclusive options. Recent work suggests that the deliberation process underlying simple choice can be understood in terms of evidence integration: Noisy evidence in favor of each op...
Conference Paper
Full-text available
While cognitive control has long been known to adjust flexibly in response to signals like errors or conflict, when and how the decision is made to adjust control remains an open question. Recently, Shenhav and colleagues (1) described a theoretical framework whereby control allocation follows from a reward optimization process, according to which...
Article
Full-text available
Previous theories predict that human dorsal anterior cingulate (dACC) should respond to decision difficulty. An alternative theory has been recently advanced that proposes that dACC evolved to represent the value of 'non-default', foraging behavior, calling into question its role in choice difficulty. However, this new theory does not take into acc...
Article
Cognitive control has long been one of the most active areas of computational modeling work in cognitive science. The focus on computational models as a medium for specifying and developing theory predates the PDP books, and cognitive control was not one of the areas on which they focused. However, the framework they provided has injected work on c...
Article
Full-text available
Many people with schizophrenia exhibit avolition, a difficulty initiating and maintaining goal-directed behavior, considered to be a key negative symptom of the disorder. Recent evidence indicates that patients with higher levels of negative symptoms differ from healthy controls in showing an exaggerated cost of the physical effort needed to obtain...
Article
Full-text available
Recent years have seen a rejuvenation of interest in studies of motivation-cognition interactions arising from many different areas of psychology and neuroscience. The present issue of Cognitive, Affective, & Behavioral Neuroscience provides a sampling of some of the latest research from a number of these different areas. In this introductory artic...
Article
Full-text available
The field of computational reinforcement learning (RL) has proved extremely useful in research on human and animal behavior and brain function. However, the simple forms of RL considered in most empirical research do not scale well, making their relevance to complex, real-world behavior unclear. In computational RL, one strategy for addressing the...
Article
Full-text available
The capacity for self-control is critical to adaptive functioning, yet our knowledge of the underlying processes and mechanisms is presently only inchoate. Theoretical work in economics has suggested a model of self-control centering on two key assumptions: (1) a division within the decision-maker between two 'selves' with differing preferences; (2...
Article
The dorsal anterior cingulate cortex (dACC) has a near-ubiquitous presence in the neuroscience of cognitive control. It has been implicated in a diversity of functions, from reward processing and performance monitoring to the execution of control and action selection. Here, we propose that this diversity can be understood in terms of a single under...
Article
Full-text available
Our experience of the world seems to divide naturally into discrete, temporally extended events, yet the mechanisms underlying the learning and identification of events are poorly understood. Research on event perception has focused on transient elevations in predictive uncertainty or surprise as the primary signal driving event segmentation. We pr...
Article
Full-text available
To support reward-based decision-making, the brain must encode potential outcomes both in terms of their incentive value and their probability of occurrence. Recent research has made it clear that the brain bears multiple representations of reward magnitude, meaning that a single choice option may be represented differently-and even inconsistently-...
Article
Recent work has given rise to the view that reward-based decision making is governed by two key controllers: a habit system, which stores stimulus-response associations shaped by past reward, and a goal-oriented system that selects actions based on their anticipated outcomes. The current literature provides a rich body of computational theory addre...
Article
Human behavior displays hierarchical structure: simple actions cohere into subtask sequences, which work together to accomplish overall task goals. Although the neural substrates of such hierarchy have been the target of increasing research, they remain poorly understood. We propose that the computations supporting hierarchical behavior may relate...
Article
Grinband et al., 2011 compare evidence that they have collected from a neuroimaging study of the Stroop task with a simulation model of performance and conflict in that task, and interpret the results as providing evidence against the theory that activity in dorsal medial frontal cortex (dMFC) reflects monitoring for conflict. Here, we discuss seve...
Article
Intergroup competition makes social identity salient, which in turn affects how people respond to competitors' hardships. The failures of an in-group member are painful, whereas those of a rival out-group member may give pleasure-a feeling that may motivate harming rivals. The present study examined whether valuation-related neural responses to riv...
Article
Full-text available
Behavioral and economic theories have long maintained that actions are chosen so as to minimize demands for exertion or work, a principle sometimes referred to as the law of less work. The data supporting this idea pertain almost entirely to demands for physical effort. However, the same minimization principle has often been assumed also to apply t...
Article
Full-text available
Connectionist and dynamical systems approaches explain human thought, language and behavior in terms of the emergent consequences of a large number of simple noncognitive processes. We view the entities that serve as the basis for structured probabilistic approaches as abstractions that are occasionally useful but often misleading: they have no rea...
Article
Human choice behavior takes account of internal decision costs: people show a tendency to avoid making decisions in ways that are computationally demanding and subjectively effortful. Here, we investigate neural processes underlying the registration of decision costs. We report two functional MRI experiments that implicate lateral prefrontal cortex...
Article
J. S. Bowers, M. F. Damian, and C. J. Davis critiqued the computational model of serial order memory put forth in M. Botvinick and D. C. Plaut, purporting to show that the model does not generalize in a way that people do. They attributed this supposed failure to the model's dependence on context-dependent representations, translating this argument...