Beren Millidge

Beren Millidge
University of Oxford | OX · Nuffield Department of Clinical Neurosciences

About

64
Publications
10,011
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
374
Citations
Citations since 2017
64 Research Items
374 Citations
2017201820192020202120222023050100150
2017201820192020202120222023050100150
2017201820192020202120222023050100150
2017201820192020202120222023050100150

Publications

Publications (64)
Chapter
Recent work has uncovered close links between classical reinforcement learning (RL) algorithms, Bayesian filtering, and Active Inference which lets us understand value functions in terms of Bayesian posteriors. An alternative, but less explored, model-free RL algorithm is the successor representation, which expresses the value function in terms of...
Chapter
Predictive Coding Networks (PCNs) aim to learn a generative model of the world. Given observations, this generative model can then be inverted to infer the causes of those observations. However, when training PCNs, a noticeable pathology is often observed where inference accuracy peaks and then declines with further training. This cannot be explain...
Chapter
Capsule networks are a neural network architecture specialized for visual scene recognition. Features and pose information are extracted from a scene and then dynamically routed through a hierarchy of vector-valued nodes called ‘capsules’ to create an implicit scene graph, with the ultimate aim of learning vision directly as inverse graphics. Despi...
Conference Paper
Full-text available
Predictive coding networks (PCNs) have an inherent degree of biological plausibil-ity and perform approximate backpropagation of error in supervised settings. It is less clear how predictive coding compares to state-of-the-art architectures, such as VAEs in unsupervised and probabilistic settings. We propose a generalized PCN that, like its' inspir...
Preprint
Full-text available
This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants$\unicode{x2014}$what we call ''shared intelligence''. This vision is premised on active...
Preprint
Mechanistic interpretability aims to explain what a neural network has learned at a nuts-and-bolts level. What are the fundamental primitives of neural network representations? Previous mechanistic descriptions have used individual neurons or their linear combinations to understand the representations a network has learned. But there are clues that...
Preprint
Neuroscience-inspired models, such as predictive coding, have the potential to play an important role in the future of machine intelligence. However, they are not yet used in industrial applications due to some limitations, such as the lack of efficiency. In this work, we address this by proposing incremental predictive coding (iPC), a variation of...
Preprint
Full-text available
The computational principles adopted by the hippocampus in associative memory (AM) tasks have been one of the mostly studied topics in computational and theoretical neuroscience. Classical models of the hippocampal network assume that AM is performed via a form of covariance learning, where associations between memorized items are represented by en...
Preprint
A large amount of recent research has the far-reaching goal of finding training methods for deep neural networks that can serve as alternatives to backpropagation (BP). A prominent example is predictive coding (PC), which is a neuroscience-inspired method that performs inference on hierarchical Gaussian generative models. These methods, however, fa...
Preprint
Full-text available
Capsule networks are a neural network architecture specialized for visual scene recognition. Features and pose information are extracted from a scene and then dynamically routed through a hierarchy of vector-valued nodes called 'capsules' to create an implicit scene graph, with the ultimate aim of learning vision directly as inverse graphics. Despi...
Preprint
Full-text available
Predictive Coding Networks (PCNs) aim to learn a generative model of the world. Given observations, this generative model can then be inverted to infer the causes of those observations. However, when training PCNs, a noticeable pathology is often observed where inference accuracy peaks and then declines with further training. This cannot be explain...
Preprint
Predictive coding (PC) is an influential theory in computational neuroscience, which argues that the cortex forms unsupervised world models by implementing a hierarchical process of prediction error minimization. PC networks (PCNs) are trained in two phases. First, neural activities are updated to optimize the network's response to external stimuli...
Preprint
Recent work has uncovered close links between between classical reinforcement learning algorithms, Bayesian filtering, and Active Inference which lets us understand value functions in terms of Bayesian posteriors. An alternative, but less explored, model-free RL algorithm is the successor representation, which expresses the value function in terms...
Conference Paper
The backpropagation of error algorithm (BP) used to train deep neural networks has been fundamental to the successes of deep learning. However, it requires sequential backwards updates and non-local computations which make it challenging to parallelize at scale and is unlike how learning works in the brain. Neuroscience-inspired learning algorithms...
Article
A large number of neural network models of associative memory have been proposed in the literature. These include the classical Hopfield networks (HNs), sparse distributed memories (SDMs), and more recently the modern continuous Hopfield networks (MCHNs), which possess close links with self-attention in machine learning. In this paper, we propose a...
Preprint
Full-text available
Backpropagation (BP) is the most successful and widely used algorithm in deep learning. However, the computations required by BP are challenging to reconcile with known neurobiology. This difficulty has stimulated interest in more biologically plausible alternatives to BP. One such algorithm is the inference learning algorithm (IL). IL has close co...
Preprint
How the brain performs credit assignment is a fundamental unsolved problem in neuroscience. Many `biologically plausible' algorithms have been proposed, which compute gradients that approximate those computed by backpropagation (BP), and which operate in ways that more closely satisfy the constraints imposed by neural circuitry. Many such algorithm...
Preprint
Full-text available
The aim of this paper is to introduce a field of study that has emerged over the last decade, called Bayesian mechanics. Bayesian mechanics is a probabilistic mechanics, comprising tools that enable us to model systems endowed with a particular partition (i.e., into particles), where the internal states (or the trajectories of internal states) of a...
Preprint
For both humans and machines, the essence of learning is to pinpoint which components in its information processing pipeline are responsible for an error in its output -- a challenge that is known as credit assignment. How the brain solves credit assignment is a key question in neuroscience, and also of significant importance for artificial intelli...
Article
Backpropagation of error (backprop) is a powerful algorithm for training machine learning architectures through end-to-end differentiation. Recently it has been shown that backprop in multilayer perceptrons (MLPs) can be approximated using predictive coding, a biologically plausible process theory of cortical computation that relies solely on local...
Preprint
Full-text available
A bstract An influential theory posits that dopaminergic neurons in the mid-brain implement a model-free reinforcement learning algorithm based on temporal difference (TD) learning. A fundamental assumption of this model is that the reward function being optimized is fixed. However, for biological creatures the ‘reward function’ can fluctuate subst...
Preprint
Predictive coding is an influential model of cortical neural activity. It proposes that perceptual beliefs are furnished by sequentially minimising "prediction errors" - the differences between predicted and observed data. Implicit in this proposal is the idea that perception requires multiple cycles of neural activity. This is at odds with evidenc...
Article
Full-text available
The free energy principle (FEP) states that any dynamical system can be interpreted as performing Bayesian inference upon its surrounding environment. Although, in theory, the FEP applies to a wide variety of systems, there has been almost no direct exploration or demonstration of the principle in concrete systems. In this work, we examine in depth...
Preprint
The backpropagation of error algorithm used to train deep neural networks has been fundamental to the successes of deep learning. However, it requires sequential backward updates and non-local computations, which make it challenging to parallelize at scale and is unlike how learning works in the brain. Neuroscience-inspired learning algorithms, how...
Preprint
A large number of neural network models of associative memory have been proposed in the literature. These include the classical Hopfield networks (HNs), sparse distributed memories (SDMs), and more recently the modern continuous Hopfield networks (MCHNs), which possesses close links with self-attention in machine learning. In this paper, we propose...
Preprint
Training with backpropagation (BP) in standard deep learning consists of two main steps: a forward pass that maps a data point to its prediction, and a backward pass that propagates the error of this prediction back through the network. This process is highly effective when the goal is to minimize a specific objective function. However, it does not...
Preprint
Full-text available
Active inference is an account of cognition and behavior in complex systems which brings together action, perception, and learning under the theoretical mantle of Bayesian inference. Active inference has seen growing applications in academic research, especially in fields that seek to model human or animal behavior. While in recent years, some of t...
Preprint
Full-text available
Active inference is a mathematical framework which originated in computational neuroscience as a theory of how the brain implements action, perception and learning. Recently, it has been shown to be a promising approach to the problems of state-estimation and control under uncertainty, as well as a foundation for the construction of goal-driven beh...
Article
Full-text available
This paper presents an active inference based simulation study of visual foraging. The goal of the simulation is to show the effect of the acquisition of culturally patterned attention styles on cognitive task performance, under active inference. We show how cultural artefacts like antique vase decorations drive cognitive functions such as percepti...
Preprint
In cognitive science, behaviour is often separated into two types. Reflexive control is habitual and immediate, whereas reflective is deliberative and time consuming. We examine the argument that Hierarchical Predictive Coding (HPC) can explain both types of behaviour as a continuum operating across a multi-layered network, removing the need for se...
Preprint
The Free-Energy-Principle (FEP) is an influential and controversial theory which postulates a deep and powerful connection between the stochastic thermodynamics of self-organization and learning through variational inference. Specifically, it claims that any self-organizing system which can be statistically separated from its environment, and which...
Preprint
Full-text available
Predictive coding offers a potentially unifying account of cortical function -- postulating that the core function of the brain is to minimize prediction errors with respect to a generative model of the world. The theory is closely related to the Bayesian brain framework and, over the last two decades, has gained substantial influence in the fields...
Preprint
In this PhD thesis, we explore and apply methods inspired by the free energy principle to two important areas in machine learning and neuroscience. The free energy principle is a general mathematical theory of the necessary information-theoretic behaviours of systems that maintain a separation from their environment. A core postulate of the theory...
Preprint
Full-text available
Intelligent agents must pursue their goals in complex environments with partial information and often limited computational capacity. Reinforcement learning methods have achieved great success by creating agents that optimize engineered reward functions, but which often struggle to learn in sparse-reward environments, generally require many environ...
Preprint
While the utility of well-chosen abstractions for understanding and predicting the behaviour of complex systems is well appreciated, precisely what an abstraction $\textit{is}$ has so far has largely eluded mathematical formalization. In this paper, we aim to set out a mathematical theory of abstraction. We provide a precise characterisation of wha...
Preprint
The Free Energy Principle (FEP) states that any dynamical system can be interpreted as performing Bayesian inference upon its surrounding environment. Although the FEP applies in theory to a wide variety of systems, there has been almost no direct exploration of the principle in concrete systems. In this paper, we examine in depth the assumptions r...
Preprint
Full-text available
The exploration-exploitation trade-off is central to the description of adaptive behaviour in fields ranging from machine learning, to biology, to economics. While many approaches have been taken, one approach to solving this trade-off has been to equip or propose that agents possess an intrinsic 'exploratory drive' which is often implemented in te...
Preprint
Full-text available
The Kalman filter is a fundamental filtering algorithm that fuses noisy sensory data, a previous state estimate, and a dynamics model to produce a principled estimate of the current state. It assumes, and is optimal for, linear models and white Gaussian noise. Due to its relative simplicity and general effectiveness, the Kalman filter is widely use...
Article
Full-text available
The expected free energy (EFE) is a central quantity in the theory of active inference. It is the quantity that all active inference agents are mandated to minimize through action, and its decomposition into extrinsic and intrinsic value terms is key to the balance of exploration and exploitation that active inference agents evince. Despite its imp...
Chapter
In cognitive science, behaviour is often separated into two types. Reflexive control is habitual and immediate, whereas reflective is deliberative and time consuming. We examine the argument that Hierarchical Predictive Coding (HPC) can explain both types of behaviour as a continuum operating across a multi-layered network, removing the need for se...
Chapter
Active Inference (AIF) is an emerging framework in the brain sciences which suggests that biological agents act to minimise a variational bound on model evidence. Control-as-Inference (CAI) is a framework within reinforcement learning which casts decision making as a variational inference problem. While these frameworks both consider action selecti...
Preprint
Full-text available
This paper presents an active inference based simulation study of visual foraging and transfer learning. The goal of the simulation is to show the effect of the acquisition of culturally patterned attention styles on cognitive task performance, under active inference. We show how cultural artifacts like antique vase decorations drive cognitive func...
Conference Paper
Full-text available
In this paper, we combine sophisticated and deep-parametric active inference to create an agent whose affective states change as a consequence of its Bayesian beliefs about how possible future outcomes will affect future beliefs. To achieve this, we augment Markov Decision Processes with a Bayes-adaptive deep-temporal tree search that is guided by...
Preprint
The recently proposed Activation Relaxation (AR) algorithm provides a simple and robust approach for approximating the backpropagation of error algorithm using only local learning rules. Unlike competing schemes, it converges to the exact backpropagation gradients, and utilises only a single type of computational unit and a single backwards relaxat...
Preprint
Predictive coding is an influential theory of cortical function which posits that the principal computation the brain performs, which underlies both perception and learning, is the minimization of prediction errors. While motivated by high-level notions of variational inference, detailed neurophysiological models of cortical microcircuits which can...
Preprint
Can the powerful backpropagation of error (backprop) reinforcement learning algorithm be formulated in a manner suitable for implementation in neural circuitry? The primary challenge is to ensure that any candidate formulation uses only local information, rather than relying on global (error) signals, as in orthodox backprop. Recently several algor...
Preprint
Full-text available
This paper presents an active inference based simulation study of visual foraging and transfer learning. The goal of the simulation is to show the effect of the acquisition of culturally patterned attention styles on cognitive task performance, under active inference. We show how cultural artifacts like antique vase decorations drive cognitive func...
Preprint
The field of reinforcement learning can be split into model-based and model-free methods. Here, we unify these approaches by casting model-free policy optimisation as amortised variational inference, and model-based planning as iterative variational inference, within a `control as hybrid inference' (CHI) framework. We present an implementation of C...
Preprint
Active Inference (AIF) is an emerging framework in the brain sciences which suggests that biological agents act to minimise a variational bound on model evidence. Control-as-Inference (CAI) is a framework within reinforcement learning which casts decision making as a variational inference problem. While these frameworks both consider action selecti...
Preprint
There are several ways to categorise reinforcement learning (RL) algorithms, such as either model-based or model-free, policy-based or planning-based, on-policy or off-policy, and online or offline. Broad classification schemes such as these help provide a unified perspective on disparate techniques and can contextualise and guide the development o...
Preprint
Backpropagation of error (backprop) is a powerful algorithm for training machine learning architectures through end-to-end differentiation. However, backprop is often criticised for lacking biological plausibility. Recently, it has been shown that backprop in multilayer-perceptrons (MLPs) can be approximated using predictive coding, a biologically-...
Article
Active Inference is a theory arising from theoretical neuroscience which casts action and planning as Bayesian inference problems to be solved by minimizing a single quantity — the variational free energy. The theory promises a unifying account of action and perception coupled with a biologically plausible process theory. However, despite these pot...
Preprint
This short letter is a response to a recent Forum article in Trends in Cognitive Sciences, by Sun and Firestone, which reprises the so-called 'Dark Room Problem' as a challenge to the explanatory value of predictive processing and free-energy-minimisation frameworks for cognitive science. Among many possible responses to Sun and Firestone, we expla...
Preprint
The Expected Free Energy (EFE) is a central quantity in the theory of active inference. It is the quantity that all active inference agents are mandated to minimize through action, and its decomposition into extrinsic and intrinsic value terms is key to the balance of exploration and exploitation that active inference agents evince. Despite its imp...
Preprint
Full-text available
The central tenet of reinforcement learning (RL) is that agents seek to maximize the sum of cumulative rewards. In contrast, active inference, an emerging framework within cognitive and computational neuroscience, proposes that agents act to maximize the evidence for a biased generative model. Here, we illustrate how ideas from active inference can...
Preprint
How can we understand vocal imitation, the rare ability of certain species to copy vocalizations and other environmental sounds? How can computational modelling assist us? Here we describe a step-by-step process of mutually accommodating biological data to an implemented computational model. We begin with observations of harbour seals and with a pu...
Preprint
Full-text available
Active Inference is a theory of action arising from neuroscience which casts action and planning as a bayesian inference problem to be solved by minimizing a single quantity - the variational free energy. Active Inference promises a unifying account of action and perception coupled with a biologically plausible process theory. Despite these potenti...
Preprint
Fixational eye movements are ubiquitous and have a large impact on visual perception. Although their physical characteristics and, to some extent, neural underpinnings are well documented, their function, with the exception of preventing visual fading, remains poorly understood. In this paper, we propose that the visual system might utilize the rel...
Preprint
Initial and preliminary implementations of predictive processing and active inference models are presented. These include the baseline hierarchical predictive coding models of (Friston 2003, 2005), and dynamical predictive coding models using generalised coordinates (Friston 2008, 2010, Buckley 2017). Additionally, we re-implement and experiment wi...
Preprint
This paper combines the active inference formulation of action (Friston, 2009) with hierarchical predictive coding models (Friston, 2003) to provide a proof-of-concept implementation of an active inference agent able to solve a common reinforcement learning baseline -- the cart-pole environment in OpenAI gym. It demonstrates empirically that predic...
Preprint
We propose a novel predictive processing account of bottom-up visual saliency in which salience is simply the low-level prediction error between the sense-data and the predictions produced by the generative models in the brain. We test this with modelling in which we use cross-predicting deep autoencoders to create salience maps in an entirely unsu...

Network

Cited By