## No file available

To read the file of this research,

you can request a copy directly from the author.

Preprints and early-stage research may not have been peer reviewed yet.

In this PhD thesis, we explore and apply methods inspired by the free energy principle to two important areas in machine learning and neuroscience. The free energy principle is a general mathematical theory of the necessary information-theoretic behaviours of systems that maintain a separation from their environment. A core postulate of the theory is that complex systems can be seen as performing variational Bayesian inference and minimizing an information-theoretic quantity called the variational free energy. The thesis is structured into three independent sections. Firstly, we focus on predictive coding, a neurobiologically plausible process theory derived from the free energy principle which argues that the primary function of the brain is to minimize prediction errors, showing how predictive coding can be scaled up and extended to be more biologically plausible, and elucidating its close links with other methods such as Kalman Filtering. Secondly, we study active inference, a neurobiologically grounded account of action through variational message passing, and investigate how these methods can be scaled up to match the performance of deep reinforcement learning methods. We additionally provide a detailed mathematical understanding of the nature and origin of the information-theoretic objectives that underlie exploratory behaviour. Finally, we investigate biologically plausible methods of credit assignment in the brain. We first demonstrate a close link between predictive coding and the backpropagation of error algorithm. We go on to propose novel and simpler algorithms which allow for backprop to be implemented in purely local, biologically plausible computations.

To read the file of this research,

you can request a copy directly from the author.

ResearchGate has not been able to resolve any citations for this publication.

[A heavily rewritten version of this paper has been published in BBS in 2021]
Markov blankets have been used to settle disputes central to philosophy of mind and cognition. Their development from a technical concept in Bayesian inference to a central concept within the free-energy principle is analysed. We propose to distinguish between instrumental Pearl blankets and realist Friston blankets. Pearl blankets are substantiated by the empirical literature but can do limited philosophical work. Friston blankets can do philosophical work, but require strong theoretical assumptions. Both are conflated in the current literature on the free-energy principle. Consequently, we propose that distinguishing between an instrumental and a realist research program will help clarify the literature.

The free energy principle (FEP) has seen extensive philosophical engagement— both from a general philosophy of science perspective and from the perspective of philosophies of specific sciences: cognitive science, neuroscience, and biology. The literature on the FEP has attempted to draw out specific philosophical commitments and entailments of the framework. But the most fundamental questions, from the perspective of philosophy of science, remain open: To what discipline(s) does the FEP belong? Does it make falsifiable claims? What sort of scientific object is it? Is it to be taken as a representation of contingent states of affairs in nature? Does it constitute knowledge? What role is it in- tended to play in relation to empirical research? Does the FEP even properly belong to the domain of science? To the extent that it has engaged with them at all, the extant literature has begged, dodged, dismissed, and skirted around these questions, without ever addressing them head-on. These questions must, I urge, be answered satisfactorily before we can make any headway on the philosophical consequences of the FEP. I take preliminary steps towards answering these questions in this paper, first by examining closely key formal elements of the framework and the implications they hold for its utility, and second, by highlighting potential modes of interpreting the FEP in light of an abundant philosophical literature on scientific modelling.

In this paper, we combine sophisticated and deep-parametric active inference to create an agent whose affective states change as a consequence of its Bayesian beliefs about how possible future outcomes will affect future beliefs. To achieve this, we augment Markov Decision Processes with a Bayes-adaptive deep-temporal tree search that is guided by a free energy functional which recur-sively scores counterfactual futures. Our model reproduces the common phenomenon of rumination over a situation until unlikely, yet aversive and arousing situations emerge in one's imagination. As a proof of concept, we show how certain hyperparameters give rise to neurocognitive dynamics that characterise imagination induced anxiety.

An influential body of research in neuroscience and the philosophy of mind asserts that the brain is an organ for prediction error minimization. I clarify how this hypothesis should be understood, and I consider a prominent attempt to justify it, according to which prediction error minimization in the brain is a manifestation of a more fundamental imperative in all self-organizing systems to minimize (variational) free energy. I argue that this justification fails. The sense in which all self-organizing systems can be said to minimize free energy according to the free energy principle is fundamentally different from the alleged sense in which brains minimize prediction error. Thus, even if the free energy principle is true, it provides no support for a theory of the brain as an organ for prediction error minimization – or any other substantive theory of brain function.

Recurrently connected networks of spiking neurons underlie the astounding information processing capabilities of the brain. Yet in spite of extensive research, how they can learn through synaptic plasticity to carry out complex network computations remains unclear. We argue that two pieces of this puzzle were provided by experimental data from neuroscience. A mathematical result tells us how these pieces need to be combined to enable biologically plausible online network learning through gradient descent, in particular deep reinforcement learning. This learning method–called e-prop–approaches the performance of backpropagation through time (BPTT), the best-known method for training recurrent neural networks in machine learning. In addition, it suggests a method for powerful on-chip learning in energy-efficient spike-based hardware for artificial intelligence.

What is Optimal Control Theory?
Dynamic Systems: Evolving over time.
Time: Discrete or continuous; Optimal way to control a dynamic
system.
Prerequisites: Calculus, Vectors and Matrices, ODE and PDE.
Applications: Production, Finance, Economics, Marketing and others.

The segregation of neural processing into distinct streams has been interpreted by someas evidence in favour of a modular view of brain function. This implies a set of specialised ‘modules’,each of which performs a specific kind of computation in isolation of other brain systems, beforesharing the result of this operation with other modules. In light of a modern understanding ofstochastic non-equilibrium systems, like the brain, a simpler and more parsimonious explanationpresents itself. Formulating the evolution of a non-equilibrium steady state system in terms of itsdensity dynamics reveals that such systems appear on average to perform a gradient ascent on theirsteady state density. If this steady state implies a sufficiently sparse conditional independencystructure, this endorses a mean-field dynamical formulation. This decomposes the density over allstates in a system into the product of marginal probabilities for those states. This factorisation lendsthe system a modular appearance, in the sense that we can interpret the dynamics of each factorindependently. However, the argument here is that it is factorisation, as opposed to modularisation,that gives rise to the functional anatomy of the brain or, indeed, any sentient system. In thefollowing, we briefly overview mean-field theory and its applications to stochastic dynamicalsystems. We then unpack the consequences of this factorisation through simple numericalsimulations and highlight the implications for neuronal message passing and the computationalarchitecture of sentience

A growing body of work underlines striking similarities between biological neural networks and recurrent, binary neural networks. A relatively smaller body of work, however, addresses the similarities between learning dynamics employed in deep artificial neural networks and synaptic plasticity in spiking neural networks. The challenge preventing this is largely caused by the discrepancy between the dynamical properties of synaptic plasticity and the requirements for gradient backpropagation. Learning algorithms that approximate gradient backpropagation using local error functions can overcome this challenge. Here, we introduce Deep Continuous Local Learning (DECOLLE), a spiking neural network equipped with local error functions for online learning with no memory overhead for computing gradients. DECOLLE is capable of learning deep spatio temporal representations from spikes relying solely on local information, making it compatible with neurobiology and neuromorphic hardware. Synaptic plasticity rules are derived systematically from user-defined cost functions and neural dynamics by leveraging existing autodifferentiation methods of machine learning frameworks. We benchmark our approach on the event-based neuromorphic dataset N-MNIST and DvsGesture, on which DECOLLE performs comparably to the state-of-the-art. DECOLLE networks provide continuously learning machines that are relevant to biology and supportive of event-based, low-power computer vision architectures matching the accuracies of conventional computers on tasks where temporal precision and speed are essential.

This essay addresses Cartesian duality and how its implicit dialectic might be repaired using physics and information theory. Our agenda is to describe a key distinction in the physical sciences that may provide a foundation for the distinction between mind and matter, and between sentient and intentional systems. From this perspective, it becomes tenable to talk about the physics of sentience and ‘forces’ that underwrite our beliefs (in the sense of probability distributions represented by our internal states), which may ground our mental states and consciousness. We will refer to this view as Markovian monism, which entails two claims: (1) fundamentally, there is only one type of thing and only one type of irreducible property (hence monism). (2) All systems possessing a Markov blanket have properties that are relevant for understanding the mind and consciousness: if such systems have mental properties, then they have them partly by virtue of possessing a Markov blanket (hence Markovian). Markovian monism rests upon the information geometry of random dynamic systems. In brief, the information geometry induced in any system—whose internal states can be distinguished from external states—must acquire a dual aspect. This dual aspect concerns the (intrinsic) information geometry of the probabilistic evolution of internal states and a separate (extrinsic) information geometry of probabilistic beliefs about external states that are parameterised by internal states. We call these intrinsic (i.e., mechanical, or state-based) and extrinsic (i.e., Markovian, or belief-based) information geometries, respectively. Although these mathematical notions may sound complicated, they are fairly straightforward to handle, and may offer a means through which to frame the origins of consciousness.

For many years, the dominant theoretical framework guiding research into the neural origins of perceptual experience has been provided by hierarchical feedforward models, in which sensory inputs are passed through a series of increasingly complex feature detectors. However, the long‐standing orthodoxy of these accounts has recently been challenged by a radically different set of theories that contend that perception arises from a purely inferential process supported by two distinct classes of neurons: those that transmit predictions about sensory states and those that signal sensory information that deviates from those predictions. Although these predictive processing (PP) models have become increasingly influential in cognitive neuroscience, they are also criticized for lacking the empirical support to justify their status. This limited evidence base partly reflects the considerable methodological challenges that are presented when trying to test the unique predictions of these models. However, a confluence of technological and theoretical advances has prompted a recent surge in human and nonhuman neurophysiological research seeking to fill this empirical gap. Here, we will review this new research and evaluate the degree to which its findings support the key claims of PP. Predictive processing models have become increasingly influential in cognitive neuroscience as a possible explanation for the neural origins of perceptual experience, but have been criticized for lacking adequate empirical support. However, there has been a recent surge in human and nonhuman neurophysiological research seeking to fill this empirical gap. Here, we will review this new research and evaluate the degree to which its findings support the key claims of predictive processing.

Nonlinear filtering is used in online estimation of a dynamic hidden variable from incoming data and has vast applications in different fields, ranging from engineering, machine learning, economic science and natural sciences. We start our review of the theory on nonlinear filtering from the simplest ‘filtering’ task we can think of, namely static Bayesian inference. From there we continue our journey through discrete-time models, which are usually encountered in machine learning, and generalize to continuous-time filtering theory. The idea of changing the probability measure connects and elucidates several aspects of the theory, such as the parallels between the discrete- and continuous-time problems and between different observation models. Furthermore, it provides insight into the construction of particle filtering algorithms. This tutorial is targeted at scientists and engineers and should serve as an introduction to the main ideas of nonlinear filtering, and as a segway to more advanced and specialized literature.

This paper considers the relationship between thermodynamics, information and inference. In particular, it explores the thermodynamic concomitants of belief updating, under a variational (free energy) principle for self-organization. In brief, any (weakly mixing) random dynamical system that possesses a Markov blanket—i.e. a separation of internal and external states—is equipped with an information geometry. This means that internal states parametrize a probability density over external states. Furthermore, at non-equilibrium steady-state, the flow of internal states can be construed as a gradient flow on a quantity known in statistics as Bayesian model evidence. In short, there is a natural Bayesian mechanics for any system that possesses a Markov blanket. Crucially, this means that there is an explicit link between the inference performed by internal states and their energetics—as characterized by their stochastic thermodynamics.
This article is part of the theme issue ‘Harmonizing energy-autonomous computing and intelligence’.

Successful behaviour depends on the right balance between maximising reward and soliciting information about the world. Here, we show how different types of information-gain emerge when casting behaviour as surprise minimisation. We present two distinct mechanisms for goal-directed exploration that express separable profiles of active sampling to reduce uncertainty. ‘Hidden state’ exploration motivates agents to sample unambiguous observations to accurately infer the (hidden) state of the world. Conversely, ‘model parameter’ exploration, compels agents to sample outcomes associated with high uncertainty, if they are informative for their representation of the task structure. We illustrate the emergence of these types of information-gain, termed active inference and active learning, and show how these forms of exploration induce distinct patterns of ‘Bayes-optimal’ behaviour. Our findings provide a computational framework for understanding how distinct levels of uncertainty systematically affect the exploration-exploitation trade-off in decision-making.

We show that deep networks can be trained using Hebbian updates yielding similar performance to ordinary back-propagation on challenging image datasets. To overcome the unrealistic symmetry in connections between layers, implicit in back-propagation, the feedback weights are separate from the feedforward weights. The feedback weights are also updated with a local rule, the same as the feedforward weights—a weight is updated solely based on the product of activity of the units it connects. With fixed feedback weights as proposed in Lillicrap et al. (2016) performance degrades quickly as the depth of the network increases. If the feedforward and feedback weights are initialized with the same values, as proposed in Zipser and Rumelhart (1990), they remain the same throughout training thus precisely implementing back-propagation. We show that even when the weights are initialized differently and at random, and the algorithm is no longer performing back-propagation, performance is comparable on challenging datasets. We also propose a cost function whose derivative can be represented as a local Hebbian update on the last layer. Convolutional layers are updated with tied weights across space, which is not biologically plausible. We show that similar performance is achieved with untied layers, also known as locally connected layers, corresponding to the connectivity implied by the convolutional layers, but where weights are untied and updated separately. In the linear case we show theoretically that the convergence of the error to zero is accelerated by the update of the feedback weights.

It has long been speculated that the backpropagation-of-error algorithm (backprop) may be a model of how the brain learns. Backpropagation-through-time (BPTT) is the canonical temporal-analogue to backprop used to assign credit in recurrent neural networks in machine learning, but there's even less conviction about whether BPTT has anything to do with the brain. Even in machine learning the use of BPTT in classic neural network architectures has proven insufficient for some challenging temporal credit assignment (TCA) problems that we know the brain is capable of solving. Nonetheless, recent work in machine learning has made progress in solving difficult TCA problems by employing novel memory-based and attention-based architectures and algorithms, some of which are brain inspired. Importantly, these recent machine learning methods have been developed in the context of, and with reference to BPTT, and thus serve to strengthen BPTT's position as a useful normative guide for thinking about temporal credit assignment in artificial and biological systems alike.

In the past few decades, probabilistic interpretations of brain functions have become widespread in cognitive science and neuroscience. In particular, the free energy principle and active inference are increasingly popular theories of cognitive functions that claim to offer a unified understanding of life and cognition within a general mathematical framework derived from information and control theory, and statistical mechanics. However, we argue that if the active inference proposal is to be taken as a general process theory for biological systems, it is necessary to understand how it relates to existing control theoretical approaches routinely used to study and explain biological systems. For example, recently, PID (Proportional-Integral-Derivative) control has been shown to be implemented in simple molecular systems and is becoming a popular mechanistic explanation of behaviours such as chemotaxis in bacteria and amoebae, and robust adaptation in biochemical networks. In this work, we will show how PID controllers can fit a more general theory of life and cognition under the principle of (variational) free energy minimisation when using approximate linear generative models of the world. This more general interpretation also provides a new perspective on traditional problems of PID controllers such as parameter tuning as well as the need to balance performances and robustness conditions of a controller. Specifically, we then show how these problems can be understood in terms of the optimisation of the precisions (inverse variances) modulating different prediction errors in the free energy functional.

Every day, our brain is facing the challenge of making sense of the rich and dynamical stream of sensory inputs, and combining it with prior knowledge about its environment. Recent psychophysical evidence suggests that it performs its computations according to Bayesian statistics, which is commonly referred to as the ‘Bayesian brain hypothesis’. A similar statistical problem arises for an external observer, who has access to neuronal recordings and wants to infer the underlying stimulus that gave rise to these recordings. Both problems, perception from the view of the brain and decoding from the view of the observer, can be formulated in the context of nonlinear filtering theory, i.e. dynamical Bayesian inference.
In this thesis, we start from a review of filtering theory with continuous-time and point-process observations. The formal solutions of these filtering problems are infinite- dimensional, and thus require a finite dimensional approximation. One important class of numerical algorithms, so-called particle filtering methods, rely on an approximation of the posterior density in terms of weighted empirical samples. Though asymptotically exact, these methods are known to suffer from the ‘curse of dimensionality’ (COD), i.e. a required number of particles that exponentially increases with problem dimensionality. We investigate the reason for the COD and assign it to the number of (observable) dimensions, which negatively influence the dynamics of particle weights towards weight degeneracy. We therefore propose unweighted particle filtering methods for perception and decoding, which do not suffer from weight degeneracy and thus exhibit a favorable scaling with observable dimensions.
Specifically, for the task of perception we propose the Neural Particle Filter (NPF), which identifies neuronal activities with (equally weighted) samples from the posterior density according to the ‘neural sampling hypothesis’. We show that this filter can be interpreted as the neuronal dynamics of a recurrently connected rate-based neural network receiving feed-forward input from sensory neurons. Further, it captures properties of temporal and multi-sensory integration that are crucial for perception, and it allows for online parameter learning with a maximum likelihood approach.
Our approximate solution to the decoding task relies on a similar dynamics for the empirical samples, the spike-based Neural Particle filter (sNPF), albeit for point-process observations. Further, we employ a comparable maximum likelihood approach to derive learning rules for the parameters of the sNPF, which allows both online and offline unsu- pervised learning of the model parameters. The favorable scaling of the sNPF with the number of observable dimensions makes the sNPF particularly suited for decoding from large-scale neuronal recordings.

Neuronal computations rely upon local interactions across synapses. For a neuronal network to perform inference, it must integrate information from locally computed messages that are propagated among elements of that network. We review the form of two popular (Bayesian) message passing schemes and consider their plausibility as descriptions of inference in biological networks. These are variational message passing and belief propagation – each of which is derived from a free energy functional that relies upon different approximations (mean-field and Bethe respectively). We begin with an overview of these schemes and illustrate the form of the messages required to perform inference using Hidden Markov Models as generative models. Throughout, we use factor graphs to show the form of the generative models and of the messages they entail. We consider how these messages might manifest neuronally and simulate the inferences they perform. While variational message passing offers a simple and neuronally plausible architecture, it falls short of the inferential performance of belief propagation. In contrast, belief propagation allows exact computation of marginal posteriors at the expense of the architectural simplicity of variational message passing. As a compromise between these two extremes, we offer a third approach – marginal message passing – that features a simple architecture, while approximating the performance of belief propagation. Finally, we link formal considerations to accounts of neurological and psychiatric syndromes in terms of aberrant message passing.

To infer the causes of its sensations, the brain must call on a generative (predictive) model. This necessitates passing local messages between populations of neurons to update beliefs about hidden variables in the world beyond its sensory samples. It also entails inferences about how we will act. Active inference is a principled framework that frames perception and action as approximate Bayesian inference. This has been successful in accounting for a wide range of physiological and behavioral phenomena. Recently, a process theory has emerged that attempts to relate inferences to their neurobiological substrates. In this paper, we review and develop the anatomical aspects of this process theory. We argue that the form of the generative models required for inference constrains the way in which brain regions connect to one another. Specifically, neuronal populations representing beliefs about a variable must receive input from populations representing the Markov blanket of that variable. We illustrate this idea in four different domains: perception, planning, attention, and movement. In doing so, we attempt to show how appealing to generative models enables us to account for anatomical brain architectures. Ultimately, committing to an anatomical theory of inference ensures we can form empirical hypotheses that can be tested using neuroimaging, neuropsychological, and electrophysiological experiments.

When modeling goal-directed behavior in the presence of various sources of uncertainty, planning can be described as an inference process. A solution to the problem of planning as inference was previously proposed in the active inference framework in the form of an approximate inference scheme based on variational free energy. However, this approximate scheme was based on the mean-field approximation, which assumes statistical independence of hidden variables and is known to show overconfidence and may converge to local minima of the free energy. To better capture the spatiotemporal properties of an environment, we reformulated the approximate inference process using the so-called Bethe approximation. Importantly, the Bethe approximation allows for representation of pairwise statistical dependencies. Under these assumptions, the minimizer of the variational free energy corresponds to the belief propagation algorithm, commonly used in machine learning. To illustrate the differences between the mean-field approximation and the Bethe approximation, we have simulated agent behavior in a simple goal-reaching task with different types of uncertainties. Overall, the Bethe agent achieves higher success rates in reaching goal states. We relate the better performance of the Bethe agent to more accurate predictions about the consequences of its own actions. Consequently, active inference based on the Bethe approximation extends the application range of active inference to more complex behavioral tasks.

How do we navigate a deeply structured world? Why are you reading this sentence first - and did you actually look at the fifth word? This review offers some answers by appealing to active inference based on deep temporal models. It builds on previous formulations of active inference to simulate behavioural and electrophysiological responses under hierarchical generative models of state transitions. Inverting these models corresponds to sequential inference, such that the state at any hierarchical level entails a sequence of transitions in the level below. The deep temporal aspect of these models means that evidence is accumulated over nested time scales, enabling inferences about narratives (i.e., temporal scenes). We illustrate this behaviour with Bayesian belief updating - and neuronal process theories - to simulate the epistemic foraging seen in reading. These simulations reproduce perisaccadic delay period activity and local field potentials seen empirically. Finally, we exploit the deep structure of these models to simulate responses to local (e.g., font type) and global (e.g., semantic) violations; reproducing mismatch negativity and P300 responses respectively.

This paper introduces an active inference formulation of planning and navigation. It illustrates how the exploitation–exploration dilemma is dissolved by acting to minimise uncertainty (i.e. expected surprise or free energy). We use simulations of a maze problem to illustrate how agents can solve quite complicated problems using context sensitive prior preferences to form subgoals. Our focus is on how epistemic behaviour—driven by novelty and the imperative to reduce uncertainty about the world—contextualises pragmatic or goal-directed behaviour. Using simulations, we illustrate the underlying process theory with synthetic behavioural and electrophysiological responses during exploration of a maze and subsequent navigation to a target location. An interesting phenomenon that emerged from the simulations was a putative distinction between ‘place cells’—that fire when a subgoal is reached—and ‘path cells’—that fire until a subgoal is reached.

The cerebral cortex predicts visual motion to adapt human behavior to surrounding objects moving in real time. Although the underlying mechanisms are still unknown, predictive coding is one of the leading theories. Predictive coding assumes that the brain's internal models (which are acquired through learning) predict the visual world at all times and that errors between the prediction and the actual sensory input further refine the internal models. In the past year, deep neural networks based on predictive coding were reported for a video prediction machine called PredNet. If the theory substantially reproduces the visual information processing of the cerebral cortex, then PredNet can be expected to represent the human visual perception of motion. In this study, PredNet was trained with natural scene videos of the self-motion of the viewer, and the motion prediction ability of the obtained computer model was verified using unlearned videos. We found that the computer model accurately predicted the magnitude and direction of motion of a rotating propeller in unlearned videos. Surprisingly, it also represented the rotational motion for illusion images that were not moving physically, much like human visual perception. While the trained network accurately reproduced the direction of illusory rotation, it did not detect motion components in negative control pictures wherein people do not perceive illusory motion. This research supports the exciting idea that the mechanism assumed by the predictive coding theory is one of basis of motion illusion generation. Using sensory illusions as indicators of human perception, deep neural networks are expected to contribute significantly to the development of brain research.

Given that eye movement control can be framed as an inferential process, how are the requisite forces generated to produce anticipated or desired fixation? Starting from a generative model based on simple Newtonian equations of motion, we derive a variational solution to this problem and illustrate the plausibility of its implementation in the oculomotor brainstem. We show, through simulation, that the Bayesian filtering equations that implement 'planning as inference' can generate both saccadic and smooth pursuit eye movements. Crucially, the associated message passing maps well onto the known connectivity and neuroanatomy of the brainstem - and the changes in these messages over time are strikingly similar to single unit recordings of neurons in the corresponding nuclei. Furthermore, we show that simulated lesions to axonal pathways reproduce eye movement patterns of neurological patients with damage to these tracts.

Visual neglect is a debilitating neuropsychological phenomenon that has many clinical implications and-in cognitive neuroscience-offers an important lesion deficit model. In this article, we describe a computational model of visual neglect based upon active inference. Our objective is to establish a computational and neurophysiological process theory that can be used to disambiguate among the various causes of this important syndrome; namely, a computational neuropsychology of visual neglect. We introduce a Bayes optimal model based upon Markov decision processes that reproduces the visual searches induced by the line cancellation task (used to characterize visual neglect at the bedside). We then consider 3 distinct ways in which the model could be lesioned to reproduce neuropsychological (visual search) deficits. Crucially, these 3 levels of pathology map nicely onto the neuroanatomy of saccadic eye movements and the systems implicated in visual neglect.

Biological systems—like ourselves—are constantly faced with uncertainty. Despite noisy sensory data, and volatile environments, creatures appear to actively maintain their integrity. To account for this remarkable ability to make optimal decisions in the face of a capricious world, we propose a generative model that represents the beliefs an agent might possess about their own uncertainty. By simulating a noisy and volatile environment, we demonstrate how uncertainty influences optimal epistemic (visual) foraging. In our simulations, saccades were deployed less frequently to regions with a lower sensory precision, while a greater volatility led to a shorter inhibition of return. These simulations illustrate a principled explanation for some cardinal aspects of visual foraging—and allow us to propose a correspondence between the representation of uncertainty and ascending neuromodulatory systems, complementing that suggested by Yu & Dayan (Yu & Dayan 2005 Neuron46, 681–692. (doi:10.1016/j.neuron.2005.04.026)).

A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo's own move selections and also the winner of AlphaGo's games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100-0 against the previously published, champion-defeating AlphaGo. © 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.

An innovative theoretical framework for stochastic dynamics based on the decomposition of a stochastic differential equation (SDE) into a dissipative component, a detailed-balance-breaking component, and a dual-role potential landscape has been developed, which has fruitful applications in physics, engineering, chemistry, and biology. It introduces the A-type stochastic interpretation of the SDE beyond the traditional Ito or Stratonovich interpretation or even the α-type interpretation for multidimensional systems. The potential landscape serves as a Hamiltonian-like function in nonequilibrium processes without detailed balance, which extends this important concept from equilibrium statistical physics to the nonequilibrium region. A question on the uniqueness of the SDE decomposition was recently raised. Our review of both the mathematical and physical aspects shows that uniqueness is guaranteed. The demonstration leads to a better understanding of the robustness of the novel framework. In addition, we discuss related issues including the limitations of an approach to obtaining the potential function from a steady-state distribution.

At the inception of human brain mapping, two principles of functional anatomy underwrote most conceptions – and analyses – of distributed brain responses: namely functional segregation and integration. There are currently two main approaches to characterising functional integration. The first is a mechanistic modelling of connectomics in terms of directed effective connectivity that mediates neuronal message passing and dynamics on neuronal circuits. The second phenomenological approach usually characterises undirected functional connectivity (i.e., measurable correlations), in terms of intrinsic brain networks, self-organised criticality, dynamical instability, etc. This paper describes a treatment of effective connectivity that speaks to the emergence of intrinsic brain networks and critical dynamics. It is predicated on the notion of Markov blankets that play a fundamental role in the self-organisation of far from equilibrium systems. Using the apparatus of the renormalisation group, we show that much of the phenomenology found in network neuroscience is an emergent property of a particular partition of neuronal states, over progressively coarser scales. As such, it offers a way of linking dynamics on directed graphs to the phenomenology of intrinsic brain networks.

Active Inference is a theory arising from theoretical neuroscience which casts action and planning as Bayesian inference problems to be solved by minimizing a single quantity — the variational free energy. The theory promises a unifying account of action and perception coupled with a biologically plausible process theory. However, despite these potential advantages, current implementations of Active Inference can only handle small policy and state–spaces and typically require the environmental dynamics to be known. In this paper we propose a novel deep Active Inference algorithm that approximates key densities using deep neural networks as flexible function approximators, which enables our approach to scale to significantly larger and more complex tasks than any before attempted in the literature. We demonstrate our method on a suite of OpenAIGym benchmark tasks and obtain performance comparable with common reinforcement learning baselines. Moreover, our algorithm evokes similarities with maximum-entropy reinforcement learning and the policy gradients algorithm, which reveals interesting connections between the Active Inference framework and reinforcement learning.

In this thesis, we appeal to recent developments in theoretical neurobiology – namely, active inference – to understand the active visual system and its disorders. Chapter 1 reviews the neurobiology of active vision. This introduces some of the key conceptual themes around attention and inference that recur through subsequent chapters. Chapter 2 provides a technical overview of active inference, and its interpretation in terms of message passing between populations of neurons. Chapter 3 applies the material in Chapter 2 to provide a computational characterisation of the oculomotor system. This deals with two key challenges in active vision: deciding where to look, and working out how to look there. The homology between this message passing and the brain networks solving these inference problems provide a basis for in silico lesion experiments, and an account of the aberrant neural computations that give rise to clinical oculomotor signs (including internuclear ophthalmoplegia). Chapter 4 picks up on the role of uncertainty resolution in deciding where to look, and examines the role of beliefs about the quality (or precision) of data in perceptual inference. We illustrate how abnormal prior beliefs influence inferences about uncertainty and give rise to neuromodulatory changes and visual hallucinatory phenomena (of the sort associated with synucleinopathies). We then demonstrate how synthetic pharmacological perturbations that alter these neuromodulatory systems give rise to the oculomotor changes associated with drugs acting upon these systems. Chapter 5 develops a model of visual neglect, using an oculomotor version of a line cancellation task. We then test a prediction of this model using magnetoencephalography and dynamic causal modelling. Chapter 6 concludes by situating the work in this thesis in the context of computational neurology. This illustrates how the variational principles used here to characterise the active visual system may be generalised to other sensorimotor systems and their disorders.

Spiking neural networks (SNNs) are nature's versatile solution to fault-tolerant, energy-efficient signal processing. To translate these benefits into hardware, a growing number of neuromorphic spiking NN processors have attempted to emulate biological NNs. These developments have created an imminent need for methods and tools that enable such systems to solve real-world signal processing problems. Like conventional NNs, SNNs can be trained on real, domain-specific data; however, their training requires the overcoming of a number of challenges linked to their binary and dynamical nature. This article elucidates step-by-step the problems typically encountered when training SNNs and guides the reader through the key concepts of synaptic plasticity and data-driven learning in the spiking setting. Accordingly, it gives an overview of existing approaches and provides an introduction to surrogate gradient (SG) methods, specifically, as a particularly flexible and efficient method to overcome the aforementioned challenges.

To exhibit social intelligence, animals have to recognize whom they are communicating with. One way to make this inference is to select among internal generative models of each conspecific who may be encountered. However, these models also have to be learned via some form of Bayesian belief updating. This induces an interesting problem: When receiving sensory input generated by a particular conspecific, how does an animal know which internal model to update? We consider a theoretical and neurobiologically plausible solution that enables inference and learning of the processes that generate sensory inputs (e.g., listening and understanding) and reproduction of those inputs (e.g., talking or singing), under multiple generative models. This is based on recent advances in theoretical neurobiology—namely, active inference and post hoc (online) Bayesian model selection. In brief, this scheme fits sensory inputs under each generative model. Model parameters are then updated in proportion to the probability that each model could have generated the input (i.e., model evidence). The proposed scheme is demonstrated using a series of (real zebra finch) birdsongs, where each song is generated by several different birds. The scheme is implemented using physiologically plausible models of birdsong production. We show that generalized Bayesian filtering, combined with model selection, leads to successful learning across generative models, each possessing different parameters. These results highlight the utility of having multiple internal models when making inferences in social environments with multiple sources of sensory information.

Since their popularization in the 1990s, Markov chain Monte Carlo (MCMC) methods have revolutionized statistical computing and have had an especially profound impact on the practice of Bayesian statistics. Furthermore, MCMC methods have enabled the development and use of intricate models in an astonishing array of disciplines as diverse as fisheries science and economics. The wide-ranging practical importance of MCMC has sparked an expansive and deep investigation into fundamental Markov chain theory. The Handbook of Markov Chain Monte Carlo provides a reference for the broad audience of developers and users of MCMC methodology interested in keeping up with cutting-edge theory and applications. The first half of the book covers MCMC foundations, methodology, and algorithms. The second half considers the use of MCMC in a variety of practical applications including in educational research, astrophysics, brain imaging, ecology, and sociology. The in-depth introductory section of the book allows graduate students and practicing scientists new to MCMC to become thoroughly acquainted with the basic theory, algorithms, and applications. The book supplies detailed examples and case studies of realistic scientific problems presenting the diversity of methods used by the wide-ranging MCMC community. Those familiar with MCMC methods will find this book a useful refresher of current theory and recent developments.

This article characterizes impulsive behavior using a patch-leaving paradigm and active inference-a framework for describing Bayes optimal behavior. This paradigm comprises different environments (patches) with limited resources that decline over time at different rates. The challenge is to decide when to leave the current patch for another to maximize reward. We chose this task because it offers an operational characterization of impulsive behavior, namely, maximizing proximal reward at the expense of future gain. We use a Markov decision process formulation of active inference to simulate behavioral and electrophysiological responses under different models and prior beliefs. Our main finding is that there are at least three distinct causes of impulsive behavior, which we demonstrate by manipulating three different components of the Markov decision process model. These components comprise (i) the depth of planning, (ii) the capacity to maintain and process information, and (iii) the perceived value of immediate (relative to delayed) rewards. We show how these manipulations change beliefs and subsequent choices through variational message passing. Furthermore, we appeal to the process theories associated with this message passing to simulate neuronal correlates. In future work, we will use this scheme to identify the prior beliefs that underlie different sorts of impulsive behavior-and ask whether different causes of impulsivity can be inferred from the electrophysiological correlates of choice behavior.

This perspective describes predictive processing as a computational framework for understanding cortical function in the context of emerging evidence, with a focus on sensory processing. We discuss how the predictive processing framework may be implemented at the level of cortical circuits and how its implementation could be falsified experimentally. Lastly, we summarize the general implications of predictive processing on cortical function in healthy and diseased states. In this perspective, Keller and Mrsic-Flogel describe the advantages of predictive processing as a computational framework for understanding cortical function in the context of emerging evidence with a focus on sensory processing.

Background:
Artificial intelligence has recently attained humanlike performance in a number of gamelike domains. These advances have been spurred by brain-inspired architectures and algorithms such as hierarchical filtering and reinforcement learning. OpenAI Gym is an open-source platform in which to train, test, and benchmark algorithms-it provides a range of tasks, including those of classic arcade games such as Doom. Here we describe how the platform might be used as a simulation, test, and diagnostic paradigm for psychiatric conditions.
Methods:
To illustrate how active inference models of game play could be used to test mechanistic and algorithmic properties of psychiatric disorders, we provide two exemplar analyses. The first speaks to the impact of aging on cognition, examining game-play behaviors in a model of aging in which we compared age-dependent changes of younger (n = 9, 22 ± 1 years of age) and older (n = 7, 56 ± 5 years of age) adult players. The second is an illustration of a putative feature of anhedonia in which we simulated diminished sensitivity to reward.
Results:
These simulations demonstrate how active inference can be used to test predicted changes in both neurobiology and beliefs in psychiatric cohorts. We show that, as well as behavioral measures, putative neural correlates of active inference can be simulated, and hypothesized (model-based) differences in local field potentials and blood oxygen level-dependent responses can be produced.
Conclusions:
We show that active inference, through epistemic and value-based goals, enables simulated subjects to actively develop detailed representations of gaming environments, and we demonstrate the use of a principled algorithmic and neurobiological framework for testing hypotheses in psychiatric illness.

Modern decision neuroscience offers a powerful and broad account of human behaviour using computational techniques that link psychological and neuroscientific approaches to the ways that individuals can generate near-optimal choices in complex controlled environments. However, until recently, relatively little attention has been paid to the extent to which the structure of experimental environments relates to natural scenarios, and the survival problems that individuals have evolved to solve. This situation not only risks leaving decision-theoretic accounts ungrounded but also makes various aspects of the solutions, such as hard-wired or Pavlovian policies, difficult to interpret in the natural world. Here, we suggest importing concepts, paradigms and approaches from the fields of ethology and behavioural ecology, which concentrate on the contextual and functional correlates of decisions made about foraging and escape and address these lacunae.

Two theoretical ideas have emerged recently with the ambition to provide a unifying functional explanation of neural population coding and dynamics: predictive coding and Bayesian inference. Here, we describe the two theories and their combination into a single framework: Bayesian predictive coding. We clarify how the two theories can be distinguished, despite sharing core computational concepts and addressing an overlapping set of empirical phenomena. We argue that predictive coding is an algorithmic / representational motif that can serve several different computational goals of which Bayesian inference is but one. Conversely, while Bayesian inference can utilize predictive coding, it can also be realized by a variety of other representations. We critically evaluate the experimental evidence supporting Bayesian predictive coding and discuss how to test it more directly.

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry

While great strides have been made in using deep learning algorithms to solve supervised learning tasks, the problem of unsupervised learning - leveraging unlabeled examples to learn about the structure of a domain - remains a difficult unsolved challenge. Here, we explore prediction of future frames in a video sequence as an unsupervised learning rule for learning about the structure of the visual world. We describe a predictive neural network ("PredNet") architecture that is inspired by the concept of "predictive coding" from the neuroscience literature. These networks learn to predict future frames in a video sequence, with each layer in the network making local predictions and only forwarding deviations from those predictions to subsequent network layers. We show that these networks are able to robustly learn to predict the movement of synthetic (rendered) objects, and that in doing so, the networks learn internal representations that are useful for decoding latent object parameters (e.g. pose) that support object recognition with fewer training views. We also show that these networks can scale to complex natural image streams (car-mounted camera videos), capturing key aspects of both egocentric movement and the movement of objects in the visual scene, and generalizing across video datasets. These results suggest that prediction represents a powerful framework for unsupervised learning, allowing for implicit learning of object and scene structure.

We describe a iterative procedure for optimizing policies, with guaranteed monotonic improvement. By making several approximations to the theoretically-justified procedure, we develop a practical algorithm, called Trust Region Policy Optimization (TRPO). This algorithm is similar to natural policy gradient methods and is effective for optimizing large nonlinear policies such as neural networks. Our experiments demonstrate its robust performance on a wide variety of tasks: learning simulated robotic swimming, hopping, and walking gaits; and playing Atari games using images of the screen as input. Despite its approximations that deviate from the theory, TRPO tends to give monotonic improvement, with little tuning of hyperparameters.