ArticlePDF Available

Abstract and Figures

This paper describes and validates a novel framework using the Approximate Bayesian Computation (ABC) algorithm for parameter estimation and model selection in models of mesoscale brain network activity. We provide a proof of principle, first pass validation of this framework using a set of neural mass models of the cortico-basal ganglia thalamic circuit inverted upon spectral features from experimental in vivo recordings. This optimization scheme relaxes an assumption of fixed-form posteriors (i.e. the Laplace approximation) taken in previous approaches to inverse modelling of spectral features. This enables the exploration of model dynamics beyond that approximated from local linearity assumptions and so fit to explicit, numerical solutions of the underlying non-linear system of equations. In this first paper, we establish a face validation of the optimization procedures in terms of: (i) the ability to approximate posterior densities over parameters that are plausible given the known causes of the data; (ii) the ability of the model comparison procedures to yield posterior model probabilities that can identify the model structure known to generate the data; and (iii) the robustness of these procedures to local minima in the face of different starting conditions. Finally, as an illustrative application we show (iv) that model comparison can yield plausible conclusions given the known neurobiology of the cortico-basal ganglia-thalamic circuit in Parkinsonism. These results lay the groundwork for future studies utilizing highly nonlinear or brittle models that can explain time dependent dynamics, such as oscillatory bursts, in terms of the underlying neural circuits.
Scaling up the ABC model comparison framework -investigating models of the cortico-basal gangliathalamic network. 12 competing models (six families subdivided each into two sub-families) were fitted to empirical data from Parkinsonian rats. Models were fitted to summary statistics of recordings from the motor cortex (M2), striatum (STR), subthalamic nucleus (STN), and external segment of the globus pallidus (GPe). Models were first fit using ABC to estimate the approximate posterior distributions over parameters. To assess relative model performances, 1000 draws were made from each model posterior and corresponding data was simulated. (A) The posterior model fits for the top three performing models are shown, with autospectra on the diagonal and NPD on the off-diagonal (M 5.2 in light green; M 4.2 in turquois; and 1.2 in red). Bounds indicate the interquartile range of the simulated features. (B) Violin plots of the distributions of model accuracies (MSEpooled) of the simulated pseudo-data from the empirical data. (C) The acceptance probability approximation to the model evidence 1-P(M|D) is determined by computing the number of samples from the posterior that exceed the median model accuracy (MSEpooled). (D) The joint space normalized Kullback-Leibler divergence of the posterior from prior is shown for each model (for formulation see second term of equation 9). Large values indicate high divergence and overfitting. (E) Combined scores for accuracy and divergence from priors using ACS.
… 
Content may be subject to copyright.
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 1
Inference of Brain Networks with
1
Approximate Bayesian Computation
2
assessing face validity with an example
3
application in Parkinsonism
4
Timothy O. West*1,2,3, Luc Berthouze4,5, Simon F. Farmer6,7, Hayriye Cagnan1,2,3,
5
Vladimir Litvak3
6
1. Nuffield Department of Clinical Neurosciences, Medical Sciences Division, University of Oxford,
7
Oxford OX3 9DU
8
2. Medical Research Council Brain Network Dynamics Unit, University of Oxford, Oxford, OX1 3TH,
9
United Kingdom.
10
3. Wellcome Trust Centre for Human Neuroimaging, UCL Institute of Neurology, Queen Square,
11
London, WC1N 3BG, UK.
12
4. Centre for Computational Neuroscience and Robotics, University of Sussex, Falmer, UK.
13
5. UCL Great Ormond Street Institute of Child Health, Guildford St., London, WC1N 1EH, UK.
14
6. Department of Neurology, National Hospital for Neurology & Neurosurgery, Queen Square, London
15
WC1N 3BG, UK.
16
7. Department of Clinical and Movement Neurosciences, Institute of Neurology, Queen Square, UCL,
17
London, WC1N 3BG, UK.
18
*Corresponding Author
19
Abstract
20
This paper describes and validates a novel framework using the Approximate Bayesian Computation
21
(ABC) algorithm for parameter estimation and model selection in models of mesoscale brain network
22
activity. We provide a proof of principle, first pass validation of this framework using a set of neural
23
mass models of the cortico-basal ganglia thalamic circuit inverted upon spectral features from
24
experimental in vivo recordings. This optimization scheme relaxes an assumption of fixed-form
25
posteriors (i.e. the Laplace approximation) taken in previous approaches to inverse modelling of
26
spectral features. This enables the exploration of model dynamics beyond that approximated from local
27
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 2
linearity assumptions and so fit to explicit, numerical solutions of the underlying non-linear system of
28
equations. In this first paper, we establish a face validation of the optimization procedures in terms of:
29
(i) the ability to approximate posterior densities over parameters that are plausible given the known
30
causes of the data; (ii) the ability of the model comparison procedures to yield posterior model
31
probabilities that can identify the model structure known to generate the data; and (iii) the robustness
32
of these procedures to local minima in the face of different starting conditions. Finally, as an illustrative
33
application we show (iv) that model comparison can yield plausible conclusions given the known
34
neurobiology of the cortico-basal ganglia-thalamic circuit in Parkinsonism. These results lay the
35
groundwork for future studies utilizing highly nonlinear or brittle models that can explain time
36
dependent dynamics, such as oscillatory bursts, in terms of the underlying neural circuits.
37
Keywords
38
Inverse modelling, networks, brain dynamics, oscillations, circuits, Parkinsonism
39
1 Introduction
40
Models of mesoscale brain activity (Vogels et al., 2005; Deco et al., 2015; Breakspear, 2017) provide
41
ways to understand how interaction between: (a) the function of neurons (dictated by their intrinsic
42
biophysical properties); and (b) the structure of the synaptic network that connects them, can modulate
43
neural communication. Typically, the integration of activity across spatially distributed networks has
44
been estimated using the tools of functional connectivity (i.e. determining the statistical dependencies
45
between brain activity; Friston, 2011). However, these approaches are descriptive are unable to explore
46
the causes of correlated neural activity that can explained through changes in either structure, function,
47
or a combination of both.
48
By building generative models of neural circuit dynamics and then inverting them from data, it is
49
possible to gain insight into the mechanisms underlying the rich spatiotemporal patterning of brain
50
activity (Horwitz et al., 2000). Inverse modelling not only allows for the prediction of an individual
51
model’s parameters, but also the comparison of models, so allowing different hypotheses to be
52
evaluated given some data (Jaqaman and Danuser, 2006). In its simplest form, combining a priori
53
knowledge alongside hand tuning of unknown parameters has led to a number of sophisticated models
54
(Traub et al., 1991; De Schutter and Bower, 1994). This approach can be formalized, using algorithmic
55
schemes for parameter estimation (Rowe et al., 2004; Wendling et al., 2009). More recently, Bayesian
56
optimization schemes provide principled way of including prior knowledge of a system when
57
computing an inverse model (Moran et al., 2009; Hadida et al., 2018; Hashemi et al., 2018). These
58
approaches can estimate the posterior distribution over model parameters (i.e. parameter estimation) as
59
well as over a space of models (i.e. model evidence).
60
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 3
Many approaches make assumptions to render a model amenable to a particular optimization scheme.
61
For instance, in Dynamic Causal Modelling (DCM; Friston et al., 2012), the optimization algorithm
62
(variational Bayes) makes an assumption of fixed form posteriors (the Laplace approximation). This
63
simplifies the optimization problem by reducing the description of the approximate posterior density to
64
its first two moments but precludes the examination of highly nonlinear or stochastic models, where
65
this is likely to be violated by the existence of multimodal posteriors (Daunizeau et al., 2009; Sengupta
66
et al., 2016). To help ensure this assumption is met, a model can be linearized and its behaviour
67
approximated by computing its transfer function around a fixed point (Valdes et al., 1999; Robinson et
68
al., 2001; Friston et al., 2012). This helps to ensure that posterior densities conform to a multivariate
69
normal but can limit the model dynamics that may be explored (although see discussion for existing
70
approaches to this problem). Importantly, highly nonlinear or stochastic dynamics are thought to
71
underpin important features observed in the functional organization of brain activity, such as the
72
transitions between resting-states (Deco et al., 2009, 2011) or the transient bursting of synchronous
73
activity (e.g. Palmigiano et al., 2017). In this work we describe a framework for inverse modelling of
74
large-scale brain dynamics that avoids: (a) appeals to the Laplace approximation; and (b) approximating
75
model dynamics from local linear behaviour.
76
To these ends, we set up a framework using the Approximate Bayesian Computation optimization
77
algorithm (ABC; Beaumont et al., 2002) that provides a method of “simulation based” inference
78
(Cranmer et al., 2020) and is well suited to complex models which have a large state and/or parameter
79
space, exhibit stochastic or highly nonlinear dynamics, or require numerically expensive integration
80
schemes to solve. This method has been successfully employed and validated across a number of such
81
models in systems biology (Excoffier, 2009; Toni and Stumpf, 2009; Turner and Sederberg, 2012; Liepe
82
et al., 2014), but is yet to see wide usage in neuroscience. Usefully, the scheme allows models to be
83
inverted upon hypothetically any summary statistic of neural recordings such as spectra, spike density,
84
or measures of connectivity (although see discussion for a description of the risk of “insufficiency” in
85
these features). Specifically, we use a variant of ABC called sequential-Monte Carlo ABC (ABC-SMC;
86
Toni et al., 2009).
87
We aim to provide a first-pass evaluation of the face validity of the proposed framework. To do this,
88
we build a set of examples using models of Parkinsonian circuit dynamics. These examples are derived
89
from a previously reported model of the cortical-basal ganglia-thalamic circuit (van Wijk et al., 2018)
90
and constrained using data from an experimental, rodent model of Parkinsonism (West et al., 2018).
91
We use a reduced set of data features (the magnitude of the power spectra and directed functional
92
connectivity) that can be derived from the full complex cross-spectra, and simplify the estimation of
93
the observation model (see methods). Note that we retain relatively simple, time-averaged data features,
94
as well as a well-established generative model, in order focus our examination upon the validity of the
95
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 4
optimization scheme itself. This then paves the way for further validations of this method in terms of
96
more complicated models and data (see discussion).
97
Specifically, we first examine the properties of the inversion scheme, investigating parameter estimates
98
and convergence. We then take a similar approach to that used previously in the validation of methods
99
such as DCM by first testing the so-called face validity (Moran et al., 2009) through examination of (a)
100
the ability of the parameter estimation procedure to yield plausible posterior distributions over
101
parameters given those known to generate the data; (b) the robustness of the parameter estimation
102
method to the existence of local minima in the face of multiple realizations of the same data and different
103
starting conditions; and (c) the ability of the model comparison procedures to recover plausible model
104
architectures given that known to generate the data. Finally, we demonstrate that the scheme can yield
105
neurobiologically plausible conclusions given the structure of the circuits known to underly oscillatory
106
dynamics in Parkinsonism.
107
2 Methods
108
2.1 Overview of Sequential Monte Carlo Approximate Bayesian Computation for Inverse
109
Modelling of Neural Data
110
We present an overview of the framework using ABC-SMC and its adaptations for applications to large
111
scale neural models is figure 1. The algorithm takes a form in which several processes are repeated
112
multiple times within their parent process (figure 1; inset). The scheme is contingent on simulation of
113
pseudo-data by a generative forward model a description of the neural dynamics - (figure 1A; green
114
box) given a set of proposal parameters sampled from a prior (multivariate Gaussian) distribution
115
(figure 1C; turquoise box). This pseudo-data can then be compared against the empirical data by first
116
using a common data transform (i.e. a summary statistic of the data) and then assessing their similarity
117
by computing the objective function (goodness-of-fit) (figure 1B; blue box). This model fit provides a
118
measure by which parameter samples are either rejected or carried forward depending on a threshold
119
on the goodness-of-fit, in order to generate the next proposal distribution in the sequence. When the
120
process in figure 1C is iterated with a shrinking tolerance schedule, ABC can be used to approximate
121
the posterior parameter distribution at convergence (figure 1C; orange box). Finally, if the process
122
described above is repeated over several competing models then the approximate
123
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 5
Figure 1 Framework for application of Approximate Bayesian Computation for simulation-based inference of
brain network dynamics. (Inset) This schematic gives an overview of the framework described in this paper. Individual
generative models are specified as a set of state equations (yellow boxes) and prior distribution over parameters (light blue
boxes) that will be used to approximate the posterior density over the parameters (orange boxes) for the system generating
the observed data (red boxes) with varying degrees of fit (blue boxes) using ABC. The approximate posterior distribution
can then be used to compare models and decide on a winning model or family of models (purple box). (A) Generation of
pseudo-data by integrating the state equations parameterized by a sample drawn from the prior or proposal distribution.
Models can incorporate stochastic innovations as well as a separate observation model to produce samples of pseudo-data
(green boxes). (B) Pseudo-data is compared against the real data using a common data transform that provides a
summary statistic of the time series data (i.e. spectra and directed functional connectivity). The simulated and empirical
data are then compared by computing the objective function that can be used to score the model fit (blue boxes). (C) ABC
sequentially repeats the processes in boxes A and B by iteratively updating a proposal distribution formed from accepted
samples. Samples are rejected depending on an adaptive threshold of the objective scores aiming to reduce the distance
between summary statistics of the data and pseudo-data. This process iterates until the convergence criterion is met and the
proposal distribution is taken as an approximation of the posterior distribution. (D) By repeating the ABC process in box
(C) over multiple models, the approximate posteriors can be used to evaluate the model probabilities. This process samples
from the posterior many times to compute the probability of each model exceeding the median accuracy of all models
tested. This acceptance probability can then be used to compare the model’s ability to accurately fit the data and select the
best candidate model given the data.
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 6
posterior distribution may be used to assess each model’s fitness via model comparison (figure 1D;
124
purple box). The exact details of each process outlined in the figure are given below. The existing
125
methods that form the basis for this work are outlined in appendix I.
126
2.2 Forward Model for Generation of Neural Pseudo-Data
127
The fitting algorithm is based upon sampling from a sequence of proposal distributions over parameters
128
to generate realizations of the generative process that we refer to as pseudo-data (figure 1A; green box).
129
A model is specified by the state equations of the dynamical system and observation model :
130

131
EQUATION 1
132
The equations of the model F describe the evolution of states x with parameters . This model
133
describes the underlying neuronal dynamics that give rise to the evolution of the states. The observation
134
model G describes the effects upon the signals that are introduced by processes such as experimental
135
acquisition or recording and is parameterized by . The observation model can account for confounds
136
introduced beyond that of the generative model of the data such as changes in signal-to-noise ratio
137
(SNR), crosstalk, or other distortions of the underlying generative process. In the examples provided
138
here, we use a simple observation model that comprises a model of sensor noise in which the gain on
139
additive noise remains a free parameter to estimate differences in the SNR between signals. We use
140
Gaussian white noise and assume identity covariance between the sensors. In this example, we avoided
141
estimation of the lead field parameters (Kiebel et al., 2006) by our choice of summary statistic
142
describing the interaction between sources (see section Directed Functional Connectivity).
143
In general, model M could describe any dynamical system describing the time evolution of neural data
144
such as spiking networks, conductance models, or phenomenological models (e.g. phase oscillators,
145
Markov chains). In this paper we use coupled neural mass equations (Jansen and Rit, 1995; David and
146
Friston, 2003) to model population activity of the cortico-basal-ganglia-thalamic circuit, of which the
147
biological and theoretical basis has been previously described (Moran et al., 2011; Marreiros et al.,
148
2013; van Wijk et al., 2018). The original equations were adapted to explicitly incorporate stochastic
149
inputs and finite transmission delays. This yielded a system of stochastic delay differential equations
150
that could be solved using the Euler-Maruyama method. For details of the modelling, as well as details
151
of the integration of the state equations please see Supplementary Information I which gives details of
152
the model formulation, state equations, and numerical solver.
153
Parameters of both the generative and observational model can either be fixed or variable. In the case
154
of variable parameters (parameters to be fit), a prior density encoding a priori beliefs about the values
155
that the parameters take must be specified. This is encoded through the mean and variance for each
156
parameter, with the variance encoding the inverse-precision of a prior belief. In this way fixed
157
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 7
parameters can be thought as being known with complete confidence. Note we assume identity
158
covariance of the priors. We take the majority of prior values for parameters of the cortico-basal-
159
ganglia-thalamic network from (van Wijk et al., 2018) but set some delays and connection strengths
160
given updated knowledge in the literature. A table of parameters can be found in Supplementary
161
Information II.
162
For results in sections 3.1 through to the first part of section 3.3 we use a reduction of the full model
163
comprising the reciprocally coupled STN/GPe. This model can be again divided into separate models
164
(for the purposes of performing face validation and example model comparison) by constraining priors
165
on the connectivity between the STN and GPe. The later part of section 3.3 and section 3.4 uses a wider
166
model space comprising a set of systematic variations upon the full model (see figure 5).
167
2.3 Model Inversion with Sequential Monte Carlo ABC
168
2.3.1 Algorithm Overview
169
In order to estimate the parameters of the model M, given the information provided by the empirical
170
recordings we use an algorithm based upon ABC-SMC (Toni et al., 2009; Del Moral et al., 2012). ABC
171
is a “likelihood free” algorithm (Marin et al., 2012). Most generally, the algorithm forms a sample of
172
draws taken from a prior distribution and then goes on to estimate an intermediate sequence of proposal
173
distributions via iterative rejection of the parameter draws. Given a suitable shrinking tolerance
174
schedule, the simulated pseudo-data (generated from the sample-parameterized forward model) and the
175
empirical data should converge as the proposal distribution approaches the true posterior.
176
The ABC algorithm is illustrated in figure 1C (orange box) and follows the procedure below. Probability
177
densities are given by ; parameters are indicated by ; models by M; data by D; and distances by
178
ρ. Samples are indicated by hat notation (i.e. ); subscripts indicate the sample number; proposal
179
distributions are indicated by an asterisk (i.e. ); and subscripts equal to zero denote belonging to
180
the empirical data (i.e. and are summary statistics of sample 1 and of the empirical data
181
respectively).
182
1. Specify prior distribution of parameters, of the model M.
(model prior)
2. Randomly sample times from the prior to yield samples .
(sampler)
3. Simulate pseudo-data
~ ).
(simulation of joint
distribution)
4. Compute summary statistic of of pseudo-data
.
(data transform)
5. Compute distance of from .
(assess model fit)
6. Reject if distance
(rejection sampling)
7. Form proposal distribution, , from the accepted
parameter samples.
(form proposal)
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 8
8. Iterate for , setting , and
shrinking the distance threshold.
(adaptive tolerance
schedule)
9. At convergence criteria, accept proposal distribution as
posterior: ,
(estimation of
approximate posterior)
To avoid sample wastage across iterations, we store the samples from step (2) and their resulting
183
distance from the data (step 5) across the Q iterations such that they are propagated through a sequence
184
of intermediate distributions. In this way, at step (7) the updated proposal distribution comprises
185
samples from both current and past draws selected on the basis of a threshold calculated over the current
186
draw.
187
We estimate the distance or error of the pseudo-data from the real data using the pooled mean squared
188
error (MSEpooled) as an objective function (see supplementary information II for equation).
189
Setting a shrinking distance threshold ensures that the posterior estimates converge upon solutions
190
that most accurately reproduce the summary statistics of the observed data (Del Moral et al., 2012).
191
With non-negligible , the algorithm samples from an approximate posterior distribution
192
rather than the true posterior  when . Thus the upper bound on the
193
error of parameter estimates is therefore determined by how far is from zero (Dean et al., 2014).
194
2.3.2 Adaptive Tolerance Schedule
195
To facilitate incremental sampling from a sequence of increasingly constrained target distributions we
196
set an adaptive tolerance schedule. This is specified by determining a predicted gradient for the average
197
distance of the next set of samples:
198

199
EQUATION 2
200
where the expected change in the distance of the new samples  is given by:
201
 
 
202
EQUATION 3
203
where  is the number of accepted samples and is a minimum criterion on the accepted sample
204
size to carry forward the tolerance shrinkage at its current gradient. If  then this gradient is
205
assumed to be too steep and the expected gradient is recalculated using a modified tolerance
that is
206
computed using the median distance between the sample pseudo-data from that real (i.e.
,
207
where ~ indicates the median). Thus parameterizes the coarseness of the optimization. If is very
208
large (e.g. >99% of N) then the algorithm will converge slowly but accurately, whereas if is very
209
small (e.g. 1% of N) the algorithm will be inaccurate and biased. We set to be the two times the
210
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 9
estimated rank of the parameter covariance matrix i.e. ) (for details of estimation see section
211
2.3.3).
212
2.3.3 Formation of Proposal Distributions
213
Following rejection sampling, the proposal density  is formed from the accepted parameters
214
sets. We use a density approximation to the marginals and a copula for the joint similar to that described
215
in Li et al., (2017). We take the initial draw of samples from the prior formed with a multivariate normal:
216

217
EQUATION 4
218
where is a vector of the prior expectations and their covariances. In subsequent iterations whereby
219
a minimum sample is accumulated, we use nonparametric estimation of the marginal densities over
220
each parameter using a non-parametric kernel density estimator (Silverman, 2003). This approach
221
allows for free-form approximation to probability densities (e.g. multimodal or long-tailed
222
distributions). This flexibility allows for sampling across multiple possible maxima at once, particularly
223
at intermediate stages of the optimization. The bandwidth (determining the smoothness) of the kernel
224
density estimator is optimized using a log-likelihood, cross-validation approach (Bowman, 1984).
225
We then form the multivariate proposal distribution using the t-copula (Nelsen, 1999). Copula theory
226
provides a mathematically convenient way of creating the joint probability distribution whilst
227
preserving the original marginal distributions. Data are transformed to the copula scale (unit-square)
228
using the kernel density estimator of the cumulative distribution function of each parameter and then
229
transformed to the joint space with the t-copula.
230
The copula estimation of the correlation structure of the parameter distributions acts to effectively
231
reduce the dimensionality of the problem by binding correlated parameters into modes. The effective
232
rank of the posterior parameter space (used in the computation of the adaptive tolerance schedule and
233
reported in the results as a post-hoc assessment of parameter learning) can be estimated by taking the
234
eigenvalues of the covariance matrix and normalizing the coefficients by their sum. Using a cumulative
235
sum of the ordered coefficients we can then determine the number of modes that can explain 95% of
236
the variance of the parameter samples.
237
2.4 Model Comparison
238
In the process of model-based inference, hypotheses may be compared in their capacity to explain the
239
observed data. Models fit with ABC can be formally compared using either “joint-space” or “marginal
240
likelihood” based approaches (Grelaud et al., 2009; Toni and Stumpf, 2009). Here we use the latter
241
approach which estimates the marginal likelihood (model evidence) for each jth model:
242

243
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 10
EQUATION 5
244
where  is a threshold on the distance metric that is suitably small to give an acceptable fit on the data
245
and is common across models. We refer to the outcome of equation 5 as the acceptance rate of a
246
particular model. For derivation of the ABC-SMC approximation to the marginal likelihood, please see
247
Toni and Stumpf (2009). In practice we set to be the median of the distances across of all sets of
248
models. The marginal posterior probability of a model is then given by combining marginal model
249
likelihoods and prior model probability  and then normalizing across the space of J models:
250



251
EQUATION 6
252
In all cases described, we assume that prior model probabilities are uniform.
253
We provide a post-hoc estimation of model complexity terms of the divergence of the posterior from
254
the prior (Friston et al., 2007; Penny, 2012). Specifically, we estimate the Kullback-Lieber divergence
255
DKL of the posterior density  from the prior density over F discretized bins of the density:
256




257
EQUATION 8
258
This is a simplification of the full multivariate divergence and ignores the dependencies between
259
variables encoded in the posterior covariance. We use the full multivariate divergence (given in
260
supplementary information IV) that uses a multivariate Gaussian approximation to the ABC estimated
261
posterior density (taking the mean and covariance of N samples from the posterior). The DKL can then
262
be used for a post-hoc discrimination of model performance. To do this we build a complexity-adjusted
263
goodness-of-fit heuristic (accuracy-complexity score; ACS) similar in form to an information criterion
264
such as the Bayesian Information Criterion, but refined to consider posterior divergence as a measure
265
of complexity, rather than the absolute number of parameters:
266



267
EQUATION 9
268
where 
is the divergence of posteriors from priors for the jth model normalized by the sum of
269
divergences across the whole model space. Models that contribute exactly 1/J of the summed divergence
270
have zero complexity penalty. Please note that the ACS does not provide the objective function for
271
optimization (which is the pooled MSE), but rather a heuristic for post-hoc discrimination of models
272
via the addition of an Occam factor to account for model parsimony (MacKay, 2003).
273
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 11
2.5 Empirical Data: Recordings from Parkinsonian Rats
274
Summary statistics are computed from empirical data and then used to fit the generative forward model.
275
In the example implementation used in this paper we use multisite basal ganglia and single site cerebral
276
cortex recordings in rats (n = 9) that have undergone a 6-hydroxydopamine (6-OHDA) induced
277
dopamine depletion of the midbrain, a lesion model of the degeneration associated with Parkinsonism
278
in humans (Magill et al., 2004, 2006). The animals were implanted with two electrodes to measure local
279
field potentials (LFP) from multiple structures in the basal ganglia: dorsal striatum (STR), external
280
segment of the globus pallidus (GPe), and the subthalamic nucleus (STN). Additionally
281
electrocorticography was measured over area M2 of the motor cortex, a homologue of the
282
Supplementary Motor Area (SMA) in humans (Paxinos and Watson, 2007). Animals were recorded
283
under isoflurane anaesthesia and during periods of “cortical-activation” induced by a hind-paw pinch
284
(Steriade, 2000). The details of the experimental procedures were previously published (Magill et al.,
285
2004, 2006). Experimental procedures were performed on adult male Sprague Dawley rats (Charles
286
River) and were conducted in accordance with the Animals (Scientific Procedures) Act, 1986 (UK),
287
and with Society for Neuroscience Policies on the Use of Animals in Neuroscience Research.
288
Anesthesia was induced with 4% v/v isoflurane (Isoflo; Schering-Plough) in O2 and maintained with
289
urethane (1.3g/kg, i.p.; ethyl carbamate, Sigma), and supplemental doses of ketamine (30 mg/kg, i.p.;
290
Ketaset; Willows Francis) and xylazine (3 mg/kg, i.p.; Rompun, Bayer).
291
Pre-processing of time series data (LFP and ECoG) was done as follows: data were 1) truncated to
292
remove 1 second (avoid filter artefacts); 2) mean corrected; 3) band-passed filtered 4-100 Hz with a
293
finite impulse response, two-pass (zero-lag) with optimal filter order; 4) data were split into 1 second
294
epochs with each epoch subjected to a Z-score threshold criterion such that epochs with high amplitude
295
artefacts were removed.
296
2.6 Computation of Summary Statistics
297
We derive a set of summary statistics from signal analyses of the experimental and simulated time
298
series. These statistics transform both the data and pseudo-data into the same feature space such that
299
they can be directly compared (figure 1B; blue box). It is important to note that the summary statistic
300
is vital in determining the outcome of the inverse modelling with ABC (Beaumont et al., 2002; and see
301
discission). The set of statistics must effectively encode all phenomena of the original data that the
302
experimenter wishes to be modelled.
303
2.6.1 Frequency Spectra
304
We use the autospectra to constrain the oscillatory activity of each neural mass. Auto-spectral analyses
305
were made using the averaged periodogram method across 1 second epochs and using a Hanning taper
306
to reduce the effects of spectral leakage. Frequencies between 49-51 Hz were removed so that there was
307
no contribution from 50 Hz line noise. 1/f background noise was removed by first performing a linear
308
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 12
regression on the log-log spectra (at 448 Hz) and then subtracting the linear component from the
309
spectra (Le Van Quyen et al., 2003; Nikulin and Brismar, 2006). Note that only empirical data
310
underwent removal of the 1/f background. This ensured that the inversion scheme was focused upon
311
fitting the spectral peaks in the data and not the profile of 1/f background noise. To simplify observation
312
modelling of differences in experimental recording gains between sites, all spectra were normalized by
313
dividing through by their summed power at 4-48 Hz.
314
2.6.2 Directed Functional Connectivity
315
To quantify interactions between populations, we use non-parametric directionality (NPD; Halliday
316
2015), a directed functional connectivity metric which describes frequency-resolved, time-lagged
317
correlations between time series. The NPD was chosen as it makes it possible to remove the zero-lag
318
component of coherence and so interactions between signals are not corrupted by signal mixing/volume
319
conduction, of which was predominant in the empirical data used here (see West et al., 2018). Thus,
320
using NPD simplifies the observation problem by removing the need to estimate mixing terms or a lead
321
field.
322
Estimates of NPD were obtained using the Neurospec toolbox (http://www.neurospec.org/). This
323
analysis combines Minimum Mean Square Error (MMSE) pre-whitening with forward and reverse
324
Fourier transforms to decompose coherence estimates at each frequency into three components:
325
forward, reverse and zero lag. These components are defined according to the corresponding time lags
326
in the cross-correlation function derived from the MMSE pre-whitened cross-spectrum. This approach
327
allows the decomposition of the signal into distinct forward and reverse components of coherence
328
separate from the zero-lag (or instantaneous) component of coherence which can reflect volume
329
conduction. The method uses temporal precedence to determine directionality. For a detailed
330
formulation of the method see Halliday, (2015); and for its validation see West et al., (2020b). We
331
ignored the instantaneous component of the NPD and use only the forward and reverse components for
332
all further analyses. Note that NPD accounts for the relative phase between activities by segregating the
333
contribution to the cross-spectrum into either leading (forward) or lagging (reverse) components.
334
2.6.3 Data Pooling and Smoothing
335
In all procedures using empirical data to constrain models, we used the group-averaged statistics
336
computed from recordings from a group of unilaterally 6-OHDA lesioned animals. As a final processing
337
step, both the autospectra and NPD were smoothed to remove noise such that fitting was focused on the
338
dominant peaks of the features (Rowe et al., 2004). This was achieved by convolving spectra with a 4
339
Hz wide Gaussian kernel. Empirical and simulated data were transformed identically to produce
340
equivalent autospectra and NPD. We assume smoothness of the spectral features as a way to separate
341
actual features from noise. Spectral estimates in neuroscience (and in general) become increasingly
342
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 13
smooth with large sample sizes (with error decreasing relative to ), thus the smoothing can be
343
considered a correction for finite time spectral estimates.
344
2.6.4 Software Availability
345
All analyses and procedures were written in MATLAB (Mathworks, Natick, MA). The toolbox is
346
publicly available and maintained as a Github repository
347
(https://github.com/twestWTCN/ABCNeuralModellingToolbox.git). For a list of external
348
dependencies and their authors see appendix I. The procedures used for constructing the figures in this
349
paper can be run using the ‘West2021_Neuroimage_Figures.mscript. Please see appendix IV for a
350
short guide to the repository and its key scripts.
351
2.7 Validation of ABC Procedures for Parameter Inference and Model Identification
352
2.7.1 Testing the Face Validity of the Model Inversion Procedures
353
To test whether the ABC estimator will: a) yield parameter estimates that are unique to the data from
354
which they have been optimized; and b) yield consistent estimation of parameters across multiple
355
instances, we performed two procedures of eight multi-starts using two separate datasets. The datasets
356
were created by first defining two forward models with different parameter sets: (1) the MAP estimate
357
of a reciprocally coupled STN/GPe after fitting to the empirical data and (2) the same model but with
358
each parameter log scaling factors randomly adjusted by ±1. These models were then used to generate
359
synthetic datasets by simulating 256s of data (with separate realizations for each multi-start). We could
360
then track the error of parameter estimates from the known parameters of the original forward model to
361
examine the accuracy of the inference.
362
When testing point (a), that parameter estimates are unique to the data from which they are fitted - we
363
performed a eight-fold cross-validation procedure in which we used a one-sample Hotelling procedure
364
to test for significant difference of each fold’s mean from that of the left-out sample. We report the
365
probability of the folds that yielded a significant test, with high probability indicating that the left-out
366
MAP estimates are likely to deviate from the rest of the fold. In this way we can identify the probability
367
of an ABC initialization yielding a non-consistent sample. Secondly, we test (b), that MAP estimates
368
are unique to the data on which they have been fitted- using the Szekely and Rizzo energy test (Aslan
369
and Zech, 2005) between the samples from data A and B, with the null-hypothesis that the multi-start
370
samples derived from different data arise from the same distribution. Finally, we use a Multivariate
371
Analysis of Variance (MANOVA) procedure to test for difference in means between the two
372
multivariate samples.
373
2.7.2 Testing the Face Validity of the Model Comparison Procedures
374
To test the face validity of the model comparison framework, we constructed a confusion matrix, an
375
approach commonly used in machine learning to examine classification accuracy. Three different
376
models of the STN/GPe network were fit to the empirical data and then using the fitted parameters three
377
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 14
synthetic data sets were simulated. We chose a model with reciprocal connectivity: (1) STN GPe;
378
and then two models in which one connection was predominant: (2) STN GPe and (3) GPe STN.
379
The three models (with the original priors) were then fitted back onto the synthetic data. Model
380
comparison was then performed to see whether it could correctly identify the original mode that had
381
generated the data. The model comparison outcomes (accuracy of the fit; DKL of posteriors from priors;
382
and the combined ACS measure) were then plotted in a 3 x 3 matrix of data versus models. In the case
383
of valid model selection, the best fitting models should lay on the diagonal of the confusion matrix. We
384
also performed a second analysis in which more complex models were fitted (incorporating between 4
385
to 6 sources). Specifically, we used models M1.1, M2.2, and 5.2 described in detail in the next section
386
(2.7.3).
387
2.7.3 Testing the Scalability of the Framework with Application to the Full Model Space
388
In order to demonstrate the scalability of the optimization and model comparison framework, we used
389
the space of 12 models described below. We individually fitted the models and then performed a
390
(marginal likelihood-based; see methods) model comparison to select the best of the candidate models.
391
A set of null models were included which are anatomically implausible. If model selection performed
392
correctly, then it is expected that these models would perform poorly.
393
To investigate the importance of known anatomical pathways in reconstructing the observed steady
394
state statistics of the empirical local field potentials (i.e. autospectra and NPD), we considered a set of
395
competing models. Specifically, we looked at the role of five pathways and their interactions: the
396
cortico-striatal indirect; the cortico-striatal direct; the cortico-subthalamic hyperdirect; thalamocortical
397
relay; and the subthalamic-pallidal feedback. In total we tested 6 families of models (presented later in
398
the results section- figure 5):
399
1. + indirect.
400
2. + indirect / + hyperdirect.
401
3. indirect / + hyperdirect.
402
4. + indirect / + direct / hyperdirect / + thalamocortical.
403
5. + indirect / + direct / + hyperdirect / + thalamocortical.
404
6. - indirect / + direct / + hyperdirect / + thalamocortical.
405
We considered these six families and further divided them into two sub-families that do or do not
406
include the subthalamopallidal (STN GPe) feedback connection. Family (1) investigates whether the
407
indirect pathway alone can explain the pattern of observed spectra and functional connectivity. In the
408
case of family (2), previous work has highlighted the importance of hyper-direct connections in the
409
functional connectivity (Jahfari et al., 2011; Nambu et al., 2015), yet anatomical research has shown
410
dopamine to weaken the density of synaptic projections (Chu et al., 2017). Thus, family (2) provides an
411
ideal set of models to examine the nonlinear mapping of anatomical to functional connectivity described
412
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 15
in the introduction of this paper. Families (3) and (6) represent null models in which the indirect
413
pathway is excluded and are used as implausible models to test whether the model comparison
414
procedure yields valid results given the known neurobiology. This is because it is thought that the
415
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 16
Figure 2 Examining the convergence of ABC optimization upon summary statistics from recordings of the STN
and GPe in Parkinsonian rats Parameters of neuronal state space models were optimized using the ABC method detailed
in the text. Snapshots of the optimization are taken at the 1st, 15th, and 30th iteration at which the optimization converges.
(A) Schematic of the STN/GPe neural mass model. (B) The iteration history of the ABC algorithm is presented as a sequence
of box plots indicating the distribution of fits (MSEpooled) at each sampling step, with mean and interquartile range indicated
by individual crosses and boxes. (C) Power spectra of the empirical data (bold) and simulated data (dashed) are shown. The
best fitting parameter sample for each iteration is given by the bold dashed line. (D) Similarly, the functional connectivity
(non-parametric directionality; NPD) is shown in red and blue with the same line coding. (E) Examples of the prior (dashed)
and proposal (bold) marginal distributions for a selection of five parameters are shown (note some priors have identical
specifications and so overlap). It is seen that over iterations the proposal and posterior deviate from the prior as the latent
parameter densities are estimated. (F) Correlation matrices from copula estimation of joint densities over parameters.
Colour bar at bottom indicates the correlation coefficient. Correlated modes appear between parameters as optimization
progresses.
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 17
indirect pathway is vital to explain activity within the network following dopamine depletion associated
416
with PD (Alexander et al., 1986; Albin et al., 1989; Bolam et al., 2000). The functional role of the
417
thalamocortical subnetwork is relatively unknown (but see recent work: Reis et al. 2019) and so families
418
(4) and (5) provide an examination of whether the addition of the thalamic relay can better explain the
419
empirical data. The second level of families (i.e. x.1-2) investigates whether the reciprocal network
420
formed by the STN and GPe is required to explain observed patterns of connectivity in the data. This
421
network has been the subject of much study and is hypothesized to play an important role in the
422
generation and/or amplification of pathological beta rhythms (Plenz and Kital, 1999; Bevan et al., 2002;
423
Cruz et al., 2011).
424
3 Results
425
3.1 Properties of Fitting Procedure and Convergence when Applied to a Simple Model of
426
the Pallido-Subthalamic Subcircuit
427
Figure 2 shows the results of an example model inversion demonstrating how the ABC algorithm
428
iteratively converges to yield an approximation to the summary statistics of the empirical data. This
429
example uses a simple model comprising the reciprocally connected subthalamic nucleus (STN) and
430
external segment of the globus-pallidus (GPe) shown in figure 2A. The autospectra and directed
431
functional connectivity were fit to the group averaged results originally reported in West et al., (2018)
432
which described an analysis of local field potentials recorded from a rodent model of Parkinsonism (see
433
methods for experimental details). By tracking the value of the objective function (i.e. the MSEpooled)
434
over multiple iterations (figure 2B) we demonstrate a fast-rising initial trajectory in the first 15 iterations
435
that eventually plateaus towards convergence, that is well approximated by a logistic function (shown
436
by purple dashed line). In figure 2C and D the simulated features (autospectra and NPD respectively)
437
gradually move closer to the empirically estimated features with each iteration of the algorithm.
438
The evolution of the proposed marginal densities (figure 2E) demonstrates that over the optimization,
439
parameter means, and variances deviate significantly from the prior. Estimation of some parameters is
440
better informed by the data than for others, as indicated by the different precision of the proposal
441
densities. Additionally, learnt multivariate structure in the joint parameter densities is apparent in the
442
parameter correlation matrices (see methods; figure 2F). The evolution of these matrices shows the
443
emergence of distinct correlated modes. These modes reduce the dimensionality of the optimization
444
problem: by estimating the number of significant principal components of the parameters (see methods)
445
we find that optimized models show a reduction of 50-70% from that of the prior. Note however, that
446
due to the identity covariance of the prior, increased correlation in parameters entails an increase the
447
complexity penalty in the ACS metric used to discriminate between models (see Methods and below).
448
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 18
3.2 Testing the Internal Consistency of Data Dependent Estimation of ABC Optimized
449
Posteriors Using a Multi-start Procedure
450
This section of the results examines the face validity of parameter estimation of ABC i.e. that the scheme
451
will (a) make a consistent estimation of posterior model parameters across multiple realizations of
452
optimization; and (b) yield posterior estimates of parameters that are plausible given the known causes
453
of the data. This is achieved using a multi-start procedure (Baritompa and Hendrix, 2005) described in
454
the methods (section 2.7.1) in which we initialize the algorithm eight times for two separate datasets
455
generated by different underlying models. The results of the multi-starts are shown in Figure 3.
456
The evolution of the objective function (MSEpooled) over the progress of the optimization is presented in
457
figure 3A. Multi-starts of the optimization for both dataset A and B exhibited consistent degrees of error
458
in the posterior summary statistics. In figure 3B, the average log-precision of the marginal densities is
459
tracked over the progress of the optimization. These data show that across all initializations, the average
460
precision of the posterior densities (1/σ = 20) was 2.5 times greater than those of the priors (1/σ = 8)
461
demonstrating increased confidence in parameters estimates that were constrained by the data.
462
In figure 3C, we present the maximum a posterior (MAP) estimates for each parameter across the multi-
463
starts. There are clear differences between parameters inverted upon the two separate sets of data (red
464
versus blue bars; asterisks indicate significant t-tests). For instance, the mean GPe time constant (1st
465
group of bars from the left) is smaller for data A compared to B. Other parameters were well-informed
466
by the data, but not significantly different between either data sets (e.g. GPe sigmoidal slope; 2nd set of
467
bars from the left). The accuracy of the parameter optimization procedure was assessed by comparing
468
the MAP estimates to the known parameters of the underlying forward models. This can be seen in
469
figure 3C where the parameter values are plot as crosses alongside the ABC posterior expectations.
470
Estimates of STN and GPe time constants and the GPe STN delay were well-recovered (the actual
471
value falls within the spread of multi-start estimates). The slope of the nonlinear activation function
472
however was poorly recovered showing consistent over-estimation, likely due to the real value falling
473
far out of the bounds of the prior. An 8-fold cross-validated, Hotelling test found and 75% and 62.5%
474
null-rate for datasets A and B respectively, indicating the majority of parameter estimates were
475
consistently distributed with the underlying forward model.
476
477
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 19
Figure 3 Multi-start analysis to test face validity of the ABC-based estimation of model parameters by
demonstrating consistency of estimation and the data specificity of parameter estimates. A two-node model of the
STN/GPe circuit (inset) was fit to two different data sets: dataset A (blue) and dataset B (red) that were generated by
different underlying models. Each estimation was performed 10 times with identical specification of prior distributions for
all initializations. (A) Tracking of the goodness-of-fit (shown as -log10(-MSEpooled)) over the iterations demonstrated
consistent convergence. Posterior estimates of the summary statistic were on average more accurate for dataset A than for
B but were consistent across multi-starts (B) Optimization showed a consistent increase in the average precision (equivalent
to a decrease in the logarithm of the inverse standard deviation of the data) of the posteriors indicating that data was
informative in constraining parameter estimates. (C) Examination of the MAP estimates demonstrated a consistent
inference of parameter values. Some parameters were drawn to common values with both data A and B (e.g. GPe time
constant), whilst others show differences informed by the data (e.g. STN time constant). MAP values are given as log
scaling parameters of the prior mean. The prior values were set to equal zero. Error bars give the standard deviations of the
estimates across initializations. Asterisks indicate significant t-test for difference in means between parameters estimated
from data A and B. (D) To visualize trajectories of the multi-starts, the high dimensional parameter space was reduced to
two dimensions using multi-dimensional scaling (MDS). Evolutions of the means of the proposal parameters exhibit a clear
divergence between data sets A and B that were significantly different (MANOVA, see main text).
To estimate the internal-consistency of the parameter estimates, we applied a one-sample Hotelling test
478
within an eight-fold, leave-one-out cross-validation to each of the MAP estimates from the multi-start.
479
For both samples of parameters estimated from data A and B we find there to be a 0% rejection of the
480
null hypothesis that the mean of the fold is significantly different from that of the left-out sample. This
481
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 20
result indicates that every initialization of the multi-start fell within the variance defined by the
482
remainder of the multi-starts. This does not test whether the variance is unacceptably large, we test this
483
in part by examining (ii): that the posteriors were differentiated by the data on which they were
484
estimated, using a Szekely and Rizzo energy test. We find there to be a significant difference in the
485
means of the two samples (Φ = 5.13; P = 0.001). This finding is supported by a MANOVA test that
486
demonstrates that the two data sets are significantly segregated by their posterior parameter means (D
487
= 1, P < 0.001). This suggests that the spread of estimates was, at least, sufficient to distinguish
488
posteriors derived from different underlying generative models. Visualization of the parameter space
489
using multidimensional scaling (MDS; figure 3D) confirms the segregation of the posterior samples
490
into two clusters determined by the datasets from which they are estimated. These results confirm that
491
the ABC optimized posteriors are consistent across multiple initializations and that the output is
492
determined by differences in the underlying model generating the given data.
493
3.3 Testing Face Validity of the Model Comparison Approach
494
To verify that the face validity of the model comparison approach i.e. that it can identify the correct
495
structure of the generative model of the data we constructed a confusion matrix (as detailed in the
496
methods section 2.7.2), first using variations on the STN/GPe model presented in the previous sections
497
and shown in figure 4A. In the case of correct model identification, the best model scores should lay
498
along the diagonal of the confusion matrix.
499
In figure 4B we present the posterior model probabilities P(M|D) (see methods for details of its
500
calculation). When normalizing across the joint space to compute the marginal posterior probability of
501
a model, we consider only the three models tested per dataset (i.e. the sum of the probabilities across
502
each column of figure 4B, C, F and G are equal to one). This analysis demonstrates that, in terms of
503
accuracy, the most probable models lie on the diagonal of the confusion matrix showing that the
504
posterior accuracies are sufficient to correctly identify the generating model. In figure 4C, we present
505
the model complexity in terms of a proportion of the sum divergence across all models (i.e. the second
506
term of equation 9). These analyses show that the divergence of each model’s posteriors from priors
507
(so called complexity measured in terms of the DKL). In the case of model 1 (which is the most flexible
508
in terms of numbers of free parameters) there are inflated divergences in the first column that result
509
from a large deviation of posteriors when attempting to fit the data generated from the alternative
510
models. This shows that a post-hoc analysis of divergence of the posterior (using DKL) can be used to
511
discriminate models which have been overfitted. When combining these two measures into the ACS
512
metric (summarising model accuracy minus complexity) in figure 4D, it is seen that the best fitting
513
models are still correctly identified even when accounting for the increased complexity of posterior
514
parameter densities. Note that the most flexible model (model 1) was unable to fit data from models 2
515
and 3, this occurred as the required posteriors to achieve effectively decouple STN/GPe feedback were
516
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 21
very far from the model 1 priors on connectivity. This is a known limitation of ABC, please see the
517
discussion.
518
To determine whether this face validation held for larger models, we performed an identical analysis
519
with three models ranging in complexity (figure 4E). Model accuracy was again highest along the
520
diagonal (figure 4F), with the complexity adjusted goodness-of-fit (ACS; figure 4H) maintaining
521
correct model identification. These results demonstrate that the model comparison approach can
522
properly identify models from which the data originated, thus providing a face validation of the model
523
comparison procedures.
524
3.4 Scaling up to Larger Model Spaces: Application to Models of the Cortico-Basal Ganglia-
525
Thalamic Circuit
526
Finally, we applied the ABC framework to a larger and more complex model space to test the scalability
527
of the methodology. Specifically, we devised a set of 12 models (illustrated in figure 5) incorporating
528
combinations of pathways in the cortico-basal ganglia-thalamic circuit amongst a set of six neural
529
populations motor cortex (M2); striatum (STR), GPe, STN, and thalamus (Thal.). Models were split
530
into sets including/excluding the indirect (M2 STR GPe STN); hyperdirect (M2 STN); and
531
thalamocortical relay (M2Thal.). Models were further subdivided to include or exclude the
532
Figure 4 Testing face validity of the ABC model comparison approach to model identification. Confusion matrices
were constructed by fitting the three models of the STN/GPe circuit. Synthetic data was generated using the fitted models
and then the three original models were fitted back to the synthetic data to test whether model comparison could identify
the generating model. (A) Schematic of neural mass model to be fitted. Annotations of connections indicate the presence
of each for models 1-3. (B) Matrix of posterior model probabilities 1-P(M|D) computed normalized across the joint model
space (for each column demarcated by dashed lines). (C) Matrix of normalized divergences of posteriors from priors (see
second term of equation 9). (D) Combined scoring to simultaneously account for model accuracies and divergence (ACS).
Large values indicate better fits with more parsimonious posteriors (small DKL). (E-F) Same as for (A-D) but for the more
complex model set.
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 22
subthalamo-pallidal feedback connection (STN GPe; models prefixed M x.2 to denote inclusion of
533
the connection). For a full description and defence of the model choices please see methods section
534
2.7.3. These models were fit individually to the empirical data and then model comparison used to
535
determine the best candidate model.
536
In figure 6 we show the resulting model fits and then the subsequent model comparison done in order
537
to determine the best model or set of models from the proposed model space. From visual inspection of
538
the fits to the data features in figure 6A as well as the distribution of posterior model accuracies in figure
539
6B there is a wide range of model performances with regards to accurate fitting of the models to the
540
data. Inspection of the posterior model fits to the data features in 6A shows the best fitting model (M
541
5.2) was able to account for the multiple peaks in the autospectra at around 20 Hz and 30 Hz. There
542
was however a systematic underestimation of the directed functional connectivity (NPD) from
543
subcortex to cortex. When the MSEpooled of the fitted models was segregated between their autospectra
544
and functional connectivity, we found that spectra were more accurately fit than for the power.
545
Figure 5 Illustration of the model space of the cortico-basal-ganglia network fitted with ABC and compared with
Bayesian model selection. The model space comprises six families which can be further subdivided into two subfamilies
yielding 12 models in total. Family (1) models the indirect pathway; family (2) contains models with both the indirect and
hyperdirect pathways; family (3) contains models with the hyperdirect pathway but not indirect pathway; family (4)
contains models with the indirect, direct and thalamocortical pathways; family (5) contains models with indirect, direct
hyperdirect, and thalamocortical pathways; family (6) contains models with hyperdirect, direct and thalamocortical pathway
but no indirect pathway. Finally, each family comprises two sub-families that either exclude (Mx.1) or include (Mx.2)
subthalamopallidal feedback excitation. Excitatory projections are indicated by ball-ended connections, whilst inhibitory
connections are flat-ended.
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 23
In all cases the models containing the subthalamopallidal excitatory connection (M x.2) performed
546
better than those without (M x.1) and in good agreement with the known Parkinsonian
547
electrophysiology (Cruz et al., 2011). Notably we found that the model families (3) and (6), the null
548
models, yielded poor accuracy with many of the posterior distributions of model MSEpooled falling far
549
below the median of the whole model space (figure 6B) that translates to a reduced model probability
550
(figure 6C). In the case of M 6.2 we see that this is accompanied by a high KL divergence (figure 6D).
551
M 4.2 and 5.2 are the strongest models, with distributions of fits tightly clustered around high values
552
yielding high model evidences. This suggests the importance of including thalamocortical feedback
553
connections in the model. Scoring with ACS suggests model 4.2 is the best of the models due to the
554
smaller model complexity (parameter divergence from prior). These results further underwrite our face
555
Figure 6 Scaling up the ABC model comparison framework investigating models of the cortico-basal ganglia-
thalamic network. 12 competing models (six families subdivided each into two sub-families) were fitted to empirical data
from Parkinsonian rats. Models were fitted to summary statistics of recordings from the motor cortex (M2), striatum (STR),
subthalamic nucleus (STN), and external segment of the globus pallidus (GPe). Models were first fit using ABC to estimate
the approximate posterior distributions over parameters. To assess relative model performances, 1000 draws were made
from each model posterior and corresponding data was simulated. (A) The posterior model fits for the top three performing
models are shown, with autospectra on the diagonal and NPD on the off-diagonal (M 5.2 in light green; M 4.2 in turquois;
and 1.2 in red). Bounds indicate the interquartile range of the simulated features. (B) Violin plots of the distributions of
model accuracies (MSEpooled) of the simulated pseudo-data from the empirical data. (C) The acceptance probability
approximation to the model evidence 1-P(M|D) is determined by computing the number of samples from the posterior that
exceed the median model accuracy (MSEpooled). (D) The joint space normalized Kullback-Leibler divergence of the
posterior from prior is shown for each model (for formulation see second term of equation 9). Large values indicate high
divergence and overfitting. (E) Combined scores for accuracy and divergence from priors using ACS.
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 24
validation of the ABC procedure by demonstrating that its posterior estimates of model probability
556
match well with the known neurobiology of the circuit.
557
4 Discussion
558
4.1 Summary of the Results
559
In this paper we formulated a novel framework (figure 1) for the inverse modelling of neural dynamics
560
based upon the ABC-SMC algorithm (Toni et al., 2009). We have provided an face validation of this
561
method when applied to models and data types typically encountered in systems neuroscience. We first
562
demonstrated that the algorithm converges to yield best fit approximations to the summary statistics of
563
empirical data to yield posterior estimates over parameters (figure 2). We assessed the accuracy of
564
parameter estimation by confirming that posterior estimates were plausible given the parameters that
565
were known to generate the data (figure 3). Additionally, we used a multi-start procedure to demonstrate
566
that the optimization was robust to local minima and thus generalizable across different realizations of
567
the data. Next, we examined the face validity of the model selection procedures (figure 4). These results
568
demonstrated that the model comparison approach can reliably identify the model that generated the
569
data, even in cases in which more complex models (going up to 6 sources) were included. Finally, we
570
demonstrated the capacity for the framework to investigate the structure of real-world neuronal circuits
571
using a set of models of the cortico-basal-ganglia-thalamic circuit fit to empirical data (figure 5).
572
Conclusions drawn from this model comparison matched well with the known neurobiology and further
573
underwrite the feasibility of applying this method to answer biologically relevant problems (figure 6).
574
4.2 ABC for Parameter Estimation of Neural Circuit Models
575
ABC has established itself as a key tool for parameter estimation in systems biology (Excoffier, 2009;
576
Ratmann et al., 2009; Toni et al., 2009; Turner and Sederberg, 2012; Liepe et al., 2014) but is yet to see
577
wide adoption in systems neuroscience. It is known that ABC will not perform well under certain
578
conditions (for a critical review see Sunnåker et al., 2013). Specifically, it has been shown that the
579
simplest form of ABC algorithm based upon an rejection-sampling approach is inefficient in the case
580
where the prior densities lie far from the true posterior (Lintusaari et al., 2016). This problem is
581
alleviated to some degree in biological models where a good amount of a priori knowledge regarding
582
plausible model structures or parameter values exists. This motivates the use of neurobiologically
583
grounded models over phenomenological models where often the ranges of potential parameter values
584
are unknown.
585
A caveat of simulation-based inverse modelling concerns the timescale of the simulation and the data
586
features to be fitted. The necessary finite time investigation precludes the examination of slow modes
587
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 25
or switching behaviour
1
occurring at time-scale beyond that captured in the empirical recordings or
588
simulation duration (see also discussion below regarding sufficiency of the summary statistics). In the
589
deployment of ABC here, we use a model cast in a set of stochastic delay differential equations, in
590
which the finite time realization of each noise process will lead to differences in the trajectory of the
591
system between instances (i.e. forward uncertainty). In these cases, optimization with ABC will be
592
drawn towards regions of parameter space where stochasticity does not result in large deviations
593
between realizations as this will result in increased uncertainty in the posterior parameter estimates.
594
Corroborating this, it was found that independent realizations of the stochastic model led to highly
595
consistent summary statistics (appendix III).
596
Furthermore, the consistency in parameter estimates between multi-starts can also be taken as evidence
597
of low inverse uncertainty as the values of estimated parameter did not deviate significantly between
598
realizations of the underlying generative process. It would be of interest (but beyond of the scope of
599
this paper) to evaluate the extent to which the variance between experimental samples (e.g. recordings
600
from different animals within the same experimental treatment, or changes in sensor noise) can affect
601
the consistency of parameter estimates (i.e. an examination of predictive validity). Schemes exist for
602
DCM where data features may be weighted in terms of the estimated noise term, a similar extension is
603
likely to be of use for ABC inverted models, especially in the case where multiple types of summary
604
statistics are combined.
605
4.3 Sufficiency of the Summary Statistic and ABC Model Selection
606
The selection of summary statistics are well known to be a vital factor in determining the outcomes of
607
ABC estimated posterior (Beaumont et al., 2002; Sunnåker et al., 2013), as well as in model selection
608
that where insufficiency of the statistic can affect models non-uniformly (Robert et al., 2011). Thus we
609
can only interpret the results of a model comparison in terms of each model’s capacity to explain the
610
given summary statistic as an abstraction of the complete data. The choice of summary statistic will
611
always introduce a degree of parameter non-identifiability, for instance an investigation of model
612
behaviour exhibiting switching or chaotic dynamics are unlikely to accurately identify the responsible
613
parameters by a feature such as the finite time estimated spectrum. In this work we used the directed
614
functional connectivity (NPD) as a data feature by which to constrain our model(s) rather than the
615
complex cross spectra (from which it can be derived). The NPD exhibits robustness to zero-lag effects
616
arising from volume conduction that in turn simplifies the estimation of mixing terms in the observer
617
model. In our validation here, we showed that the feature was sufficient to recover known parameters,
618
but likely entails an increase in the degree of non-identifiability that could be examined in future work.
619
1
In appendix III we investigated whether the posterior models exhibited any of these dynamics for which we
found no evidence.
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 26
Furthermore, sampling approaches to the estimation of marginal likelihoods in order to perform
620
Bayesian model comparison are challenging to compute (Chib, 1995) and common approximations
621
have been demonstrated to be poor (Penny, 2012). Furthermore, sampling approximations to the model
622
evidence such as that described here are highly dependent upon the distance from the true posterior and
623
the sufficiency of the summary statistic. Further work would need to be done in order to understand
624
how the ABC estimates of model evidence are limited by non-vanishing error tolerance (i.e. ) in
625
which posteriors are only approximate (Dean et al., 2014).
626
4.4 Future Directions for ABC and Mesoscale Neural Modelling
627
This work follows on from a number of previous works that have performed inference from large scale
628
models of brain activity and spectral domain summary statistics of neural recordings such as their cross-
629
spectra or functional connectivity (Valdes et al., 1999; Rowe et al., 2004; van Albada et al., 2010;
630
Friston et al., 2012; Hadida et al., 2018; Hashemi et al., 2018). Whilst similar in their aims, the
631
computational challenge of the inverse problem has meant that the techniques adopted to solve it have
632
dictated the types of questions to which they can be applied. Previous approaches to constraining models
633
from spectral features have often bypassed finding explicit numerical solutions to models, instead
634
opting to approximate dynamics by estimating the system’s transfer function around a local-
635
linearization (Valdes et al., 1999; Rowe et al., 2004). Beyond reducing the computational burden of
636
numerical integration, this approach also facilitates the use of techniques such as variational Bayes
637
(Friston et al., 2012) by ensuring that posterior densities conform to a multivariate Gaussian (the
638
Laplace assumption).
639
Whilst this technique has proven powerful (e.g. Moran et al., 2011; Bastos et al., 2015), it precludes the
640
examination of highly nonlinear models that exhibit structural instabilities (i.e. bifurcations or phase
641
transitions) that will result in a non-convex cost function, and are thus unlikely to conform to the
642
Laplace assumption. Importantly these bifurcations are known to exist in the neural mass models of the
643
type used here (Aburn et al., 2012) and have been demonstrated to yield multimodal posteriors (Hadida
644
et al., 2018). It would be of future interest to systematically delineate the conditions for when the above
645
approximations. For instance, a comparison of posterior parameter estimates computed between ABC
646
and DCM (i.e. a construct validation), in a model approaching a transition point would address the
647
question of what approach is best suited to a particular modelling scenario.
648
Current approaches to the inverse modelling phenomena such as state transitions or time dependent
649
fluctuations with DCM discretize these phenomena into either sliding windows (Rosch et al., 2018) or
650
a succession states that evolve according to a matrix of transition probabilities (Zarghami and Friston,
651
2020). This follows from an assumption that time varying behaviour can be separated into fast local
652
dynamics which are then under the control of some slow mode that dictates the succession (Rabinovich
653
et al., 2012). Whilst this approach is useful for understanding the states, it somewhat abstracts the
654
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 27
mechanisms that lay behind the transitions, whether that due to slow changes in connectivity or
655
parameters (e.g. plasticity), evolution of a slow variable (c.f. an order-parameter; Haken et al., 1985),
656
or switching induced by stochastic drives to a model. Examination of these transitions are, for instance,
657
important in models looking to interact with ongoing brain states through stimulation (see for instance
658
West et al., 2020a). The framework described here provides an opportunity to investigate the
659
mechanism behind these transitions and paves the way for future studies investigating the types of
660
mechanisms that underwrite the statistics of for instance electrophysiological bursts (Powanwe and
661
Longtin, 2019; Duchet et al., 2020) or neural microstates (Baker et al., 2014). Previous work has shown
662
that ABC is well suited to applications using highly nonlinear or stochastic systems (see Toni et al.
663
2009 for an example).
664
4.5 Conclusions
665
Overall, we have introduced a framework for parameter estimation and model comparison that draws
666
upon a number of recent developments in simulation-based inference that make it attractive to the
667
inverse modelling of large-scale neural activity. This framework provides a robust method by which
668
large scale brain activity can be understood in terms of the underlying structure of the circuits that
669
generate it. This scheme avoids making appeals to local-linear behaviour and thus opens the way to
670
future studies exploring the mechanisms underlying itinerant or stochastic neural dynamics. We have
671
demonstrated that this framework provides consistent estimation of parameters over multiple instances;
672
can reliably identify the most plausible model that has generated an observed set of data; and given an
673
example application demonstrating the potential for this framework to answer neurobiologically
674
relevant questions. Whilst this paper constitutes a first validation and description of the method, more
675
work will be required to establish its validity in the context of more complex models as well as statistics
676
of time-dependent properties of neural dynamics.
677
5 Acknowledgments and Funding
678
T.O.W. acknowledges funding from UCL CoMPLEX doctoral training program, Moger Moves donation and the UCL Bogue
679
Fellowship for funding part of this work. H.C. receives funding from an MRC Career Development award (MR/R020418/1).
680
S.F.F. receives funding support from the UCL/UCLH NIHR Biomedical Research Centre. L.B. acknowledges funding support
681
from the Leverhulme Trust (RPG- 2017-370). The Wellcome Trust Centre for Neuroimaging is funded by core funding from
682
the Wellcome Trust (539208). We thank Peter Magill and Andrew Sharott at the Brain Network Dynamics Unit, Oxford
683
University for making available the experimental data used in this study. We thank all authors of the publicly available
684
toolboxes used in this paper (listed in the supplementary information VI). We thank Eugene P. Duff, Benoit Duchet, and Karl
685
J. Friston for their helpful comments on the manuscript.
686
6 References
687
Ableidinger M, Buckwar E, Hinterleitner H (2017) A Stochastic Version of the Jansen and Rit Neural Mass Model: Analysis
688
and Numerics. J Math Neurosci 7:8 Available at: http://mathematical-
689
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 28
neuroscience.springeropen.com/articles/10.1186/s13408-017-0046-4 [Accessed June 1, 2018].
690
Aburn MJ, Holmes CA, Roberts JA, Boonstra TW, Breakspear M (2012) Critical fluctuations in cortical models near
691
instability. Front Physiol 3:331 Available at:
692
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3424523&tool=pmcentrez&rendertype=abstract
693
[Accessed February 26, 2016].
694
Albin RL, Young AB, Penney JB (1989) The functional anatomy of basal ganglia disorders. Trends Neurosci 12:366375
695
Available at: http://www.sciencedirect.com/science/article/pii/016622368990074X [Accessed October 25, 2015].
696
Alexander GE, DeLong MR, Strick PL (1986) Parallel Organization of Functionally Segregated Circuits Linking Basal
697
Ganglia and Cortex. Annu Rev Neurosci 9:357381 Available at:
698
http://www.annualreviews.org/doi/10.1146/annurev.ne.09.030186.002041 [Accessed August 3, 2018].
699
Aslan B, Zech G (2005) New test for the multivariate two-sample problem based on the concept of minimum energy. J Stat
700
Comput Simul 75:109119 Available at: http://www.tandfonline.com/doi/abs/10.1080/00949650410001661440
701
[Accessed September 4, 2019].
702
Baker AP, Brookes MJ, Rezek IA, Smith SM, Behrens T, Probert Smith PJ, Woolrich M (2014) Fast transient networks in
703
spontaneous human brain activity. Elife 3 Available at: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3965210/
704
[Accessed September 30, 2015].
705
Baker CTH, Buckwar E (2000) Numerical Analysis of Explicit One-Step Methods for Stochastic Delay Differential
706
Equations. LMS J Comput Math 3:315335.
707
Baritompa B, Hendrix EMT (2005) On the Investigation of Stochastic Global Optimization Algorithms. J Glob Optim
708
31:567578 Available at: http://link.springer.com/10.1007/s10898-004-9966-0 [Accessed September 25, 2019].
709
Bastos AM, Litvak V, Moran R, Bosman CA, Fries P, Friston KJ (2015) A DCM study of spectral asymmetries in
710
feedforward and feedback connections between visual areas V1 and V4 in the monkey. Neuroimage 108:460475
711
Available at: https://www.sciencedirect.com/science/article/pii/S1053811915000117 [Accessed July 10, 2018].
712
Beaumont MA, Cornuet J-M, Marin J-M, Robert CP (2009) Adaptive approximate Bayesian computation. Biometrika
713
96:983990 Available at: http://www.jstor.org/stable/27798882.
714
Beaumont MA, Zhang W, Balding DJ (2002) Approximate Bayesian Computation in Population Genetics. Genetics
715
162:2025 LP 2035 Available at: http://www.genetics.org/content/162/4/2025.abstract.
716
Bevan MD, Magill PJ, Terman D, Bolam JP, Wilson CJ (2002) Move to the rhythm: oscillations in the subthalamic nucleus
717
external globus pallidus network. Trends Neurosci 25:525531.
718
Bolam JP, Hanley JJ, Booth PA, Bevan MD (2000) Synaptic organisation of the basal ganglia. J Anat:527542 Available at:
719
http://www.ncbi.nlm.nih.gov/pubmed/10923985 [Accessed November 25, 2016].
720
Bowman AW (1984) An Alternative Method of Cross-Validation for the Smoothing of Density Estimates. Biometrika
721
71:353 Available at: https://www.jstor.org/stable/2336252?origin=crossref [Accessed May 7, 2019].
722
Breakspear M (2017) Dynamic models of large-scale brain activity. Nat Neurosci 20:340352 Available at:
723
http://www.nature.com/doifinder/10.1038/nn.4497 [Accessed March 9, 2017].
724
Buckwar E (2000) Introduction to the numerical analysis of stochastic delay differential equations. J Comput Appl Math
725
125:297307.
726
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 29
Chib S (1995) Marginal likelihood from the Gibbs output. J Am Stat Assoc 90:13131321.
727
Chu HY, McIver EL, Kovaleski RF, Atherton JF, Bevan MD (2017) Loss of Hyperdirect Pathway Cortico-Subthalamic
728
Inputs Following Degeneration of Midbrain Dopamine Neurons. Neuron 95:1306-1318.e5 Available at:
729
https://doi.org/10.1016/j.neuron.2017.08.038.
730
Cranmer K, Brehmer J, Louppe G (2020) The frontier of simulation-based inference. Proc Natl Acad Sci U S A 117:30055
731
30062.
732
Cruz A V., Mallet N, Magill PJ, Brown P, Averbeck BB (2011) Effects of dopamine depletion on information flow between
733
the subthalamic nucleus and external globus pallidus. 106:20122023.
734
Daunizeau J, Friston KJ, Kiebel SJ (2009) Variational Bayesian identification and prediction of stochastic nonlinear dynamic
735
causal models. Phys D Nonlinear Phenom 238:20892118.
736
David O, Friston KJ (2003) A neural mass model for MEG/EEG: Neuroimage 20:17431755 Available at:
737
http://www.sciencedirect.com/science/article/pii/S1053811903004579 [Accessed February 2, 2016].
738
De Schutter E, Bower JM (1994) An active membrane model of the cerebellar Purkinje cell I. Simulation of current clamps
739
in slice. J Neurophysiol 71:375400.
740
Dean TA, Singh SS, Jasra A, Peters GW (2014) Parameter Estimation for Hidden Markov Models with Intractable
741
Likelihoods. Scand J Stat 41:970987 Available at: http://doi.wiley.com/10.1111/sjos.12077 [Accessed June 20,
742
2019].
743
Deco G, Jirsa V, McIntosh AR, Sporns O, Kötter R (2009) Key role of coupling, delay, and noise in resting brain
744
fluctuations. Proc Natl Acad Sci U S A 106:1030210307 Available at:
745
http://www.ncbi.nlm.nih.gov/pubmed/19497858 [Accessed July 16, 2018].
746
Deco G, Jirsa VK, McIntosh AR (2011) Emerging concepts for the dynamical organization of resting-state activity in the
747
brain. Nat Rev Neurosci 12:4356 Available at: http://www.nature.com/articles/nrn2961 [Accessed August 15, 2016].
748
Deco G, Tononi G, Boly M, Kringelbach ML (2015) Rethinking segregation and integration: contributions of whole-brain
749
modelling. Nat Rev Neurosci 16:430439 Available at: http://www.ncbi.nlm.nih.gov/pubmed/26081790 [Accessed
750
September 21, 2016].
751
Del Moral P, Doucet A, Jasra A (2012) An adaptive sequential Monte Carlo method for approximate Bayesian computation.
752
Stat Comput 22:10091020 Available at: http://link.springer.com/10.1007/s11222-011-9271-y [Accessed June 27,
753
2019].
754
Duchet B, Ghezzi F, Weerasinghe G, Tinkhauser G, Kuhn AA, Brown P, Bick C, Bogacz R (2020) Average beta burst
755
duration profiles provide a signature of dynamical changes between the ON and OFF medication states in Parkinson’s
756
disease. bioRxiv:2020.04.27.064246 Available at: https://doi.org/10.1101/2020.04.27.064246 [Accessed July 20,
757
2020].
758
Excoffier CLDWL (2009) Bayesian Computation and Model Selection in Population Genetics. Available at:
759
http://arxiv.org/abs/0901.2231 [Accessed July 25, 2017].
760
Friston K, Mattout J, Trujillo-Barreto N, Ashburner J, Penny W (2007) Variational free energy and the Laplace
761
approximation. Neuroimage 34:220234 Available at:
762
https://www.sciencedirect.com/science/article/pii/S1053811906008822?via%3Dihub [Accessed June 20, 2019].
763
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 30
Friston KJ (2011) Functional and Effective Connectivity: A Review. Brain Connect 1:1336 Available at:
764
http://www.liebertonline.com/doi/abs/10.1089/brain.2011.0008 [Accessed October 7, 2016].
765
Friston KJ, Bastos A, Litvak V, Stephan KE, Fries P, Moran RJ (2012) DCM for complex-valued data: cross-spectra,
766
coherence and phase-delays. Neuroimage 59:439455 Available at:
767
http://www.sciencedirect.com/science/article/pii/S1053811911008251 [Accessed February 29, 2016].
768
Grelaud A, Robert CP, Marin J-M, Rodolphe F, Taly J-F (2009) ABC likelihood-free methods for model choice in Gibbs
769
random fields. Bayesian Anal 4:317335 Available at: http://projecteuclid.org/euclid.ba/1340370280 [Accessed April
770
9, 2019].
771
Hadida J, Sotiropoulos SN, Abeysuriya RG, Woolrich MW, Jbabdi S (2018) Bayesian Optimisation of Large-Scale
772
Biophysical Networks. Neuroimage 174:219236 Available at:
773
https://www.sciencedirect.com/science/article/pii/S1053811918301708#fig3 [Accessed May 13, 2019].
774
Haken H, Kelso JAS, Bunz H (1985) A theoretical model of phase transitions in human hand movements. Biol Cybern
775
51:347356 Available at: http://link.springer.com/10.1007/BF00336922 [Accessed February 26, 2016].
776
Halliday DM (2015) Nonparametric directionality measures for time series and point process data. J Integr Neurosci 14:253
777
277 Available at: http://www.worldscientific.com/doi/10.1142/S0219635215300127 [Accessed September 5, 2016].
778
Halliday DM, Senik MH, Stevenson CW, Mason R (2016) Non-parametric directionality analysis Extension for removal of
779
a single common predictor and application to time series. J Neurosci Methods 268:8797 Available at:
780
http://www.sciencedirect.com/science/article/pii/S0165027016300863 [Accessed April 25, 2017].
781
Hansen JA, Penland C, Hansen JA, Penland C (2006) Efficient Approximate Techniques for Integrating Stochastic
782
Differential Equations. Mon Weather Rev 134:30063014 Available at:
783
http://journals.ametsoc.org/doi/abs/10.1175/MWR3192.1 [Accessed July 31, 2018].
784
Hashemi M, Hutt A, Buhry L, Sleigh J (2018) Optimal Model Parameter Estimation from EEG Power Spectrum Features
785
Observed during General Anesthesia. Neuroinformatics 16:231251.
786
Horwitz B, Friston KJ, Taylor JG (2000) Neural modeling and functional brain imaging: An overview. Neural Networks
787
13:829846.
788
Jahfari S, Waldorp L, van den Wildenberg WPM, Scholte HS, Ridderinkhof KR, Forstmann BU (2011) Effective
789
Connectivity Reveals Important Roles for Both the Hyperdirect (Fronto-Subthalamic) and the Indirect (Fronto-
790
Striatal-Pallidal) Fronto-Basal Ganglia Pathways during Response Inhibition. J Neurosci 31 Available at:
791
http://www.jneurosci.org/content/31/18/6891.long [Accessed June 28, 2017].
792
Jansen BH, Rit VG (1995) Electroencephalogram and visual evoked potential generation in a mathematical model of
793
coupled cortical columns. Biol Cybern 73:357366 Available at: http://link.springer.com/10.1007/BF00199471
794
[Accessed July 4, 2016].
795
Jaqaman K, Danuser G (2006) Linking data to models: Data regression. Nat Rev Mol Cell Biol 7:813819.
796
Kiebel SJ, David O, Friston KJ (2006) Dynamic causal modelling of evoked responses in EEG/MEG with lead field
797
parameterization. Neuroimage 30:12731284.
798
Le Van Quyen M, Chavez M, Rudrauf D, Martinerie J (2003) Exploring the nonlinear dynamics of the brain. In: Journal of
799
Physiology Paris, pp 629639. Elsevier.
800
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 31
Li J, Nott DJ, Fan Y, Sisson SA (2017) Extending approximate Bayesian computation methods to high dimensions via a
801
Gaussian copula model. Comput Stat Data Anal 106:7789.
802
Liepe J, Kirk P, Filippi S, Toni T, Barnes CP, Stumpf MPH (2014) A framework for parameter estimation and model
803
selection from experimental data in systems biology using approximate Bayesian computation. Nat Protoc 9:439456
804
Available at: http://www.nature.com/articles/nprot.2014.025 [Accessed May 13, 2019].
805
Lintusaari J, Gutmann MU, Dutta R, Kaski S, Corander J (2016) Fundamentals and Recent Developments in Approximate
806
Bayesian Computation. Syst Biol 66:syw077 Available at: https://academic.oup.com/sysbio/article-
807
lookup/doi/10.1093/sysbio/syw077 [Accessed May 13, 2019].
808
Lopes da Silva F (1991) Neural mechanisms underlying brain waves: from neural membranes to networks.
809
Electroencephalogr Clin Neurophysiol 79:8193 Available at:
810
http://linkinghub.elsevier.com/retrieve/pii/0013469491900445 [Accessed March 9, 2017].
811
Lopes da Silva FH, Hoeks A, Smits H, Zetterberg LH (1974) Model of brain rhythmic activity. Kybernetik 15:2737
812
Available at: http://link.springer.com/10.1007/BF00270757 [Accessed November 4, 2016].
813
MacKay DJC (2003) Information theory, inference, and learning algorithms. Cambridge University Press. Available at:
814
http://www.inference.org.uk/itprnn/book.html [Accessed September 3, 2019].
815
Magill PJ, Pogosyan A, Sharott A, Csicsvari J, Bolam JP, Brown P (2006) Changes in functional connectivity within the rat
816
striatopallidal axis during global brain activation in vivo. J Neurosci 26:63186329 Available at:
817
http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&DbFrom=pubmed&Cmd=Link&LinkName=pubmed_pubme
818
d&LinkReadableName=Related
819
Articles&IdsFromResult=16763040&ordinalpos=3&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pu
820
bmed_RVDocSum [Accessed October 11, 2016].
821
Magill PJ, Sharott A, Bolam JP, Brown P (2004) Brain StateDependency of Coherent Oscillatory Activity in the Cerebral
822
Cortex and Basal Ganglia of the Rat. J Neurophysiol 92:21222136 Available at:
823
http://jn.physiology.org/content/92/4/2122.full [Accessed April 26, 2017].
824
Mallet N, Pogosyan A, Sharott A, Csicsvari J, Bolam JP, Brown P, Magill PJ (2008) Disrupted Dopamine Transmission and
825
the Emergence of Exaggerated Beta Oscillations in Subthalamic Nucleus and Cerebral Cortex. J Neurosci 28:4795
826
4806 Available at: http://www.jneurosci.org/cgi/doi/10.1523/JNEUROSCI.0123-08.2008 [Accessed August 5, 2016].
827
Marin JM, Pudlo P, Robert CP, Ryder RJ (2012) Approximate Bayesian computational methods. Stat Comput 22:11671180
828
Available at: https://link.springer.com/article/10.1007/s11222-011-9288-2 [Accessed August 17, 2020].
829
Marreiros AC, Cagnan H, Moran RJ, Friston KJ, Brown P (2013) Basal gangliacortical interactions in Parkinsonian
830
patients. Neuroimage 66:301310 Available at: http://www.sciencedirect.com/science/article/pii/S1053811912010920
831
[Accessed September 9, 2015].
832
Moran RJ, Mallet N, Litvak V, Dolan RJ, Magill PJ, Friston KJ, Brown P (2011) Alterations in brain connectivity
833
underlying beta oscillations in parkinsonism Kording KP, ed. PLoS Comput Biol 7:e1002124 Available at:
834
http://dx.plos.org/10.1371/journal.pcbi.1002124 [Accessed August 8, 2016].
835
Moran RJ, Stephan KE, Seidenbecher T, Pape H-C, Dolan RJ, Friston KJ (2009) Dynamic causal models of steady-state
836
responses. Neuroimage 44:796811 Available at:
837
http://www.sciencedirect.com/science/article/pii/S1053811908010641 [Accessed February 26, 2016].
838
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 32
Nambu A, Tachibana Y, Chiken S (2015) Cause of parkinsonian symptoms: Firing rate, firing pattern or dynamic activity
839
changes? Basal Ganglia 5:16.
840
Nelsen RB (1999) An Introduction to Copulas. New York, NY: Springer New York. Available at:
841
http://link.springer.com/10.1007/978-1-4757-3076-0 [Accessed November 26, 2018].
842
Nikulin V V., Brismar T (2006) Phase synchronization between alpha and beta oscillations in the human
843
electroencephalogram. Neuroscience 137:647657.
844
Palmigiano A, Geisel T, Wolf F, Battaglia D (2017) Flexible information routing by transient synchrony. Nat Neurosci
845
20:10141022 Available at: http://www.nature.com/articles/nn.4569 [Accessed July 16, 2019].
846
Paxinos G, Watson C (2007) The rat brain in stereotaxic coordinates. Elsevier.
847
Penny WD (2012) Comparing dynamic causal models using AIC, BIC and free energy. Neuroimage 59:319330.
848
Plenz D, Kital ST (1999) A basal ganglia pacemaker formed by the subthalamic nucleus and external globus pallidus. Nature
849
400:677682 Available at: http://dx.doi.org/10.1038/23281 [Accessed December 7, 2015].
850
Powanwe AS, Longtin A (2019) Determinants of Brain Rhythm Burst Statistics. Sci Rep 9:123.
851
Rabinovich MI, Afraimovich VS, Bick C, Varona P (2012) Information flow dynamics in the brain. Phys Life Rev 9:5173
852
Available at: https://www.sciencedirect.com/science/article/abs/pii/S1571064511001448 [Accessed October 8, 2019].
853
Ratmann O, Andrieu C, Wiuf C, Richardson S (2009) Model criticism based on likelihood-free inference, with an
854
application to protein network evolution. Proc Natl Acad Sci U S A 106:1057610581 Available at:
855
http://www.ncbi.nlm.nih.gov/pubmed/19525398 [Accessed September 4, 2018].
856
Reis C, Sharott A, Magill PJ, van Wijk BCM, Parr T, Zeidman P, Friston KJ, Cagnan H (2019) Thalamocortical dynamics
857
underlying spontaneous transitions in beta power in Parkinsonism. Neuroimage 193:103114 Available at:
858
https://www.sciencedirect.com/science/article/pii/S1053811919301764 [Accessed March 26, 2019].
859
Robert CP, Cornuet J-M, Marin J-M, Pillai NS (2011) Lack of confidence in approximate Bayesian computation model
860
choice. Proc Natl Acad Sci U S A 108:1511215117 Available at: http://www.ncbi.nlm.nih.gov/pubmed/21876135
861
[Accessed September 9, 2019].
862
Robinson PA, Rennie CJ, Wright JJ, Bahiumuli H, Gordon E, Rowe DL (2001) Prediction of electrocnccphulographic
863
spectra from neurophysiology. Phys Rev E - Stat Nonlinear, Soft Matter Phys 63:021903102190318.
864
Rosch RE, Hunter PR, Baldeweg T, Friston KJ, Meyer MP (2018) Calcium imaging and dynamic causal modelling reveal
865
brain-wide changes in effective connectivity and synaptic dynamics during epileptic seizures Jbabdi S, ed. PLOS
866
Comput Biol 14:e1006375 Available at: https://dx.plos.org/10.1371/journal.pcbi.1006375 [Accessed May 13, 2019].
867
Rowe DL, Robinson PA, Rennie CJ (2004) Estimation of neurophysiological parameters from the waking EEG using a
868
biophysical model of brain dynamics. J Theor Biol 231:413433 Available at:
869
https://www.sciencedirect.com/science/article/pii/S002251930400325X?via%3Dihub#bib32 [Accessed April 9,
870
2019].
871
Sengupta B, Friston KJ, Penny WD (2016) Gradient-based MCMC samplers for dynamic causal modelling. Neuroimage
872
125:11071118.
873
Silverman BW (2003) Density estimation for statistics and data analysis. Chapman and Hall/CRC.
874
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 33
Sisson SA, Fan Y, Beaumont M (2018) ABC Samplers. In: Handbook of approximate Bayesian computation. Boca Raton:
875
Chapman and Hall/CRC.
876
Steriade M (2000) Corticothalamic resonance, states of vigilance and mentation. Neuroscience 101:243276.
877
Sunnåker M, Busetto AG, Numminen E, Corander J, Foll M, Dessimoz C (2013) Approximate Bayesian computation. PLoS
878
Comput Biol 9:e1002803 Available at: http://www.ncbi.nlm.nih.gov/pubmed/23341757 [Accessed September 4,
879
2018].
880
Toni T, Stumpf MPH (2009) Simulation-based model selection for dynamical systems in systems and population biology.
881
Bioinformatics 26:104110.
882
Toni T, Welch D, Strelkowa N, Ipsen A, Stumpf MP. (2009) Approximate Bayesian computation scheme for parameter
883
inference and model selection in dynamical systems. J R Soc Interface 6:187202 Available at:
884
http://rsif.royalsocietypublishing.org/content/6/31/187 [Accessed July 25, 2017].
885
Traub RD, Wong RKS, Miles R, Michelson H (1991) A model of a CA3 hippocampal pyramidal neuron incorporating
886
voltage-clamp data on intrinsic conductances. J Neurophysiol 66:635650.
887
Turner BM, Sederberg PB (2012) Approximate Bayesian computation with differential evolution. J Math Psychol 56:375
888
385 Available at: http://dx.doi.org/10.1016/j.jmp.2012.06.004.
889
Valdes PA, Jimenez JC, Riera J, Biscay R, Ozaki T (1999) Nonlinear EEG analysis based on a neural mass model. Biol
890
Cybern 81:415424.
891
van Albada SJ, Kerr CC, Chiang AKI, Rennie CJ, Robinson PA (2010) Neurophysiological changes with age probed by
892
inverse modeling of EEG spectra. Clin Neurophysiol 121:2138.
893
van Wijk BCM, Cagnan H, Litvak V, Kühn AA, Friston KJ (2018) Generic dynamic causal modelling: An illustrative
894
application to Parkinson’s disease. Neuroimage 181:818–830 Available at:
895
https://www.sciencedirect.com/science/article/pii/S1053811918307377?via%3Dihub [Accessed November 13, 2018].
896
Vogels TP, Rajan K, Abbott LF (2005) Neural Network Dynamics. Annu Rev Neurosci 28:357376 Available at:
897
http://www.annualreviews.org/doi/10.1146/annurev.neuro.28.061604.135637 [Accessed November 25, 2020].
898
Wendling F, Ansari-Asl K, Bartolomei F, Senhadji L (2009) From EEG signals to brain connectivity: A model-based
899
evaluation of interdependence measures. J Neurosci Methods 183:918.
900
Wendling F, Bellanger JJ, Bartolomei F, Chauvel P (2000) Relevance of nonlinear lumped-parameter models in the analysis
901
of depth-EEG epileptic signals. Biol Cybern 83:367378 Available at:
902
http://link.springer.com/10.1007/s004220000160 [Accessed July 4, 2016].
903
West TO, Berthouze L, Halliday DM, Litvak V, Sharott A, Magill PJ, Farmer SF (2018) Propagation of Beta/Gamma
904
Rhythms in the Cortico-Basal Ganglia Circuits of the Parkinsonian Rat. J Neurophysiol:jn.00629.2017 Available at:
905
http://www.physiology.org/doi/10.1152/jn.00629.2017 [Accessed April 20, 2018].
906
West TO, Farmer SF, Magill PJ, Sharott A, Litvak V, Cagnan H (2020a) State Dependency of Beta Oscillations in the
907
Cortico-Basal-Ganglia Circuit and their Neuromodulation under Phase Locked Inputs. bioRxiv:2020.03.20.000711.
908
West TO, Halliday DM, Bressler SL, Farmer SF, Litvak V (2020b) Measuring Directed Functional Connectivity Using Non-
909
Parametric Directionality Analysis: Validation and Comparison with Non-Parametric Granger Causality.
910
Neuroimage:116796 Available at: https://linkinghub.elsevier.com/retrieve/pii/S1053811920302834 [Accessed April
911
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 34
20, 2020].
912
Zarghami TS, Friston KJ (2020) Dynamic effective connectivity. Neuroimage 207:116453.
913
Supplementary Information
914
1 Simulation of Neural Model
915
1.1.1 Notation
916
We denote spike rates by x, membrane potentials by v, synaptic gains by H, time constants by , and external inputs by A. First
917
order derivatives are indicated using Newton’s notation. Subscripts for intrinsic parameters/states denote the ith population.
918
The total output activity for each kth source, is given by . Throughout we will use a plain letter for a vector (e.g. y) and a
919
letter with a subscript for a vector element (e.g. yi).
920
1.1.2 Modelling of Neural Dynamics
921
In order to simulate neural signals generated by a neuronal source, we used a coupled network of neural mass models that
922
simulate field potentials generated by synchronized activity of large ensembles of homogenous neurons (Lopes da Silva, 1991;
923
Jansen and Rit, 1995). These models rest on the assumption that an incoming volley of spikes arriving at a population can be
924
converted to a post synaptic potential by convolution with a synaptic response kernel. Following temporal integration of the
925
post synaptic potentials, the population can then in turn generate a spike density with a sigmoidal mapping of membrane
926
voltage to spike output frequency. Each of the ith populations may be coupled via intrinsic connectivity to simulate the
927
dynamics within cortical columns. These intrinsically coupled populations are divided into k sources that are themselves
928
extrinsically connected to simulate inter-areal coupling (Wendling et al., 2000; David and Friston, 2003). The ith population
929
within the lth source is given generally by the form:
930
 ,
931

932
EQUATION S1
933
where the average postsynaptic membrane potential of the th population is given by , and is parameterized by a synaptic
934
gain and a lumped post synaptic time constant . The input to the mass is given between the outer set of brackets and
935
comprises some background noise plus a combined input from the other K sources. Populations within each kth source (e.g.
936
one cortical column) are intrinsically coupled (instantaneously) via gain factors H that allow inhibitory and excitatory
937
populations to interact. Inhibitory cells have negative connection weights and vice versa for excitatory cells.
938
To couple distant sources (extrinsic connectivity), the total input to the lth source is the weighted sum (set by the adjacency
939
matrix  of inputs across all K populations:
940


941
EQUATION S2
942
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 35
where is the total output from the kth source (see below). Membrane potentials are converted to spike densities via the
943
sigmoid operator:
944

945
EQUATION S3
946
which is parameterised by to determine the slope of the activation function (a parameter specific to the ith population) and
947
effectively models the variance of the populations firing thresholds. The final terms of equation s1 relate to the fact that these
948
equations are equivalent to the convolution operation of an exponential kernel (see Jansen and Rit 1995 for details of the
949
derivation).
950
To create the full model describing the basal ganglia and motor cortex, masses are coupled with a structure outlined in the
951
schematic in the full model shown in figure 6. We model inhibitory connections by flipping the sign on the adjacency matrix,
952
such that they have a subtractive influence. Connectivity was simulated between sources (extrinsic connectivity), the neural
953
activity (given by V) propagates from the kth to the lth source according to a weighted adjacency matrix with entries 
954
indicating the strength and polairty of the connection. The adjacency matrix of the full model is given below
955
     
     
     
   
     
 
956
EQUATION S4
957
where column 1 gives connections projecting from M2; column 2 from the STR; column 3 from the GPe; column 4 from the
958
STN; column 5 from the GPi; and column 6 from the thalamus. Equivalently the rows give the weights of the input to the
959
populations. Variants of this full model can then be created by adjusting the parameters or removing coefficients  from the
960
matrix.
961
1.2 Transmission Delays
962
We incorporate finite transmission delays by formulating the state space equations to explicitly depend on the past activity of
963
the source from which the connection originated. This was achieved by modifying the extrinsic connectivity matrices by
964
indexing past values with a matrix with elements  specifying the delay for connection of source k to l. Thus, the total
965
external input to source at time is:
966

 
967
EQUATION S5
968
with the constraint that  .
969
1.3 State Equations of Full Model
970
In total there are 14 state equations that are an adaption of the full model described in van Wijk et al. (2018). Each mass
971
(population) comprises two equations (decomposing the second order differential equations into two first order ones) that
972
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 36
describes the voltage change of the population. Again, the model is divided into K extrinsically coupled sources (M2, STR,
973
GPe, STN, GPi, and Thal.) that each comprise a set of intrinsically coupled populations.
974
The 1st source (M2) model consists of 4 populations of neurons (8 states). Each layer is connected via intrinsic connectivity
975
with synaptic gain parameters (H). All populations in the motor cortex have a self-inhibiting connection. In order to notate
976
intrinsic connections from the jth to the ith population, we extend our subscripts for synaptic gains to Hi,j. All layers receive
977
independent stochastic inputs ui. The cortical source comprises:
978
1) A middle layer composed of middle pyramidal cells with inhibitory self-connection (with strength parameterized by
979
H1,1):
980
 ,
981
 

;
982
EQUATION S6
983
2) A supra-granular layer composed of superficial pyramidal cells with inhibitory self-connection (with strength
984
parameterized by H2,2):
985
 ,
986
 

;
987
EQUATION S7
988
3) The supra-granular layer also contains a separate inhibitory interneuron population, again with an inhibitory self-
989
connection (with strength parameterized by H3,3):
990
 ,
991
 

;
992
EQUATION S8
993
4) Finally, the infra-granular layer is made up of deep pyramidal cells also with an inhibitory self-connection (with
994
strength parameterized by H4,4)
995
 ,
996
 

.
997
EQUATION S9
998
Overall, the output of the cortex is equal to the voltage in the deep pyramidal layer, thus:
999
.
1000
EQUATION S10
1001
The 2nd source (STR), is modelled as a single inhibitory population and self-inhibitory connection (with strength parameterized
1002
by H5,5):
1003
 ,
1004
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 37
 

;
1005
EQUATION S11
1006
and total output:
1007
.
1008
EQUATION S12
1009
The 3rd source (GPe) is modelled as a single inhibitory population:
1010
 ,
1011
 

,
1012
EQUATION S13
1013
and total output:
1014
.
1015
EQUATION S14
1016
The 4th source (STN) is modelled as a single excitatory population:
1017
 ,
1018
 

,
1019
EQUATION S15
1020
and total output:
1021
.
1022
EQUATION S16
1023
The 5th source (GPi) is taken to be a single inhibitory population:
1024
 ,
1025
 
,
1026
EQUATION S17
1027
and total output:
1028
.
1029
EQUATION S18
1030
The 6th source (Thal.) is taken to be a single excitatory population with self-inhibition (with strength parameterized by H9,9):
1031
 ,
1032
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 38
 

,
1033
EQUATION S19
1034
and total output:
1035
.
1036
EQUATION S20
1037
1.4 Integration of Stochastic Delay Differential Equations
1038
The model also incorporates a stochastic input for each population which represents endogenous background activity. This
1039
input is given by:
1040

1041
EQUATION S21
1042
where
1043

1044
EQUATION S22
1045
and represents a gain factor on the noise scaling the noise for population . The noise is drawn from a zero-mean normal
1046
distribution, with a standard deviation that is set for the whole model. In the model presented here the stochastic innovations
1047
are independent of the state variable (i.e. they are additive) and a Euler-Maruyama (EM) scheme with a suitably small step
1048
size (h = 0.001ms; less than half of the fastest time constant) is appropriate. See appendix II for an examination of the choice
1049
of stepsize This numerical scheme has been demonstrated to yield accurate results in similar models (Ableidinger et al., 2017;
1050
Palmigiano et al., 2017). A formal assessment of the convergence of the EM scheme is beyond the remit of this paper but we
1051
refer the technical reader to (Baker and Buckwar, 2000; Buckwar, 2000). For additive noise, this scheme follows on naturally
1052
from forward Euler and deploys a rescaling of the stochastic component by the square-root of the integration step to ensure
1053
fluctuations are obey a proper Weiner process(as per Hansen et al. 2006):
1054
.
1055
EQUATION S23
1056
To allow for settling of state equations, we set the initial states to be equal to zero, and then remove the initial transient (3s) as
1057
a burn-in.
1058
2 Formulation of Objective Function
1059
The objective function computes the error between the summary statistics of the simulated pseudo-data and empirical data
1060
(). To do this we use the pooled mean squared error:
1061





1062
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 39
EQUATION S24
1063
where is the length of the wth data feature, over W features. For instance, in the case of the data features used in this
1064
paper, W = 4 for two signals, as there are two power spectra and two directed functional connectivity spectra.
1065
3 Table of Prior Parameter Values
1066
Parameter
Mean Value(s)
Units
Log
Precision2
Cortical Model
Intrinsic Connectivity of middle
layer:
[400 400 400 0]
s-1
1/4 σ2
Intrinsic Connectivity of supra-
granular (pyramidal) layer: 
[800 400 400 400]
s-1
1/4 σ2
Intrinsic Connectivity of supra-
granular (inhibitory) layer: 
[400 400 400 400]
s-1
1/4 σ2
Intrinsic Connectivity of infra-
granular layer: 
[0 800 400 400]
s-1
1/4 σ2
Time Constants:

[3 2 12 18]
ms
1/4 σ2
Sigmoid Slope: 
All 2/3
1/4 σ2
STR Model
Self-Inhibition: 
400
s-1
1/8 σ2
Time Constant:
8
ms
1/8 σ2
Sigmoid Slope:
2/3
1/4 σ2
GPe Model
Self-Inhibition:
0
s-1
1/8 σ2
Time Constant:
8
ms
1/8 σ2
Sigmoid Slope:
2/3
1/8 σ2
STN Model
Self- Inhibition:
0
s-1
1/8 σ2
Time Constant:
4
ms
1/8 σ2
Sigmoid Slope:
2/3
1/8 σ2
GPi Model
2
These values indicate the variance of scaling constant c, where c~N(μ, σ2). Thus, the log scaled parameter X
with mean value x is given by .
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 40
Self- Inhibition:
0
s-1
1/8 σ2
Time Constant:
8
ms
1/8 σ2
Sigmoid Slope:
2/3
1/8 σ2
Thal. Model
Self-Inhibition:

400
s-1
1/8 σ2
Time Constant:
8
ms
1/8 σ2
Sigmoid Slope:
2/3
1/8 σ2
Delays
M2 → STR: 
3
ms
1/4 σ2
M2 → STN: 
3
ms
1/4 σ2
STR → GPe: 
7
ms
1/4 σ2
STR → GPi: 
12
ms
1/4 σ2
GPe → STN: 
1
ms
1/4 σ2
GPe → GPi: 
1
ms
1/4 σ2
STN → GPe:
3
ms
1/4 σ2
STN → GPi: 
3
ms
1/4 σ2
GPi → Thal.: 
3
ms
1/4 σ2
Thal. → M2: 
3
ms
1/4 σ2
M2 → Thal.: 
8
ms
1/4 σ2
Connections
M2 → STR: 
(+) 2000
s-1
1/4 σ2
M2 → STN: 
(+) 2000
s-1
1/4 σ2
STR → GPe: 
(-) 1600
s-1
1/4 σ2
STR → GPi: 
(-) 1600
s-1
1/4 σ2
GPe → STN: 
(-) 2000
s-1
1/4 σ2
GPe → GPi: 
(-) 2000
s-1
1/4 σ2
STN → GPe:
(+) 2000
s-1
1/4 σ2
STN → GPi: 
(+) 2000
s-1
1/4 σ2
GPi → Thal.: 
(-) 1600
s-1
1/4 σ2
Thal. → M2: 
(+) 1000
s-1
1/4 σ2
M2 → Thal.: 
(+) 2000
s-1
1/4 σ2
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 41
Observation Model
Observation noise (per source)
[0.2 0.2 0.2 0.2]
scalar
1 σ2
1067
4 Computation of Kullback-Leibler Divergence for Multivariate Normal
1068
Distribution
1069
In order to compute the full multivariate divergence between posterior  and prior distributions
1070
 over parameters we make an approximation to a k-dimensional multivariate normal distribution
1071
by estimating the mean and covariance of N parameters drawn from the joint kernel estimate of the posterior. The Kullback-
1072
Leibler distance evaluated from the mean and covariance of the distributions:
1073





1074
EQUATION S25
1075
5 Computation of Pooled Mean Squared Error
1076
To compute the distance between summary statistics computed from simulated data and that of the
1077
target data, we compute the pooled mean squared error:
1078
 


 



1079
where Nf is the number of features (i.e. Nf = 16 for data comprising 4 channels: 4 autospectra + 12
1080
directed functional connectivity spectra), and i is the length of the data feature n.
1081
EQUATION S26
1082
6 List of Toolboxes Used
1083
We thank all authors of the toolboxes below:
1084
Toolbox Name
Author
License
Year
Accessed:
allcomb
‘Jos’
BSD-3-
Clause
2018
https://uk.mathworks.com/matlabcentral/fileexchange/10064-
allcomb-varargin
boundedline-
pkg
Kelly
Kearney
MIT
2015
https://github.com/kakearney/boundedline-pkg
bplot
Jonathan C.
Lansey
BSD-3-
Clause
2015
https://uk.mathworks.com/matlabcentral/fileexchange/42470-
box-and-whiskers-plot-without-statistics-toolbox
brewermap
Stephen
Cobeldick
Apache
2.0
2014
https://github.com/DrosteEffect/BrewerMap
export_fig
Oliver J.
Woodford,
Yair M.
Altman
BSD-3-
Clause
2014
https://github.com/altmany/export_fig
highdim
Brian Lau
GNU-3
2017
https://github.com/brian-lau/highdim
hotellingT2
Antonio
Trujillo-
Ortiz
BSD-3-
Clause
2002
https://uk.mathworks.com/matlabcentral/fileexchange/2844-
hotellingt2
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 42
linspecer
Jonathan C.
Lansey
BSD-3-
Clause
2015
https://github.com/davidkun/linspecer
neurospec 2.2
David
Halliday
GNU-2
2018
http://www.neurospec.org/
ParforProgMon
Dylan Muir,
Willem-Jan
de Goeij,
The
MathWorks,
Inc.
BSD-3-
Clause
2016
https://github.com/DylanMuir/ParforProgMon
splitvec
Bruno
Luong
BSD-3-
Clause
2009
https://uk.mathworks.com/matlabcentral/fileexchange/24255-
splitvec
SPM-12
The FIL
Methods
Group
GNU-2
2020
https://www.fil.ion.ucl.ac.uk/spm/software/spm12/
violin
Holger
Hoffmann
BSD-3-
Clause
2015
https://uk.mathworks.com/matlabcentral/fileexchange/45134-
violin-plot
weightedcov
Liber
Eleutherios
BSD-3-
Clause
2008
https://uk.mathworks.com/matlabcentral/fileexchange/37184-
weighted-covariance-matrix
1085
7 Appendices
1086
7.1 Appendix I- Table of Methods Used
1087
Method
Notes
Reference(s)
Convolution based neural mass
models
Mean field approximation to
homogenous neural population
activity.
(Lopes da Silva et al., 1974; Jansen
and Rit, 1995; David and Friston,
2003)
Neural mass model of the cortico-
basal ganglia -thalamic circuit
Population model with structure and
parameterization defended and
explained in given references.
(Moran et al., 2011; van Wijk et al.,
2018)
Likelihood free inference with
Approximate Bayesian Computation
References are for
introductions/tutorials
(Beaumont et al., 2002; Sunnåker et
al., 2013; Sisson et al., 2018)
Sequential Approximate Bayesian
Computation
Improvement on ABC to aid
convergence and computational
efficiency
(Beaumont et al., 2009; Toni et al.,
2009)
Kernel density approximation to ABC
marginals and copula estimation of
dependence
We use a cross-validation log-
likelihood optimization of the kernel
density bandwidth for estimation of
the marginals, with approximate
maximum likelihood to fit a t-copula
to estimate the joint.
(Li et al., 2017)
Model Selection with ABC Optimized
Models
Using acceptance rates as
approximation to marginal likelihood.
(Grelaud et al., 2009; Toni and
Stumpf, 2009)
Electrophysiological recordings from
Parkinsonian rats
Field Recordings made in
experimental 6-OHDA model of
Parkinsonism
(Mallet et al., 2008; Moran et al.,
2011)
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 43
Non-parametric Directionality for
directed functional connectivity
estimates
Used as a summary statistic of
between signal interactions.
(Halliday et al., 2016; West et al.,
2020b)
1088
7.2 Appendix II - Examination of Integration Step-size
1089
1090
The posterior model 5.2 from figure 5 was simulated (256s) with MAP parameters but for different integration step-sizes.
1091
The Euler-Maruyama step size was varied from 2ms to 0.1ms. Simulations show that the numerical solutions are convergent
1092
with smaller step-sizes. Simulations in this paper use a 0.5ms step size, no qualitative change in the simulation summary
1093
statistics were found for a smaller step size at 0.1ms.
1094
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 44
7.3 Appendix III- Examination of forward uncertainty of posterior model
1095
1096
The posterior model 5.2 from figure 5 was repeatedly (n=25) simulated (256s) with MAP parameters but different
1097
realizations of the underlying noise process. This figure shows the averaged simulated data features alongside their
1098
interquartile range in the bounds. No major deviations were found across realizations, with all giving consistent spectral
1099
features.
1100
1101
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 45
7.4 Appendix IV- Outline of Software and Scripts
1102
We provide scripts to perform the analyses described in this paper in the open accessible GitHub
1103
repository. The script “West2021_Neuroimage_Figures.m” provides a master file linking to the
1104
respective computations required for each figure. Note that is some circumstances, the functions can be
1105
parallelized to run across multiple instances of MATLAB, running for instance on a distributed system
1106
(e.g. “Figure5_6_i_modelfitting.m”). These scripts use a shared Working List, called ‘WML.mat’, that
1107
must be accessible to all MATLAB instances.
1108
The general structure of the software is given in the schematic below. The software can be broken down
1109
into (1) project specific files and (2) the generic procedures written to be applicable across a range of
1110
different datasets and models. Individual projects folders must include (A) a dataset given as
1111
timeseries data, or data preprocessed and already transformed to the feature space of the summary
1112
statistics; (B) a set of model functions describing the equations of motion, and integration scheme that
1113
outputs a time series; (C) model priors specified in terms of their expectations and precisions; (D) a
1114
folder of model specifications outlining the parameterization of different models to be compared. All
1115
functions take a common configuration structure ‘R’ that sets various parameters and paths for the
1116
analyses. These are setup using “ABC_setup_projectname.m” that must be written for each new project.
1117
Using the given template project, it should be possible to adapt the files to suit other sets of data and
1118
models. For bug-fixes, troubleshooting, or pull requests please see the GITHUB repository at:
1119
https://github.com/twestWTCN/ABCNeuralModellingToolbox.git.
1120
West et al. (2020): ABC for Inference of Brain Networks
26/03/2021 rev2_RL 46
1121
Schematic of software structure and key functions. The software is split into two main branches: (A) project specific files
1122
designed for particular model and dataset; (B) generic procedures written to be broadly applicable across projects. Functions
1123
often use a configuration variable ‘R’ that created using “ABC_setup_....mthat defines various important parameters for
1124
the fitting and analysis.
1125
ABCNeuralModellingToolbox
/Projects/project name…
project specific files
/data
store of empirical data
/model_fx
model dynamics and integration
ABC_fx_...
equations of motion for model
dynamics
ABC_fx_compile_...
parameterization, coupling,
integration
getStateDetails
retrieve dimensions of model
states and parameters
/priors
model priors
getModelPriors
sets model priors
/routine
individual routines Sub-project folder
ABC_setup_...
configuration structure
ModelSpecs
sets of files describing indivdual
model parameters
/outputs
saved fits, analyses, etc
/sim_machinery
generic ABC procedures
/datafeatures
construct summary statistics
constructGenCrossMatrix
generic function for several
features
/graph_outputs
graphical outputs
genplotter_200420
plots data features
/inversion
model inversion
SimAn_ABC_201120
this is the main ABC algorithm
postEstCopula
estimates kernel density proposal
postDrawMVN
makes draws from multivariate
normal proposal
postDrawCopula
makes draws from multivariate
proposal
compareData_180520
estimates error of summary
statistics from data
/model_analysis
model comparison
modelCompMaster_160620
model comparison procedures
modelProbs_160620
runs draws from posterior
/observer
observation modelling
observe_data_280220
observation model
/simulations
related to integration of model
innovate_timeseries
sets up stochastic innovations
setSimTime
set time vector for simulations
/ABC_dependencies
external toolboxes
... Importantly, this permits the investigation of bursting dynamics, such as those analysed in this work. The validity of this approach (in terms of the accuracy of estimation of parameters as well as the identification of model architectures), given the type of neural mass models and neurophysiological data described here, has been examined in previous work [90]. Briefly, ABC approximates the posterior density over models and their parameters, given some empirical data. ...
... The marginal probability was then given by the probability that the summary statistics were less than a certain threshold � � (common across models) distance from the actual data (see S3 Appendix for more detail). The outcome of this procedure is given in [90] and in which we found that a model incorporating both the hyperdirect and subthalamo-pallidal pathways was the best candidate in describing the patterns of neuronal activity in recordings made in Parkinsonian rats. This model is very similar in architecture to previous work investigating the same system [50,81]. ...
Article
Full-text available
Synchronization of neural oscillations is thought to facilitate communication in the brain. Neurodegenerative pathologies such as Parkinson's disease (PD) can result in synaptic reorganization of the motor circuit, leading to altered neuronal dynamics and impaired neural communication. Treatments for PD aim to restore network function via pharmacological means such as dopamine replacement, or by suppressing pathological oscillations with deep brain stimulation. We tested the hypothesis that brain stimulation can operate beyond a simple "reversible lesion" effect to augment network communication. Specifically, we examined the modulation of beta band (14-30 Hz) activity, a known biomarker of motor deficits and potential control signal for stimulation in Parkinson's. To do this we setup a neural mass model of population activity within the cortico-basal ganglia-thalamic (CBGT) circuit with parameters that were constrained to yield spectral features comparable to those in experimental Parkinsonism. We modulated the connectivity of two major pathways known to be disrupted in PD and constructed statistical summaries of the spectra and functional connectivity of the resulting spontaneous activity. These were then used to assess the network wide outcomes of closed-loop stimulation delivered to motor cortex and phase locked to subthalamic beta activity. Our results demonstrate that the spatial pattern of beta syn-chrony is dependent upon the strength of inputs to the STN. Precisely timed stimulation has the capacity to recover network states, with stimulation phase inducing activity with distinct spectral and spatial properties. These results provide a theoretical basis for the design of the next-generation brain stimulators that aim to restore neural communication in disease.