# Dynamical estimation of neuron and network properties II: Path integral Monte Carlo methods.

**ABSTRACT** Hodgkin-Huxley (HH) models of neuronal membrane dynamics consist of a set of nonlinear differential equations that describe the time-varying conductance of various ion channels. Using observations of voltage alone we show how to estimate the unknown parameters and unobserved state variables of an HH model in the expected circumstance that the measurements are noisy, the model has errors, and the state of the neuron is not known when observations commence. The joint probability distribution of the observed membrane voltage and the unobserved state variables and parameters of these models is a path integral through the model state space. The solution to this integral allows estimation of the parameters and thus a characterization of many biological properties of interest, including channel complement and density, that give rise to a neuron's electrophysiological behavior. This paper describes a method for directly evaluating the path integral using a Monte Carlo numerical approach. This provides estimates not only of the expected values of model parameters but also of their posterior uncertainty. Using test data simulated from neuronal models comprising several common channels, we show that short (<50 ms) intracellular recordings from neurons stimulated with a complex time-varying current yield accurate and precise estimates of the model parameters as well as accurate predictions of the future behavior of the neuron. We also show that this method is robust to errors in model specification, supporting model development for biological preparations in which the channel expression and other biophysical properties of the neurons are not fully known.

**0**Bookmarks

**·**

**121**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**We present a method for using measurements of membrane voltage in individual neurons to estimate the parameters and states of the voltage-gated ion channels underlying the dynamics of the neuron's behavior. Short injections of a complex time-varying current provide sufficient data to determine the reversal potentials, maximal conductances, and kinetic parameters of a diverse range of channels, representing tens of unknown parameters and many gating variables in a model of the neuron's behavior. These estimates are used to predict the response of the model at times beyond the observation window. This method of [Formula: see text] extends to the general problem of determining model parameters and unobserved state variables from a sparse set of observations, and may be applicable to networks of neurons. We describe an exact formulation of the tasks in nonlinear data assimilation when one has noisy data, errors in the models, and incomplete information about the state of the system when observations commence. This is a high dimensional integral along the path of the model state through the observation window. In this article, a stationary path approximation to this integral, using a variational method, is described and tested employing data generated using neuronal models comprising several common channels with Hodgkin-Huxley dynamics. These numerical experiments reveal a number of practical considerations in designing stimulus currents and in determining model consistency. The tools explored here are computationally efficient and have paths to parallelization that should allow large individual neuron and network problems to be addressed.Biological Cybernetics 10/2011; 105(3-4):217-37. · 1.93 Impact Factor - [Show abstract] [Hide abstract]

**ABSTRACT:**Estimating the behavior of a network of neurons requires accurate models of the individual neurons along with accurate characterizations of the connections among them. Whereas for a single cell, measurements of the intracellular voltage are technically feasible and sufficient to characterize a useful model of its behavior, making sufficient numbers of simultaneous intracellular measurements to characterize even small networks is infeasible. This paper builds on prior work on single neurons to explore whether knowledge of the time of spiking of neurons in a network, once the nodes (neurons) have been characterized biophysically, can provide enough information to usefully constrain the functional architecture of the network: the existence of synaptic links among neurons and their strength. Using standardized voltage and synaptic gating variable waveforms associated with a spike, we demonstrate that the functional architecture of a small network of model neurons can be established.Biological Cybernetics 04/2014; · 1.93 Impact Factor -
##### Article: Estimating parameters and predicting membrane voltages with conductance-based neuron models.

[Show abstract] [Hide abstract]

**ABSTRACT:**Recent results demonstrate techniques for fully quantitative, statistical inference of the dynamics of individual neurons under the Hodgkin-Huxley framework of voltage-gated conductances. Using a variational approximation, this approach has been successfully applied to simulated data from model neurons. Here, we use this method to analyze a population of real neurons recorded in a slice preparation of the zebra finch forebrain nucleus HVC. Our results demonstrate that using only 1,500 ms of voltage recorded while injecting a complex current waveform, we can estimate the values of 12 state variables and 72 parameters in a dynamical model, such that the model accurately predicts the responses of the neuron to novel injected currents. A less complex model produced consistently worse predictions, indicating that the additional currents contribute significantly to the dynamics of these neurons. Preliminary results indicate some differences in the channel complement of the models for different classes of HVC neurons, which accords with expectations from the biology. Whereas the model for each cell is incomplete (representing only the somatic compartment, and likely to be missing classes of channels that the real neurons possess), our approach opens the possibility to investigate in modeling the plausibility of additional classes of channels the cell might possess, thus improving the models over time. These results provide an important foundational basis for building biologically realistic network models, such as the one in HVC that contributes to the process of song production and developmental vocal learning in songbirds.Biological Cybernetics 06/2014; · 1.93 Impact Factor

Page 1

Biol Cybern (2012) 106:155–167

DOI 10.1007/s00422-012-0487-5

ORIGINAL PAPER

Dynamical estimation of neuron and network properties II:

path integral Monte Carlo methods

Mark Kostuk · Bryan A. Toth · C. Daniel Meliza ·

Daniel Margoliash · Henry D. I. Abarbanel

Received: 16 January 2012 / Accepted: 14 March 2012 / Published online: 13 April 2012

© Springer-Verlag 2012

Abstract Hodgkin–Huxley(HH)modelsofneuronalmem-

brane dynamics consist of a set of nonlinear differential

equations that describe the time-varying conductance of var-

ious ion channels. Using observations of voltage alone we

show how to estimate the unknown parameters and unob-

served state variables of an HH model in the expected

circumstance that the measurements are noisy, the model

has errors, and the state of the neuron is not known when

observations commence. The joint probability distribution

of the observed membrane voltage and the unobserved state

variables and parameters of these models is a path integral

through the model state space. The solution to this integral

allows estimation of the parameters and thus a character-

ization of many biological properties of interest, including

channel complement and density, that give rise to a neuron’s

electrophysiologicalbehavior.Thispaperdescribesamethod

for directly evaluating the path integral using a Monte Carlo

numerical approach. This provides estimates not only of the

expected values of model parameters but alsoof their poster-

ioruncertainty.Usingtestdatasimulatedfromneuronalmod-

elscomprisingseveralcommonchannels,weshowthatshort

(<50ms) intracellular recordings from neurons stimulated

M. Kostuk · B. A. Toth

Department of Physics, University of California, 9500 Gilman

Drive, San Diego, La Jolla, CA 92093-0402, USA

C. D. Meliza (B ) · D. Margoliash

Department of Organismal Biology and Anatomy, University

of Chicago, 1027 E 57th Street, Chicago, IL 60637, USA

e-mail: dmeliza@uchicago.edu

H. D. I. Abarbanel

Marine Physical Laboratory, Department of Physics (Scripps

Institution of Oceanography), Center for Theoretical Biological

Physics, University of California, 9500 Gilman Drive,

San Diego, La Jolla, CA 92093-0374, USA

e-mail: habarbanel@ucsd.edu

with a complex time-varying current yield accurate and pre-

cise estimates of the model parameters as well as accurate

predictions of the future behavior of the neuron. We also

show that this method is robust to errors in model specifica-

tion, supporting model development for biological prepara-

tions in which the channel expression and other biophysical

properties of the neurons are not fully known.

Keywords

Ion channel properties · Markov Chain Monte Carlo

Data assimilation · Neuronal dynamics ·

1 Introduction

Thedynamicalresponsesofneuronsandnervoussystemnet-

works to time-varying inputs depend on ion channels that

are gated by membrane voltage and chemical ligands. The

complex, nonlinear dynamics of the voltage-gated channels

(Johnston and Wu 1995; Graham 2002) control the genera-

tion of action potentials and determine their shape and fre-

quency. Determining which channels are present in a class

of neuron, their biophysical properties, and how these con-

tribute to the phenomenological behavior of the neurons is

generally a painstaking process involving extensive pharma-

cological manipulation.

An alternative approach that we explore here is to record

a neuron’s response to a time-varying current and use these

datatoestimatepropertiesofabiophysicalmodeloftheneu-

ron. Accurate predictions of the neuron’s response to novel

stimuli by such a model indicates that the model may be

used to make inferences about the biological properties of

the neuron. The measurements are necessarily limited to a

small subset of the many state variables in such a model, and

there can be many unknown parameters. We have developed

a systematic approach to this problem that provides an exact

123

Page 2

156 Biol Cybern (2012) 106:155–167

statistical setting for transferring information from measure-

ments to a model, which we described in Part I of this paper

(Toth et al. 2011), along with a variational approximation to

thehigh-dimensionalpathintegralinvolvedinthisapproach.

Here, in Part II we move beyond the variational approxi-

mation, to a numerical evaluation (also approximate) of the

integraloverpathsofthemodelstatethroughandbeyondthe

temporalwindowofobservations.Bysamplingfromthedis-

tributionoflikelypathswecanobtainanestimateofthestate

and parameter values that best represent the data, as well as

statistics about the uncertainty of these estimates. This pos-

terior uncertainty indicates how much the data, as well as

the dynamics of the model, constrain the parameter and state

estimates, thereby providing additional information that can

be used to guide model selection and inference as well as

experimental design.

Small error bounds indicate that the amount of data is

sufficient to make confident statements about the underlying

biophysicalpropertiesoftheneuron;largeerrorboundsfora

parameter indicate either that the data is insufficient in quan-

tity or dynamical range, that the behavior of the neuron is

insensitive to the value of that parameter, or that the model

is in error. We can also integrate forward in time beyond

the observation interval using samples from the full poster-

ior distribution to obtain a predictive distribution, which is

useful for model validation and selection.

After a brief summary of the family of Hodgkin–Huxley

(HH) biophysical models we utilize and of the path integral

formulationofthedataassimilationproblem(formoredetail

see Toth et al. 2011), we describe a Markov Chain Monte

Carlo (MCMC) algorithm for sampling from the distribution

ofpathsconditionedontherecordeddata.Wealsopresentan

implementation of the algorithm suitable for highly parallel-

ized devices such as graphical processing units (GPUs). We

then repeat a number of the “twin experiments” from Part I,

in which data simulated from a model is used in our proce-

dure and the estimates are compared to the parameters used

to generate the data as well as the observed and unobserved

states.

The goal of these numerical experiments is to determine

under what conditions the data assimilation procedure is

accurate. We do not pursue two questions addressed in Part I

related to the frequency of the input current and the amount

of noise in the measurements, taking values that we dem-

onstrated work well with the variational method. Instead we

focusontherobustnessofthemethodtomodelerrors—when

the model contains channels that are absent in the data and

vice versa. As in Toth et al. (2011) we analyze two sim-

ple HH models: one has Na, K, and leak currents (NaKL

model), and the other has in addition an Ihcurrent (NaKLh

model). We test for model robustness by generating data

using one model and estimating parameters using the other,

and show that we can identify missing or extra channels in

the model. The numerical approach to evaluating the path

integral described here is particularly suited to these situa-

tions, because it takes into account such model errors, which

are inevitable in the study of biological neurons.

2 Methods

2.1 General framework

In studying the biological properties of neurons, we can typ-

ically measure only the membrane voltage potential, V(t),

while injecting a known stimulus current Iapp(t). We would

like to infer properties of the voltage-gated ion channels that

open and close in response to the current injection. The task

is to select a model that is consistent with the observations

and to estimate the values of the parameters that have some

connection to biologically interesting properties of the neu-

ron and the system of which it is a part. Other types of

measurements are possible, including voltage-clamp mea-

surementsofcurrentfloworopticalmeasurementsfromfluo-

rescentreporters;thesearenotwithinthescopeofthecurrent

paper.Thebasicproblem,ofestimatingunobservableparam-

etersandstatesfromalimitedsetofobservations,inprinciple

remains the same, although the kinetics of some measure-

ments (such as calcium indicators) may be sufficiently slow

so as to challenge the modeling effort.

In making such inferences, models that are explic-

itly based on biophysical entities are preferable to more

abstract phenomenological ones (e.g., integrate-and-fire).

The biophysical models typically comprise a set of ordinary

differential equations, for current conservation across the

membrane (and possibly between different compartments)

and for the kinetics of the gating variables that describe the

opened and closed configurations of various voltage-gated

channels (Sect. 2.5). In general, the system is described

by a D-dimensional state vector that includes the observ-

able voltage V(t) and a set of unobservable variables ai(t);

i = 1, 2,..., D − 1 associated with other compartments

and the permeability of each of the voltage-gated channels

included in the model. The models we examine in this paper

are relatively simple, with a single compartment and up to

five state variables associated with three channel types. The

methods developed here and in our earlier paper (Toth et al.

2011) can be applied to specific, more complex settings, but

will require richer models and possibly additional data.

The second component of the model is the process

of observation. We make measurements of the voltage at

discrete times over some interval tn={t0, t1,...,tm=T}.

We label these observations y(tn)=y(n);n =0, 1,...,m,

and they are related to the true voltage of the neuron through

somemeasurementfunctionh(x)thatincorporatesnoisearis-

ing from various sources. The discrete time nature of the

123

Page 3

Biol Cybern (2012) 106:155–167157

observations suggests that the model be stated as a rule in

discrete time taking the system at time tn= n to the state at

time n + 1. This rule may be the specific formulation of a

numerical solution algorithm for the underlying differential

equations. Both the discrete time rule and the observation

function may involve a collection of unknown parameters,

which we denote as p.

Using the observations y(n) of the voltage, we wish to

establish the full collection of state variables x(n) at times

in the observation window, in particular at the end of the

measurements t = T. With estimates of x(T), of the param-

eters p, and knowledge of the stimulus for t > T we may

use the model to predict the neuron’s behavior for t > T.

Our goal is to achieve y(t) ≈ h(x(t)) for t > T, as these

are the observed quantities. If these predictions are accu-

rate for a broad range of biologically plausible stimuli, then

the estimates of the model parameters provide a parsimoni-

ous, biophysically interpretable description of the neuron’s

behavior.

In the present paper the data are simulated from a model

which has the form of a set of ordinary differential equations

(Sect.2.5),whichwesolvebydiscretizingtime.Afterchoos-

ingarealisticsetofparametersandinitialstates,wegenerate

a solution to the equations. We select a subset of the output,

here the voltage alone, and add noise to yield the observed

quantities y(n). This transfer of information from measure-

ments to models is called data assimilation, following the

name given in the extensive and well developed geophysical

literature on this subject (Evensen 2009). The technique for

testing data assimilation methods in cases where the data is

generated by the model is known as a “twin experiment.”

2.2 Path integral formulation

As discussed in Toth et al. (2011) one can cast the general

set of questions in data assimilation when one has noisy

data, model errors, and uncertainty about the initial state

of the model x(t0) = x(0) into an integral over the states

x(tn) = x(n); {t0, t1,...,tm = T} during the measure-

ment interval [0, T] at the measurement times tn. If we

denote the path in state space through this time interval as

X = {x(0), x(1),...,x(m)}, then the path is a location in

(m + 1)D-dimensional space.

The expected value, conditional on the measurements

Y = {y(0), y(1),...,y(m)}, of any function along the path

G(X) is given by the (m + 1)D dimensional integral

?dXe−A0(X,Y)G(X)

E[G(X)|Y] =

?dXe−A0(X,Y)

.

(1)

The action A0(X, Y) is given in terms of (1) the con-

ditional mutual information of a state x(n) and a measure-

menty(n),conditionedonmeasurementsuptothetimetn−1:

Y(n − 1) = {y(0), y(1),...,y(n − 1)}, (2) the transition

probability P(x(n + 1)|x(n)) to arrive in the state x(n + 1)

at time tn+1given the state x(n) at tn, and (3) the distribution

of states at t0P(x(0)), as

−A0(X, Y)

=

n=0

m

?

m−1

?

m

?

m−1

?

log

?

P(x(n), y(n)|Y(n − 1))

P(x(n)|Y(n − 1)) P(y(n)|Y(n − 1))

?

+

n=0

log[P(x(n + 1)|x(n))] + log[P(x(0)]

?P(y(n)|x(n), Y(n − 1))

=

n=0

log

P(y(n)|Y(n − 1))

?

+

n=0

log[P(x(n + 1)|x(n))] + log[P(x(0)].

(2)

The first sum gives the information transferred from the

measurement y(n) to the state x(n), conditioned on previ-

ous measurements Y(n−1). The second term represents the

underlying dynamics of the model that moves the state for-

ward one step in time, and the last term is the uncertainty in

the state at the time t0of the initial measurement.

Approximationsto A0(X, Y)werediscussedinTothetal.

(2011):

– If the dynamical rule is written as ga(x(n + 1), x(n), p)

= 0; a = 1, 2,..., D in the case of no model errors,

then the transition probability P(x(n+1)|x(n)) is a delta

function P(x(n +1)|x(n)) = δD(g(x(n +1), x(n), p)).

Withmodelerrors,thisisbroadened,andwithaGaussian

approximation to these model errors we write

P(x(n + 1)|x(n)) ∝

⎡

×exp

⎣−1

2

m−1,D

?

n=0,a=1

Rf a(ga(x(n + 1), x(n), p))2

⎤

⎦.

(3)

When the differential equation solver is explicit g(x

(n + 1), x(n), p) = x(n + 1) − f(x(n), p).

– If the measurements are independent at different times,

and the noise in each measurement is Gaussian, then the

information transfer contribution to A0(X, Y) is propor-

tional to

Rm

2

m,L

?

n=0,l=1

(yl(n) − xl(n))2.

(4)

Thesestandardassumptionsmaybeeasilyreplacedwithin

the context of the path integral, introducing little additional

123

Page 4

158Biol Cybern (2012) 106:155–167

computational challenge using the Monte Carlo approach

utilized below.

For these standard assumptions

A0(X, Y) =Rm

2

m,L

?

?

n=0,l=1

m−1,D

(yl(n) − xl(n))2

+1

2

n=0,a=1

Rf a(ga(x(n + 1), x(n), p))2

−log[P(x(0)].

(5)

Further we take the initial distribution of states to reflect

total ignorance of the initial state: P(x(0)) is a uniform dis-

tribution. It then factors out of all calculations such as those

in Eq. (1). Note that A0(X, Y) is not Gaussian in the state

variables as the function f(x, p) is nonlinear in the x.

The expected value of a state variable comes from

G(X)=X, and the marginal distribution of xc(n); Pc,n(z)

arises from G(X) = δ(z − xc(n)). From quantities such as

this we can answer important sets of questions in the data

assimilation process.

2.3 Monte Carlo evaluation

Using our formulation of the conditional probability den-

sity P(X|Y) for a path through the state space, we would

like to evaluate approximations to quantities such as means,

covariances about those means, and marginal distributions

of parameters or state variables. These can be used to make

estimates and predictions of quantities in the model, consis-

tent with the observed data. Each of these quantities can be

written as path integrals of the form given in Eq. (1), where

G(X) is chosen to be some interesting function of the path.

Thenumericalchallengeistoevaluatethesepathintegrals.

One approach is to seek a stationary path where

∂A0(X, Y)

∂X

and we have explored this in Toth et al. (2011) when the

dynamics is deterministic, namely Rf → ∞.

In working with data from experiments, the model is

an incomplete representation of the underlying processes,

and the deterministic variational method, where Rf → ∞,

imposes an equality constraint in the optimization that is

likely to be too strong. The minimization of the action in

Eq. (6) should be better as it embodies the notion of model

errors.Inpractice,onecanconsiderusingEq.(6)toselectan

initial path from which to begin the Monte Carlo procedure

we outline here. Indeed, that is how we selected an initial

guess for the path in the iterative algorithm described below.

An alternative to the variational approach is to generate

a series of paths {X(1),...,X(Npaths)} that are distributed in

(m + 1)D dimensional path space according to P(X|Y) ∝

= 0,

(6)

exp[−A0(X, Y)].Wecanusethesepathstoapproximatethe

distribution with

P(X|Y) ≈

1

Npaths

Npaths

?

j=1

δ(X − X(j)).

An estimate of the expected value of a function ?G(X)?

on the path follows as

?G(X)? =?dXG(X)P(X|Y)

≈

j=1

These paths can be thought of as representing the many

possible time evolutions of the system state from an ensem-

bleofinitialconditions.Theyeachevolvethroughpathspace

according to the dynamics embodied in the transition prob-

abilities entering the action A0(X) (we drop the explicit

reference to the observations Y now). The parameters are

considered as constants with respect to the dynamics, i.e.,

dp(t)/dt = 0, and the distribution that is obtained reflects

their influence on the state vector x(n) through the model

equations f(x(n), p).

We require a method which will produce paths distrib-

uted according to exp[−A0(X)]. There are several path

integral Monte Carlo (PIMC) methods, such as Metropo-

lis–Hastings or Hybrid Monte Carlo, that are designed to

achieve this (Metropolis et al. 1953; Mackay 2003; Neal

1993). These methods make a biased random walk through

the path (search) space that approaches the desired distribu-

tion.

1

Npaths

Npaths

?

G(X(j)).

(7)

2.4 Metropolis–Hastings Monte Carlo

One of the earliest developed methods for sampling from

a high-dimensional distribution is the Metropolis–Hastings

MonteCarlomethod(Metropolisetal.1953;Hastings1970).

ThisgeneratesasequenceofpathsX(0), X(1),...,X(k)from

a random walk that is biased through an acceptance crite-

rion that depends upon the distribution of interest. It is an

example of a MCMC method because the sequence of paths

maythemselvesbeconsideredasstatesofaMarkovProcess.

A Markov chain consists of a set of accepted paths, the set

being indexed by the iterations k of the procedure.

The Metropolis–Hastings Monte Carlo method works by

generating a new path X(k+1)from the current path X(k)

in two steps. First, a candidate path Xproposedis proposed

by adding an unbiased random displacement to the current

path Xcurrent. The displacement may be to any subset of the

components of Xcurrent; we restrict the distribution of this

perturbation to be symmetric, assuring that the transition

Xcurrent→ Xproposedis as likely as the reverse Xproposed→

Xcurrent, to insure that the chain remains Markov.

123

Page 5

Biol Cybern (2012) 106:155–167 159

Next, the proposed path is either accepted as the next

path in the sequence X(k+1)= Xproposedor it is rejected

so X(k+1)= X(k). The probability for acceptance is

min1, exp

−(A0(Xproposed) − A0(Xcurrent))

This says that proposed changes that lower the action are

accepted, while those that increase the action are accepted

with probability exp[−?A0].

??

??

.

(8)

2.4.1 General procedure

An initial path X(0)is supplied by the user and set to be

the current path at iteration k = 0; the observed time

series Y is data that is loaded in from a file. MCMC

algorithms may take some time to converge to the cor-

rect distribution, and selecting an initial path that is close

to the true solution generally leads to much faster conver-

gence rates. In principle, the Metropolis–Hastings algorithm

will eventually sample from the entire posterior distribu-

tion, but when distributions are multimodal, in practice the

sampling chains will tend to stay in local minima corre-

sponding to the initial guess. Our approach to this prob-

lem, which is common to all MCMC algorithms, is to use

another optimization scheme (Toth et al. 2011) to perform a

coarse search over broad parameter bounds and supply the

result of this procedure as the starting point for the MCMC

chains.

The MCMC calculation proceeds in two distinct phases;

the first “burn-in” phase iterates this initial path guess while

adjusting the size of the random perturbation α so that

the acceptance rate of paths according to the Metropolis–

Hastingscriteriaequationisapproximatelyone-half.Adjust-

ing the step size between iterations violates the Markovian

requirement for symmetric transitions between the subse-

quent paths in the chain; however it also allows the chain to

evolvemoreefficiently.Duetothisviolation,nostatisticsare

gathered during the first phase.

During the second phase we uniformly sample from the

chain to collect the constituent paths of our distribution. It is

therefore important that α be held fixed during this period,

at a value determined during the first phase. Other than that,

the iterations proceed identically during both phases.

A single iteration consists of the following steps:

(1) If k = 0 evaluate A0(X(0)).

(2) If k > 0, propose a change to component(s) of X:

Xproposed

i

i

+ U(−α/2, α/2) · ri.

(3) Evaluate A0(Xproposed) and

?A0= A0(Xproposed) − A0(X(k)).

(4) If U(0, 1) < min[1, exp(−?A0)], then X(k+1)←

Xproposed(accept change).

Otherwise let X(k+1)← X(k)(reject change).

← X(k)

This loop is repeated until k = kfinal, when sufficient sta-

tistics on the paths distributed as exp[−A0(X)] have been

collected.Thesubscripti referstoanycomponentofthepath

X,includingindividualtimestepsandtheglobalparameters.

ri is a scale factor for perturbations to the ith component

of X.

The perturbation step size α is a dimensionless param-

eter between zero and one; it is set to be the same for all

dimensions of X, allowing any inter-dimensional scaling to

be done by the user via the bounds. This is a simplification,

not a requirement of the procedure. In this way, the actual

perturbation to a component of X is given by a random num-

berU(−α/2, α/2)timesthebounds-range(ri),whereU isa

uniformly distributed random value between −α/2 and α/2.

2.4.2 Implementation for parallel architectures

TheMetropolis–HastingsMonteCarlomethodissimpleand

powerful, but it requires many path updates to achieve accu-

rate statistics. One way to deal with this is to take advantage

of parallel computing technology, using a GPU. With GPU

technology it is possible to execute hundreds of threads of

execution simultaneously on a single GPU. Typically each

thread will perform the same operations, but on different

piecesofthedata.Becausethepathsareupdatedsequentially,

the iteration steps cannot be parallelized in their entirety.

However, during each iteration the path is perturbed in all

(m+1)D+K dimensionstogivethecandidatepathXproposed,

and A0(Xproposed) is calculated, both of which can be effi-

ciently parallelized across the dimensions of X.

All MCMC calculations reported here first generated 107

sample paths as a burn-in phase for the Metropolis–Hastings

iterations. These were followed by a statistics collection

phase of 108iterations during which 103accepted paths

were uniformly sampled to create the approximate distri-

bution P(X|Y). The data assimilation window comprised

m + 1 = 4,096 points. The expression of the model error

g(x(n)) is a discrete time version of deterministic equations

for the neuron. We selected a fourth order Runge–Kutta inte-

gration scheme using a time step of ?t = 0.01ms.

When the assimilation procedure was completed, a pre-

diction using a fourth order Runge–Kutta scheme was per-

formed on each of the accepted paths using state variables

x(T) and the parameters associated with that path. The pre-

dicted trajectories were averaged to determine the expected

value of x(t > T) and the standard deviation was evaluated

about this mean. This gives the predicted quantities and their

RMS errors as reported in the figures.

In order to assign values for Rf, the normalized deviation

of the noise was estimated at 1 part in 103for all dimen-

sions of the model error. This normalized deviation was then

scaled by the full range of the state variable and squared to

123

Page 6

160Biol Cybern (2012) 106:155–167

get the variance for that dimension, so for V ∈ [−200, 200],

Rf = 6.25, while for the gating variables, Rf = 106.

We considered an experimental error of ±0.5 mV giving

Rm ≈ 1. We adjusted the Monte Carlo step size using a

scaling factor α to achieve an acceptance rate near 0.5. The

time required to perform each of our reported calculations

with 108candidate paths, each of dimension 16,402 (NaKL)

to 20,498 (NaKLh), took about 10h to complete on a single

NVIDIAGTX470GPU.Inourexperience,providedthatthe

dimension of the problem is roughly constant, the amount of

timeforacalculationscalesroughlylinearlywiththenumber

of GPUs.

In practice, as the Metropolis–Hastings procedure seeks

paths distributed about the maxima of the probability dis-

tribution exp[−A0(X)], a statistical minimization of A0(X)

occurs when paths are accepted and rejected. This makes

it a natural generalization of the procedure used in Toth

et al. (2011) applicable to the situation where there is model

error as well. As emphasized in this paper, the MCMC

approach also results in expected errors for estimations and

predictions.

2.5 Model neurons

2.5.1 NaKL model

The simplest HH model describes the dynamics of the volt-

age V(t) across the membrane of a one compartment neuron

containing two voltage-gated ion channels, Na and K, a pas-

sive leak current, and an electrode through which an external

current Iapp(t)canbeapplied.Themodelconsistsofanequa-

tion for voltage (Johnston and Wu 1995)

dV(t)

dt

=1

+gKn(t)4(EK− V(t))

+gL(EL− V(t)) + IDC+ Iapp(t)?,

= FV(V(t), m(t), h(t), n(t))

C

?

gNam(t)3h(t)(ENa− V(t))

(9)

where the g terms indicate maximal conductances and the E

termsreversalpotentials,foreachoftheNa,K,andleakchan-

nels. IDCis a DC current. Equations for the voltage depen-

dent gating variables ai(t) = {m(t), h(t), n(t)} complete

the model. We refer to this as the NaKL model.

Each gating variable ai(t) = {m(t), h(t), n(t)} satisfies

a first order kinetic equation of the form

dai(t)

dt

=ai0(V(t)) − ai(t)

τi(V(t))

.

(10)

The kinetic terms ai0(V) and τi(V) are taken here in the

form

ai0(V) =1

2

?

1 + tanh(V − va)

?

+ta2

2

dva

?

τi(V) = ta0+ ta1

1 − tanh2(V − vat)

1 + tanh(V − vatt)

dvat

?

?

dvatt

?

.

(11)

Theconstantsva, dva,...areselectedtomatchthefamil-

iar forms for the kinetic coefficients usually given in terms

of sigmoidal functions (1 ± exp((V − V1)/V2))−1. As dis-

cussed in Part I, the tanh forms are numerically the same

over the dynamic range of the neuron models but have better

controlledderivativeswhenonegoesoutofthatrangeduring

the required search procedures. This is less important here

because the MCMC method does not use the derivatives, but

we retain the same form for consistency.

In terms of the formalism of Sect. 2.1, the the model state

variables are the x(t) = {V(t), m(t), h(t), n(t)}, and the

parameters are p = {C, gNa, ENa, gK, EK,...,dvatt}. In a

twin experiment, the data are generated by solving these HH

equations for some initial condition x(0) = {V(0), m(0),

h(0), n(0)} and some choice for the parameters, the DC cur-

rent and Iapp(t), the stimulating current. The measured data

y(t) = h(x(t)) is here just the voltage V(t) produced by

the model plus additive, independent noise at each time step

drawn from a Gaussian distribution with zero mean and var-

iance of 1mV.

2.5.2 NaKLh model

Most neurons express a number of voltage-gated channels

in addition to the sodium and potassium channels directly

responsible for action potential generation (Graham 2002).

These additional channels contribute to bursting, firing rate

adaptation, and other behaviors. As in Toth et al. (2011),

we are interested in whether the data assimilation procedure

described here can be applied to more complex models than

NaKL, and whether it can be used in the context of model

specificationtodeterminewhichchannelsshouldbeincluded

in the model. A model incorporating all the channels likely

tobeinatypicalneuronisbeyondthescopeofthispaper,but

as a first step we extended the NaKL model to include the Ih

current, which has moderately slower kinetics than the other

channels in the model, and is activated by hyperpolarization

(McCormickandPape1990).The Ihcurrentwasrepresented

by an additional term in the HH voltage equation (9),

Ih(t) = ghhc(t)(Eh− V(t)),

as well as an additional equation for the dynamics of the Ih

gating variable:

(12)

dhc(t)

dt

=hc0(V(t)) − hc(t)

τhc(V)

123

Page 7

Biol Cybern (2012) 106:155–167161

hc0(V) =1

2

?

1 + tanh

?(V − vhc)

?(V − vhct)

dvhc

??

τhc(V) = thc0+ thc1tanh

dvhct

?

.

(13)

3 Results

3.1 NaKL model parameters estimated from NaKL data

We begin with an examination of the PIMC data assimila-

tion procedure using the NaKL model. We selected a set of

parameters (Table 1, “Data”) using standard textbook values

for maximal conductances (gNa, gK, gL) and reversal poten-

tials (ENa, EK, EL). The parameters in the kinetic equations

for the gating variables {m(t), h(t), n(t)} came from a fit to

standard expressions for the time constants τi(V) and driv-

ing functions ai0(V) using the hyperbolic tangent functions

in Eq. (11). Choosing appropriate initial conditions, we inte-

grated the NaKL HH equations with an input current Iapp(t)

consisting of a scaled waveform taken from the solution to

a chaotic dynamical system. The amplitude of the waveform

was selected so it depolarized and hyperpolarized the model

neuron, evoking action potentials and traversing the biolog-

ically realistic regions of the neuron’s state space. Based on

our findings in Toth et al. (2011), the frequency of Iapp(t)

was slow relative to the response rate of the neuron.

Table1 ParametervaluesinsimulateddatafromNaKLHHmodeland

estimates from the PIMC algorithm

Parameter‘Data’ EstimateSDUnits

gNa

ENa

gK

EK

gL

EL

vm=vmt

dvm=dvmt

tm0

tm1

vh=vht

dvh=dvht

th0

th1

vn=vnt

dvn=dvnt

tn0

tn1

120.0

55.0

20.0

−77.0

0.3

−54.4

−34.0

34.0

0.01

0.5

−60.0

−19.0

0.2

8.5

−65.0

45.0

0.8

5.0

108.6

55.8

18.2

−78.0

0.316

−55.3mV

−36.5

35.6

0.136

0.407

−62.1

−23.6

0.157

8.3

−64.2

44.6

0.35

4.84

7.3

0.70

1.0

0.67

0.021

1.5

1.5

0.77

0.096

0.087

1.8

2.0

0.071

0.16

1.3

0.40

0.24

0.086

mS/cm2

mV

mS/cm2

mV

mS/cm2

mV

mV

mV

ms

ms

mV

mV

ms

ms

mV

mV

ms

ms

The SD column indicates the standard deviation of the posterior distri-

bution

Toconstruct A0(X, Y)wetookm+1 = 4,096datapoints

with?t = 0.01mswritingg(x(n), x(n+1), p) = x(n+1)−

?tf(x(n), p) where f(x(n), p) is represented as an explicit

4th-order Runge–Kutta integration scheme. Using the meth-

ods described in Sect. 2.3, we evaluated expected values for

the state variables ?xa(n)? and parameters through the obser-

vation period, and also evaluated second moments to yield

standard deviations about these expected values. The dimen-

sion of the integral we are approximating is 4 (4,095)+18=

16,398.

Using the model with these parameters and state vari-

ables at T as initial conditions, we predict forward from

the end of the observation/assimilation time interval, also

using 4th-order Runge–Kutta integration. In the data assim-

ilation procedure we have many accepted paths distributed

as exp[−A0(X)]. We predict into t > T using x(T) for each

path and the parameters estimated with that accepted path.

This permits us to calculate a mean and standard deviation

for each state variable as the system continues to evolve.

Figure 1a shows the estimated membrane voltage (red

dots) ± standard deviation (green band) overlaid on the true

voltage(blackline)fordatageneratedwiththeNaKLmodel.

The data assimilation window consists of the time points

before the vertical blue line. Time points after the blue line

compare the predicted response of the model to the true volt-

age after time T. Figure 1b compares the estimated and

predicted values for the Na+activation variable (m(t)), an

unobserved state variable, with the known values.

The accuracy with which the path integral estimates track

the observed and unobserved states through the observa-

tion window is clear in Fig. 1. In the prediction window

the expected voltages and m(t) give quite good values for

the times when spikes occur, indicating that the Na and K

currents are well represented when an action potential is

generated. The dynamical response in the regions of hyper-

polarized response is less accurate, though all estimates lie

within one standard deviation of the expected value. As in

Toth et al. (2011) we could take this as a signal that the stim-

ulus current did not spend enough timein thehyperpolarized

voltage region to stimulate the dynamics there very well.

Table 1 compares the parameter values used to simulate

the data from the NaKL model with the estimates (± stan-

dard deviation) made from the noisy voltage alone. For all

buttm0andtn0theexpectedvaluesarealmostidenticaltothe

true values.

3.1.1 Bias in the conditional expected values arising from

model error

Itisnoteworthythatwhiletheposteriorerrorsaresmallinthe

estimates in Table 1 (and later tables of parameter estimates)

the expected or mean value appears to be biased away from

the known value. This bias comes from our procedure and

123

Page 8

162Biol Cybern (2012) 106:155–167

Fig. 1 NaKL model. a Comparison of the membrane voltage and the

estimates and predictions. In the assimilation window (t < 40.95ms,

blueline),theexpectedvalueofthevoltageconditionedontheobserva-

tions (Vest) is shown by red dots, with the standard deviation of this dis-

tribution shown in green. For t ≥ 40.95ms, the red and green symbols

show the mean and standard deviation of the distribution of responses

predicted forward in time using the estimates of the parameters and the

state variables at T = 40.95ms. The membrane voltage is shown as

a black line. b Comparison of the known value of the Na+activation

variable (black trace) and the distribution of the estimates (red±green;

prior to blue line) and predictions from the model (red±green; after

blue line). (Color figure online)

is associated with having model error as part of the action

A0(X).

The distribution of paths exp[−A0(X)] is the solution to

a Fokker–Planck equation of the form

dX(s)

ds

= −∂A0(X(s))

∂X(s)

+

√2 N(0, 1),

(14)

where X(s) is the state space path as a function of “time” s,

and N(0, 1)isGaussianwhilenoisewithzeromeanandvari-

anceunity.AnequivalenttoourMetropolis–HastingsMonte

Carlo procedure is to solve this stochastic differential equa-

tion in (m + 1)D-dimensions, where one can show that as

s → ∞, the distribution of paths is precisely exp[−A0(X)].

The Monte Carlo algorithm is seen as a search for minima of

the action along with accounting for the fluctuations about

the minima. All of this is guided by the observations as they

enter the action.

To demonstrate the point about biases in the estimation,

suppose we had two measurements y1, y2and two model

outputs with the model taken as linear x2= Mx1. Then the

action we would associate with this, including model error,

is

A(x1, x2)=1

2

?

(y1−x1)2+(y2−x2)2+ R(x2−Mx1)2?

,

(15)

and this has its minimum at

x1

=(1+R)y1+RMy2

x2=(1+RM2)y2+RMy1

This clearly show the bias we anticipated. As R → ∞,

we see that the bias remains, but x2 = Mx1is enforced.

If R = 0, however, the minimum is at x1 = y1, x2 = y2,

1+R(1+M2)

1+R(1+M2)

.

(16)

and if the dynamics underlying the data source satisfies y2=

My1, the same holds for the model.

3.2 NaKLh model parameters estimated from NaKLh data

To increase the complexity of the model, an additional volt-

age-gated current was added (Ih; see Sect. 2.5). As in the

previous section, data were simulated from known param-

eters and initial conditions (Table 2, ‘Data’). Based on our

findings in Toth et al. (2011), we used a strong stimulus cur-

rent to ensure that the Ihcurrent was sufficiently activated.

Then, using the noisy voltage output from the simulation,

the PIMC algorithm was used to estimate the parameters and

unobserved states of the model. These estimates are as good

or better than for the simpler NaKL model (Table 2). More-

over, the additional 4,096+8 dimensions added to the inte-

gral did not increase the posterior variance or substantially

decrease the speed of the calculation.

Figure 2 compares the known values for V(t) and the

unobserved gating variable hc(t) of the newly added Ihcur-

rent against the estimates and their posterior error. As with

the NaKL model, the estimates of voltage (A) and the Ihgat-

ingvariable(B)closelyfollowthetruevaluesduringthedata

assimilation window and beyond.

3.3 NaKLh model parameters estimated from NaKL data

A critical consideration in applying this method to experi-

mentally obtained data is model selection. If the goal is to

make inferences about biologically relevant properties such

as the set of channels expressed by an individual neuron,

then we need some method of determining which chan-

nels contribute to a neuron’s electrophysiological behav-

ior and should be included in the model. One approach

123

Page 9

Biol Cybern (2012) 106:155–167163

Table 2 Parameter values in simulated data from NaKLh HH model

and estimates from the PIMC algorithm

Parameter ‘Data’Estimate SDUnits

gNa

ENa

gK

EK

gL

EL

vm=vmt

dvm=dvmt

tm0

tm1

vh=vht

dvh=dvht

th0

th1

vn=vnt

dvn=dvnt

tn0

tn1

gh

Eh

vhc

dvhc

thc0

thc1

vhct

dvhct

120.0

55.0

20.0

−77.0

0.30

−54.4

−34.0

34.0

0.01

0.5

−60.0

−19.0

0.2

8.5

−65.0

45.0

0.8

5.0

1.21

40.0

−75.0

−11.0

0.1

193.5

−80.0

21.0

116.0

55.17

19.40

−77.38

0.2987

−56.8

−33.64

36.58

0.051

0.45

−67.29

−18.9

0.34

6.25

−68.2

44.6

0.97

4.8

1.18

37.8

−77.7

−11.1

0.118

177.24

−81.3

21.4

2.8

0.19

0.20

0.27

0.0098

1.8

1.2

1.15

0.023

0.060

3.2

1.1

0.14

2.5

2.6

0.63

0.12

0.38

0.029

0.65

0.64

0.26

0.043

3.1

0.67

0.26

mS/cm2

mV

mS/cm2

mV

mS/cm2

mV

mV

mV

ms

ms

mV

mV

ms

ms

mV

mV

ms

ms

mS/cm2

mV

mV

mV

ms

ms

mV

mV

The SD column indicates the standard deviation of the posterior distri-

bution

is to start with a relatively complex model that includes

all of the currents likely to be in the neuron of inter-

est (possibly using genetic expression data to restrict the

list of candidates). All such currents would have the HH

form

Icurrent(t) = gcurrent(mc(t))q(hc(t))p(Ecurrent− V(t)), (17)

where mc(t) is a state variable associated with transitions

to the open state, hc(t) is a state variable associated with

transitions to closed states, and p and q are integers. Differ-

ential equations for the kinetics of the state variables would

need to be specified as well. If this term is included in the

model but gcurrentis not distinguishable from zero, or if the

other parameters are nonsensical, then we could infer that

the current does not contribute appreciably to the neuron’s

behavior.

Using the variational approximation introduced in Part I,

wefoundthatifweestimatedparametersofanNaKLhmodel

fromdatathatweregeneratedfromanNaKLmodel,theesti-

mated conductance for the missing Ihcurrent was extremely

close to zero, supporting this approach. We repeated this

experiment using the numerical approximation to the full

integral and obtained the same result. NaKL data were simu-

lated by using an NaKLh model in which ghwas fixed at

zero. Parameters associated with the Ih current were left

at their estimated values, though they had no effect on the

voltage behavior of the neuron. The responses of this model

wereidenticaltotheNaKLmodelinwhichno Ihcurrentwas

specified.Table3comparesvaluesfortheparametersusedto

simulate the data with the values estimated using the PIMC

method. The maximal conductance ghis small, indicating

that Ihdoes not contribute to the responses of the neuron.

We note that the estimate of ghin Toth et al. (2011) is much

smaller, but in that experiment we did not include observa-

tional noise. Reducing the observational noise in this exper-

iment leads to a corresponding decrease in the ghestimate

(not shown).

As important as the absolute value of the estimated con-

ductance is the posterior error associated with the estimate,

which we are able to estimate using the PIMC method.

Fig. 2 NaKLh model. a Comparison of the membrane voltage with

estimates during the assimilation window (t < 40.95ms) and predic-

tions from the model (t ≥ 40.95ms). As in Fig. 1, the known values

are indicated by the black trace, and the mean and standard deviation

of the posterior distributions are indicated by red dots and green bars,

respectively. b Comparison of the known values for the Ihactivation

gatingvariablehc(t)withtheestimatesandpredictionsfromthemodel.

(Color figure online)

123

Page 10

164Biol Cybern (2012) 106:155–167

Table 3 Parameter values in simulated data from NaKLh HH model

and estimates from the PIMC algorithm

Parameter‘Data’ Estimate SDUnits

gNa

ENa

gK

EK

gL

EL

vm=vmt

dvm=dvmt

tm0

tm1

vh=vht

dvh=dvht

th0

th1

vn=vnt

dvn=dvnt

tn0

tn1

gh

Eh

vhc

dvhc

thc0

thc1

vhct

dvhct

120

55.0

20.0

−77.0

0.30

−54.4

−34.0

34.0

0.01

0.5

−60.0

−19.0

0.2

8.5

−65.0

45.0

0.8

5.0

114.1

55.2

19.9

−77.1

0.292

−55.3

−31.1

34.0

0.071

0.58

−52.2

−20.1

1.1

8.8

−63.9

43.8

0.88

5.3

2.2

0.12

0.19

0.33

0.0059

0.51

1.7

1.2

0.030

0.025

2.7

0.66

0.42

2.3

1.2

1.5

0.083

0.15

mS/cm2

mV

mS/cm2

mV

mS/cm2

mV

mV

mV

ms

ms

mV

mV

ms

ms

mV

mV

ms

ms

mS/cm2

mV

mV

mV

ms

ms

mV

mV

0.00.019

−27.0

7.4

−43.5

2.8

207.6

−99.6

56.0

0.015

1.3

1.1

0.55

0.057

1.9

0.27

0.42

−40.0

−75.0

−11.0

0.1

193.5

−80

21.0

gh= 0.0 in the simulated data, indicating the absence of this channel

In contrast to the other parameters where the posterior error

is a small fraction of the expected value, the error for ghis

nearlyaslargeastheexpectedvalue.Theestimatesforthe Ih

kineticsarewrong,butcanbeignoredbecause gh≈ 0.Ifthe

current does not contribute to the data, estimates of its prop-

erties are unreliable. The idea that we could build a “large”

model of all neurons and use the data to prune off currents

that are absent is plausible and supported by this calculation.

Whetherthisoptimisticviewpointwillpersistasweconfront

laboratory data with our methods is yet to be seen.

Figure3comparesthemeasuredvoltageandK+activation

variable against the estimated values during the observation

window (0–40.95ms). The expected values follow the true

values closely, with small posterior errors, indicating that

the presence of the Ihcurrent in the estimation model does

not negatively impact the estimation procedure even though

this current is not present in the data. More importantly, the

predicted voltage obtained by integrating forward using the

estimated parameters and state variables at T = 40.95ms

closely follows the true voltage. The quality of the predic-

tion would support the inference, based on the voltage data

alone,that Ihdoesnotcontributesubstantiallytothebehavior

of this neuron.

3.4 NaKL model parameters estimated from NaKLh data

The complementary approach to the model selection prob-

lem is to build up from a simple model, adding complexity

to address aspects of the data that are not fit well. To address

the feasibility of this approach, we used data simulated from

the NaKLh model to fit parameters from the NaKL model,

which is missing key parameters and state variables. Figure

4ashowsacomparisonoftheestimatedandpredictedvoltage

from the NaKL model with the true values from the NaKLh

model. In the observation window there is no indication that

thereisanythingwrongwiththemodel,butthedramaticfail-

uretopredicttheresponseoftheneuronaftertheobservation

window leaves no doubt that something is not consistent.

Although one cannot do this easily in a laboratory experi-

ment, in a twin experiment we are able to examine how well

Fig. 3 NaKLh model parameters estimated from NaKL data. a Com-

parison of the membrane voltage with estimatesduring the assimilation

window(t < 40.95ms)andpredictionsfromthemodel(t ≥ 40.95ms).

This panel is almost identical to Fig. 2a, except that the Ihcurrent does

not contribute to the simulated data. b Comparison of the known val-

ues for the K+inactivation gating variable n(t) with the estimates and

predictions from the model

123

Page 11

Biol Cybern (2012) 106:155–167165

Fig. 4 NaKL model parameters estimated from NaKLh data. a Com-

parison of the known membrane voltage with estimates during the

assimilation window (t < 40.95ms) and predictions from the model

(t ≥ 40.95ms). This panel is almost identical to Fig. 2a, except that

model does not include the Ihcurrent that is present in the simulated

data. b Comparison of the known values for the Na+inactivation gating

variable h(t) with the estimates and predictions from the model

the estimations behave when estimating the gating variables.

Figure4bshowstheestimatedNa+inactivationvariable h(t)

and its known value from our knowledge of the full state of

the observed neuron model. It is clear that the estimates of

the unobserved variable during the observation window fails

to reflect the underlying system. One interpretation is that

because the model is too simple to represent the actual sys-

tem, the estimated parameters and state variables are driven

to unrealistic values, which is revealed when the model is

used to predict novel data. Although the failure of the NaKL

modeltocapturetheNaKLhbehaviordoesnotgiveanydirect

indicationofwhatismissing,itdoesprovideaclearbasisfor

comparing families of models to determine which one best

represents the underlying process.

4 Discussion

The equations describing the dynamics of a broad range of

voltage-gated ion channels and their effect on the membrane

voltage of neurons have been known for many years, but due

to the large number of channels expressed in nature and the

overall nonlinearity of neuronal systems it has not been pos-

sible to use recordings of membrane voltage alone to deter-

mine what channels are present and their kinetic parameters.

The problem is a general one of finding the paths through

a model state space that are consistent with observations of

some subset of the state variables (here, voltage), as well as

with the internal dynamics of the model.

When the measurements are noisy, the model has errors,

and the state of the model is uncertain when observations

commence, this is a problem in statistical physics. We pre-

sentedanexactformulationofthepathintegralthatdescribes

this problem previously (Toth et al. 2011; Abarbanel 2009),

alongwithavariationalapproximationbasedonasaddlepath

estimation. We found that this variational method provided

accurate estimates of channel densities and kinetic parame-

ters when applied to simulated data, and that the estimation

procedure was robust to additive noise as well as errors in

the model specification.

We have extended our earlier work here with effective

numerical approximations to the full path integral, using

a Metropolis–Hastings Monte Carlo technique. The chief

advantage of this approach is that it allows one to find not

only the optimal path through the state space (and the asso-

ciated parameters), but to sample from the joint probability

distribution of the paths and parameters, conditioned on the

data observations. The variance of this posterior distribution

provides valuable information about the degree to which the

observed data constrains the model, and can indicate when

additional data may be needed. It also allows generation

of posterior predictive distributions—the expected behavior

given the model and the observed data—which are useful in

model validation and selection. In other words, the numeri-

calapproximationtothefullintegralnotonlyallowstransfer

of information from the data to the model, but also indicates

how much information was transferred.

Interestingly the Metropolis–Hastings method, as with

other approaches (Andrieu et al. 2010), seeks paths near the

maxima of the distribution exp[−A0(X)] so that it repre-

sents, in effect, a statistical version of the stationary path

method discussed in Toth et al. (2011). It has the computa-

tionaladvantageofnotrequiringanyderivativesoftheaction,

so if the model contains thresholds or “switches” there is no

problem in the PIMC method.

Using this method, we repeated several of the numeri-

cal experiments with simulated data described earlier (Toth

et al. 2011) in order to show that estimates are accurate and

provide good forward predictions for simple HH-type bio-

physical models. The posterior variance of these estimates

was small, indicating that a small amount of data (41ms)

was sufficient to provide a high degree of confidence in the

123

Page 12

166Biol Cybern (2012) 106:155–167

estimates, and furthermore, that the behavior of the models

was strongly dependent on the parameter values.

We also demonstrated, by estimating model states and

parameters with the wrong complement of channels, one too

many or one too few, that we could easily identify these situ-

ations. In the case where the model contained a channel that

was not present in the data, the estimated conductance for

that additional channel was small and the posterior uncer-

tainty was large, allowing the “extra” channel to be pruned

from the model with confidence. When the model contained

too few channels the forward predictions were highly inac-

curate, indicating that the action of the missing channels

plays an important dynamical role. From these observations,

it appears that in dealing with preparations where the full

complementofchannelsisnotknown,itispreferabletostart

with a larger model that includes any channel with a reason-

able probability of being present, which might be informed

by the known biology of the system such as prior pharmaco-

logical experiments or genetic expression patterns, and then

removechannelsforwhichthemaximumconductanceisesti-

mated to be close to zero or has a high error. We recommend

this as a general strategy for selecting good models when

experimental data is used.

We noted, as exhibited by our estimates in Tables 1, 2, 3,

that because the estimation procedure is seeking the minima

of A0(X)alongwiththefluctuationsaboutthoseminima,and

since the action contains terms representing the error in the

models, the expected values of the estimates will be biased,

even when the posterior error about the expected values is

small. This bias is discernable in the twin experiments we

discuss here, but would not be known in the application of

ourmethodstolaboratorydata.Itisimportanttobecognizant

of the bias, however.

From a practical standpoint, the major disadvantage in

usingthefullpathintegralratherthanthevariationalapprox-

imation is that it can require much more computational time.

ThisislargelyamelioratedbytheparallelGPUalgorithmwe

utilized,andthelowcostofusingaGPUforthecomputation.

Furthermore, given that both methods provide similar levels

of accuracy, the variational method can be used for explor-

atoryanalysis,followedbyasubsequentin-depthanalysisof

the full distribution. We have used the variational principle,

whether implemented through IPOPT or via another optimi-

zation routine, as a source of the first guess for a path in the

PIMC method. This appears to provide a much better start-

ing path than a random guess—better, in the sense that the

PIMC method converges in fewer iterations to a distribution

with good expectation values. Other strategies are explored

by Andrieu et al. (2010).

Therehasbeensubstantialinterestinrecentyearsinusing

noisymeasurementsofneuronstoinferbiologicallyrelevant

properties, using a variety of optimization methods (Druck-

mann et al. 2007; Abarbanel 2009; Huys and Paninski 2009;

Lepora et al. 2011). The ability to make such inferences

opens possibilities of using very brief intracellular record-

ings to closely characterize individual neurons, to reveal the

distribution of various biological properties over large pop-

ulations of neurons, or to track changes in these properties

throughlearningorchangesinbehavioralstate.Themethods

we describe here and in Part I have several new advantages.

They provide estimates not only of the maximal conduc-

tances offixed channel types,butalsooftheparameters gov-

erning the gating kinetics of unknown channels. The Monte

Carlo method also provides information about the posterior

uncertaintyintheparameterestimates.Thesefeaturesmaybe

of particular value in analyzing systems where the candidate

channels are not well known.

The experiments we describe here necessarily focused on

relatively simple and generalizable model neurons. Extend-

ing the method to more complex systems will require the

incorporation of additional knowledge about the types of

channels most likely to be present or the anatomy of the neu-

rons.Dendriticdynamicsarenotveryimportantinresponses

toinjectedcurrent,butwillbecomesowhenweconsidersyn-

apticinputsandnetworkdynamics(JohnstonandNarayanan

2008).Twinexperimentsliketheoneswehavepresentedhere

willcontinue toplayanimportantroleinstudying biological

data:nomatterhowcomplexthemodelbecomes,twinexper-

iments can be used to generate simulated data to determine

what conditions are necessary for obtaining a good estimate

oftheparameters,whichinturncanbeusedtooptimizestim-

ulation protocols and other aspects of experimental design.

Acknowledgements

(Grant DE-SC0002349) and the National Science Foundation (Grants

IOS-0905076, IOS-0905030, and PHY-0961153) are gratefully ackno-

wledged. Partial support from the NSF sponsored Center for Theo-

retical Biological Physics is also appreciated. Discussions with Jack

Quinn on GPU computing were very valuable in our numerical work

reported here. He provided us with the GPU computing strategy we

have employed.

Support from the US Department of Energy

References

Abarbanel HD (2009) Effective actionsfor statisticaldataassimilation.

Phys Lett A 373(44):4044–4048

Andrieu C, Doucet A, Holenstein R (2010) Particle Markov chain

Monte Carlo methods. J R Stat Soc B 72(3): 269–342. doi:10.

1111/j.1467-9868.2009.00736.x

Druckmann S, Banitt Y, Gidon A, Schürmann F, Markram H, Segev I

(2007) A novel multiple objective optimization framework for

constraining conductance-based neuron models by experimental

data. Front Neurosci 1(1): 7–18. doi:10.3389/neuro.01.1.1.001.

2007

Evensen G (2009) Data assimilation: the ensemble Kalman filter. 2nd

edn. Springer, Berlin

Graham L (2002) In: . Arbib MA (ed) The handbook for brain theory

and neural networks. 2nd edn. MIT Press, Cambridge pp 164–170

123

Page 13

Biol Cybern (2012) 106:155–167167

Hastings WK (1970) Monte Carlo sampling methods using Markov

chains and their applications. Biometrika 57(1): 97–109. doi:10.

1093/biomet/57.1.97

Huys QJM, Paninski L (2009) Smoothing of, and parameter esti-

mation from, noisy biophysical recordings. PLoS Comput Biol

5(5):e1000379. doi:10.1371/journal.pcbi.1000379

JohnstonD,NarayananR (2008) Activedendrites:colorfulwingsofthe

mysterious butterflies. Trends Neurosci 31(6): 309–316. doi:10.

1016/j.tins.2008.03.004

Johnston D, Wu SMS (1995) Foundations of cellular neurophysiology.

MIT Press, Cambridge

Lepora NF, Overton PG, Gurney K (2011) Efficient fitting of conduc-

tance-basedmodelneuronsfromsomaticcurrentclamp.JComput

Neurosci. doi:10.1007/s10827-011-0331-2

Mackay DJC (2003) Information theory, inference and learning algo-

rithms. Cambridge University Press, Cambridge

McCormick DA, Pape HC (1990) Properties of a hyperpolarization-

activated cation current and its role in rhythmic oscillation in tha-

lamic relay neurones. J Physiol 431(1):291–318

Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E

(1953) Equationofstatecalculationsbyfastcomputingmachines.

J Chem Phys 21: 1087–1092. doi:10.1063/1.1699114

Neal RM (1993) Probabilistic inference using Markov chain Monte

Carlo methods. Tech. Rep. CRG-TR-93-1, University of Toronto

Toth BA, Kostuk M, Meliza CD, Abarbanel HDI, Margoliash D

(2011) Dynamical estimation of neuron and network properties

I: variational methods. Biol Cybern 105(3):217–237

123

#### Full-text

#### View other sources

#### Hide other sources

- Available from C Daniel Meliza · Nov 11, 2014Available from 10.1007/s00422-012-0487-5