Conference PaperPDF Available

Flexible Timing with Delay Networks - The Scalar Property and Neural Scaling

Authors:
  • Applied Brain Research Inc.

Abstract and Figures

We propose a spiking recurrent neural network model of flexible human timing behavior based on the delay network. The well-known 'scalar property' of timing behavior arises from the model in a natural way, and critically depends on how many dimensions are used to represent the history of stimuli. The model also produces heterogeneous firing patterns that scale with the timed interval, consistent with available neural data. This suggests that the scalar property and neural scaling are tightly linked. Further extensions of the model are discussed that may capture additional behavior, such as continuative timing, temporal cognition, and learning how to time.
Content may be subject to copyright.
Flexible Timing with Delay Networks – The Scalar Property and Neural Scaling
Joost de Jong (j.de.jong.53@student.rug.nl)1
Aaron R. Voelker (arvoelke@uwaterloo.ca)2
Hedderik van Rijn (d.h.van.rijn@rug.nl)1
Terrence C. Stewart (tcstewar@uwaterloo.ca)2
Chris Eliasmith (celiasmith@uwaterloo.ca)2
1Experimental Psychology, Grote Kruisstraat 2/1, Groningen, 9712 TS, the Netherlands
2Centre for Theoretical Neuroscience, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada
Abstract
We propose a spiking recurrent neural network model of flex-
ible human timing behavior based on the delay network. The
well-known ‘scalar property’ of timing behavior arises from
the model in a natural way, and critically depends on how many
dimensions are used to represent the history of stimuli. The
model also produces heterogeneous firing patterns that scale
with the timed interval, consistent with available neural data.
This suggests that the scalar property and neural scaling are
tightly linked. Further extensions of the model are discussed
that may capture additional behavior, such as continuative tim-
ing, temporal cognition, and learning how to time.
Keywords: Interval Timing; Scalar Property; Spiking Recur-
rent Neural Networks; Neural Engineering Framework; Delay
Network
Introduction
Time is a fundamental dimension against which our mental
lives play out: we remember the past, experience the present,
and anticipate the future. Humans are sensitive to a wide
range of temporal scales, from microseconds in sound local-
ization to tens of hours in circadian rhythms. It is somewhere
in between—on the order of hundreds of milliseconds to sev-
eral seconds—that we consciously perceive time and coordi-
nate actions within our environment (van Rijn, 2018). How
does our brain represent time as accurately as possible, and
how does it flexibly deal with different temporal intervals?
Scalar Property
Given the centrality of time to our experience, it is no won-
der that timing and time perception have been the subject of
extensive empirical study over the past 150 years. Many per-
ceptual, cognitive, and neural mechanisms related to time per-
ception have been studied, and perhaps the most well-known
finding from the literature is the scalar property (Gibbon,
1977). The scalar property of variance states that the stan-
dard deviation of time estimates are linearly proportional to
the mean of the estimated time. The scalar property has been
confirmed by a wide variety of experimental data (Wearden
& Lejeune, 2008). However, some research suggests that
the scalar property does not always hold. It was already
observed by Allan and Kristofferson (1974) that for well-
practiced subjects in interval discrimination tasks, the stan-
dard deviation was constant for a range of relatively short
intervals. Similar results were observed with pigeons, where
the standard deviation remained flat for intervals up to around
500 ms (Fetterman & Killeen, 1992). Also, Grondin (2014)
notes that the scalar property of variance critically depends
on the range of intervals under consideration, and cites many
examples with increases in slope after intervals of about 1.3
seconds.
Most models of timing take the scalar property as a start-
ing point, or consider conformity to the scalar property as a
crucial test. This seriously undermines their ability to explain
violations of the scalar property. Here, we take the approach
of not assuming the scalar property a priori, but instead con-
struct a biologically plausible model that is trained to opti-
mally represent time. We then systematically explore ranges
of model parameters that lead the scalar property to be sat-
isfied or violated, and provide a theoretical framework that
aims to unify a variety of empirical observations.
Neural Scaling
Variance is not the only property of timing that scales with
the estimated time interval. The firing patterns of individual
neurons also stretch or compress proportional to the timed
interval. In a recent study, Wang, Narain, Hosseini, and Jaza-
yeri (2018) show that neurons in striatum and medial pre-
frontal cortex (mPFC) scale in this manner. During the timed
interval, individual neurons display ramping, decaying, oscil-
lating, or more complex firing patterns. In general, the spe-
cific shapes of temporal firing patterns for a given neuron re-
main the same, but become stretched for longer intervals and
compressed for shorter intervals. Additionally, neurons in the
thalamus display a different kind of scaling: their mean level
of activity correlates with the timed interval. Both findings
have been explained using a recurrent neural network (RNN)
model (corresponding to neurons in striatum or mPFC) that
receives a tonic input (originating from the thalamus) to scale
the temporal dynamics of the network (Wang et al., 2018).
The units in the neural network exhibit neural firing patterns
and scaling similar to those observed experimentally. The
model of timing we propose reproduces the same findings
as the RNN model described in Wang et al. (2018). These
findings suggest that, in order to perform timed actions as
accurately as possible, the brain is able to flexibly scale its
temporal dynamics. This implies a tight connection between
the scalar property of variance and the temporal scaling of
individual neurons.
Neural Models of Timing
Many neurally inspired models of timing and time perception
have been proposed. Some models are based on ramping neu-
ral activity (Simen, Balci, deSouza, Cohen, & Holmes, 2011),
some decaying neural activity (Shankar & Howard, 2010) and
some on oscillating neural activity (Matell & Meck, 2004).
Interestingly, all these neural firing patterns (and more com-
plex ones) have been observed by Wang et al. (2018) in
striatum and mPFC during a motor timing task. Therefore,
appealing to only one of these neural firing patterns may
be insufficient to fully explain timing performance. In line
with this observation, the recurrent neural network model by
Wang et al. (2018) exhibits a wide variety of firing patterns.
However, their model does not show why this heterogene-
ity of firing patterns is important for timing performance or
what the role is of ramping, decaying, or oscillating neurons
in timing performance. Randomly-connected recurrent neu-
ral networks—referred to as reservoir computers—produce a
wide variety of dynamics that can subsequently be extracted
by a read-out population (Buonomano & Maass, 2009). A
more structured approach to building a recurrent neural net-
work may highlight the functional relevance of different neu-
ral firing patterns on timing performance.
One candidate for such a structured approach is the delay
network (Voelker & Eliasmith, 2018). The delay network is a
spiking recurrent neural network that approximates a rolling
window of its input history by compressing the history into
aq-dimensional state-vector. It has been observed that in-
dividual neurons in the delay network show responses simi-
lar to time-cells (MacDonald, Lepage, Eden, & Eichenbaum,
2011). Here, we use the delay network to explain both the
scalar property of timing and the scaling of individual neural
responses by comparing delay network data to empirical data
from Wang et al. (2018).
Methods
We first discuss the mathematics behind the delay network.
Then, we show how to implement the delay network as a
spiking recurrent neural network using the Neural Engineer-
ing Framework (NEF; Eliasmith & Anderson, 2003). Lastly,
we discuss the details of our simulations that follow the ex-
perimental setup of Wang et al. (2018).
The Delay Network
The delay network is a dynamical system that maintains a
temporal memory of its input across a rolling window of
θseconds (Voelker & Eliasmith, 2018; Voelker, 2019). It
does so by optimally compressing its input history into a q-
dimensional state-vector. This vector continuously evolves
through time in a way that captures the sliding window of
history, while being amenable to representation by a popula-
tion of spiking neurons using the NEF (as explained in the
following subsection).
We consider the problem of computing the function y(t) =
u(tθ), where u(t)is the input to the network, y(t)is the
output of the network, and θ>0 is the length of the window
in time to be stored in memory. In order to compute such a
function, the network must necessarily maintain a history of
input across all intermediate moments in time, u(tθ0), for
θ0ranging from the start of the window (θ0=0), going back
in time to the end of the window (θ0=θ). This window must
then slide forwards in time once t>θ, thus always preserving
the input over an interval of length θ. Computing this func-
tion in continuous time is challenging, as one cannot merely
sample a finite number of time-points and shift them along;
the time-step of the system could be arbitrarily small, or there
may not even be an internal time-step as in the case of imple-
mentation on mixed-analog neuromorphic hardware (Neckar
et al., 2019).
The approach taken by Voelker and Eliasmith (2018) is
to convert this problem into a set of differential equations,
dx/dt =θ1(Ax+Bu), where xis a q-dimensional state-
vector, and (A,B)are matrices governing the dynamics of
x. We use the (A,B)matrices from Voelker (2019; sec-
tion 6.1.3). This results in the approximate reconstruction:
u(tθ0)P(θ0/θ)·x(t), where Pare the shifted Legendre
polynomials. Importantly, the dimensionality qdetermines
the quality of the approximation. This free parameter con-
trols the number of polynomials used to represent the win-
dow – analogous to a Taylor series expansion of the input
using polynomials up to degree q1. Thus, qdetermines
how much of the input’s frequency spectrum, with respect to
the period 1/θ, should be maintained in memory. Another
notable property is that 1/θcorresponds to a gain factor on
the integration of x(t)that can be controlled in order to dy-
namically adjust the length of the window on-the-fly.
The Neural Engineering Framework (NEF)
Given this mathematical formulation of the computations that
the neurons must perform in order to represent their past in-
put, we turn to the question of how to recurrently connect
neurons such that they perform this computation. For this,
we use the NEF (Eliasmith & Anderson, 2003).
In the NEF, the activity of a group of neurons forms a dis-
tributed representation of some underlying vector space x. In
particular, each neuron ihas some encoder (or preferred di-
rection vector) eisuch that this neuron will fire most strongly
when xis similar to ei. To produce heterogeneity in the neu-
ral population, each neuron has a randomly assigned gain αi
and bias βi. Overall, the current entering each neuron would
ideally be αiei·x+βi. This input current determines the spik-
ing activity of the neuron, based on the neuron model. In
this work, we use the standard leaky integrate-and-fire (LIF)
model. This results in a pattern of neural activity over time
ai(t)that encodes some continuous vector over time x(t).
If we have two groups of neurons, one representing x
and one representing y, and we want yto be some function
of x, then we can form connections from the first popula-
tion to the second. In particular, we want to connect neu-
ron ito neuron jwith weights ωi j such that the total sum
from all the input connections will give the same result as
the ideal equation assumed above. In other words, we want
iai(t)ωi j =αjej·y(t)for all j(the bias βjcurrent is sup-
plied separately). The ideal ωi j are found using regularized
least-squares optimization.
Furthermore, this method for finding connection weights
can be extended to support recurrent connections (i.e., con-
nections from the neurons in one population back to itself).
These connections are solved for in the same manner, and, as
has been shown Eliasmith and Anderson (2003), the result-
ing network approximates a dynamical system of the form
dx/dt =f(x) + g(u), where xis the vector represented by
the group of neurons, uis the vector represented by the group
of neurons providing input to this group, and the functions f
and gdepend on both the functions used to find the connec-
tion weights (as per the previous paragraph) and the tempo-
ral properties of the synapses involved (most importantly, the
postsynaptic time constant).
The result is that the NEF provides a method for generat-
ing a population of neurons (to represent the q-dimensional
state) and finding the ideal recurrent connections between
those neurons such that they compute the differential equa-
tions required by the delay network.
It should be noted that the resulting network is structured
exactly like a standard reservoir computer: a large number
of neurons are recurrently connected, an input is supplied to
that network, and we can decode information from the dy-
namics of the network by computing weighted sums of the
overall neural activity. However, rather than randomly gener-
ating the recurrent weights, we are using the NEF to find the
optimal weights for storing information over a rolling win-
dow in time. This method has been shown to be far more
computationally efficient and accurate than various forms of
reservoir computer for computing delays (Voelker, 2019).
An example of the resulting system is shown in Figure 1.
Here the network is optimized to represent the past θ=1 s of
its own input using q=6 dimensions. Part A shows the (one-
dimensional) input to the network over time. In this case, the
input is a Gaussian bump centred at t=0.5 seconds. The
resulting neural activity (for 50 randomly-chosen neurons) is
shown in Part B. Note that the neural activity at the begin-
ning (before the input bump) and at the end (after t>1.5 s) is
fairly constant. This is the stable background activity of the
network in the absence of any input. Since the network only
stores the last second, in the absence of any input it will settle
back to this state in 1 second.
Part C shows one example of decoding information out
of this network. In particular, we are decoding the function
y(t) = u(t0.5)– that is, the output should be the same as the
input θ0=0.5 seconds ago. This output is found by comput-
ing the weighted sum of the spikes that best approximates this
value, again, using least-squares optimization to find these
weights. That is, y(t) = iai(t)di, where diis the decoding
weight for the ith neuron. We see that the network accurately
represents the overall shape, although the Gaussian bump has
become a bit wider, and the output dips to be slightly nega-
Figure 1: The Delay Network – Optimized to represent the
past 1second of input history using 6dimensions. (A): The
input to the network. (B): Neural activity of 50 randomly-
chosen neurons within the network. (C): Decoding informa-
tion from the network by taking the weighted sum of neu-
ral activity that best approximates the input from 0.5 seconds
ago. (D): Decoding all information from the past 1 second.
Each row is a different slice in time (from 0 to 1 second),
and uses a different weighted sum of the same neural activity.
The graph in part (C) is a slice through this image, indicated
by a dotted line. (E): The underlying low-dimensional state
information that represents the window.
tive before and after the bump. These are side-effects of the
neurons approximating the ideal math for the delay network,
and its compression of the input into 6 dimensions.
In Part D, we show the same process as in Part C, but for
all times in the past from right now (θ0=0 s) to the furthest
point back in time (θ0=1 s). This is to show that we can
decode all different points in time in the past, and the partic-
ular case shown in Part C is just one example (indicated with
a dotted line). Each of these different outputs uses the same
underlying neural activity, but different decoders diout of the
recurrent population.
Finally, Part E shows that we can also decode out the q-
dimensional state representation x(t)that the delay network
uses for its representation. These are the values that govern
the dynamics of the delay network, and they form a nonlin-
ear basis for all the possible functions that could be decoded
out from the neural activity. Indeed, each row in Part D can
also be interpreted as a different linear transformation of the
data shown in Part E. Voelker and Eliasmith (2018) derive
the closed-form mathematical expression that provides such a
transformation, thus relating all time-points within the rolling
window to this underlying state-vector.
These different views of the delay network can be seen
as a very clear example of David Marr’s Tri-Level Hypoth-
esis (Marr, 1982), which we use here to understand this sys-
tem at varying levels of abstraction. For instance, we may
consider only the implementational level, which consists of
leaky integrate-and-fire neurons with recurrent connection
weights between them, a set of input weights from the stimu-
lus, and multiple sets of output weights. Or we may consider
the algorithmic level, where the system is representing a q-
dimensional state-vector xand changing that vector over time
according to the differential equations given in the previous
section. Or we may consider the computational level, where
the network is storing a (compressed) history of its own in-
put, and different slices of that input can be extracted from
that memory. All of these are correct characterizations of the
same system.
Simulation Experiment
In the original experiment by Wang et al. (2018), monkeys
were presented with a “cue” signal that indicated the inter-
val to be reproduced: red for a short interval (800 ms) and
blue for a long interval (1500 ms). Then, they were presented
with a “set” signal that marked the start of the interval. The
monkeys had to issue a response after the cued interval had
elapsed. We have attempted to match the relevant details of
their experimental setup as follows. The delay network (with
q=4) continually receives input from a control population
that scales θin order to produce intervals around 800 ms or
1500 ms. In effect, this gain population controls the length
of the window on-the-fly. The effective value of θis 1, di-
vided by the value that the gain population represents. When
the value represented by the gain population is greater than 1,
it makes the length of the window shorter; when it is smaller
than 1, it makes the window longer. This enables us to choose
values for the gain population that will let the delay network
time intervals around 800 ms or 1500 ms. The delay network
receives input that is continually represented, along with the
history of this input. The input signal is a rectangular impulse
of 500 ms. The same read-out population decodes the delayed
input signal as θis varied.
Results
Scalar Property in the Delay Network
In order to quantify the scalar property in the spiking imple-
mentation of the delay network, we calculated the mean and
standard deviation of the decoded output at θseconds. We
performed this analysis for delay networks with a range of
values for θand qwhile keeping the number of neurons per
dimension fixed at 500. We considered only positive values
around the peak of the decoded output. If the scalar prop-
erty holds, we should observe a linear relationship between θ
and the standard deviation of the impulse response. Our data
suggests that the scalar property critically depends on q(Fig-
ure 2). The relationship between the standard deviation, θ,
and qcan be described as follows. The standard deviation
remains constant for a range of short intervals and starts to
increase linearly after some value of θ. Both the location of
this transition and the slope of the linear increase depend on
q. This helps explain some previous differences in experi-
mental findings. For example, the flat standard deviation for
500 ms observed by Fetterman and Killeen (1992) can be
explained by assuming that q=2 within our model.
Figure 2: Scalar Property. The standard deviation of the im-
pulse response plotted against θfor different values of q.
Neural Scaling in the Delay Network
Our simulations of the Wang et al. (2018) experiment pro-
duced results with a qualitative fit to the empirical data (Fig-
ure 3). First, the standard deviation of the decoded output
increased with θ(also see previous section). Second, the neu-
ral responses were highly heterogeneous, with ramping, de-
caying, and oscillating neurons. These firing profiles were
observed because they are linear combinations of the under-
lying state vector x(t)(see Figure 1E). Third, the responses
of individual neurons stretched or compressed with the length
of the timed response response, similar to the empirical data
from Wang et al. (2018).
Figure 3: Neural Scaling. A square input was provided to
the delay network, while varying the value of the gain input.
The peak and standard deviation of the decoded output scale
with the gain. The heterogeneous firing patterns of individual
neurons also scale with gain. Here, neural firing patterns of
three example neurons are shown that qualitatively fit the data
from Wang et al. (2018). We focused on the first period of the
neural response to the “set” stimulus. The top neuron shows
ramping activity, the middle neuron shows decaying activity,
and the bottom neuron shows oscillatory activity.
Discussion
The aim of the present study is to use the delay network to ex-
plain two findings in the timing literature: the scalar property
of variance and neural scaling. We did not assume the scalar
property a priori, but systematically explored the parameters
of the delay network that lead the scalar property to be satis-
fied or violated. Our results suggest that the scalar property
critically depends on q. Notably, the time-cell data that was
analyzed in earlier work fit best for q=6 (Voelker & Elia-
smith, 2018). The temporal range of conformity to the scalar
property and slope of the scalar property may be explained
by the number of dimensions the delay network uses (q): for
higher q, the range of short intervals with a constant standard
deviation increases, whereas the slope of the scalar property
decreases. We also found that scaling the dynamics of the de-
lay network produces scaling of neural firing patterns, match-
ing empirical data (Wang et al., 2018). Our model suggests
that when the delay network represents its input history with
more dimensions, neural firing patterns become more com-
plex, as additional linear combinations of higher-degree Leg-
endre polynomials are encoded by individual neurons. Fur-
thermore, these findings suggest that the scalar property and
the adaptive control of neural dynamics are tightly linked.
Previous Models
The delay network shares some features with previous neu-
ral models of timing, but there are also critical differences.
First, similar to previous RNN models, the delay network is
an RNN that uses population-level dynamics to time inter-
vals. However, previous RNN models use a random connec-
tivity approach to generate the necessary dynamics for ac-
curate timing, whereas the delay network explicitly defines
the required dynamics and optimizes neural connectivity to
implement those dynamics. Also, previous RNN models of
timing do not characterize how the input history is repre-
sented. Similar to memory models of timing (Shankar &
Howard, 2010), the delay network makes this connection ex-
plicit. Even though memory models and the delay network
both specify how input history is represented, the memory
models do not specify how to optimally scale the dynamics
of the network or compute arbitrary functions over the repre-
sented history. In contrast, the delay network is optimized to
recurrently represent time, and comes with a general frame-
work that links the input history, network representation, and
spiking neural activity. In sum, we believe that the delay net-
work is an improvement over previous models of timing by
both explicitly specifying how time is represented and imple-
menting that representation in a flexible neural framework.
Extending the Delay Network
In this work, we have used the delay network to explain the
scalar property and neural scaling in a simple motor timing
task. However, the delay network may be used to explain a
wide variety of timing phenomena, including: continuative
timing, temporal cognition, and learning how to time.
Continuative Timing First, the delay network can be ex-
tended to account for time perception in a wide variety of re-
alistic situations. A classic dichotomy in the timing literature
is between prospective and retrospective timing. Prospective
timing is explicitly estimating an interval with knowledge be-
forehand that attention should be focused on time. On the
other hand, retrospective timing is estimating, in hindsight,
how long ago an event happened. However, this distinction
may be arbitrary, since in realistic situations, one often no-
tices the duration of an ongoing interval. For instance, you
may notice that a web page is taking too long to load but wait
an additional amount of time before checking your signal re-
ception. When this happens, one neither has earlier knowl-
edge that time should be attended to (prospective) nor the in-
struction to estimate how much time has passed since an event
(retrospective). Therefore, a more appropriate term for timing
in realistic situations would be continuative timing (van Rijn,
2018). The delay network, at any point in time, serves as a
rich source of information regarding the temporal structure
of ongoing events, including how long ago an event started
and stopped. This information can be used to infer how much
time has elapsed since a salient event and compared to the
typical temporal structure of an event in memory. Such com-
parisons could then facilitate decision-making, such as in de-
ciding whether to wait for an additional amount of time.
Temporal Cognition Second, time is a crucial factor in a
wide variety of cognitive processes. Timing models have
been successfully integrated in ACT-R (Taatgen, van Rijn,
& Anderson, 2007) and models of decision-making (Balc &
Simen, 2016). The delay network, built with the NEF, is
compatible with other cognitive models that have been devel-
oped in the same framework, or indeed any cognitive mod-
els that can incorporate neural networks. Therefore, a future
avenue of research will be to incorporate the delay network
into existing models of cognitive processes, such as action-
selection (Stewart, Bekolay, & Eliasmith, 2012) and working
memory (Singh & Eliasmith, 2006).
Learning to Time Third, the delay network may be used to
explain how timing is learned. In the experiment by Wang
et al. (2018), the monkeys trained extensively before they
could accurately perform the motor timing task. The mon-
keys received rewards according to the accuracy of their per-
formance. Another open question is how an optimal mapping
between cues and the gain population can be learned. There-
fore, future work will focus on modeling how timing is mas-
tered during reinforcement learning.
References
Allan, L. G., & Kristofferson, A. B. (1974). Psychophysi-
cal theories of duration discrimination. Perception & Psy-
chophysics,16(1), 26–34.
Balc, F., & Simen, P. (2016). A decision model of timing.
Current Opinion in Behavioral Sciences,8, 94–101.
Buonomano, D. V., & Maass, W. (2009). State-dependent
computations: spatiotemporal processing in cortical net-
works. Nature Reviews Neuroscience,10(2), 113–125.
Eliasmith, C., & Anderson, C. H. (2003). Neural engineer-
ing: computation, representation, and dynamics in neuro-
biological systems. Cambridge, Mass: MIT Press.
Fetterman, J. G., & Killeen, P. R. (1992). Time discrimi-
nation in Columba livia and Homo sapiens. Journal of Ex-
perimental Psychology: Animal Behavior Processes,18(1),
80–94.
Gibbon, J. (1977). Scalar Expectancy Theory and Weber’s
Law in Animal Timing. Psychological Review,84(3), 279–
325.
Grondin, S. (2014). About the (Non)scalar Property for Time
Perception. In H. Merchant & V. de Lafuente (Eds.), Neu-
robiology of Interval Timing (Vol. 829, pp. 17–32). New
York, NY: Springer New York.
MacDonald, C., Lepage, K., Eden, U., & Eichenbaum, H.
(2011). Hippocampal Time Cells Bridge the Gap in Mem-
ory for Discontiguous Events. Neuron,71(4), 737–749.
Marr, D. (1982). Vision: a computational investigation into
the human representation and processing of visual infor-
mation. San Francisco: W.H. Freeman.
Matell, M. S., & Meck, W. H. (2004). Cortico-striatal cir-
cuits and interval timing: coincidence detection of oscilla-
tory processes. Cognitive Brain Research,21(2), 139–170.
Neckar, A., Fok, S., Benjamin, B. V., Stewart, T. C., Oza,
N. N., Voelker, A. R., . . . Boahen, K. (2019). Braindrop: A
Mixed-Signal Neuromorphic Architecture With a Dynam-
ical Systems-Based Programming Model. Proceedings of
the IEEE,107(1), 144–164.
Shankar, K. H., & Howard, M. W. (2010). Timing using
temporal context. Brain Research,1365, 3–17.
Simen, P., Balci, F., deSouza, L., Cohen, J. D., & Holmes, P.
(2011). A Model of Interval Timing by Neural Integration.
Journal of Neuroscience,31(25), 9238–9253.
Singh, R., & Eliasmith, C. (2006). Higher-Dimensional Neu-
rons Explain the Tuning and Dynamics of Working Mem-
ory Cells. Journal of Neuroscience,26(14), 3667–3678.
Stewart, T. C., Bekolay, T., & Eliasmith, C. (2012). Learn-
ing to Select Actions with Spiking Neurons in the Basal
Ganglia. Frontiers in Neuroscience,6.
Taatgen, N. A., van Rijn, H., & Anderson, J. (2007). An inte-
grated theory of prospective time interval estimation: The
role of cognition, attention, and learning. Psychological
Review,114(3), 577–598.
van Rijn, H. (2018). Towards Ecologically Valid Interval
Timing. Trends in Cognitive Sciences,22(10), 850–852.
Voelker, A. R. (2019). Dynamical systems in spiking neu-
romorphic hardware. Unpublished doctoral dissertation,
University of Waterloo, Waterloo, ON.
Voelker, A. R., & Eliasmith, C. (2018). Improving Spik-
ing Dynamical Networks: Accurate Delays, Higher-Order
Synapses, and Time Cells. Neural Computation,30(3),
569–609.
Wang, J., Narain, D., Hosseini, E. A., & Jazayeri, M. (2018).
Flexible timing by temporal scaling of cortical responses.
Nature Neuroscience,21(1), 102–110.
Wearden, J. H., & Lejeune, H. (2008). Scalar Properties
in Human Timing: Conformity and Violations. Quarterly
Journal of Experimental Psychology,61(4), 569–587.
... Delay network model [109] Representing input history --extendable and captures violations of scalar property. State-dependent network and population clock models [94], [110] State-dependent changes in neural properties -state-dependent neural properties are exploited. ...
... c) Delay network model: Recently, making use of Neural Engineering Framework (NEF), de Jong et al. [109] adopted a novel approach, which captured scalar property and its violations. Additionally, they assumed a connection between the scalar property and a neurological phenomenon called neural scaling [127], which is the property of neurons whose firing distributions stretch or compress according to the temporal length of a given stimulus. ...
Article
Animals exploit time to survive in the world. Temporal information is required for higher-level cognitive abilities such as planning, decision making, communication, and effective cooperation. Since time is an inseparable part of cognition, there is a growing interest in the artificial intelligence approach to subjective time, which has a possibility of advancing the field. The current survey study aims to provide researchers with an interdisciplinary perspective on time perception. Firstly, we introduce a brief background from the psychology and neuroscience literature, covering the characteristics and models of time perception and related abilities. Secondly, we summarize the emergent computational and robotic models of time perception. A general overview to the literature reveals that a substantial amount of timing models are based on a dedicated time processing like the emergence of a clock-like mechanism from the neural network dynamics and reveal a relationship between the embodiment and time perception. We also notice that most models of timing are developed for either sensory timing (i.e. ability to assess an interval) or motor timing (i.e. ability to reproduce an interval). The number of timing models capable of retrospective timing, which is the ability to track time without paying attention, is insufficient. In this light, we discuss the possible research directions to promote interdisciplinary collaboration in the field of time perception.
... Delay network model [109] Representing input history --extendable and captures violations of scalar property. State-dependent network and population clock models [94], [110] State-dependent changes in neural properties -state-dependent neural properties are exploited. ...
... c) Delay network model: Recently, making use of Neural Engineering Framework (NEF), de Jong et al. [109] adopted a novel approach, which captured scalar property and its violations. Additionally, they assumed a connection between the scalar property and a neurological phenomenon called neural scaling [127], which is the property of neurons whose firing distributions stretch or compress according to the temporal length of a given stimulus. ...
Preprint
Animals exploit time to survive in the world. Temporal information is required for higher-level cognitive abilities such as planning, decision making, communication and effective cooperation. Since time is an inseparable part of cognition, there is a growing interest in artificial intelligence to time, which has a possibility of advancing the field. This study aims to provide researchers with an interdisciplinary perspective on time. Firstly, we briefly discussed the necessary information from psychology and neuroscience, such as characteristics and models of time perception and related abilities. Secondly, we investigated the emergent computational and robotic models of time perception. As a result of the review, we observed that most timing models showed a sign of dedicated time processing like the emergence of clock-like mechanism from the neural network dynamics and revealed a relationship between embodiment and time perception. We also noticed that most models of timing developed for either sensory timing, the ability of assessment of an interval, or motor timing, ability to reproduce an interval. Additionally, the number of timing models capable of retrospective timing, which is the ability to track time without paying attention, is insufficient. In this light, we discussed possible research directions to promote interdisciplinary collaboration for time perception.
... In addition, the LDN has also been used as the basis for modelling how animals can estimate time intervals [42]. In that work, a spiking LDN model that produces an output after an interval of time is analyzed in terms of the variance of its outputs, and it is shown to produce both the behavioural scaling effects seen in humans (increased variance for increased intervals, but only for intervals larger than 500ms) and the neural firing patterns seen in monkeys. ...
Article
Full-text available
The Neural Engineering Framework (Eliasmith & Anderson, 2003) is a long-standing method for implementing high-level algorithms constrained by low-level neurobiological details. In recent years, this method has been expanded to incorporate more biological details and applied to new tasks. This paper brings together these ongoing research strands, presenting them in a common framework. We expand on the NEF's core principles of a) specifying the desired tuning curves of neurons in different parts of the model, b) defining the computational relationships between the values represented by the neurons in different parts of the model, and c) finding the synaptic connection weights that will cause those computations and tuning curves. In particular, we show how to extend this to include complex spatiotemporal tuning curves, and then apply this approach to produce functional computational models of grid cells, time cells, path integration, sparse representations, probabilistic representations, and symbolic representations in the brain.
... A spiking neural network called the Delay Network [38] embraces these mechanisms to approximate an ideal delay line by converting it into a finite number of ODEs integrated over time. This model reproduces properties of "time cells" observed in the hippocampus, striatum, and cortex [13,38], and has been deployed on ultra low-power [6] analog and digital neuromorphic hardware including Braindrop [28] and Loihi [12,37]. This paper applies the memory model from [38] to the domain of deep learning. ...
Conference Paper
Full-text available
We propose a novel memory cell for recurrent neural networks that dynamically maintains information across long windows of time using relatively few resources. The Legendre Memory Unit (LMU) is mathematically derived to orthogonalize its continuous-time history-doing so by solving d coupled ordinary differential equations (ODEs), whose phase space linearly maps onto sliding windows of time via the Legendre polynomials up to degree d − 1. Backpropagation across LMUs outperforms equivalently-sized LSTMs on a chaotic time-series prediction task, improves memory capacity by two orders of magnitude, and significantly reduces training and inference times. LMUs can efficiently handle temporal dependencies spanning 100,000 time-steps, converge rapidly, and use few internal state-variables to learn complex functions spanning long windows of time-exceeding state-of-the-art performance among RNNs on permuted sequential MNIST. These results are due to the network's disposition to learn scale-invariant features independently of step size. Backpropagation through the ODE solver allows each layer to adapt its internal time-step, enabling the network to learn task-relevant timescales. We demonstrate that LMU memory cells can be implemented using m recurrently-connected Poisson spiking neurons, O(m) time and memory, with error scaling as O(d / √m). We discuss implementations of LMUs on analog and digital neuromorphic hardware.
Chapter
As neuromorphic hardware begins to emerge as a viable target platform for artificial intelligence (AI) applications, there is a need for tools and software that can effectively compile a variety of AI models onto such hardware. Nengo (http://nengo.ai) is an ecosystem of software designed to fill this need with a suite of tools for creating, training, deploying, and visualizing neural networks for various hardware backends, including CPUs, GPUs, FPGAs, microcontrollers, and neuromorphic hardware. While backpropagation-based methods are powerful and fully supported in Nengo, there is also a need for frameworks that are capable of efficiently mapping dynamical systems onto such hardware while best utilizing its computational resources. The neural engineering framework (NEF) is one such method that is supported by Nengo. Most prominently, Nengo and the NEF have been used to engineer the world’s largest functional model of the human brain. In addition, as a particularly efficient approach to training neural networks for neuromorphics, the NEF has been ported to several neuromorphic platforms. In this chapter, we discuss the mathematical foundations of the NEF and a number of its extensions and review several recent applications that use Nengo to build models for neuromorphic hardware. We focus in-depth on a particular class of dynamic neural networks, Legendre Memory Units (LMUs), which have demonstrated advantages over state-of-the-art approaches in deep learning with respect to energy efficiency, training time, and accuracy.
Thesis
Full-text available
Dynamical systems are universal computers. They can perceive stimuli, remember, learn from feedback, plan sequences of actions, and coordinate complex behavioural responses. The Neural Engineering Framework (NEF) provides a general recipe to formulate models of such systems as coupled sets of nonlinear differential equations and compile them onto recurrently connected spiking neural networks – akin to a programming language for spiking models of computation. The Nengo software ecosystem supports the NEF and compiles such models onto neuromorphic hardware. In this thesis, we analyze the theory driving the success of the NEF, and expose several core principles underpinning its correctness, scalability, completeness, robustness, and extensibility. We also derive novel theoretical extensions to the framework that enable it to far more effectively leverage a wide variety of dynamics in digital hardware, and to exploit the device-level physics in analog hardware. At the same time, we propose a novel set of spiking algorithms that recruit an optimal nonlinear encoding of time, which we call the Delay Network (DN). Backpropagation across stacked layers of DNs dramatically outperforms stacked Long Short-Term Memory (LSTM) networks—a state-of-the-art deep recurrent architecture—in accuracy and training time, on a continuous-time memory task, and a chaotic time-series prediction benchmark. The basic component of this network is shown to function on state-of-the-art spiking neuromorphic hardware including Braindrop and Loihi. This implementation approaches the energy-efficiency of the human brain in the former case, and the precision of conventional computation in the latter case.
Article
Full-text available
Researchers building spiking neural networks face the challenge of improving the biological plausibility of their model networks while maintaining the ability to quantitatively characterize network behavior. In this work, we extend the theory behind the neural engineering framework (NEF), a method of building spiking dynamical networks, to permit the use of a broad class of synapse models while maintaining prescribed dynamics up to a given order. This theory improves our understanding of how low-level synaptic properties alter the accuracy of high-level computations in spiking dynamical networks. For completeness, we provide characterizations for both continuous-time (i.e., analog) and discrete-time (i.e., digital) simulations. We demonstrate the utility of these extensions by mapping an optimal delay line onto various spiking dynamical networks using higher-order models of the synapse. We show that these networks nonlinearly encode rolling windows of input history, using a scale-invariant representation, with accuracy depending on the frequency content of the input signal. Finally, we reveal that these methods provide a novel explanation of time cell responses during a delay task, which have been observed throughout hippocampus, striatum, and cortex.
Article
Full-text available
Musicians can perform at different tempos, speakers can control the cadence of their speech, and children can flexibly vary their temporal expectations of events. To understand the neural basis of such flexibility, we recorded from the medial frontal cortex of nonhuman primates trained to produce different time intervals with different effectors. Neural responses were heterogeneous, nonlinear, and complex, and they exhibited a remarkable form of temporal invariance: firing rate profiles were temporally scaled to match the produced intervals. Recording from downstream neurons in the caudate and from thalamic neurons projecting to the medial frontal cortex indicated that this phenomenon originates within cortical networks. Recurrent neural network models trained to perform the task revealed that temporal scaling emerges from nonlinearities in the network and that the degree of scaling is controlled by the strength of external input. These findings demonstrate a simple and general mechanism for conferring temporal flexibility upon sensorimotor and cognitive functions.
Article
Full-text available
Describes a theory of temporal control which treats responding of animal Ss at asymptote under a variety of learning procedures. Ss are viewed as making estimates of the time to reinforcement delivery using a scalar-timing process, which rescales estimates for different values of the interval being timed. Scalar-timing implies a constant coefficient of variation. Expectancies of reward based on these estimates are formed, and a discrimination between response alternatives is made by taking a ratio of their expectancies. In periodic schedules of reinforcement the discrimination is between local and overall expectancy of reward. In psychophysical studies of duration discrimination, the expectancy ratio reduces the likelihood ratio, and in conjunction with the scalar property, results in a general form of Weber's law. The psychometric choice function describing preference for different amounts and delays of reinforcement also results in a form of Weber's law. (102 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Braindrop is the first neuromorphic system designed to be programmed at a high level of abstraction. Previous neuromorphic systems were programmed at the neurosynaptic level and required expert knowledge of the hardware to use. In stark contrast, Braindrop’s computations are specified as coupled nonlinear dynamical systems and synthesized to the hardware by an automated procedure. This procedure not only leverages Braindrop’s fabric of subthreshold analog circuits as dynamic computational primitives but also compensates for their mismatched and temperature-sensitive responses at the network level. Thus, a clean abstraction is presented to the user. Fabricated in a 28-nm FDSOI process, Braindrop integrates 4096 neurons in 0.65 mm20.65~\text {mm}^{2} . Two innovations—sparse encoding through analog spatial convolution and weighted spike-rate summation though digital accumulative thinning—cut digital traffic drastically, reducing the energy Braindrop consumes per equivalent synaptic operation to 381 fJ for typical network configurations.
Article
Research on interval timing has provided significant insight into how intervals are perceived and produced in well-controlled laboratory settings. However, for timing theories to explain real-world performance, it is imperative that they provide better quantitative predictions and be applicable to timing tasks that are relevant in ecologically valid settings.
Book
Available again, an influential book that offers a framework for understanding visual perception and considers fundamental questions about the brain and its functions. David Marr's posthumously published Vision (1982) influenced a generation of brain and cognitive scientists, inspiring many to enter the field. In Vision, Marr describes a general framework for understanding visual perception and touches on broader questions about how the brain and its functions can be studied and understood. Researchers from a range of brain and cognitive sciences have long valued Marr's creativity, intellectual power, and ability to integrate insights and data from neuroscience, psychology, and computation. This MIT Press edition makes Marr's influential work available to a new generation of students and scientists. In Marr's framework, the process of vision constructs a set of representations, starting from a description of the input image and culminating with a description of three-dimensional objects in the surrounding environment. A central theme, and one that has had far-reaching influence in both neuroscience and cognitive science, is the notion of different levels of analysis—in Marr's framework, the computational level, the algorithmic level, and the hardware implementation level. Now, thirty years later, the main problems that occupied Marr remain fundamental open problems in the study of perception. Vision provides inspiration for the continuing efforts to integrate knowledge from cognition and computation to understand vision and the brain.
Article
Approaching sensation scientifically is relatively straightforward. There are physical attributes for stimulating the central nervous system, and there are specific receptors for each sense for translating the physical signals into codes that brain will recognize. When studying time though, it is far from obvious that there are any specific receptors or specific stimuli. Consequently, it becomes important to determine whether internal time obeys some laws or principles usually reported when other senses are studied. In addition to reviewing some classical methods for studying time perception, the present chapter focusses on one of these laws, Weber law, also referred to as the scalar property in the field of time perception. Therefore, the question addressed here is the following: does variability increase linearly as a function of the magnitude of the duration under investigation? The main empirical facts relative to this question are reviewed, along with a report of the theoretical impact of these facts on the hypotheses about the nature of the internal mechanisms responsible for estimating time.