Content uploaded by Steve Furber
Author content
All content in this area was uploaded by Steve Furber on Jan 03, 2020
Content may be subject to copyright.
SpiNNaker 2: A 10 Million Core Processor
System for Brain Simulation and Machine
Learning
Christian Mayr a,1, Sebastian Hppner aand Steve Furber b
aChair of Highly-Parallel VLSI-Systems and Neuromorphic Circuits, Institute of Circuits
and Systems, Technische Universitt Dresden, Dresden, Germany
bAdvanced Processor Technologies Group, School of Computer Science, University of
Manchester, Manchester, United Kingdom
Abstract. SpiNNaker is an ARM-based processor platform optimized for the simula-
tion of spiking neural networks. This brief describes the roadmap in going from the
current SPINNaker1 system, a 1 Million core machine in 130nm CMOS, to SpiN-
Naker2, a 10 Million core machine in 22nm FDSOI. Apart from pure scaling, we will
take advantage of specific technology features, such as runtime adaptive body biasing,
to deliver cutting-edge power consumption. Power management of the cores allows
a wide range of workload adaptivity, i.e. processor power scales with the complexity
and activity of the spiking network. Additional numerical accelerators will enhance
the utility of SpiNNaker2 for simulation of spiking neural networks as well as for ex-
ecuting conventional deep neural networks. These measures should increase the sim-
ulation capacity of the machine by a factor >50. The interplay between the two do-
mains, i.e. spiking and rate based, will provide an interesting field for algorithm explo-
ration on SpiNNaker2. Apart from the platforms’ traditional usage as a neuroscience
exploration tool, the extended functionality opens up new application areas such as
automotive AI, tactile internet, industry 4.0 and biomedical processing.
Keywords. MPSoC, neuromorphic computing, SpiNNaker2, power management,
22nm FDSOI, numerical accelerators
Introduction
The ”Spiking Neural Network Architecture” SpiNNaker is a processor platform optimized
for the simulation of neural networks. A large number of ARM cores is integrated in a system
architecture optimized for communication and memory access. Specifically, to take advan-
tage of the asynchronous, naturally parallel and independent subcomputations of biological
neurons, each core simulates neurons independently and communicates via a lightweight,
spike-optimized asynchronous communication protocol [1]. Neurons are simulated for a cer-
tain timestep (typically 1ms), and then activity patterns exchanged between cores, on the as-
sumption that time models itself, i.e. an exchange of activity every millisecond is assumed to
represent biological real time. This allows the energy efficient simulation of neural network
models in real time, with SpiNNaker significantly outperforming conventional high perfor-
mance computing wrt both these aspects. The first generation of SpiNNaker has been de-
signed by the University of Manchester and is currently operational at its intended maximum
system size, i.e. 1 Million ARM processors, as well as in the form of smaller boards in mo-
1Corresponding Author: Christian Mayr, TU Dresden, Mommsenstr. 12, 01062 Dresden, Germany. Tel.: +49
351 463 42392; Fax: +49 351 463 37794; E-mail: christian.mayr@tu-dresden.de.
arXiv:1911.02385v1 [cs.ET] 6 Nov 2019
bile applications (see lower row of images in Fig. 1). One Million cores allow the simulation
of spiking neural networks on the order of 1% of the human brain. Since 2013, Technische
Universitt Dresden and the University of Manchester have been jointly developing the next
generation SpiNNaker2 system in the framework of the EU flagship Human Brain Project
(upper row of images in Fig1). Specific target is a scaling of number of cores by a factor of 10,
while staying in the same power budget (i.e. 10x better power efficiency). Overall increase in
system capacity (simulated numbers of neurons) is expected to be a factor of approx. 50.
1. Spinnaker2 capabilities and main new building blocks
SpiNNaker1
SpiNNaker1 board
(864 ARM cores)
capocaccia.ethz.ch
„Large Scale“ SpiNNaker1
Machine (1 Million Processors) Mobile Systems
SpiNNaker2
Large Scale SpiNNaker2 Machine
SpiNNaker2 Chip
•Ca. 144 ARM M4F Processors
•18MByte SRAM
•8GByte DRAM in PoP Package
Approx. 50x
Size Scaling
„Brain-Size“
Network
Simulations
Mobile Systems
56 Chips x 25 Boards x 5Racks x 10 Cabinets
≈ 10 Million Processors
SpiNNaker1 Chip (18 cores)
Figure 1. First and second generation of the SpiNNaker system. Please note: The above numbers for SpiN-
Naker2 represent the current state of design and are subject to change.
The 50x capacity scaling is exemplified in the middle of Fig1, i.e. the simulation capac-
ity of a current 48 processor board in SpiNNaker1 is expected to be carried inside a single
SpiNNaker2 chip. The additional scaling of 5 above the factor 10 (i.e. increase in core num-
ber) is primarily expected from a higher clock rate and numerical accelerators for common
synaptic operations. The general approach for SpiNNaker2 can be broken down in four main
aspects: (1) Keep the processor-based flexibility of SpiNNaker1, which has been a major
advantage compared to more streamlined neuromorphic approaches such as Truenorth [2] or
analog neuromorphic circuits [3,4]. (2) Dont do everything in software in the processors,
i.e. incorporate numerical accelerators for the most common operations. Current prototypes
contain accelerators for exponential functions [5] and derive random numbers from the ther-
mal noise of the clock generators inherent in each core, i.e. at virtually zero circuit overhead
[6]. For an example of the overall benefit of the accelerators in terms of clock cycles, see [7]
and other upcoming papers. Other accelerators, i.e. a log function, are in discussion. (3) Use
the latest technologies and features for energy efficiency. We are targeting a 22nm FDSOI
technology for SpiNNaker2, with on-chip adaptive body biasing (ABB) enabling reliable op-
eration down to 0.4V under virtually all operating conditions. ABB can narrow down the
transistor threshold voltage spread at runtime (i.e. over aging, temperature, manufacturing
corner) to enable robust near-threshold logic operation. (4) Allow workload adaptivity on
all levels. Similar to the brain, power consumption should be proportional to the task be-
ing carried out. Specifically, all processors in SpiNNaker2 operate under dynamic voltage
and frequency scaling [8]. Their operating voltage and frequency are individually adjusted to
the load of incoming spikes per every millisecond and the expected clock cycles required to
compute this load. Thus, at low computational load times, a task is stretched out over time
and can run at less supply voltage, reducing both dynamic and leakage power. Besides this
computational adaptivity, communication has also been streamlined. For example, the chip-
to-chip links feature very fast power up and down functionality, allowing energy proportion-
ality with regard to the bits transmitted. In addition to the above four main development lines,
multiply accumulate arrays (MAC) have been incorporated in the latest prototype chip to en-
hance the usefulness of SpiNNaker2 beyond simulation of spiking neural networks. With the
MACs, the synaptic weighing and accumulation of entire layers of conventional deep neural
networks can be offloaded from the processors, freeing them to carry out spike based sim-
ulations in parallel. For a discussion on how to parallelize these new network types across
processors, see [7].
2. Brain simulation and other applications
Sheer scaling of SpiNNaker2 compared to SpiNNaker1 should allow the simulation of sig-
nificantly larger and more complex spiking neural networks. In an upcoming publication,
we are showing an implementation of synaptic sampling [9], i.e. the most complex synaptic
plasticity ever implemented on SpiNNaker. This implementation is significantly aided by the
addition of the new numerical accelerators. In addition, there is scope for extending these
spiking simulations to multiscale networks, by using the MACs to simulate entire networks
as mesoscopic, black box modules. SpiNNaker2 should be able to run models like BioSpaun
[10] on levels of abstraction from multicompartment neurons via spiking point neurons, rate-
based neurons up to mesoscopic models. Beyond these extensions of traditional use cases,
the upper left corner of Fig. 1 shows other future uses. Specifically, SpiNNaker2 combines
high throughput machine learning, sensor-actuator processing with inherent millisecond la-
tency and IoT-device-level energy efficiency, which represents a breakthrough in the field of
real-time, mobile human-machine interaction. Targeted applications for SpiNNaker2 in this
area include the tactile internet (e.g. tele-learning, robotics interaction), autonomous driving,
industry 4.0 (e.g. real time predictive maintenance) or biomedical (e.g. closed-loop neural
implants with spiking and machine learning functionality). For an overview of the current
SpiNNaker2 development and technical details (MACs, FDSOI approach, accelerators, etc),
see [11]. A last word about the timetable: After the current phase of prototyping, we expect
the final SpiNNaker2 chip to become available sometime late 2020/early 2021, so look out
for the 2021 Capo Caccia and Telluride neuromorphic workshops.
Acknowledgements
The authors thank ARM for IP contributions. Discussions at the yearly Capo Caccia work-
shop have contributed to the prototype designs and benchmarks. The research leading to
these results has received funding from the European Union Seventh Framework Programme
(FP7) under grant agreement No 604102 and the EU’s Horizon 2020 research and innovation
programme under grant agreements No 720270 and 785907 (Human Brain Project, HBP).
References
[1] Steve B Furber, David R Lester, Luis A Plana, Jim D Garside, Eustace Painkras, Steve Temple, and
Andrew D Brown. Overview of the spinnaker system architecture. IEEE Transactions on Computers,
62(12):2454–2467, 2013.
[2] Paul A Merolla, John V Arthur, Rodrigo Alvarez-Icaza, Andrew S Cassidy, Jun Sawada, Filipp Akopyan,
Bryan L Jackson, Nabil Imam, Chen Guo, Yutaka Nakamura, et al. A million spiking-neuron integrated
circuit with a scalable communication network and interface. Science, 345(6197):668–673, 2014.
[3] Mihai A Petrovici, Sebastian Schmitt, Johann Kl¨
ahn, David St¨
ockel, Anna Schroeder, Guillaume Bel-
lec, Johannes Bill, Oliver Breitwieser, Ilja Bytschok, Andreas Gr ¨
ubl, et al. Pattern representation and
recognition with accelerated analog neuromorphic systems. In Circuits and Systems (ISCAS), 2017 IEEE
International Symposium on, pages 1–4. IEEE, 2017.
[4] A K¨
onig, C Mayr, T Bormann, and C Klug. Dedicated implementation of embedded vision systems
employing low-power massively parallel feature computation. In Proceedings of the 3rd VIVA-Workshop
on Low-Power Information Processing, pages 1–8, 2002.
[5] Johannes Partzsch, Sebastian H¨
oppner, Matthias Eberlein, Rene Sch¨
uffny, Christian Mayr, David R Lester,
and Steve Furber. A fixed point exponential function accelerator for a neuromorphic many-core system.
In Circuits and Systems (ISCAS), 2017 IEEE International Symposium on, pages 1–4. IEEE, 2017.
[6] Felix Neumarker, Sebastian H¨
oppner, Andreas Dixius, and Christian Mayr. True random number gen-
eration from bang-bang adpll jitter. In Nordic Circuits and Systems Conference (NORCAS), 2016 IEEE,
pages 1–5. IEEE, 2016.
[7] Chen Liu, Guillaume Bellec, Bernhard Vogginger, David Kappel, Johannes Partzsch, Sebastian H¨
oppner,
Wolfgang Maass, Steve B Furber, Robert Legenstein, Christian G Mayr, et al. Memory-efficient deep
learning on a spinnaker 2 prototype. Frontiers in neuroscience, 12:840, 2018.
[8] Sebastian H¨
oppner, Yexin Yan, Bernhard Vogginger, Andreas Dixius, Johannes Partzsch, Felix Neum¨
arker,
Stephan Hartmann, Stefan Schiefer, Stefan Scholze, Georg Ellguth, et al. Dynamic voltage and frequency
scaling for neuromorphic many-core systems. In Circuits and Systems (ISCAS), 2017 IEEE International
Symposium on, pages 1–4. IEEE, 2017.
[9] David Kappel, Robert Legenstein, Stefan Habenschuss, Michael Hsieh, and Wolfgang Maass. A dynamic
connectome supports the emergence of stable computational function of neural circuits through reward-
based learning. eNeuro, 5(2):ENEURO–0301, 2018.
[10] Chris Eliasmith, Jan Gosmann, and Xuan Choo. Biospaun: A large-scale behaving brain model with
complex neurons. arXiv preprint arXiv:1602.05220, 2016.
[11] Sebastian H¨
oppner and Christian Mayr. Spinnaker2-towards extremely efficient digital neuromorphics
and multi-scale brain emulation. Neuro Inspired Computational Elements (NICE 2018), 2018.