Pier Stanislao Paolucci

Pier Stanislao Paolucci
INFN - Istituto Nazionale di Fisica Nucleare | INFN · Rome I

About

175
Publications
15,189
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,128
Citations
Citations since 2016
54 Research Items
784 Citations
2016201720182019202020212022020406080100120140
2016201720182019202020212022020406080100120140
2016201720182019202020212022020406080100120140
2016201720182019202020212022020406080100120140

Publications

Publications (175)
Preprint
Full-text available
Neuroscience is moving towards a more integrative discipline, where understanding brain function requires consolidating the accumulated evidence seen across experiments, species, and measurement techniques. A remaining challenge on that path is integrating such heterogeneous data into analysis workflows such that consistent and comparable conclusio...
Preprint
Full-text available
Sleep is known to play a central role in learning and cognition, yet the mechanisms underlying its role in stabilizing learning and improving energetic management are still to be clarified. It is characterized by patterns of cortical activity alternating between the stages of slow wave sleep (NREM) and rapid eye movement sleep (REM). In this work,...
Preprint
Full-text available
The brain can efficiently learn a wide range of tasks, motivating the search for biologically inspired learning rules for improving current artificial intelligence technology. Most biological models are composed of point neurons, and cannot achieve the state-of-the-art performances in machine learning. Recent works have proposed that segregation of...
Article
Full-text available
Working Memory (WM) is a cognitive mechanism that enables temporary holding and manipulation of information in the human brain. This mechanism is mainly characterized by a neuronal activity during which neuron populations are able to maintain an enhanced spiking activity after being triggered by a short external cue. In this study, we implement, us...
Article
In the near future, Exascale systems will need to bridge three technology gaps to achieve high performance while remaining under tight power constraints: energy efficiency and thermal control; extreme computation efficiency via HW acceleration and new arithmetic; methods and tools for seamless integration of reconfigurable accelerators in heterogen...
Conference Paper
Full-text available
The brain can learn to solve a wide range of tasks with high temporal and energetic efficiency. However, most biological models are composed of simple single-compartment neurons and cannot achieve the state-of-the-art performances of artificial intelligence. We propose a multi-compartment model of pyramidal neuron, in which bursts and dendritic inp...
Article
Full-text available
Spiking neural network models are increasingly establishing themselves as an effective tool for simulating the dynamics of neuronal populations and for understanding the relationship between these dynamics and brain function. Furthermore, the continuous development of parallel computing technologies and the growing availability of computational res...
Article
Full-text available
The field of recurrent neural networks is over-populated by a variety of proposed learning rules and protocols. The scope of this work is to define a generalized framework, to move a step forward towards the unification of this fragmented scenario. In the field of supervised learning, two opposite approaches stand out, error-based and target-based....
Preprint
Full-text available
Humans and animals can learn new skills after practicing for a few hours, while current reinforcement learning algorithms require a large amount of data to achieve good performances. Recent model-based approaches show promising results by reducing the number of necessary interactions with the environment to learn a desirable policy. However, these...
Preprint
Full-text available
The brain can learn to solve a wide range of tasks with high temporal and energetic efficiency. However, most biological models are composed of simple single compartment neurons and cannot achieve the state-of-art performances of artificial intelligence. We propose a multi-compartment model of pyramidal neuron, in which bursts and dendritic input s...
Preprint
Full-text available
The APEnet+ board delivers a point-to-point, low-latency, 3D torus network interface card. In this paper we describe the latest generation of APEnet NIC, APEnet v5, integrated in a PCIe Gen3 board based on a state-of-the-art, 28 nm Altera Stratix V FPGA. The NIC features a network architecture designed following the Remote DMA paradigm and tailored...
Conference Paper
Full-text available
To achieve high performance and high energy efficiency on near-future exascale computing systems, three key technology gaps needs to be bridged. These gaps include: energy efficiency and thermal control; extreme computation efficiency via HW acceleration and new arithmetics; methods and tools for seamless integration of reconfigurable accelerators...
Preprint
Full-text available
Learning in biological or artificial networks means changing the laws governing the network dynamics in order to better behave in a specific situation. In the field of supervised learning, two complementary approaches stand out: error-based and target-based learning. However, there exists no consensus on which is better suited for which task, and w...
Article
Full-text available
The brain exhibits capabilities of fast incremental learning from few noisy examples, as well as the ability to associate similar memories in autonomously-created categories and to combine contextual hints with sensory perceptions. Together with sleep, these mechanisms are thought to be key components of many high-level cognitive functions. Yet, li...
Preprint
Full-text available
Recent enhancements in neuroscience, like the development of new and powerful recording techniques of the brain activity combined with the increasing anatomical knowledge provided by atlases and the growing understanding of neuromodulation principles, allow studying the brain at a whole new level, paving the way to the creation of extremely detaile...
Article
Full-text available
Recurrent spiking neural networks (RSNN) in the brain learn to perform a wide range of perceptual, cognitive and motor tasks very efficiently in terms of energy consumption and their training requires very few examples. This motivates the search for biologically inspired learning rules for RSNNs, aiming to improve our understanding of brain computa...
Article
Full-text available
Over the past decade there has been a growing interest in the development of parallel hardware systems for simulating large-scale networks of spiking neurons. Compared to other highly-parallel systems, GPU-accelerated solutions have the advantage of a relatively low cost and a great versatility, thanks also to the possibility of using the CUDA-C/C+...
Preprint
Full-text available
Recurrent spiking neural networks (RSNN) in the human brain learn to perform a wide range of perceptual, cognitive and motor tasks very efficiently in terms of energy consumption and requires very few examples. This motivates the search for biologically inspired learning rules for RSNNs to improve our understanding of brain computation and the effi...
Preprint
Full-text available
Over the past decade there has been a growing interest in the development of parallel hardware systems for simulating large-scale networks of spiking neurons. Compared to other highly-parallel systems, GPU-accelerated solutions have the advantage of a relatively low cost and a great versatility, thanks also to the possibility of using the CUDA-C/C+...
Preprint
Full-text available
The brain exhibits capabilities of fast incremental learning from few noisy examples, as well as the ability to associate similar memories in autonomously-created categories and to combine contextual hints with sensory perceptions. Together with sleep, these mechanisms are thought to be key components of many high-level cognitive functions. Yet, li...
Article
Full-text available
Slow waves (SWs) are spatio-temporal patterns of cortical activity that occur both during natural sleep and anesthesia and are preserved across species. Even though electrophysiological recordings have been largely used to characterize brain states, they are limited in the spatial resolution and cannot target specific neuronal population. Recently,...
Article
Full-text available
EuroEXA is a major European FET research initiative that aims to deliver a proof-of-concept of a next generation Exa-scalable HPC platform. EuroEXA leverages on previous projects results (ExaNeSt, ExaNoDe and ECOSCALE) to design a medium scale but scalable, fully working HPC system prototype exploiting state-of-the-art FPGA devices that integrate c...
Article
Full-text available
Cortical slow oscillations (≲1 Hz) are an emergent property of the cortical network that integrate connectivity and physiological features. This rhythm, highly revealing of the characteristics of the underlying dynamics, is a hallmark of low complexity brain states like sleep, and represents a default activity pattern. Here, we present a methodolog...
Article
Full-text available
Cortical synapse organization supports a range of dynamic states on multiple spatial and temporal scales, from synchronous slow wave activity (SWA), characteristic of deep sleep or anesthesia, to fluctuating, asynchronous activity during wakefulness (AW). Such dynamic diversity poses a challenge for producing efficient large-scale simulations that...
Article
Full-text available
The occurrence of sleep passed through the evolutionary sieve and is widespread in animal species. Sleep is known to be beneficial to cognitive and mnemonic tasks, while chronic sleep deprivation is detrimental. Despite the importance of the phenomenon, a complete understanding of its functions and underlying mechanisms is still lacking. In this pa...
Preprint
Full-text available
Cortical synapse organization supports a range of dynamic states on multiple spatial and temporal scales, from synchronous slow wave activity (SWA), characteristic of deep sleep or anesthesia, to fluctuating, asynchronous activity during wakefulness (AW). Such dynamic diversity poses a challenge for producing efficient large-scale simulations that...
Preprint
Full-text available
Cortical slow oscillations are an emergent property of the cortical network, hallmark of low complexity brain states like sleep, and representing a default activity pattern. Here, we present a methodological approach for quantifying the spatial and temporal properties of this emergent activity. We improved and enriched a robust analysis procedure t...
Preprint
Full-text available
We profile the impact of computation and inter-processor communication on the energy consumption and on the scaling of cortical simulations approaching the real-time regime on distributed computing platforms. Also, the speed and energy consumption of processor architectures typical of standard HPC and embedded platforms are compared. We demonstrate...
Preprint
Full-text available
Slow waves (SWs) occur both during natural sleep and anesthesia and are universal across species. Even though electrophysiological recordings have been largely used to characterize brain states, they are limited in the spatial resolution and cannot target specific neuronal population. Recently, large-scale optical imaging techniques coupled with fu...
Preprint
Full-text available
The occurrence of sleep passed through the evolutionary sieve and is widespread in animal species. Sleep is known to be beneficial to cognitive and mnemonic tasks, while chronic sleep deprivation is detrimental. Despite the importance of the phenomenon, a theoretical and computational approach demonstrating the underlying mechanisms is still lackin...
Article
Full-text available
The use of GPUs to implement general purpose computational tasks, known as GPGPU since fifteen years ago, has reached maturity. Applications take advantage of the parallel architectures of these devices in many different domains. Over the last few years several works have demonstrated the effectiveness of the integration of GPU-based systems in the...
Article
The ExaNeSt project started on December 2015 and is funded by EU H2020 research framework (call H2020-FETHPC-2014, n. 671553) to study the adoption of low-cost, Linux-based power-efficient 64-bit ARM processors clusters for Exascale-class systems. The ExaNeSt consortium pools partners with industrial and academic research expertise in storage, inte...
Article
Full-text available
The deployment of the next generation computing platform at ExaFlops scale requires to solve new technological challenges mainly related to the impressive number (up to 10^6) of compute elements required. This impacts on system power consumption, in terms of feasibility and costs, and on system scalability and computing efficiency. In this perspect...
Article
Full-text available
Efficient brain simulation is a scientific grand challenge, a parallel/distributed coding challenge and a source of requirements and suggestions for future computing architectures. Indeed, the human brain includes about 10^15 synapses and 10^11 neurons activated at a mean rate of several Hz. Full brain simulation poses Exascale challenges even if s...
Article
Full-text available
We measured the impact of long-range exponentially decaying intra-areal lateral connectivity on the scaling and memory occupation of a distributed spiking neural network simulator compared to that of short-range Gaussian decays. While previous studies adopted short-range connectivity, recent experimental neurosciences studies are pointing out the r...
Article
Full-text available
In this paper we present the status of the 3rd generation design of the APEnet board (V5) built upon the 28nm Altera Stratix V FPGA; it features a PCIe Gen3 x8 interface and enhanced embedded transceivers with a maximum capability of 12.5Gbps each. The network architecture is designed in accordance to the Remote DMA paradigm. The APEnet+ V5 prototy...
Article
Full-text available
With processor architecture evolution, the HPC market has undergone a paradigm shift. The adoption of low-cost, Linux-based clusters extended the reach of HPC from its roots in modelling and simulation of complex physical systems to a broader range of industries, from biotechnology, cloud computing, computer analytics and big data challenges to man...
Article
Full-text available
Energy consumption is today one of the most relevant issues in operating HPC systems for scientific applications. The use of unconventional computing systems is therefore of great interest for several scientific communities looking for a better tradeoff between time-to-solution and energy-to-solution. In this context, the performance assessment of...
Article
NaNet is a framework for the development of FPGA-based PCI Express (PCIe) Network Interface Cards (NICs) with real-time data transport architecture that can be effectively employed in TRIDAQ systems. Key features of the architecture are the flexibility in the configuration of the number and kind of the I/O channels, the hardware offloading of the n...
Article
This project aims to exploit the parallel computing power of a commercial Graphics Processing Unit (GPU) to implement fast pattern matching in the Ring Imaging Cherenkov (RICH) detector for the level 0 (L0) trigger of the NA62 experiment. In this approach, the ring-fitting algorithm is seedless, being fed with raw RICH data, with no previous inform...
Article
The parallel computing power of commercial Graphics Processing Units (GPUs) is exploited to perform real-time ring fitting at the lowest trigger level using information coming from the Ring Imaging Cherenkov (RICH) detector of the NA62 experiment at CERN. To this purpose, direct GPU communication with a custom FPGA-based board has been used to redu...
Article
Full-text available
A commercial Graphics Processing Unit (GPU) is used to build a fast Level 0 (L0) trigger system tested parasitically with the TDAQ (Trigger and Data Acquisition systems) of the NA62 experiment at CERN. In particular, the parallel computing power of the GPU is exploited to perform real-time fitting in the Ring Imaging CHerenkov (RICH) detector. Dire...
Article
Full-text available
Over the last few years the GPGPU (General-Purpose computing on Graphics Processing Units) paradigm represented a remarkable development in the world of computing. Computing for High-Energy Physics is no exception: several works have demonstrated the effectiveness of the integration of GPU-based systems in high level trigger of different experiment...
Article
Full-text available
A GPU-based low level (L0) trigger is currently integrated in the experimental setup of the RICH detector of the NA62 experiment to assess the feasibility of building more refined physics-related trigger primitives and thus improve the trigger discriminating power. To ensure the real-time operation of the system, a dedicated data transport mechanis...
Article
Full-text available
The KM3NeT-Italia underwater neutrino detection unit, the tower, consists of 14 floors. Each floor supports 6 Optical Modules containing front-end electronics needed to digitize the PMT signal, format and transmit the data and 2 hydrophones that reconstruct in real-time the position of Optical Modules, for a maximum tower throughput of more than 60...
Article
General-purpose computing on GPUs is emerging as a new paradigm in several fields of science, although so far applications have been tailored to employ GPUs as accelerators in offline computations. With the steady decrease of GPU latencies and the increase in link and memory throughputs, time is ripe for real-time applications using GPUs in high-en...
Article
Full-text available
NaNet-10 is a four-ports 10GbE PCIe Network Interface Card designed for low-latency real-time operations with GPU systems. To this purpose the design includes an UDP offload module, for fast and clock-cycle deterministic handling of the transport layer protocol, plus a GPUDirect P2P/RDMA engine for low-latency communication with NVIDIA Tesla GPU de...
Article
Full-text available
In the attempt to develop an interconnection architecture optimized for hybrid HPC systems dedicated to scientific computing, we designed APEnet+, a point-to-point, low-latency and high-performance network controller supporting 6 fully bidirectional off-board links over a 3D torus topology. The first release of APEnet+ (named V4) was a board based...
Article
Full-text available
Recent experimental neuroscience studies are pointing out the role of long-range intra-areal connectivity that can be modeled by a distance dependent exponential decay of the synaptic probability distribution. This short report provides a preliminary measure of the impact of exponentially decaying lateral connectivity compared to that of shorter-ra...
Article
High Performance Computing is becoming increasingly relevant for industry and academia. With the current development on the processor market, modern systems quickly grow in size, i.e. in number of cores, but only little in terms of performance, i.e. in actual execution speed. Reason for that is the increasing impact of the memory and communication...
Article
General-purpose computing on GPUs (Graphics Processing Units) is emerging as a new paradigm in several fields of science, although so far applications have been tailored to the specific strengths of such devices as accelerator in offline computation. With the steady reduction of GPU latencies, and the increase in link and memory throughput, the use...
Article
Full-text available
This short report describes the scaling, up to 1024 software processes and hardware cores, of a distributed simulator of plastic spiking neural networks. A previous report demonstrated good scalability of the simulator up to 128 processes. Herein we extend the speed-up measurements and strong and weak scaling analysis of the simulator to the range...
Article
In the next decade, a growing number of scientific and industrial applications will require power-efficient systems providing unprecedented computation, memory, and communication resources. A promising paradigm foresees the use of heterogeneous many-tile architectures. The resulting computing systems are complex: they must be protected against seve...