Enrico Calore

Enrico Calore
INFN - Istituto Nazionale di Fisica Nucleare | INFN · Ferrara

PhD

About

100
Publications
12,648
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,758
Citations
Introduction
I received my Bsc and Msc in Computer Engineering from the University of Padova in 2006 and 2010 respectively. In 2014 I received the PhD in Computer Science from the University of Milan. I have been a PostDoc at INFN and University of Ferrara until 2019 and now I am a Research Engineer at INFN Ferrara. My research interests are mainly in: HPC; parallel computing; performance evaluation; scientific computing; code portability, and code optimization towards performance and energy-efficiency.
Additional affiliations
January 2020 - present
INFN - Istituto Nazionale di Fisica Nucleare
Position
  • Engineer
January 2019 - December 2019
INFN - Istituto Nazionale di Fisica Nucleare
Position
  • PostDoc Position
January 2015 - December 2018
University of Ferrara
Position
  • PostDoc Position
Education
January 2014
University of Milan
Field of study
  • Computer Science
November 2010
University of Padova
Field of study
  • Computer Engineering
July 2006
University of Padova
Field of study
  • Computer Engineering

Publications

Publications (100)
Article
Improper camera orientation produces convergent vertical lines (keystone distortion) and skewed horizon lines (horizon distortion) in digital pictures; an a-posteriori processing is then necessary to obtain appealing pictures. We show here that, after accurate calibration, the camera on-board accelerometer can be used to automatically generate an a...
Article
This paper describes a massively parallel code for a state-of-the art thermal lattice–Boltzmann method. Our code has been carefully optimized for performance on one GPU and to have a good scaling behavior extending to a large number of GPUs. Versions of this code have been already used for large-scale studies of convective turbulence. GPUs are beco...
Article
Energy efficiency is becoming increasingly important for computing systems, in particular for large scale HPC facilities. In this work we evaluate, from an user perspective, the use of Dynamic Voltage and Frequency Scaling (DVFS) techniques, assisted by the power and energy monitoring capabilities of modern processors in order to tune applications...
Article
Full-text available
Nowadays, the use of hardware accelerators to boost the performance of HPC applications is a consolidated practice, and among others, GPUs are by far the most widespread. More recently, some data centers have successfully deployed also FPGA accelerated systems, especially to boost machine learning inference algorithms. Given the growing use of mach...
Article
Full-text available
Calcification of the aortic valve (CAVDS) is a major cause of aortic stenosis (AS) leading to loss of valve function which requires the substitution by surgical aortic valve replacement (SAVR) or transcatheter aortic valve intervention (TAVI). These procedures are associated with high post-intervention mortality, then the corresponding risk assessm...
Article
Quantum Sensing is a rapidly expanding research field that finds one of its applications in Fundamental Physics, as the search for Dark Matter. Devices based on superconducting qubits have already been successfully applied in detecting few-GHz single photons via Quantum Non-Demolition measurement (QND). This technique allows us to perform repeatabl...
Article
Full-text available
We unveil the multifractal behavior of Ising spin glasses in their low-temperature phase. Using the Janus II custom-built supercomputer, the spin-glass correlation function is studied locally. Dramatic fluctuations are found when pairs of sites at the same distance are compared. The scaling of these fluctuations, as the spin-glass coherence length...
Conference Paper
The Computational Storage paradigm is attracting increasing interest in many applications because of the performance and the energy-efficiency improvement, given by the tight coupling of processing elements with Solid State Drives through proper interconnection fabrics. In this work, we study a computational storage architecture aimed to boost the...
Article
Full-text available
Agriculture acts as a catalyst for comprehensive economic growth, boosting income levels, mitigating poverty, and contrasting hunger. For these reasons, it is important to monitor agricultural practices and the use of parcels carefully and automatically to support the development of sustainable use of natural resources. The deployment of high-resol...
Preprint
Rejuvenation and memory, long considered the distinguishing features of spin glasses, have recently been proven to result from the growth of multiple length scales. This insight, enabled by simulations on the Janus~II supercomputer, has opened the door to a quantitative analysis. We combine numerical simulations with comparable experiments to intro...
Preprint
Full-text available
Recent trends in deep learning (DL) imposed hardware accelerators as the most viable solution for several classes of high-performance computing (HPC) applications such as image classification, computer vision, and speech recognition. This survey summarizes and classifies the most recent advances in designing DL accelerators suitable to reach the pe...
Article
The extended principle of superposition has been a touchstone of spin-glass dynamics for almost 30 years. The Uppsala group has demonstrated its validity for the metallic spin glass, CuMn, for magnetic fields H up to 10 Oe at the reduced temperature Tr=T/Tg=0.95, where Tg is the spin-glass condensation temperature. For H>10 Oe, they observe a depar...
Preprint
We unveil the multifractal behavior of Ising spin glasses in their low-temperature phase. Using the Janus II custom-built supercomputer, the spin-glass correlation function is studied locally. Dramatic fluctuations are found when pairs of sites at the same distance are compared. The scaling of these fluctuations, as the spin-glass coherence length...
Article
Full-text available
Memory and rejuvenation effects in the magnetic response of off-equilibrium spin glasses have been widely regarded as the doorway into the experimental exploration of ultrametricity and temperature chaos. Unfortunately, despite more than twenty years of theoretical efforts following the experimental discovery of memory and rejuvenation, these effec...
Article
Full-text available
Precise assessment of calcification lesions in the Aortic Root (AR) is relevant for the success of the Transcatheter Aortic Valve Implantation (TAVI) procedure. To this end, the radiologists analyze the Cardiac Computed Tomography (CCT) scans of patients, and detect the position and extent of the calcium deposits. In this contribution, we develop a...
Article
Full-text available
One of the objectives fostered in medical science is the so-called precision medicine, which requires the analysis of a large amount of survival data from patients to deeply understand treatment options. Tools like machine learning (ML) and deep neural networks are becoming a de-facto standard. Nowadays, computing facilities based on the Von Neuman...
Preprint
Time-reversal symmetry is spontaneously broken in spin glasses below their glass temperature. Under such conditions, the standard assumption about the equivalence of the most standard protocols (i.e.,\it{no big difference between switching the field on or off}, as it is sometimes said) is not really justified. In fact, we show here that the spin-gl...
Preprint
Full-text available
Memory and rejuvenation effects in the magnetic response of off-equilibrium spin glasses have been widely regarded as the doorway into the experimental exploration of ultrametricity and temperature chaos (maybe the most exotic features in glassy free-energy landscapes). Unfortunately, despite more than twenty years of theoretical efforts following...
Article
Full-text available
Experiments featuring non-equilibrium glassy dynamics under temperature changes still await interpretation. There is a widespread feeling that temperature chaos (an extreme sensitivity of the glass to temperature changes) should play a major role but, up to now, this phenomenon has been investigated solely under equilibrium conditions. In fact, the...
Article
The synergy between experiment, theory, and simulations enables a microscopic analysis of spin-glass dynamics in a magnetic field in the vicinity of and below the spin-glass transition temperature T g . The spin-glass correlation length, ξ ( t , t w ; T ), is analysed both in experiments and in simulations in terms of the waiting time t w after the...
Preprint
Full-text available
The synergy between experiment, theory, and simulations enables a microscopic analysis of spin-glass dynamics in a magnetic field in the vicinity of and below the spin-glass transition temperature $T_\mathrm{g}$. The spin-glass correlation length, $\xi(t,t_\mathrm{w};T)$, is analysed both in experiments and in simulations in terms of the waiting ti...
Article
The correlation length ξ, a key quantity in glassy dynamics, can now be precisely measured for spin glasses both in experiments and in simulations. However, known analysis methods lead to discrepancies either for large external fields or close to the glass temperature. We solve this problem by introducing a scaling law that takes into account both...
Preprint
Full-text available
We find a dynamic effect in the non-equilibrium dynamics of a spin glass that closely parallels equilibrium temperature chaos. This effect, that we name dynamic temperature chaos, is spatially heterogeneous to a large degree. The key controlling quantity is the time-growing spin-glass coherence length. Our detailed characterization of dynamic tempe...
Preprint
Full-text available
The correlation length $\xi$, a key quantity in glassy dynamics, can now be precisely measured for spin glasses both in experiments and in simulations. However, known analysis methods lead to discrepancies either for large external fields or close to the glass temperature. We solve this problem by introducing a scaling law that takes into account b...
Article
Full-text available
This paper presents the performance analysis for both the computing performance and the energy efficiency of a Lattice Boltzmann Method (LBM) based application, used to simulate three-dimensional multicomponent turbulent systems on massively parallel architectures for high-performance computing. Extending results reported in previous works, the ana...
Article
Full-text available
We illustrate the application of quantum computing techniques to the investigation of the thermodynamical properties of a simple system, made up of three quantum spins with frustrated pair interactions and affected by a hard sign problem when treated within classical computational schemes. We show how quantum algorithms completely solve the problem...
Article
Full-text available
In the last years, the energy efficiency of HPC systems is increasingly becoming of paramount importance for environmental, technical, and economical reasons. Several projects have investigated the use of different processors and accelerators in the quest of building systems able to achieve high energy efficiency levels for data centers and HPC ins...
Chapter
Full-text available
In this work we describe a method to measure the computing performance and energy-efficiency to be expected of an FPGA device. The motivation of this work is given by their possible usage as accelerators in the context of floating-point intensive HPC workloads. In fact, FPGA devices in the past were not considered an efficient option to address flo...
Chapter
Full-text available
Reconfigurable computing, exploiting Field Programmable Gate Arrays (FPGA), has become of great interest for both academia and industry research thanks to the possibility to greatly accelerate a variety of applications. The interest has been further boosted by recent developments of FPGA programming frameworks which allows to design applications at...
Chapter
Full-text available
In this paper we report results of the analysis of computational performances and energy efficiency of a Lattice Boltzmann method (LBM) based application on the Intel KNL family of processors. In particular we analyse the impact of the main memory (DRAM) while using optimised memory access patterns to accessing data on the on-chip memory (MCDRAM) c...
Chapter
Energy-efficiency is already of paramount importance for High Performance Computing (HPC) systems operation, and tools to monitor power usage and tune relevant hardware parameters are already available and in use at major supercomputing centres. On the other hand, HPC application developers and users still usually focus just on performance, even if...
Chapter
This paper presents an early performance assessment of the ThunderX2, the most recent Arm-based multi-core processor designed for HPC applications. We use as benchmarks well known stencil-based LBM and LQCD algorithms, widely used to study respectively fluid flows, and interaction properties of elementary particles. We run benchmark kernels derived...
Preprint
Full-text available
We illustrate the application of Quantum Computing techniques to the investigation of the thermodynamical properties of a simple system, made up of three quantum spins with frustrated pair interactions and affected by a hard sign problem when treated within classical computational schemes. We show how quantum algorithms completely solve the problem...
Article
Full-text available
We investigate the fate of the Roberge-Weiss endpoint transition and its connection with the restoration of chiral symmetry as the chiral limit of Nf=2+1 QCD is approached. We adopt a stout staggered discretization on lattices with Nt=4 sites in the temporal direction; the chiral limit is approached maintaining a constant physical value of the stra...
Preprint
Full-text available
We investigate the fate of the Roberge-Weiss endpoint transition and its connection with the restoration of chiral symmetry as the chiral limit of $N_f = 2+1$ QCD is approached. We adopt a stout staggered discretization on lattices with $N_t = 4$ sites in the temporal direction; the chiral limit is approached maintaining a constant physical value o...
Article
Full-text available
Experiments on spin glasses can now make precise measurements of the exponent z(T) governing the growth of glassy domains, while our computational capabilities allow us to make quantitative predictions for experimental scales. However, experimental and numerical values for z(T) have differed. We use new simulations on the Janus II computer to resol...
Chapter
In this contribution we measure the computing and energy performance of the recently developed DAVIDE HPC-cluster, a massively parallel machine based on IBM POWER CPUs and NVIDIA Pascal GPUs. We use as an application benchmark the OpenStaPLE Lattice QCD code, written using the OpenACC programming framework. Our code exploits the computing performan...
Article
Full-text available
Energy consumption of processors and memories is quickly becoming a limiting factor in the deployment of large computing systems. For this reason, it is important to understand the energy performance of these processors and to study strategies allowing their use in the most efficient way. In this work, we focus on the computing and energy performan...
Article
Full-text available
Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes, allowing for application optimizations. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able...
Chapter
GPUs deliver higher performance than traditional processors, offering remarkable energy efficiency, and are quickly becoming very popular processors for HPC applications. Still, writing efficient and scalable programs for GPUs is not an easy task as codes must adapt to increasingly parallel architecture features. In this chapter, the authors descri...
Article
Full-text available
Significance The Mpemba effect, wherein an initially hotter system relaxes faster when quenched to lower temperatures than an initially cooler system, has attracted much attention. Paradoxically, its very existence is a hot topic. Using massive numerical simulations, we show unambiguously that the Mpemba effect is present in the archetypical model...
Article
Full-text available
Energy consumption is increasingly becoming a limiting factor to the design of faster large-scale parallel systems, and development of energy-efficient and energy-aware applications is today a relevant issue for HPC code-developer communities. In this work we focus on energy performance of the Knights Landing (KNL) Xeon Phi, the latest many-core ar...
Preprint
Full-text available
Energy consumption of processors and memories is quickly becoming a limiting factor in the deployment of large computing systems. For this reason it is important to understand the energy performance of these processors and to study strategies allowing to use them in the most efficient way. In this work we focus on computing and energy performance o...
Chapter
The Knights Landing (KNL) is the codename for the latest generation of Intel processors based on Intel Many Integrated Core (MIC) architecture. It relies on massive thread and data parallelism, and fast on-chip memory. This processor operates in standalone mode, booting an off-the-shelf Linux operating system. The KNL peak performance is very high...
Article
Full-text available
Experiments on spin glasses can now make precise measurements of the exponent $z(T)$ governing the growth of glassy domains, while our computational capabilities allow us to make quantitative predictions for experimental scales. However, experimental and numerical values for $z(T)$ have differed. We use new simulations on the Janus II computer to r...
Article
Full-text available
This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor ar...
Conference Paper
Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figu...
Article
Full-text available
Varying from multi-core CPU processors to many-core GPUs, the present scenario of HPC architectures is extremely heterogeneous. In this context, code portability is increasingly important for easy maintainability of applications; this is relevant in scientific computing where code changes are numerous and frequent. In this talk we present the desig...
Article
Full-text available
Energy consumption is today one of the most relevant issues in operating HPC systems for scientific applications. The use of unconventional computing systems is therefore of great interest for several scientific communities looking for a better tradeoff between time-to-solution and energy-to-solution. In this context, the performance assessment of...
Article
Full-text available
We first reproduce on the Janus and Janus II computers a milestone experiment that measures the spin-glass coherence length through the lowering of free-energy barriers induced by the Zeeman effect. Secondly we determine the scaling behavior that allows a quantitative analysis of a new experiment reported in the companion Letter [S. Guchhait and R....
Article
Full-text available
High-performance computing systems are more and more often based on accelerators. Computing applications targeting those systems often follow a host-driven approach in which hosts offload almost all compute-intensive sections of the code onto accelerators; this approach only marginally exploits the computational resources available on the host CPUs...
Article
Full-text available
Significance The unifying feature of glass formers (such as polymers, supercooled liquids, colloids, granulars, spin glasses, superconductors, etc.) is a sluggish dynamics at low temperatures. Indeed, their dynamics are so slow that thermal equilibrium is never reached in macroscopic samples: in analogy with living beings, glasses are said to age....
Article
The present panorama of HPC architectures is extremely heterogeneous, ranging from traditional multi-core CPU processors, supporting a wide class of applications but delivering moderate computing performance, to many-core GPUs, exploiting aggressive data-parallelism and delivering higher performances for streaming computing applications. In this sc...
Article
An increasingly large number of HPC systems rely on heterogeneous architectures combining traditional multi-core CPUs with power efficient accelerators. Designing efficient applications for these systems have been troublesome in the past as accelerators could usually be programmed using specific programming languages threatening maintainability, po...
Conference Paper
Current development trends of fast processors calls for an increasing number of cores, each core featuring wide vector processing units. Applications must then exploit both directions of parallelism to run efficiently. In this work we focus on the efficient use of vector instructions. These process several data-elements in parallel, and memory data...
Article
Abstract Whereas overt visuospatial attention is customarily measured with eye tracking, covert attention is assessed by various methods. Here we exploited SSVEPs – the oscillatory responses of the visual cortex to incoming flickering stimuli – to record the movements of covert visuospatial attention in a way operatively similar to eye tracking (at...
Conference Paper
An increasingly large number of HPC systems rely on heterogeneous architectures combining traditional multi-core CPUs with power efficient accelerators. Designing efficient applications for these systems has been troublesome in the past as accelerators could usually be programmed only using specific programming languages – such as CUDA – threatenin...
Conference Paper
Energy efficiency is becoming more and more important in the HPC field; high-end processors are quickly evolving towards more advanced power-saving and power-monitoring technologies. On the other hand, low-power processors, designed for the mobile market, attract interest in the HPC area for their increasing computing capabilities, competitive pric...
Conference Paper
An increasingly large number of scientific applications run on large clusters based on GPU systems. In most cases the large scale parallelism of the applications uses MPI, widely recognized as the de-facto standard for building parallel applications, while several programming languages are used to express the parallelism available in the applicatio...
Conference Paper
Accelerators are quickly emerging as the leading technology to further boost computing performances; their main feature is a massively parallel on-chip architecture. NVIDIA and AMD GPUs and the Intel Xeon-Phi are examples of accelerators available today. Accelerators are power-efficient and deliver up to one order of magnitude more peak performance...
Conference Paper
Many scientific software applications, that solve complex compute-or data-intensive problems, such as large parallel simulations of physics phenomena, increasingly use HPC systems in order to achieve scientifically relevant results. An increasing number of HPC systems adopt heterogeneous node architectures, combining traditional multi-core CPUs wit...
Article
Full-text available
2(+) and 1(-) states in Zr-90 were populated via the (O-17, O-17 'gamma) reaction at 340 MeV. The gamma decay was measured with high resolution using the AGATA (advanced gamma tracking array demonstrator array). Differential cross sections were obtained at few different angles for the scattered particle. The results of the elastic scattering and in...
Article
Full-text available
The architecture of high performance computing systems is becoming more and more heterogeneous, as accelerators play an increasingly important role alongside traditional CPUs. Programming heterogeneous systems efficiently is a complex task, that often requires the use of specific programming environments. Programming frameworks supporting codes por...
Experiment Findings
Full-text available
Brain-Computer Interfaces (BCIs) implement a direct communication pathway between the brain of an user and an external device, as a computer or a machine in general. One of the most used brain responses to implement non-invasive BCIs is the so called steady-state visually evoked potential (SSVEP). This periodic response is generated when an user ga...
Conference Paper
http://hdl.handle.net/10077/10529 We sought to provide direct evidence of the attention movements during dynamic mental imagery. Observers extrapolated in imagery the horizontal motion of a target with the gaze in central fixation. We recorded the steady-statevisual- evoked potentials (SSVEP) generated by flickering the left and right sides of the...
Conference Paper
An increasing number of massively parallel machines adopt heterogeneous node architectures combining traditional multicore CPUs with energy-efficient and fast accelerators. Programming heterogeneous systems can be cumbersome and designing efficient codes of- ten becomes a hard task. The lack of standard programming frameworks for accelerator based...
Conference Paper
High performance computing increasingly relies on heterogeneous systems, based on multi-core CPUs, tightly coupled to accelerators: GPUs or many core systems. Programming heterogeneous systems raises new issues: reaching high sustained performances means that one must exploit parallelism at several levels; at the same time the lack of a standard pr...