
Marco D. Santambrogio- Politecnico di Milano
Marco D. Santambrogio
- Politecnico di Milano
About
387
Publications
56,632
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,219
Citations
Introduction
Current institution
Publications
Publications (387)
Quantum computing represents an exciting computing paradigm that promises to solve problems untractable for a classical computer. The main limiting factor for quantum devices is the noise impacting qubits, which hinders the superpolynomial speedup promise. Thus, although Quantum Error Correction (QEC) mechanisms are paramount, QEC demands high spee...
Quantum computing is a new paradigm of computation that exploits principles from quantum mechanics to achieve an exponential speedup compared to classical logic. However, noise strongly limits current quantum hardware, reducing achievable performance and limiting the scaling of the applications. For this reason, current noisy intermediate-scale qua...
Event-Related Potentials (ERPs) studies are powerful and widespread tools in neuroscience. The standard pipeline foresees the individuation of relevant components, and the computation of discrete features characterizing them, as latency and amplitude. Nonetheless, this approach only evaluates one aspect of the signal at a time, without considering...
Anatomical complexity and data dimensionality present major issues when analysing brain connectivity data. The functional and anatomical aspects of the connections taking place in the brain are in fact equally relevant and strongly intertwined. However, due to theoretical challenges and computational issues, their relationship is often overlooked i...
Semantic segmentation and classification are pivotal in many clinical applications, such as radiation dose quantification and surgery planning. While manually labeling images is highly time-consuming, the advent of Deep Learning (DL) has introduced a valuable alternative. Nowadays, DL models inference is run on Graphics Processing Units (GPUs), whi...
Reconfigurable computing is an expanding field that, during the last decades, has evolved from a relatively closed community, where hard skilled developers deployed high performance systems, based on their knowledge of the underlying physical system, to an attractive solution to both industry and academia. With this chapter, we explore the differen...
Medical practice is shifting towards the automation and standardization of the most repetitive procedures to speed up the time-to-diagnosis. Semantic segmentation repre-sents a critical stage in identifying a broad spectrum of regions of interest within medical images. Indeed, it identifies relevant objects by attributing to each image pixels a val...
Mental calculations involve various areas of the brain. The frontal, parietal and temporal lobes of the left hemisphere have a principal role in the completion of this typology of tasks. Their level of activation varies based on the mathematical competence and attentiveness of the subject under examination and the perceived difficulty of the task....
“Cloud-native” is the umbrella adjective describing the standard approach for developing applications that exploit cloud infrastructures’ scalability and elasticity at their best. As the application complexity and user-bases grow, designing for performance becomes a first-class engineering concern. As an answer to these needs, heterogeneous computi...
Field Programmable Gate Arrays (FPGAs) are spatial architectures with a heterogenous reconfigurable fabric. They are state-of-the-art for prototyping, telecommunications, embedded, and an emerging alternative for cloud-scale acceleration. However, FPGA adoption found limitations in their programmability and required knowledge. Therefore, researcher...
Image registration is a well-defined computation paradigm widely applied to align one or more images to a target image. This paradigm, which builds upon three main components, is particularly compute-intensive and represents many image processing pipelines' bottlenecks. State-of-the-art solutions leverage hardware acceleration to speed up image reg...
Left ventricular remodeling is a mechanism common to various cardiovascular diseases affecting myocardial morphology. It can be often overlooked in clinical practice since the parameters routinely employed in the diagnostic process (e.g., the ejection fraction) mainly focus on evaluating volumetric aspects. Nevertheless, the integration of a quanti...
Regular Expression (RE) matching is a computational kernel used in several applications. Since RE complexity and data volumes are steadily increasing, hardware acceleration is gaining attention also for this problem. Existing approaches have limited flexibility as they require a different implementation for each RE. On the other hand, it is complex...
Stencil-based algorithms are a relevant class of computational kernels in high-performance systems, as they appear in a plethora of fields, from image processing to seismic simulations, from numerical methods to physical modeling. Among the various incarnations of stencil-based computations, Iterative Stencil Loops (ISLs) and Convolutional Neural N...
The HPCG benchmark represents a modern complement to the HPL benchmark in the performance evaluation of HPC systems, as it has been recognized as a more representative benchmark to reflect real-world applications. While typical workloads become more and more challenging, the semiconductor industry is battling with performance scaling and power effi...
GPUs are readily available in cloud computing and personal devices, but their use for data processing acceleration has been slowed down by their limited integration with common programming languages such as Python or Java. Moreover, using GPUs to their full capabilities requires expert knowledge of asynchronous programming. In this work, we present...
Microservices changed cloud computing by moving the applications’ complexity from one monolithic executable to thousands of network interactions between small components. Given the increasing deployment sizes, the architectural exploitation challenges, and the impact on data-centers’ power consumption, we need to efficiently track this complexity....
Sparse matrix-vector multiplication is often employed in many data-analytic workloads in which low latency and high throughput are more valuable than exact numerical convergence. FPGAs provide quick execution times while offering precise control over the accuracy of the results thanks to reduced-precision fixed-point arithmetic. In this work, we pr...
The increase in computational power of embedded devices and the latency demands of novel applications brought a paradigm shift on how and where the computation is performed. Although AI inference is slowly moving from the cloud to end-devices with limited resources, time-centric recurrent networks like Long-Short Term Memory remain too complex to b...
Some of the most recent applications and services revolve around the analysis of time-series, which generally exhibits chaotic characteristics. This behavior brought back the necessity to simplify their representation to discover meaningful patterns and extract information efficiently. Furthermore, recent trends show how computation is moving back...
In the last few years Internet of Things (IoT) applications are moving from the cloud-sensor paradigm to a more variegated structure where IoT nodes interact with an intermediate fog computing layer. To enable compute-intensive tasks to be executed near the source of the data, fog computing nodes should provide enough performance and be sufficientl...
In a quest for making FPGA technology more accessible to the software community, Xilinx recently released PYNQ, a framework for Zynq that relies on Python and overlays to ease the integration of functionalities of the programmable logic into applications. In this work we build upon this framework to enable transparent hardware acceleration for scie...
Virtualization is the main building block of many architectures and systems from embedded computing to large scale data-centers. Managing efficiently computing resources and their power consumption becomes fundamental to optimize the performance of the workloads running on those systems, however, hardware tools like Intel RAPL can only introduce po...
Pairwise sequence alignment is one of the most computationally intensive kernels in genomic data analysis, accounting for more than 90% of the runtime for key bioinformatics applications. This method is particularly expensive for third-generation sequences due to the high computational cost of analyzing sequences of length between 1Kb and 1Mb. Give...