D. Sciuto’s research while affiliated with Politecnico di Milano and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (206)


Building High-Performance, Easy-to-Use Polymorphic Parallel Memories with HLS
  • Chapter

June 2019

·

40 Reads

IFIP Advances in Information and Communication Technology

·

·

·

[...]

·

With the increased interest in energy efficiency, a lot of application domains experiment with Field Programmable Gate Arrays (FPGAs), which promise customized hardware accelerators with high-performance and low power consumption. These experiments possible due to the development of High-Level Languages (HLLs) for FPGAs, which permit non-experts in hardware design languages (HDLs) to program reconfigurable hardware for general purpose computing. However, some of the expert knowledge remains difficult to integrate in HLLs, eventually leading to performance loss for HLL-based applications. One example of such a missing feature is the efficient exploitation of the local memories on FPGAs. A solution to address this challenge is PolyMem, an easy-to-use polymorphic parallel memory that uses BRAMs. In this work, we present HLS-PolyMem, the first complete implementation and in-depth evaluation of PolyMem optimized for the Xilinx Design Suite. Our evaluation demonstrates that HLS-PolyMem is a viable alternative to HLS memory partitioning, the current approach for memory parallelism in Vivado HLS. Specifically, we show that PolyMem offers the same performance as HLS partitioning for simple access patterns, and outperforms partitioning as much as 13x when combining multiple access patterns for the same data structure. We further demonstrate the use of PolyMem for two different case studies, highlighting the superior capabilities of HLS-PolyMem in terms of performance, resource utilization, flexibility, and usability. Based on all the evidence provided in this work, we conclude that HLS-PolyMem enables the efficient use of BRAMs as parallel memories, without compromising the HLS level or the achievable performance.




Fig. 1. Screenshot of the GUI application, with DFG exploration.
Fig. 2. Overview of the Design Space Exploration.
Fig. 3. Architecture of the STM Spear Development board.  
Fig. 4. The figure shows the main buses involved in data transmission between main memory and computing cores.  
Fig. 10. Transfer speed of data copy between memory and hardware cores with driver.  

+4

FPGA-based design using the FASTER toolchain: The case of STM spear development board
  • Conference Paper
  • Full-text available

August 2014

·

199 Reads

·

1 Citation

Even though FPGAs are becoming more and more popular as they are used in many different scenarios like communications and HPC, the steep learning curve needed to work with this technology is still the major limiting factor to their full success. Many works proposed to mitigate this problem by creating a companion of tools to support the designer during the development phase for this technology. The EU FASTER Project aims at realizing an integrated toolchain that assists the designer in the steps of the design flow that are necessary to port a given application onto an FPGA device. The novelty of the framework relies in the fact that the partial dynamic reconfiguration, which FPGA devices can exploit, is seen as a first class citizen throughout the whole design flow. This work reports a case study in which the FASTER toolchain has been used to port a raytracer application onto the STM Spear prototyping embedded platform. The paper discusses the steps done for the realization of the prototype and the results obtained on the target device. It finally reports some improvements that can be exploited to improve the performance of the hardware implementation that has been realized.

Download

Fig. 1: The FASTER tool-chain 
Fig. 2: RTSM inputs, characteristics and operations 
Effective Reconfigurable Design: the FASTER Approach

Lecture Notes in Computer Science

While fine-grain, reconfigurable devices have been available for years, they are mostly used in a fixed functionality, “asic-replacement” manner. To exploit opportunities for flexible and adaptable run-time exploitation of fine grain reconfigurable resources (as implemented currently in dynamic, partial reconfiguration), better tool support is needed. The FASTER project aims to provide a methodology and a tool-chain that will enable designers to efficiently implement a reconfigurable system on a platform combining software and reconfigurable resources. Starting from a high-level application description and a target platform, our tools analyse the application, evaluate reconfiguration options, and implement the designer choices on underlying vendor tools. In addition, FASTER addresses micro-reconfiguration, verification, and the run-time management of system resources. We use industrial applications to demonstrate the effectiveness of the proposed framework and identify new opportunities for reconfigurable technologies.


Fig. 1. Internal representation of the power generator. 
Fig. 2. Power switch behavior. CPU socket specifies which socket has to provide the energy to the appliance. 
A SystemC-Based Framework for the Simulation of Appliances Networks in Energy-Aware Smart Spaces

March 2014

·

209 Reads

·

4 Citations

The e�cient energy management of build- ings is nowadays a crucial point to move toward a sustain- able planet. Unfortunately, the design of smart buildings able to optimize their energy consumption is a quite com- plex task. Since this exploration cannot be performed on the field, simulation methodologies are usually adopted to study the behavior of buildings and their energy sustain- ability during the design phase. This paper proposes a simulation framework based on SystemC to easily evalu- ate di↵erent policies to control the energy consumption of a smart space. In particular, SystemC makes it possible to obtain a flexible representation of the system, allow- ing the designer to easily evaluate di↵erent configurations of appliances and policies, and it directly works with the commonly-used C programming language.


Figure 1: MPower Android application user interface: (a) is a screenshot of the application main screen, and (b) is an example notification, which suggests to the user that the Bluetooth system should be disable because unused. 
Table 1 : Configuration parameters: values and considered configurations.
Figure 2: MPower statistics about devices that have currently installed MPower and are transmitted the collected data. Those statistics are taken from the official Google Play page. 
Figure 4: Prediction of y, obtained from d as explained in Section 4.2. Note that, for easing the visual comparison between estimated and actual values, we rounded the prediction to an integer.
Adaptive and Flexible Smartphone Power Modeling

October 2013

·

494 Reads

·

22 Citations

Mobile Networks and Applications

Mobile devices have become the main interaction mean between users and the surrounding environment. An indirect measure of this trend is the increasing amount of security threats against mobile devices, which in turn created a demand for protection tools. Protection tools, unfortunately, add an additional burden for the smartphone's battery power, which is a precious resource. This observation motivates the need for smarter (security) applications, designed and capable of running within adaptive energy goals. Although this problem affects other areas, in the security area this research direction is referred to as "green security". In general, a fundamental need to the researches toward creating energy-aware applications, consist in having appropriate power models that capture the full dynamic of devices and users. This is not an easy task because of the highly dynamic environment and usage habits. In practice, this goal requires easy mechanisms to measure the power consumption and approaches to create accurate models. The existing approaches that tackle this problem are either not accurate or not applicable in practice due to their limiting requirements. We propose MPower, a power-sensing platform and adaptive power modeling platform for Android mobile devices. The MPower approach creates an adequate and precise knowledge base of the power "behavior" of several different devices and users, which allows us to create better device-centric power models that considers the main hardware components and how they contributed to the overall power consumption. In this paper we consolidate our perspective work on MPower by providing the implementation details and evaluation on 278 users and about 22.5 million power-related data. Also, we explain how MPower is useful in those scenarios where low-power, unobtrusive, accurate power modeling is necessary (e.g., green security applications).


Towards a performance-as-a-service Cloud

October 2013

·

147 Reads

·

3 Citations

Motivation While the pay-as-you-go model of Infrastructure-as-a-Service (IaaS) clouds is more flexible than an in-house IT infrastructure, it still has a resource-based interface towards users, who can rent virtual computing resources over relatively long time scales. There is a fundamental mismatch between this resource-based interface and what users really care about: performance.


Morphone.OS: Context-Awareness in Everyday Life

September 2013

·

318 Reads

·

7 Citations

Mobile devices, due to their wide distribution and to their increasing smartness and availability of computational power, can become the interaction point between users and their surrounding environments. However, current mobile devices OSes lack of the ability to anticipate and overcome internal and external changes. Integrating mechanisms of self-awareness and self-adaptability in nowadays smartphones is an attractive perspective to match with these requirements. Moreover, adaptive behaviors can enhance the management by the mobile device itself, of the available resources at its best, e.g., the battery life. This paper envisions various situations in which a self-aware mobile device can interact with the surrounding environment and support the user in performing everyday actions. A prototype of such an adaptive device, called morphone.os and based on the Android OS, has been designed and implemented to verify the reaction of the device in different situations providing convincing and promising preliminary results.


D-RECS: A complete methodology to implement Self Dynamic Reconfigurable FPGA-based systems

July 2013

·

15 Reads

·

2 Citations

Dynamic self reconfigurable embedded systems are gathering, day after day, an increasing interest from both the scientific and the industrial world. At the same time, however, the need of a comprehensive and easy to use tool which can guide designers through the whole implementation process is becoming stronger. Up to now every proposed methodology for implementing dynamic self reconfigurable systems is architecture-centered. In most cases the system development process is time consuming and requires a very specific technical background. Aim of this work is to provide a fast brain to bit design flow whose goal is to simplify the dynamic reconfigurable system development process by shifting the designer focus from the architecture point of view to the application point of view: designers will not need to possess Dynamic Reconfigurability expertise but just to be skilled with the application domain.


Citations (70)


... Data encoding techniques like bus invert (BI) [29] and INC-XOR [23] are aimed to reduce the only self-switching activity for random data patterns. Gray code [30], T0 [3], working-zone encoding [17], and T0-XOR [11] are also aimed to reduce the self-switching activity but for correlated data patterns. Application-specific approaches are presented in [4] [5] [1] [27] [36]. ...

Reference:

Scopus-IJRTE-Data Encoding Techniques to Reduce - Mar-2019
Power optimization of system-level address buses based on software profiling
  • Citing Conference Paper
  • January 2000

... Our new HLS PolyMem is an alternative HLL solution, proven to be easily integrated with the Xilinx toolchains. The current work is an extension of our previous implementation presented in (7). Figure 1 depicts the architecture of a system using (HLS-)PolyMem. ...

HLS Support for Polymorphic Parallel Memories
  • Citing Conference Paper
  • October 2018

... In fact, that this kind of algorithms is often trapped in a local optimum of the cost function, and never achieves a global optimum. Recent works have been published [14][15][16][17][18][19] in partitioning area, which tends to prove that the problem is still opened. ...

On-line fault detection in a hardware/software co-design environment: system partitioning
  • Citing Conference Paper
  • January 2001

... Future activities will also be focused on considering compiler optimization (constants propagation, dead code elimination, loop splitting, loop tiling, etc.), architectural aspects (pipelining, stalls, cache miss/hit, superscalarity, etc.), and logic synthesis optimization. To deal with these aspects, we will explore the results produced by methodologies developed for power modeling [43], [44], [45], [46]. These latter methodologies, extended to the metrics presented in this paper, make possible a more accurate estimation and partitioning at the cost of implementing a more sophisticated analysis. ...

Dynamic modeling of inter-instruction effects for execution time estimation
  • Citing Conference Paper
  • January 2001

... Granularity of an application can be used as a basic criteria of partitioning and mapping process. The granularity-based partitioning and mapping process determines the utilization pattern of hardware resources [2]. The utilization pattern of CMCM can be realized through configuration of available hardware resources in the presence of application's granularity. ...

Online fault detection in a hardware/software co-design environment: system partitioning
  • Citing Article
  • January 2001

... In opposition to all the methods mentioned, the following approach moves the problem of reliability to a higher abstraction level (i.e. the function level). The authors of [10] developed a new data type that introduces the so-called self-checking (i.e. the error detection technique) into data-paths of HLS generated systems. The authors also consider the suitability of moving such problem to a higher level of abstraction in the context of the complexity of today's systems. ...

Reliable System Specification for Self-Checking Data-Paths

... The HERA (Hardware Evolution over Reconfigurable Architectures) project is one of the research projects aimed at creating a FPGA-based evolvable hardware system. Several recent papers [15]- [18] describe the goals and the progresses of the HERA project towards the design and the implementation of such a self-evolvable hardware system. ...

A Highly Parallel FPGA-based Evolvable Hardware Architecture
  • Citing Article
  • January 2010

... Verification. The architecture/allocation solution provided by the previous step is verified by means of the Timing Simulation (such a simulation methodology, derived from [11], [41], [42], also contributes to the definition of the interconnection network and the scheduling policy). If such a combination is not a valid solution to the partitioning problem (i.e., it does not meet the timing constraints), the whole process is repeated, starting from a finer granularity clustering. ...

HW/SW co-simulation for fast design-space exploration of multiprocessor embedded systems
  • Citing Article
  • July 2001

Canadian Journal of Electrical and Computer Engineering

... The proposed framework already solved a number of issues from which other similar toolchains suffer. For example, they could be used for only a dedicated application domain [2][3] [4][5] [6] or used for different applications without an interconnect optimized [7] [8]. One of the most important contribution of our framework is to optimize the interconnect of hardware cores because data communication is one of the two main sources of overhead in multicore systems [9]. ...

FASTER: Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration
  • Citing Article
  • November 2014

Microprocessors and Microsystems

... To select the proper resource for loading and triggering hardware task reconfiguration and execution in partially reconfigurable systems with FPGAs, efficient and flexible runtime system support is needed [1],[2]. In this work we present the realization of the Partial Reconfiguration (PR) utility on the SPEAr prototyping development platform, we integrate the Run-Time System Manager (RTSM) pre-sented on [3] and check the correctness of the resulting system with the use of a synthetic task graph, using hardware tasks from a RayTracer application [4] executed in parallel with a software edge detection application. Our main contributions are: (i) an architecture that achieves partial reconfiguration (PR) on the SPEAr's FPGA daughter-board, based on a basic DMA module provided by [4], (ii) the integration of a state of the art RTSM on the SPEAr side, and (iii) concurrent execution of HW and SW applications on the SPEAr platform. ...

FPGA-based design using the FASTER toolchain: The case of STM spear development board