Fig 1 - uploaded by Kim Grüttner
Content may be subject to copyright.
Proposed Rapid Prototyping Framework 

Proposed Rapid Prototyping Framework 

Source publication
Conference Paper
Full-text available
Abstract Consideration of an embedded system’s timing behaviour and power consumption at system-level is an ambitious task. Sophisticated tools and techniques exist for power and timing estimations of individual components such as custom hard- and software as well as IP components. But prediction of the composed system behaviour can hardly be made...

Contexts in source publication

Context 1
... proposed concept for a rapid prototyping framework (based on [7]) is illustrated in Figure 1, which follows the Platform-Based Design approach with a separation of ap- plication model a , platform model c , and mapping de- scription b . The platform model is a graph consisting of processing element, interconnect, and memory nodes. ...
Context 2
... on the user-defined mapping, each task of the parallel application model is characterized regarding its timing and power properties. During hardware/- software task separation (Figure 1, d ) the functional C code is extracted from each task for analysis of its timing and power consumption. The following three sections describe the esti- mation and extra-functional model generation for Software e , Hardware f , and IP components g . ...
Context 3
... model is functionally equivalent to the initial input source, but in its internal structure it follows the generated RTL model, allowing an accurate estimation of the behaviour in terms of power and timing. The characterisation and model generation flow is shown in Figure 4. Using a high-level synthesis tool (PowerOpt), the initial source code is transformed into a control and data flow graph (CDFG) (Figure 4, step 1). The CDFG serves as input for the scheduling, allocation, and binding phases (Figure 4, step 2). ...
Context 4
... generation of the virtual system, annotated sources from e and f as well as the selected models from g are combined to a virtual prototype (Figure 1). The annotated sources for the HW and SW components have to be wrapped in appropriate TLM-2.0 models. ...

Similar publications

Article
Full-text available
Consideration of an embedded system’s timing behavior and power consumption at system-level is an ambitious task. Sophisticated tools and techniques exist for power and timing estimations of individual components such as custom hard- and software as well as IP components. In this article we present an ESL framework for timing and power aware virtua...

Citations

... Our approach is different as our emphasis is not on a low-level estimation including register-transfer level (RTL) techniques such as power-gating or instruction set building (often leading to a long simulation run-time), but rather on a fast high-level mechanistic simulation. Further work in regards of power and time consumption has also been done by Gruttner et al. in [11]. Their focus, however, lies on rapid virtual system prototyping of SoCs using C/C++ generated virtual executable prototypes utilizing code annotation. ...
Preprint
Full-text available
In recent years, due to a higher demand for portable devices, which provide restricted amounts of processing capacity and battery power, the need for energy and time efficient hard- and software solutions has increased. Preliminary estimations of time and energy consumption can thus be valuable to improve implementations and design decisions. To this end, this paper presents a method to estimate the time and energy consumption of a given software solution, without having to rely on the use of a traditional Cycle Accurate Simulator (CAS). Instead, we propose to utilize a combination of high-level functional simulation with a mechanistic extension to include non-functional properties: Instruction counts from virtual execution are multiplied with corresponding specific energies and times. By evaluating two common image processing algorithms on an FPGA-based CPU, where a mean relative estimation error of 3% is achieved for cacheless systems, we show that this estimation tool can be a valuable aid in the development of embedded processor architectures. The tool allows the developer to reach well-suited design decisions regarding the optimal processor hardware configuration for a given algorithm at an early stage in the design process.
... The second state-of-the-art approach is using a simulation for the power analysis [6][7][8]. This approach allows to do the power analysis for all power states and in a reproducible way. ...
... This approach allows to do the power analysis for all power states and in a reproducible way. However, for the simulation, the power analysis cannot be done in real-time and highly depends on the accuracy of the model, e.g., cycle accurate [7], instruction accurate [6], or component based [8]. The creation of the simulation model for the power analysis might be time-consuming. ...
Article
Full-text available
To achieve a good estimate of the power consumption of an embedded system, including its firmware, is a crucial step in the development of systems with a severely constrained power supply. This is especially true for cases where the device is powered by a small battery or through energy harvesting. The state-of-the-art approaches to measure or estimate the power consumption are formal methods, using power debugging tools with the real hardware or simulation based estimations. In the work at hand, a novel method to estimate the power consumption is proposed, it utilizes the sensor-in-the-loop architecture and enhancing it with a power estimation functionality. The proposed method combines the benefits of former methods, allowing for run-time analysis of the power-consumption in a reproducible way using recorded data without the need for power debugging hardware. In the experiments it is shown that, once set up, the proposed method is able to estimate the power consumption with an error of less than 1% compared to a power debugging hardware. Thus, the proposed method provides a reliable and fast way to estimate the systems power consumption.
... For these reasons, a methodology and modelling infrastructure is required, which allows integration of timing and power information from RT-level estimations into a fast executable virtual platform at system-level. This article extends [11], that presents a methodology for estimating execution times and power consumption of hardware and software components in multiprocessor systems. To simulate the timing and power behaviour of hardware and software for a given target architecture, low-level timing and power properties are annotated to the source code of the functional model. ...
... This article extends [11] in the following way: ...
... Mapping of input models [11] Content courtesy of Springer Nature, terms of use apply. Rights reserved. ...
Article
Full-text available
Consideration of an embedded system’s timing behavior and power consumption at system-level is an ambitious task. Sophisticated tools and techniques exist for power and timing estimations of individual components such as custom hard- and software as well as IP components. In this article we present an ESL framework for timing and power aware virtual system prototyping of heterogeneous MPSoCs consisting of software, custom hardware and 3rd party IP components. In virtual platform, previously only used for functional software verification, our proposed timed value streams enable a hierarchical and composable power model. Our proposed ESL framework supports the integration of a broad range of system-level timing and power models into virtual platform. Power and timing models can either be generated from a functional C/C++ description or include state-machine based power models to existing functional and timed virtual platform (black-box) components. Our timed value stream based power model supports the run-time analysis of different platform power management strategies with configurable temporal abstraction, supporting simulation speed and accuracy trade-offs. This work evaluates timing and power back-annotation and power state machine based approaches with timed value streams in two use-cases: An MP3 decoder, compared to a power-aware ISS and gate-level simulation, and an FPGA based many-core architecture against measurements. Finally, the simulation time overhead of the proposed stream based power model is analyzed and discussed.
... An analogous approach can be also used when utilizing more-modern transaction-level modeling (TLM), where the transactions are used instead of the instructions for power estimation (e.g., [12][13][14][15]). Even more abstract tasks or applications can be used to trace the activity and map to the power model for energy estimation; however, these are mainly used for energy profiling of some application running on an embedded processor. ...
Article
Full-text available
Power estimation is one of the key aspects that can help designers create digital circuits more effectively. If a designer is able to estimate circuit parameters during the early stages of development, correct decisions can be made that can significantly shorten the design time. The early design stages are represented by modeling at the system level of abstraction. However, existing system-level power/energy estimation methods are either too complicated, or they do not consider power management when estimating power consumption, meaning they are inaccurate. Therefore, in this paper we propose a method for a more accurate system-level estimation of the dynamic energy consumption by considering the impact of power management. The SystemC description of a power-managed system and the simulation results (in the form of the value change dump (VCD)) are inputs to the estimation method. The proposed method is based on an activity profile using the modified Hamming distance computation. The method is especially useful for the exploration of alternative power-management strategies, and it helps the designer to select the most efficient strategy.
... We borrowed some ideas from TLM Power, but the latter does not allow temperature management. Another approach using the power-state model on SystemC programs is presented in [24], with advanced techniques for software integration like source-level simulation with backannotations. We extended the idea to support cosimulation with a thermal solver, including closed-loop cosimulation where the software has access to non-functional values through sensors. ...
... We extended the idea to support cosimulation with a thermal solver, including closed-loop cosimulation where the software has access to non-functional values through sensors. Also, [34,55,23,24] target precisely timed models while we allow an analysis on temporally decoupled or loosely timed models. ...
Article
Full-text available
Many techniques and tools exist to estimate the power consumption and the temperature map of a chip. These tools help the hardware designers develop power efficient chips in the presence of temperature constraints. For this task, the application can be ignored or at least abstracted by some high level scenarios; at this stage, the actual embedded software is generally not available yet. However, after the hardware is defined, the embedded software can still have a significant influence on the power consumption; i.e., two implementations of the same application can consume more or less power. Moreover, the actual software power manager ensuring the temperature constraints, usually by acting dynamically on the voltage and frequency, must itself be validated. Validating such power management policy requires a model of both actuators and sensors, hence a closed-loop simulation of the functional model with a non-functional one. In this paper, we present and compare several tools to simulate the power and thermal behavior of a chip together with its functionality. We explore several levels of abstraction and study the impact on the precision of the analysis.
... Our methodology currently considers that the programmer is an parallel programmer that only needs to explore few hardware/software codesigns, otherwise a design space exploration strategy should be analyzed to reduce the amount of possible solutions, like using back annotations [11], [19]. ...
... Other works propose electronic system level timing and power estimation that combines system-level timing and power estimation techniques with platform-based rapid prototyping [11], [19]. However, the annotated task has to be specified in a particular language and/or has to be mapped to a specific component of the system. ...
... The automatic generation of the different granularities is beyond of the scope of this paper contribution. However, a starting programmer may need to analyze a large number of granularities and mappings, and in this case, a sytem to automatize the design space exploration would be helpful [11]. ...
Article
Full-text available
Heterogeneous computing is emerging as a mandatory requirement for power-efficient system design. With this aim, modern heterogeneous platforms like Zynq All-Programmable SoC, that integrates ARM-based SMP and programmable logic, have been designed. However, those platforms introduce large design cycles consisting on hardware/software partitioning, decisions on granularity and number of hardware accelerators, hardware/software integration, bitstream generation, etc. This paper presents a performance parallel heterogeneous estimation for systems where hardware/software co-design and run-time heterogeneous task scheduling are key. The results show that the programmer can quickly decide, based only on her/his OmpSs (OpenMP + extensions) application, which is the co-design that achieves nearly optimal heterogeneous parallel performance, based on the methodology presented and considering only synthesis estimation results. The methodology presented reduces the programmer co-design decision from hours to minutes and shows high potential on hardware/software heterogeneous parallel performance estimation on the Zynq All-Programmable SoC.
... Last, but not least, we intend to integrate energy cost to the characterization vector, to enable the integration into our ESL simulation framework for heterogenous MPSoCs [10]. ...
Conference Paper
Full-text available
The early performance evaluation of complex platforms and software stacks requires fast and sufficiently accurate workload representations. In the literature, two different approaches have been proposed: Host-based simulation with abstract performance annotations, enabling fast and functional simulations with limited architectural accuracy, and abstract workload models (or traffic generators) with more detailed platform resource usage patterns. In this work, we present an approach for automatic workload extraction from functional application code, combining the benefits of both approaches. First, the algorithmic behaviour of the embedded software is characterised statically both in terms of target processor usage and target memory access patterns, resulting in an abstracted, control flowaware workload model. Secondly, this model can be used on the target architecture itself as well as within a host-based simulation environment. We demonstrate the effectiveness of our approach by running our performance model on a virtual platform with and without a target Instruction Set Simulator (ISS) and comparing the simulation traces with the unaltered target processor binary execution.
... The entire flow starts with an executable functional specification and a structural platform specification model taken from [12], [13]. First, the functional specification model is transformed into a process network representation, partitioned and mapped and scheduled on the target platform model that represents the available processing elements, memories and communication elements of the targeted SoC. ...
Conference Paper
Full-text available
We present a new system-level design methodology enabling the consideration of process variations and degradation due to aging in early stages of the design process. By mapping an executable system specification to SoC processing, communication and memory components in combination with component wise timing and power characterization with a source-level backannotation, we enable efficient full SoC power and temperature over time simulations. Based on the resulting temporal and spatial power and temperature distribution we use a high-level multiphysics simulation to assess the impact of degradation and aging. We evaluate our approach using an ARM7 based SoC design.
Conference Paper
In this work, we present an exploration platform for microcoded RISC-V cores leveraging the One Instruction Set Com- puter (OISC) principle. Following the industry-proven virtual prototyping approach, we have realized our exploration platform by implementing an extensible and configurable Instruction Set Simulator (ISS). The developed ORISCV-ISS combines the advanced ecosystem of RISC-V with the ultimate minimalism of OISCs. ORISCV-ISS serves as development platform for both, hardware architecture and microcode procedures, and provides the basis for early design space exploration. Using ORISCV-ISS, we developed SUBLEQ microcode that is fully RISC-V compliant and ready to be run on real hardware. We evaluate how multiple hardware configurations and OISC extensions affect the performance, providing key information to balance between area savings and system performance.