-
[show abstract]
[hide abstract]
ABSTRACT: Adaptive multiprocessor systems are appearing as a promising solution for dealing with complex and unpredictable scenarios. Given the large variety of possible use cases that these platforms must support and the resulting workload variability, offline approaches are no longer sufficient because they do not allow coping with time changing workloads. This letter presents a novel approach based on the utilization of PI and PID controllers, widely used in control automation, for optimizing resources utilization in Multiprocessor System-on-Chip (MPSoC). Several architecture characteristics such as response time during frequency changing, noise and perturbations are modeled and validated in a high-level model and results are compared to information obtained on a homogeneous MPSoC platform prototype. Power and energy consumption figures are discussed and two controllers are proposed: 1) PI-; and 2) PID-based controllers. Results show the system capability of adapting under disturbing conditions while ensuring application performance constraints and reducing energy consumption.
IEEE embedded systems letters 10/2011;
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we point out the importance of considering sensitivity performances due to process variations and operating voltage conditions during the design process. We demonstrate that such considerations significantly decrease the read timing margin introduced by the traditional corner method. However, this implies using statistical design technique which is introduced herein. The memory considered is a 256 kb SRAM design in 90 nm technology node.
Symposium on VLSI, 2008. ISVLSI '08. IEEE Computer Society Annual; 05/2008
-
[show abstract]
[hide abstract]
ABSTRACT: In this article, we present an original MPI-based adaptive task migration support for the HS-Scale system. Our previous communication API was modified in order to be MPI compliant. In order to enable task migration without any MMU, a Position Independent Code compilation technique is implemented. The self-adaptability is based on monitoring information collected at run-time by each processing element (PE). Each PE is endowed with the same decisional capability insuring the scalability of the solution. A MJPEG case study validated on a multi-FPGA prototyping platform is presented. The observation of the dynamic behavior of HS-Scale shows that the system is able to find itself a stable task placement providing the best performance in terms of processing throughput.
Symposium on VLSI, 2008. ISVLSI '08. IEEE Computer Society Annual; 05/2008
-
[show abstract]
[hide abstract]
ABSTRACT: Aggressive scaling of transistors is often accompanied by an increase in variability of its intrinsic parameters. In this paper, we point out the importance of considering sensitivity performances due to process variations during SRAM design. We propose a novel dummy bitline driver, an essential component in a self timed memory, which is less sensitive to process variations. A statistical sizing method of this dummy bitline driver is introduced so as to improve the read timing margin, while ensuring a high timing yield. The memory considered is a 256 kb SRAM design in 90 nm technology node.
Electronic Design, Test and Applications, 2008. DELTA 2008. 4th IEEE International Symposium on; 02/2008
-
[show abstract]
[hide abstract]
ABSTRACT: Process variation constitutes a serious hindrance to the performance of SRAMs, since memories require bigger design margins for their proper operations. In this paper, we propose a new dummy bit line driver structure and its statistical sizing method to reduce the sensitivity of the memory with respect to process variations, while improving the read timing margin. The dummy bit line driver is an essential component in a self-timed memory during a read operation. It triggers the sense amplifier at the appropriate time when bit line is discharged. We considered a 256 kb SRAM in a 90 nm technology node.
Memory Technology, Design and Testing, 2007. MTDT 2007. IEEE International Workshop on; 01/2008
-
[show abstract]
[hide abstract]
ABSTRACT: The migration of transistors in the very deep submicron region has allowed the integration of billions of transistors on a single chip. However, this relentless march has caused the rapid emergence of variability problems, which have adverse effects on the circuit's performance. This paper highlights the importance of taking into account process variability aspects in the design of an eSRAM for reducing the excessive design margin, introduced by the corner analysis method. We show that the sensitivity dispersions of the memory to process variations can be mitigated through the use of an appropriate dummy bit line driver (DBD). This component is in fact an essential element in a self-timed memory. We made use of the DBD in a 256kb SRAM in 90nm technology process.
Integrated Reliability Workshop Final Report, 2007. IRW 2007. IEEE International; 11/2007
-
[show abstract]
[hide abstract]
ABSTRACT: Dual rail logic is considered as a relevant hardware countermeasure against Differential Power Analysis (DPA) by making power consumption data independent. In this paper, we deduce from a thorough analysis of the robustness of dual rail logic against DPA the design range in which it can be considered as effectively robust. Surprisingly this secure design range is quite narrow. We therefore propose the use of an improved logic, called Secure Triple Track Logic, as an alternative to more conventional dual rail logics. To validate the claimed benefits of the logic introduced herein, we have implemented a sensitive block of the Data Encryption Standard algorithm (DES) and carried out by simulation DPA attacks.
Very Large Scale Integration, 2007. VLSI - SoC 2007. IFIP International Conference on; 11/2007
-
[show abstract]
[hide abstract]
ABSTRACT: Multi-Processor Systems-on-Chip are becoming increasingly popular in embedded systems for the high degree of performance and flexibility they permit. While most MPSoCs are today highly heterogeneous for better fitting the target applications, homogeneous systems may become in a near future a viable alternative bringing other benefits such as run-time load balancing, high performance and low power consumption. The work presented in this paper relies on a homogeneous NoC-based MPSoC framework we developed which allows us to conduct cycle-accurate evaluations of 2 different techniques: proactive and reactive communications.
Field-Programmable Custom Computing Machines, Annual IEEE Symposium on. 08/2007;
-
[show abstract]
[hide abstract]
ABSTRACT: Scalability of architecture, programming model and task control management will be a major challenge for MP-SOC designs in the coming years. The contribution presented in this paper is HS-Scale, a hardware/software framework to study, define and experiment scalable solutions for next generation MP-SOC. The hardware architecture, H-Scale, is a homogeneous MP-SOC based on RISC processors, distributed memories and a globally asynchronous/locally synchronous network on chip. S-Scale is the software support to program H-Scale. It is a multithreaded sequential programming model with dedicated communication primitives handled at run-time by a simple operating system we developed. The hardware validations on FPGA and CMOS 90 nm technology and the experimental case studies on several applications (FIR, DES and MJPEG) demonstrate the scalability of our approach and draws interesting perspectives to automate task placement and duplication.
Embedded Computer Systems: Architectures, Modeling and Simulation, 2007. IC-SAMOS 2007. International Conference on; 08/2007
-
[show abstract]
[hide abstract]
ABSTRACT: Multiprocessor systems-on-chip are becoming increasingly popular in embedded systems for the high degree of performance and flexibility they permit. While most MPSoCs are today highly heterogeneous for better fitting the target applications, homogeneous systems may become in a near future a viable alternative bringing other benefits such as run-time load balancing, high performance and low power consumption. The work presented in this paper relies on a homogeneous NoC-based MPSoC framework we developed which allows us to conduct cycle-accurate evaluations of 2 different techniques: proactive and reactive communications.
Field-Programmable Custom Computing Machines, 2007. FCCM 2007. 15th Annual IEEE Symposium on; 05/2007
-
[show abstract]
[hide abstract]
ABSTRACT: This work addresses the problem of hardware attacks against cryptographic circuits. The most dangerous side-channel attack: the differential power analysis (DPA) is discussed, as well the state of art countermeasures. Then new reconfigurable system on chip resistant against DPA attacks is proposed. Results shows that our architecture is efficient against DPA attacks, but also outcomes the performance of classical implementation of modular exponentiation, for key size exceeding 2048 bits, with a reasonable extra area overhead
System-on-Chip, 2006. International Symposium on; 12/2006
-
[show abstract]
[hide abstract]
ABSTRACT: Hardware implementations of cryptographic algorithms may leak some information that can be used to recover cryptographic keys. This work combines reconfigurable techniques with the recently proposed leak resistant arithmetic (LRA) to thwart some side channel attacks (SCA). The introduced architecture outcomes the performance of classical implementation of modular multiplication, for key size exceeding 2048 bits, with a reasonable extra area overhead. Nevertheless, this is not a drawback, but a cost, since the main issue of the proposed architecture is the improved robustness in terms of security.
Field Programmable Logic and Applications, 2006. FPL '06. International Conference on; 09/2006
-
[show abstract]
[hide abstract]
ABSTRACT: Dynamic reconfiguration provides interesting features offering hardware flexibility and adaptability. Unfortunately, the lack of programming tools to manage it has limited its use in current SoCs. This paper presents a method to abstract, at design-time, dynamic reconfiguration management. Dynamic hardware multiplexing is a generic principle based on a scheduler dedicated to reconfigurable resources management at run-time. Formal background, implementation, simulation results and validations are exposed to illustrate the contribution of this study.
Emerging VLSI Technologies and Architectures, 2006. IEEE Computer Society Annual Symposium on; 04/2006
-
[show abstract]
[hide abstract]
ABSTRACT: Electronic Industry is moving toward design and realization of systems-on-silicon. This kind of integrated circuit is generally built around a processor core. Logic functions are added to this core in order to realize an Application Specific Integrated Processor (ASIP).
In signal processing domain, a simulation of such a systems necessitates very long digital patterns. Then, timing validation by means of a simulator can become prohibitive. In order to cope with this problem, a prototyping system has been developed. It allows concurrent validation of the hardware and software under development. It is based on a well known Digital Signal Processor (TMS320C40) and on reconfigurable digital circuits (XILINX XC4013).
01/2006: pages 410-414;
-
[show abstract]
[hide abstract]
ABSTRACT: The physical implementation of cryptographic algorithms may leak to some attacker security information by the side channel data, as power consumption, timing, temperature or electromagnetic emanation. The differential power analysis (DPA) is a powerful side channel attack, based only on the power consumption information. There are some countermeasures proposed at algorithmic or architectural level that are expensive and/or complexes. This paper addresses the DPA attack problem by a novel and efficient transistor-level method based on a power consumption control, without any modification on the cryptographic algorithms, messages or keys
Integrated Circuits and Systems Design, 18th Symposium on; 10/2005
-
[show abstract]
[hide abstract]
ABSTRACT: The authors addressed the multi-tasking issue for reconfigurable coprocessors in random application contexts. A scheduling algorithm was proposed to handle simultaneously a set of random tasks and able to maximize the resource usage even when the task-load is low. For this, processes are considered as relocatable: a simple transformation scheme is applied by a configuration controller to the initial configuration in order to relocate or duplicate the task when necessary. In this paper, the proposed method is implemented on a coarse grain reconfigurable architecture with 8 and 32 processing elements. A large amount of random scenario have been simulated and the statistical results presented here clearly show real advantages of the proposed method, but also some limitations drawing the line of future works.
Field Programmable Logic and Applications, 2005. International Conference on; 09/2005
-
[show abstract]
[hide abstract]
ABSTRACT: This paper aims at introducing a novel design methodology of compact, high performance and secured dual rail primitives widely used in quasi-delay insensitive circuits. An example of application of this design methodology to basic quasi-delay insensitive primitives is given on a 130 nm process. The performance and the security properties of the resulting cells are then compared, using electrical simulations, to the implementations proposed in former works.
Journal of Low Power Electronics 03/2005; 1(1):20-26.
-
[show abstract]
[hide abstract]
ABSTRACT: In this article we present a model of coarse grained reconfigurable architecture, dedicated to accelerate data-flow oriented
applications. The proliferation of new academic and industrial architectures implies a large variety of solutions for platform-based
designers. Thus, efficient metrics to compare and qualify these architectures are more and more necessary. Several metrics,
Troughput Density [3][12], Remanence [4] and Operative Density are then used to perform comparisons on different architectures. Architectures are often customisable and purpose several
parameters. Therefore, it is crucial to characterize the architectural model according to these parameters. This paper proposes
as a case study the Systolic Ring, and gives a set of metrics as functions of the architecture parameters. The methodology
illustrated is generic and proved very efficient to highlight architectural properties such as the scalability.
09/2003: pages 722-732;
-
[show abstract]
[hide abstract]
ABSTRACT: A platform and methodology for real time systems on chip
prototyping is presented. The JPEG case study is presented as an example
of prototyping. The proposed methodology overcomes limitations of the
simulation approach and can often achieve real time validation. However,
the case study highlights drawbacks of the prototyping platform and
points out some remaining work to make prototyping more effective
Rapid System Prototyping, 1999. IEEE International Workshop on; 08/1999
-
[show abstract]
[hide abstract]
ABSTRACT: Dual rail logic is considered as a relevant hardware countermeasure against Differential Power Analysis (DPA) by making power
consumption data independent. In this paper, we deduce from a thorough analysis of the robustness of dual rail logic against
DPA the design range in which it can be considered as effectively robust. Surprisingly this secure design range is quite narrow.
We therefore propose the use of an improved logic, called Secure Triple Track Logic, as an alternative to more conventional
dual rail logics. To validate the claimed benefits of the logic introduced herein, we have implemented a sensitive block of
the Data Encryption Standard algorithm (DES) and carried out by simulation DPA attacks.
01/1970: pages 340-351;