-
IEEE Trans. on CAD of Integrated Circuits and Systems. 01/2012; 31:121-131.
-
[show abstract]
[hide abstract]
ABSTRACT: Actor model-based design is actively researched for parallel embedded SW design since the model exposes the potential parallelism explicitly in an architecture-neutral form. In most actor-oriented models, actors are self-contained and data channels are the only sharable object between actors, and they compose a system in a flat layer. In contrast, it is common to use shared library functions and construct vertically layered software for efficiency and modularity. To fill this gap between modeling and implementation, we propose a special actor, library task, with new types of ports: library master port and library slave port. It is a sharable and mappable object that defines a set of function interfaces inside. N:1 master-slave connection allows sharing a library task and the master-slave connection can specify vertically layered software and client-server applications naturally. To support the library task in our embedded software design environment, we develop an automatic mapping algorithm as well as an automatic code generator. The design environment with the library task is applied for two target platforms: IBM CELL Broad band Engine and an ARM-based multicore simulator. Preliminary experiments show that the special actor, or library task, extends the expression power of the previous actor model with efficiently generated codes.
IEEE Transactions on Industrial Informatics 06/2011; · 2.99 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Multiprocessor systems-on-chip (MPSoCs) are evolving toward processor pool-based architecture that employs a hierarchical on-chip network for inter-processor and intra-processor pool communication. This letter presents a systematic exploration method of the cascaded bus matrix-based on-chip network design for processor pool-based MPSoCs. It uses an evolutionary algorithm to find optimal architectures in terms of on-chip area while satisfying a given performance constraint. Since simulation is too time-consuming to evaluate the performance of complex on-chip networks during architecture exploration, we propose to prune the design space efficiently using two novel static analysis techniques: 1) bandwidth analysis considering task execution dependences, and 2) memory contention analysis for accurate performance estimation. Thanks to fast and accurate evaluation by the proposed analysis techniques, we achieved an order of magnitude speed improvement for the architecture exploration without performance loss, compared with a simulation-based approach.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 04/2011; · 1.27 Impact Factor
-
2011 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS XI, Samos, Greece, July 18-21, 2011; 01/2011
-
IEEE Trans. Industrial Informatics. 01/2011; 7:340-353.
-
IEEE Trans. on CAD of Integrated Circuits and Systems. 01/2011; 30:468-472.
-
2011 Conference on Design and Architectures for Signal and Image Processing, DASIP 2011, Tampere, Finland, November 2-4, 2011; 01/2011
-
Proceedings of the 48th Design Automation Conference, DAC 2011, San Diego, California, USA, June 5-10, 2011; 01/2011
-
2011 9th IEEE Symposium on Embedded Systems for Real-Time Multimedia (ESTIMedia), Taipei, Taiwan, October 13-14, 2011; 01/2011
-
Proceedings of the 16th Asia South Pacific Design Automation Conference, ASP-DAC 2011, Yokohama, Japan, January 25-27, 2011; 01/2011
-
[show abstract]
[hide abstract]
ABSTRACT: This paper determines a static scheduling and the minimal size of arc buffers for a given synchronous dataflow (SDF) graph, satisfying a throughput constraint. Unlike the previous work, we assume that the target architecture and the mapping information are given. In addition we consider the unfolding of the SDF graph to improve the throughput. To solve this problem, we adopt answer set programming (ASP) with constraint programming (CP) technique which always finds optimal solutions. The proposed ASP+CP formulation is compact enough to list the scheduling rules in 27 lines and could be applied to a small-but-practical size of coarse-grain SDF graphs successfully.
SoC Design Conference (ISOCC), 2010 International; 12/2010
-
[show abstract]
[hide abstract]
ABSTRACT: For real-time applications, it is necessary to estimate the worst-case performance early in the design process without actual hardware implementation. While the non-preemptive task scheduling is pertinent to multi-core platforms because of easy implementation and high performance, its scheduling anomaly behavior makes the worst-case performance estimation extremely difficult. In this paper, we propose an analysis technique based on mixed integer linear programming (MILP) to estimate the worst-case performance of each task in a non-preemptive multitask application on multi-processor system-on-chip architecture. MILP provides a systematic way to describe the complex interaction among task scheduling, communication architecture, and task execution, which affects the worst-case behavior dynamically. The proposed analysis technique overcomes several limitations that previous work usually has; it allows multiple tasks with different periods and models contention on the communication architecture. We show that the proposed analysis takes affordable computation time to make it of practical value even though it has exponential complexity in theory. The proposed technique estimates a safe bound on task latency statistically, which is demonstrated by extensive random simulations.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 11/2010; · 1.27 Impact Factor
-
ACM Trans. Design Autom. Electr. Syst. 01/2010; 15.
-
IEEE Trans. on CAD of Integrated Circuits and Systems. 01/2010; 29:1600-1613.
-
8th IEEE Workshop on Embedded Systems for Real-Time Multimedia, ESTIMedia 2010, 28-29 October 2010, Scottsdale, AZ, USA; 01/2010
-
Signal Processing Systems. 01/2010; 58:193-213.
-
Proceedings of the 8th International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2010, part of ESWeek '10 Sixth Embedded Systems Week, Scottsdale, AZ, USA, October 24-28, 2010; 01/2010
-
[show abstract]
[hide abstract]
ABSTRACT: Embedded software design for a multicore platform involves parallel programming for heterogeneous multiprocessors with diverse communication architectures under design constraints such as hardware cost, power, and timeliness. Since the classical von Neumann programming model assumes sequential execution of programs, it is not adequate for MPSoC SW development. Thus new programming models and corresponding SW development tools that are capable of exploiting the available parallelism and ensuring satisfaction of design constraints, are necessary.
IEEE Signal Processing Magazine 12/2009; · 4.07 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: In system-on-chip (SoC) design, it is essential to verify the correctness of design before a chip is fabricated. While conventional hardware emulators validate functional correctness of hardware components quickly, only a few researches exist to use hardware emulators for timing verification since synchronization between the hardware emulator and the other parts easily overwhelms the gain of hardware emulator. In this paper we propose a novel hardware/software coemulation framework for fast yet accurate system verification based on the virtual synchronization technique. For virtual synchronization, interface protocol and interface logic between a hardware emulator and the HW/SW coemulation kernel are proposed. Experiments with real-life examples prove the effectiveness of the proposed technique.
Systems, Architectures, Modeling, and Simulation, 2009. SAMOS '09. International Symposium on; 08/2009
-
Design, Automation and Test in Europe, DATE 2009, Nice, France, April 20-24, 2009; 01/2009