Conference Paper

Memory Models for an Application-Specific Instruction-set Processor Design Flow

Dept. of Inf. Eng. & Comput. Sci., Feng Chia Univ., Taichung;
DOI: 10.1109/ICESS.2008.40 Conference: Embedded Software and Systems, 2008. ICESS '08. International Conference on
Source: IEEE Xplore

ABSTRACT To optimize system performance for a specific target application, embedded system designers may add some new instructions, called application-specific instructions (ASIs), by automatic design flow. In past days, most application-specific instruction-set processor (ASIP) researches focus on reducing instruction latency to improve performance regardless of the impact of memory access. In this paper, a design flow is proposed to automatically generate ASIs and to compare the performance between considering register transferring and regardless of it. The experiment results show the proposed approach can achieve up to 14% performance improvement and 10% memory access reduction comparing to no register transferring consideration.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Instruction set extensions (ISEs) can be used effectively to accelerate the performance of embedded processors. The critical, and difficult task of ISE selection is often performed manually by designers. A few automatic methods for ISE generation have shown good capabilities, but are still limited in the handling of memory accesses, and so they fail to directly address the memory wall problem. We present here the first ISE identification technique that can automatically identify state-holding application-specific functional units (AFUs) comprehensively, thus being able to eliminate a large portion of memory traffic from cache and main memory. Our cycle-accurate results obtained by the SimpleScalar simulator show that the identified AFUs with architecturally visible storage gain significantly more than previous techniques, and achieve an average speedup of 2.8times over pure software execution. Moreover, the number of required memory-access instructions is reduced by two thirds on average, suggesting corresponding benefits on energy consumption
    Proceedings of the Conference on Design, Automation and Test in Europe, DATE 2006, Munich, Germany, March 6-10, 2006; 01/2006
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Configurable processors are becoming increasingly popular for modern embedded systems (especially for the field-programmable system-on-a-chip). While steady progress has been made in the tools and methodologies of automatic instruction set extension for configurable processors, the limited data bandwidth available in the core processor (e.g., the number of simultaneous accesses to the register file) becomes a potential performance bottleneck. In this paper we first present a quantitative analysis of the data bandwidth limitation in configurable processors, and then propose a novel low-cost architectural extension and associated compilation techniques to address the problem. The application of our approach results in a promising performance improvement.
    Proceedings of the ACM/SIGDA 13th International Symposium on Field Programmable Gate Arrays, FPGA 2005, Monterey, California, USA, February 20-22, 2005; 01/2005