Conference Proceeding

Introducing control-flow inclusion to support pipelining in custom instruction set extensions.

01/2009; pp.114-121 In proceeding of: Proceedings of the IEEE 7th Symposium on Application Specific Processors, SASP 2009, San Francisco, CA, USA , July 27-28, 2009
Source: DBLP
0 0
 · 
0 Bookmarks
 · 
23 Views
  • Conference Proceeding: Automatic application-specific instruction-set extensions under microarchitectural constraints
    [show abstract] [hide abstract]
    ABSTRACT: Many commercial processors now offer the possibility of extending their instruction set for a specific application - that is, to introduce customized functional units. There is a need to develop algorithms that decide automatically, from high-level application code, which operations are to be carried out in the customized extensions. A few algorithms exist but are severely limited in the type of operation clusters they can choose and hence reduce significantly the effectiveness of specialization. In this paper, we introduce a more general algorithm which selects maximal-speedup convex subgraphs of the application dataflow graph under fundamental microarchitectural constraints, and which improves significantly on the state of the art.
    Design Automation Conference, 2003. Proceedings; 07/2003
  • Article: Introduction of Architecturally Visible Storage in Instruction Set Extensions
    [show abstract] [hide abstract]
    ABSTRACT: Instruction set extensions (ISEs) can be used effectively to accelerate the performance of embedded processors. The critical and difficult task of ISE selection is often performed manually by designers. A few automatic methods for ISE generation have shown good capabilities but are still limited in the handling of memory accesses, and so they fail to directly address the memory wall problem. We present here the first ISE identification technique that can automatically identify state-holding application-specific functional units (AFUs) comprehensively, thus being able to eliminate a large portion of memory traffic from cache and the main memory. Our cycle-accurate results obtained by the SimpleScalar simulator show that the identified AFUs with architecturally visible storage gain significantly more than previous techniques and achieve an average speedup of 2.8times over pure software execution with a little area overhead. Moreover, the number of required memory-access instructions is reduced by two thirds on average, suggesting corresponding benefits on energy consumption
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 04/2007; · 1.27 Impact Factor
  • Source
    Conference Proceeding: Code transformation strategies for extensible embedded processors.
    Proceedings of the 2006 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES 2006, Seoul, Korea, October 22-25, 2006; 01/2006

Full-text

View
0 Downloads
Available from