Assigning program to cache-aware scratchpad concerning the influence of the whole embedded system
ABSTRACT The scratch-pad memory (SPM) is becoming more and more popular in embedded system and take great impact on the performance. This paper proposes a method to well utilize the SPM taking the influence of the whole embedded system into account. Results show that our benchmarks average a 27% speedup without OS optimization and 31% with the optimization in kernel allocation
[Show abstract] [Hide abstract]
ABSTRACT: A*CB8D88A EF!2 G083H*I!J6!"2!J,K6 ,-G ".8CL:MNO2!P!"E1!/.!Q 80 -46270 Q@ "(6- 0 !"E1!/.!Q 80 -46270 R:Y5UZ QH[PB0,-H8", ;(!$H!2 45 "E54]$!"V "(!"* " * ". 0/*)5B;I A "28"45 ": D$/I*) "*)<@= D8I$ *C*C!";^$> ">* D8^(!H8Q _3 $> ">* D8^(!H8Q 1737 D8D8" 28970-38950 D8^(!H8Q 17370-38950 "*)<@= D8I$ !" "88cB "d!2 $!"A "8) " &O g(!3/" Q " J,K6($*)5$ 0!"*C "P<@hi!"T$T*+ "*)<@= D8I$ 28960-3 E$W J,,?!P "!Qj!k #8D "( (!'8" _3I*C*C!";l(! * P7"*G-mfT$ 0!"*C "A3$: -34760 ;l!2 ^$H!2 45 "P< `n@2 D; J,, ;#,- "["I+(!d$:!/"2 "*I*CC(! ,- "["I+(!d$:!/"2 "*I*CC( _3A*)*!";c<rqd T C!"*+,-+2 "/";)$ "> Q 2670 *)<@= ;^ -30570 C!"*+,- 8J "( p Q*C!"/d$)*)*!";b<OqdC2!"4s$ "V$G2 "/"; $d!*C J,T Q*C!"/t J,$,@( "C dC2!"4s$ "V$G2 "/"; 28880 - 8J "( "E) "8' c/D!P!P8%G 0!"l u8D Q( <uhi!"l/J,?!JB ",3 "8 ( De8D "( QEI$b*)d...
[Show abstract] [Hide abstract]
ABSTRACT: This paper presents the first-ever compile- time method for allocating a portion of the heap data to scratch-pad memory. A scratch-pad is a fast directly addressed compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs cache and by its significantly lower overheads in access time, energy consumption, area and overall runtime. Existing compiler methods for allocating data to scratch-pad are able to place only global and stack data in scratch-pad memory; heap data is allocated entirely in DRAM, resulting in poor performance. Runtime methods based on software caching can place heap data in scratch- pad, but because of their high overheads from software address translation, they have not been successful, especially for heap data. In this paper we present a dynamic yet compiler-directed allocation method for heap data that for the first time, (i) is able to place a portion of the heap data in scratch-pad; (ii) has no software-caching tags; (iii) requires no run-time per-access extra address translation; and (iv) is able to move heap data back and forth between scratch-pad and DRAM to better track the program's locality characteristics. With our method, global, stack and heap variables can share the same scratch-pad. When compared to placing all heap variables in DRAM and only global and stack data in scratch-pad, our results show that our method reduces the average runtime of our benchmarks by 34.6%, and the average power consumption by 39.9%, for the same size of scratch-pad fixed at 5% of total data size.
Conference Paper: Cache-aware scratchpad allocation algorithm[Show abstract] [Hide abstract]
ABSTRACT: In the context of portable embedded systems, reducing energy is one of the prime objectives. Most high-end embedded microprocessors include onchip instruction and data caches, along with a small energy efficient scratchpad. Previous approaches for utilizing scratchpad did not consider caches and hence fail for the au courant architecture. In the presented work, we use the scratchpad for storing instructions and propose a generic cache aware scratchpad allocation (CASA) algorithm. We report an average reduction of 8-29% in instruction memory energy consumption compared to a previously published technique for benchmarks from the mediabench suite. The scratchpad in the presented architecture is similar to a preloaded loop cache. Comparing the energy consumption of our approach against preloaded loop caches, we report average energy savings of 20-44%.Design, Automation and Test in Europe Conference and Exhibition, 2004. Proceedings; 03/2004