Conference Paper

Geyser-1: A MIPS R3000 CPU core with fine grain runtime power gating

DOI: 10.1109/ASSCC.2009.5357257 Conference: Solid-State Circuits Conference, 2009. A-SSCC 2009. IEEE Asian
Source: IEEE Xplore

ABSTRACT Geyser-1, a prototype MIPS R3000 CPU with fine grain runtime PG for major computational components in the execution stage is available. Function units such as CLU, shifter, multiplier and divider are power-gated and controlled at runtime such that only the function unit to be used is powered-on to minimize the leakage power. The evaluation results on the real chip reveals that the fine grain runtime PG mechanism works without electric problems. It reduces the leakage power 7% at 25°C and 24% at 80°C. The evaluation results using benchmark programs show that the power consumption can be reduced from 3% at 25°C and 30% at 80°C.

  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes the ultrafine-grained run-time power gating of on-chip routers, in which the power supply to each router component (e.g., virtual-channel buffer, virtual-channel multiplexer, and crossbar multiplexer and output latch) can be individually controlled based on the applied workload. Since only the router components that are transferring a packet are activated, the leakage power of the on-chip network can be reduced to a near-optimal level. However, such techniques inherently increase the communication latency and degrade the application performance, since a certain amount of wakeup latency is required to activate the sleeping components. To mitigate this wakeup latency, an early wakeup method that can preliminarily detect the next packet arrival and activate the corresponding components is essential. We designed and implemented an ultrafine-grained power-gating router using a commercial 65 nm process. We propose four early wakeup methods and combine them with the power-gating router. The proposed router with the early wakeup methods is evaluated in terms of its application performance, area overhead, and leakage power reduction taking into account the on/off energy overhead. The simulation results showed that it reduces the leakage power by 54.4-59.9% on average even when the application programs are fully running, at the expense of 4.6% of the area and 0.7-3.7% of the performance overheads when we assume a 1 GHz operation.
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 05/2011; · 1.20 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Leakage power has already been consuming a considerable portion of the active power in recent process technologies. In this chapter, we survey various power gating techniques to reduce the leakage power of on-chip routers. Then we introduce a run-time fine-grained power-gating router, in which power supply to each router component (e.g., virtual-channel buffer, crossbar’s multiplexer, and output latch) can be individually controlled in response to the applied workload. The fine-grained power gating router with 35 micro-power domains is designed using a commercial 65 nm process and evaluated in terms of the area overhead, application performance, and leakage power reduction.
    12/2010: pages 21-43;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Static power dissipation has been identified as a limiting factor in future microprocessor technologies. This paper presents Loop-Directed Mothballing (LDM) to reduce static power by power-gating execution units. The method accurately predicts the resource requirements and limits performance degradation by focussing on inner loops. In simulation, the energy-delay product (EDP) of the processor is reduced by 10.3%. Two prior methods show worse EDP despite having greater power savings. Index Terms Microprocessor, static power, power-gating, execution unit, simulation and EDP Static power dissipation is a key issue for microprocessor technology scaling in the near future [5]. Increasing clock frequency and transistor count drive the power consumption higher, but the related problems of heat dissipation, energy costs and battery life could limit the practicalities of such future technologies. Existing approaches [4], [8] address this problem by power-gating (switching off) execution units to reduce the static power dissipation, which is present in all logic that is powered even if it exhibits no dynamic switching activity. The energy-delay product (EDP) takes both performance and energy into account, and is used in this work as power can often be saved at the expense of performance. Execution units are among the most power-hungry devices in the microprocessor [4], [6], but power-gating them is a nontrivial problem; the heterogeneity of the units requires more analysis to match the application’s requirements to the resources, and a poor match could result in a costly loss in performance due to contention for units or stalls while required units are powered up. In this paper, a method that analyses innermost loop bodies is presented, which exploits similar resource utilisation across loop iterations and hides the costs of unit power up in the branch misprediction from the loop exit. II. LOOP-DIRECTED MOTHBALLING