[show abstract][hide abstract] ABSTRACT: Protecting on-chip cache memories against soft errors has become an increasing challenge in designing new generation reliable microprocessors. Previous efforts have mainly focused on improving the reliability of the cache data arrays. Due to its crucial importance to the correctness of cache accesses, the tag array also demands high reliability against soft errors. Exploiting the address locality of memory accesses, we propose to duplicate most recently accessed tag entries in a small tag replication buffer (TRB) thus to protect the information integrity of the tag array in the data cache. Experimental results show that our proposed TRB scheme achieves a high 90% access-with-replica (AWR) rate with low per-formance (0%), energy (16.3%), and area (19.9%) overheads. We also conduct a detailed design space exploration for the TRB design and propose a selective TRB scheme that achieves a higher AWR rate (97.4%) for the dirty cachelines with negligible over-heads. To provide a comprehensive evaluation of the tag-array re-liability, we further conduct an architectural vulnerability factor (AVF) analysis for the tag array in the data cache and propose a refined metric, detected-without-replica-AVF (DOR-AVF), which combines the AVF and AWR analysis. Based on our DOR-AVF analysis, a selective TRB scheme with early write-back (S-TRB-EWB) is proposed, which achieves a zero DOR-AVF and 100% AWR rate at a negligible performance overhead. Results from sta-tistical fault/error injection experiment also confirm the effective-ness of our TRB schemes and the achieved reliability of the cache tag array that recovers 100% of detected errors. Index Terms—Cache tag array, reliability, soft error, tag repli-cation buffer (TRB).
IEEE Transactions on Very Large Scale Integration (VLSI) Systems 01/2012; 1. · 1.22 Impact Factor
[show abstract][hide abstract] ABSTRACT: With continuous scaling down of the semiconductor technology, the soft errors induced by energetic particles have become an increasing challenge in designing current and next-generation reliable microprocessors. Due to their large share of the transistor budget and die area, cache memories suffer from an increasing vulnerability against soft errors. Previous work based on the vulnerability factor (VF) analysis proposed analytical models to evaluate the reliability of on-chip data and instruction caches. However, we have no possession of a system-level study on the vulnerability of instruction caches. In this paper, we propose a new analytical model to estimate the system-level vulnerability factor for on-chip instruction caches. In our model, the error masking/detection effects in instructions based on the Instruction Set Architecture (ISA) are studied. Our experimental results using SPEC benchmark suite show that the self-error-masking/detection in instructions will reduce the VF of the instruction caches compared to the previous study. We also conduct an evaluation on the effectiveness of the reliability optimization techniques for instruction caches under our system-level VF characterization. Our proposed vulnerability model can provide an insightful guidance for the reliable instruction cache and ISA design.
2011 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2011, Vancouver, BC, Canada, October 3-5, 2011; 01/2011
[show abstract][hide abstract] ABSTRACT: Protecting the on-chip cache memories against soft errors has become an increasing challenge in designing new generation reliable microprocessors. Previous efforts have mainly focused on improving the reliability of the cache data arrays. Due to its crucial importance to the correctness of cache accesses, the tag array demands high reliability against soft errors while the data array is fully protected. Exploiting the address locality of memory accesses, we propose to duplicate most recently accessed tag entries in a small Tag Replication Buffer (TRB) thus to protect the information integrity of the tag array in the data cache with low performance, energy and area overheads. A Selective-TRB scheme is further proposed to protect only tag entries of dirty cache lines. The experimental results show that the Selective-TRB scheme achieves a higher access-with-replica (AWR) rate of 97.4% for the dirty-cache line tags. To provide a comprehensive evaluation of the tag-array reliability, we also conduct an architectural vulnerability factor (AVF) analysis for the tag array and propose a refined metric, detected-without-replica-AVF (DOR-AVF), which combines the AVF and AWR analysis. Based on our DOR-AVF analysis, a TRB scheme with early write-back (EWB) is proposed, which achieves a zero DOR-AVF at a negligible performance overhead.
IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2010, 5-7 July 2010, Lixouri Kefalonia, Greece; 01/2010
[show abstract][hide abstract] ABSTRACT: Protecting the register value and its data buses is crucial to reliable computing in high-performance microprocessors due to the increasing susceptibility of CMOS circuitry to soft errors induced by high-energy particle strikes. Since the register file is in the critical path of the processor pipeline, any reliable design that increases either the pressure on the register file or the register file access latency is not desirable. In this paper, we propose to exploit narrow-width register values, which present the majority of the generated values, for making a duplicate of the value within the same data item; this in-register duplication (IRD) eliminates the requirement for additional copy registers. The data path pipeline is augmented to efficiently incorporate parity encoding and parity checking such that error recovery is seamlessly supported in IRD and the parity checking is overlapped with the execution stage to avoid increasing the critical path. A detailed architectural vulnerability factor (AVF) analysis shows that IRD significantly reduces the AVF from 8.4% in a conventional unprotected register file to 0.1% in an IRD register file. Our experimental evaluation using the SPEC CINT2000 benchmark suite also shows that IRD provides superior read-with-duplicate (RWD) and error detection/recovery rates under heavy error injection as compared to previous reliability schemes, while only incurring a small power overhead.
IEEE Transactions on Very Large Scale Integration (VLSI) Systems 08/2009; · 1.22 Impact Factor
[show abstract][hide abstract] ABSTRACT: Soft errors induced by energetic particle strikes in on-chip cache memories have become an increasing challenge in designing new generation reliable microprocessors. Previous efforts have exploited information redundancy via parity/ECC codings or cacheline duplication for information integrity in on-chip cache memories. Due to various performance, area/size, and energy constraints in various target systems, many existing unoptimized protection schemes may eventually prove significantly inadequate and ineffective. In this paper, we propose a new framework for conducting comprehensive studies and characterization on the reliability behavior of cache memories, in order to provide insight into cache vulnerability to soft errors as well as design guidance to architects for highly efficient reliable on-chip cache memory design. Our work is based on the development of new lifetime models for data and tag arrays residing in both the data and instruction caches. Those models facilitate the characterization of cache vulnerability of stored items at various lifetime phases. We then exemplify this design methodology by proposing reliability schemes targeting at specific vulnerable phases. Benchmarking is carried out to showcase the effectiveness of our approach.
IEEE Transactions on Computers 01/2009; 58:1171-1184. · 1.38 Impact Factor
[show abstract][hide abstract] ABSTRACT: Soft-error induced reliability problems have become a major challenge in designing new generation microprocessors. Due to the on-chip caches' dominant share in die area and transistor budget, protecting them against soft errors is of paramount importance. Recent research has focused on the design of cost-effective reliable data caches in terms of performance, energy, and area overheads, based on the assumption of fixed error rates. However, for systems in operating environments that vary with time or location, those schemes will be either insufficient or overdesigned for the changing error rates. In this paper, we explore the design of a self-adaptive reliable data cache that dynamically adapts its employed reliability schemes to the changing operating environments thus to maintain a target reliability. The proposed data cache is implemented with three levels of error protection schemes, a monitoring mechanism, and a control component that decides whether to upgrade, downgrade, or keep the current protection level based on the feedback from the monitor. Our experimental evaluation using a set of SPEC CPU2000 benchmarks shows that our self-adaptive data cache achieves similar reliability to a cache protected by the most reliable scheme, while simultaneously minimizing the performance and power overheads.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 09/2008; · 1.09 Impact Factor
[show abstract][hide abstract] ABSTRACT: Powerful branch predictors along with a large branch target buffer (BTB) are employed in superscalar processors for instruction-level parallelism exploitation. However, the large BTB not only dominates the predictor energy consumption, but also becomes a major roadblock in achieving faster clock frequencies at deep sub-micron technologies. In this paper, we propose a filtering scheme to reduce the accesses to the BTB to achieve a significant dynamic energy reduction in the BTB while maintaining the performance. Our experimental evaluation using the SPEC2000 benchmark suite shows that our BTB Access Filtering (BAF) design achieves a 88.5% dynamic energy reduction over a default 2K-entry 2-way BTB at the cost of a negligible 0.1% performance loss, on the average across all benchmarks. We also studied the leakage behavior and its control in our BAF design. The results show that by applying a drowsy strategy, we can achieve a very effective leakage control in the BTB, a 83% leakage reduction at a marginal 0.3% performance overhead. For high performance design, our BAF can also improve BTB's performance scalability at new technologies. In deeply-pipelined designs, BAF design yields a 2.7% (and 8.1%) performance improvement over a conventional 2-cycle (and 3-cycle) BTB, with its energy efficiency fully exploited.
IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2008, 7-9 April 2008, Montpellier, France; 01/2008
[show abstract][hide abstract] ABSTRACT: Designing high-performance low-power register files is of critical importance to the continuation of current perfor- mance advances in wide-issue and deeply-pipelined super- scalar microprocessors. In this paper, we propose a new microarchitecture, the asymmetrically-banked value-aware register file (AB-VARF), to exploit the prevailing narrow- width register values for low-latency and power-efficient register file designs. The register bit-widths of different banks in our AB-VARF register files are specifically cus- tomized to capture different narrow-width values. Aug- mented with a value width predictor, the register renaming logic is slightly tuned to rename predicted narrow-width registers to the corresponding narrow-width banks. Our experimental evaluation with SPEC CINT2000 benchmark suites shows that AB-VARF reduces the energy consumption by 92.6% over a conventional register file, on the average, at the cost of a 6.6% performance loss to an ideal 1-cycle monolithic register file.
2007 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2007), May 9-11, 2007, Porto Alegre, Brazil; 01/2007
[show abstract][hide abstract] ABSTRACT: Energetic-particle induced soft errors in on-chip cache memories have become a major challenge in designing new generation reliable microprocessors. Uniformly applying conven- tional protection schemes such as error correcting codes (ECC) to SRAM caches may not be practical where performance, power, and die area are highly constrained, especially for embedded systems. In this paper, we propose to analyze the lifetime behavior of the data cache to identify its temporal vulnerability. For this vulnerability analysis, we develop a new lifetime model. Based on the new lifetime model, we evaluate the effectiveness of several existing schemes in reducing the vulnerability of the data cache. Furthermore, we propose to periodically invalidate clean cache lines to reduce the probability of errors being read in by the CPU. Combined with previously proposed early writeback strategies (1), our schemes achieve a substantially low vulnerability in the data cache, which indicate the necessity of different protection schemes for data items during various phases in their lifetime. I. INTRODUCTION With continuous technology scaling down, microprocessors are becoming more susceptible to soft errors induced by
Proceedings of 2006 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (IC-SAMOS 2006), Samos, Greece, July 17-20, 2006; 01/2006