Conference Paper

An Analytical Model for Reliability Evaluation of NoC Architectures.

Univ. of Tehran, Tehran
DOI: 10.1109/IOLTS.2007.13 Conference: 13th IEEE International On-Line Testing Symposium (IOLTS 2007), 8-11 July 2007, Heraklion, Crete, Greece
Source: DBLP

ABSTRACT This paper proposes an analytical model to assess Reliability Factor of an NoC based System-on-Chip design. Reliability Factor is the probability that faults in the NoC infrastructure can be recovered without any effect on system functionality. The proposed method classifies switch faults of an NoC according to their impact on system functionality. Based on this classification, the contribution of each transient fault lowering the reliability of the NoC is calculated. This model can be used to decide which fault tolerant techniques cause more improvement on system reliability.

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present comparative reliability analysis between shared-bus AMBA and network-on-chip (NoC) in the presence of single-event upsets (SEUs) using MPEG-2 video decoder as a case study. Employing SystemC-based cycle-accurate fault simulations, we investigate how the decoder reliability is affected when SEUs are injected into the computation cores and communication interconnects of the decoder. We show that for a given soft error rate, AMBA-based decoder experiences higher SEUs during computation due to higher execution time. On the other hand, NoC-based decoder experiences higher SEUs during inter-core communication due to higher channel latency and resource usage in the interconnects. Furthermore, we evaluate the impact of total SEUs at application-level for NoC- and AMBA-based decoders.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Advances in technology scaling increasingly make Network-on-Chips (NoCs) more susceptible to failures that cause various reliability challenges. With increasing area occupied by different on-chip memories, strategies for maintaining fault-tolerance of distributed on-chip memories become a major design challenge. We propose a system-level design methodology for scalable fault-tolerance of distributed on-chip memories in NoCs. We introduce a novel reliability clustering model for fault-tolerance analysis and shared redundancy management of on-chip memory blocks. We perform extensive design space exploration applying the proposed reliability clustering on a block-redundancy fault-tolerant scheme to evaluate the tradeoffs between reliability, performance, and overheads. Evaluations on a 64-core chip multiprocessor (CMP) with an 8x8 mesh NoC show that distinct strategies of our case study may yield up to 20% improvements in performance gains and 25% improvement in energy savings across different benchmarks, and uncover interesting design configurations.
    Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013; 01/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present reliability analysis and comparison between on-chip communication architectures: dominant shared-bus AMBA and emerging Network-on-Chip (NoC); in the presence of single-event upsets (SEUs) using MPEG-2 video decoder as a case study. Employing SystemC-based fault simulations, reliability of the decoders is studied in terms of SEUs experienced in the computation cores and communication interconnects. We show that for a given soft error rate (SER), NoC-based decoder experiences lower SEUs than AMBA-based decoder. Using peak signal-to-noise ratio (PSNR) and frame error ratio (FER) metrics to evaluate the impact of SEUs at application-level, we show that NoC-based decoder gives up to 4 dB higher PSNR, while AMBA experiences up to 3% lower FER. Furthermore, we investigate the impact of routing, application task mapping (distribution of tasks among computation cores) and architecture allocation (choice of number of computation cores) on the reliability of the decoders in the presence of SEUs.
    Microprocessors and Microsystems 01/2011; · 0.55 Impact Factor