Conference Paper

A highly resilient routing algorithm for fault-tolerant NoCs

DOI: 10.1109/DATE.2009.5090627 Conference: Design, Automation and Test in Europe, DATE 2009, Nice, France, April 20-24, 2009
Source: DBLP


Current trends in technology scaling foreshadow worsening transistor reliability as well as greater numbers of transistors in each system. The combination of these factors will soon make long-term product reliability extremely difficult in complex modern systems such as systems on a chip (SoC) and chip multiprocessor (CMP) designs, where even a single device failure can cause fatal system errors. Resiliency to device failure will be a necessary condition at future technology nodes. In this work, we present a network-on-chip (NoC) routing algorithm to boost the robustness in interconnect networks, by reconfiguring them to avoid faulty components while maintaining connectivity and correct operation. This distributed algorithm can be implemented in hardware with less than 300 gates per network router. Experimental results over a broad range of 2D-mesh and 2D-torus networks demonstrate 99.99% reliability on average when 10% of the interconnect links have failed.

Download full-text


Available from: Valeria Bertacco, Apr 25, 2015
  • Source
    • "References for comparison are the schemes proposed by Gomez et al. in [25] named Finter, which is an intermediate node based multi-phase routing using a different escape channel for each phase. A flag transmission and routing entry update mechanism presented by Fick et al. named DATE09 [13]. A fine-resolution detection and reconfiguration strategy named uDIREC proposed by Parikh and Ber- tacco [28], that makes detection decisions and stores the topology information in a software-maintained scoreboard at the " supervisor node " before appling Up*/Down* routing to avoid deadlock. "

    Full-text · Dataset · Feb 2016
  • Source
    • "This restricts the possible routing decisions for fault tolerant routing algorithms even more. Deadlock free routing algorithms are often based on the turn model [13], where some turns are prohibited, or on virtual channels [6] (VCs). In contrast, deflection routing is inherently deadlock free. "

    Full-text · Conference Paper · Sep 2015
    • "Once a packet faces a faulty component, it is rerouted around the fault to reach the destination. The methods using this technique can also be divided into two subgroups, depending on the mechanism they use to create a deadlock-free path around the faulty region; some approaches [12], [15] are designed based on the turn model, and some [4], [16] use VCs. As discussed earlier, the turn model and VCs were first introduced to guarantee the deadlock-freedom and provide adaptivity in routing. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Networks-on-Chip (NoCs) are becoming more susceptible to faults due to the increasing density in the VLSI circuits. As a result, designing reliable and efficient routing methods is highly desirable. Most of the existing fault-tolerant routing techniques use nonminimal paths to reroute the packets around the faulty regions. Using these approaches, the network performance degrades drastically not only by taking unnecessary longer paths, but also by creating hotspots around the faults. Moreover, they are designed statically and cannot adapt to the dynamic traffic distribution in the network. In this paper, a reconfigurable and fault-tolerant routing method is proposed which is designed based on the Abacus Turn Model (AbTM). The presented deadlock-free routing technique is dynamically tuned based on the location of faults and congestion in the network. Thus, it is able to tolerate all single router failures without exploiting virtual channels. Moreover, it can grant full adaptiveness to the hotspot regions of the network. Using this scheme, the rerouting is minimized by forwarding the packets through the available shortest paths. This efficiency makes the proposed method a powerful asset for reliable routing in NoCs.
    No preview · Article · Feb 2015
Show more