[show abstract][hide abstract] ABSTRACT: On-chip network is becoming critical to the scalability of future many-core architectures. Recently, nanophotonics has been proposed for on-chip networks because of its low latency and high bandwidth. However, nanophotonics has relatively high static power consumption, which can lead to inefficient architectures. In this work, we propose FlexiShare - a nanophotonic crossbar architecture that minimizes static power consumption by fully sharing a reduced number of channels across the network. To enable efficient global sharing, we decouple the allocation of the channels and the buffers, and introduce novel photonic token-stream mechanism for channel arbitration and credit distribution The flexibility of FlexiShare introduces additional router complexity and electrical power consumption. However, with the reduced number of optical channels, the overall power consumption is reduced without loss in performance. Our evaluation shows that the proposed token-stream arbitration applied to a conventional crossbar design improves network throughput by 5.5Ã under permutation traffic. In addition, FlexiShare achieves similar performance as a token-stream arbitrated conventional crossbar using only half the amount of channels under balanced, distributed traffic. With the extracted trace traffic from MineBench and SPLASH-2, FlexiShare can further reduce the amount of channels by up to 87.5%, while still providing better performance - resulting in up to 72% reduction in power consumption compared to the best alternative.
16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 9-14 January 2010, Bangalore, India; 01/2010
[show abstract][hide abstract] ABSTRACT: Sharing on-chip network resources efficiently is criti- cal in the design of a cost-efficient network on-chip (NoC). Concentration has been proposed for on-chip networks but the trade-off in concentration implementation and perfor- mance has not been well understood. In this paper, we de- scribe cost-efficient implementations of concentration and show how external concentration provides a significant re- duction in complexity (47% and 36% reduction in area and energy, respectively) compared to previous assumed inte- grated (high-radix) concentration while degrading overall performance by only 10%. Hybrid implementations of con- centration is also presented which provide additional trade- off between complexity and performance. To further reduce the cost of NoC, we describe how channel slicing can be used together with concentration. We propose virtual con- centration which further reduces the complexity - saving area and energy by 69% and 32% compared to baseline mesh and 88% and 35% over baseline concentrated mesh.
Third International Symposium on Networks-on-Chips, NOCS 2009, May 10-13 2009, La Jolla, CA, USA. Proceedings; 01/2009
[show abstract][hide abstract] ABSTRACT: Future many-core processors will require high-performance yet energy-efficient on-chip networks to provide a communication substrate for the increasing number of cores. Recent advances in silicon nanophotonics create new opportunities for on-chip networks. To efficiently exploit the benefits of nanophotonics, we propose Firefly - a hybrid, hierarchical network architecture. Firefly consists of clusters of nodes that are connected using conventional, electrical signaling while the inter-cluster communication is done using nanophotonics - exploiting the benefits of electrical signaling for short, local communication while nanophotonics is used only for global communication to realize an efficient on-chip network. Crossbar architecture is used for inter-cluster communication. However, to avoid global arbitration, the crossbar is partitioned into multiple, logical crossbars and their arbitration is localized. Our evaluations show that Firefly improves the performance by up to 57% compared to an all-electrical concentrated mesh (CMESH) topology on adversarial traffic patterns and up to 54% compared to an all-optical crossbar (OP XBAR) on traffic patterns with locality. If the energy-delay-product is compared, Firefly improves the efficiency of the on-chip network by up to 51% and 38% compared to CMESH and OP XBAR, respectively.
36th International Symposium on Computer Architecture (ISCA 2009), June 20-24, 2009, Austin, TX, USA; 01/2009