Conference Paper

Corona: System Implications of Emerging Nanophotonic Technology

Univ. of Wisconsin - Madison, Madison, WI
DOI: 10.1109/ISCA.2008.35 Conference: 35th International Symposium on Computer Architecture (ISCA 2008), June 21-25, 2008, Beijing, China
Source: DBLP


We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance, memory and inter-core bandwidths will also have to scale by orders of magnitude. Pin limitations, the energy cost of electrical signaling, and the non-scalability of chip-length global wires are significant bandwidth impediments. Recent developments in silicon nanophotonic technology have the potential to meet these off- and on-stack bandwidth requirements at acceptable power levels. Corona is a 3 D many-core architecture that uses nanophotonic communication for both inter-core communication and off-stack communication to memory or I/O devices. Its peak floating-point performance is 10 teraflops. Dense wavelength division multiplexed optically connected memory modules provide 10 terabyte per second memory bandwidth. A photonic crossbar fully interconnects its 256 low-power multithreaded cores at 20 terabyte per second bandwidth. We have simulated a 1024 thread Corona system running synthetic benchmarks and scaled versions of the SPLASH-2 benchmark suite. We believe that in comparison with an electrically-connected many-core alternative that uses the same on-stack interconnect power, Corona can provide 2 to 6 times more performance on many memory intensive workloads, while simultaneously reducing power.

Download full-text


Available from: Marco Fiorentino
  • Source
    • "Photonic Network-on-Chip (PNoC) [1], [2], [3] is a novel concept enabling ultra-high communication bandwidth in the terabits per second range, low power, and low communication latency. When combined with Wavelength Division Multiplexing (WDM), multiple parallel optical streams of data are concurrently transferred through a single waveguide. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In conventional hybrid PNoC systems, the end-to end optical data transfer is accompanied with electrical control functions including path-setup, acknowledgment, and tear-down. These functions directly a�ffect the performance and power characteristics of these circuit-switching based systems. In this paper, we propose a novel hybrid PNoC system, named PHENIC-II. Thanks to the adopted non-blocking photonic switch and light-weight electronic router, PHENIC-II is capable of alleviating the congestion in the electronic control layer which is considered as the main source of latency and power overhead in hybrid PNoC systems. From the performance evaluation, we demonstrate that the proposed system has a better performance and low energy dissipation when compared to the previously proposed systems.
    Full-text · Conference Paper · Oct 2015
  • Source
    • "The NoC employs the multiple-write-single-read (MWSR) mechanism for L1-to-L2 communication with a dedicated silicon-photonic channel having a width of 512 bits for each L2 cache bank. A token-based protocol is used to arbitrate between the L1 caches for getting access to L1-to-L2 communication channels [5]. Tokens are assigned in a round-robin fashion for fairness. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In manycore systems, the silicon-photonic link technology is projected to replace electrical link technology for global communication in network-on-chip (NoC) as it can provide as much as an order of magnitude higher bandwidth density and lower data-dependent power. However, a large amount of fixed power is dissipated in the laser sources required to drive these silicon-photonic links, which negates any bandwidth density advantages. This large laser power dissipation depends on the number of on-chip silicon-photonic links, the bandwidth of each link, and the photonic losses along each link. In this paper, we propose to reduce the laser power dissipation at runtime by dynamically activating/deactivating L2 cache banks and switching ON/OFF the corresponding silicon-photonic links in the NoC. This method effectively throttles the total on-chip NoC bandwidth at runtime according to the memory access features of the applications running on the manycore system. Full-system simulation utilizing Princeton application repository for shared-memory computers and Stanford parallel applications for shared-memory-2 parallel benchmarks reveal that our proposed technique achieves on an average 23.8% (peak value 74.3%) savings in laser power, and 9.2% (peak value 26.9%) lower energy-delay product for the whole system at the cost of 0.65% loss (peak value 2.6%) in instructions per cycle on average when compared to the cases where all L2 cache banks are always active.
    Full-text · Article · Jun 2015 · IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • Source
    • "Pioneering contributions try to reduce distances with 3D [12] [15] or reduce traveling time with optic [13] [19] and RF [3] [2] "

    Full-text · Article · May 2015
Show more