Conference Paper

Corona: System Implications of Emerging Nanophotonic Technology.

Univ. of Wisconsin - Madison, Madison, WI
DOI: 10.1109/ISCA.2008.35 Conference: 35th International Symposium on Computer Architecture (ISCA 2008), June 21-25, 2008, Beijing, China
Source: DBLP

ABSTRACT We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance, memory and inter-core bandwidths will also have to scale by orders of magnitude. Pin limitations, the energy cost of electrical signaling, and the non-scalability of chip-length global wires are significant bandwidth impediments. Recent developments in silicon nanophotonic technology have the potential to meet these off- and on-stack bandwidth requirements at acceptable power levels. Corona is a 3 D many-core architecture that uses nanophotonic communication for both inter-core communication and off-stack communication to memory or I/O devices. Its peak floating-point performance is 10 teraflops. Dense wavelength division multiplexed optically connected memory modules provide 10 terabyte per second memory bandwidth. A photonic crossbar fully interconnects its 256 low-power multithreaded cores at 20 terabyte per second bandwidth. We have simulated a 1024 thread Corona system running synthetic benchmarks and scaled versions of the SPLASH-2 benchmark suite. We believe that in comparison with an electrically-connected many-core alternative that uses the same on-stack interconnect power, Corona can provide 2 to 6 times more performance on many memory intensive workloads, while simultaneously reducing power.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Emerging non-volatile memory technologies such as MRAM are promising design solutions for energy-efficient memory architecture, especially formobile systems. However, building commodity MRAM by reusing DRAM designs is not straightforward. The existing memory interfaces are incompatible with MRAM small page size, and they fail to leverage MRAM unique properties, causing unnecessary performance and energy overhead. In this article, we propose four techniques to enable and optimize an LPDDRx-compatible MRAM solution: ComboAS to solve the pin incompatibility; DynLat to avoid unnecessary access latencies; and EarlyPA and BufW to further improve performance by exploiting the MRAM unique features of non-destructive read and independent write path. Combining all these techniques together, we boost the MRAM performance by 17% and provide a DRAM-compatible MRAM solution consuming 21% less energy.
    ACM Transactions on Architecture and Code Optimization 12/2014; 11(4):1-22. DOI:10.1145/2667105 · 0.60 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: High-end computing systems are expected to scale from petascale to exascale over the next decade. We describe requirements and architectures for high-bandwidth interconnects and high-radix switches based on photonics that could enable this performance growth.
    2011 IEEE Photonics Society Summer Topical Meeting Series; 07/2011
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 01/2015; DOI:10.1109/TCAD.2015.2402172 · 1.20 Impact Factor

Full-text (2 Sources)

Available from
Jun 10, 2014