Project

# WiPLASH: Architecting More Than Moore – Wireless Plasticity for Heterogeneous Massive Computer Architectures

Goal: The main design principles in computer architecture have shifted from a monolithic scaling-driven approach towards an emergence of heterogeneous architectures that tightly co-integrate multiple specialized computing and memory units. This is motivated by the urgent need of very high parallelism and by energy constraints. This heterogeneous hardware specialization requires interconnection mechanisms that integrate the architecture. State-of-the-art approaches are 3D stacking and 2D architectures complemented with a Network-on-Chip (NoC) to interconnect the components. However, such interconnects are fundamentally monolithic and rigid, and are unable to provide the efficiency and architectural flexibility required by current and future key ICT applications. The main challenge is to introduce diversification and specialization in heterogeneous processor architectures while ensuring their generality and scalability.

In order to achieve this, the WiPLASH project aims to pioneer an on-chip wireless communication plane able to provide architectural plasticity, reconfigurability and adaptation to the application requirements with near-ASIC efficiency but without any loss of generality. For this, the WiPLASH consortium will provide solid experimental foundations of the key enablers of on-chip wireless communication at the functional unit level as well as their technological and architectural integration. The main goals are: (i) prototype a miniaturized and tunable graphene antenna in the terahertz band, (ii) co-integrate graphene RF components with submillimeter-wave transceivers and (iii) demonstrate low-power reconfigurable wireless chip-scale networks. The culminating goal is to demonstrate that the wireless plane offers the plasticity required by future computing platforms by improving at least one key application (mainly biologically-plausible deep learning architectures) by 10X in terms of execution speed and energy-delay product over a state-of-the-art baseline.

0 new
0
Recommendations
0 new
0
Followers
0 new
14
0 new
129

## Project log

Hyperdimensional computing (HDC) is an emerging computing paradigm that represents, manipulates, and communicates data using very long random vectors (aka hypervectors). Among different hardware platforms capable of executing HDC algorithms, in-memory computing (IMC) systems have been recently proved to be one of the most energy-efficient options, due to hypervector manipulations in the memory itself that reduces data movement. Although implementations of HDC on single IMC cores have been made, their parallelization is still unresolved due to the communication challenges that these novel architectures impose and that traditional Networks-on-Chip and Networks-in-Package were not designed for. To cope with this difficulty, we propose the use of wireless on-chip communication technology in unique ways. We are particularly interested in physically distributing a large number of IMC cores performing similarity search across a chip, and maintaining the classification accuracy when each of which is queried with a slightly different version of a bundled hypervector. To achieve it, we introduce a novel over-the-air computing that consists of defining different binary decision regions in the receivers so as to compute the logical majority operation (i.e., bundling, or superposition) required in HDC. It introduces moderate overheads of a single antenna and receiver per IMC core. By doing so, we achieve a joint broadcast distribution and computation with a performance and efficiency unattainable with wired interconnects, which in turn enables massive parallelization of the architecture. It is demonstrated that the proposed approach allows to both bundle at least three hypervectors and scale similarity search to 64 IMC cores seamlessly, while incurring an average bit error ratio of 0.01 without any impact in the accuracy of a generic HDC-based classifier working with 512-bit vectors.
Analog In-Memory Computing (AIMC) is emerging as a disruptive paradigm for heterogeneous computing, potentially delivering orders of magnitude better peak performance and efficiency over traditional digital signal processing architectures on Matrix-Vector multiplication. However, to sustain this throughput in real-world applications, AIMC tiles must be supplied with data at very high bandwidth and low latency; this poses an unprecedented pressure on the on-chip communication infrastructure, which becomes the system's performance and efficiency bottleneck. In this context, the performance and plasticity of emerging on-chip wireless communication paradigms provide the required breakthrough to up-scale on-chip communication in large AIMC devices. This work presents a many-tile AIMC architecture with inter-tile wireless communication that integrates multiple heterogeneous computing clusters, embedding a mix of parallel RISC-V cores and AIMC tiles. We perform an extensive design space exploration of the proposed architecture and discuss the benefits of exploiting emerging on-chip communication technologies such as wireless transceivers in the millimeter-wave and terahertz bands.
This paper introduces the concept of smart radio environments, currently intensely studied for wireless communication in metasurface‐programmable meter‐scaled environments (e.g., inside rooms), on the chip scale. Wireless networks‐on‐chips (WNoCs) are a candidate technology to improve inter‐core communication on chips but current proposals are plagued by a dilemma: either the received signal is weak, or it is significantly reverberated such that the on–off‐keying modulation speed must be throttled. Here, this vexing problem is overcome by endowing the wireless on‐chip environment with in situ programmability which enables the shaping of the channel impulse response (CIR); thereby, a pulse‐like CIR shape can be imposed despite strong multipath propagation and without entailing a reduced received signal strength. First, a programmable metasurface suitable for integration in the on‐chip environment (“on‐chip reconfigurable intelligent surface”) is designed and characterized. Second, its configuration is optimized to equalize selected wireless on‐chip channels “over the air.” Third, by conducting a rigorous communication analysis, the feasibility of significantly higher modulation speeds with shaped CIRs is evidenced. The results introduce a programmability paradigm to WNoCs which boosts their competitiveness as complementary on‐chip interconnect solution. A programmable metasurface is included inside a chip package, and suitable metasurface configurations are identified that equalize wireless channels on the chip over‐the‐air to mitigate inter‐symbol interference. The largely improved data transfer rates boost the competitiveness of wireless networks‐on‐chips (WNoCs) as complementary interconnect technology. WNoCs aim to avert the risk of communication‐limited performance of multicore chips.
Diodes made of heterostructures of the 2D material graphene and conventional 3D materials are reviewed in this manuscript. Several applications in high frequency electronics and optoelectronics are highlighted. In particular, advantages of metal–insulator–graphene (MIG) diodes over conventional metal–insulator–metal diodes are discussed with respect to relevant figures‐of‐merit. The MIG concept is extended to 1D diodes. Several experimentally implemented radio frequency circuit applications with MIG diodes as active elements are presented. Furthermore, graphene‐silicon Schottky diodes as well as MIG diodes are reviewed in terms of their potential for photodetection. Here, graphene‐based diodes have the potential to outperform conventional photodetectors in several key figures‐of‐merit, such as overall responsivity or dark current levels. Obviously, advantages in some areas may come at the cost of disadvantages in others, so that 2D/3D diodes need to be tailored in application‐specific ways. Diodes made of heterostructures of the 2D material graphene and conventional 3D materials are reviewed in this article. In particular, metal–insulator–graphene diodes and graphene‐silicon Schottky diodes are discussed with relevant figures‐of‐merit. Several applications in high frequency electronics and optoelectronics are highlighted, such as power detectors, mixers, frequency doublers, receivers, and photodetectors.
This work presents the design, implementation, and characterization of the first thin-film integrated tunable microwave harmonic generator. The design is realized by exploiting the nonlinearity of four chemical vapor deposition (CVD) graphene-based diodes arranged in a nonlinear transmission-line (NLTL) approach. The used thin-film monolithic microwave integrated circuit (MMIC) technology is substrate independent. The fabricated prototype is realized on a 500- $\mu \text{m}$ transparent quartz substrate and occupies less than 1.2 mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> of chip area including pads. Measurement results show a wide input frequency range from 0 to 2.8 GHz with measured $S_{11}$ better than −10 dB. The measured second-harmonic conversion gain (CG) for an output frequency of 3.4 GHz is −21.6 dB. The measured third-harmonic CG for an output frequency of 3.15 GHz is −31 dB. To the best knowledge of the authors, the proposed circuit is the first tunable microwave harmonic generator combining graphene diodes in a NLTL topology on thin-film technology.
Deep neural network (DNN) models continue to grow in size and complexity, demanding higher computational power to enable real-time inference. To efficiently deliver such computational demands, hardware accelerators are being developed and deployed across scales. This naturally requires an efficient scale-out mechanism for increasing compute density as required by the application. 2.5D integration over interposer has emerged as a promising solution, but as we show in this work, the limited interposer bandwidth and multiple hops in the Network-on-Package (NoP) can diminish the benefits of the approach. To cope with this challenge, we propose WIENNA, a wireless NoP-based 2.5D DNN accelerator. In WIENNA, the wireless NoP connects an array of DNN accelerator chiplets to the global buffer chiplet, providing high-bandwidth multicasting capabilities. Here, we also identify the dataflow style that most efficienty exploits the wireless NoP's high-bandwidth multicasting capability on each layer. With modest area and power overheads, WIENNA achieves 2.2X--5.1X higher throughput and 38.2% lower energy than an interposer-based NoP design.
The main design principles in computer architecture have recently shifted from a monolithic scaling-driven approach to the development of heterogeneous architectures that tightly co-integrate multiple specialized processor and memory chiplets. In such data-hungry multi-chip architectures, current Networks-in-Package (NiPs) may not be enough to cater to their heterogeneous and fast-changing communication demands. This position paper makes the case for wireless in-package nanonetworking as the enabler of efficient and versatile wired-wireless interconnect fabrics for massive heterogeneous processors. To that end, the use of graphene-based antennas and transceivers with unique frequency-beam reconfigurability in the terahertz band is proposed. The feasibility of such a nanonetworking vision and the main research challenges towards its realization are analyzed from the technological, communications, and computer architecture perspectives.