Jiaqi Gu

Jiaqi Gu
University of Texas at Austin | UT · Department of Electrical & Computer Engineering

Bachelor of Engineering

About

98
Publications
5,873
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,534
Citations
Education
September 2018 - May 2023
University of Texas at Austin
Field of study
  • Electrical and Computer Engineering
September 2014 - June 2018
Fudan University
Field of study
  • Microelectronics Science and Engineering

Publications

Publications (98)
Preprint
Electronic-photonic integrated circuits (EPICs) offer transformative potential for next-generation high-performance AI but require interdisciplinary advances across devices, circuits, architecture, and design automation. The complexity of hybrid systems makes it challenging even for domain experts to understand distinct behaviors and interactions a...
Article
In recent decades, the demand for computational power has surged, particularly with the rapid expansion of artificial intelligence (AI). As we navigate the post-Moore's law era, the limitations of traditional electrical digital computing, including process bottlenecks and power consumption issues, are propelling the search for alternative computing...
Preprint
Nanophotonic device design aims to optimize photonic structures to meet specific requirements across various applications. Inverse design has unlocked non-intuitive, high-dimensional design spaces, enabling the discovery of high-performance devices beyond heuristic or analytic methods. The adjoint method, which calculates gradients for all variable...
Preprint
Electromagnetic field simulation is central to designing, optimizing, and validating photonic devices and circuits. However, costly computation associated with numerical simulation poses a significant bottleneck, hindering scalability and turnaround time in the photonic circuit design process. Neural operators offer a promising alternative, but exi...
Article
Full-text available
Optimization methods are frequently exploited in the design of silicon photonic devices. In this paper, we demonstrate that pushing the objective function to its minimum during optimization often results in devices that gradually become more sensitive to perturbations of design variables. The dominant strategy of selecting the design with the small...
Preprint
The finite-difference time-domain (FDTD) method, which is important in photonic hardware design flow, is widely adopted to solve time-domain Maxwell equations. However, FDTD is known for its prohibitive runtime cost, taking minutes to hours to simulate a single device. Recently, AI has been applied to realize orders-of-magnitude speedup in partial...
Article
Full-text available
Photonic computing shows promise for transformative advancements in machine learning (ML) acceleration, offering ultrafast speed, massive parallelism, and high energy efficiency. However, current photonic tensor core (PTC) designs based on standard optical components hinder scalability and compute density due to their large spatial footprint. To ad...
Article
Full-text available
Optical neural networks (ONNs) are promising hardware platforms for next-generation neuromorphic computing due to their high parallelism, low latency, and low energy consumption. However, previous integrated photonic tensor cores (PTCs) consume numerous single-operand optical modulators for signal and weight encoding, leading to large area costs an...
Conference Paper
We presents a hardware-efficient o ptical c omputing a rchitecture f or structured neural networks (OSNNs). The performance of our neural chip was validated on a photonic-electronic testing platform experimentally, demonstrating reduced optical component utilization and small deviation.
Conference Paper
Employing Integrated Photonic Chip-Based ONNs for early pancreatic cancer detection, achieved an 80% Dice score, demonstrating efficient, high-speed alternatives to traditional electrical training systems for medical imaging.
Preprint
The optical neural network (ONN) is a promising hardware platform for next-generation neuromorphic computing due to its high parallelism, low latency, and low energy consumption. However, previous integrated photonic tensor cores (PTCs) consume numerous single-operand optical modulators for signal and weight encoding, leading to large area costs an...
Preprint
Photonic computing shows promise for transformative advancements in machine learning (ML) acceleration, offering ultra-fast speed, massive parallelism, and high energy efficiency. However, current photonic tensor core (PTC) designs based on standard optical components hinder scalability and compute density due to their large spatial footprint. To a...
Preprint
The wide adoption and significant computing resource consumption of attention-based Transformers, e.g., Vision Transformer and large language models, have driven the demands for efficient hardware accelerators. While electronic accelerators have been commonly used, there is a growing interest in exploring photonics as an alternative technology due...
Preprint
Transformers have achieved great success in machine learning applications. Normalization techniques, such as Layer Normalization (LayerNorm, LN) and Root Mean Square Normalization (RMSNorm), play a critical role in accelerating and stabilizing the training of Transformers. While LayerNorm recenters and rescales input vectors, RMSNorm only rescales...
Conference Paper
We deploy a compact butterfly-style photonic-electronic neural chip on ResNet-20 and achieve > 85% measured accuracy on the CIFAR-10 dataset with only 3-bit weight programming resolutions, showing its practicality in implementing complicated deep learning tasks.
Preprint
Transformers have attained superior performance in natural language processing and computer vision. Their self-attention and feedforward layers are overparameterized, limiting inference speed and energy efficiency. Tensor decomposition is a promising technique to reduce parameter redundancy by leveraging tensor algebraic properties to express the p...
Article
Transformers have attained superior performance in natural language processing and computer vision. Their self-attention and feedforward layers are overparameterized, limiting inference speed and energy efficiency. Tensor decomposition is a promising technique to reduce parameter redundancy by leveraging tensor algebraic properties to express the p...
Preprint
Full-text available
Among different quantum algorithms, PQC for QML show promises on near-term devices. To facilitate the QML and PQC research, a recent python library called TorchQuantum has been released. It can construct, simulate, and train PQC for machine learning tasks with high speed and convenient debugging supports. Besides quantum for ML, we want to raise th...
Preprint
Optical computing is an emerging technology for next-generation efficient artificial intelligence (AI) due to its ultra-high speed and efficiency. Electromagnetic field simulation is critical to the design, optimization, and validation of photonic devices and circuits. However, costly numerical simulation significantly hinders the scalability and t...
Preprint
Analog computing has been recognized as a promising low-power alternative to digital counterparts for neural network acceleration. However, conventional analog computing is mainly in a mixed-signal manner. Tedious analog/digital (A/D) conversion cost significantly limits the overall system's energy efficiency. In this work, we devise an efficient a...
Preprint
As deep learning models and datasets rapidly scale up, network training is extremely time-consuming and resource-costly. Instead of training on the entire dataset, learning with a small synthetic dataset becomes an efficient solution. Extensive research has been explored in the direction of dataset condensation, among which gradient matching achiev...
Preprint
Analog/mixed-signal circuit design is one of the most complex and time-consuming stages in the whole chip design process. Due to various process, voltage, and temperature (PVT) variations from chip manufacturing, analog circuits inevitably suffer from performance degradation. Although there has been plenty of work on automating analog circuit desig...
Article
In the post Moore's era, conventional electronic digital computing platforms have encountered escalating challenges to support massively parallel and energy-hungry artificial intelligence (AI) workloads. Intelligent applications in data centers, edge devices, and autonomous vehicles have restricted requirements in throughput, power, and latency, wh...
Preprint
Full-text available
Quantum Neural Network (QNN) is drawing increasing research interest thanks to its potential to achieve quantum advantage on near-term Noisy Intermediate Scale Quantum (NISQ) hardware. In order to achieve scalable QNN learning, the training process needs to be offloaded to real quantum machines instead of using exponential-cost classical simulators...
Article
Optical phase change material (PCM) has emerged promising to enable photonic in-memory neurocomputing in optical neural network (ONN) designs. However, massive photonic tensor core (PTC) reuse is required to implement large matrix multiplication due to the limited single-core scale. The resultant large number of PCM writes during inference incurs s...
Article
Optical neural networks (ONNs) are promising hardware platforms for next-generation artificial intelligence acceleration with ultrafast speed and low-energy consumption. However, previous ONN designs are bounded by one multiply–accumulate operation per device, showing unsatisfying scalability. In this work, we propose a scalable ONN architecture, d...
Conference Paper
We propose a 2-bit electronic-photonic shifter based on microdisk add-drop switches with experimental demonstrations. The proposed shifter can be deployed in future high speed and energy efficient electronic-photonic arithmetic logic units.
Preprint
Photonic tensor cores (PTCs) are essential building blocks for optical artificial intelligence (AI) accelerators based on programmable photonic integrated circuits. PTCs can achieve ultra-fast and efficient tensor operations for neural network (NN) acceleration. Current PTC designs are either manually constructed or based on matrix decomposition th...
Preprint
With the recent advances in optical phase change material (PCM), photonic in-memory neurocomputing has demonstrated its superiority in optical neural network (ONN) designs with near-zero static power consumption, time-of-light latency, and compact footprint. However, photonic tensor cores require massive hardware reuse to implement large matrix mul...
Preprint
As deep learning has shown revolutionary performance in many artificial intelligence applications, its escalating computation demand requires hardware accelerators for massive parallelism and improved throughput. The optical neural network (ONN) is a promising candidate for next-generation neurocomputing due to its high parallelism, low latency, an...
Preprint
Vision transformers (ViTs) have attracted much attention for their superior performance on computer vision tasks. To address their limitations of single-scale low-resolution representations, prior work adapts ViTs to high-resolution dense prediction tasks with hierarchical architectures to generate pyramid features. However, multi-scale representat...
Preprint
Silicon-photonics-based optical neural network (ONN) is a promising hardware platform that could represent a paradigm shift in efficient AI with its CMOS-compatibility, flexibility, ultra-low execution latency, and high energy efficiency. In-situ training on the online programmable photonic chips is appealing but still encounters challenging issues...
Preprint
Full-text available
Quantum Neural Network (QNN) is a promising application towards quantum advantage on near-term quantum hardware. However, due to the large quantum noises (errors), the performance of QNN models has a severe degradation on real quantum devices. For example, the accuracy gap between noise-free simulation and noisy results on IBMQ-Yorktown for MNIST-4...
Preprint
Discrete cosine transform (DCT) and other Fourier-related transforms have broad applications in scientific computing. However, off-the-shelf high-performance multi-dimensional DCT (MD DCT) libraries are not readily available in parallel computing systems. Public MD DCT implementations leverage a straightforward method that decomposes the computatio...
Preprint
Deep neural networks (DNN) have shown superior performance in a variety of tasks. As they rapidly evolve, their escalating computation and memory demands make it challenging to deploy them on resource-constrained edge devices. Though extensive efficient accelerator designs, from traditional electronics to emerging photonics, have been successfully...
Preprint
Full-text available
Quantum noise is the key challenge in Noisy Intermediate-Scale Quantum (NISQ) computers. Limited research efforts have explored a higher level of optimization by making the quantum circuit resilient to noise. We propose and experimentally implement QuantumNAS, the first comprehensive framework for noise-adaptive co-search of variational circuit and...
Article
Integrated photonics has shown extraordinary potential in optical computing. Various optical computing modules have been investigated, among which the digital comparator is a significant element for any arithmetic logic unit. In this study, an architecture of a wavelength‐division‐multiplexing‐based (WDM‐based) electronic–photonic digital comparato...
Article
Optical neural networks (ONNs) have demonstrated record-breaking potential in high-performance neuromorphic computing due to their ultra-high execution speed and low energy consumption. However, current learning protocols fail to provide scalable and efficient solutions to photonic circuit optimization in practical applications. In this work, we pr...
Preprint
Machine learning frameworks adopt iterative optimizers to train neural networks. Conventional eager execution separates the updating of trainable parameters from forward and backward computations. However, this approach introduces nontrivial training time overhead due to the lack of data locality and computation parallelism. In this work, we propos...
Conference Paper
We propose and experimentally demonstrate a 3-8 wavelength-division-multiplexing (WDM) based optical decoder using microring-based add-drop switches and filters. The proposed decoder has a smaller footprint and consumes lower power compared with previous designs.
Preprint
Optical neural networks (ONNs) have demonstrated record-breaking potential in high-performance neuromorphic computing due to their ultra-high execution speed and low energy consumption. However, current learning protocols fail to provide scalable and efficient solutions to photonic circuit optimization in practical applications. In this work, we pr...
Preprint
Full-text available
Logic synthesis is a fundamental step in hardware design whose goal is to find structural representations of Boolean functions while minimizing delay and area. If the function is completely-specified, the implementation accurately represents the function. If the function is incompletely-specified, the implementation has to be true only on the care...
Article
Full-text available
The recent rapid progress in integrated photonics has catalyzed the development of integrated optical computing in this post-Moore's law era. Electronic-photonic digital computing, as a new paradigm to achieve high-speed and power-efficient computation, has begun to attract attention. In this paper, we systematically investigate the optical sequent...
Chapter
As machine learning models and dataset escalate in scales rapidly, the huge memory footprint impedes efficient training. Reversible operators can reduce memory consumption by discarding intermediate feature maps in forward computations and recover them via their inverse functions in the backward propagation. They save memory at the cost of computat...
Article
As a promising neuromorphic framework, the optical neural network (ONN) demonstrates ultra-high inference speed with low energy consumption. However, the previous ONN architectures have high area overhead which limits their practicality. In this paper, we propose an areaefficient ONN architecture based on structured neural networks, leveraging opti...
Article
Full-text available
Integrated photonics offers attractive solutions for realizing combinational logic for high-performance computing. The integrated photonic chips can be further optimized using multiplexing techniques such as wavelength-division multiplexing (WDM). In this paper, we propose a WDM-based electronic-photonic switching network (EPSN) to realize the func...
Article
Placement for very large-scale integrated (VLSI) circuits is one of the most important steps for design closure. We propose a novel GPU-accelerated placement framework DREAMPlace, by casting the analytical placement problem equivalently to training a neural network. Implemented on top of a widely adopted deep learning toolkit PyTorch , with custo...
Article
Full-text available
The past two decades have witnessed the stagnation of the clock speed of microprocessors followed by the recent faltering of Moore’s law as nanofabrication technology approaches its unavoidable physical limit. Vigorous efforts from various research areas have been made to develop power-efficient and ultrafast computing machines in this post-Moore’s...
Article
Placement is an important step in modern very-large-scale integrated (VLSI) designs. Detailed placement is a placement refining procedure intensively called throughout the design flow, thus its efficiency has a vital impact on design closure. However, since most detailed placement techniques are inherently greedy and sequential, they are generally...