Zishen Wan

Zishen Wan

Doctor of Philosophy

About

66
Publications
7,315
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
558
Citations
Introduction
Personal website: https://zishenwan.github.io. I’m a PhD student in Electrical and Computer Engineering, Georgia Tech. My research interest spans from high-level algorithms to low-level computer architecture and VLSI, focusing on building efficient and reliable hardware and system for autonomous machines, edge intelligence, and ML. Before joining Georgia Tech, I received M.S. from Harvard University in 2020, and B.S. from Harbin Institute of Technology in 2018, both in Electrical Engineering.

Publications

Publications (66)
Preprint
Full-text available
Recent researches on robotics have shown significant improvement, spanning from algorithms, mechanics to hardware architectures. Robotics, including manipulators, legged robots, drones, and autonomous vehicles, are now widely applied in diverse scenarios. However, the high computation and data complexity of robotic algorithms pose great challenges...
Preprint
Full-text available
Reliability and safety are critical in autonomous machine services, such as autonomous vehicles and aerial drones. In this paper, we first present an open-source Micro Aerial Vehicles (MAVs) reliability analysis framework, MAVFI, to characterize transient fault's impacts on the end-to-end flight metrics, e.g., flight time, success rate. Based on ou...
Preprint
Full-text available
Quantization can help reduce the memory, compute, and energy demands of deep neural networks without significantly harming their quality. However, whether these prior techniques, applied traditionally to image-based models, work with the same efficacy to the sequential decision-making process in reinforcement learning remains an unanswered question...
Article
This book provides a thorough overview of the state-of-the-art field-programmable gate array (FPGA)-based robotic computing accelerator designs and summarizes their adopted optimized techniques. This book consists of ten chapters, delving into the details of how FPGAs have been utilized in robotic perception, localization, planning, and multi-robot...
Preprint
The next ubiquitous computing platform, following personal computers and smartphones, is poised to be inherently autonomous, encompassing technologies like drones, robots, and self-driving cars. Ensuring reliability for these autonomous machines is critical. However, current resiliency solutions make fundamental trade-offs between reliability and c...
Preprint
Full-text available
The remarkable advancements in artificial intelligence (AI), primarily driven by deep neural networks, are facing challenges surrounding unsustainable computational trajectories, limited robustness, and a lack of explainability. To develop next-generation cognitive AI systems, neuro-symbolic AI emerges as a promising paradigm, fusing neural and sym...
Article
The remarkable advancements in artificial intelligence (AI), primarily driven by deep neural networks, are facing challenges surrounding unsustainable computational trajectories, limited robustness, and a lack of explainability. To develop next-generation cognitive AI systems, neuro-symbolic AI emerges as a promising paradigm, fusing neural and sym...
Article
Dek Moving toward reliable autonomous machines.
Article
The prediction accuracy of deep neural networks (DNNs) deployed at the edge can deteriorate over time due to shifts in the data distribution. For heightened robustness, it’s crucial for DNNs to continually refine and improve their predictive capabilities. However, adaptation in resource-limited edge environments is fraught with challenges: (i) new...
Conference Paper
Full-text available
The remarkable advancements in artificial intelligence (AI), primarily driven by deep neural networks, are facing challenges surrounding unsustainable computational trajectories, limited robustness, and a lack of explainability. To develop next-generation cognitive AI systems, neuro-symbolic AI emerges as a promising paradigm, fusing neural and sym...
Preprint
Full-text available
We introduce RobotPerf, a vendor-agnostic bench-marking suite designed to evaluate robotics computing performance across a diverse range of hardware platforms using ROS 2 as its common baseline. The suite encompasses ROS 2 packages covering the full robotics pipeline and integrates two distinct benchmarking approaches: black-box testing, which meas...
Preprint
Full-text available
Autonomous systems, such as Unmanned Aerial Vehicles (UAVs), are expected to run complex reinforcement learning (RL) models to execute fully autonomous position-navigation-time tasks within stringent onboard weight and power constraints. We observe that reducing onboard operating voltage can benefit the energy efficiency of both the computation and...
Preprint
Full-text available
While deep neural networks are being utilized heavily for autonomous driving, they need to be adapted to new unseen environmental conditions for which they were not trained. We focus on a safety critical application of lane detection, and propose a lightweight, fully unsupervised, real-time adaptation approach that only adapts the batch-normalizati...
Article
Accurate identification of the target and tracking it at high speeds using drone-mounted cameras and compute hardware finds military and commercial applications. Conventional frame-based cameras and convolutional neural networks (CNNs) extract detailed spatial information to show high accuracy but suffer from lower throughput caused by large models...
Article
Safety and resiliency are essential components of autonomous vehicles. In this research, we introduce ROSFI, the first robot operating system (ROS) resilience analysis methodology, to assess the effect of silent data corruption (SDC) on mission metrics. We use unmanned aerial vehicles (UAVs) as a case study to demonstrate that system-level paramete...
Article
Recent researches on robotics have shown significant improvement, spanning from algorithms, mechanics to hardware architectures. Robotics, including manipulators, legged robots, drones, and autonomous vehicles, are now widely applied in diverse scenarios. However, the high computation and data complexity of robotic algorithms pose great challenges...
Preprint
Full-text available
Robotic computing has reached a tipping point, with a myriad of robots (e.g., drones, self-driving cars, logistic robots) being widely applied in diverse scenarios. The continuous proliferation of robotics, however, critically depends on efficient computing substrates, driven by real-time requirements, robotic size-weight-and-power constraints, cyb...
Preprint
Full-text available
We introduce an early-phase bottleneck analysis and characterization model called the F-1 for designing computing systems that target autonomous Unmanned Aerial Vehicles (UAVs). The model provides insights by exploiting the fundamental relationships between various components in the autonomous UAV, such as sensor, compute, and body dynamics. To gua...
Preprint
Full-text available
Swarm intelligence is being increasingly deployed in autonomous systems, such as drones and unmanned vehicles. Federated reinforcement learning (FRL), a key swarm intelligence paradigm where agents interact with their own environments and cooperatively learn a consensus policy while preserving privacy, has recently shown potential advantages and ga...
Preprint
Full-text available
As we march towards the age of ubiquitous intelligence, we note that AI and intelligence are progressively moving from the cloud to the edge. The success of Edge-AI is pivoted on innovative circuits and hardware that can enable inference and limited learning in resource-constrained edge autonomous systems. This paper introduces a series of ultra-lo...
Preprint
Full-text available
Simultaneous Localization and Mapping (SLAM) estimates agents' trajectories and constructs maps, and localization is a fundamental kernel in autonomous machines at all computing scales, from drones, AR, VR to self-driving cars. In this work, we present an energy-efficient and runtime-reconfigurable FPGA-based accelerator for robotic localization. W...
Conference Paper
Full-text available
Learning-based navigation systems are widely used in autonomous applications, such as robotics, unmanned vehicles and drones. Specialized hardware accelerators have been proposed for high-performance and energy-efficiency for such navigational tasks. However, transient and permanent faults are increasing in hardware systems and can catastrophically...
Preprint
Full-text available
Learning-based navigation systems are widely used in autonomous applications, such as robotics, unmanned vehicles and drones. Specialized hardware accelerators have been proposed for high-performance and energy-efficiency for such navigational tasks. However, transient and permanent faults are increasing in hardware systems and can catastrophically...
Preprint
Full-text available
We present a bottleneck analysis tool for designing compute systems for autonomous Unmanned Aerial Vehicles (UAV). The tool provides insights by exploiting the fundamental relationships between various components in the autonomous UAV such as sensor, compute, body dynamics. To guarantee safe operation while maximizing the performance (e.g., velocit...
Preprint
Full-text available
Aerial autonomous machines (Drones) has a plethora of promising applications and use cases. While the popularity of these autonomous machines continues to grow, there are many challenges, such as endurance and agility, that could hinder the practical deployment of these machines. The closed-loop control frequency must be high to achieve high agilit...
Conference Paper
Full-text available
In our past few years’ of commercial deployment experiences, we identify localization as a critical task in autonomous machine applications, and a great acceleration target. In this paper, based on the observation that the visual frontend is a major performance and energy consumption bottleneck, we present our design and implementation of an energy...
Conference Paper
Full-text available
Stereo matching is a critical task for robot navigation and autonomous vehicles, providing the depth estimation of surroundings. Among all stereo matching algorithms, Efficient Large-scale Stereo (ELAS) offers one of the best tradeoffs between efficiency and accuracy. However, due to the inherent iterative process and unpredictable memory access pa...
Preprint
Full-text available
Stereo matching is a critical task for robot navigation and autonomous vehicles, providing the depth estimation of surroundings. Among all stereo matching algorithms, Efficient Large-scale Stereo (ELAS) offers one of the best tradeoffs between efficiency and accuracy. However, due to the inherent iterative process and unpredictable memory access pa...
Preprint
Full-text available
In our past few years' of commercial deployment experiences, we identify localization as a critical task in autonomous machine applications, and a great acceleration target. In this paper, based on the observation that the visual frontend is a major performance and energy consumption bottleneck, we present our design and implementation of an energy...
Preprint
Full-text available
Building domain-specific architectures for autonomous aerial robots is challenging due to a lack of systematic methodology for designing onboard compute. We introduce a novel performance model called the F-1 roofline to help architects understand how to build a balanced computing system for autonomous aerial robots considering both its cyber (senso...
Chapter
Before we delve into utilizing FPGAs for accelerating robotic workloads, in this chapter we first provide the background of FPGA technologies so that readers without prior knowledge can grasp the basic understanding of what an FPGA is and how an FPGA works. We also introduce partial reconfiguration, a technique that exploits the flexibility of FPGA...
Chapter
FPGAs provide rich I/O interfaces, flexibility, and capability of handling complex workloads with high performance and low energy consumption, thus FPGAs are ideal compute substrates to deploy in autonomous driving systems. In this chapter, we present a detailed case study on building a commercial autonomous driving compute system, especially the c...
Chapter
Cameras are widely used in intelligent robot systems because of their lightweight and rich information for perception. Cameras can be used to complete a variety of basic tasks of intelligent robots, such as visual odometry (VO), place recognition, object detection, and recognition. With the development of convolutional neural networks (CNNs), we ca...
Chapter
The commercialization of autonomous robots is a thriving sector, and likely to be the next major compute demand driver, after PC, cloud computing, and mobile computing. After examining various compute substrates for robotic computing, we believe that FPGAs are currently the best compute substrate for robotic applications for several reasons: first,...
Chapter
Thus far, we have focused on the utilization of FPGAs in single-robot applications. In this chapter, we consider collaborative exploration through a team of robots, in which the robots share information with each other or even with the infrastructure [301]. Especially, we discuss how FPGAs can be utilized to accelerate multi-robot acceleration work...
Chapter
Localization, i.e., ego-motion estimation, is one of the most fundamental tasks to autonomous machines, in which an agent calculates the position and orientation of itself in a given frame of reference, i.e., map. For general robotic software stacks, localization is the building block of many tasks. Knowing the translational pose enables a robot or...
Chapter
Motion planning is the module that computes how a robot or autonomous vehicle maneuvers itself. The task of motion planning is to generate a trajectory without colliding any obstacles and sends it to the feedback control for physical robot control execution. The planned trajectory is usually specified and represented as a sequence of planned trajec...
Chapter
The last decade has seen significant progress in the development of robotics, spanning from algorithms, mechanics to hardware platforms. Various robotic systems, like manipulators, legged robots, unmanned aerial vehicles, and self-driving cars have been designed for search and rescue [1, 2], exploration [3, 4], package delivery [5], entertainment [...
Chapter
Perception is related to many robotic applications where sensory data and artificial intelligence techniques are involved. The goal of perception is to sense the dynamic environment surrounding the robot and to build a reliable and detailed representation of this environment based on sensory data. Since all subsequent localization, planning, and co...
Chapter
Due to radiation tolerance requirements, the compute power of space-grade ASICs is usually decades behind the state-of-the-art commercial off-the-shelf processors. On the other hand, space-grade FPGAs deliver high reliability, adaptability, processing power, and energy efficiency, and are expected to close the two-decade performance gap between com...
Article
We introduce the "Formula-1" (F-1) roofline model to understand the role of computing in aerial autonomous machines. The model provides insights by exploiting the fundamental relationships between various components in an aerial robot, such as sensor framerate, compute performance, and body dynamics (physics). The model serves as a tool that can ai...
Article
Full-text available
Time-lens technology is of significant interest in signal processing and optical communication. The impacts of group velocity dispersion (GVD) on ultrafast pulse shaping in a time-lens system based on four-wave mixing are explored in this paper. The output signals of temporal magnification and time-to-frequency conversion under different GVDs are t...
Preprint
Full-text available
Conventional hardware-friendly quantization methods, such as fixed-point or integer, tend to perform poorly at very low word sizes as their shrinking dynamic ranges cannot adequately capture the wide data distributions commonly seen in sequence transduction models. We present AdaptivFloat, a floating-point inspired number representation format for...

Network

Cited By