Zhizhen Zhong

Zhizhen Zhong
Verified
Zhizhen verified their affiliation via an institutional email.
Verified
Zhizhen verified their affiliation via an institutional email.
  • Doctor of Philosophy
  • PostDoc at Massachusetts Institute of Technology

Postdoc at MIT CSAIL

About

48
Publications
6,518
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
477
Citations
Introduction
I am a postdoctoral researcher at MIT. My research focuses on realizing the next generation of networked computer systems by engineering the unique properties of light and its fundamental particles (photons). Towards this vision, my work takes an application-centric approach to co-design different stacks of networked computer systems: from fundamental photonic/electronic integrated circuits and devices, to computer hardware architecture, all the way to control algorithms and software systems.
Current institution
Massachusetts Institute of Technology
Current position
  • PostDoc
Additional affiliations
August 2010 - August 2014
Tsinghua University
Position
  • PhD Student
December 2019 - October 2020
Meta
Position
  • Consultant
February 2018 - July 2018
University of California, Davis
Position
  • Visiting Scholar

Publications

Publications (48)
Conference Paper
Full-text available
We demonstrate a practical Bayesian Optimization system for wavelength reconfiguration at Facebook backbone. Our system uses a firewall for safe deployment. It is open-source, compatible with any vendor, and achieves 4.76× faster wavelength reconfiguration.
Conference Paper
Full-text available
Fiber cut events reduce the capacity of wide-area networks (WANs) by several Tbps. In this paper, we revive the lost capacity by reconfiguring the wavelengths from cut fibers into healthy fibers. We highlight two challenges that made prior solutions impractical and propose a system called Arrow to address them. First, our measurements show that con...
Conference Paper
Full-text available
We present In-network Optical Inference (IOI), a system providing low-latency machine learning inference by leveraging programmable switches and optical matrix multiplication. IOI consists of a novel transceiver module designed specifically to perform linear operations such as matrix multiplication in the optical domain. IOI’s transceivers are plug...
Article
Advanced machine learning models are currently impossible to run on edge devices such as smart sensors and unmanned aerial vehicles owing to constraints on power, processing, and memory. We introduce an approach to machine learning inference based on delocalized analog processing across networks. In this approach, named Netcast, cloud-based "smart...
Conference Paper
Full-text available
The massive growth of machine learning-based applications and the end of Moore's law have created a pressing need to redesign computing platforms. We propose Lightning, the first reconfigurable photonic-electronic smartNIC to serve real-time deep neural network inference requests. Lightning uses a fast datapath to feed traffic from the NIC into the...
Preprint
Full-text available
Mixture-of-Expert (MoE) models outperform conventional models by selectively activating different subnets, named \emph{experts}, on a per-token basis. This gated computation generates dynamic communications that cannot be determined beforehand, challenging the existing GPU interconnects that remain \emph{static} during the distributed training proc...
Article
This paper analyzes the performance and energy efficiency of Netcast, a recently proposed optical neural-network architecture designed for edge computing. Netcast performs deep neural network inference by dividing the computational task into two steps, which are split between the (cloud) server and (edge) client: (1) the server employs a wavelength...
Conference Paper
Full-text available
We demonstrate Lightning, a reconfigurable photonic-electronic deep learning smartNIC that serves real-time inference requests at 4.055 GHz compute frequency. To do so, Lightning uses a novel datapath to feed traffic from the NIC into its photonic computing cores without incurring digital data movement bottlenecks. Lightning achieves this by employ...
Conference Paper
Full-text available
The rising demand for WAN capacity driven by the rapid growth of inter-data center traffic poses new challenges for costly optical networks. Today, cloud providers rely on fixed optical backbones, where all hardware devices operate on a rigid spectrum grid, leading to the waste of expensive optical resources and subpar performance in handling failu...
Conference Paper
We propose a photonic edge computing architecture based on WDM, broadband modulation, and output-stationary integration. Using this scheme, we demonstrate 98.8%-accurate DNN inference over an 86-km deployed fiber link with 3 THz optical bandwidth.
Preprint
This paper analyzes the performance and energy efficiency of Netcast, a recently proposed optical neural-network architecture designed for edge computing. Netcast performs deep neural network inference by dividing the computational task into two steps, which are split between the server and (edge) client: (1) the server employs a wavelength-multipl...
Preprint
Full-text available
Advances in deep neural networks (DNNs) are transforming science and technology. However, the increasing computational demands of the most powerful DNNs limit deployment on low-power devices, such as smartphones and sensors – and this trend is accelerated by the simultaneous move towards Internet-of-Things (IoT) devices. Numerous efforts are underw...
Preprint
Full-text available
Advances in deep neural networks (DNNs) are transforming science and technology. However, the increasing computational demands of the most powerful DNNs limit deployment on low-power devices, such as smartphones and sensors -- and this trend is accelerated by the simultaneous move towards Internet-of-Things (IoT) devices. Numerous efforts are under...
Preprint
Full-text available
We explore a novel approach for building DNN training clusters using commodity optical devices. Our proposal, called TopoOpt, co-optimizes the distributed training process across three dimensions: computation, communication, and network topology. TopoOpt uses a novel alternating optimization technique and a group theory-inspired algorithm to find t...
Conference Paper
We present experimental demonstrations of ultra-low power edge computing enabled by wavelength division multiplexed optical links and time-integrating optical receivers. Initial experimentation demonstrations show ≲ 10 fJ of optical energy per MAC.
Conference Paper
Full-text available
As the COVID-19 pandemic reshapes our social landscape, its lessons have far-reaching implications on how online service providers manage their infrastructure to mitigate risks. This paper presents Facebook's risk-driven backbone management strategy to ensure high service performance throughout the COVID-19 pandemic. We describe Risk Simulation Sys...
Article
Full-text available
The “pay-as-you-grow” cloud computing model has become popular for today’s enterprises. Cloud computing not only frees end users from complex operations, but also allows higher resource utilization, lower investment, and increased energy efficiency. However, with some emerging technologies, cloud computing is unable to meet the required latency lev...
Article
Full-text available
Physical layer attacks threaten services transmitted through optical networks. To detect attacks, we present an investigation of optical spectrum feature analysis (OSFA) and recognition. By analyzing the spectral features of optical signals, recognition and detection of unauthorized signals can be realized. In this paper, (1) we theoretically analy...
Article
Full-text available
Transient traffic spikes are becoming a crucial challenge for network operators, from both user-experience and network-maintenance perspectives. Different from long-term traffic growth, the bursty nature of short-term traffic fluctuations makes it difficult to be provisioned effectively. Luckily, next-generation elastic optical networks (EONs) prov...
Article
Full-text available
In future optical satellite networks with various service requirements, the bandwidth of a single traffic request occupies part of an inter-satellite link (ISL) channel capacity, thus leading to a greater demand for flexible resource allocation. The switching scheme is the most important determinant for flexible resource allocation in optical satel...
Conference Paper
Full-text available
We first propose a novel multi-domain routing paradigm that transforms the routing problem from heuristic-algorithm-based computation to artificial-intelligence-based data analytics. Numerical results prove that our proposal can achieve excellent routing accuracy, and significant signaling reduction.
Conference Paper
We propose joint allocation of computation resource and optical transmission time slices to realize ultralow-latency optical interconnection in time-synchronized HPC systems. Results show that over 80% reduction in buffering time is achieved at high load.
Conference Paper
We propose a crosstalk tracing method using deep neural networks for weakly-coupled MDM optical networks. Results show that over 95% tracing accuracy is achieved and the impact of time consistency in data collection is revealed.
Conference Paper
Elastic Optical Networks (EONs) represent a new approach for dealing with the enormous traffic demand in core networks as they can offer bandwidth granularities closer to those requested by the user and hence improve spectral utilization. In current literature there is a lack of dynamic strategies for service degradation which is a possible measure...
Conference Paper
Modal crosstalk is the main bottleneck in MMF-enabled optical datacenter networks with direct detection. A novel time-slicing-based crosstalk-mitigated MDM scheme is first proposed, then theoretically analyzed and experimentally demonstrated.
Conference Paper
We propose an in-service crosstalk monitoring and tracing method using fine-grained monitoring optical time slices for SDM-enabled intra-datacenter and HPC systems. Modal crosstalk below -36.01dB was successfully monitored and traced in an MMF transmission system.
Conference Paper
Full-text available
A flexible time-synchronized TWDM-PON (TS-TWDM-PON) architecture is proposed and implemented for low-latency metro-access communication. Results show that a two-order-of-magnitude reduction in end-to-end delay can be achieved with the new TS-TWDM-PON architecture.
Preprint
Modal crosstalk is the main bottleneck in MMF-enabled optical datacenter networks with direct detection. A novel time-slicing-based crosstalk-mitigated MDM scheme is first proposed, then theoretically analyzed and experimentally demonstrated.
Preprint
In this paper, we proposed a novel OTSS-assisted optical network architecture for smart-grid communication networks, which has unique requirements for low-latency connections. Illustrative results show that, OTSS can provide extremely better performance in latency and blocking probability than conventional flexi-grid optical networks.
Conference Paper
Full-text available
In this paper, we proposed a novel OTSS-assisted optical network architecture for smart-grid communication net- works, which has unique requirements for low-latency connections. Illustrative results show that, OTSS can provide extremely better performance in latency and blocking probability than conventional flexi-grid optical networks
Article
Full-text available
Energy-efficient Time- and Wavelength- Division Multiplexed Passive Optical Network (TWDM-PON) has been intensely investigated. However, conventional schemes aimed at energy efficiency may bring about repeated power-state transitions between sleep mode and active mode, resulting in periodic device-temperature cycling and frequent wavelength reassig...
Conference Paper
We propose a Fast-Reconfigurable Optical Interconnect (FROI) architecture enabled by time-synchronized node coordination for high performance computing. Experimental results show that an ultra-low reconfiguration time of 45.6μs can be achieved after traffic pattern changes.
Conference Paper
Full-text available
The emergence of new network applications is driving network operators to not only fulfill dynamic bandwidth requirements, but offer various grades of service. Degraded provisioning provides an effective solution to flexibly allocate resources in various dimensions to reduce blocking for differentiated demands when network congestion occurs. In thi...
Preprint
The emergence of new network applications is driving network operators to not only fulfill dynamic bandwidth requirements, but offer various grades of service. Degraded provisioning provides an effective solution to flexibly allocate resources in various dimensions to reduce blocking for differentiated demands when network congestion occurs. In thi...
Article
Full-text available
The growing popularity of high-speed mobile communications, cloud computing, and the Internet of Things (IoT) has reinforced the tidal traffic phenomenon, which induces spatio-temporal disequilibrium in the network traffic load. The main reason for tidal traffic is the large-scale population migration between business areas during the day and resid...
Conference Paper
Full-text available
We present a software-defined unified control architecture for heterogeneous packet-optical networks inter-connection. This architecture supports hybrid packet- and circuit-switched networks employing various switching technologies and can achieve fast and seamless connection establishment.
Conference Paper
Full-text available
Tidal traffic caused by the large-scale population migration between workplace during the day and residence at night are becoming a crucial problem for metro network control and management. We introduce an effective tidal traffic dispatching scheme with a novel TIDAL model based on software-defined architecture. Simulation results show that our pro...
Conference Paper
Full-text available
We proposed a software-defined unified control architecture for IP over optical transport networks. A successful network experiment of end-to-end dynamic connection establishment is implemented across both IP and OTN layers with the scheme.

Network

Cited By