Eitan Zahavi

Eitan Zahavi
  • PhD: Cluster Forwarding (2015). Advisors: A. Kolodny, I. Keslassy, I. Cidon
  • Distinguished Architect at Mellanox Technologies, Yokneam, Israel

About

42
Publications
42,929
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,038
Citations
Introduction
I am a Senior Principal Engineer at Amazon Network. I used to manage the Mellanox end-to-end performance architecture group which focuses on features that improve the overall system performance for both Ethernet and InfiniBand, lossy and lossless. We also study Optical Data Center networks.
Current institution
Mellanox Technologies, Yokneam, Israel
Current position
  • Distinguished Architect
Additional affiliations
January 2008 - April 2015
Mellanox Technologies
Position
  • Senior Principal Engineer
Description
  • Study lossless networks, topologies, CC, AR, etc.
May 1999 - January 2008
Mellanox Technologies
Position
  • Co-founder, Design Automation Group Leader
Description
  • Lead the Mellanox Computer Aided Design group. Providing internal automation tools for the VLSI group. Also set the company chip design methodology.
November 1995 - December 1996
Intel
Position
  • Researcher
Education
September 2008 - December 2015
Technion – Israel Institute of Technology
Field of study
  • Electrical Engineering
March 1983 - March 1987
Technion – Israel Institute of Technology
Field of study
  • Electrical Engineering

Publications

Publications (42)
Preprint
Full-text available
The Dragonfly topology is currently one of the most popular network topologies in high-performance parallel systems. The interconnection networks of many of these systems are built from components based on the InfiniBand specification. However, due to some constraints in this specification, the available versions of the InfiniBand network controlle...
Article
Full-text available
We explore optical switching to extend network programmability to the physical layer and discuss applications of a Layer-1 software-defined network (SDN) in AI/HPC clusters. In this context we identify two applications for optical circuit switches (OCSs): failure resilience and reconfigurable topologies for deep learning workloads. We present exper...
Conference Paper
We explore optical switching to extend network programmability to the physical layer. We present applications of our Layer-1 SDN for improving fabric resilience against hardware failures and saving network power and cost in Deep-Learning training.
Conference Paper
We propose a programmable optical fabric design for Data Center networks that extends SDN to L1. We present experiments on our HPC/ML testbed leveraging the programmable network to automatically failover from hardware or software failures.
Article
Full-text available
The Dragonfly topology is currently one of the most popular network topologies in high-performance parallel systems. The interconnection networks of many of these systems are built from components based on the InfiniBand specification. However, due to some constraints in this specification, the available versions of the InfiniBand network controlle...
Article
Full-text available
The NEPHELE hybrid electro-optical datacenter network (DCN) architecture is proposed as a dynamic network solution to provide high capacity, scalability, and cost efficiency in comparison to the existing DCN infrastructures. The details of the NEPHELE DCN architecture and its various key parts are introduced, and the performance of its implementati...
Article
Full-text available
The most demanding tenants of shared clouds require complete isolation from their neighbors, in order to guarantee that their application performance is not affected by other tenants. Unfortunately, while shared clouds can offer an option, whereby tenants obtain dedicated servers, they do not offer any network provisioning service, which would shie...
Conference Paper
The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blo...
Preprint
The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blo...
Article
The efficient integration of optical switching in data center networks is being studied as a means to cope with surging traffic demands. Optically switched, flatter network architectures more efficiently handle the east-west traffic profiles of modern data centers. Limitations in the port count and reconfiguration speed of optical switches require...
Article
Full-text available
Dragonfly topologies are gathering great interest as one of the most promising interconnect options for High-Performance Computing systems. Dragonflies contain physical cycles that may lead to traffic deadlocks unless the routing algorithm prevents them properly. Previous topology-aware algorithms are difficult to implement, or even unfeasible, in...
Conference Paper
Full-text available
In recent years, the usage of RDMA in data center networks has increased significantly, with RDMA over Converged Ethernet (RoCE) emerging as the canonical approach for deploying RDMA in Ethernet-based data centers. Initial implementations of RoCE required a lossless fabric for optimal performance. This is typically achieved by enabling Priority Flo...
Patent
Full-text available
A communication network includes multiple nodes, which are arranged in groups such that the nodes in each group are interconnected in a bipartite topology and the groups are interconnected in a mesh topology. The nodes are configured to convey traffic between source hosts and respective destination hosts by routing packets among the nodes on paths...
Conference Paper
Full-text available
Dragonfly topology was introduced by Kim et al. [1] aiming to decrease the cost and diameter of the network. The topology divides routers into groups connected by long links. Each group strives to implement high-radix virtual router, connected by a completely-connected topology. In this paper, we propose an extended Dragonfly+ network in which rout...
Conference Paper
Full-text available
The recently emerging Converged Enhanced Ethernet (CEE) data center networks rely on layer-2 flow control in order to support packet loss sensitive transport protocols, such as RDMA and FCoE. Although lossless networks were proven to improve end-to-end network performance, without careful design and operation, they might suffer from in-network dead...
Conference Paper
Full-text available
This paper raises the data center designers question of trade-off between high-buffer switches versus low-latency switches. Packet buffer hardware dictates this trade-off due to the constraints of DRAM and SRAM technologies. While the designers who prefer network robust solutions would typically prefer large-buffer switches with settling for high l...
Conference Paper
Full-text available
The most demanding tenants of shared clouds require complete isolation from their neighbors, in order to guarantee that their application performance is not affected by other tenants. Unfortunately, while shared clouds can offer an option whereby tenants obtain dedicated servers, they do not offer any network provisioning service, which would shiel...
Article
Full-text available
The most demanding tenants of shared clouds require complete isolation from their neighbors, in order to guarantee that their application performance is not affected by other tenants. Unfortunately, while shared clouds can offer an option whereby tenants obtain dedicated servers, they do not offer any network provisioning service, which would shiel...
Article
Full-text available
Data center networks demand high bandwidth switches. These networks also sustain common in cast scenarios, which require large switch buffers. Therefore, network and switch designers encounter a buffer-bandwidth trade off as follows. Large switch buffers allow absorbing larger in cast workload. However, higher switch bandwidth allows both faster bu...
Conference Paper
Full-text available
High-Performance Computing (HPC) Clusters and Data Center Networks often rely on fat-tree topologies. However, fat trees and their known variants are not designed for concurrent small jobs. As a result, in recent years, HPC designers have introduced ad-hoc topologies to offer better performance for these concurrent small jobs. In this paper, we pre...
Article
Full-text available
Similar to off-chip networks, current NoC architectures are based on the store and forward of uncoordinated end-to-end packet transmissions through autonomous buffered routers. However, the monolithic nature and the small physical dimensions of on chip networks open up the opportunity for much more tightly controlled architectures. We present GANA,...
Conference Paper
Full-text available
With the growing popularity of big-data applications, Data Center Networks increasingly carry larger and longer traffic flows. As a result of this increased flow granularity, static routing cannot efficiently load-balance traffic, resulting in an increased network contention and a reduced throughput. Unfortunately, while adaptive routing can solve...
Conference Paper
Full-text available
With the growing popularity of big-data applications, Data Center Networks increasingly carry larger and longer traffic flows. As a result of this increased flow granularity, static routing cannot efficiently load-balance traffic, resulting in an increased network contention and a reduced throughput. Unfortunately, while adaptive routing can solve...
Conference Paper
Full-text available
We present HNOCS (Heterogeneous Network-on-Chip Simulator), an open-source NoC simulator based on OMNeT++. To the best of our knowledge, HNOCS is the first simulator to support modeling of heterogeneous NoCs with variable link capacities and number of VCs per unidirectional port. The HNOCS simulation platform provides an open-source, modular, scala...
Conference Paper
Full-text available
In a lossless interconnection network, network congestion needs to be detected and resolved to ensure high performance and good utilization of network resources at high network load. If no countermeasure is taken, congestion at a node in the network will stimulate the growth of a congestion tree that not only affects contributors to congestion, but...
Conference Paper
Full-text available
In lossless interconnection networks such as Infini-Band, congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. The InfiniBand standard describes CC functionality for detecting and resolving congestion, but the design decisions on how to implement this functionallity is left to...
Conference Paper
Full-text available
As the size of High Performance Computing clusters grows, the increasing probability of interconnect hot spots degrades the latency and effective bandwidth the network provides. This paper presents a solution to this scalability problem for real life constant bisectional-bandwidth fat-tree topologies. It is shown that maximal bandwidth and cut-thro...
Article
Full-text available
In lossless interconnection networks such as Infini-Band, congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. The InfiniBand standard describes CC functionality for detecting and resolving congestion, but the design decisions on how to implement this functionallity is left to...
Conference Paper
Full-text available
As chip density keeps doubling every process generation, the use of Network-on-Chip becomes the prevalent architecture of SoC, MPSoC and large scale CMP designs. To that end, diverse NoC solutions are developed by the industry and the research community in order to meet heterogeneous on-chip communication requirements. Consequently, there is a grow...
Article
Clustered systems have become a dominant architecture of scalable high-performance super computers. In these large-scale computers, the network performance and scalability is as critical as the compute-nodes speed. InfiniBandTM has become a commodity networking solution supporting the stringent latency, bandwidth and scalability requirements of the...
Article
Clustered systems have become a dominant architecture of scalable high-performance super computers. In these large-scale computers, the network performance and scalability is as critical as the compute-nodes speed. InfiniBandTM has become a commodity networking solution supporting the stringent latency, bandwidth and scalability requirements of the...

Network

Cited By