
Eitan Zahavi- PhD: Cluster Forwarding (2015). Advisors: A. Kolodny, I. Keslassy, I. Cidon
- Distinguished Architect at Mellanox Technologies, Yokneam, Israel
Eitan Zahavi
- PhD: Cluster Forwarding (2015). Advisors: A. Kolodny, I. Keslassy, I. Cidon
- Distinguished Architect at Mellanox Technologies, Yokneam, Israel
About
42
Publications
42,929
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,038
Citations
Introduction
I am a Senior Principal Engineer at Amazon Network.
I used to manage the Mellanox end-to-end performance architecture group which focuses on features that improve the overall system performance for both Ethernet and InfiniBand, lossy and lossless. We also study Optical Data Center networks.
Current institution
Mellanox Technologies, Yokneam, Israel
Current position
- Distinguished Architect
Additional affiliations
November 1995 - December 1996
Education
September 2008 - December 2015
March 1983 - March 1987
Publications
Publications (42)
The Dragonfly topology is currently one of the most popular network topologies in high-performance parallel systems. The interconnection networks of many of these systems are built from components based on the InfiniBand specification. However, due to some constraints in this specification, the available versions of the InfiniBand network controlle...
We explore optical switching to extend network programmability to the physical layer and discuss applications of a Layer-1 software-defined network (SDN) in AI/HPC clusters. In this context we identify two applications for optical circuit switches (OCSs): failure resilience and reconfigurable topologies for deep learning workloads. We present exper...
We explore optical switching to extend network programmability to the physical layer. We present applications of our Layer-1 SDN for improving fabric resilience against hardware failures and saving network power and cost in Deep-Learning training.
We propose a programmable optical fabric design for Data Center networks that extends SDN to L1. We present experiments on our HPC/ML testbed leveraging the programmable network to automatically failover from hardware or software failures.
The Dragonfly topology is currently one of the most popular network topologies in high-performance parallel systems. The interconnection networks of many of these systems are built from components based on the InfiniBand specification. However, due to some constraints in this specification, the available versions of the InfiniBand network controlle...
The NEPHELE hybrid electro-optical datacenter network (DCN) architecture is proposed as a dynamic network solution to provide high capacity, scalability, and cost efficiency in comparison to the existing DCN infrastructures. The details of the NEPHELE DCN architecture and its various key parts are introduced, and the performance of its implementati...
The most demanding tenants of shared clouds require complete isolation from their neighbors, in order to guarantee that their application performance is not affected by other tenants. Unfortunately, while shared clouds can offer an option, whereby tenants obtain dedicated servers, they do not offer any network provisioning service, which would shie...
The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blo...
The advent of RoCE (RDMA over Converged Ethernet) has led to a significant increase in the use of RDMA in datacenter networks. To achieve good performance, RoCE requires a lossless network which is in turn achieved by enabling Priority Flow Control (PFC) within the network. However, PFC brings with it a host of problems such as head-of-the-line blo...
The efficient integration of optical switching in data center networks is being studied as a means to cope with surging traffic demands. Optically switched, flatter network architectures more efficiently handle the east-west traffic profiles of modern data centers. Limitations in the port count and reconfiguration speed of optical switches require...
Dragonfly topologies are gathering great interest as one of the most promising interconnect options for High-Performance Computing systems. Dragonflies contain physical cycles that may lead to traffic deadlocks unless the routing algorithm prevents them properly. Previous topology-aware algorithms are difficult to implement, or even unfeasible, in...
In recent years, the usage of RDMA in data center networks has increased significantly, with RDMA over Converged Ethernet (RoCE) emerging as the canonical approach for deploying RDMA in Ethernet-based data centers. Initial implementations of RoCE required a lossless fabric for optimal performance. This is typically achieved by enabling Priority Flo...
A communication network includes multiple nodes, which are arranged in groups such that the nodes in each group are interconnected in a bipartite topology and the groups are interconnected in a mesh topology. The nodes are configured to convey traffic between source hosts and respective destination hosts by routing packets among the nodes on paths...
Dragonfly topology was introduced by Kim et al. [1] aiming to decrease the cost and diameter of the network. The topology divides routers into groups connected by long links. Each group strives to implement high-radix virtual router, connected by a completely-connected topology. In this paper, we propose an extended Dragonfly+ network in which rout...
The recently emerging Converged Enhanced Ethernet (CEE) data center networks rely on layer-2 flow control in order to support packet loss sensitive transport protocols, such as RDMA and FCoE. Although lossless networks were proven to improve end-to-end network performance, without careful design and operation, they might suffer from in-network dead...
This paper raises the data center designers question of trade-off between high-buffer switches versus low-latency switches. Packet buffer hardware dictates this trade-off due to the constraints of DRAM and SRAM technologies. While the designers who prefer network robust solutions would typically prefer large-buffer switches with settling for high l...
The most demanding tenants of shared clouds require complete isolation from their neighbors, in order to guarantee that their application performance is not affected by other tenants. Unfortunately, while shared clouds can offer an option whereby tenants obtain dedicated servers, they do not offer any network provisioning service, which would shiel...
The most demanding tenants of shared clouds require complete isolation from
their neighbors, in order to guarantee that their application performance is
not affected by other tenants. Unfortunately, while shared clouds can offer an
option whereby tenants obtain dedicated servers, they do not offer any network
provisioning service, which would shiel...
Data center networks demand high bandwidth switches. These networks also sustain common in cast scenarios, which require large switch buffers. Therefore, network and switch designers encounter a buffer-bandwidth trade off as follows. Large switch buffers allow absorbing larger in cast workload. However, higher switch bandwidth allows both faster bu...
High-Performance Computing (HPC) Clusters and Data Center Networks often rely on fat-tree topologies. However, fat trees and their known variants are not designed for concurrent small jobs. As a result, in recent years, HPC designers have introduced ad-hoc topologies to offer better performance for these concurrent small jobs. In this paper, we pre...
Similar to off-chip networks, current NoC architectures are based on the store and forward of uncoordinated end-to-end packet transmissions through autonomous buffered routers. However, the monolithic nature and the small physical dimensions of on chip networks open up the opportunity for much more tightly controlled architectures. We present GANA,...
With the growing popularity of big-data applications, Data Center Networks increasingly carry larger and longer traffic flows. As a result of this increased flow granularity, static routing cannot efficiently load-balance traffic, resulting in an increased network contention and a reduced throughput. Unfortunately, while adaptive routing can solve...
With the growing popularity of big-data applications, Data Center Networks increasingly carry larger and longer traffic flows. As a result of this increased flow granularity, static routing cannot efficiently load-balance traffic, resulting in an increased network contention and a reduced throughput. Unfortunately, while adaptive routing can solve...
We present HNOCS (Heterogeneous Network-on-Chip Simulator), an open-source NoC simulator based on OMNeT++. To the best of our knowledge, HNOCS is the first simulator to support modeling of heterogeneous NoCs with variable link capacities and number of VCs per unidirectional port. The HNOCS simulation platform provides an open-source, modular, scala...
In a lossless interconnection network, network congestion needs to be detected and resolved to ensure high performance and good utilization of network resources at high network load. If no countermeasure is taken, congestion at a node in the network will stimulate the growth of a congestion tree that not only affects contributors to congestion, but...
In lossless interconnection networks such as Infini-Band, congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. The InfiniBand standard describes CC functionality for detecting and resolving congestion, but the design decisions on how to implement this functionallity is left to...
As the size of High Performance Computing clusters grows, the increasing probability of interconnect hot spots degrades the latency and effective bandwidth the network provides. This paper presents a solution to this scalability problem for real life constant bisectional-bandwidth fat-tree topologies. It is shown that maximal bandwidth and cut-thro...
In lossless interconnection networks such as Infini-Band, congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. The InfiniBand standard describes CC functionality for detecting and resolving congestion, but the design decisions on how to implement this functionallity is left to...
As chip density keeps doubling every process generation, the use of Network-on-Chip becomes the prevalent architecture of SoC, MPSoC and large scale CMP designs. To that end, diverse NoC solutions are developed by the industry and the research community in order to meet heterogeneous on-chip communication requirements. Consequently, there is a grow...
Clustered systems have become a dominant architecture of scalable high-performance super computers. In these large-scale computers, the network performance and scalability is as critical as the compute-nodes speed. InfiniBandTM has become a commodity networking solution supporting the stringent latency, bandwidth and scalability requirements of the...
Clustered systems have become a dominant architecture of scalable high-performance super computers. In these large-scale computers, the network performance and scalability is as critical as the compute-nodes speed. InfiniBandTM has become a commodity networking solution supporting the stringent latency, bandwidth and scalability requirements of the...