Figure - available from: The Journal of Supercomputing
This content is subject to copyright. Terms and conditions apply.
Average packet latency from generation vs. accepted traffic for uniform traffic and two dimensions for direct topologies. a 256 processing nodes. b 4 K processing nodes. c 64 K processing nodes
Source publication
In large-scale supercomputers, the interconnection network plays a key role in system performance. Network topology highly defines the performance and cost of the interconnection network. Direct topologies are sometimes used due to its reduced hardware cost, but the number of network dimensions is limited by the physical 3D space, which leads to an...
Similar publications
Supercomputers with ever increasing computing power are being built for scientific applications. As the system size scales up, so does the size of interconnect network. As a result, communication in supercomputers becomes increasingly expensive due to the long distance between nodes and network contention. Topology mapping, which maps parallel appl...
BXI, Bull eXascale Interconnect, is the new interconnection network developed by Atos for high-performance computing. It has been designed to meet the requirements of exascale supercomputers. At such scale, faults have to be expected and dealt with transparently so that applications remain unaffected by them. BXI features various mechanisms for thi...
This paper presents INRFlow, a mature, frugal, flow-level simulation framework for modelling large-scale networks and computing systems. INRFlow is designed to carry out performance-related studies of interconnection networks for both high performance computing systems and datacentres. It features a completely modular design in which adding new top...
Citations
... The family of KNS topologies was first proposed in [15], and its performance and cost scalability have been evaluated using simulation models in [16] and [17]. However, no known implementation of a KNS network in a real system using InfiniBand exists. ...
... Regarding the routing algorithms for KNS topologies, in [17] it is proposed an adaptation of the dimension order routing (DOR) algorithm used in meshes and tori. The main idea behind DOR routing is that packets traverse the network in the order given by the available dimensions, first the X dimension, then the Y dimension, etc. ...
... If s > 1 the indirect network is formed by more than one switch, and it can be built using a Fat-Tree topology. In this case, the implementation of the routing algorithm is a bit more complex, but it can be used with some of the existing ones, as proposed in [17]. ...
The InfiniBand (IB) interconnection technology is widely used in the networks of modern supercomputers and data centers. Among other advantages, the IB-based network devices allow for building multiple network topologies, and the IB control software (subnet manager) supports several routing engines suitable for the most common topologies. However, the implementation of some novel topologies in IB-based networks may be difficult if suitable routing algorithms are not supported, or if the IB switch or NIC architectures are not directly applicable for that topology. This work describes the implementation of the network topology known as KNS in a real HPC cluster using an IB network. As far as we know, this is the first implementation of this topology in an IB-based system. In more detail, we have implemented the KNS routing algorithm in the OpenSM software distribution of the subnet manager, and we have adapted the available IB-based switches to the particular structure of this topology. We have evaluated the correctness of our implementation through experiments in the real cluster, using well-known benchmarks. The obtained results, which match the expected performance for the KNS topology, show that this topology can be implemented in IB-based clusters as an alternative to other interconnection patterns.
... For the sake of simplicity, we only show the nodes on the corner routers. Regarding the routing algorithms for KNS topologies, it is proposed an adaptation of the dimension-order routing (DOR) algorithm used in meshes and tori [7]. The main idea behind DOR routing is that packets traverse the network in the order given by the available dimensions, first the X dimension, then the Y dimension, etc. ...
... The Hybrid-DOR pseudo-code for k-ary n-direct 1-direct topologies is shown below: i = i + 1 9: end while Algorithm 2 Hybrid-DOR for Switches [7] link = x d According to this definition of the KNS topology, a new type of network is obtained, which allows scaling to a large number of nodes at a cost not as high as the indirect topologies and without losing the efficiency that is lost when increasing the number of nodes in the direct ones. Instead of traversing each dimension node by node, as in direct networks until reaching the destination, in KNS all the nodes of each dimension have a direct connection through an indirect network, which reduces the average distance of the network. ...
InfiniBand networking technology widely utilized in modern high-performance systems. This work describes the implementation of the hybrid network topology known as KNS in a real HPC cluster using an InfiniBand interconnection network. He have used cluster CELLIA (Cluster for the Evaluation of Low-Latency Architectures), consists of 50 compute and storage nodes equipped with InfiniBand network cards, and up to 50 8-port InfiniBand switches, which allow us to build several topologies. We have implemented the KNS routing algorithm in OpenSM, the subnet manager provided by the OpenFabrics Software (OFS). We also evaluate the performance of the KNS topology using well-known benchmarks, such as HPCC, HPCG, Graph500, and Netgauge. The obtained results show that the low-diameter KNS topology is an efficient and cost-effective alternative to interconnect the computing and storage nodes in HPC clusters. As far as we know, no known InfiniBand system has implemented this topology before.
... The most important modern low-diameter networks are Slim Fly (SF) [35], Dragonfly (DF) [137], Jellyfish (JF) [203], Xpander (XP) [219], and HyperX (Hamming graph) (HX) [4]. Other proposed topologies in this family include Flexfly [227], Galaxyfly [140], Megafly [77], projective topologies [52], HHS [22], and others [184], [128], [172]. All these networks have different structure and thus different potential for multipathing [43]; in Figure 3, we illustrate example paths between a pair of routers. ...
The recent line of research into topology design focuses on lowering network diameter. Many low-diameter topologies such as Slim Fly or Jellyfish that substantially reduce cost, power consumption, and latency have been proposed. A key challenge in realizing the benefits of these topologies is routing. On one hand, these networks provide shorter path lengths than established topologies such as Clos or torus, leading to performance improvements. On the other hand, the number of shortest paths between each pair of endpoints is much smaller than in Clos, but there is a large number of non-minimal paths between router pairs. This hampers or even makes it impossible to use established multipath routing schemes such as ECMP. In this article, to facilitate high-performance routing in modern networks, we analyze existing routing protocols and architectures, focusing on how well they exploit the diversity of minimal and non-minimal paths. We first develop a taxonomy of different forms of support for multipathing and overall path diversity. Then, we analyze how existing routing schemes support this diversity. Among others, we consider multipathing with both shortest and non-shortest paths, support for disjoint paths, or enabling adaptivity. To address the ongoing convergence of HPC and “Big Data” domains, we consider routing protocols developed for both HPC systems and for data centers as well as general clusters. Thus, we cover architectures and protocols based on Ethernet, InfiniBand, and other HPC networks such as Myrinet. Our review will foster developing future high-performance multipathing routing protocols in supercomputers and data centers.
... The most important modern low-diameter networks are Slim Fly (SF) [35], Dragonfly (DF) [137], Jellyfish (JF) [203], Xpander (XP) [219], and HyperX (Hamming graph) (HX) [4]. Other proposed topologies in this family include Flexfly [227], Galaxyfly [140], Megafly [77], projective topologies [52], HHS [22], and others [184], [128], [172]. All these networks have different structure and thus different potential for multipathing [43]; in Figure 3, we illustrate example paths between a pair of routers. ...
The recent line of research into topology design focuses on lowering network diameter. Many low-diameter topologies such as Slim Fly or Jellyfish that substantially reduce cost, power consumption, and latency have been proposed. A key challenge in realizing the benefits of these topologies is routing. On one hand, these networks provide shorter path lengths than established topologies such as Clos or torus, leading to performance improvements. On the other hand, the number of shortest paths between each pair of endpoints is much smaller than in Clos, but there is a large number of non-minimal paths between router pairs. This hampers or even makes it impossible to use established multipath routing schemes such as ECMP. In this work, to facilitate high-performance routing in modern networks, we analyze existing routing protocols and architectures, focusing on how well they exploit the diversity of minimal and non-minimal paths. We first develop a taxonomy of different forms of support for multipathing and overall path diversity. Then, we analyze how existing routing schemes support this diversity. Among others, we consider multipathing with both shortest and non-shortest paths, support for disjoint paths, or enabling adaptivity. To address the ongoing convergence of HPC and "Big Data" domains, we consider routing protocols developed for both traditional HPC systems and supercomputers, and for data centers and general clusters. Thus, we cover architectures and protocols based on Ethernet, InfiniBand, and other HPC networks such as Myrinet. Our review will foster developing future high-performance multipathing routing protocols in supercomputers and data centers.
... IODET [95] considers the torus topology [16] and its dimension order routing algorithm. BBQ [96] is designed for the KNS topology [97] with the Hybrid-DOR routing algorithm. H2LQ [98] is tailored to Dragonfly topology using its minimal routing [29]. ...
In recent years, energy has become one of the most important factors for de- signing and operating large scale computing systems. This is particularly true in high-performance computing, where systems often consist of thousands of nodes. Especially after the end of Dennard’s scaling, the demand for energy- proportionality in components, where energy is depending linearly on utilization, increases continuously. As the main contributor to the overall power consumption, processors have received the main attention so far. The increasing energy proportionality of processors, however, shifts the focus to other components such as interconnection networks. Their share of the overall power consumption is expected to increase to 20% or more while other components further increase their efficiency in the near future. Hence, it is crucial to improve energy proportionality in interconnection networks likewise to reduce overall power and energy consumption. To facilitate these attempts, this work provides comprehensive studies about energy saving in interconnection networks at different levels. First, interconnection networks differ fundamentally from other components in their underlying technology. To gain a deeper understanding of these differences and to identify targets for energy savings, this work provides a detailed power analysis of current network hardware. Furthermore, various applications at different scales are analyzed regarding their communication patterns and locality properties. The findings show that communication makes up only a small fraction of the execution time and networks are actually idling most of the time. Another observation is that point-to-point communication often only occurs within various small subsets of all participants, which indicates that a coordinated mapping could further decrease network traffic. Based on these studies, three different energy-saving policies are designed, which all differ in their implementation and focus. Then, these policies are evaluated in an event-based, power-aware network simulator. While two policies that operate completely local at link level, enable significant energy savings of more than 90% in most analyses, the hybrid one does not provide further benefits despite significant additional design effort. Additionally, these studies include network design parameters, such as transition time between different link configurations, as well as the three most common topologies in supercomputing systems. The final part of this work addresses the interactions of congestion management and energy-saving policies. Although both network management strategies aim for different goals and use opposite approaches, they complement each other and can increase energy efficiency in all studies as well as improve the performance overhead as opposed to plain energy saving.
... In recent years, designers and researchers have proposed several hierarchical network topologies, such as Dragonflies [23] or Slim-flies [6], which focus on reducing the number of required network devices by employing connection patterns of reduced diameter. Based on this idea, hybrid network topologies, such as KNS [28], have been also proposed in the last years for interconnecting thousands of processing and storage end nodes. Like Dragonflies and Slim-flies, KNS topologies offer an excellent performance/cost ratio, mainly because they allow short routes and provide path diversity, which can be leveraged by efficient routing algorithms. ...
... These "tailored" queuing schemes are known as topology-and routingaware. In that sense, we proposed the topology-and routing-aware queuing scheme called BBQ (Band-based Queuing) [34], which is tailored to KNS topologies using the Hybrid-DOR deterministic routing algorithm [28]. BBQ significantly reduces the HoL blocking utilizing a small number of queues per port. ...
... In general, the objective of hybrid topologies is to provide high-performance like indirect topologies, but at a similar cost compared to direct topologies. With this objective in mind, the k-ary n-direct s-indirect (KNS) family of topologies were proposed [28]. Specifically, KNS topologies organize end nodes in n dimensions, as in a direct network topology, each dimension having k end nodes, but the end nodes of a given dimension are not interconnected as in meshes or Tori. ...
Hybrid and direct topologies are cost-efficient and scalable options to interconnect thousands of end nodes in high-performance computing (HPC) systems. They offer a rich path diversity, high bisection bandwidth, and a reduced diameter guaranteeing low latency. In these topologies, efficient deterministic routing algorithms can be used to balance smartly the traffic flows among the available routes. Unfortunately, congestion leads these networks to saturation, where the HoL blocking effect degrades their performance dramatically. Among the proposed solutions to deal with HoL blocking, the routing algorithms selecting alternative routes, such as adaptive and oblivious, can mitigate the congestion effects. Other techniques use queues to separate congested flows from non-congested ones, thus reducing the HoL blocking. In this article, we propose a new approach that reduces HoL blocking in hybrid and direct topologies using source-adaptive and oblivious routing. This approach also guarantees deadlock-freedom as it uses virtual networks to break potential cycles generated by the routing policy in the topology. Specifically, we propose two techniques, called Source-Adaptive Solution for Head-of-Line Blocking Avoidance (SASHA) and Oblivious Solution for Head-of-Line Blocking Avoidance (OSHA). Experiment results, carried out through simulations under different traffic scenarios, show that SASHA and OSHA can significantly reduce the HoL blocking.
... Existing topologies of datacenter networks are hierarchical comparable to the ones applied in traditional telephony networks. Recently, various alternate topologies have been raised in different projects such as fat trees [69], hyper-cubes [70] and randomized small-world topologies [71]. Regardless of the network topology used, the target is engineering a scalable topology in which the delivered bisection bandwidth is increased linearly by increasing the network ports' number. ...
Recently, cloud computing has appeared as a modern technology used to host and deliver services over the Internet. Business owners see the cloud as an interesting technology because it abrogates the demand for customers to plan ahead for provisioning. In addition, the cloud simplifies infrastructure planning for new companies starting as small businesses and enables extra resources to be added only if there are many requests for services. Cloud computing can be represented as a technological revolution in the world of the IT industry; however, cloud evolution is presently in its infancy, accompanied by many challenges that should be addressed. In this paper, an inclusive study of cloud computing is presented, highlighting its main concept including its definition and classifications, architecture, famous applications, serious challenges and popularly used simulators. The goal of this study is to offer better comprehension of the cloud computing design issues and to identify significant research trends in this increasingly significant area.
... • k-ary n-direct s-indirect (KNS) [17]: 4096-node network interconnected by 128 switches with 64 ports and 4096 small switches with 3 ports (parameters: ary = 2, direct = 64, indirect = 1). ...