Huixia Zhang’s research while affiliated with Xidian University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


Alarm: An Adaptive Routing Algorithm Based on One-Way Delay for Infiniband
  • Article

July 2024

·

15 Reads

IEEE Transactions on Network Science and Engineering

Changyun Luo

·

·

·

[...]

·

Weigang Hou

Based on the most recent TOP500 rankings, Infiniband (IB) stands out as the dominant network architecture among the top 10 supercomputers. Yet, it primarily employs deterministic routing, which tends to be suboptimal in network traffic balance. While deterministic routing invariably opts for the same forwarding path, adaptive routing offers flexibility by permitting packets to traverse varied paths for every source-destination pair. Contemporary adaptive routing methods in HPC networks typically determine path selection rooted in the switch queue's occupancy. While the queue length provides a glimpse into local congestion, it's challenging to consolidate such fragmented information to portray the full path accurately. In this paper, we introduce Alarm, an adaptive routing system that uses probabilistic path selection grounded in one-way delay metrics. The one-way delay not only offers a more holistic view of congestion, spanning from source to destination, but also captures the intricacies of network flows. Alarm gleans the one-way delay from each pathway via data packets, eliminating the need for separate delay detection packets and clock synchronization. The probabilistic selection hinges on weights determined by the one-way delay, ensuring the prevention of bottleneck links during congestion updates. Notably, routing decisions under Alarm are made per-flowlet. Guided by delay cues, the gap between flowlets is dynamically adjusted to match the maximum delay variation across diverse paths, thereby preventing the occurrence of packet out-of-order. The simulation results show that Alarm can achieve 2.0X and 1.7X better average and p99 FCT slowdown than existing adaptive routing.


FlowStar: Fast Convergence Per-Flow State Accurate Congestion Control for InfiniBand

June 2024

·

5 Reads

·

3 Citations

IEEE/ACM Transactions on Networking

According to the latest TOP500 list, InfiniBand (IB) is the most widely used network architecture in the top 10 supercomputers. IB relies on Credit-based Flow Control (CBFC) to provide a lossless network and InfiniBand congestion control (IB CC) to relieve congestion, however, this can lead to the problem of victim flow since messages are mixed in the same queue and long-lived congestion spreading due to slow convergence. To deal with these problems, in this paper, we propose FlowStar, a fast convergence per-flow state accurate congestion control for InfiniBand. FlowStar includes two core mechanisms: 1) optimized per-flow CBFC mechanism provides flow state control to detect real congestion; and 2) rate adjustment rules make up for the mismatch between the original IB CC rate regulation and the per-hop CBFC to alleviate congestion spreading. FlowStar implements a per-flow congestion state on switches and can obtain in-flight packet information without additional parameter settings to ensure a lossless network. Evaluations show that FlowStar improves average and tail message complete time under different workloads.

Citations (1)


... Congestion control efficiency: High BDP may cause traditional TCP congestion control strategies to scale poorly. Underutilization of the network or, on the other hand, significant congestion might result from ineffective congestion control [174]. Congestion control mechanism optimization is a prerequisite for attaining effective performance in highspeed networks [175]. ...

Reference:

An Investigation of TCP Security and Performance Issues when Deployed in High Speed Networks
FlowStar: Fast Convergence Per-Flow State Accurate Congestion Control for InfiniBand
  • Citing Article
  • June 2024

IEEE/ACM Transactions on Networking