Yiu-Wing Leung’s research while affiliated with Hong Kong Baptist University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (136)


Spectrum Handoff Without Forced Termination in Cognitive Radio Networks
  • Conference Paper

April 2024

·

3 Reads

Yiu-Wing Leung


Energy-Aware Non-Preemptive Task Scheduling With Deadline Constraint in DVFS-Enabled Heterogeneous Clusters

December 2022

·

48 Reads

·

22 Citations

IEEE Transactions on Parallel and Distributed Systems

·

·

·

[...]

·

Xiaowen Chu

Energy conservation of large data centers for high performance computing workloads, such as deep learning with Big Data, is of critical significance, where cutting down a few percent of electricity translates into million-dollar savings. This work studies energy conservation on emerging CPU-GPU hybrid clusters through dynamic voltage and frequency scaling (DVFS). We aim at minimizing the total energy consumption of processing a batch of offline tasks or a sequence of real-time tasks under deadline constraints. We derive a fast and accurate analytical model to compute the appropriate voltage/frequency setting for each task, and assign multiple tasks to the cluster with heuristic scheduling algorithms. In particular, our model stresses the nonlinear relationship between task execution time and processor speed for GPU-accelerated applications, for more accurately capturing real-world GPU energy consumption. In performance evaluation driven by real-world power measurement traces, our scheduling algorithm shows comparable energy savings to the theoretical upper bound. With a GPU scaling interval where analytically at most 36% of energy can be saved, we record 33-35% of energy savings. Our results are applicable to energy management on modern heterogeneous clusters.


Joint Access Point Placement and Power-Channel-Resource-Unit Assignment for IEEE 802.11ax-Based Dense WiFi Network with QoS Requirements

November 2021

·

40 Reads

·

6 Citations

IEEE Transactions on Mobile Computing

IEEE 802.11ax is the standard for the new generation WiFi networks. In this paper, we formulate the problem of joint access point (AP) placement and power-channel-resource unit assignment for 802.11ax-based dense WiFi. The objective is to minimize the number of APs. Two quality-of-service (QoS) requirements are to be fulfilled: (1) a two-tier throughput requirement which ensures that the throughput of each station is good enough, and (2) a fault tolerance requirement which ensures that the stations could still use WiFi even when some APs fail. We prove that this problem is NP-hard. To tackle this problem, we first develop an analytic model to derive the throughput of each station under the OFDMA mechanism and a widely used interference model. We then design a heuristic algorithm to find high-quality solutions with polynomial time complexity. Simulation results under both fixed-user and mobile-user cases show that: (1) when the area is small (50 × 50 m2\rm m^2 ), our algorithm gives the optimal solutions; when the area is larger (80 × 60 m2\rm m^2 ), our algorithm can reduce the number of APs by 34.9-87.7% as compared to the Random and Greedy algorithms. (2) Our algorithm can always get feasible solutions that fulfill the QoS requirements.



Fig. 2. Our studied heterogeneous CPU-GPU cluster with m servers.
Fig. 3. When memory frequency is fixed, the minimum energy depends on the core voltage only. The data is obtained with P = 100 + 50f Gm + 150V Gc 2 f Gc ; t = 25(0.5/f Gc + 0.5/f Gm ) + 5; g 1 (V Gc ) = (V Gc − 0.5)/2 + 0.5 and f Gm o = f Gm max = 1.2. Note that although we use a specific function for demonstration, the finding holds for other general functions of our GPU DVFS modeling scheme.
Fig. 4. The energy consumption and the optimal voltage/frequency setting of the 20 benchmark applications. The x-axis stands for the application index
Fig. 11. Comparison between the energy consumption of the non-DVFS and DVFS scheduling algorithms.
Fig. 12. The energy consumption with runtime readjustments.

+2

Energy-aware Task Scheduling with Deadline Constraint in DVFS-enabled Heterogeneous Clusters
  • Preprint
  • File available

April 2021

·

138 Reads

Energy conservation of large data centers for high-performance computing workloads, such as deep learning with big data, is of critical significance, where cutting down a few percent of electricity translates into million-dollar savings. This work studies energy conservation on emerging CPU-GPU hybrid clusters through dynamic voltage and frequency scaling (DVFS). We aim at minimizing the total energy consumption of processing a batch of offline tasks or a sequence of real-time tasks under deadline constraints. We derive a fast and accurate analytical model to compute the appropriate voltage/frequency setting for each task and assign multiple tasks to the cluster with heuristic scheduling algorithms. In particular, our model stresses the nonlinear relationship between task execution time and processor speed for GPU-accelerated applications, for more accurately capturing real-world GPU energy consumption. In performance evaluation driven by real-world power measurement traces, our scheduling algorithm shows comparable energy savings to the theoretical upper bound. With a GPU scaling interval where analytically at most 36% of energy can be saved, we record 33-35% of energy savings. Our results are applicable to energy management on modern heterogeneous clusters.

Download



ESetStore: An Erasure-Coded Storage System with Fast Data Recovery

March 2020

·

91 Reads

·

16 Citations

IEEE Transactions on Parallel and Distributed Systems

Erasure codes have been used extensively in large-scale storage systems to reduce the storage overhead of triplication-based storage systems. One key performance issue introduced by erasure codes is the long time needed to recover from a single failure, which occurs constantly in large-scale storage systems. We present ESetStore, a prototype erasure-coded storage system that aims to achieve fast recovery from failures. ESetStore is novel in the following aspects. We proposed a data placement algorithm named ESet for our ESetStore that can aggregate adequate I/O resources from available storage servers to recover from each single failure. We designed and implemented efficient read and write operations on our erasure-coded storage system via effective use of available I/O and computation resources. Our proposed fast data recovery solution can also enhance existing solutions. We evaluated the performance of ESetStore with extensive experiments on a cluster with 50 storage servers. The evaluation results demonstrate that our recovery performance can obtain linear performance growth by harvesting available I/O resources. With our defined parameter recovery I/O parallelism I under some mild conditions, we can achieve optimal recovery performance, in which ESet enables minimal recovery time. Rather than being an alternative to improve recovery performance, our work can be an enhancement for existing solutions, such as Partial-parallel-repair (PPR), to further improve recovery performance.



Citations (71)


... To mitigate this challenge, the concept of mobile edge computing (MEC) [2] has been proposed as a potential technology to provide real-time services at the wireless network edges (e.g., base stations). In MEC and edge intelligence [3] systems, dynamic voltage and frequency scaling (DVFS) [4] technique is commonly used to balance the performance of processors by adjusting the computing frequency based on the realtime energy consumption. In addition, by fully or partially offloading DNN inference tasks from mobile devices to the edge servers (e.g., DNN partitioning), we can potentially reduce both inference time and energy consumption of mobile devices. ...

Reference:

DVFS-Aware DNN Inference on GPUs: Latency Modeling and Performance Analysis
Energy-Aware Non-Preemptive Task Scheduling With Deadline Constraint in DVFS-Enabled Heterogeneous Clusters
  • Citing Article
  • December 2022

IEEE Transactions on Parallel and Distributed Systems

... In [46], Qiu et al. presented a joint optimization method for the placement of access points, power assignment, channel assignment, and resource unit assignment in dense IEEE 802.11ax WLAN. ...

Joint Access Point Placement and Power-Channel-Resource-Unit Assignment for IEEE 802.11ax-Based Dense WiFi Network with QoS Requirements
  • Citing Article
  • November 2021

IEEE Transactions on Mobile Computing

... The MFMCF system [22] proposed by Yuan et al. introduces the concept of multi-fingerprints, but these multi-fingerprints essentially involve the fusion of different data sources rather than the features of the same data source under different conditions. Yu et al. propose a solution for time-varying environments [23], where fingerprint databases are constructed separately for different time periods of the day, but this approach can be considered as a pseudo multi-fingerprints method. While Yu et al. considered complex pedestrian environments by training models with data collected under different pedestrian densities [24], they did not take into account the impact of complex spatial environments, which is far more significant than the influence of pedestrians. ...

Multi-Fingerprint for Wireless Localization in Time-Varying Indoor Environment
  • Citing Conference Paper
  • December 2020

... Regarding the traditional TCP/IP-based DCN, there are many recent works proposed to deal with the issue of data storage from different perspectives [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28]. For example, the random-based storage strategies are simple and popular among several existing systems, such as the Google File System (GFS) [10], Cassandra [11], and the Hadoop Distributed File System (HDFS) [12]. ...

ESet: Placing Data Towards Efficient Recovery for Large-Scale Erasure-Coded Storage Systems
  • Citing Conference Paper
  • August 2016

... In [32], Qiu et al. proposed an energy-efficient method for dense WiFi networks based on IEEE 802.11ax. This method achieved energy savings by jointly optimizing the AP placement and power-channel-resource unit (RU) assignments. ...

Joint Access Point Placement and Power-Channel-Resource-Unit Assignment for 802.11ax-Based Dense WiFi with QoS Requirements
  • Citing Conference Paper
  • July 2020

... Even if some fragments are lost due to network issues, the original block can be reconstructed through erasure coding as long as at least k out of the n fragments are received. This resilience has led to the widespread adoption of erasure coding in the development of distributed storage systems, particularly in network environments where nodes frequently become unavailable or packet loss is common [44][45][46][47][48][49]. Although existing distributed storage systems have focused on providing reliable storage by ensuring data availability, they have not been designed to support streaming services for TV programs. ...

ESetStore: An Erasure-Coded Storage System with Fast Data Recovery
  • Citing Article
  • March 2020

IEEE Transactions on Parallel and Distributed Systems

... Another approach for HS construction is the use of random numbers generations by which each SU selects a random seed that is used for CH generation. This idea is used by [17,26,38,44,[61][62][63]70,72,74,76,86,87,100,104,109,110]. The need of random generators adds complexity to the CH sequence generation algorithms and imposes an extra computational cost to hardware constrained devices. ...

ZOS: A Fast Rendezvous Algorithm Based on Set of Available Channels for Cognitive Radios
  • Citing Conference Paper
  • September 2018

... The transition to the use of UAV groups controlled by a single operator is clearly on the agenda, and it should be noted that the development of appropriate algorithms for controlling UAV groups has been discussed in the literature for a long time (Cheah et al., 2009, Bayindir, 2016Dorigo et al., 2021). Various methods are used to solve this problem, in particular, those based on self-organization (self-adaptive collective motion) of UAV groups (Zhao et al., 2018), on machine learning (Ding et al., 2023), on the use of graph theory (Li et al., 2024). There are known works that consider algorithms using a virtual leader who is tracked by all UAVs forming a group (Liu & Gao, 2020). ...

Self-Adaptive Collective Motion of Swarm Robots

IEEE Transactions on Automation Science and Engineering

... However, the timeslots assignment on a given channel is not uniform and may results a single point of failure. To gather the local information a multi-radio based hase been considered in [14,15]The robustness against PUs' activity can be achieved by profiling (ranking) the available channels based on local channel sensing information. There are few CH sequences; AMRCC [9], gQ-RDV [16],Cross layer RDV [17], NCQ-CH [18] and SUBSET [19] can be found in the existing literatures that consider channel quality to design CH sequence. ...

Cooperative rendezvous protocol for multiple user-pairs in cognitive radio networks
  • Citing Conference Paper
  • April 2018