Victor C. M. Leung’s research while affiliated with Shenzhen University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (546)


A Text Detection Method Based on Multi-Scale Selective Fusion Feature Pyramid and Multi-Semantic Spatial Network for Visual IoT
  • Article

January 2025

IEEE Internet of Things Journal

Manli Wang

·

Zeya Dou

·

·

[...]

·

Victor C. M. Leung

With the rapid development of Visual Internet of Things (VIoT) and text detection technology, they have been widely combined and applied to many industrial production sites, such as label text detection, achieving impressive results. However, there are still many shortcomings in the text detection technology: 1) The existing VIoT system has very limited detection precision for text with large scale changes, especially for some small-scale text detection; 2) The existing text detection algorithms cannot meet the actual situation, as the labels often contain handwritten texts, and the text to be detected is arbitrary shape; 3) In the actual detection, there are many creases or defects on the text label. To solve the above problems, this paper designs a text detection method based on a multi-scale selection fusion feature pyramid and multi-semantic spatial network to assist the VIoT system in detecting label text. Firstly, a multi-scale selective fusion feature pyramid is designed, which not only uses the texture extraction module to effectively improve the text texture feature and multi-scale feature extraction ability, but also uses the cross-scale selective fusion block to selectively fuse the features of different stages to reduce the influence of pollution on detection. In addition, a multi-semantic spatial network is designed to capture the multi-semantic spatial information of each feature channel by using the multi-scale deep shared one-dimensional convolution, which effectively integrates global context dependence and multi-semantic spatial prior. Experimental results show that the comprehensive index F-measure on the public datasets ICDAR2015, Total-Text and CTW1500 is increased by 5.7%, 3.3%and 3.8% respectively. Furthermore, the precision, recall, and F-measure on the dataset Label-Text are 94.6%, 90.7%and 92.6% respectively. The label text detection VIoT system we designed has been deployed in the field and achieved excellent performance. The code of our proposed method can be found in: https://github.com/rebornone1/MSNet


Adaptive Density Estimation for Personalized Recommendations Across Varied User Activity Levels

January 2025

IEEE Transactions on Computational Social Systems

Top-N recommendation systems are recognized as highly effective for delivering personalized services that cater to the varied interests of users. Nonetheless, current state-of-the-art (SOTA) analyses reveal a marked variability in their performance across users with differing levels of activity, which substantially undermines the quality of personalized recommendation services. Prevailing research tends to overlook this discrepancy, often presuming a uniform probability distribution in user preferences and employing a static model (such as a single latent vector) for user representation. This oversimplification impedes the adaptability of existing models to accommodate the spectrum of user activity levels. In our research, we introduce the variational kernel density estimation (VKDE) approach, an innovative nonparametric method designed to accurately capture the unique preference distributions of individual users. The VKDE framework integrates multiple local distributions to construct a comprehensive global preference profile for each user. We have developed a novel variational kernel function that delineates user-specific interests and constructs each local distribution accordingly. Additionally, we present a tailored sampling strategy that simplifies the complexity of the training process while preserving the efficacy of the recommendations. Empirical evaluations conducted on four widely recognized public datasets demonstrate that our VKDE model achieves superior performance over the SOTA alternatives, significantly enhancing accuracy for users with a broad range of activity levels.


When Generative AI Meets Semantic Communication: Optimizing Radio Map Construction and Distribution in Future Mobile Networks

January 2025

·

2 Reads

IEEE Network

With the rapid development of the internet of things and smart cities, the demand for effective spectrum collaboration has grown significantly. Radio maps play a crucial role in understanding the spatial radio environment, which is essential for wireless applications such as cell planning and radio resource management. However, generating radio maps is a resource-intensive task, as the construction process is complex and the distribution process consumes substantial bandwidth. This article proposes a method that combines semantic communication (SemCom) and generative artificial intelligence (GAI) to optimize the construction and distribution processes of radio maps. By incorporating SemCom, our method transmits only key semantic information during the distribution process, significantly reducing bandwidth usage while ensuring accurate and efficient information transfer. While the diverse generative capabilities of GAI introduce some instability, which could be a limitation in radio map construction, this article achieves precise content decoding through prompts based on multi-modal semantic information, enabling accurate construction and restoration of radio maps. By utilizing SemCom and GAI technologies, the cost of applications of radio maps can be significantly reduced, which is beneficial for stimulating the pace of intelligence evolution in future mobile networks.


Efficient Detection Framework Adaptation for Edge Computing: A Plug-and-play Neural Network Toolbox Enabling Edge Deployment

December 2024

·

11 Reads

Edge computing has emerged as a key paradigm for deploying deep learning-based object detection in time-sensitive scenarios. However, existing edge detection methods face challenges: 1) difficulty balancing detection precision with lightweight models, 2) limited adaptability of generalized deployment designs, and 3) insufficient real-world validation. To address these issues, we propose the Edge Detection Toolbox (ED-TOOLBOX), which utilizes generalizable plug-and-play components to adapt object detection models for edge environments. Specifically, we introduce a lightweight Reparameterized Dynamic Convolutional Network (Rep-DConvNet) featuring weighted multi-shape convolutional branches to enhance detection performance. Additionally, we design a Sparse Cross-Attention (SC-A) network with a localized-mapping-assisted self-attention mechanism, enabling a well-crafted joint module for adaptive feature transfer. For real-world applications, we incorporate an Efficient Head into the YOLO framework to accelerate edge model optimization. To demonstrate practical impact, we identify a gap in helmet detection -- overlooking band fastening, a critical safety factor -- and create the Helmet Band Detection Dataset (HBDD). Using ED-TOOLBOX-optimized models, we address this real-world task. Extensive experiments validate the effectiveness of ED-TOOLBOX, with edge detection models outperforming six state-of-the-art methods in visual surveillance simulations, achieving real-time and accurate performance. These results highlight ED-TOOLBOX as a superior solution for edge object detection.


Fig. 4: Samples of current MiE datasets. These are as follows: (a) SMIC [115], (b) CASME [247], (c) CASME II [245], (d) CAS(ME) 2 [173], (e) SAMM [33], (f) MEVIEW [84], (g) CAS(ME) 3 [104], (h) MMEW [10], and (i) 4DME [112].
An Overview of MaE Datasets
An Overview of Representative Methods on Deep Static MaE Recognition
Facial Expression Analysis and Its Potentials in IoT Systems: A Contemporary Survey
  • Preprint
  • File available

December 2024

·

21 Reads

Facial expressions convey human emotions and can be categorized into macro-expressions (MaEs) and micro-expressions (MiEs) based on duration and intensity. While MaEs are voluntary and easily recognized, MiEs are involuntary, rapid, and can reveal concealed emotions. The integration of facial expression analysis with Internet-of-Thing (IoT) systems has significant potential across diverse scenarios. IoT-enhanced MaE analysis enables real-time monitoring of patient emotions, facilitating improved mental health care in smart healthcare. Similarly, IoT-based MiE detection enhances surveillance accuracy and threat detection in smart security. This work aims at providing a comprehensive overview of research progress in facial expression analysis and explores its integration with IoT systems. We discuss the distinctions between our work and existing surveys, elaborate on advancements in MaE and MiE techniques across various learning paradigms, and examine their potential applications in IoT. We highlight challenges and future directions for the convergence of facial expression-based technologies and IoT systems, aiming to foster innovation in this domain. By presenting recent developments and practical applications, this study offers a systematic understanding of how facial expression analysis can enhance IoT systems in healthcare, security, and beyond.

Download

UAV Virtual Antenna Array Deployment for Uplink Interference Mitigation in Data Collection Networks

December 2024

·

23 Reads

Unmanned aerial vehicles (UAVs) have gained considerable attention as a platform for establishing aerial wireless networks and communications. However, the line-of-sight dominance in air-to-ground communications often leads to significant interference with terrestrial networks, reducing communication efficiency among terrestrial terminals. This paper explores a novel uplink interference mitigation approach based on the collaborative beamforming (CB) method in multi-UAV network systems. Specifically, the UAV swarm forms a UAV-enabled virtual antenna array (VAA) to achieve the transmissions of gathered data to multiple base stations (BSs) for data backup and distributed processing. However, there is a trade-off between the effectiveness of CB-based interference mitigation and the energy conservation of UAVs. Thus, by jointly optimizing the excitation current weights and hover position of UAVs as well as the sequence of data transmission to various BSs, we formulate an uplink interference mitigation multi-objective optimization problem (MOOP) to decrease interference affection, enhance transmission efficiency, and improve energy efficiency, simultaneously. In response to the computational demands of the formulated problem, we introduce an evolutionary computation method, namely chaotic non-dominated sorting genetic algorithm II (CNSGA-II) with multiple improved operators. The proposed CNSGA-II efficiently addresses the formulated MOOP, outperforming several other comparative algorithms, as evidenced by the outcomes of the simulations. Moreover, the proposed CB-based uplink interference mitigation approach can significantly reduce the interference caused by UAVs to non-receiving BSs.




Reliable or Green? Continual Individualized Inference Provisioning in Fabric Metaverse via Multi-Exit Acceleration

December 2024

·

5 Reads

·

1 Citation

IEEE Transactions on Mobile Computing

Fabric metaverse employs intelligence fibers embedded with flexible sensors to unknowingly gather and transmit massive hypermodal data around humans to a deep neural network-based metaverse inference service (DMS) for continual and real-time analysis. Each DMS has one primary branch and multiple side branches that allow early termination of service with differential accuracy and energy consumption. However, the continual provisioning of compute-intensive DMS with varying requirements for service model, accuracy, delay, and reliability poses a challenge for edge servers characterized by restricted computing resources and intermittent green energy. In this paper, we focus on a continual individualized DMS provisioning problem in the fabric metaverse consisting of a side branch insertion subproblem and a server activation and service deployment subproblem, and formulate them as Integer linear Programming and Markov Decision Process, respectively. Then, we propose a green continual inference (GCI) system, where a pruner with provable approximation ratios trims superfluous branches of every model to the given number K to minimize total overflow accuracy between accuracy demands and reserved branches assigned to users. Based on this exit result, each DMS is further divided into several blocks with dependencies to exploit constrained resources of computing and energy in a fine-grained manner. Finally, a learning-based scheduler is merged into GCI to maximize request throughput while minimizing the activation number of edge servers on different demand scenarios, by adaptively activating suitable servers and deploying required blocks and their corresponding backups on selected servers. Theoretical analyses, simulations, and experiments demonstrate that the GCI is promising compared with baseline algorithms.


Multi-Objective Optimization for Multi-UAV-Assisted Mobile Edge Computing

December 2024

·

18 Reads

·

1 Citation

IEEE Transactions on Mobile Computing

Recent developments in unmanned aerial vehicles (UAVs) and mobile edge computing (MEC) have provided users with flexible and resilient computing services. However, meeting the computation-intensive and delay-sensitive demands of users poses a significant challenge due to the limited resources of UAVs. To address this challenge, we consider a multi-UAV-assisted MEC system. Based on this system, we formulate a multi-objective optimization problem aiming at minimizing the total task completion delay, reducing the total UAV energy consumption, and maximizing the total number of offloaded tasks. Since the problem is a mixed-integer non-linear programming (MINLP) and NP-hard problem, we propose a joint task offloading, computation resource allocation, and UAV trajectory control (JTORATC) approach. The problem is split into three components to cope with the coupling of these decision variables, and then solved individually to obtain the corresponding decisions. Specifically, the sub-problem of task offloading is solved by using distributed splitting and threshold rounding methods, the sub-problem of computation resource allocation is solved by adopting the Karush-Kuhn-Tucker (KKT) method, and the sub-problem of UAV trajectory control is solved by employing the successive convex approximation (SCA) method. Simulation results show that the proposed JTORATC has superior performance compared with the other benchmark methods.


Citations (32)


... Furthermore, in [26], the authors utilized CB to address the long-distance and energysaving uplink transmission problem in UAV-assisted mobile wireless sensor networks. Additionally, Li et al. [27] explored an uplink communication approach from terrestrial terminal to satellite based on the distributed CB. Although the above works can effectively solve some problems in wireless networks by applying the CB method, they ignore a key issue, namely the interference of the data propagation process to non-target receivers. ...

Reference:

UAV Virtual Antenna Array Deployment for Uplink Interference Mitigation in Data Collection Networks
Collaborative Ground-Space Communications via Evolutionary Multi-Objective Deep Reinforcement Learning
  • Citing Article
  • December 2024

IEEE Journal on Selected Areas in Communications

... Retaining the principal components of the pre-trained weights aligns the direction and magnitude of weight updates across different clients to handle data heterogeneity. Additionally, FFA-LoRA [102] freezes one low-rank matrix and fine-tunes only the other. This reduces inconsistency during server aggregation of LoRA gradients, alleviating the optimization instability caused by non-IID data. ...

FedFMSL: Federated Learning of Foundations Models With Sparsely Activated LoRA
  • Citing Article
  • December 2024

IEEE Transactions on Mobile Computing

... The 6G system collects operating data in real time from machine tools, workshops, and accessory components by using ultra-high bandwidth, extremely low latency, and great dependability. Through the integration of edge computing and AI technology, the system enables the direct monitoring and transmission of data at the terminal level for real-time order execution [43,44]. In 6G, blockchain technology facilitates the direct exchange of data among all terminals in a smart factory without the need for an intermediary transportation center. ...

Dependency-Aware Microservice Deployment for Edge Computing: A Deep Reinforcement Learning Approach With Network Representation
  • Citing Article
  • December 2024

IEEE Transactions on Mobile Computing

... Existing network optimization methods primarily include numerical approaches based on optimization theory [5], [8], [10], [12]- [14] and fitting algorithms based on machine learning [4], [9], [11], [15]- [22], with some work exploring the use of deep learning to enhance numerical methods [23]- [25]. For problems without complex characteristics, it is often straightforward to apply classical algorithms. ...

Multi-Objective Optimization for Multi-UAV-Assisted Mobile Edge Computing
  • Citing Article
  • December 2024

IEEE Transactions on Mobile Computing

... The F²NAS method is then applied to enhance the learning process and improve the overall performance of the request scheduling by incorporating its key components. For the purpose of this experiment, we employ real-world data traces collected from a reputable Edge Computing Service Provider (ESP) [54], [55] to ensure the validity and reliability of our results. The experimental setup consists of two clusters of edge servers situated in distinct geographic regions to mimic spatial distribution. ...

Cur-CoEdge: Curiosity-Driven Collaborative Request Scheduling in Edge-Cloud Systems
  • Citing Conference Paper
  • May 2024

... The Stackelberg game model, based on the economic theory, has been successfully applied to the incentive mechanism of task offloading in VEC scenarios [22]. Sun et al. [23] designed a collaborative algorithm, named BARGAIN-MATCH, to fuse a bargaining-based incentive method for resource allocation and a horizontal and vertical collaboration approach based on matching rules to effectively manage task offloading requests. ...

Stackelberg Game-Based Dependency-Aware Task Offloading and Resource Pricing in Vehicular Edge Networks
  • Citing Article
  • October 2024

IEEE Internet of Things Journal

... Many such methods involving machine learning models like support vector machines (SVM), decision trees, random forest and neural networks has been employed to classify the network traffic messages as spam or real. By training these models on a labeled dataset that consists of spam and non-spam messages, they essentially learn to recognize more complex patterns that cannot be easily recognized using traditional techniques [20][21][22][23]. ...

A Cross-Field Deep Learning-based Fuzzy Spamming Detection Approach Via Collaboration of Behavior Modeling and Sentiment Analysis
  • Citing Article
  • December 2024

IEEE Transactions on Fuzzy Systems

... In edge-cloud collaboration, edge devices and the cloud cooperate to balance system performance with resource constraints. Time-sensitive tasks are processed locally on edge devices, while more complex or less time-critical workloads are uploaded to the cloud for additional computational support (e.g., [17], [18], [19]). This cooperative mechanism has been leveraged to accelerate inference for large language models (LLMs) [19]. ...

Large Language Models (LLMs) Inference Offloading and Resource Allocation in Cloud-Edge Computing: An Active Inference Approach
  • Citing Article
  • December 2024

IEEE Transactions on Mobile Computing

... With the advancement of network technology, Unmanned Aerial Vehicles (UAVs) have been used as flying nodes to monitor the environment remotely [1][2][3]. These UAVs are part of an ad hoc network called the Flying Ad hoc Network (FANET) [4,5]. There are two types of communication in FANET: UAV-to-UAV communication [6] and nd station [6]. ...

Collaborative computation offloading and wireless charging scheduling in multi-UAV-assisted MEC networks: A TD3-based approach
  • Citing Article
  • June 2024

Computer Networks

... The authors of [27,28] jointly optimized RIS and transmit beamforming to focus signal energy on legitimate users while disrupting the eavesdropper's channel. In [29], the BS was used to perform beamforming on private information for legitimate users while employing AN to jam eavesdroppers, thereby ensuring the transmission security of ISAC systems assisted by aerial RIS. The authors of [30,31] demonstrated that deploying RIS on UAVs can further expand transmission coverage and enhance system performance, revealing new degrees of freedom (DoFs) for ISAC systems. ...

Active Aerial Reconfigurable Intelligent Surface Assisted Secure Communications: Integrating Sensing and Positioning
  • Citing Article
  • October 2024

IEEE Journal on Selected Areas in Communications