Xuan Liu’s research while affiliated with Hunan University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (93)


Matching Gains with Pays: Effective and Fair Learning in Multi-Agent Public Goods Dilemmas
  • Chapter

October 2024

·

4 Reads

Yitian Chen

·

Xuan Liu

·

·

[...]

·

The training of multi-agent reinforcement learning (MARL) tasks with the public goods dilemma (PGD) is difficult because the selfish actions of individual agents for high personal rewards may reduce the collective utility of the whole group. Existing solutions to this problem, e.g., reward gifting or intrinsic rewards, although inducing cooperation among agents in small groups, cannot guarantee fairness among agents’ policies and fail to achieve optimal group utility in large-scale systems. In this paper, we propose F4PGD, an effective method to train large-scale MARL tasks with PGD in a decentralized manner, which is inspired by Adam’s equity theory that the match between a person’s payoff and his contribution is the key incentive for people to contribute to the common good. In F4PGD, a mechanism is designed to match an agent’s reward with its contribution, which suppresses agents from taking a free ride and meanwhile encourages well-learned agents to contribute to public goods. Experimental results show that F4PGD effectively learns optimal policies for the whole group and guarantees fairness among agents in several typical MARL tasks with PGD.


Selective Learning for Sample-Efficient Training in Multi-Agent Sparse Reward Tasks (Extended Abstract)

August 2024

Learning effective strategies in sparse reward tasks is one of the fundamental challenges in reinforcement learning. This becomes extremely difficult in multi-agent environments, as the concurrent learning of multiple agents induces the non-stationarity problem and a sharply increased joint state space. Existing works have attempted to promote multi-agent cooperation through experience sharing. However, learning from a large collection of shared experiences is inefficient as there are only a few high-value states in sparse reward tasks, which may instead lead to the curse of dimensionality in large-scale multi-agent systems. This paper focuses on sparse-reward multi-agent cooperative tasks and proposes an effective experience-sharing method, Multi-Agent Selective Learning (MASL), to boost sample-efficient training by reusing valuable experiences from other agents. MASL adopts a retrogression-based selection method to identify high-value traces of agents from the team rewards, based on which some recall traces are generated and shared among agents to motivate effective exploration. Moreover, MASL selectively considers information from other agents to cope with the non-stationarity issue while enabling efficient training for large-scale agents. Experimental results show that MASL significantly improves sample efficiency compared with state-of-the-art MARL algorithms in cooperative tasks with sparse rewards.


Gradient Diffusion: A Perturbation-Resilient Gradient Leakage Attack

July 2024

·

2 Reads

Recent years have witnessed the vulnerability of Federated Learning (FL) against gradient leakage attacks, where the private training data can be recovered from the exchanged gradients, making gradient protection a critical issue for the FL training process. Existing solutions often resort to perturbation-based mechanisms, such as differential privacy, where each participating client injects a specific amount of noise into local gradients before aggregating to the server, and the global distribution variation finally conceals the gradient privacy. However, perturbation is not always the panacea for gradient protection since the robustness heavily relies on the injected noise. This intuition raises an interesting question: \textit{is it possible to deactivate existing protection mechanisms by removing the perturbation inside the gradients?} In this paper, we present the answer: \textit{yes} and propose the Perturbation-resilient Gradient Leakage Attack (PGLA), the first attempt to recover the perturbed gradients, without additional access to the original model structure or third-party data. Specifically, we leverage the inherent diffusion property of gradient perturbation protection and construct a novel diffusion-based denoising model to implement PGLA. Our insight is that capturing the disturbance level of perturbation during the diffusion reverse process can release the gradient denoising capability, which promotes the diffusion model to generate approximate gradients as the original clean version through adaptive sampling steps. Extensive experiments demonstrate that PGLA effectively recovers the protected gradients and exposes the FL training process to the threat of gradient leakage, achieving the best quality in gradient denoising and data recovery compared to existing models. We hope to arouse public attention on PGLA and its defense.




Cautiously-Optimistic Knowledge Sharing for Cooperative Multi-Agent Reinforcement Learning

March 2024

·

1 Citation

Proceedings of the AAAI Conference on Artificial Intelligence

While decentralized training is attractive in multi-agent reinforcement learning (MARL) for its excellent scalability and robustness, its inherent coordination challenges in collaborative tasks result in numerous interactions for agents to learn good policies. To alleviate this problem, action advising methods make experienced agents share their knowledge about what to do, while less experienced agents strictly follow the received advice. However, this method of sharing and utilizing knowledge may hinder the team's exploration of better states, as agents can be unduly influenced by suboptimal or even adverse advice, especially in the early stages of learning. Inspired by the fact that humans can learn not only from the success but also from the failure of others, this paper proposes a novel knowledge sharing framework called Cautiously-Optimistic kNowledge Sharing (CONS). CONS enables each agent to share both positive and negative knowledge and cautiously assimilate knowledge from others, thereby enhancing the efficiency of early-stage exploration and the agents' robustness to adverse advice. Moreover, considering the continuous improvement of policies, agents value negative knowledge more in the early stages of learning and shift their focus to positive knowledge in the later stages. Our framework can be easily integrated into existing Q-learning based methods without introducing additional training costs. We evaluate CONS in several challenging multi-agent tasks and find it excels in environments where optimal behavioral patterns are difficult to discover, surpassing the baselines in terms of convergence rate and final performance.


CMMR: A Composite Multidimensional Models Robustness Evaluation Framework for Deep Learning

February 2024

·

5 Reads

·

2 Citations

Lecture Notes in Computer Science

Accurately evaluating the defense models against adversarial examples has been proven to be a challenging task. We have recognized the limitations of mainstream evaluation standards, which fail to account for the discrepancies in evaluation results arising from different adversarial attack methods, experimental setups, and metrics sets. To address these disparities, we propose the Composite Multidimensional Model Robustness (CMMR) evaluation framework, which integrates three evaluation dimensions: attack methods, experimental settings, and metrics sets. By comprehensively evaluating the model’s robustness across these dimensions, we aim to effectively mitigate the aforementioned variations. Furthermore, the CMMR framework allows evaluators to flexibly define their own options for each evaluation dimension to meet their specific requirements. We provide practical examples to demonstrate how the CMMR framework can be utilized to assess the performance of models in enhancing robustness through various approaches. The reliability of our methodology is assessed through both practical examinations and theoretical validations. The experimental results demonstrate the excellent reliability of the CMMR framework and its significant reduction of variations encountered in evaluating model robustness in practical scenarios.


Universal and Scalable Weakly-Supervised Domain Adaptation

February 2024

·

13 Reads

·

1 Citation

IEEE Transactions on Image Processing

Domain adaptation leverages labeled data from a source domain to learn an accurate classifier for an unlabeled target domain. Since the data collected in practical applications usually contain noise, the weakly-supervised domain adaptation algorithm has attracted widespread attention from researchers that tolerates the source domain with label noises or/and features noises. Several weakly-supervised domain adaptation methods have been proposed to mitigate the difficulty of obtaining the high-quality source domains that are highly related to the target domain. However, these methods assume to obtain the accurate noise rate in advance to reduce the negative transfer caused by noises in source domain, which limits the application of these methods in the real world where the noise rate is unknown. Meanwhile, since source data usually comes from multiple domains, the naive application of single-source domain adaptation algorithms may lead to sub-optimal results. We hence propose a universal and scalable weakly-supervised domain adaptation method called PDCAS to ease restraints of such assumptions and make it more general. Specifically, PDCAS includes two stages: progressive distillation and domain alignment. In progressive distillation stage, we iteratively distill out potentially clean samples whose annotated labels are highly consistent with the prediction of model and correct labels for noisy source samples. This process is non-supervision by exploiting intrinsic similarity to measure and extract initial corrected samples. In domain alignment stage, we consider Class-Aligned Sampling which balances the samples for both source and target domains along with the global feature distributions to alleviate the shift of label distributions. Finally, we apply PDCAS in multi-source noisy scenario and propose a novel multi-source weakly-supervised domain adaptation method called MSPDCAS, which shows the scalability of our framework. Extensive experiments on Office-31 and Office-Home datasets demonstrate the effectiveness and robustness of our method compared to state-of-the-art methods.


Mutual Gradient Inversion: Unveiling Privacy Risks of Federated Learning on Multi-Modal Signals

January 2024

·

5 Reads

Signal Processing Letters, IEEE

Federated Learning (FL) preserves privacy by training a global model via gradient exchange between the parameter server and local clients rather than raw data sharing. Edge devices with sensors serve as local clients receiving multimodal signals and contributing multimodal data for training. Despite the privacy-centric design of FL, it remains vulnerable to gradient leakage attacks. However, existing studies predominantly focus on single-modality data recovery from gradients, leaving a critical research void in multimodal data scenarios. In this letter, we propose MGIS: Mutual Gradient Inversion Strategy , the first gradient inversion attack and defense paradigm dealing with multimodal data. Inspired by knowledge distillation, MGIS utilizes common information (e.g., labels) between different modalities to extract multimodal data from gradients. Experimental results demonstrate that MGIS outperforms single-modality gradient attacks in the quality of privacy data recovery and highlight the increased privacy leakage risk associated with multi-modality data compared to single-modality data.


Advancing RFID Technology for Virtual Boundary Detection

January 2024

·

1 Read

IEEE Transactions on Mobile Computing

A boundary is a physical or virtual line that marks the edge or limit of a specific region, which has been widely used in many applications, such as autonomous driving, virtual wall, and robotic lawn mowers. However, none of existing work can well balance the deployability and the scalability of a boundary. In this paper, we propose a brand new RFID-based virtual boundary scheme together with its detection algorithm called RF-Boundary, which has the competitive advantages of being battery-free and easy-to-maintain. We develop two technologies of phase gradient and dual-antenna AoA to address the key challenges posed by RF-boundary, in terms of lack of calibration information and multi-edge interference. Besides, we consider the presence of multipath in the real world applications, model the effect on signals in the dynamic scenarios, and demonstrate the robustness of our phase gradient-based scheme under multipath. We implement a prototype of RF-Boundary with commercial RFID systems and a mobile robot. Extensive experiments verify the feasibility as well as the good performance of RF-Boundary, with a mean detection error of only 8.6 cm.


Citations (66)


... Existing frameworks for teacher-student communities are commonly employed in Reinforcement Learning (RL) [7], [8], [9], [10]. However, few methods have been proposed for multinode supervised or semi-supervised teacher-student learning [11], [12]. ...

Reference:

Collaborative Knowledge Distillation via a Learning-by-Education Node Community
Cautiously-Optimistic Knowledge Sharing for Cooperative Multi-Agent Reinforcement Learning
  • Citing Article
  • March 2024

Proceedings of the AAAI Conference on Artificial Intelligence

... As illustrated in Fig. 1, the overall framework of the model consists of local training processes on the client side and global aggregation processes on the server side, with the client and server exchanging weight calculations through uploads and downloads [18], [19]. For the clients, they must compute differential privacy on the model parameters after local training, and then upload the noise-added parameters to the server. ...

LDCSF: Local depth convolution-based Swim framework for classifying multi-label histopathology images
  • Citing Conference Paper
  • Full-text available
  • December 2023

... These models construct a generalizable architecture through large-scale pre-training on protein sequences and extract diverse and complementary features as embeddings. Numerous prediction models utilizing pLM embeddings have been reported in current literature, such as bindEmbed21DL [39], DeepProSite [50], EquiPNAS [27], ULDNA [51], ESM-NBR [52], and CLAPE [53]. These models consistently outperform those without language model embeddings. ...

ESM-NBR: fast and accurate nucleic acid-binding residue prediction via protein language model feature representation and multi-task learning
  • Citing Conference Paper
  • December 2023

... However, recent studies indicate that federated learning does not provide full privacy protection [7,8]. Therefore, it is essential to implement further privacy protection techniques. ...

MGIA: Mutual Gradient Inversion Attack in Multi-Modal Federated Learning (Student Abstract)
  • Citing Article
  • June 2023

Proceedings of the AAAI Conference on Artificial Intelligence

... The section thoroughly analyses areas that the studies focused on using passwordless authentication. This paper organizes these areas into eight main domains consisting of Security, IoT, Public Sector and Services, Network and Infrastructure, Business and Economy, Education and Research, Lifestyle and Tourism and Agriculture Each domain is accompanied by a specific count of research papers and the corresponding percentage representation, providing a comprehensive overview of the research landscape as shown in Figure 5 and TABLE 4. [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49], [50], [51], [52], [53], [54], [55], [56], [57], [58], [59], [60], [61]. [62], [63], [64] [65], [66] [67], [68], [69]. ...

RFID Authentication System Based on User Biometric Information

Applied Sciences

... In the rapidly evolving fields of the Internet of Things (IoT) [1]- [3] and smart devices, precise indoor positioning technology has become a cornerstone for numerous applications, including smart manufacturing, automated logistics, emergency rescue, and personal navigation. Although GPS and other satellite navigation systems provide high precision location services in outdoor environments, their effectiveness diminishes indoors due to signal attenuation and multipath effects. ...

RF-Siamese: Approaching Accurate RFID Gesture Recognition With One Sample

IEEE Transactions on Mobile Computing

... Everyday objects' sensing capabilities using long-range RFID in the IoT are identified in a study 17 , detecting user presence at 96.7% and daily activities at 82.8%. An RFID-based gesture recognition system is proposed 18 , achieving an experimental accuracy of 97.2% with 18 different gestures. Furthermore, RFID tattoos 14 are also used for speech recognition. ...

Real-Time and Accurate Gesture Recognition With Commercial RFID Devices

IEEE Transactions on Mobile Computing

... A commercial RFID is utilised for speech recognition 16 with multiple tags embedded on a transparent sheet to detect a single word. The system achieves an accuracy of 0.95% in detecting user speech and can recognise a vocabulary of 20 words with an average accuracy classification of 0.88%. ...

HearMe: Accurate and Real-time Lip Reading based on Commercial RFID Devices

IEEE Transactions on Mobile Computing