Xue Liu’s research while affiliated with York University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (583)


Figure 3. Discrete state space: System reward and service quality comparison of various LLMs.
Fig. 3 to 8 show the simulation results and comparisons.
Figure 4. Continuous state space: System reward comparison of various LLMs.
Figure 5. Continuous state space: Power consumption comparison of various LLMs.
Figure 6. Continuous state space: Average reward comparison under different data rate constraints.

+2

Prompting Wireless Networks: Reinforced In-Context Learning for Power Control
  • Preprint
  • File available

June 2025

·

·

Dun Yuan

·

[...]

·

Zhang

To manage and optimize constantly evolving wireless networks, existing machine learning (ML)- based studies operate as black-box models, leading to increased computational costs during training and a lack of transparency in decision-making, which limits their practical applicability in wireless networks. Motivated by recent advancements in large language model (LLM)-enabled wireless networks, this paper proposes ProWin, a novel framework that leverages reinforced in-context learning to design task-specific demonstration Prompts for Wireless Network optimization, relying on the inference capabilities of LLMs without the need for dedicated model training or finetuning. The task-specific prompts are designed to incorporate natural language descriptions of the task description and formulation, enhancing interpretability and eliminating the need for specialized expertise in network optimization. We further propose a reinforced in-context learning scheme that incorporates a set of advisable examples into task-specific prompts, wherein informative examples capturing historical environment states and decisions are adaptively selected to guide current decision-making. Evaluations on a case study of base station power control showcases that the proposed ProWin outperforms reinforcement learning (RL)-based methods, highlighting the potential for next-generation future wireless network optimization.

Download

Hierarchical Debate-Based Large Language Model (LLM) for Complex Task Planning of 6G Network Management

June 2025

·

8 Reads

6G networks have become increasingly complicated due to novel network architecture and newly emerging signal processing and transmission techniques, leading to significant burdens to 6G network management. Large language models (LLMs) have recently been considered a promising technique to equip 6G networks with AI-native intelligence. Different from most existing studies that only consider a single LLM, this work involves a multi-LLM debate-based scheme for 6G network management, where multiple LLMs can collaboratively improve the initial solution sequentially. Considering the complex nature of 6G domain, we propose a novel hierarchical debate scheme: LLMs will first debate the sub-task decomposition, and then debate each subtask step-by-step. Such a hierarchical approach can significantly reduce the overall debate difficulty by sub-task decomposition, aligning well with the complex nature of 6G networks and ensuring the final solution qualities. In addition, to better evaluate the proposed technique, we have defined a novel dataset named 6GPlan, including 110 complex 6G network management tasks and 5000 keyword solutions. Finally, the experiments show that the proposed hierarchical debate can significantly improve performance compared to baseline techniques, e.g. more than 30% coverage rate and global recall rate improvement.


Fig. 1. Illustration and comparisons between conventional VS structured entity extraction in telecom context.
Fig. 3. The overall pipeline of 6GTech dataset design
Understanding 6G through Language Models: A Case Study on LLM-aided Structured Entity Extraction in Telecom Domain

May 2025

·

21 Reads

Knowledge understanding is a foundational part of envisioned 6G networks to advance network intelligence and AI-native network architectures. In this paradigm, information extraction plays a pivotal role in transforming fragmented telecom knowledge into well-structured formats, empowering diverse AI models to better understand network terminologies. This work proposes a novel language model-based information extraction technique, aiming to extract structured entities from the telecom context. The proposed telecom structured entity extraction (TeleSEE) technique applies a token-efficient representation method to predict entity types and attribute keys, aiming to save the number of output tokens and improve prediction accuracy. Meanwhile, TeleSEE involves a hierarchical parallel decoding method, improving the standard encoder-decoder architecture by integrating additional prompting and decoding strategies into entity extraction tasks. In addition, to better evaluate the performance of the proposed technique in the telecom domain, we further designed a dataset named 6GTech, including 2390 sentences and 23747 words from more than 100 6G-related technical publications. Finally, the experiment shows that the proposed TeleSEE method achieves higher accuracy than other baseline techniques, and also presents 5 to 9 times higher sample processing speed.


Exploring Multimodal Foundation AI and Expert-in-the-Loop for Sustainable Management of Wild Salmon Fisheries in Indigenous Rivers

May 2025

·

6 Reads

Wild salmon are essential to the ecological, economic, and cultural sustainability of the North Pacific Rim. Yet climate variability, habitat loss, and data limitations in remote ecosystems that lack basic infrastructure support pose significant challenges to effective fisheries management. This project explores the integration of multimodal foundation AI and expert-in-the-loop frameworks to enhance wild salmon monitoring and sustainable fisheries management in Indigenous rivers across Pacific Northwest. By leveraging video and sonar-based monitoring, we develop AI-powered tools for automated species identification, counting, and length measurement, reducing manual effort, expediting delivery of results, and improving decision-making accuracy. Expert validation and active learning frameworks ensure ecological relevance while reducing annotation burdens. To address unique technical and societal challenges, we bring together a cross-domain, interdisciplinary team of university researchers, fisheries biologists, Indigenous stewardship practitioners, government agencies, and conservation organizations. Through these collaborations, our research fosters ethical AI co-development, open data sharing, and culturally informed fisheries management.


What Makes Teamwork Work? A Multimodal Case Study on Emotions and Diagnostic Expertise in an Intelligent Tutoring System

May 2025

·

14 Reads

Teamwork is pivotal in medical teamwork when professionals with diverse skills and emotional states collaborate to make critical decisions. This case study examines the interplay between emotions and professional skills in group decision-making during collaborative medical diagnosis within an Intelligent Tutoring System (ITS). By comparing verbal and physiological data between high-performing and low-performing teams of medical professionals working on a patient case within the ITS, alongside individuals' retrospective collaboration experiences, we employ multimodal data analysis to identify patterns in team emotional climate and their impact on diagnostic efficiency. Specifically, we investigate how emotion-driven dialogue and professional expertise influence both the information-seeking process and the final diagnostic decisions. Grounded in the socially shared regulation of learning framework and utilizing sentiment analysis, we found that social-motivational interactions are key drivers of a positive team emotional climate. Furthermore, through content analysis of dialogue and physiological signals to pinpoint emotional fluctuations, we identify episodes where knowledge exchange and skill acquisition are most likely to occur. Our findings offer valuable insights into optimizing group collaboration in medical contexts by harmonizing emotional dynamics with adaptive strategies for effective decision-making, ultimately enhancing diagnostic accuracy and teamwork effectiveness.


Transtreaming: Adaptive Delay-aware Transformer for Real-time Streaming Perception

April 2025

·

1 Read

·

2 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

Real-time object detection is critical for the decision-making process for many real-world applications, such as collision avoidance and path planning in autonomous driving. This work presents an innovative real-time streaming perception method, Transtreaming, which addresses the challenge of real-time object detection with dynamic computational delays. The core innovation of Transtreaming lies in its adaptive delay-aware transformer, which can concurrently predict multiple future frames and select the output that best matches the real-world present time, compensating for any system-induced computational delays. The proposed model outperforms existing state-of-the-art methods, even in single-frame detection scenarios, by leveraging a transformer-based methodology. It demonstrates robust performance across a range of devices, from powerful V100 to modest 2080Ti, achieving the highest level of perceptual accuracy on all platforms. Unlike most state-of-the-art methods that struggle to complete computation within a single frame on less powerful devices, Transtreaming meets the stringent real-time processing requirements on all kinds of devices. The experimental results emphasize the system's adaptability and its potential to significantly improve the safety and reliability of many real-world systems, such as autonomous driving.


Fig. 1. Change in subsidy factor (K i ) as a function of total computing power (A i ) with parameters λ = 0.8, k = 2 when the completed difficulty D = 10.
Opportunity-Cost-Driven Reward Mechanisms for Crowd-Sourced Computing Platforms

April 2025

·

14 Reads

This paper introduces a game-theoretic model tailored for reward distribution on crowd-sourced computing platforms. It explores a repeated game framework where miners, as computation providers, decide their computation power contribution in each round, guided by the platform's designed reward distribution mechanism. The reward for each miner in every round is based on the platform's randomized task payments and the miners' computation transcripts. Specifically, it defines Opportunity-Cost-Driven Incentive Compatibility (OCD-IC) and Dynamic OCD-IC (DOCD-IC) for scenarios where strategic miners might allocate some computation power to more profitable activities, such as Bitcoin mining. The platform must also achieve Budget Balance (BB), aiming for a non-negative total income over the long term. This paper demonstrates that traditional Pay-Per-Share (PPS) reward schemes require assumptions about task demand and miners' opportunity costs to ensure OCD-IC and BB, yet they fail to satisfy DOCD-IC. The paper then introduces Pay-Per-Share with Subsidy (PPSS), a new reward mechanism that allows the platform to provide subsidies to miners, thus eliminating the need for assumptions on opportunity cost to achieve OCD-IC, DOCD-IC, and long-term BB.


What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

March 2025

·

8 Reads

As enthusiasm for scaling computation (data and parameters) in the pretraining era gradually diminished, test-time scaling (TTS), also referred to as ``test-time computing'' has emerged as a prominent research focus. Recent studies demonstrate that TTS can further elicit the problem-solving capabilities of large language models (LLMs), enabling significant breakthroughs not only in specialized reasoning tasks, such as mathematics and coding, but also in general tasks like open-ended Q&A. However, despite the explosion of recent efforts in this area, there remains an urgent need for a comprehensive survey offering a systemic understanding. To fill this gap, we propose a unified, multidimensional framework structured along four core dimensions of TTS research: what to scale, how to scale, where to scale, and how well to scale. Building upon this taxonomy, we conduct an extensive review of methods, application scenarios, and assessment aspects, and present an organized decomposition that highlights the unique functional roles of individual techniques within the broader TTS landscape. From this analysis, we distill the major developmental trajectories of TTS to date and offer hands-on guidelines for practical deployment. Furthermore, we identify several open challenges and offer insights into promising future directions, including further scaling, clarifying the functional essence of techniques, generalizing to more tasks, and more attributions.


Enhancing Large Language Models (LLMs) for Telecommunications using Knowledge Graphs and Retrieval-Augmented Generation

March 2025

·

6 Reads

Large language models (LLMs) have made significant progress in general-purpose natural language processing tasks. However, LLMs are still facing challenges when applied to domain-specific areas like telecommunications, which demands specialized expertise and adaptability to evolving standards. This paper presents a novel framework that combines knowledge graph (KG) and retrieval-augmented generation (RAG) techniques to enhance LLM performance in the telecom domain. The framework leverages a KG to capture structured, domain-specific information about network protocols, standards, and other telecom-related entities, comprehensively representing their relationships. By integrating KG with RAG, LLMs can dynamically access and utilize the most relevant and up-to-date knowledge during response generation. This hybrid approach bridges the gap between structured knowledge representation and the generative capabilities of LLMs, significantly enhancing accuracy, adaptability, and domain-specific comprehension. Our results demonstrate the effectiveness of the KG-RAG framework in addressing complex technical queries with precision. The proposed KG-RAG model attained an accuracy of 88% for question answering tasks on a frequently used telecom-specific dataset, compared to 82% for the RAG-only and 48% for the LLM-only approaches.


TransDiffSBDD: Causality-Aware Multi-Modal Structure-Based Drug Design

March 2025

·

12 Reads

Structure-based drug design (SBDD) is a critical task in drug discovery, requiring the generation of molecular information across two distinct modalities: discrete molecular graphs and continuous 3D coordinates. However, existing SBDD methods often overlook two key challenges: (1) the multi-modal nature of this task and (2) the causal relationship between these modalities, limiting their plausibility and performance. To address both challenges, we propose TransDiffSBDD, an integrated framework combining autoregressive transformers and diffusion models for SBDD. Specifically, the autoregressive transformer models discrete molecular information, while the diffusion model samples continuous distributions, effectively resolving the first challenge. To address the second challenge, we design a hybrid-modal sequence for protein-ligand complexes that explicitly respects the causality between modalities. Experiments on the CrossDocked2020 benchmark demonstrate that TransDiffSBDD outperforms existing baselines.


Citations (47)


... Hardware-software optimization fragmentation 40% reduction in system efficiency L. Leonardi et al [20] Real-time processing barriers >200ms processing delay X. Zhang et al [21] Limited cross-platform validation <60% reproducibility across systems T. Glatard et al [22] Performance Limitations Signal degradation below 10^-15 S/m SNR reduction >50% A. O'Brien et al [23] Computational overhead in multifrequency processing Processing time >500ms T. Qin et al [24] Depth penetration constraints Limited to <10cm depth J. E. Simms et al [25] Methodological The focus on materials with conductivities ranging from 10^-18 to 10^-12 S/m is strategically chosen based on several critical factors. First, this range encompasses key materials in emerging biomedical and industrial applications, including polymer-based medical devices (10^-16 to 10^-14 S/m), advanced composite materials (10^-15 to 10^-13 S/m), and novel semiconductor compounds (10^-14 to 10^-12 S/m). ...

Reference:

A Novel Hybrid Algorithm for Enhanced Low-Conductivity Material Imaging in Magnetic Induction Tomography
Transtreaming: Adaptive Delay-aware Transformer for Real-time Streaming Perception
  • Citing Article
  • April 2025

Proceedings of the AAAI Conference on Artificial Intelligence

... Although identical base prompts were used for all models in our study to maintain consistency and minimize variability due to prompt design, these prompts were intentionally kept basic. It is well-established in the literature that the quality of LLM outputs depends heavily on the quality and specificity of the prompts given [24][25][26]. More complex or detailed prompts could potentially elicit more accurate or nuanced responses from the models [27]. ...

Large Language Models for Wireless Networks: An Overview from the Prompt Engineering Perspective

IEEE Wireless Communications

... Effectively capturing feature interactions is one of the key factors in improving the performance of CTR prediction models [6,16,66,76]. Early CTR models are typically limited to capturing low-order feature interactions with bounded degrees [45,53,59,70] or rely on hand-crafted feature combinations based on expert knowledge [6,55]. ...

Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction Models
  • Citing Conference Paper
  • March 2025

... Alongside the prompt engineering experiments described above, we also evaluated a well-known language model technique called retrieval-augmented generation (RAG) (Lewis et al., 2021;Wu et al., 2024), which enhances language model outputs by supplying domain-specific information alongside the input prompt. A basic implementation of RAG involves creating a system where a user poses a question and the system retrieves relevant excerpts from a content database to provide as additional context to the language model for response generation ( Figure 2D). ...

Retrieval-Augmented Generation for Natural Language Processing: A Survey
  • Citing Preprint
  • January 2025

... For practical applications, vertical structures of physical quantities are important, requiring SR of three-dimensional data. Since neural operators can be applied to three-dimensional data (e.g., Qin et al., 2025), three-dimensional zero-shot SR is Transformer-based neural operator (TNO). "Linear" represents a linear transformation, "SiLU" represents the Sigmoid Linear Unit, "Norm" represents instance normalization (Ulyanov et al., 2016), "MM" represents matrix multiplication, and "ReLU" represents the Rectified Linear Unit. ...

Modeling multivariable high-resolution 3D urban microclimate using localized Fourier neural operator
  • Citing Article
  • February 2025

Building and Environment

... Qu et al. [164] examine limitations of on-device LLMs due to edge devices' constrained capacity, advocating mobile edge intelligence to reduce latency and privacy issues. Chen et al. [165] explore LLM integration into edge intelligence, focusing on adaptive applications and throughput challenges for small models on edge devices. Lee et al. [166] discuss adapting Vision Transformers for mobile and edge devices, emphasizing computational efficiency and implementation challenges on resource-limited devices. ...

Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges
  • Citing Article
  • January 2025

IEEE Network

... Alsadat et al. [10] applied LLMs to enhance multi-agent coordination in stochastic games. Yan et al. [11] designed a LLM-RL framework applied to vehicle routing and communication, and it is proved that this improvement can effectively improve the satisfaction rate of time window constraints. However, existing LLM-enhanced RL approaches often fail to take full advantage of the Chain of Thought (CoT) or avoid hallucinations of LLM. ...

Hybrid LLM-DDQN Based Joint Optimization of V2I Communication and Autonomous Driving

IEEE Wireless Communications Letters

... Capturing and integrating these correlations is crucial for building accurate prediction models. Although numerous deep learning-based traffic prediction models have been developed, most models employ a decoupled architecture, where temporal and spatial correlations are extracted separately by independent modules [12,13,14] or they use joint modules to synchronously extract temporal and spatial correlations [15,16,17], but fail to consider spatial-temporal correlations. Moreover, models that consider joint spatial-temporal correlations (incorporating temporal, spatial, and spatial-temporal correlations) [18,19,20,21] often face bottlenecks in accuracy and computational efficiency (refer to the evaluation metrics and training or inference times pro-vided in the original papers), failing to demonstrate the superiority of the joint spatial-temporal correlations architecture. ...

T-Graphormer: Using Transformers for Spatiotemporal Forecasting

... DSCMM [19] and OMM-OBDSC [4] used front-view images for driving scenario recognition to obtain road class probabilities by processing front-view images. In [20] and [21], an Elevation-Aware Unit is proposed to utilize front-view images and IMU data to acquire elevation information for diverse urban roads. While these methods can mitigate errors from elevated roads and ordinary urban roads underneath them, they are less effective for road splits. ...

Elevation-Aware Map Matching Model Leveraging Transfer Learning in Sparse Data Conditions
  • Citing Article
  • January 2025

IEEE Transactions on Intelligent Transportation Systems

... Agentic AI [1], often powered by large language models (LLMs), is gaining significant attraction in both industry and academia due to its wide range of applications that enable autonomous decision-making, streamline complex workflows, and assist with tasks such as reasoning, planning, and realtime adaptation on behave of users or organizations across various domains, including customer service, cybersecurity, healthcare, and business operations. Thanks to recent advancements in wireless communication and cloud computing [2], along with inspiration from [3], these agents can be instantiated as cloud-based services [4], embedded service applications, or distributed systems that operate across digital and physical environments [5]. They process languagebased input and generate human-readable output, facilitating easier access to AI-based techniques. ...

Generative AI as a Service in 6G Edge-Cloud: Generation Task Offloading by In-Context Learning
  • Citing Article
  • January 2024

IEEE Wireless Communications Letters