
Guihai Chen- Shanghai Jiao Tong University
Guihai Chen
- Shanghai Jiao Tong University
About
725
Publications
57,215
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,639
Citations
Current institution
Publications
Publications (725)
As cloud computing continues to evolve, the adoption of multi-NUMA (Non-Uniform Memory Access) architecture by cloud service providers has introduced new challenges in virtual machine (VM) scheduling. To address these challenges and more accurately reflect the complexities faced by modern cloud environments, we introduce the Dynamic VM Allocation p...
The conventional cloud-based large model learning framework is increasingly constrained by latency, cost, personalization, and privacy concerns. In this survey, we explore an emerging paradigm: collaborative learning between on-device small model and cloud-based large model, which promises low-latency, cost-efficient, and personalized intelligent s...
The emergence of long-context text applications utilizing large language models (LLMs) has presented significant scalability challenges, particularly in memory footprint. The linear growth of the Key-Value (KV) cache, which stores attention keys and values to reduce redundant computations, can significantly increase memory usage and may prevent mod...
The proliferation of GPS enabled devices has led to the accumulation of a substantial corpus of historical trajectory data. By leveraging these data for training machine learning models,researchers have devised novel data-driven methodologies that address the personalized route recommendation (PRR) problem. In contrast to conventional algorithms su...
Learned indexes, which model key-value data structures by machine learning models, have been extensively studied. However, the fastest immutable learned indexes (e.g., RMI) do not provide the same tight lookup bounds as classical indexes such as B-trees. There are learned indexes that provide tight bounds (e.g., PGM) but those fall short in query p...
The context caching technique is employed to accelerate the Multimodal Large Language Model (MLLM) inference by prevailing serving platforms currently. However, this approach merely reuses the Key-Value (KV) cache of the initial sequence of prompt, resulting in full KV cache recomputation even if the prefix differs slightly. This becomes particular...
In many practical natural language applications, user data are highly sensitive, requiring anonymous uploads of text data from mobile devices to the cloud without user identifiers. However, the absence of user identifiers restricts the ability of cloud-based language models to provide personalized services, which are essential for catering to diver...
Congestion control protocols are crucial for optimizing the performance of datacenter network applications. Although reactive congestion control (RCC) protocols are commonly used in commercial datacenters, researchers have been exploring token-based proactive congestion control (TCC) protocols to further enhance network performance. Despite the dev...
The map matching of cellular data reconstructs real trajectories of users by exploiting the sequential connections between mobile devices and cell towers. The difficulty in obtaining paired cellular-GPS data and the cellular variation compromise the accuracy and reliability of existing map matching approaches. In this paper, we propose a novel u ns...
For Unmanned Aerial Vehicles (UAVs) monitoring tasks, capturing high quality images of target objects is important for subsequent recognition. Concerning the problem, many prior works study placement/trajectory planning for UAVs to maximize the quality of captured images. However, all of them overlook a fact that UAV monitoring may cause a huge ris...
Federated learning (FL) can be implemented in large-scale wireless networks in a hierarchical way, introducing edge servers as relays between the cloud server and devices. These devices are dispersed within multiple clusters coordinated by edges. However, the devices are typically mobile users with unpredictable trajectories, and the impact of thei...
While the flexibility of programmable switches brings opportunities, it also introduces security risks. Hence, it is vital to conduct effective troubleshooting in the programmable switch to mitigate frequent network failures. However, troubleshooting programmable switch failures is challenging due to their enhanced flexibility and functionality com...
As one of the pillars in cluster computing frameworks, coflow scheduling algorithms can effectively shorten the network transmission time of cluster computing jobs, thus reducing the job completion times and improving the execution performance. However, most of existing coflow scheduling algorithms failed to consider the influences of concurrent fl...
3D Gaussian splatting (3DGS) suggests the use of explicit point-based 3D representations for high-fidelity novel view synthesis, with training and rendering speeds that are better than prior neural radiance fields. However, 3DGS relies heavily on synthetic point clouds generated by structure from motion (SfM) or multi view stereo (MVS) techniques,...
With the popularization of smart phones, mobile crowdsourcing emerged and gained growing attention in the recent years. Mobile users are now able to conduct complex tasks with the communication between each other. In this paper, we study the task scheduling problem in the mobile crowdsourcing systems based on the spontaneously formed mobile social...
Traditional fine-grained classification focuses on visible light domains, such as animals and cars. However, these methods often perform poorly when applied to radar images and images of satellites because of challenges such as distinguishing between noise and objects and the significant scale differences among object components. To address these u...
With the popularity of mobile devices, spatial crowdsourcing has attracted widespread attention, which collects spatial tasks with location constraints and assigns them to workers who can travel to certain locations to participate in and obtain profits. One of the core issues is task assignment, in which tasks should be assigned to proper workers t...
Context
The rise of artificial intelligence of things (AloT) has enabled smart cities and industries, and UAV‐assisted edge computing networks are an important technology to support the above scenarios. UAV‐assisted refers to leveraging UAVs as a dynamic, flexible infrastructure to assist edge network data processing and communication tasks. Multip...
Existing work on large language model (LLM) personalization assigned different responding roles to LLM, but overlooked the diversity of questioners. In this work, we propose a new form of questioner-aware LLM personalization, generating different responses even for the same query from different questioners. We design a dual-tower model architecture...
The emergence of long-context text applications utilizing large language models (LLMs) has presented significant scalability challenges, particularly in memory footprint. The linear growth of the Key-Value (KV) cache responsible for storing attention keys and values to minimize redundant computations can lead to substantial increases in memory cons...
RDMA over Converged Ethernet (RoCEv2) has been widely deployed to data centers (DCs) for its better compatibility with Ethernet/IP than Infiniband (IB). As cross-DC applications emerge, they also demand high throughput, low latency, and lossless network for cross-DC data transmission. However, RoCEv2’s underlying lossless mechanism Priority-based F...
Nowadays, there exists a lot of cross-region data transmission demand on the cloud. It is promising to use serverless computing for data compressing to save the total data size. However, it is challenging to estimate the data transmission time and monetary cost with serverless compression. In addition, minimizing the data transmission cost is non-t...
This paper proposes a novel contrastive cross-modal knowledge transfer framework, SemiCMT, for multi-modal IoT sensing applications. It effectively transfers the feature extraction capability (also called knowledge) learned from a source modality (e.g., acoustic signals) with abundant unlabeled training data, to a target modality (e.g., seismic sig...
At ByteDance, where we execute over a million Spark jobs and handle 500PB of shuffled data daily, ensuring resource efficiency is paramount for cost savings. However, achieving optimization of resource efficiency in large-scale production environments poses significant challenges. Drawing from our practical experiences, we have identified three key...
With the improvement of edge-based autonomous systems such as mobile Industrial IoT(IIoT) networks, edge devices can capture and upload videos with increasing bitrates. Massive edge-computing end nodes are eager for adequate multimedia data to satisfy the requirements of real-time video services. However, existing encoding standards for video servi...
Large Multimodal Models (LMMs) have shown significant progress in various complex vision tasks with the solid linguistic and reasoning capacity inherited from large language models (LMMs). Low-rank adaptation (LoRA) offers a promising method to integrate external knowledge into LMMs, compensating for their limitations on domain-specific tasks. Howe...
In modern mobile applications, users frequently encounter various new contexts, necessitating on-device continual learning (CL) to ensure consistent model performance. While existing research predominantly focused on developing lightweight CL frameworks, we identify that data scarcity is a critical bottleneck for on-device CL. In this work, we expl...
Ensuring verifiable and interpretable safety of deep reinforcement learning (DRL) is crucial for its deployment in real-world applications. Existing approaches like verification-in-the-loop training, however, face challenges such as difficulty in deployment, inefficient training, lack of interpretability, and suboptimal performance in property sati...
Truck-drone systems, wherein trucks carrying drones drive to pre-planned positions and then free drones equipped with cameras to monitor a known number of objects with reported positions, have been used for various scenarios. An object's quality of monitoring (QoM) by a camera is defined as a function of camera focal length and monitoring distance....
The approximate membership query (AMQ) data structure is a kind of space-efficient probabilistic data structure. It can approximately indicate whether an element exists in a set. The AMQ data structure has been widely used in network measurements, network security, network caching,
etc
. Resizing is an extensively utilized operation of the AMQ da...
Consistent routing updates through Software-Defined Networking (SDN) can be difficult due to the asynchronous and distributed nature of the data plane. Recent studies have achieved consistent unicast routing updates. However, achieving consistent updates with drop-freeness and duplicate-freeness remains a challenge for multicast with fewer known re...
Learning to rank (LTR) is widely employed in web searches to prioritize pertinent webpages from retrieved content based on input queries. However, traditional LTR models encounter two principal obstacles that lead to suboptimal performance: (1) the lack of well-annotated query-webpage pairs with ranking scores covering a diverse range of search que...
Patients with Parkinson's disease (PD) often show gait impairments including shuffling gait, festination, and lack of arm and leg coordination. Quantitative gait analysis can provide valuable insights for PD diagnosis and monitoring. Prior work has utilized 3D motion capture, foot pressure sensors, IMUs, etc. to assess the severity of gait impairme...
The surging demand for cloud computing resources, driven by the rapid growth of sophisticated large-scale models and data centers, underscores the critical importance of efficient and adaptive resource allocation. As major tech enterprises deploy massive infrastructures with thousands of GPUs, existing cloud platforms still struggle with low resour...
Index structures are powerful tools for improving query performance and reducing disk access in database systems. Multi-dimensional indexes, in particular, are used to filter records effectively based on multiple attributes. Classical multi-dimensional index structures, such as KD-Tree, Quadtree, and R-Tree, have been widely used in modern database...
In-air gesture control extends a touch screen and enables contactless interaction, thus has become a popular research direction in the past few years. Prior work has implemented this functionality based on cameras, acoustic signals, and Wi-Fi via existing hardware on commercial devices. However, these methods have low user acceptance. Solutions bas...
Machine learning (ML) models have been deployed in mobile networks to deal with massive data from different layers to enable automated network management. To overcome high communication cost and severe privacy concerns of centralized ML, federated learning (FL) has been proposed to achieve distributed ML among numerous networked devices. While the...
Finding special items in data streams, like heavy hitters, top-
$k$
items, and persistent items, has always been a hot topic in the field of network measurement. While data streams nowadays are usually high-dimensional, most prior works optimize data structures to accurately find special items according to a certain primary dimension and yield lit...
Multipath TCP (MPTCP) is a burgeoning transport protocol which enables the server to transmit the traffic across multiple network interfaces in parallel. Classic MPTCPs have good friendliness and practicality such as relatively low overhead, but are hard to achieve consistent high-throughput and adaptability, especially for the ability to flexibly...
Cloud services have shifted from monolithic designs to microservices running on cloud-native infrastructure with monitoring systems to ensure service level agreements (SLAs). However, traditional monitoring systems no longer meet the demands of cloud-native monitoring. In Alibaba’s “double eleven” shopping festival, it is observed that the monitor...
While Learning to Rank (LTR) is widely employed in web searches to prioritize pertinent webpages from the retrieved contents based on input queries, traditional LTR models stumble over two principal stumbling blocks leading to subpar performance: 1) the lack of well-annotated query-webpage pairs with ranking scores to cover search queries of variou...
Airlines commonly need to take into consideration maximizing their profit while designing the hub-and-spoke network to obtain more market share and promote healthy development of aviation industry. Hence, in this article, we study the problem of multiple-allocation HUb and spoke network design for ROuting flight flows to maximize airline profit uti...
Low-bit quantization has become widespread for compressing image super-resolution (SR) models for edge deployment, which allows advanced SR models to enjoy compact low-bit parameters and efficient integer/bitwise constructions for storage compression and inference acceleration, respectively. However, it is notorious that low-bit quantization degrad...
Gases in the environment can significantly affect our health and safety. As mobile devices gain popularity, we consider to explore a human-centered gas detection system that can be integrated into commercial mobile devices to realize ubiquitous gas detection. However, existing gas sensors either have too long response delays or are too cumbersome....
Travel route recommendation is an important part of electronic tour guides and map applications. It aims to recommend a sequence of points of interest (POIs) to users based on their interests. The variety of users' historical records and their requirements makes the problem challenging and most existing works fail to satisfy these two aspects at th...
Spatial crowdsourcing is an increasingly popular category in the era of mobile Internet and sharing economy, where tasks have spatio-temporal constraints and must be completed at specific locations. In this paper, we focus on
the
M
ulti-
O
bjective
S
patio-
T
emporal task assignment (MOST) problem
considering the worker heterogeneity in...
Flow monitoring is widely applied in software-defined networks (SDNs) for monitoring network performance. Especially, detecting heavy hitters can prevent the Distributed Denial of Service (DDoS) attack. However, many existing approaches fall into one of two undesirable extremes: (i) inefficient collection where only accuracy is concerned in the met...
While learning to rank (LTR) is widely employed in web searches to prioritize pertinent webpages from the retrieved contents based on input queries, traditional LTR models stumble over two principal stumbling blocks leading to subpar performance: (1) the lack of well-annotated query-webpage pairs with ranking scores to cover search queries of vario...
Wireless Charger Network (WCN) emerges as a promising networking paradigm, employing wireless chargers with Wireless Power Transfer (WPT) technology to provide long-term and sustainable energy supply for future networks. Although extensive research has been conducted in this area over the last decade, there is currently no comprehensive survey to c...
Estimating per-flow cardinality from high-speed data streams has many applications such as anomaly detection and resource allocation. Yet despite tracking single flow cardinality with approximation algorithms offered, there remain algorithmical challenges for monitoring multi-flows especially under unbalanced cardinality distribution: existing meth...
With the increase of diversity in application preferences and networks, existing congestion control algorithms (CCAs) do not accommodate this complicated reality. Previous classic CCAs are designed for a specific domain with fixed rules, failing to adapt to such diversities. Recently surged learning-based CCAs have great potential in adaptability a...
In many real-life applications, wireless chargers are deployed outdoor or in public area or even unattended environment such as hotels, restaurants, retail stores. They are exposed to various risks and malicious attacks that may break them down and further incur significant cost (e.g., battery replacement and maintenance) or performance degradation...
Carpooling route planning becomes an important problem with the growth of low-carbon traffic systems. When each passenger has several potential locations to get on and off the car, the problem will be more challenging. In the paper, we discussed a simplified carpooling route planning problem, namely the Shortest Path Tour Problem (SPTP), whose aim...
Motivated by the carpooling services, we investigate a new and more challenging scenario for carpooling and model it as the Multi-candidate Carpooling Routing Problem (MCRP). The MCRP can be regarded as a new variant of TSP called Generalized Precedence-Constaint Asymmetric Subset Traveling Salesman Path Problem (GPAS-TSPP) and we construct complex...
Approximate query processing (AQP) is one of the key techniques to cope with big data querying problem on account that it obtains approximate answers efficiently. To address non-trivial sample selection and heavy sampling cost issues in AQP, we propose ShadowAQP, an efficient and accurate approach based on attribute-oriented sample size allocation...
A major challenge confronting today’s distributed metadata management schemes is how to meet the dynamic requirements of various applications through effectively mapping and migrating metadata nodes to different metadata servers (MDS’s). Most of the existing works dynamically reallocate nodes to different servers adopting history-based coarse-grain...
While
learning to rank
(LTR) has been widely used in web search to prioritize most relevant webpages among the retrieved contents subject to the input queries, the traditional LTR models fail to deliver decent performance due to two main reasons: 1) the lack of well-annotated query-webpage pairs with ranking scores to cover search queries of vari...