
Shanliang YaoUniversity of Liverpool | UoL · Faculty of Science and Engineering
Shanliang Yao
Doctor of Philosophy
Working on multi-sensor fusion
About
26
Publications
5,536
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
154
Citations
Publications
Publications (26)
The 3rd Workshop on Maritime Computer Vision (MaCVi) 2025 addresses maritime computer vision for Unmanned Surface Vehicles (USV) and underwater. This report offers a comprehensive overview of the findings from the challenges. We provide both statistical and qualitative analyses, evaluating trends from over 700 submissions. All datasets, evaluation...
Waterway perception is critical for the special operations and autonomous navigation of Unmanned Surface Vessels (USVs), but current perception schemes are sensor-based, neglecting the interaction between humans and USVs for embodied perception in various operations. Therefore, inspired by visual grounding, we present WaterVG, the inaugural visual...
While Gaussian Splatting (GS) demonstrates efficient and high-quality scene rendering and small area surface extraction ability, it falls short in handling large-scale aerial image surface extraction tasks. To overcome this, we present ULSR-GS, a framework dedicated to high-fidelity surface extraction in ultra-large-scale scenes, addressing the lim...
Recently, visual grounding and multi-sensors setting have been incorporated into perception system for terrestrial autonomous driving systems and Unmanned Surface Vehicles (USVs), yet the high complexity of modern learning-based visual grounding model using multi-sensors prevents such model to be deployed on USVs in the real-life. To this end, we d...
In the rapidly evolving field of 3D reconstruction, 3D Gaussian Splatting (3DGS) and 2D Gaussian Splatting (2DGS) represent significant advancements. Although 2DGS compresses 3D Gaussian primitives into 2D Gaussian surfels to effectively enhance mesh extraction quality, this compression can potentially lead to a decrease in rendering quality. Addit...
Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivor rescue, environmental monitoring, hydrography mapping and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surface...
The perception of waterways based on human intent is significant for autonomous navigation and operations of Unmanned Surface Vehicles (USVs) in water environments. Inspired by visual grounding, we introduce WaterVG, the first visual grounding dataset designed for USV-based waterway perception based on human prompts. WaterVG encompasses prompts des...
Water resources have spurred the need for advanced maritime technologies, particularly for effective exploration and harnessing. Traditional maritime resource investigation methodologies are hampered by challenges such as elevated costs and risks. Unmanned Surface Vehicles (USVs) have emerged as a transformative solution, offering applications prev...
Multi-task panoptic perception leveraging multi-sensor fusion is crucial for comprehensively understanding waterway environments, which enhances the robust monitoring and autonomous navigation of Unmanned Surface Vehicles. However, the fragmented design inherent in multimodal and multi-task neural networks inevitably leads to decreased inference sp...
With the development of Unmanned Surface Vehicles (USVs), the perception of inland waterways has become significant to autonomous navigation. RGB cameras can capture images with rich semantic features, but they would fail in adverse weather and at night. As a perception sensor that has initially emerged in recent years, 4D millimeter-wave radar (4D...
Vehicle detection based on deep learning has been developed rapidly and basically formed a certain pattern. Almost all works in vehicle detection are concentrated on single-label object detection. However, in the real world, a vehicle has multiple attributes from the perspective of a human being. When we observe a car, we tend to perceive its type,...
The Digital Elevation Model (DEM) super-resolution approach aims to improve the spatial resolution or detail of an existing DEM by applying techniques such as machine learning or spatial interpolation. Convolutional Neural Networks and Generative Adversarial Networks have exhibited remarkable capabilities in generating high-resolution DEMs from cor...
Panoptic perception is essential to unmanned surface vehicles (USVs) for autonomous navigation. The current panoptic perception scheme is mainly based on vision only, that is, object detection and semantic segmentation are performed simultaneously based on camera sensors. Nevertheless, the fusion of camera and radar sensors is regarded as a promisi...
Natural language (NL) based vehicle retrieval is a task aiming to retrieve a vehicle that is most consistent with a given NL query from among all candidate vehicles. Because NL query can be easily obtained, such a task has a promising prospect in building an interactive intelligent traffic system (ITS). Current solutions mainly focus on extracting...
Current perception models for different tasks usually exist in modular forms on Unmanned Surface Vehicles (USVs), which infer extremely slowly in parallel on edge devices, causing the asynchrony between perception results and USV position, and leading to error decisions of autonomous navigation. Compared with Unmanned Ground Vehicles (UGVs), the ro...
Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivors rescue, environmental monitoring, hydrography mapping and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surfac...
Building a high-precision bathymetry digital elevation model is essential for navigation planning, marine and lake resource planning, port construction, and underwater archaeological projects. However, existing bathymetry methods have yet to be effectively and comparatively analyzed. This paper comprehensively reviews state-of-the-art bathymetry me...
Natural language (NL) based vehicle retrieval is a task aiming to retrieve a vehicle that is most consistent with a given NL query from among all candidate vehicles. Because NL query can be easily obtained, such a task has a promising prospect in building an interactive intelligent traffic system (ITS). Current solutions mainly focus on extracting...
Driven by deep learning techniques, perception technology in autonomous driving has developed rapidly in recent years. To achieve accurate and robust perception capabilities, autonomous vehicles are often equipped with multiple sensors, making sensor fusion a crucial part of the perception system. Among these fused sensors, radars and cameras enabl...
Driven by deep learning techniques, perception technology in autonomous driving has developed rapidly in recent years, enabling vehicles to accurately detect and interpret surrounding environment for safe and efficient navigation. To achieve accurate and robust perception capabilities, autonomous vehicles are often equipped with multiple sensors, m...
Most high-performance semantic segmentation networks are based on complicated deep convolutional neural networks, leading to severe latency in real-time detection. However, the state-of-the-art semantic segmentation networks with low complexity are still far from detecting objects accurately. In this paper, we propose a real-time semantic segmentat...
Deep learning has been widely used to classify and detect the FMCW radar targets of traffic vehicles. However, the recognition work of traffic accidents based on the millimeter-wave radar is much rare and complicated. Besides, constructing complex target datasets such as traffic accidents is expensive and laborious. Therefore, this paper proposes a...
CNNs have achieved remarkable image classifcation and object detection results over the past few years. Due to the locality of the convolution operation, although CNNs can extract rich features of the object itself, they can hardly obtain global context in images. It means the CNN-based network is not a good candidate for detecting objects by utili...
In the medical field, pathological carcinoma images look much more complicated than other medical images. Identifying carcinoma pathology images is a time-consuming and error-prone task for regular doctors and even for some specialists. Nowadays, deep learning has been widely applied in medicine, which could significantly reduce the time cost and i...